SlideShare a Scribd company logo
! 
BUILDING HIGH PROFILE 
EDITORIAL SITES 
YANN MALET 
2014.DJANGOCON.EU 
MAY 2014
ABOUT THIS TALK 
● It comes after 
− Data Herding: How to Shepherd Your Flock Through 
Valleys of Darkness (2010) 
− Breaking down the process of building a custom 
CMS (2010) 
− Stop Tilting at Windmills - Spotting Bottlenecks 
(2011)
AGENDA 
● Foreword 
● Multi layer cache to protect your database 
● Image management on responsive site 
● Devops
HIGH PERFORMANCE 
Django is web scale...
… AS ANYTHING ELSE ...
AGENDA 
● Foreword 
● Multi layer cache to protect your database 
● Image management on responsive site 
● Devops
VARNISH CACHE
VARNISH 
● Varnish Cache is a web application accelerator 
− aka caching HTTP reverse proxy 
− 10 – 1000 times faster 
! 
● This is hard stuff don't try to reinvent this wheel
VARNISH: TIPS AND TRICKS 
● Strip cookies 
● Saint Mode 
● Custom error better than guru meditation
STRIP COOKIES 
● Increasing hit rate is all about reducing 
− Vary: on parameters 
● Accept-Language 
● Cookie
STRIP COOKIES 
sub vcl_recv { 
# unless sessionid/csrftoken is in the request, 
# don't pass ANY cookies (referral_source, utm, etc) 
if (req.request == "GET" && 
(req.url ~ "^/static" || 
(req.http.cookie !~ "sessionid" && 
req.http.cookie !~ "csrftoken"))) { 
remove req.http.Cookie; 
} 
... 
} 
sub vcl_fetch { 
# pass through for anything with a session/csrftoken set 
if (beresp.http.set-cookie ~ "sessionid" || 
beresp.http.set-cookie ~ "csrftoken") { 
return (pass); 
} else { 
return (deliver); 
} 
... 
}
VARNISH: SAINT MODE 
● Varnish Saint Mode lets you serve stale content 
from cache, even when your backend servers are 
unavailable. 
− http://lincolnloop.com/blog/varnish-saint-mode/
VARNISH: SAINT MODE 1/2 
# /etc/varnish/default.vcl 
backend default { 
.host = "127.0.0.1"; 
.port = "8000"; 
.saintmode_threshold = 0; 
.probe = { .url = "/"; .interval = 1s; .timeout = 1s; 
.window = 5; .threshold = 3;} 
} 
sub vcl_recv { 
if (req.backend.healthy) { 
set req.grace = 1h; 
set req.ttl = 5s; 
} else { 
# Accept serving stale object (extend TTL by 6h) 
set req.grace = 6h; 
} 
}
VARNISH: SAINT MODE 2/2 
! 
sub vcl_fetch { 
# keep all objects for 6h beyond their TTL 
set beresp.grace = 6h; 
! 
# If we fetch a 500, serve stale content instead 
if (beresp.status == 500 || 
beresp.status == 502 || 
beresp.status == 503) { 
set beresp.saintmode = 30s; 
return(restart); 
} 
}
VARNISH: SAINT MODE 
.url: Format the default request with this URL. 
.timeout: How fast the probe must finish, you must specify a time 
unit with the number, such as “0.1 s”, “1230 ms” or even “1 h”. 
.interval: How long time to wait between polls, you must specify a 
time unit here also. Notice that this is not a ‘rate’ but an ‘interval’. The 
lowest poll rate is (.timeout + .interval). 
.window: How many of the latest polls to consider when determining 
if the backend is healthy. 
.threshold: How many of the .window last polls must be good for 
the backend to be declared healthy.
VARNISH: CUSTOM ERROR PAGE 
sub vcl_error { 
... 
# Otherwise, return the custom error page 
set obj.http.Content-Type = "text/html; charset=utf-8"; 
synthetic std.fileread("/var/www/example_com/varnish_error.html"); 
return(deliver); 
} 
● Use a nicely formatted error page instead of the 
default white meditation guru
CACHING STRATEGY IN YOUR APP
INEVITABLE QUOTE 
! 
„THERE ARE ONLY TWO HARD THINGS IN 
COMPUTER SCIENCE: 
CACHE INVALIDATION AND NAMING 
THINGS, AND OFF-BY-ONE ERRORS.“ 
! 
– PHIL KARLTON
CACHING STRATEGY 
● Russian doll caching 
● Randomized your cache invalidation for the 
HTML cache 
● Cache buster URL for your HTML cache 
● Cache database queries 
● More resilient cache backend
RUSSIAN DOLL CACHING 
● Nested cache with increasing TTL as you walk down 
{% cache MIDDLE_TTL "article_list" 
request.GET.page last_article.id last_article.last_modified %} 
{% include "includes/article/list_header.html" %} 
<div class="article-list"> 
{% for article in article_list %} 
{% cache LONG_TTL "article_list_teaser_" article.id article.last_modified %} 
{% include "includes/article/article_teaser.html" %} 
{% endcache %} 
{% endfor %} 
</div> 
{% endcache %}
RUSSIAN DOLL CACHING 
It get faster as traffic increases
● Do not invalidate all the `X_TTL` at the same time 
− Modify cache templatetag: TTL +/- 20% 
● Fork the {% cache … %} templatetag 
try: 
RANDOMIZED CACHE TTL 
expire_time = int(expire_time) 
expire_time = randint(expire_time * 0.8, expire_time * 1.2) 
except (ValueError, TypeError): 
raise TemplateSyntaxError( 
'"cache" tag got a non-integer timeout value: %r' % expire_time)
CENTRAL TTL DEFINITION 
● Context processor to set TTL 
− SHORT_TTL 
− MIDDLE_TTL 
− LONG_TTL 
− FOR_EVER_TTL (* not really)
RESILIENT CACHE BACKEND 
● Surviving node outages is not included 
− Wrap the Django cache backend in try / except 
− You might also want to report it in New Relic 
● Fork Django cache backend
CACHE BUSTER URL 
● http://example.com/*/?PURGE_CACHE_HTML 
● This URL 
− traverses your stack 
− purges the HTML cache fragment 
− generates fresh one 
! 
● Fork the {% cache … %} templatetag
CACHING DB QUERIES 
● Johnny cache 
− It is a middleware so there is surprising side effects 
− If you change the DB outside request / response 
# johnny/cache.py 
def enable(): 
"""Enable johnny-cache, for use in scripts, management 
commands, async workers, or other code outside the Django 
request flow.""" 
get_backend().patch()
MULTIPLE CACHE BACKENDS 
! 
CACHES = { 
'default': { 
'BACKEND': 'project.apps.core.backends.cache.PyLibMCCache', 
'OPTIONS': cache_opts, 
'VERSION': 1}, 
'html': { 
'BACKEND': 'myproject.apps.core.backends.cache.PyLibMCCache', 
'TEMPLATETAG_CACHE': True, 
'VERSION': 1}, 
'session': { 
'BACKEND': 'myproject.apps.core.backends.cache.PyLibMCCache', 
'VERSION': 1, 
'OPTIONS': cache_opts,}, 
'johnny': { 
'BACKEND': 'myproject.apps.core.backends.cache.JohnnyPyLibMCCache', 
'JOHNNY_CACHE': True, 
'VERSION': 1} 
}
CACHED_DB SESSION 
SESSION_ENGINE = "Django.contrib.sessions.backends.cached_db" 
SESSION_CACHE_ALIAS = "session"
AGENDA 
● Foreword 
● Multi layer cache to protect your database 
● Image management on responsive site 
● Devops
RESPONSIVE DESIGN IMPACTS 
● 3x more image sizes 
− Desktop 
− Tablet 
− Mobile
IMAGE MANAGEMENT 
● Django-filer 
● Easy-thumbnails 
● Cloudfiles (cloud containers) 
! 
● Assumption of fast & reliable disk should be forgotten 
− The software stack is not helping, a lot of work is left to you 
● Forked 
− Dajngo-filer (fork) 
− Easy-thumbnails (Fork - very close to to be able to drop it) 
− Django-cumulus (81 Forks) 
− Monkey patch pyrax 
− ... 
Heein!!!
DJANGO-CUMULUS 
● The truth is much worst 
− Log everything from the swiftclient 
● Target 0 calls to the API and DB on a hot page 
− The main repo is getting better ... 
'loggers': { 
... 
'Django.db': { 
'handlers': ['console'], 
'level': 'DEBUG', 
'propagate': True, 
}, 
'swiftclient': { 
'handlers': ['console'], 
'level': 'DEBUG', 
'propagate': True, 
},
DJANGO-CUMULUS 
● Django storage backend for Cloudfiles from Rakspace 
− Be straight to the point when talking to slow API 
diff --git a/cumulus/storage.py b/cumulus/storage.py 
@@ -201,6 +202,19 @@ class SwiftclientStorage(Storage): 
... 
+ def save(self, name, content): 
+ """ 
+ Don't check for an available name before saving, just overwrite. 
+ """ 
+ # Get the proper name for the file, as it will actually be saved. 
+ if name is None: 
+ name = content.name 
+ name = self._save(name, content) 
+ # Store filenames with forward slashes, even on Windows 
+ return force_text(name.replace('', '/'))
DJANGO-CUMULUS 
Trust unreliable API at scale 
diff --git a/cumulus/storage.py b/cumulus/storage.py 
@@ -150,8 +150,11 @@ class SwiftclientStorage(Storage): 
def _get_object(self, name): 
""" 
Helper function to retrieve the requested Object. 
""" 
- if self.exists(name): 
+ try: 
return self.container.get_object(name) 
+ except pyrax.exceptions.NoSuchObject as err: 
+ pass 
@@ -218,7 +221,7 @@ class SwiftclientStorage(Storage): 
def exists(self, name): 
""" 
exists in the storage system, or False if the name is 
available for a new file. 
""" 
- return name in self.container.get_object_names() 
+ return bool(self._get_object(name))
PATCH PYRAX 
● Assume for the best 
− Reduce the auth attempts 
− Reduce the connection timeout 
def patch_things(): 
# Automatically generate thumbnails for all aliases 
models.signals.post_save.connect(queue_thumbnail_generation) 
# Force the retries for pyrax to 1, to stop the request doubling 
pyrax.cf_wrapper.client.AUTH_ATTEMPTS = 1 
pyrax.cf_wrapper.client.CONNECTION_TIMEOUT = 2
GENERATE THE THUMBS 
● Generate the thumbs as soon as possible 
− post save signals that offload to a task 
− easy-thumbnails 
def queue_thumbnail_generation(sender, instance, **kwargs): 
""" 
Iterate over the sender's fields, and if there 
is a FileField instance (or a subclass like 
MultiStorageFileField) send the instance to a 
task to generate All the thumbnails defined 
in settings.THUMBNAIL_ALIASES. 
""" 
…
PICTUREFILL.JS 
… A Responsive Images approach that you can 
use today that mimics the proposed picture 
element using spans... 
− Old API demonstrated 1.2.1 
<span data-picture data-alt="A giant stone facein Angkor Thom, Cambodia"> 
<span data-src="small.jpg"></span> 
<span data-src="medium.jpg" data-media="(min-width: 400px)"></span> 
<span data-src="large.jpg" data-media="(min-width: 800px)"></span> 
<span data-src="extralarge.jpg" data-media="(min-width: 1000px)"></span> 
<!-- Fallback content for non-JS browsers. Same img src as the initial, 
unqualified source element. --> 
<noscript> 
<img src="small.jpg" alt="A giant stone face in Angkor Thom, Cambodia"> 
</noscript> 
</span>
PUTTING IT ALL TOGETHER 1/2 
● Iterate through article_list 
● Nested cache 
<!-- article_list.html --> 
{% extends "base.html" %} 
{% load image_tags cache_tags pagination_tags %} 
{% block content %} 
{% cache MIDDLE_TTL "article_list_" category author tag request.GET.page all_pages %} 
<div class="article-list archive-list "> 
{% for article in object_list %} 
{% cache LONG_TTL "article_teaser_" article.id article.modified %} 
{% include "newsroom/includes/article_teaser.html" with columntype="categorylist" %} 
{% endcache %} 
{% endfor %} 
</div> 
{% endcache %}
PUTTING IT ALL TOGETHER 2/2 
<!-- article_teaser.html --> 
{% load image_tags %} 
<section class="blogArticleSection"> 
{% if article.image %} 
<a href="{{ article.get_absolute_url }}" class="thumbnail"> 
<span data-picture data-alt="{{ article.image.default_alt_text }}"> 
<span data-src="{{ article.image|thumbnail_url:"large" }}"></span> 
<span data-src="{{ article.image|thumbnail_url:"medium" }}" data-media="(min-width: 480px)"></span> 
<span data-src="{{ article.image|thumbnail_url:"small" }}" data-media="(min-width: 600px)"></span> 
<noscript> 
<img src="{{ article.image|thumbnail_url:"small" }}" alt="{{ article.image.default_alt_text }}"> 
</noscript> 
</span> 
</a> 
{% endif %} 
... 
Use Picturefill to render your images
AGENDA 
● Foreword 
● Multi layer cache to protect your database 
● Image management on responsive site 
● Devops
DEVOPS 
● Configuration management 
● Single command deployment for all 
environments 
● Settings parity
CONFIGURATION MANAGEMENT 
● Pick one that fits your brain & skillset 
− Puppet 
− Chef 
− Ansible 
− Salt 
● At Lincoln Loop we are using Salt 
− One master per project 
− Minion installed on all the cloud servers
SALT 
● Provision & deploy a server role 
● +X app servers to absorb a traffic spike 
● Replace an unsupported OS 
● Update a package 
● Run a one-liner command 
− Restart a service on all instances 
● Varnish, memcached, ... 
− Check the version
SINGLE COMMAND DEPLOYMENT 
● One-liner or you will get it wrong 
● Consistency for each role is critical 
− Avoid endless debugging of pseudo random issue
SETTING PARITY 
● Is the Utopia you want to tend to but … 
− There are some differences 
● Avoid logic in settings.py 
● Fetch data from external sources: .env
SETTINGS.PY READS FROM .ENV 
import os 
import ConfigParser 
from superproject.settings.base import * 
TEMPLATE_LOADERS = ( 
('Django.template.loaders.cached.Loader', TEMPLATE_LOADERS),) 
config = ConfigParser.ConfigParser() 
config.read(os.path.abspath(VAR_ROOT + "/../.env")) 
DATABASES = { 
'default': { 
'ENGINE': 'Django.db.backends.mysql', 
'NAME': config.get("mysql", "mysql_name"), 
'USER': config.get("mysql", "mysql_user"), 
'PASSWORD': config.get("mysql", "mysql_password"), 
'HOST': config.get("mysql", "mysql_host"), 
'PORT': config.get("mysql", "mysql_port"), 
} 
}
CONCLUSION 
● Multi-layer Cache to protect your database 
− Varnish 
− Russian doll cache for the HTML fragments 
● Smart key naming and invalidation condition 
● Cache buster URL 
● Image management 
− Harder on high traffic responsive site 
− Software stack not mature 
● Devops 
− Configuration management is a must 
− Try to have settings parity between your environment
HIGH PERFORMANCE DJANGO 
Kickstarter 
http://lloop.us/hpd
BACKUP SLIDES
A WORD ABOUT LEGACY MIGRATION 
● This is often the hardest part to estimates 
− Huge volume of data 
− Often inconsistent 
− Unknown implicit business logic 
! 
● At scale if something can go wrong it will 
● It always take longer
REUSING PUBLISHED 
APPLICATIONS 
● Careful review before adding an external requirements 
− Read the code 
● Best practice 
● Security audit 
− Can operate at your targeted scale 
− In line with the rest of your project 
● It is not a binary choice you can 
− extract a very small part 
− Write your own version based on what you learned
Challenges when building high profile editorial sites

More Related Content

What's hot (19)

PDF
Resource registries plone conf 2014
Ramon Navarro
 
PDF
Varnish Configuration Step by Step
Kim Stefan Lindholm
 
PDF
Memcached Study
nam kwangjin
 
ODP
Caching and tuning fun for high scalability
Wim Godden
 
PDF
More tips n tricks
bcoca
 
PDF
Bottom to Top Stack Optimization - CICON2011
CodeIgniter Conference
 
PDF
How containers helped a SaaS startup be developed and go live
Ramon Navarro
 
ODP
Caching and tuning fun for high scalability @ FOSDEM 2012
Wim Godden
 
PDF
Puppet: Eclipsecon ALM 2013
grim_radical
 
ODP
Caching and tuning fun for high scalability @ phpBenelux 2011
Wim Godden
 
PDF
Ansible 實戰:top down 觀點
William Yeh
 
PDF
How to Develop Puppet Modules: From Source to the Forge With Zero Clicks
Carlos Sanchez
 
PDF
快快樂樂用Homestead
Chen Cheng-Wei
 
PPT
Tips for a Faster Website
Rayed Alrashed
 
PDF
Vagrant for real codemotion (moar tips! ;-))
Michele Orselli
 
PPT
0628阙宏宇
zhu02
 
PDF
Apache Traffic Server & Lua
Kit Chan
 
PDF
Preparation study of_docker - (MOSG)
Soshi Nemoto
 
Resource registries plone conf 2014
Ramon Navarro
 
Varnish Configuration Step by Step
Kim Stefan Lindholm
 
Memcached Study
nam kwangjin
 
Caching and tuning fun for high scalability
Wim Godden
 
More tips n tricks
bcoca
 
Bottom to Top Stack Optimization - CICON2011
CodeIgniter Conference
 
How containers helped a SaaS startup be developed and go live
Ramon Navarro
 
Caching and tuning fun for high scalability @ FOSDEM 2012
Wim Godden
 
Puppet: Eclipsecon ALM 2013
grim_radical
 
Caching and tuning fun for high scalability @ phpBenelux 2011
Wim Godden
 
Ansible 實戰:top down 觀點
William Yeh
 
How to Develop Puppet Modules: From Source to the Forge With Zero Clicks
Carlos Sanchez
 
快快樂樂用Homestead
Chen Cheng-Wei
 
Tips for a Faster Website
Rayed Alrashed
 
Vagrant for real codemotion (moar tips! ;-))
Michele Orselli
 
0628阙宏宇
zhu02
 
Apache Traffic Server & Lua
Kit Chan
 
Preparation study of_docker - (MOSG)
Soshi Nemoto
 

Similar to Challenges when building high profile editorial sites (20)

ODP
PHP London Dec 2013 - Varnish - The 9 circles of hell
luis-ferro
 
PPT
Drupal Performance - SerBenfiquista.com Case Study
hernanibf
 
PDF
Scaling PHP apps
Matteo Moretti
 
PPT
Performance and Scalability
Mediacurrent
 
PDF
Less and faster – Cache tips for WordPress developers
Seravo
 
PDF
Clug 2012 March web server optimisation
grooverdan
 
ODP
Caching and tuning fun for high scalability
Wim Godden
 
PDF
php & performance
simon8410
 
PDF
Cache all the things - A guide to caching Drupal
digital006
 
PDF
PHP & Performance
毅 吕
 
PDF
Automating Complex Setups with Puppet
Kris Buytaert
 
PDF
Automating complex infrastructures with Puppet
Kris Buytaert
 
PDF
#OktoCampus - Workshop : An introduction to Ansible
Cédric Delgehier
 
PPTX
Drupal, varnish, esi - Toulouse November 2
Marcus Deglos
 
ODP
Nagios Conference 2014 - Andy Brist - Nagios XI Failover and HA Solutions
Nagios
 
PDF
One-Man Ops
Jos Boumans
 
PDF
Symfony finally swiped right on envvars
Sam Marley-Jarrett
 
PDF
Training Slides: Basics 106: Tungsten Dashboard Overview, Installation and Ar...
Continuent
 
PDF
Webinar Slides: New Tungsten Dashboard - Overview, Installation and Architecture
Continuent
 
ODP
Caching and tuning fun for high scalability @ FrOSCon 2011
Wim Godden
 
PHP London Dec 2013 - Varnish - The 9 circles of hell
luis-ferro
 
Drupal Performance - SerBenfiquista.com Case Study
hernanibf
 
Scaling PHP apps
Matteo Moretti
 
Performance and Scalability
Mediacurrent
 
Less and faster – Cache tips for WordPress developers
Seravo
 
Clug 2012 March web server optimisation
grooverdan
 
Caching and tuning fun for high scalability
Wim Godden
 
php & performance
simon8410
 
Cache all the things - A guide to caching Drupal
digital006
 
PHP & Performance
毅 吕
 
Automating Complex Setups with Puppet
Kris Buytaert
 
Automating complex infrastructures with Puppet
Kris Buytaert
 
#OktoCampus - Workshop : An introduction to Ansible
Cédric Delgehier
 
Drupal, varnish, esi - Toulouse November 2
Marcus Deglos
 
Nagios Conference 2014 - Andy Brist - Nagios XI Failover and HA Solutions
Nagios
 
One-Man Ops
Jos Boumans
 
Symfony finally swiped right on envvars
Sam Marley-Jarrett
 
Training Slides: Basics 106: Tungsten Dashboard Overview, Installation and Ar...
Continuent
 
Webinar Slides: New Tungsten Dashboard - Overview, Installation and Architecture
Continuent
 
Caching and tuning fun for high scalability @ FrOSCon 2011
Wim Godden
 
Ad

Recently uploaded (20)

PPTX
purpose of this tutorial is to introduce you to Computers and its fundamentals.
rameshwardayalrao1
 
PPT
Oxygen Co2 Transport in the Lungs(Exchange og gases)
SUNDERLINSHIBUD
 
PDF
ADVXAI IN MALWARE ANALYSIS FRAMEWORK: BALANCING EXPLAINABILITY WITH SECURITY
ijscai
 
PPTX
Mining Presentation Underground - Copy.pptx
patallenmoore
 
PPTX
MPMC_Module-2 xxxxxxxxxxxxxxxxxxxxx.pptx
ShivanshVaidya5
 
PDF
LEARNING CROSS-LINGUAL WORD EMBEDDINGS WITH UNIVERSAL CONCEPTS
kjim477n
 
PDF
SMART HOME AUTOMATION PPT BY - SHRESTH SUDHIR KOKNE
SHRESTHKOKNE
 
PDF
Call For Papers - International Journal on Natural Language Computing (IJNLC)
kevig
 
PDF
13th International Conference of Networks and Communications (NC 2025)
JohannesPaulides
 
PDF
Detailed manufacturing Engineering and technology notes
VIKKYsing
 
PPTX
111111111111111111111111111111111111111111.pptx
sppatelrs
 
PDF
NOISE CONTROL ppt - SHRESTH SUDHIR KOKNE
SHRESTHKOKNE
 
PPTX
ENSA_Module_8.pptx_nice_ipsec_presentation
RanaMukherjee24
 
PDF
Unified_Cloud_Comm_Presentation anil singh ppt
anilsingh298751
 
PDF
MOBILE AND WEB BASED REMOTE BUSINESS MONITORING SYSTEM
ijait
 
PPTX
Data_Analytics_Presentation_By_Malik_Azanish_Asghar.pptx
azanishmalik1
 
PDF
Book.pdf01_Intro.ppt algorithm for preperation stu used
archu26
 
PPTX
File Strucutres and Access in Data Structures
mwaslam2303
 
PDF
BioSensors glucose monitoring, cholestrol
nabeehasahar1
 
PPTX
Introduction to Neural Networks and Perceptron Learning Algorithm.pptx
Kayalvizhi A
 
purpose of this tutorial is to introduce you to Computers and its fundamentals.
rameshwardayalrao1
 
Oxygen Co2 Transport in the Lungs(Exchange og gases)
SUNDERLINSHIBUD
 
ADVXAI IN MALWARE ANALYSIS FRAMEWORK: BALANCING EXPLAINABILITY WITH SECURITY
ijscai
 
Mining Presentation Underground - Copy.pptx
patallenmoore
 
MPMC_Module-2 xxxxxxxxxxxxxxxxxxxxx.pptx
ShivanshVaidya5
 
LEARNING CROSS-LINGUAL WORD EMBEDDINGS WITH UNIVERSAL CONCEPTS
kjim477n
 
SMART HOME AUTOMATION PPT BY - SHRESTH SUDHIR KOKNE
SHRESTHKOKNE
 
Call For Papers - International Journal on Natural Language Computing (IJNLC)
kevig
 
13th International Conference of Networks and Communications (NC 2025)
JohannesPaulides
 
Detailed manufacturing Engineering and technology notes
VIKKYsing
 
111111111111111111111111111111111111111111.pptx
sppatelrs
 
NOISE CONTROL ppt - SHRESTH SUDHIR KOKNE
SHRESTHKOKNE
 
ENSA_Module_8.pptx_nice_ipsec_presentation
RanaMukherjee24
 
Unified_Cloud_Comm_Presentation anil singh ppt
anilsingh298751
 
MOBILE AND WEB BASED REMOTE BUSINESS MONITORING SYSTEM
ijait
 
Data_Analytics_Presentation_By_Malik_Azanish_Asghar.pptx
azanishmalik1
 
Book.pdf01_Intro.ppt algorithm for preperation stu used
archu26
 
File Strucutres and Access in Data Structures
mwaslam2303
 
BioSensors glucose monitoring, cholestrol
nabeehasahar1
 
Introduction to Neural Networks and Perceptron Learning Algorithm.pptx
Kayalvizhi A
 
Ad

Challenges when building high profile editorial sites

  • 1. ! BUILDING HIGH PROFILE EDITORIAL SITES YANN MALET 2014.DJANGOCON.EU MAY 2014
  • 2. ABOUT THIS TALK ● It comes after − Data Herding: How to Shepherd Your Flock Through Valleys of Darkness (2010) − Breaking down the process of building a custom CMS (2010) − Stop Tilting at Windmills - Spotting Bottlenecks (2011)
  • 3. AGENDA ● Foreword ● Multi layer cache to protect your database ● Image management on responsive site ● Devops
  • 4. HIGH PERFORMANCE Django is web scale...
  • 5. … AS ANYTHING ELSE ...
  • 6. AGENDA ● Foreword ● Multi layer cache to protect your database ● Image management on responsive site ● Devops
  • 8. VARNISH ● Varnish Cache is a web application accelerator − aka caching HTTP reverse proxy − 10 – 1000 times faster ! ● This is hard stuff don't try to reinvent this wheel
  • 9. VARNISH: TIPS AND TRICKS ● Strip cookies ● Saint Mode ● Custom error better than guru meditation
  • 10. STRIP COOKIES ● Increasing hit rate is all about reducing − Vary: on parameters ● Accept-Language ● Cookie
  • 11. STRIP COOKIES sub vcl_recv { # unless sessionid/csrftoken is in the request, # don't pass ANY cookies (referral_source, utm, etc) if (req.request == "GET" && (req.url ~ "^/static" || (req.http.cookie !~ "sessionid" && req.http.cookie !~ "csrftoken"))) { remove req.http.Cookie; } ... } sub vcl_fetch { # pass through for anything with a session/csrftoken set if (beresp.http.set-cookie ~ "sessionid" || beresp.http.set-cookie ~ "csrftoken") { return (pass); } else { return (deliver); } ... }
  • 12. VARNISH: SAINT MODE ● Varnish Saint Mode lets you serve stale content from cache, even when your backend servers are unavailable. − http://lincolnloop.com/blog/varnish-saint-mode/
  • 13. VARNISH: SAINT MODE 1/2 # /etc/varnish/default.vcl backend default { .host = "127.0.0.1"; .port = "8000"; .saintmode_threshold = 0; .probe = { .url = "/"; .interval = 1s; .timeout = 1s; .window = 5; .threshold = 3;} } sub vcl_recv { if (req.backend.healthy) { set req.grace = 1h; set req.ttl = 5s; } else { # Accept serving stale object (extend TTL by 6h) set req.grace = 6h; } }
  • 14. VARNISH: SAINT MODE 2/2 ! sub vcl_fetch { # keep all objects for 6h beyond their TTL set beresp.grace = 6h; ! # If we fetch a 500, serve stale content instead if (beresp.status == 500 || beresp.status == 502 || beresp.status == 503) { set beresp.saintmode = 30s; return(restart); } }
  • 15. VARNISH: SAINT MODE .url: Format the default request with this URL. .timeout: How fast the probe must finish, you must specify a time unit with the number, such as “0.1 s”, “1230 ms” or even “1 h”. .interval: How long time to wait between polls, you must specify a time unit here also. Notice that this is not a ‘rate’ but an ‘interval’. The lowest poll rate is (.timeout + .interval). .window: How many of the latest polls to consider when determining if the backend is healthy. .threshold: How many of the .window last polls must be good for the backend to be declared healthy.
  • 16. VARNISH: CUSTOM ERROR PAGE sub vcl_error { ... # Otherwise, return the custom error page set obj.http.Content-Type = "text/html; charset=utf-8"; synthetic std.fileread("/var/www/example_com/varnish_error.html"); return(deliver); } ● Use a nicely formatted error page instead of the default white meditation guru
  • 18. INEVITABLE QUOTE ! „THERE ARE ONLY TWO HARD THINGS IN COMPUTER SCIENCE: CACHE INVALIDATION AND NAMING THINGS, AND OFF-BY-ONE ERRORS.“ ! – PHIL KARLTON
  • 19. CACHING STRATEGY ● Russian doll caching ● Randomized your cache invalidation for the HTML cache ● Cache buster URL for your HTML cache ● Cache database queries ● More resilient cache backend
  • 20. RUSSIAN DOLL CACHING ● Nested cache with increasing TTL as you walk down {% cache MIDDLE_TTL "article_list" request.GET.page last_article.id last_article.last_modified %} {% include "includes/article/list_header.html" %} <div class="article-list"> {% for article in article_list %} {% cache LONG_TTL "article_list_teaser_" article.id article.last_modified %} {% include "includes/article/article_teaser.html" %} {% endcache %} {% endfor %} </div> {% endcache %}
  • 21. RUSSIAN DOLL CACHING It get faster as traffic increases
  • 22. ● Do not invalidate all the `X_TTL` at the same time − Modify cache templatetag: TTL +/- 20% ● Fork the {% cache … %} templatetag try: RANDOMIZED CACHE TTL expire_time = int(expire_time) expire_time = randint(expire_time * 0.8, expire_time * 1.2) except (ValueError, TypeError): raise TemplateSyntaxError( '"cache" tag got a non-integer timeout value: %r' % expire_time)
  • 23. CENTRAL TTL DEFINITION ● Context processor to set TTL − SHORT_TTL − MIDDLE_TTL − LONG_TTL − FOR_EVER_TTL (* not really)
  • 24. RESILIENT CACHE BACKEND ● Surviving node outages is not included − Wrap the Django cache backend in try / except − You might also want to report it in New Relic ● Fork Django cache backend
  • 25. CACHE BUSTER URL ● http://example.com/*/?PURGE_CACHE_HTML ● This URL − traverses your stack − purges the HTML cache fragment − generates fresh one ! ● Fork the {% cache … %} templatetag
  • 26. CACHING DB QUERIES ● Johnny cache − It is a middleware so there is surprising side effects − If you change the DB outside request / response # johnny/cache.py def enable(): """Enable johnny-cache, for use in scripts, management commands, async workers, or other code outside the Django request flow.""" get_backend().patch()
  • 27. MULTIPLE CACHE BACKENDS ! CACHES = { 'default': { 'BACKEND': 'project.apps.core.backends.cache.PyLibMCCache', 'OPTIONS': cache_opts, 'VERSION': 1}, 'html': { 'BACKEND': 'myproject.apps.core.backends.cache.PyLibMCCache', 'TEMPLATETAG_CACHE': True, 'VERSION': 1}, 'session': { 'BACKEND': 'myproject.apps.core.backends.cache.PyLibMCCache', 'VERSION': 1, 'OPTIONS': cache_opts,}, 'johnny': { 'BACKEND': 'myproject.apps.core.backends.cache.JohnnyPyLibMCCache', 'JOHNNY_CACHE': True, 'VERSION': 1} }
  • 28. CACHED_DB SESSION SESSION_ENGINE = "Django.contrib.sessions.backends.cached_db" SESSION_CACHE_ALIAS = "session"
  • 29. AGENDA ● Foreword ● Multi layer cache to protect your database ● Image management on responsive site ● Devops
  • 30. RESPONSIVE DESIGN IMPACTS ● 3x more image sizes − Desktop − Tablet − Mobile
  • 31. IMAGE MANAGEMENT ● Django-filer ● Easy-thumbnails ● Cloudfiles (cloud containers) ! ● Assumption of fast & reliable disk should be forgotten − The software stack is not helping, a lot of work is left to you ● Forked − Dajngo-filer (fork) − Easy-thumbnails (Fork - very close to to be able to drop it) − Django-cumulus (81 Forks) − Monkey patch pyrax − ... Heein!!!
  • 32. DJANGO-CUMULUS ● The truth is much worst − Log everything from the swiftclient ● Target 0 calls to the API and DB on a hot page − The main repo is getting better ... 'loggers': { ... 'Django.db': { 'handlers': ['console'], 'level': 'DEBUG', 'propagate': True, }, 'swiftclient': { 'handlers': ['console'], 'level': 'DEBUG', 'propagate': True, },
  • 33. DJANGO-CUMULUS ● Django storage backend for Cloudfiles from Rakspace − Be straight to the point when talking to slow API diff --git a/cumulus/storage.py b/cumulus/storage.py @@ -201,6 +202,19 @@ class SwiftclientStorage(Storage): ... + def save(self, name, content): + """ + Don't check for an available name before saving, just overwrite. + """ + # Get the proper name for the file, as it will actually be saved. + if name is None: + name = content.name + name = self._save(name, content) + # Store filenames with forward slashes, even on Windows + return force_text(name.replace('', '/'))
  • 34. DJANGO-CUMULUS Trust unreliable API at scale diff --git a/cumulus/storage.py b/cumulus/storage.py @@ -150,8 +150,11 @@ class SwiftclientStorage(Storage): def _get_object(self, name): """ Helper function to retrieve the requested Object. """ - if self.exists(name): + try: return self.container.get_object(name) + except pyrax.exceptions.NoSuchObject as err: + pass @@ -218,7 +221,7 @@ class SwiftclientStorage(Storage): def exists(self, name): """ exists in the storage system, or False if the name is available for a new file. """ - return name in self.container.get_object_names() + return bool(self._get_object(name))
  • 35. PATCH PYRAX ● Assume for the best − Reduce the auth attempts − Reduce the connection timeout def patch_things(): # Automatically generate thumbnails for all aliases models.signals.post_save.connect(queue_thumbnail_generation) # Force the retries for pyrax to 1, to stop the request doubling pyrax.cf_wrapper.client.AUTH_ATTEMPTS = 1 pyrax.cf_wrapper.client.CONNECTION_TIMEOUT = 2
  • 36. GENERATE THE THUMBS ● Generate the thumbs as soon as possible − post save signals that offload to a task − easy-thumbnails def queue_thumbnail_generation(sender, instance, **kwargs): """ Iterate over the sender's fields, and if there is a FileField instance (or a subclass like MultiStorageFileField) send the instance to a task to generate All the thumbnails defined in settings.THUMBNAIL_ALIASES. """ …
  • 37. PICTUREFILL.JS … A Responsive Images approach that you can use today that mimics the proposed picture element using spans... − Old API demonstrated 1.2.1 <span data-picture data-alt="A giant stone facein Angkor Thom, Cambodia"> <span data-src="small.jpg"></span> <span data-src="medium.jpg" data-media="(min-width: 400px)"></span> <span data-src="large.jpg" data-media="(min-width: 800px)"></span> <span data-src="extralarge.jpg" data-media="(min-width: 1000px)"></span> <!-- Fallback content for non-JS browsers. Same img src as the initial, unqualified source element. --> <noscript> <img src="small.jpg" alt="A giant stone face in Angkor Thom, Cambodia"> </noscript> </span>
  • 38. PUTTING IT ALL TOGETHER 1/2 ● Iterate through article_list ● Nested cache <!-- article_list.html --> {% extends "base.html" %} {% load image_tags cache_tags pagination_tags %} {% block content %} {% cache MIDDLE_TTL "article_list_" category author tag request.GET.page all_pages %} <div class="article-list archive-list "> {% for article in object_list %} {% cache LONG_TTL "article_teaser_" article.id article.modified %} {% include "newsroom/includes/article_teaser.html" with columntype="categorylist" %} {% endcache %} {% endfor %} </div> {% endcache %}
  • 39. PUTTING IT ALL TOGETHER 2/2 <!-- article_teaser.html --> {% load image_tags %} <section class="blogArticleSection"> {% if article.image %} <a href="{{ article.get_absolute_url }}" class="thumbnail"> <span data-picture data-alt="{{ article.image.default_alt_text }}"> <span data-src="{{ article.image|thumbnail_url:"large" }}"></span> <span data-src="{{ article.image|thumbnail_url:"medium" }}" data-media="(min-width: 480px)"></span> <span data-src="{{ article.image|thumbnail_url:"small" }}" data-media="(min-width: 600px)"></span> <noscript> <img src="{{ article.image|thumbnail_url:"small" }}" alt="{{ article.image.default_alt_text }}"> </noscript> </span> </a> {% endif %} ... Use Picturefill to render your images
  • 40. AGENDA ● Foreword ● Multi layer cache to protect your database ● Image management on responsive site ● Devops
  • 41. DEVOPS ● Configuration management ● Single command deployment for all environments ● Settings parity
  • 42. CONFIGURATION MANAGEMENT ● Pick one that fits your brain & skillset − Puppet − Chef − Ansible − Salt ● At Lincoln Loop we are using Salt − One master per project − Minion installed on all the cloud servers
  • 43. SALT ● Provision & deploy a server role ● +X app servers to absorb a traffic spike ● Replace an unsupported OS ● Update a package ● Run a one-liner command − Restart a service on all instances ● Varnish, memcached, ... − Check the version
  • 44. SINGLE COMMAND DEPLOYMENT ● One-liner or you will get it wrong ● Consistency for each role is critical − Avoid endless debugging of pseudo random issue
  • 45. SETTING PARITY ● Is the Utopia you want to tend to but … − There are some differences ● Avoid logic in settings.py ● Fetch data from external sources: .env
  • 46. SETTINGS.PY READS FROM .ENV import os import ConfigParser from superproject.settings.base import * TEMPLATE_LOADERS = ( ('Django.template.loaders.cached.Loader', TEMPLATE_LOADERS),) config = ConfigParser.ConfigParser() config.read(os.path.abspath(VAR_ROOT + "/../.env")) DATABASES = { 'default': { 'ENGINE': 'Django.db.backends.mysql', 'NAME': config.get("mysql", "mysql_name"), 'USER': config.get("mysql", "mysql_user"), 'PASSWORD': config.get("mysql", "mysql_password"), 'HOST': config.get("mysql", "mysql_host"), 'PORT': config.get("mysql", "mysql_port"), } }
  • 47. CONCLUSION ● Multi-layer Cache to protect your database − Varnish − Russian doll cache for the HTML fragments ● Smart key naming and invalidation condition ● Cache buster URL ● Image management − Harder on high traffic responsive site − Software stack not mature ● Devops − Configuration management is a must − Try to have settings parity between your environment
  • 48. HIGH PERFORMANCE DJANGO Kickstarter http://lloop.us/hpd
  • 50. A WORD ABOUT LEGACY MIGRATION ● This is often the hardest part to estimates − Huge volume of data − Often inconsistent − Unknown implicit business logic ! ● At scale if something can go wrong it will ● It always take longer
  • 51. REUSING PUBLISHED APPLICATIONS ● Careful review before adding an external requirements − Read the code ● Best practice ● Security audit − Can operate at your targeted scale − In line with the rest of your project ● It is not a binary choice you can − extract a very small part − Write your own version based on what you learned