0% found this document useful (0 votes)
112 views27 pages

Cache That!: Gopal Vijayaraghavan Yahoo Inc

This document discusses caching strategies to optimize website performance. It recommends caching content at various levels from the browser to backend databases. It focuses on two PHP caching tools - APC and Memcached. APC provides opcode caching and works well for small amounts of mostly static data, while Memcached is better for larger, more dynamic datasets due to its distributed nature and ability to handle cache expiration. The document provides tuning tips for APC and strategies for preventing "cache slams" when cache entries expire.

Uploaded by

hqman
Copyright
© Attribution Non-Commercial (BY-NC)
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
112 views27 pages

Cache That!: Gopal Vijayaraghavan Yahoo Inc

This document discusses caching strategies to optimize website performance. It recommends caching content at various levels from the browser to backend databases. It focuses on two PHP caching tools - APC and Memcached. APC provides opcode caching and works well for small amounts of mostly static data, while Memcached is better for larger, more dynamic datasets due to its distributed nature and ability to handle cache expiration. The document provides tuning tips for APC and strategies for preventing "cache slams" when cache entries expire.

Uploaded by

hqman
Copyright
© Attribution Non-Commercial (BY-NC)
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 27

Cache That!

Gopal Vijayaraghavan
Yahoo Inc.
[email protected]

July 26th, 2007


Cache ? Why ?
- Easiest optimisation you could do
- Maximum return on work invested
- Hopefully transparent
- Probably doesn't need a complete redesign
- especially of business logic
- Tradeoff between speed and accuracy
- 80+% of Web content is not mission critical
- scale is more important
Sandwich Theory

http://flickr.com/photos/ozute/87465787/
Sandwich Theory
- Pages are dynamic
- Like a sandwich made to order.
- The whole page is built to order, but bits of
content require varying “freshness”
- A jar of pre-cut olives are the next best thing after
Sliced Bread
“fast”: latency & throughput

http://flickr.com/photos/brtsergio/184026033/
Cache: strategies and levels
- Browser land
- Content proxying
- Pre-generating content
- Active data caching
- Backend caches
So, what's this talk about ?

http://flickr.com/photos/frogmuseum2/238601344
Common Sense: Repeated
- Browser caches ?
- DNS, Expires:, Cache-Control:
- Squid reverse proxies ?
- Ad-hoc semi-static content
- Content pre-gen ?
- RSS feeds & other periodically updated content
- Backend caches ?
- DB query caches, disk/OS caches
Active Data Caching
- We're actually in PHP land now
- Two main cache tools
- APC
- Memcache
- APC is php specific
- Memcached is more generic and distributed
Choices

http://flickr.com/photos/elsie/347928333/
Red or Blue ?
- APC - Memcached
- Single server - Distributed
- Stores arrays as-is - needs serialization
- small data - large data safe
- data churn causes - handles data churn
issues better
- Does opcodes too - Only data
APC: Opcode Cache
- Drop-in & use
- Prevents disk access & syscalls
- Is really old Hat, now
- Tuning & tweaking
- nostat mode
- include_once_override
- locking modes
Tuning APC – check list
- You need to tweak APC if you've got
- a large number of files (> 1024)
- a large cache footprint (> 64 M)
- a mess of include_onces
- a highly OOP design
- very static php files
- rsync and inode re-use
APC Tuneables
- apc.ini
- apc.shm_size (64 M)
- apc.num_files_hint (512)
- apc.stat (On)
- apc.stat_ctime (Off)
- apc.include_once_override (Off)
- apc.filters ()
- apc.localcache (Off)
APC monitoring
- Monitor APC using the apc.php in the pkg
- Provides basic information about
- memory usage
- fragmentation
- cache request/insert rates
- user and file
- expunge/cache-full count
Tune your Code - #I
- Reduce include/include_once calls
- Template & stitch
- R3 (http://rthree.sf.net/)
- Use full include paths
- include(“./z.php); instead of include(“z.php”);
- Use constant includes
- don't wrap include with your functions
- Avoid eval() at all costs
Inclued
Inclued: Smarty
APC: var_export gotchas
- PHP opcodes are cached in APC
- So, putting a constant array on a file seems OK
- But here's the “constant” array in PHP

line # op ext operands


-------------------------------------------------------
2 0 INIT_ARRAY ~0, 'z', 'x'
3 1 ADD_ARRAY_ELEMENT ~0, 'b', 'a'
4 2 ADD_ARRAY_ELEMENT ~0, 'c', 'b'
5 3 ADD_ARRAY_ELEMENT ~0, 'd', 'c'
4 ASSIGN !0, ~0
7 5 RETURN 1
6 ZEND_HANDLE_EXCEPTION
APC: Dynamic Style
- apc_fetch()/apc_store()
- The interface is simple enough
if(!($data = apc_fetch('data'))) {
$data = array( .... );
apc_store('data', $data);
}

- But with a Time To Live (TTL) setting, data expires


Lies, Damned Lies & Statistics

http://flickr.com/photos/matthijs/82616861/
APC vs Files
Writes Reads
file + includes 17809 1856348.5
file + serialized arrays 13804 517380.2
apc 98586 359790.3
Read/Write performance * Less time means better

Includes

Read
Serialized Write

APC

0.00 500000.00 1000000.00 1500000.00 2000000.00

http://pooteeweet.org/blog/721
APC vs Files
- Serialized arrays are the fastest to write to
- APC is the slowest to write to, but the fastest to
read from
- Disk files with arrays lose out on both fronts
- But then why would you ever use files ?
- Because they expire gracefully, that's why
- So, what happens when a cache entry expires ?
Tune your Code - II
- Use APC user cache
- WPcache2, smarty_cache_apc
- Re-order your nested lookups
- NO: $a = apc_fetch(“foo”); echo $a[“baz”];
- YES: $a = apc_fetch(“foo.baz”); echo $a;
- Avoid data race conditions
- shouldn't I fix the race conditions ?
Anatomy of a Cache Slam
What works (or has worked)
- Give it more (virtual) memory to burn
- Use APC user cache for infrequently updated
data
- Update your APC cache out of band
- Cron & Curl are APC's friends
- Always check your APC info for
- cache full count
- fragmentation
Thank you for listening
- Resources
- http://pecl.php.net/package/APC
- http://pecl.php.net/package/memcache
- Slides
- http://t3.dotgnu.info/slides/oscon07.pdf
- My Blog
- http://t3.dotgnu.info/blog/php/

You might also like