Caching Policies and Strategies - System Design On AWS (Book)
Caching Policies and Strategies - System Design On AWS (Book)
See everything available through the O’Reilly learning platform and sta Search
This will be the 4th chapter of the final book. Please note that the GitHub repo will be
made active later on.
If you have comments about how we might improve the content and/or examples in this
book, or if you notice missing material within this chapter, please reach out to the editor
at [email protected].
When data is accessed from a cache, there are two possible outcomes: cache hit
and cache miss. A cache hit occurs when the requested data is found in the cache,
allowing for fast retrieval without accessing the slower main memory or external
resources. On the other hand, a cache miss happens when the requested data is
not present in the cache, requiring the system to fetch the data from the main
memory or external storage. Cache hit rates measure the effectiveness of the
cache in serving requests without needing to access slower external storage, while
cache miss rates indicate how often the cache fails to serve requested data.
This chapter will cover important information to help you understand how to use
data caching effectively. We’lll talk about cache eviction policies, which are rules
for deciding when to remove data from the cache to make retrieval of important
data faster. We’ll also cover cache invalidation, which ensures the cached data is
always correct and matches the real underlying data source. The chapter will also
discuss caching strategies for both read and write intensive applications. We’ll also
cover how to actually put caching into action, including where to put the caches to
get the best results. You’ll also learn about different ways caching works and why
Content Delivery Networks (CDNs) are important. And finally, you’ll learn about
two popular open-source caching solutions. So, let’s get started with the benefits
of caching.
Caching Benefits
Caches play a crucial role in improving system performance and reducing latency
for several reasons:
Faster Access
Caches help reduce latency by reducing the need to access slower stor-
age resources. By serving data from a cache hit, the system avoids the
delay associated with fetching data from main memory or external
sources, thereby reducing overall latency.
Bandwidth Optimization
Improved Throughput
Amdahl’s Law and the Pareto distribution provide further insights into the benefits
of caching:
Amdahl’s Law
Pareto Distribution
The Pareto distribution, also known as the 80/20 rule, states that a sig-
nificant portion of the system’s workload is driven by a small fraction of
the data. Caching aligns well with this distribution by allowing the fre-
quently accessed data to reside in a fast cache, serving the most critical
operations efficiently. By focusing caching efforts on the most accessed
data, the Pareto distribution can be leveraged to SIGN IN TRY NOW
optimize performance
for the most important workloads.
In summary, caches provide faster access to frequently accessed data, reducing la-
tency and improving overall system performance. They help optimize bandwidth,
increase throughput, and align with principles such as Amdahl’s Law and the
Pareto distribution to maximize performance benefits.
The next section will cover different policies to perform cache eviction, like tech-
niques such as least recently used (LRU) and least frequently used (LFU), which
can help you choose the best caching policy for different situations.
Belady’s Algorithm
Belady’s algorithm is an optimal caching algorithm that evicts the data item that
will be used furthest in the future. It requires knowledge of the future access pat-
tern, which is usually impractical to obtain. Belady’s algorithm serves as a theoret-
ical benchmark for evaluating the performance of other caching policies.
Queue-Based Policies
FIFO (First-In-First-Out)
FIFO is a simple caching policy that evicts the oldest data item
from the cache. It follows the principle that the first data item
inserted into the cache is the first one to be evicted when the
cache is full. FIFO is easy to implement but may suffer from
the “aging” problem, where recently accessed items are
evicted prematurely.
LIFO (Last-In-First-Out)
MRU evicts the most recently accessed data item from the
cache. It assumes that the most recently accessed item is likely
to be accessed again soon. MRU can be useful in scenarios
where a small subset of items is accessed frequently.
Frequency-Based Policies
Frequency-based cache eviction policies prioritize retaining items in the
cache based on how often they are accessed. TheSIGN IN TRY NOW
cache replaces items
that have been accessed the least frequently, assuming that rarely ac-
cessed data may not be as critical for performance optimization.
LFU evicts the least frequently accessed data item from the
cache. It assumes that items with lower access frequency are
less likely to be accessed in the future. LFU requires maintain-
ing access frequency counts for each item, which can be mem-
ory-intensive.
Allowlist Policy
An allowlist policy for cache replacement is a mechanism that defines a set of pri-
oritized items eligible for retention in a cache when space is limited. Instead of us-
ing a traditional cache eviction policy that removes the least recently used or least
frequently accessed items, an allowlist policy focuses on explicitly specifying
which items should be preserved in the cache. This policy ensures that important
or high-priority data remains available in the cache, even during periods of cache
pressure. By allowing specific items to remain in the cache while evicting others,
the allowlist policy optimizes cache utilization and improves performance for criti-
cal data access scenarios.
Caching policies serve different purposes and exhibit varying performance charac-
teristics based on the access patterns and workload of the system. Choosing the
right caching policy depends on the specific requirements and characteristics of
the application.
By understanding and implementing these caching policies effectively, system de-
SIGN IN TRY NOW
signers and developers can optimize cache utilization, improve data retrieval per-
formance, and enhance the overall user experience. Let’s discuss different cache
invalidation strategies, which are applied post identifying which data to evict
based on the above cache eviction policies.
Cache Invalidation
Cache invalidation is a crucial aspect of cache management that ensures the
cached data remains consistent with the underlying data source. Effective cache
invalidation strategies help maintain data integrity and prevent stale or outdated
data from being served. Here are three common cache invalidation techniques: ac-
tive invalidation, invalidating on modification, invalidating on read, and time-to-
live (TTL).
Active Invalidation
Invalidating on Modification
Invalidating on Read
In invalidating on read, the cache is invalidated when the cached data is
accessed or read. Upon receiving a read request, SIGN IN TRY NOW
the cache checks if the
data is still valid or has expired. If the data is expired or flagged as in-
valid, the cache fetches the latest data from the data source and updates
the cache before serving the request. This approach guarantees that
fresh data is always served, but it adds overhead to each read operation
since the cache must validate the data’s freshness before responding.
Time-to-Live (TTL)
The choice of cache invalidation strategy depends on factors such as the nature of
the data, the frequency of updates, the performance requirements, and the de-
sired consistency guarantees. Active invalidation offers precise control but re-
quires active management, invalidating on modification ensures immediate data
freshness, invalidating on read guarantees fresh data on every read operation, and
TTL-based invalidation provides a time-based expiration mechanism.
Understanding the characteristics of the data and the system’s requirements helps
in selecting the appropriate cache invalidation strategy to maintain data consis-
tency and improve overall performance.
The next section covers different caching strategies to ensure that data is properly
consistent between the cache and the underlying data source.
Caching Strategies
Caching strategies define how data is managed and synchronized between the
cache and the underlying data source. In this section, we will explore several
caching strategies as shown in Figure 4-1, including cache-aside, read-through, re-
fresh-ahead, write-through, write-around, and write-back.SIGN IN
TRY NOW
The left-hand side of the diagram displays read-intensive caching strategies, fo-
cusing on optimizing the retrieval of data that is frequently read or accessed. The
goal of a read-intensive caching strategy is to minimize the latency and improve
the overall performance of read operations by serving the cached data directly
from memory, which is much faster than fetching it from a slower and more dis-
tant data source. This strategy is particularly beneficial for applications where the
majority of operations involve reading data rather than updating or writing data.
Cache-Aside
Read-Through
Read-through caching strategy retrieves data from the cache if available;
SIGN IN TRY NOW
otherwise, it fetches the data from the underlying data source. When a
cache miss occurs for a read operation, the cache retrieves the data from
the data source, stores it in the cache for future use, and returns the
data to the caller. Subsequent read requests for the same data can be
served directly from the cache, improving the overall read performance.
This strategy offloads the responsibility of managing cache lookups from
the application unlike Cache-aside strategy, providing a simplified data
retrieval process.
Refresh-Ahead
The right-hand side of the diagram displays the write-intensive strategies, fo-
cussing around optimizing the storage and management of data that is frequently
updated or written. Unlike read-intensive caching, where the focus is on optimiz-
ing data retrieval, a write-intensive caching strategy aims to enhance the effi-
ciency of data updates and writes, while still maintaining acceptable performance
levels. In a write-intensive caching strategy, the cache is designed to handle fre-
quent write operations, ensuring that updated data is stored temporarily in the
cache before being eventually synchronized with the underlying data source, such
as a database or a remote server. This approach can help reduce the load on the
primary data store and improve the application’s responsiveness by acknowledging
write operations more quickly.
Write-Through
Write-Back
Each caching strategy has its own advantages and considerations, and the selec-
tion of an appropriate strategy depends on the specific requirements and charac-
teristics of the system.
Caching Deployment
When deploying a cache, various deployment options are available depending on
SIGN IN TRY NOW
the specific requirements and architecture of the system. Here are three common
cache deployment approaches: in-process caching, inter-process caching, and re-
mote caching.
In-Process Caching
In in-process caching, the cache resides within the same process or ap-
plication as the requesting component. The cache is typically imple-
mented as an in-memory data store and is directly accessible by the ap-
plication or service. In-process caching provides fast data access and low
latency since the cache is located within the same process, enabling di-
rect access to the cached data. This deployment approach is suitable for
scenarios where data sharing and caching requirements are limited to a
single application or process.
Inter-Process Caching
Remote Caching
The choice of cache deployment depends on factors such as the scale of the sys-
tem, performance requirements, data sharing needs, and architectural considera-
tions. In-process caching offers low latency and direct access to data within a sin-
gle process, inter-process caching enables sharing and caching data across multi-
ple applications or processes, and remote caching provides distributed caching ca-
pabilities across multiple machines or locations. Understanding the specific re-
quirements and characteristics of the system helps in selecting the appropriate
cache deployment strategy to optimize performance and resource utilization. Let’s
cover different caching mechanisms to improve application performance in the
next section.
Caching Mechanisms
In this section, we will explore different caching mechanisms, including client-side
caching, CDN caching, web server caching, application caching, database caching,
query-level caching, and object-level caching.
Client-side Caching
CDN Caching
Application Caching
Database Caching
Query-Level Caching
Object-Level Caching
Object-level caching caches individual data objects or records
SIGN IN TRY NOW
retrieved from the database. This mechanism is useful when
accessing specific objects frequently or when the database is
relatively static. Object-level caching reduces the need for fre-
quent database queries, improving overall application
performance.
Out of the above mechanisms, Content Delivery Networks (CDNs) play a crucial
role in improving the performance and availability of web content to end-users by
reducing latency and enhancing scalability by caching at edge locations. Let’s
cover CDNs in detail in the next section.
CDNs can be categorized into two main types: push and pull CDNs.
Push CDN
In a push CDN, content is pre-cached and distributed to edge servers in
SIGN IN TRY NOW
advance. The CDN provider proactively pushes content to edge locations
based on predicted demand or predetermined rules. With push CDNs,
content is only uploaded when it is new or changed, reducing traffic
while maximizing storage efficiency. This approach ensures faster con-
tent delivery as the content is readily available at the edge servers when
requested by end-users.
Push CDNs are suitable for websites with low traffic or content that
doesn’t require frequent updates. Instead of regularly pulling content
from the server, it is uploaded to the CDNs once and remains there until
changes occur.
Pull CDN
In a pull CDN, content is cached on-demand. The CDN servers pull the
content from the origin server when the first user requests it. The con-
tent is then cached at the edge servers for subsequent requests, opti-
mizing delivery for future users. The duration for which content is
cached is determined by a time-to-live (TTL) setting. Pull CDNs minimize
storage space on the CDN, but there can be redundant traffic if files are
pulled before their expiration, resulting in unnecessary data transfer.
Pull CDNs are well-suited for websites with high traffic since recently-
requested content remains on the CDN, evenly distributing the load.
Content Fragmentation
Breaking down dynamic content into smaller fragments to en-
SIGN IN TRY NOW
able partial caching and efficient updates.
Content Personalization
DNS Redirection
Client Multiplexing
Ensuring content consistency across multiple edge servers within a CDN is crucial
to deliver the most up-to-date and accurate content. CDNs employ various meth-
ods to maintain content consistency, including:
SIGN IN TRY NOW
Periodic Polling
CDNs periodically poll the origin server to check for updates or changes
in content. This ensures that cached content is refreshed to reflect the
latest version.
Time-to-Live (TTL)
Leases
Note
AWS offers Amazon Cloudfront, a pull CDN offering built for high performance, se-
curity, and developer convenience, which we will cover in more detail in Chapter 9
- AWS Network Services.
Before ending the section, let’s also understand that using a CDN can come with
certain drawbacks also due to cost, stale content and frequent URL changes. CDNs
may involve significant costs depending on the amount of traffic. However, it’s im-
portant to consider these costs in comparison to the expenses you would incur
without utilizing a CDN. If updates are made before the TTL expires, there is a pos-
sibility of content being outdated until it is refreshed on the CDN. CDNs require
modifying URLs for static content to point to the CDN, which can be an additional
task to manage.
Overall, CDNs offer benefits in terms of performance and scalability but require
careful consideration of these factors and the specific needs of your website. At
the end of this chapter, let’s dive deeper into two popular open-source caching so-
lutions to understand their architecture and how they implement the caching con-
cepts discussed in the chapter.
Open Source Caching Solutions SIGN IN TRY NOW
Open source caching solutions, such as Redis and Memcached, have gained popu-
larity due to their efficiency, scalability, and ease of use. Let’s take a closer look at
Memcached and Redis, two widely adopted open-source caching solutions.
Memcached
Memcached is an open-source, high-performance caching solution widely used in
web applications. It operates as a distributed memory object caching system, stor-
ing data in memory across multiple servers. Here are some key features and bene-
fits of Memcached:
Horizontal Scalability
Protocol Compatibility
Memcached Architecture
Memcached’s architecture consists of a centralized server that coordinates the
storage and retrieval of cached data. When a client sends a request to store or re-
trieve data, the server handles the request and interacts with the underlying mem-
ory allocation strategy.
When a new item is added to the cache, Memcached determines the appropriate
slab size for the item based on its size. If an existing slab with enough free space is
available, the item is stored in that slab. Otherwise, Memcached allocates a new
slab from the available memory pool and adds the item to that slab. The slab allo-
cation strategy enables efficient memory utilization and allows Memcached to
store a large number of items in memory while maintaining optimal performance.
Redis
Redis, short for Remote Dictionary Server, is a server-based in-memory data struc-
ture store that can serve as a high-performance cache. Unlike traditional databases
that rely on iterating, sorting, and ordering rows, Redis organizes data in custom-
izable data structures from the ground up, supporting a wide range of data types,
including strings, bitmaps, bitfields, lists, sets, hashes, geospatial, hyperlog and
more, making it versatile for various caching use cases. Here are some key features
and benefits of Redis:
High Performance
Persistence Options
Redis architecture is designed for high performance, low latency, and simplicity. It
provides a range of deployment options for ensuring high availability based on the
requirements and cost constraints. Let’s go over the availability in Redis deploy-
ments in detail, followed by persistence models for redis durability and memory
management in Redis.
Redis Sentinel
Redis Cluster
Redis provides two persistence models for data durability: RDB files (Redis
Database Files) and AOF (Append-Only File). These persistence mechanisms en-
sure that data is not lost in case of system restarts or crashes. Let’s explore both
models in more detail:
Snapshot-based Persistence
RDB files are highly efficient in terms of disk space usage and data load-
ing speed. They are compact and can be loaded back into Redis quickly,
making it suitable for scenarios where fast recovery is essential.
RDB files provide full data recovery as they contain the entire dataset. In
case of system failures, Redis can restore the data by loading the most
recent RDB file available.
However, it’s worth noting that RDB files have some limitations. Since they are
snapshots, they do not provide real-time durability and may result in data loss if a
crash occurs between two snapshot points. Additionally, restoring large RDB files
can take time and impact the system’s performance during the recovery process.
AOF persistence is an alternative persistence model in Redis that logs every write
operation to an append-only file. AOF captures a sequential log of write opera-
tions, enabling Redis to reconstruct the dataset by replaying the log. Here are key
features and considerations of AOF persistence:
Write-ahead Log
Append-only Nature
AOF appends new write operations to the end of the file, ensuring that
the original dataset is never modified. This approach protects against
data corruption caused by crashes or power failures.
However, AOF persistence comes with its own considerations. The append-only
file can grow larger over time, potentially occupying significant disk space. Redis
offers options for AOF file rewriting to compact the log and reduce its size.
Additionally, AOF persistence typically has a slightly higher performance overhead
compared to RDB files due to the need to write every command to disk.
In practice, Redis users often employ a combination of RDB and AOF persistence
based on their specific requirements and trade-offs between performance, durabil-
ity, and recovery time objectives.
It’s important to note that Redis also provides an option to use no persistence
(volatile mode) if durability is not a primary concern or if data can be regenerated
from an external source in the event of a restart or crash.
Redis leverages forking and copy-on-write (COW) techniques to facilitate data per-
sistence efficiently within its single-threaded architecture. When Redis performs a
snapshot (RDB) or background saving operation, it follows these steps:
1. 1. Forking: Redis uses the fork() system call to create a child process, which is
an identical copy of the parent process. Forking is a lightweight operation as
it creates a copy-on-write clone of the parent’s memory.
2. 2. Copy-on-Write (COW): Initially, the child process shares the same memory
pages with the parent process. However, when eitherSIGN
the IN
TRY NOW
parent or child
process modifies a memory page, COW comes into play. Instead of immedi-
ately duplicating the modified page, the operating system creates a new
copy only when necessary.
Memory Efficiency
When the child process is initially created, it shares the same memory
pages with the parent process. This shared memory approach consumes
minimal additional memory. Only the modified pages are copied when
necessary, saving memory resources.
Performance
Since only the modified pages are duplicated, Redis can take advantage
of the COW mechanism to perform persistence operations without incur-
ring a significant performance overhead. This is particularly beneficial for
large datasets where copying the entire dataset for persistence would be
time-consuming.
Fork Safety
Redis uses fork-based persistence to avoid blocking the main event loop
during the snapshot process. By forking a child process, the parent
process can continue serving client requests while the child process per-
forms the persistence operation independently. This ensures high re-
sponsiveness and uninterrupted service.
It’s important to note that while forking and COW provide memory efficiency and
performance benefits, they also have considerations. Forking can result in in-
creased memory usage during the copy-on-write process if many modified pages
need to be duplicated. Additionally, the fork operation may be slower on systems
with large memory footprints.
Overall, Redis effectively utilizes forking and copy-on-write mechanisms within its
SIGN IN TRY NOW
single-threaded architecture to achieve efficient data persistence. By employing
these techniques, Redis can perform snapshots and background saving operations
without significantly impacting its performance or memory usage.
Overall, Redis offers developers a powerful and flexible data storage solution with
various deployment options and capabilities.
Both Redis and Memcached are excellent open-source caching solutions with their
unique strengths. The choice between them depends on specific requirements and
use cases. Redis is suitable for scenarios requiring versatile data structures, persis-
tence, pub/sub messaging, and advanced caching features. On the other hand,
Memcached shines in simple, lightweight caching use cases that prioritize scalabil-
ity and ease of integration.
Note
AWS offers Amazon Elasticache, compatible with both Redis and Memcached for
real-time use cases like caching, session stores, gaming, geo-spatial services, real-
time analytics, and queuing, which we will cover in more detail in Chapter 10 -
AWS Storage Services.
Conclusion
In concluding this chapter on caching, we have journeyed through a comprehen-
sive exploration of the fundamental concepts and strategies that empower effi-
cient data caching. We’ve covered cache eviction policies, cache invalidation mech-
anisms, and a plethora of caching strategies, equipping you with the knowledge to
optimize data access and storage. We’ve delved into caching deployment, under-
standing how strategic placement can maximize impact, and explored the diverse
caching mechanisms available. Additionally, we’ve touched upon Content Delivery
Networks (CDNs) and open-source caching solutions including Redis and
Memcached, that offer robust options for enhancing performance. By incorporat-
ing Redis or Memcached into your architecture, you can significantly improve ap-
plication performance, reduce response times, and enhance the overall user expe-
rience by leveraging the power of in-memory caching.
As we move forward in our exploration of enhancing system performance, the next
SIGN IN TRY NOW
chapter will embark on an exploration of scaling and load balancing strategies.
Scaling is a pivotal aspect of modern computing, allowing systems to handle in-
creased loads gracefully. We will also delve into strategies for load balancing in
distributing incoming traffic efficiently. Together, these topics will empower you
to design and maintain high-performing systems that can handle the demands of
today’s dynamic digital landscape.
Get System Design on AWS now with the O’Reilly learning platform.
Press releases
Media coverage
Community partners
WATCH ON YOUR BIG SCREEN
Affiliate program
View all O’Reilly videos, Superstream events, and Meet
Submit an RFP
the Expert sessions on your home TV.
Diversity
SUPPORT
DO NOT SELL MY PERSONAL INFORMATION
Contact us
Newsletters
Privacy policy
SIGN IN TRY NOW
INTERNATIONAL
Indonesia
Japan
© 2024, O’Reilly Media, Inc. All trademarks and registered trademarks appearing on oreilly.com are the property of their respective
owners.
Terms of service • Privacy policy • Editorial independence