RedisConf18 - Fail-Safe Starvation-Free Durable Priority Queues in Redis

Fail-Safe Starvation-Free Durable Priority Queues in Redis
Jesse H. Willett
jhw@prosperworks.com
https://github.com/jhwillett

2
ProsperWorks is a multi-tenant
CRM-as-a-Service.
ProsperWorks was built with three
basic principles in mind:
● Keep it simple.
● Show what matters.
● Make it actionable.
Who are We?
We help businesses sell more with a CRM teams actually love to use.

3
I am a server architect focused on storage, scaling, and asynchronous workloads.
I have worked at scale on public-facing live services built on many stacks:
● ProsperWorks: Postgres/Citus+Redis+Elasticsearch Ruby on Rails
● Lyft: MongoDB+Redis Doctrine/PHP
● Zynga: Memcache+Membase PHP
I have also worked on image processing grids, feature phone games, text search
engines, PC strategy games, and desktop publishing suites.
All of these systems had queues. Queues naturally manage the impedance
mismatch between system with different time or cost signatures.
Who am I?

4
Presenting Ick, a Redis-based priority queue which we have used in our
Postgres-to-Elasticsearch pipeline since Q3 2015.
Ick extends Redis Sorted Sets with 175 LoC of Lua. The combination neatly
solves many problems in asynchronous job processing.
“Ick” was my gut reaction to the idea of closing a race condition by deploying Lua
to Redis. Once successful, we adopted the backronym “Ick == Indexing QUeue”.
Ick is available via Ruby bindings in the gem redis-ick under the MIT License.
So far only Prosperworks uses redis-ick, and I am the only maintainer.
What is This?

5
● Redis Reliable Queue Pattern
○ Does not support deduplication or reordering.
○ Ick is RPOPLPUSH for Sorted Sets with a custom score update mode.
● Redis Streams
○ Still, we might have used Streams if they had been available in 2015.
● Apache Kafka
○ Log compaction could serve our deduplication needs, no reordering.
○ Too costly to own or rent for a small team in 2015, yet another storage service.
● Amazon Kinesis
○ Cost effective, yet another storage service.
Ick Comparables

6
● Our primary store is Postgres with a normalized entity-relationship model.
● Elasticsearch hosts search over a de-normalized form of our entities.
○ ES provides scale and advanced search features.
○ Mapping from PG to ES is coupled to our business logic, lives best in our code.
● Challenges keeping ES up-to-date with live changes in PG.
○ High-frequency fast PG updates from our web layer and from asynchronous jobs.
○ Low-frequency slow ES Bulk API calls.
○ A few seconds of latency in the PG ⇒ ES pipeline is acceptable.
○ UX degrades with minutes of latency. Hours of latency is unacceptable.
● A Natural Pattern:
○ When the app writes to PG, also put ids of dirty entities in a Redis queue.
○ In some cases, we also search out dirty entities in PG directly.
○ A background consumer process takes batches of dirty ids and updates ES in bulk.
Problem Space

7
# in producer
redis.rpush(key,msg)
# in consumer
batch = batch_size.times { redis. lpop(key) }.flatten # messages no longer in Redis
process_batch_slowly(batch)
● Advantages:
○ Simple
○ Many implementations: Resque works like this w/ batch_size 1
○ Scaling to many workers is straightforward.
● Disadvantages:
○ Messages lost on failure.
○ Unconstrained backlog growth when ES falls behind.
Solution 1: Basic List Pattern

8
Sometimes we see hot data: entities which are dirtied several times per second.
Under heavy load our ES Bulk API calls can take 5s or more.
With too much hot data, our backlog can grow without bound.
To get leverage over this problem we need to deduplicate messages.
We prefer deduplication at the queue level. We considered and rejected:
● One lock per message at enqueue time - brittle and expensive.
● Version information in the message - large decrease in solution generality.
We Really Care about Deduplication!

9
In Redis, this means we prefer Sorted Sets:
Sorted sets are a data type which is similar to a mix between a Set and a Hash. Like sets, sorted
sets are composed of unique, non-repeating string elements, so in some sense a sorted set is a set
as well.
However while elements inside sets are not ordered, every element in a sorted set is associated with
a floating point value, called the score [...]
Moreover, elements in a sorted sets are taken in order [by score].
Sorted Set accesses cost O(log N) versus the O(1) of Lists, but deduplicate.
Sorted Sets support FIFO-like behavior if we use timestamps as scores.
Sorted Sets for Deduplication

10
# in producer
redis.zadd(key,Time.now.to_f,msg) # Time.now for score ==> FIFO-like
# in consumer
batch = redis. zrangebyrank(key,0,batch_size) # critical section start
redis.zrem(key,*batch.map(&:first)) # critical section end
● Advantages:
○ Messages preserved across failure.
○ De-duplication aka write-folding constrains backlog growth.
○ 1 + 2/batch_size Redis ops per message, down from 2 ops/message.
● Disadvantages:
○ Race condition between zadd and process_batch_slowly can lead to dropped messages.
○ Hot data can starve if continually re-added with a higher score.
Solution 2: Basic Sorted Set Pattern

11
# in producer
redis.zadd(key,Time.now.to_f,msg) # variadic ZADD is an option
# in consumer
batch = redis. zrangebyrank(key,0,batch_size)
batch2 = redis. zrangebyrank(key,0,batch_size) # critical section start
unchanged = batch & batch2 # remove msgs whose scores have changed
redis.zrem(key,*unchanged.map(&:first)) # critical section end
● Advantages:
○ Critical section is smaller.
○ Critical section is not exposed to process_batch_slowly.
○ Messages only dropped from Redis after success (i.e. ZREM as ACK)
● Disadvantages:
○ Extra Redis op per cycle.
○ Hot data can starve if continually re-added with a higher score.
Solution 3: Improved Sorted Set Pattern

12
The Sorted Set solutions have a critical section where dirty signals can be lost,
and also a more subtle problem with hot data.
Hot data is continually re-added with higher scores.
During periods of intermediate load, we might carry a steady-state backlog which
is larger than a single batch size for an extended period.
When these conditions coincide, hot data may dance out of the low-score end of
the Sorted Set for hours.
We call this is the Hot Data Starvation Problem.
We Really Care about Hot Data!

13
An Ick is a pair of Redis Sorted Sets: a producer set and a consumer set.
● ICKADD adds messages to the producer set.
● ICKRESERVE moves lowest-score messages from the pset to the cset, then returns the cset.
● ICKCOMMIT removes messages from the cset.
● On duplicates, ICKADD and ICKRESERVE both select the minimum score.
ICKADD [score,msg]* app ==> Redis pset
ICKRESERVE n Redis pset ==> Redis cset up to size N ==> app return batch
ICKCOMMIT msgs* Redis cset removed
Introducing Ick

14
# push ‘a’ and ‘b’ into the Ick
Ick.new(redis). ickadd(key,123,’a’,456,’b’) # pset [[123,’a’],[456,’b’]]
# re-push ‘b’ with higher score, nothing changes
Ick.new(redis). ickadd(key,789,’b’) # pset [[123,’a’],[456,’b’]] unchanged
# re-push ‘b’ with lower score, score changes
Ick.new(redis). ickadd(key,100,’b’) # pset [[100,’b’],[123,’a’]] move b to 100
ICKADD adds to the producer set. Duplicates are assigned the minimum score.
Almost ZADD XX but more predictable.
Assuming scores trend up over time, there is no starvation. Scores never go up,
so all messages trend toward the lowest score, where they are consumed.
ICKADD

15
# push some messages into the Ick
Ick.new(redis). ickadd(key,12,’a’,10,’b’,13,’c’) # pset [[10,’b’],[12,’a’],[13,’c’]]
# reserve a batch
batch = Ick.new(redis). ickreserve(key,2) # pset [[13,’c’]] removed b and a
# cset [[10,’b’],[12,’a’]] added b and ad
# batch [’b’,10,’a’,12] per ZRANGE w/
score
# repeated ICKRESERVE just re-fetch the consumer set
batch = Ick.new(redis). ickreserve(key,2) # pset [[13,’c’]] unchanged
# cset [[10,’b’],[12,’a’]] unchanged
# batch [’b’,10,’a’,12] unchanged
ICKRESERVE fills up the consumer set by moving the lowest-score messages from
the producer set, then returns the consumer set.
This merge respects the minimum score rule.
ICKRESERVE

16
# push some messages into the Ick
# reserve a batch
batch = Ick.new(redis). ickreserve(key,2) # pset [[13,’c’]] removed b and a
# cset [[10,’b’],[12,’a’]] added b and a
# batch [’b’,10,’a’,12] per ZRANGE w/
score
# commit ‘a’ to acknowledge success
Ick.new(redis). ickcommit(key,’a’) # pset [[13,’c’]] unchanged
# cset [[10,’b’]] removed a
ICKCOMMIT forgets messages in the producer set.
ICKCOMMIT

17
● All Ick ops are bulk operations and support multiple messages per Redis ops.
● Duplicate messages always resolved to the minimum score.
● We use current timestamps for scores.
○ The scores of new messages tends to increase.
● Even a hot data does not lose its place in line.
● A message can be present in both the pset and the cset.
○ When it is re-added after being reserved.
○ Good: reifies the critical section where PG vs ES agreement is indeterminate.
Properties of Icks

18
# in producer
Ick.new(redis). ickadd(key,Time.now.to_f,msg) # supports variadic bulk ICKADD
# in consumer
batch = Ick.new(redis). ickreserve(key,batch_size)
Ick.new(redis). ickcommit(key,*batch.map(&:first)) # critical section only in Redis tx
● Advantages:
○ Critical section is bundled up in a Redis transaction.
○ Hot data starvation solved by constraining scores to only decrease, never increase.
○ Messages only dropped from Redis after success (i.e. ICKCOMMIT as ACK)
● Disadvantages:
○ Must deploy Lua to your Redis.
○ Not inherently scalable.
Solution 4: Ick Pattern

19
Ick support for multiple Ick consumers was considered but rejected:
● Consumer processes would need to identify themselves somehow.
● How are messages allocated to consumers?
● How do consumers come and go?
● Will this break deduplication or other serializability guarantees?
● How can the app customize?
We scale at the app level by hashing messages over many Ick+consumer pairs.
This suffers from head-of-line blocking but keeps these hard problems in
higher-level code which we can monitor and tie to business logic more easily.
Dealing with Scale

20
We usually use the current time for score in our Icks.
This is FIFO-like: any backlog has priority over current demand has priority over
future demand.
Unfortunately, resources are finite. We alert when the scores of the current batch
get older than our service level objectives.
Unfortunately, demand is bursty. For bulk operations we offset the scores by 5
seconds plus 1 second per 100 messages.
That is, as bulk operations get bulkier they also get nicer.
Advanced Ick Patterns: Hilbert’s SLA

21
I recently added a new Ick operation which combines ICKCOMMIT of the last
batch with ICKRESERVE for the next batch:
last_batch = []
while still_going() do
next_batch = Ick.new(redis). ickexchange(key,batch_size,*last_batch.map(&:first))
process_batch_slowly(next_batch)
last_batch = next_batch
end
Ick.new(redis). ickexchange(key,0,*last_batch.map(&:first))
It is gratifying to have two-phase commit without doubling the Redis ops.
This pattern would be useful in any two-phase commit or pipeline system.
Advanced Ick Patterns: ICKEXCHANGE

22
I anticipate using Ick to schedule delayed jobs by using scores as “release date”.
To support this I added an option to ICKRESERVE:
# push messages and reserve initial batch
Ick.new(redis). ickreserve(key,2) # pset [[13,’c’]] moved b and a
# cset [[10,’b’],[12,’a’]] moved b and a
# no commits, but a younger message is added
Ick.new(redis). ickadd(key,7,’x’) # pset [[7,’x’],[13,’c’]] 7 sorts first
# cset [[10,’b’],[12,’a’]] but cset is full
# plain reserve is wedged but backwash unblocks
Ick.new(redis). ickreserve(key,2) # pset [[7,’x’],[13,’c’]] no change
# cset [[10,’b’],[12,’a’]] full
Ick.new(redis). ickreserve(key,2,backwash: true) # pset [[12,’a’],[13,’c’]] backwashed a and
b!
# cset [[7,’x’],[10,’b’]] unblocked x!
Advanced Ick Patterns: Backwash

Thank You
Jesse H. Willett
jhw@prosperworks.com
https://github.com/jhwillett

RedisConf18 - Fail-Safe Starvation-Free Durable Priority Queues in Redis

More Related Content

Similar to RedisConf18 - Fail-Safe Starvation-Free Durable Priority Queues in Redis (20)

More from Redis Labs (20)

Recently uploaded (20)

RedisConf18 - Fail-Safe Starvation-Free Durable Priority Queues in Redis