Introduction to Cassandra: Replication and Consistency

Cassandra
Replication & Consistency

Benjamin Black, b@b3k.us
2010-04-28

Dynamo BigTable
Cluster Sparse,
management, columnar data
replication, fault model, storage
tolerance architecture
Cassandra

Dynamo-like
Features
Symmetric, P2P architecture
No special nodes/SPOFs
Gossip-based cluster management
Distributed hash table for data
placement
Pluggable partitioning
Pluggable topology discovery
Pluggable placement strategies
Tunable, eventual consistency

BigTable-like
Features
Sparse, “columnar” data model
Optional, 2-level maps called
Super Column Families
SSTable disk storage
Append-only commit log
Memtable (buffer and sort)
Immutable SSTable files
Hadoop integration

Topic(s) for Today

Replication
&
Consistency

Replication
How many copies of each piece
of data do we want in the
system?

N=3

Consistency
Level
How many replicas must
respond to declare success?
W=2 R=2

?

CL.Options
WRITE READ
Level Description Level Description

ZERO Cross fingers

ANY
WEAK
1st Response
(including HH)
ONE 1st Response ONE 1st Response

STRONG
QUORUM N/2 + 1 replicas QUORUM N/2 + 1 replicas

ALL All replicas ALL All replicas

A Side Note on
CL
Consistency
Level is based
on Replication
Factor (N), not
on the number
of nodes in the
system.

A Question of
Time
row

column column column column column

value value value value value

timestamp timestamp timestamp timestamp timestamp

All columns have a value and a timestamp
Timestamps provided by clients
usec resolution by convention
Latest timestamp wins
Vector clocks may be introduced in 0.7

Read Repair
?

Query all replicas on every read
Data from one replica
Checksum/timestamps from all
others
If there is a mismatch:
Pull all data and merge
Write back to out of sync replicas

Weak vs. Strong
Weak Consistency
(reads)Perform repair after
returning results

Strong Consistency (reads)
Perform repair before returning
results

R+W>N

Please imagine this inequality has huge fangs, dripping with the
blood of innocent, enterprise developers so you can best appreciate
the terror it inspires.

Our Guarantee
R+W>N guarantees overlap of
read and write quorums

W=2 R=2

N=3

A Matter of
Perspective
View
consistency

Replica
consistency

The Ring
0
range
113

375 125

312
250

Tokens
A TOKEN is a
partitioner-dependent
element on the ring
Each NODE has a
single, unique TOKEN

Each NODE claims a RANGE of
the ring from its TOKEN to the
token of the previous node on
the ring

Partitioning
Map from Key Space to Token

RandomPartitioner
Tokens are integers in the range 0-2127
MD5(Key) -> Token
Good: Even key distribution, Bad:
Inefficient range queries
OrderPreservingPartitioner
Tokens are UTF8 strings in the range ‘’-∞
Key -> Token
Good: Efficient range queries, Bad:
Uneven key distribution

Snitching
Map from Nodes to Physical
Location
EndpointSnitch
Guess at rack and datacenter based on IP address octets.

DatacenterEndpointSnitch
Specify IP subnets for racks, grouped per datacenter.

PropertySnitch
Specify arbitrary mappings from individual IP addresses to
racks and datacenters.

Or write your own!

Placement
Map from Token Space to Nodes

The first replica is always placed
on the node that claims the
range in which the token falls.

Strategies determine where the
rest of the replicas are placed.

RackUnaware
Place replicas on the N-1
subsequent nodes around the ring,
ignoring topology.

datacenter A datacenter B

rack 1 rack 2 rack 1 rack 2

RackAware
Place the second replica in another
datacenter, and the remaining N-2
replicas on nodes in other racks in
the same datacenter.


DatacenterShard
Place M of the N replicas in another
datacenter, and the remaining N -
(M + 1) replicas on nodes in other
racks in the same datacenter.


Cassandra
http://cassandra.apache.org

Amazon Dynamo
http://www.allthingsdistributed.com/2007/10/amazons_dynamo.html

Google BigTable
http://labs.google.com/papers/bigtable.html

Facebook Cassandra
http://www.cs.cornell.edu/projects/ladis2009/papers/lakshman-ladis2009.pdf

Introduction to Cassandra: Replication and Consistency

Recommended

More Related Content

What's hot (20)

Viewers also liked (20)

Similar to Introduction to Cassandra: Replication and Consistency (20)

Recently uploaded (20)

Introduction to Cassandra: Replication and Consistency