Beyond (No)SQL

Beyond (No)SQL
Sarah Mei
Pivotal Labs

@sarahmei
sarah@pivotallabs.com

Fsck CS.
-DHH (paraphrased)

Photo by jasonwg (http://www.ﬂickr.com/photos/jasonwg/1382036808)

Agenda

• The data storage landscape

Agenda

• Relational model & SQL

Agenda

RELATIONAL ALGEBRA

Agenda

RELATIONAL ALGEBRA
• Evaluating data stores

Photo by TANAKA Juuyoh http://www.ﬂickr.com/photos/tanaka_juuyoh/3121538767/

“NoSQL”
MongoDB

Memcache

Bigtable

“NoSQL”
MongoDB

Memcache

Bigtable

CouchDB

“NoSQL”
MongoDB

Memcache

Bigtable

CouchDB

Cassandra

“NoSQL”
MongoDB

Memcache

Bigtable

CouchDB

Cassandra

Project Voldemort

“NoSQL”
MongoDB

Memcache

Bigtable

CouchDB

Cassandra

Project Voldemort

Hbase

“NoSQL”
MongoDB

Memcache

Bigtable

CouchDB

Cassandra

Project Voldemort

Hbase

Riak

“NoSQL”
MongoDB

Memcache

Bigtable

CouchDB

Cassandra

Project Voldemort

Hbase

Riak

Redis

“NoSQL”
MongoDB

Memcache

Bigtable

CouchDB

Cassandra

Project Voldemort

Hbase

Riak

Redis

Tokyo Cabinet

“NoSQL”
MongoDB

Memcache Key-value
Bigtable

CouchDB

Cassandra

Project Voldemort

Hbase

Riak

Redis

Tokyo Cabinet

“NoSQL”
MongoDB

Memcache Key-value
Bigtable

CouchDB
Document
Cassandra

Project Voldemort

Hbase

Riak

Redis

Tokyo Cabinet

“NoSQL”
MongoDB

Memcache Key-value
Bigtable

CouchDB
Document
Cassandra Other
Project Voldemort

Hbase

Riak

Redis

Tokyo Cabinet

Photo from mga (http://www.ﬂickr.com/photos/mgiraldo/420642350)

The Relational Model

Diagram by Wikipedia user AutumnSnow (http://en.wikipedia.org/wiki/File:Relational_model_concepts.png)

Sets

• No duplicates
• Unordered

More sets

• TABLE: a set of columns and a set of rows

More sets

• COLUMN: a unique name and a type

More sets

• COLUMN: a unique name and a type
• ROW: a set of name-value pairs

Relational Algebra

Operations you can do on tables (“relations”)

Operations
• projection: subset of available columns
• selection: subset of available rows
• cartesian product
• set union
• set intersection
• rename

A Join
select * from foo, bar where foo.ipsum = bar.lorem

First you take the cartesian product....

foo: bat ipsum bar: cat lorem
me 5 X us 5
you 4 them 4
bat ipsum cat lorem
me 5 us 5
= me 5 them 4
you 4 us 5
you 4 them 4

A Join

Then you select the rows that satisfy the join condition:

bat ipsum cat lorem bat ipsum cat lorem
me 5 us 5 me 5 us 5
me 5 them 4 => you 4 them 4
you 4 us 5
you 4 them 4

A Join


me 5 X us 5
8,000,000
you 4 them 4
bat ipsum cat lorem
me 5 us 5
= me 5 them 4
you 4 us 5
you 4 them 4

A Join


me 5 X us 5
8,000,000 100,000
you 4 them 4
bat ipsum cat lorem
me 5 us 5
= me 5 them 4
you 4 us 5
you 4 them 4

A Join


me 5 X us 5
8,000,000 100,000
you 4 them 4
bat ipsum cat lorem
me 5 us 5
= me 5 them 4
800,000,000,000
you 4 us 5
you 4 them 4

Your choices

• Scale the database

Your choices

• Scale the database
• Try a different approach

What SQL Gets You

• Speed
(when data is highly structured and small enough)

What SQL Gets You

• Speed
• Aggregation

What SQL Gets You

• Speed
• Aggregation
• Relational searching

What SQL Gets You

• Speed
• Aggregation
• Relational searching
• ACID - guaranteed full consistency

Image by captcreate (http://www.ﬂickr.com/photos/27845211@N02/2662264721)

What if you gave up
data aggregation?

What if you gave up
data aggregation?

Document databases:
MongoDB
CouchDB
Riak

What if you gave up
where clauses?

What if you gave up
where clauses?
Key-value stores:
Memcache
Project Voldemort
Redis
Tokyo Cabinet

What if you gave up
consistency?

What if you gave up
consistency?

Then things get interesting.

ACID & BASE

• Atomicity
• Consistency

ACID & BASE

• Atomicity
• Consistency
• Isolation

ACID & BASE

• Atomicity
• Consistency
• Isolation
• Durability

ACID & BASE

• Atomicity • Basically Available
• Consistency
• Isolation
• Durability

ACID & BASE

• Consistency • Soft state
• Isolation
• Durability

ACID & BASE

• Consistency • Soft state
• Isolation • Eventually consistent
• Durability

Photo by Marcus Vegas (http://www.ﬂickr.com/photos/vegas/413159909)

Fully ACID Fully BASE



Redis



CouchDB Redis



CouchDB Redis Bigtable


Questions to ask about data

• Where can I compromise aggregation?
• Where can I compromise where clauses?
• Where can I compromise consistency?
Where can I localize consistency?

CAP Theorem
Pick any two:
• Consistency

CAP Theorem
Pick any two:
• Consistency
• Availability

CAP Theorem
Pick any two:
• Consistency
• Availability
• Partition tolerance

Summary

Every system at scale will have to
compromise consistency at some level.

Summary

Every system at scale will have to
compromise consistency at some level.

Do it mindfully.

Questions?
Twitter: @sarahmei
Email: sarah@pivotallabs.com
Slides: http://bit.ly/9xS2PK

Please rate this talk! http://bit.ly/9MCtX9

Beyond (No)SQL

Recommended

More Related Content

What's hot (20)

Recently uploaded (20)

Beyond (No)SQL

Editor's Notes