Non-Relational Databases & Key/Value Stores

NON-RELATIONAL
DATABASES

Saturday, October 31, 2009

@JPERRAS - JOEL PERRAS
Canadian Geek

Blog: http://nerderati.com

GitHub: http://github.com/jperras

CakePHP Core since Early 2009, PHP dev. since 2001

McGill University, Montréal, Canada - Physics,
Mathematics & Computer Science

Employer: Plank Design (http://plankdesign.com)
(Twitter: @plankdesign)


RELATIONAL DATABASES

Many different vendors: MySQL, PostgreSQL,
SQLite, Oracle, ...

Same basic implementation:

B(+)-Trees for pages

B(+)-Trees or hash tables for secondary indexes

Possibly R-Trees for spatial indexes


WHAT THEY’RE GOOD AT


Schemas (relational models)

Familiar BCNF structure

Strong consistency

Transactions

Very “mature” & well tested (mostly)

Easy adoption/integration


RDBMS’ES ARE NOT
GOING ANYWHERE

FriendFeed

Wikipedia

Google AdWords

Facebook


Most small to medium size applications will
never need to go beyond a single database server.


Always try and follow the Golden Web Application
Development Rule:


DON’T TRY TO SOLVE A
PROBLEM YOU DON’T
HAVE


The web has created new problem domains in
data storage and querying.


MODERN WEB APPS

Often use variable schemas

Optional ﬁelds: contact lists, addresses, favourite
movies/books, etc.

NULL-itis: null values should not be permitted in
BCNF, but are everywhere in web applications.


MODERN WEB APPS

‘Social’ apps => high write/read ratios

Complex Many-to-Many relationships

Joins become a problem in federated architectures

Eventual consistency is usually acceptable

Downtime unacceptable


OTHER CONCERNS


RULES OF APP AGING
http://push.cx/2009/rules-of-database-app-aging

1. All ﬁelds become optional
2. All relationships become many-to-many
3. Chatter (comments explaining hacks)
grows with time.


SOME GOOD PROBLEMS
TO HAVE
Even if they are “Hard” ones to solve.


Load Balancing
(you can only live with one machine for so long)


High Availability
(because disks fail, and replication fails)


What’s a web application developer to do?


Alternative Data Storage Solutions


Not a silver bullet.

These can solve some problems,
but cause others and have their own limitations.

It’s up to you to weigh the cost/beneﬁt of your chosen
solution.


THE LANDSCAPE

Key/Value Stores/Distributed Hash Tables (DHT)

Document-oriented databases

Column-oriented databases


KEY/VALUE STORES

Voldemort

Scalaris

Tokyo Cabinet

Redis

MemcacheDB


DOCUMENT ORIENTED
DATA STORES

CouchDB <- (my favourite!)

MongoDB

SimpleDB (Amazon)


COLUMN-ORIENTED
STORES

BigTable (Google)

HBase (Hadoop Database)

Hypertable (BigTable Open Source clone)

Cassandra (Facebook)


How do we use these technologies
alongside CakePHP ?


This year’s magical word:

DataSources


CASE STUDY - COUCHDB

http://github.com/jperras/divan
(I will make zip/tar available when more stable - stay tuned)


CASE STUDY - TOKYO
CABINET/TYRANT

http://github.com/jperras/tyrannical
(I will make zip/tar available when more stable - stay tuned)


Non-relational stores are not relational.


So don’t try to force the interface to
be relational.


DESIGNING A NON-
RELATIONAL DATASOURCE

Favour simplicity over transparency

Don’t try to implement everything that the
MySQL driver implements

Use the strengths of the alternative store


Example Use Cases


KEY/VALUE STORES

Most have atomic increment/decrement operations
Great for API rate limiters (e.g. 300 API reqs/hour/account)

Counts & sums of normalized data
Most popular items, votes, ratings, some statistics

And more.


DOCUMENT STORES

Filesystem objects (pdfs, images, excel sheets etc.) -
stored as document attachments (size limited).
Allows you to reduce reliance on shared ﬁlesystems (NFS)

Address book

Volatile schema situations

CouchDB has a very interesting feature set


There are many, many use cases.


Thanks to the DataSource adapter implementation
in CakePHP, creating a model-based interface is simple.


Thank you!

@jperras
http://nerderati.com
http://github.com/jperras

CODE
Divan - CouchDB datasource

Yantra - State Machine component for application control ﬂow

CakPHP TextMate Bundle

CakeMate - TextMate/Vim Plugin

Tyrannical - Tokyo Tyrant datasource

Originally by Martin Samson (pyrolian@gmail.com)

Working to improve code - commits coming soon.

Currently working on a framework-agnostic, distributed, plugin/library server.


Non-Relational Databases & Key/Value Stores

Recommended

More Related Content

Viewers also liked (19)

Similar to Non-Relational Databases & Key/Value Stores (20)

Recently uploaded (20)

Non-Relational Databases & Key/Value Stores