SlideShare a Scribd company logo
Cassandra and Rails
What we learned on our first project




                               Mike Wynholds
                                  Carbon Five
                                 @mwynholds
Agenda

Introduction

What is Cassandra?

Cassandra and Rails

Emergent data patterns

Deployment
What is Cassandra?
Column Family database

Distributed design - “eventually consistent”

Open sourced by Facebook, now Apache

Used by Facebook, Twitter, Digg, Rackspace...

  Largest cluster: 100 TB, 150 nodes

Still very immature - at version 0.6.2
Relational vs Column Family

      Schema-ful                       Schema-less (mostly)

      Row-based                        Column-based

      Robust SQL queries               No query language

      Transactional                    Eventually consistent *

      ODBC/JDBC                        Thrift *

      Fast - 300/350ms                 Blazing - .12/15ms **

* Cassandra-specific
** MySQL vs Cassandra, > 50GB data, write/read
Relational                          Column Based
Database                               Keyspace
                                       Column Family
Table 1
 col1     col2   col3    col4   col5    key1: col1:val     col3:val


                                        key2: col1:val1    col2:val    col3:val

                                       key3: col3:val

Table 2          Table 3               Super Column Family
 col1   col2      col1   col2   col3
                                                col1:      col1:
                                       key1:
                                                col1:val   col1:val   col2:val


                                       key2::
What about ORM?


Cassandra is NOT a relational database

No ActiveRecord support (currently)

No Hibernate support (currently)

OCFM? Lots of room for jars/gems
An example in Rails


bitchroom - a place to bitch and whine

Twitter-like features - user post timelines

Digg-like features - up/down, fav, reply
Custom Rails config
-- /config/cassandra.yml --

development:
 servers: "127.0.0.1:9160"
 keyspace: "bitchroom_development"
 timeout: 3
 retries: 2

-- /initializers/initialize_app.rb --

require 'cassandra'
env = ENV['RAILS_ENV'] || 'development'
cfg = YAML.load_file('#{RAILS_ROOT}/config/cassandra.yml')[env]
thrift_options = { :timeout => cfg['timeout'], :retries => cfg['retries'] }
$cassandra = Cassandra.new(cfg['keyspace'], cfg['servers'], thrift_options)
$cassandra.disable_node_auto_discovery!
Cassandra API

   get(keyspace, key, column_path, level)
   get_slice(keyspace, key, column_parent, predicate, level)
   multiget_slice(keyspace, keys, column_parent, predicate, level)
   get_count(keyspace, key, column_parent, level)
   get_range_slices(keyspace, column_parent, predicate, range, level)
   insert(keyspace, ey, column_path, value, timestamp, level)
   batch_mutate(keyspace, mutation_map, level)
   remove(keyspace, key, column_path, timestamp, level)
   describe_*(...)



Version 0.6.x only
Sample save
def save
 uuid = SimpleUUID::UUID.new(Time.now)
 @id = uuid.to_guid

 post_hash = self.to_cassandra_hash
 cassandra.insert(:Posts, @id, post_hash)

 pointer = {uuid => @id}
 cassandra.insert(:Timelines, "main", pointer)
 cassandra.insert(:Timelines, "user-#{@author_id}")

 return self
end
Emergent data patterns


 Simple object map

 Object relationship map

 Timeline <-- this one is the key
Simple Object Map
      Row id = object id (primary key)

      Attribute column names

      String column values

ids = [ 1, 2, 3 ]
posts = cassandra.multi_get(:Posts, ids, { })
Object Relationship Map
      Row id = object id (primary key)

      Relationship attribute column names

      External ID super-column values

ids = [ 1, 2, 3 ]
prs = cassandra.multi_get(:PostRelationships, ids, { })
Timeline
       TimeUUID column names

       External ID column values


result = []
options = { :count => 20, :reverse => true }
timelines = [ 'user-1', 'user-2', 'user-3' ]
multi_get_result = cassandra.multi_get(:Timelines, timelines, options)
multi_get_result.values().each do |timeline|
  timeline.each { |uuid, id| result << [uuid, id] }
end

result = result.sort { |a, b| b[0] <=> a[0] }.slice(0, options[:count])
Deployment
Load balancing + failover


                :80



               haproxy
                :81      :9160   cassandra
                nginx
                :5000              :9160



      :9160
               mongrel           cassandra
Resources


http://cassandra.apache.org/

http://wiki.apache.org/cassandra/API

http://github.com/fauna/cassandra

http://github.com/ryanking/simple_uuid
Ad

More Related Content

What's hot (20)

Bulk Loading into Cassandra
Bulk Loading into CassandraBulk Loading into Cassandra
Bulk Loading into Cassandra
Brian Hess
 
Cassandra Day SV 2014: Netflix’s Astyanax Java Client Driver for Apache Cassa...
Cassandra Day SV 2014: Netflix’s Astyanax Java Client Driver for Apache Cassa...Cassandra Day SV 2014: Netflix’s Astyanax Java Client Driver for Apache Cassa...
Cassandra Day SV 2014: Netflix’s Astyanax Java Client Driver for Apache Cassa...
DataStax Academy
 
Background Tasks in Node - Evan Tahler, TaskRabbit
Background Tasks in Node - Evan Tahler, TaskRabbitBackground Tasks in Node - Evan Tahler, TaskRabbit
Background Tasks in Node - Evan Tahler, TaskRabbit
Redis Labs
 
Introduction to redis
Introduction to redisIntroduction to redis
Introduction to redis
Tanu Siwag
 
Postgres & Redis Sitting in a Tree- Rimas Silkaitis, Heroku
Postgres & Redis Sitting in a Tree- Rimas Silkaitis, HerokuPostgres & Redis Sitting in a Tree- Rimas Silkaitis, Heroku
Postgres & Redis Sitting in a Tree- Rimas Silkaitis, Heroku
Redis Labs
 
Get started with Developing Frameworks in Go on Apache Mesos
Get started with Developing Frameworks in Go on Apache MesosGet started with Developing Frameworks in Go on Apache Mesos
Get started with Developing Frameworks in Go on Apache Mesos
Joe Stein
 
8b. Column Oriented Databases Lab
8b. Column Oriented Databases Lab8b. Column Oriented Databases Lab
8b. Column Oriented Databases Lab
Fabio Fumarola
 
Cassandra and Spark
Cassandra and Spark Cassandra and Spark
Cassandra and Spark
datastaxjp
 
Cassandra Summit 2015: Intro to DSE Search
Cassandra Summit 2015: Intro to DSE SearchCassandra Summit 2015: Intro to DSE Search
Cassandra Summit 2015: Intro to DSE Search
Caleb Rackliffe
 
Developing Frameworks for Apache Mesos
Developing Frameworks  for Apache MesosDeveloping Frameworks  for Apache Mesos
Developing Frameworks for Apache Mesos
Joe Stein
 
Introduction to cassandra 2014
Introduction to cassandra 2014Introduction to cassandra 2014
Introduction to cassandra 2014
Patrick McFadin
 
Cassandra Java APIs Old and New – A Comparison
Cassandra Java APIs Old and New – A ComparisonCassandra Java APIs Old and New – A Comparison
Cassandra Java APIs Old and New – A Comparison
shsedghi
 
Apache Kafka, HDFS, Accumulo and more on Mesos
Apache Kafka, HDFS, Accumulo and more on MesosApache Kafka, HDFS, Accumulo and more on Mesos
Apache Kafka, HDFS, Accumulo and more on Mesos
Joe Stein
 
Redis and it's data types
Redis and it's data typesRedis and it's data types
Redis and it's data types
Aniruddha Chakrabarti
 
Cassandra Summit 2014: Reading Cassandra SSTables Directly for Offline Data A...
Cassandra Summit 2014: Reading Cassandra SSTables Directly for Offline Data A...Cassandra Summit 2014: Reading Cassandra SSTables Directly for Offline Data A...
Cassandra Summit 2014: Reading Cassandra SSTables Directly for Offline Data A...
DataStax Academy
 
Lessons from Cassandra & Spark (Matthias Niehoff & Stephan Kepser, codecentri...
Lessons from Cassandra & Spark (Matthias Niehoff & Stephan Kepser, codecentri...Lessons from Cassandra & Spark (Matthias Niehoff & Stephan Kepser, codecentri...
Lessons from Cassandra & Spark (Matthias Niehoff & Stephan Kepser, codecentri...
DataStax
 
The Best and Worst of Cassandra-stress Tool (Christopher Batey, The Last Pick...
The Best and Worst of Cassandra-stress Tool (Christopher Batey, The Last Pick...The Best and Worst of Cassandra-stress Tool (Christopher Batey, The Last Pick...
The Best and Worst of Cassandra-stress Tool (Christopher Batey, The Last Pick...
DataStax
 
Cassandra Tutorial
Cassandra TutorialCassandra Tutorial
Cassandra Tutorial
mubarakss
 
Redis introduction
Redis introductionRedis introduction
Redis introduction
Federico Daniel Colombo Gennarelli
 
Boosting Machine Learning with Redis Modules and Spark
Boosting Machine Learning with Redis Modules and SparkBoosting Machine Learning with Redis Modules and Spark
Boosting Machine Learning with Redis Modules and Spark
Dvir Volk
 
Bulk Loading into Cassandra
Bulk Loading into CassandraBulk Loading into Cassandra
Bulk Loading into Cassandra
Brian Hess
 
Cassandra Day SV 2014: Netflix’s Astyanax Java Client Driver for Apache Cassa...
Cassandra Day SV 2014: Netflix’s Astyanax Java Client Driver for Apache Cassa...Cassandra Day SV 2014: Netflix’s Astyanax Java Client Driver for Apache Cassa...
Cassandra Day SV 2014: Netflix’s Astyanax Java Client Driver for Apache Cassa...
DataStax Academy
 
Background Tasks in Node - Evan Tahler, TaskRabbit
Background Tasks in Node - Evan Tahler, TaskRabbitBackground Tasks in Node - Evan Tahler, TaskRabbit
Background Tasks in Node - Evan Tahler, TaskRabbit
Redis Labs
 
Introduction to redis
Introduction to redisIntroduction to redis
Introduction to redis
Tanu Siwag
 
Postgres & Redis Sitting in a Tree- Rimas Silkaitis, Heroku
Postgres & Redis Sitting in a Tree- Rimas Silkaitis, HerokuPostgres & Redis Sitting in a Tree- Rimas Silkaitis, Heroku
Postgres & Redis Sitting in a Tree- Rimas Silkaitis, Heroku
Redis Labs
 
Get started with Developing Frameworks in Go on Apache Mesos
Get started with Developing Frameworks in Go on Apache MesosGet started with Developing Frameworks in Go on Apache Mesos
Get started with Developing Frameworks in Go on Apache Mesos
Joe Stein
 
8b. Column Oriented Databases Lab
8b. Column Oriented Databases Lab8b. Column Oriented Databases Lab
8b. Column Oriented Databases Lab
Fabio Fumarola
 
Cassandra and Spark
Cassandra and Spark Cassandra and Spark
Cassandra and Spark
datastaxjp
 
Cassandra Summit 2015: Intro to DSE Search
Cassandra Summit 2015: Intro to DSE SearchCassandra Summit 2015: Intro to DSE Search
Cassandra Summit 2015: Intro to DSE Search
Caleb Rackliffe
 
Developing Frameworks for Apache Mesos
Developing Frameworks  for Apache MesosDeveloping Frameworks  for Apache Mesos
Developing Frameworks for Apache Mesos
Joe Stein
 
Introduction to cassandra 2014
Introduction to cassandra 2014Introduction to cassandra 2014
Introduction to cassandra 2014
Patrick McFadin
 
Cassandra Java APIs Old and New – A Comparison
Cassandra Java APIs Old and New – A ComparisonCassandra Java APIs Old and New – A Comparison
Cassandra Java APIs Old and New – A Comparison
shsedghi
 
Apache Kafka, HDFS, Accumulo and more on Mesos
Apache Kafka, HDFS, Accumulo and more on MesosApache Kafka, HDFS, Accumulo and more on Mesos
Apache Kafka, HDFS, Accumulo and more on Mesos
Joe Stein
 
Cassandra Summit 2014: Reading Cassandra SSTables Directly for Offline Data A...
Cassandra Summit 2014: Reading Cassandra SSTables Directly for Offline Data A...Cassandra Summit 2014: Reading Cassandra SSTables Directly for Offline Data A...
Cassandra Summit 2014: Reading Cassandra SSTables Directly for Offline Data A...
DataStax Academy
 
Lessons from Cassandra & Spark (Matthias Niehoff & Stephan Kepser, codecentri...
Lessons from Cassandra & Spark (Matthias Niehoff & Stephan Kepser, codecentri...Lessons from Cassandra & Spark (Matthias Niehoff & Stephan Kepser, codecentri...
Lessons from Cassandra & Spark (Matthias Niehoff & Stephan Kepser, codecentri...
DataStax
 
The Best and Worst of Cassandra-stress Tool (Christopher Batey, The Last Pick...
The Best and Worst of Cassandra-stress Tool (Christopher Batey, The Last Pick...The Best and Worst of Cassandra-stress Tool (Christopher Batey, The Last Pick...
The Best and Worst of Cassandra-stress Tool (Christopher Batey, The Last Pick...
DataStax
 
Cassandra Tutorial
Cassandra TutorialCassandra Tutorial
Cassandra Tutorial
mubarakss
 
Boosting Machine Learning with Redis Modules and Spark
Boosting Machine Learning with Redis Modules and SparkBoosting Machine Learning with Redis Modules and Spark
Boosting Machine Learning with Redis Modules and Spark
Dvir Volk
 

Similar to Cassandra and Rails at LA NoSQL Meetup (20)

Scaling web applications with cassandra presentation
Scaling web applications with cassandra presentationScaling web applications with cassandra presentation
Scaling web applications with cassandra presentation
Murat Çakal
 
MariaDB Cassandra Interoperability
MariaDB Cassandra InteroperabilityMariaDB Cassandra Interoperability
MariaDB Cassandra Interoperability
Colin Charles
 
Cassandra's Sweet Spot - an introduction to Apache Cassandra
Cassandra's Sweet Spot - an introduction to Apache CassandraCassandra's Sweet Spot - an introduction to Apache Cassandra
Cassandra's Sweet Spot - an introduction to Apache Cassandra
Dave Gardner
 
MariaDB and Cassandra Interoperability
MariaDB and Cassandra InteroperabilityMariaDB and Cassandra Interoperability
MariaDB and Cassandra Interoperability
Colin Charles
 
Scala active record
Scala active recordScala active record
Scala active record
鉄平 土佐
 
Mysqlconf2013 mariadb-cassandra-interoperability
Mysqlconf2013 mariadb-cassandra-interoperabilityMysqlconf2013 mariadb-cassandra-interoperability
Mysqlconf2013 mariadb-cassandra-interoperability
Sergey Petrunya
 
Maria db cassandra interoperability cassandra storage engine in mariadb
Maria db cassandra interoperability cassandra storage engine in mariadbMaria db cassandra interoperability cassandra storage engine in mariadb
Maria db cassandra interoperability cassandra storage engine in mariadb
YUCHENG HU
 
From Postgres to Cassandra (Rimas Silkaitis, Heroku) | C* Summit 2016
From Postgres to Cassandra (Rimas Silkaitis, Heroku) | C* Summit 2016From Postgres to Cassandra (Rimas Silkaitis, Heroku) | C* Summit 2016
From Postgres to Cassandra (Rimas Silkaitis, Heroku) | C* Summit 2016
DataStax
 
Cassandra Overview
Cassandra OverviewCassandra Overview
Cassandra Overview
Sergey Titov, Ph.D.
 
C* Summit 2013: Can't we all just get along? MariaDB and Cassandra by Colin C...
C* Summit 2013: Can't we all just get along? MariaDB and Cassandra by Colin C...C* Summit 2013: Can't we all just get along? MariaDB and Cassandra by Colin C...
C* Summit 2013: Can't we all just get along? MariaDB and Cassandra by Colin C...
DataStax Academy
 
Trivadis TechEvent 2016 Big Data Cassandra, wieso brauche ich das? by Jan Ott
Trivadis TechEvent 2016 Big Data Cassandra, wieso brauche ich das? by Jan OttTrivadis TechEvent 2016 Big Data Cassandra, wieso brauche ich das? by Jan Ott
Trivadis TechEvent 2016 Big Data Cassandra, wieso brauche ich das? by Jan Ott
Trivadis
 
Apache cassandra & apache spark for time series data
Apache cassandra & apache spark for time series dataApache cassandra & apache spark for time series data
Apache cassandra & apache spark for time series data
Patrick McFadin
 
Introduction to Apache Spark
Introduction to Apache SparkIntroduction to Apache Spark
Introduction to Apache Spark
Anastasios Skarlatidis
 
Hadoop + Cassandra: Fast queries on data lakes, and wikipedia search tutorial.
Hadoop + Cassandra: Fast queries on data lakes, and  wikipedia search tutorial.Hadoop + Cassandra: Fast queries on data lakes, and  wikipedia search tutorial.
Hadoop + Cassandra: Fast queries on data lakes, and wikipedia search tutorial.
Natalino Busa
 
Scylla Summit 2018: Introducing ValuStor, A Memcached Alternative Made to Run...
Scylla Summit 2018: Introducing ValuStor, A Memcached Alternative Made to Run...Scylla Summit 2018: Introducing ValuStor, A Memcached Alternative Made to Run...
Scylla Summit 2018: Introducing ValuStor, A Memcached Alternative Made to Run...
ScyllaDB
 
Introduction to Apache Cassandra
Introduction to Apache CassandraIntroduction to Apache Cassandra
Introduction to Apache Cassandra
Robert Stupp
 
Developing with Cassandra
Developing with CassandraDeveloping with Cassandra
Developing with Cassandra
Sperasoft
 
Apache Cassandra at the Geek2Geek Berlin
Apache Cassandra at the Geek2Geek BerlinApache Cassandra at the Geek2Geek Berlin
Apache Cassandra at the Geek2Geek Berlin
Christian Johannsen
 
Apache Spark for Library Developers with William Benton and Erik Erlandson
 Apache Spark for Library Developers with William Benton and Erik Erlandson Apache Spark for Library Developers with William Benton and Erik Erlandson
Apache Spark for Library Developers with William Benton and Erik Erlandson
Databricks
 
NoSql Database
NoSql DatabaseNoSql Database
NoSql Database
Suresh Parmar
 
Scaling web applications with cassandra presentation
Scaling web applications with cassandra presentationScaling web applications with cassandra presentation
Scaling web applications with cassandra presentation
Murat Çakal
 
MariaDB Cassandra Interoperability
MariaDB Cassandra InteroperabilityMariaDB Cassandra Interoperability
MariaDB Cassandra Interoperability
Colin Charles
 
Cassandra's Sweet Spot - an introduction to Apache Cassandra
Cassandra's Sweet Spot - an introduction to Apache CassandraCassandra's Sweet Spot - an introduction to Apache Cassandra
Cassandra's Sweet Spot - an introduction to Apache Cassandra
Dave Gardner
 
MariaDB and Cassandra Interoperability
MariaDB and Cassandra InteroperabilityMariaDB and Cassandra Interoperability
MariaDB and Cassandra Interoperability
Colin Charles
 
Mysqlconf2013 mariadb-cassandra-interoperability
Mysqlconf2013 mariadb-cassandra-interoperabilityMysqlconf2013 mariadb-cassandra-interoperability
Mysqlconf2013 mariadb-cassandra-interoperability
Sergey Petrunya
 
Maria db cassandra interoperability cassandra storage engine in mariadb
Maria db cassandra interoperability cassandra storage engine in mariadbMaria db cassandra interoperability cassandra storage engine in mariadb
Maria db cassandra interoperability cassandra storage engine in mariadb
YUCHENG HU
 
From Postgres to Cassandra (Rimas Silkaitis, Heroku) | C* Summit 2016
From Postgres to Cassandra (Rimas Silkaitis, Heroku) | C* Summit 2016From Postgres to Cassandra (Rimas Silkaitis, Heroku) | C* Summit 2016
From Postgres to Cassandra (Rimas Silkaitis, Heroku) | C* Summit 2016
DataStax
 
C* Summit 2013: Can't we all just get along? MariaDB and Cassandra by Colin C...
C* Summit 2013: Can't we all just get along? MariaDB and Cassandra by Colin C...C* Summit 2013: Can't we all just get along? MariaDB and Cassandra by Colin C...
C* Summit 2013: Can't we all just get along? MariaDB and Cassandra by Colin C...
DataStax Academy
 
Trivadis TechEvent 2016 Big Data Cassandra, wieso brauche ich das? by Jan Ott
Trivadis TechEvent 2016 Big Data Cassandra, wieso brauche ich das? by Jan OttTrivadis TechEvent 2016 Big Data Cassandra, wieso brauche ich das? by Jan Ott
Trivadis TechEvent 2016 Big Data Cassandra, wieso brauche ich das? by Jan Ott
Trivadis
 
Apache cassandra & apache spark for time series data
Apache cassandra & apache spark for time series dataApache cassandra & apache spark for time series data
Apache cassandra & apache spark for time series data
Patrick McFadin
 
Hadoop + Cassandra: Fast queries on data lakes, and wikipedia search tutorial.
Hadoop + Cassandra: Fast queries on data lakes, and  wikipedia search tutorial.Hadoop + Cassandra: Fast queries on data lakes, and  wikipedia search tutorial.
Hadoop + Cassandra: Fast queries on data lakes, and wikipedia search tutorial.
Natalino Busa
 
Scylla Summit 2018: Introducing ValuStor, A Memcached Alternative Made to Run...
Scylla Summit 2018: Introducing ValuStor, A Memcached Alternative Made to Run...Scylla Summit 2018: Introducing ValuStor, A Memcached Alternative Made to Run...
Scylla Summit 2018: Introducing ValuStor, A Memcached Alternative Made to Run...
ScyllaDB
 
Introduction to Apache Cassandra
Introduction to Apache CassandraIntroduction to Apache Cassandra
Introduction to Apache Cassandra
Robert Stupp
 
Developing with Cassandra
Developing with CassandraDeveloping with Cassandra
Developing with Cassandra
Sperasoft
 
Apache Cassandra at the Geek2Geek Berlin
Apache Cassandra at the Geek2Geek BerlinApache Cassandra at the Geek2Geek Berlin
Apache Cassandra at the Geek2Geek Berlin
Christian Johannsen
 
Apache Spark for Library Developers with William Benton and Erik Erlandson
 Apache Spark for Library Developers with William Benton and Erik Erlandson Apache Spark for Library Developers with William Benton and Erik Erlandson
Apache Spark for Library Developers with William Benton and Erik Erlandson
Databricks
 
Ad

Recently uploaded (20)

Enhancing ICU Intelligence: How Our Functional Testing Enabled a Healthcare I...
Enhancing ICU Intelligence: How Our Functional Testing Enabled a Healthcare I...Enhancing ICU Intelligence: How Our Functional Testing Enabled a Healthcare I...
Enhancing ICU Intelligence: How Our Functional Testing Enabled a Healthcare I...
Impelsys Inc.
 
Quantum Computing Quick Research Guide by Arthur Morgan
Quantum Computing Quick Research Guide by Arthur MorganQuantum Computing Quick Research Guide by Arthur Morgan
Quantum Computing Quick Research Guide by Arthur Morgan
Arthur Morgan
 
Greenhouse_Monitoring_Presentation.pptx.
Greenhouse_Monitoring_Presentation.pptx.Greenhouse_Monitoring_Presentation.pptx.
Greenhouse_Monitoring_Presentation.pptx.
hpbmnnxrvb
 
tecnologias de las primeras civilizaciones.pdf
tecnologias de las primeras civilizaciones.pdftecnologias de las primeras civilizaciones.pdf
tecnologias de las primeras civilizaciones.pdf
fjgm517
 
What is Model Context Protocol(MCP) - The new technology for communication bw...
What is Model Context Protocol(MCP) - The new technology for communication bw...What is Model Context Protocol(MCP) - The new technology for communication bw...
What is Model Context Protocol(MCP) - The new technology for communication bw...
Vishnu Singh Chundawat
 
Big Data Analytics Quick Research Guide by Arthur Morgan
Big Data Analytics Quick Research Guide by Arthur MorganBig Data Analytics Quick Research Guide by Arthur Morgan
Big Data Analytics Quick Research Guide by Arthur Morgan
Arthur Morgan
 
Build Your Own Copilot & Agents For Devs
Build Your Own Copilot & Agents For DevsBuild Your Own Copilot & Agents For Devs
Build Your Own Copilot & Agents For Devs
Brian McKeiver
 
Noah Loul Shares 5 Steps to Implement AI Agents for Maximum Business Efficien...
Noah Loul Shares 5 Steps to Implement AI Agents for Maximum Business Efficien...Noah Loul Shares 5 Steps to Implement AI Agents for Maximum Business Efficien...
Noah Loul Shares 5 Steps to Implement AI Agents for Maximum Business Efficien...
Noah Loul
 
Linux Support for SMARC: How Toradex Empowers Embedded Developers
Linux Support for SMARC: How Toradex Empowers Embedded DevelopersLinux Support for SMARC: How Toradex Empowers Embedded Developers
Linux Support for SMARC: How Toradex Empowers Embedded Developers
Toradex
 
Increasing Retail Store Efficiency How can Planograms Save Time and Money.pptx
Increasing Retail Store Efficiency How can Planograms Save Time and Money.pptxIncreasing Retail Store Efficiency How can Planograms Save Time and Money.pptx
Increasing Retail Store Efficiency How can Planograms Save Time and Money.pptx
Anoop Ashok
 
Generative Artificial Intelligence (GenAI) in Business
Generative Artificial Intelligence (GenAI) in BusinessGenerative Artificial Intelligence (GenAI) in Business
Generative Artificial Intelligence (GenAI) in Business
Dr. Tathagat Varma
 
Manifest Pre-Seed Update | A Humanoid OEM Deeptech In France
Manifest Pre-Seed Update | A Humanoid OEM Deeptech In FranceManifest Pre-Seed Update | A Humanoid OEM Deeptech In France
Manifest Pre-Seed Update | A Humanoid OEM Deeptech In France
chb3
 
Role of Data Annotation Services in AI-Powered Manufacturing
Role of Data Annotation Services in AI-Powered ManufacturingRole of Data Annotation Services in AI-Powered Manufacturing
Role of Data Annotation Services in AI-Powered Manufacturing
Andrew Leo
 
HCL Nomad Web – Best Practices und Verwaltung von Multiuser-Umgebungen
HCL Nomad Web – Best Practices und Verwaltung von Multiuser-UmgebungenHCL Nomad Web – Best Practices und Verwaltung von Multiuser-Umgebungen
HCL Nomad Web – Best Practices und Verwaltung von Multiuser-Umgebungen
panagenda
 
Massive Power Outage Hits Spain, Portugal, and France: Causes, Impact, and On...
Massive Power Outage Hits Spain, Portugal, and France: Causes, Impact, and On...Massive Power Outage Hits Spain, Portugal, and France: Causes, Impact, and On...
Massive Power Outage Hits Spain, Portugal, and France: Causes, Impact, and On...
Aqusag Technologies
 
Heap, Types of Heap, Insertion and Deletion
Heap, Types of Heap, Insertion and DeletionHeap, Types of Heap, Insertion and Deletion
Heap, Types of Heap, Insertion and Deletion
Jaydeep Kale
 
Splunk Security Update | Public Sector Summit Germany 2025
Splunk Security Update | Public Sector Summit Germany 2025Splunk Security Update | Public Sector Summit Germany 2025
Splunk Security Update | Public Sector Summit Germany 2025
Splunk
 
AI and Data Privacy in 2025: Global Trends
AI and Data Privacy in 2025: Global TrendsAI and Data Privacy in 2025: Global Trends
AI and Data Privacy in 2025: Global Trends
InData Labs
 
Mobile App Development Company in Saudi Arabia
Mobile App Development Company in Saudi ArabiaMobile App Development Company in Saudi Arabia
Mobile App Development Company in Saudi Arabia
Steve Jonas
 
#StandardsGoals for 2025: Standards & certification roundup - Tech Forum 2025
#StandardsGoals for 2025: Standards & certification roundup - Tech Forum 2025#StandardsGoals for 2025: Standards & certification roundup - Tech Forum 2025
#StandardsGoals for 2025: Standards & certification roundup - Tech Forum 2025
BookNet Canada
 
Enhancing ICU Intelligence: How Our Functional Testing Enabled a Healthcare I...
Enhancing ICU Intelligence: How Our Functional Testing Enabled a Healthcare I...Enhancing ICU Intelligence: How Our Functional Testing Enabled a Healthcare I...
Enhancing ICU Intelligence: How Our Functional Testing Enabled a Healthcare I...
Impelsys Inc.
 
Quantum Computing Quick Research Guide by Arthur Morgan
Quantum Computing Quick Research Guide by Arthur MorganQuantum Computing Quick Research Guide by Arthur Morgan
Quantum Computing Quick Research Guide by Arthur Morgan
Arthur Morgan
 
Greenhouse_Monitoring_Presentation.pptx.
Greenhouse_Monitoring_Presentation.pptx.Greenhouse_Monitoring_Presentation.pptx.
Greenhouse_Monitoring_Presentation.pptx.
hpbmnnxrvb
 
tecnologias de las primeras civilizaciones.pdf
tecnologias de las primeras civilizaciones.pdftecnologias de las primeras civilizaciones.pdf
tecnologias de las primeras civilizaciones.pdf
fjgm517
 
What is Model Context Protocol(MCP) - The new technology for communication bw...
What is Model Context Protocol(MCP) - The new technology for communication bw...What is Model Context Protocol(MCP) - The new technology for communication bw...
What is Model Context Protocol(MCP) - The new technology for communication bw...
Vishnu Singh Chundawat
 
Big Data Analytics Quick Research Guide by Arthur Morgan
Big Data Analytics Quick Research Guide by Arthur MorganBig Data Analytics Quick Research Guide by Arthur Morgan
Big Data Analytics Quick Research Guide by Arthur Morgan
Arthur Morgan
 
Build Your Own Copilot & Agents For Devs
Build Your Own Copilot & Agents For DevsBuild Your Own Copilot & Agents For Devs
Build Your Own Copilot & Agents For Devs
Brian McKeiver
 
Noah Loul Shares 5 Steps to Implement AI Agents for Maximum Business Efficien...
Noah Loul Shares 5 Steps to Implement AI Agents for Maximum Business Efficien...Noah Loul Shares 5 Steps to Implement AI Agents for Maximum Business Efficien...
Noah Loul Shares 5 Steps to Implement AI Agents for Maximum Business Efficien...
Noah Loul
 
Linux Support for SMARC: How Toradex Empowers Embedded Developers
Linux Support for SMARC: How Toradex Empowers Embedded DevelopersLinux Support for SMARC: How Toradex Empowers Embedded Developers
Linux Support for SMARC: How Toradex Empowers Embedded Developers
Toradex
 
Increasing Retail Store Efficiency How can Planograms Save Time and Money.pptx
Increasing Retail Store Efficiency How can Planograms Save Time and Money.pptxIncreasing Retail Store Efficiency How can Planograms Save Time and Money.pptx
Increasing Retail Store Efficiency How can Planograms Save Time and Money.pptx
Anoop Ashok
 
Generative Artificial Intelligence (GenAI) in Business
Generative Artificial Intelligence (GenAI) in BusinessGenerative Artificial Intelligence (GenAI) in Business
Generative Artificial Intelligence (GenAI) in Business
Dr. Tathagat Varma
 
Manifest Pre-Seed Update | A Humanoid OEM Deeptech In France
Manifest Pre-Seed Update | A Humanoid OEM Deeptech In FranceManifest Pre-Seed Update | A Humanoid OEM Deeptech In France
Manifest Pre-Seed Update | A Humanoid OEM Deeptech In France
chb3
 
Role of Data Annotation Services in AI-Powered Manufacturing
Role of Data Annotation Services in AI-Powered ManufacturingRole of Data Annotation Services in AI-Powered Manufacturing
Role of Data Annotation Services in AI-Powered Manufacturing
Andrew Leo
 
HCL Nomad Web – Best Practices und Verwaltung von Multiuser-Umgebungen
HCL Nomad Web – Best Practices und Verwaltung von Multiuser-UmgebungenHCL Nomad Web – Best Practices und Verwaltung von Multiuser-Umgebungen
HCL Nomad Web – Best Practices und Verwaltung von Multiuser-Umgebungen
panagenda
 
Massive Power Outage Hits Spain, Portugal, and France: Causes, Impact, and On...
Massive Power Outage Hits Spain, Portugal, and France: Causes, Impact, and On...Massive Power Outage Hits Spain, Portugal, and France: Causes, Impact, and On...
Massive Power Outage Hits Spain, Portugal, and France: Causes, Impact, and On...
Aqusag Technologies
 
Heap, Types of Heap, Insertion and Deletion
Heap, Types of Heap, Insertion and DeletionHeap, Types of Heap, Insertion and Deletion
Heap, Types of Heap, Insertion and Deletion
Jaydeep Kale
 
Splunk Security Update | Public Sector Summit Germany 2025
Splunk Security Update | Public Sector Summit Germany 2025Splunk Security Update | Public Sector Summit Germany 2025
Splunk Security Update | Public Sector Summit Germany 2025
Splunk
 
AI and Data Privacy in 2025: Global Trends
AI and Data Privacy in 2025: Global TrendsAI and Data Privacy in 2025: Global Trends
AI and Data Privacy in 2025: Global Trends
InData Labs
 
Mobile App Development Company in Saudi Arabia
Mobile App Development Company in Saudi ArabiaMobile App Development Company in Saudi Arabia
Mobile App Development Company in Saudi Arabia
Steve Jonas
 
#StandardsGoals for 2025: Standards & certification roundup - Tech Forum 2025
#StandardsGoals for 2025: Standards & certification roundup - Tech Forum 2025#StandardsGoals for 2025: Standards & certification roundup - Tech Forum 2025
#StandardsGoals for 2025: Standards & certification roundup - Tech Forum 2025
BookNet Canada
 
Ad

Cassandra and Rails at LA NoSQL Meetup

  • 1. Cassandra and Rails What we learned on our first project Mike Wynholds Carbon Five @mwynholds
  • 2. Agenda Introduction What is Cassandra? Cassandra and Rails Emergent data patterns Deployment
  • 3. What is Cassandra? Column Family database Distributed design - “eventually consistent” Open sourced by Facebook, now Apache Used by Facebook, Twitter, Digg, Rackspace... Largest cluster: 100 TB, 150 nodes Still very immature - at version 0.6.2
  • 4. Relational vs Column Family Schema-ful Schema-less (mostly) Row-based Column-based Robust SQL queries No query language Transactional Eventually consistent * ODBC/JDBC Thrift * Fast - 300/350ms Blazing - .12/15ms ** * Cassandra-specific ** MySQL vs Cassandra, > 50GB data, write/read
  • 5. Relational Column Based Database Keyspace Column Family Table 1 col1 col2 col3 col4 col5 key1: col1:val col3:val key2: col1:val1 col2:val col3:val key3: col3:val Table 2 Table 3 Super Column Family col1 col2 col1 col2 col3 col1: col1: key1: col1:val col1:val col2:val key2::
  • 6. What about ORM? Cassandra is NOT a relational database No ActiveRecord support (currently) No Hibernate support (currently) OCFM? Lots of room for jars/gems
  • 7. An example in Rails bitchroom - a place to bitch and whine Twitter-like features - user post timelines Digg-like features - up/down, fav, reply
  • 8. Custom Rails config -- /config/cassandra.yml -- development: servers: "127.0.0.1:9160" keyspace: "bitchroom_development" timeout: 3 retries: 2 -- /initializers/initialize_app.rb -- require 'cassandra' env = ENV['RAILS_ENV'] || 'development' cfg = YAML.load_file('#{RAILS_ROOT}/config/cassandra.yml')[env] thrift_options = { :timeout => cfg['timeout'], :retries => cfg['retries'] } $cassandra = Cassandra.new(cfg['keyspace'], cfg['servers'], thrift_options) $cassandra.disable_node_auto_discovery!
  • 9. Cassandra API get(keyspace, key, column_path, level) get_slice(keyspace, key, column_parent, predicate, level) multiget_slice(keyspace, keys, column_parent, predicate, level) get_count(keyspace, key, column_parent, level) get_range_slices(keyspace, column_parent, predicate, range, level) insert(keyspace, ey, column_path, value, timestamp, level) batch_mutate(keyspace, mutation_map, level) remove(keyspace, key, column_path, timestamp, level) describe_*(...) Version 0.6.x only
  • 10. Sample save def save uuid = SimpleUUID::UUID.new(Time.now) @id = uuid.to_guid post_hash = self.to_cassandra_hash cassandra.insert(:Posts, @id, post_hash) pointer = {uuid => @id} cassandra.insert(:Timelines, "main", pointer) cassandra.insert(:Timelines, "user-#{@author_id}") return self end
  • 11. Emergent data patterns Simple object map Object relationship map Timeline <-- this one is the key
  • 12. Simple Object Map Row id = object id (primary key) Attribute column names String column values ids = [ 1, 2, 3 ] posts = cassandra.multi_get(:Posts, ids, { })
  • 13. Object Relationship Map Row id = object id (primary key) Relationship attribute column names External ID super-column values ids = [ 1, 2, 3 ] prs = cassandra.multi_get(:PostRelationships, ids, { })
  • 14. Timeline TimeUUID column names External ID column values result = [] options = { :count => 20, :reverse => true } timelines = [ 'user-1', 'user-2', 'user-3' ] multi_get_result = cassandra.multi_get(:Timelines, timelines, options) multi_get_result.values().each do |timeline| timeline.each { |uuid, id| result << [uuid, id] } end result = result.sort { |a, b| b[0] <=> a[0] }.slice(0, options[:count])
  • 15. Deployment Load balancing + failover :80 haproxy :81 :9160 cassandra nginx :5000 :9160 :9160 mongrel cassandra