SlideShare a Scribd company logo
Developing with Cassandra
In memory Key-Value:
Redis
CouchBase
File system
BerkeleyDB
Distributed file system
HDFS
Velocity
Volume
Variety
In memory NewSQL /
domain specific
VoltDB
Traditional
[SQL] RDBMS
MongoDB
Distributed DBMS
Cassandra
HBase
CAP theorem:
• Consistency
• Availability
• Partition tolerance
- pick two
Eventual consistency
Transactional?
• No
Choose the DB
Why select Cassandra?
• [relatively] easy to setup
• [relatively] easy to use
• ~zero routine ops
• it works (!!) as promised:
o real-time replication
o node/site failure recovery
o zero load writes
o double of nodes = double of speed
Because Cassandra is Fast!
But needs some time to deliver
• 12'000 WPS on a laptop
• ~0.1 / 1 ms constant latency for writes/reads
Check References:
http://techblog.netflix.com/2011/11/benchmarking-cassandra-scalability-on.html
http://vldb.org/pvldb/vol5/p1724_tilmannrabl_vldb2012.pdf
Scalability
Good For:
• log-like data
TTL helps
• massive writes
1M WPS enough?
• simple real-time analytics
Not So Good For:
• dump of junk
(consider HDFS)
• OLAP
(depends on "O")
Good and Not So Good
Distributed DBMS
Just DBMS - closed monolithic solution
o not a platform to run custom code (as MongoDB);
o not an extension (as HBase);
o highly optimized
No-master, eventually consistent
NoSQL
Data model - Key-Value
http://cassandra.apache.org
Apache Cassandra
Developed at Facebook for Inbox search
Released to open source in 2008
In use:
• Netflix - main non-content data store~500 Cassandra nodes (2012)
• eBay - recommendation system"dozens of nodes", 200 TB storage (2012)
• Twitter - tweet analysis100 + TB of data
• More clients: (http://www.datastax.com/cassandrausers)
History
1.0 - October 2011
1.1 - April 2012
1.2 - January 2013
2.0 - expected this summer (2013)
June 26 2013 - 158 bugs, 89 worth to notice
Sperasoft Experience:
• hit 1 bug in production (stability issue)
• hit 1 bug in QA (in a crafted case)
Mature & Agile
Apache .tar.gz and Debian
packageshttp://cassandra.apache.org/download/
DataStax DSC - Cassandra + OpsCenter
http://planetcassandra.org/Download/DataStaxCommunityEdition
Embedded – for funct. tests on Java apps
Maven
Documentation
http://wiki.apache.org/cassandra/
http://www.datastax.com/docs
Distributions
http://www.slideshare.net/planetcassandra/5-andy-cobley-raspberry-pi
CPU: ARM 700 MHz
RAM: 500 MB
Storage: SD card
Price: $25
200 WPS!
What Hardware?
Bare metal
CPU: 8 cores (4 works too)
RAM: 16 - 64 GB (min 8 GB)
Storage: rotating disks 3 - 5 TB total (SSD better)
VM works too, but...
Storage: local disks, avoid NAS
More on Hardware
Software Yes & No (production)
?
Software for Development: All Yes
Plus more: Java, Python, C#, PHP, Ruby, Clojure, Go, R, ...
KeyspaceKeyspace
Column
family
Column
family
~ RDBMS DB / Schema
~ RDBMS table
Key1
Key2
Value
Column
Row
Clustering key
Partitioning key Map < ... , Map < ... , ... > >
Data Model
Node 3Node 2Node 1
CFCFCF
1
2
3
Client
Parallel reads,writes
1
2
3
Data on Discs
https://github.com/datastax/java-driver
Client API Options
Thrift RPC Native protocol + CQL3
Apache Thrift Custom protocol
Synchronous Asynchronous
Schema-less Static schema
Store & Forward Cursors promised in 2.0
API for any language Java; Python, C# coming
Cryptic API JDBC-like API
Supported yet Going forward
• Forget RDB design principles
• Forget abstract data model
- shape data for queries
• No joins - materialized views
• Data duplication - OK
• Remember eventual consistency
• Queries are precious
• Use right data types - timestamp, uuid
Why? Because NoSQL is a low level tool for high optimization.
Data Modeling for NoSQL
Do & Don't
Patterns and Anti-patterns
Query
Wait
Read
Query
Wait
Read
Parallel it:
slo-o-ow
Sequential Read
CREATE TABLE timeline (
event uuid,
timestamp timeuuid,
...
PRIMARY KEY (event, timestamp)
);
event:date
timestamp
...
CREATE TABLE timeline (
event uuid,
date long,
timestamp timeuuid,
...
PRIMARY KEY ((event, date), timestamp)
);
event
timestamp
...
Still bad - need sharding
http://www.datastax.com/dev/blog/advanced-time-series-with-cassandra
Long rows - Cassandra handle 2G columns, but... slo-o-ow
Timeline
Insert = Update = Delete
A B C D1
Y1
Z1
1
A Y Z1
a b c d
UPDATE ... SET b = 'Y' WHERE id = 1
INSERT INTO ... SET (id, c) values (1, 'Z')
DELETE d FROM ... WHERE id = 1
SELECT * FROM ... WHERE id = 1
have to fetch 4 rows
slo-o-ow
Plan Data Immutable
http://www.datastax.com/dev/blog/cassandra-anti-patterns-queues-and-queue-like-datasets
Q
Queue: INSERT INTO ...
SET( name, enqueued_at, payload )
VALUES ( 'Q', now(), ... )
Dequeue: DELETE payload FROM ...
WHERE name = 'Q'
AND enqueued_at = ...
Pick the next: SELECT * FROM ...
WHERE id = 1 LIMIT 1
*Q
*Q
*Q
*Q
Q
Q
*? ? ?Q
have to fetch 4 rowsslo-o-ow
Queue
Remember - eventual consistency.
Concurrent updates
=> wrong count
SELECT count(*)
FROM ... WHERE .... ;
Full scan over the selection
=>
Default 10'000 rows limit
=> wrong count
Have an integer column and increment it
CREATE TABLE count_table (
id
uuid,
value counter,
PRIMARY KEY (id)
);
...
UPDATE count_table
SET value = value + 1
WHERE id = ... ;
Counter column family
http://www.datastax.com/documentation/cassandr
a/1.2/cassandra/cql_using/use_counter_t.html
slo-o-ow
mess
mess
How Many?
CREATE TABLE blob (
id uuid,
data blob,
PRIMARY KEY (id)
);
id
chunk_no
data
CREATE TABLE blob (
id uuid,
chunk_no int,
data blob,
PRIMARY KEY (id, chunk_no)
);
id data
http://wiki.apache.org/cassandra/FAQ#large_file_and_blob_storage
http://wiki.apache.org/cassandra/CassandraLimitations
OutOfMemory
Blobs
Unbounded Queries
Result set
Client
OutOfMemory
OutOfMemory
Cursors. Planned to 2.0.
https://issues.apache.org/jira/browse/CASSANDRA-4415
C* clients use RPC yet
Cassandra
slo-o-ow
Column names - data... yet
Keep them short.
Node 3Node 1
CF
3
Client
2
Node 3Node 2
Why??
slo-o-ow
hot spots
Limit Client to a node
Helpful Links
Sperasoft @ slideshare: http://www.slideshare.net/Sperasoft
Sperasoft @ speakerdeck: https://speakerdeck.com/sperasoft
Sperasoft @ github: https://github.com/Sperasoft/Workshop
Ad

More Related Content

What's hot (18)

Scaling Cassandra for Big Data
Scaling Cassandra for Big DataScaling Cassandra for Big Data
Scaling Cassandra for Big Data
DataStax Academy
 
Tuning tips for Apache Spark Jobs
Tuning tips for Apache Spark JobsTuning tips for Apache Spark Jobs
Tuning tips for Apache Spark Jobs
Samir Bessalah
 
Elastic HBase on Mesos - HBaseCon 2015
Elastic HBase on Mesos - HBaseCon 2015Elastic HBase on Mesos - HBaseCon 2015
Elastic HBase on Mesos - HBaseCon 2015
Cosmin Lehene
 
Understanding DSE Search by Matt Stump
Understanding DSE Search by Matt StumpUnderstanding DSE Search by Matt Stump
Understanding DSE Search by Matt Stump
DataStax
 
Managing Cassandra at Scale by Al Tobey
Managing Cassandra at Scale by Al TobeyManaging Cassandra at Scale by Al Tobey
Managing Cassandra at Scale by Al Tobey
DataStax Academy
 
C* Summit 2013: Cassandra at Instagram by Rick Branson
C* Summit 2013: Cassandra at Instagram by Rick BransonC* Summit 2013: Cassandra at Instagram by Rick Branson
C* Summit 2013: Cassandra at Instagram by Rick Branson
DataStax Academy
 
Debugging & Tuning in Spark
Debugging & Tuning in SparkDebugging & Tuning in Spark
Debugging & Tuning in Spark
Shiao-An Yuan
 
Lessons from Cassandra & Spark (Matthias Niehoff & Stephan Kepser, codecentri...
Lessons from Cassandra & Spark (Matthias Niehoff & Stephan Kepser, codecentri...Lessons from Cassandra & Spark (Matthias Niehoff & Stephan Kepser, codecentri...
Lessons from Cassandra & Spark (Matthias Niehoff & Stephan Kepser, codecentri...
DataStax
 
Ceph Object Storage Performance Secrets and Ceph Data Lake Solution
Ceph Object Storage Performance Secrets and Ceph Data Lake SolutionCeph Object Storage Performance Secrets and Ceph Data Lake Solution
Ceph Object Storage Performance Secrets and Ceph Data Lake Solution
Karan Singh
 
Cassandra Summit 2014: Reading Cassandra SSTables Directly for Offline Data A...
Cassandra Summit 2014: Reading Cassandra SSTables Directly for Offline Data A...Cassandra Summit 2014: Reading Cassandra SSTables Directly for Offline Data A...
Cassandra Summit 2014: Reading Cassandra SSTables Directly for Offline Data A...
DataStax Academy
 
Cassandra Summit 2014: Performance Tuning Cassandra in AWS
Cassandra Summit 2014: Performance Tuning Cassandra in AWSCassandra Summit 2014: Performance Tuning Cassandra in AWS
Cassandra Summit 2014: Performance Tuning Cassandra in AWS
DataStax Academy
 
Scylla Summit 2018: Keeping Your Latency SLAs No Matter What!
Scylla Summit 2018: Keeping Your Latency SLAs No Matter What!Scylla Summit 2018: Keeping Your Latency SLAs No Matter What!
Scylla Summit 2018: Keeping Your Latency SLAs No Matter What!
ScyllaDB
 
AWS Redshift Introduction - Big Data Analytics
AWS Redshift Introduction - Big Data AnalyticsAWS Redshift Introduction - Big Data Analytics
AWS Redshift Introduction - Big Data Analytics
Keeyong Han
 
Scylla Summit 2018: What's New in Scylla Manager?
Scylla Summit 2018: What's New in Scylla Manager?Scylla Summit 2018: What's New in Scylla Manager?
Scylla Summit 2018: What's New in Scylla Manager?
ScyllaDB
 
Cassandra Troubleshooting 3.0
Cassandra Troubleshooting 3.0Cassandra Troubleshooting 3.0
Cassandra Troubleshooting 3.0
J.B. Langston
 
Postgres & Redis Sitting in a Tree- Rimas Silkaitis, Heroku
Postgres & Redis Sitting in a Tree- Rimas Silkaitis, HerokuPostgres & Redis Sitting in a Tree- Rimas Silkaitis, Heroku
Postgres & Redis Sitting in a Tree- Rimas Silkaitis, Heroku
Redis Labs
 
Automation of Hadoop cluster operations in Arm Treasure Data
Automation of Hadoop cluster operations in Arm Treasure DataAutomation of Hadoop cluster operations in Arm Treasure Data
Automation of Hadoop cluster operations in Arm Treasure Data
Yan Wang
 
Mesosphere and Contentteam: A New Way to Run Cassandra
Mesosphere and Contentteam: A New Way to Run CassandraMesosphere and Contentteam: A New Way to Run Cassandra
Mesosphere and Contentteam: A New Way to Run Cassandra
DataStax Academy
 
Scaling Cassandra for Big Data
Scaling Cassandra for Big DataScaling Cassandra for Big Data
Scaling Cassandra for Big Data
DataStax Academy
 
Tuning tips for Apache Spark Jobs
Tuning tips for Apache Spark JobsTuning tips for Apache Spark Jobs
Tuning tips for Apache Spark Jobs
Samir Bessalah
 
Elastic HBase on Mesos - HBaseCon 2015
Elastic HBase on Mesos - HBaseCon 2015Elastic HBase on Mesos - HBaseCon 2015
Elastic HBase on Mesos - HBaseCon 2015
Cosmin Lehene
 
Understanding DSE Search by Matt Stump
Understanding DSE Search by Matt StumpUnderstanding DSE Search by Matt Stump
Understanding DSE Search by Matt Stump
DataStax
 
Managing Cassandra at Scale by Al Tobey
Managing Cassandra at Scale by Al TobeyManaging Cassandra at Scale by Al Tobey
Managing Cassandra at Scale by Al Tobey
DataStax Academy
 
C* Summit 2013: Cassandra at Instagram by Rick Branson
C* Summit 2013: Cassandra at Instagram by Rick BransonC* Summit 2013: Cassandra at Instagram by Rick Branson
C* Summit 2013: Cassandra at Instagram by Rick Branson
DataStax Academy
 
Debugging & Tuning in Spark
Debugging & Tuning in SparkDebugging & Tuning in Spark
Debugging & Tuning in Spark
Shiao-An Yuan
 
Lessons from Cassandra & Spark (Matthias Niehoff & Stephan Kepser, codecentri...
Lessons from Cassandra & Spark (Matthias Niehoff & Stephan Kepser, codecentri...Lessons from Cassandra & Spark (Matthias Niehoff & Stephan Kepser, codecentri...
Lessons from Cassandra & Spark (Matthias Niehoff & Stephan Kepser, codecentri...
DataStax
 
Ceph Object Storage Performance Secrets and Ceph Data Lake Solution
Ceph Object Storage Performance Secrets and Ceph Data Lake SolutionCeph Object Storage Performance Secrets and Ceph Data Lake Solution
Ceph Object Storage Performance Secrets and Ceph Data Lake Solution
Karan Singh
 
Cassandra Summit 2014: Reading Cassandra SSTables Directly for Offline Data A...
Cassandra Summit 2014: Reading Cassandra SSTables Directly for Offline Data A...Cassandra Summit 2014: Reading Cassandra SSTables Directly for Offline Data A...
Cassandra Summit 2014: Reading Cassandra SSTables Directly for Offline Data A...
DataStax Academy
 
Cassandra Summit 2014: Performance Tuning Cassandra in AWS
Cassandra Summit 2014: Performance Tuning Cassandra in AWSCassandra Summit 2014: Performance Tuning Cassandra in AWS
Cassandra Summit 2014: Performance Tuning Cassandra in AWS
DataStax Academy
 
Scylla Summit 2018: Keeping Your Latency SLAs No Matter What!
Scylla Summit 2018: Keeping Your Latency SLAs No Matter What!Scylla Summit 2018: Keeping Your Latency SLAs No Matter What!
Scylla Summit 2018: Keeping Your Latency SLAs No Matter What!
ScyllaDB
 
AWS Redshift Introduction - Big Data Analytics
AWS Redshift Introduction - Big Data AnalyticsAWS Redshift Introduction - Big Data Analytics
AWS Redshift Introduction - Big Data Analytics
Keeyong Han
 
Scylla Summit 2018: What's New in Scylla Manager?
Scylla Summit 2018: What's New in Scylla Manager?Scylla Summit 2018: What's New in Scylla Manager?
Scylla Summit 2018: What's New in Scylla Manager?
ScyllaDB
 
Cassandra Troubleshooting 3.0
Cassandra Troubleshooting 3.0Cassandra Troubleshooting 3.0
Cassandra Troubleshooting 3.0
J.B. Langston
 
Postgres & Redis Sitting in a Tree- Rimas Silkaitis, Heroku
Postgres & Redis Sitting in a Tree- Rimas Silkaitis, HerokuPostgres & Redis Sitting in a Tree- Rimas Silkaitis, Heroku
Postgres & Redis Sitting in a Tree- Rimas Silkaitis, Heroku
Redis Labs
 
Automation of Hadoop cluster operations in Arm Treasure Data
Automation of Hadoop cluster operations in Arm Treasure DataAutomation of Hadoop cluster operations in Arm Treasure Data
Automation of Hadoop cluster operations in Arm Treasure Data
Yan Wang
 
Mesosphere and Contentteam: A New Way to Run Cassandra
Mesosphere and Contentteam: A New Way to Run CassandraMesosphere and Contentteam: A New Way to Run Cassandra
Mesosphere and Contentteam: A New Way to Run Cassandra
DataStax Academy
 

Viewers also liked (12)

BI, Reporting and Analytics on Apache Cassandra
BI, Reporting and Analytics on Apache CassandraBI, Reporting and Analytics on Apache Cassandra
BI, Reporting and Analytics on Apache Cassandra
Victor Coustenoble
 
Spark + Cassandra = Real Time Analytics on Operational Data
Spark + Cassandra = Real Time Analytics on Operational DataSpark + Cassandra = Real Time Analytics on Operational Data
Spark + Cassandra = Real Time Analytics on Operational Data
Victor Coustenoble
 
Real-time Cassandra
Real-time CassandraReal-time Cassandra
Real-time Cassandra
Acunu
 
Cassandra at NoSql Matters 2012
Cassandra at NoSql Matters 2012Cassandra at NoSql Matters 2012
Cassandra at NoSql Matters 2012
jbellis
 
Cassandra DataTables Using RESTful API
Cassandra DataTables Using RESTful APICassandra DataTables Using RESTful API
Cassandra DataTables Using RESTful API
Simran Kedia
 
Using Cassandra with your Web Application
Using Cassandra with your Web ApplicationUsing Cassandra with your Web Application
Using Cassandra with your Web Application
supertom
 
Cassandra NodeJS driver & NodeJS Paris
Cassandra NodeJS driver & NodeJS ParisCassandra NodeJS driver & NodeJS Paris
Cassandra NodeJS driver & NodeJS Paris
Duyhai Doan
 
Cassandra Java APIs Old and New – A Comparison
Cassandra Java APIs Old and New – A ComparisonCassandra Java APIs Old and New – A Comparison
Cassandra Java APIs Old and New – A Comparison
shsedghi
 
Application Development with Apache Cassandra as a Service
Application Development with Apache Cassandra as a ServiceApplication Development with Apache Cassandra as a Service
Application Development with Apache Cassandra as a Service
WSO2
 
Node.js and Cassandra
Node.js and CassandraNode.js and Cassandra
Node.js and Cassandra
Stratio
 
NodeJS : Communication and Round Robin Way
NodeJS : Communication and Round Robin WayNodeJS : Communication and Round Robin Way
NodeJS : Communication and Round Robin Way
Edureka!
 
69 claves para conocer Big Data
69 claves para conocer Big Data69 claves para conocer Big Data
69 claves para conocer Big Data
Stratebi
 
BI, Reporting and Analytics on Apache Cassandra
BI, Reporting and Analytics on Apache CassandraBI, Reporting and Analytics on Apache Cassandra
BI, Reporting and Analytics on Apache Cassandra
Victor Coustenoble
 
Spark + Cassandra = Real Time Analytics on Operational Data
Spark + Cassandra = Real Time Analytics on Operational DataSpark + Cassandra = Real Time Analytics on Operational Data
Spark + Cassandra = Real Time Analytics on Operational Data
Victor Coustenoble
 
Real-time Cassandra
Real-time CassandraReal-time Cassandra
Real-time Cassandra
Acunu
 
Cassandra at NoSql Matters 2012
Cassandra at NoSql Matters 2012Cassandra at NoSql Matters 2012
Cassandra at NoSql Matters 2012
jbellis
 
Cassandra DataTables Using RESTful API
Cassandra DataTables Using RESTful APICassandra DataTables Using RESTful API
Cassandra DataTables Using RESTful API
Simran Kedia
 
Using Cassandra with your Web Application
Using Cassandra with your Web ApplicationUsing Cassandra with your Web Application
Using Cassandra with your Web Application
supertom
 
Cassandra NodeJS driver & NodeJS Paris
Cassandra NodeJS driver & NodeJS ParisCassandra NodeJS driver & NodeJS Paris
Cassandra NodeJS driver & NodeJS Paris
Duyhai Doan
 
Cassandra Java APIs Old and New – A Comparison
Cassandra Java APIs Old and New – A ComparisonCassandra Java APIs Old and New – A Comparison
Cassandra Java APIs Old and New – A Comparison
shsedghi
 
Application Development with Apache Cassandra as a Service
Application Development with Apache Cassandra as a ServiceApplication Development with Apache Cassandra as a Service
Application Development with Apache Cassandra as a Service
WSO2
 
Node.js and Cassandra
Node.js and CassandraNode.js and Cassandra
Node.js and Cassandra
Stratio
 
NodeJS : Communication and Round Robin Way
NodeJS : Communication and Round Robin WayNodeJS : Communication and Round Robin Way
NodeJS : Communication and Round Robin Way
Edureka!
 
69 claves para conocer Big Data
69 claves para conocer Big Data69 claves para conocer Big Data
69 claves para conocer Big Data
Stratebi
 
Ad

Similar to Developing with Cassandra (20)

Cassandra at Pollfish
Cassandra at PollfishCassandra at Pollfish
Cassandra at Pollfish
Stavros Kontopoulos
 
Cassandra at Pollfish
Cassandra at PollfishCassandra at Pollfish
Cassandra at Pollfish
Pollfish
 
High concurrency,
Low latency analytics
using Spark/Kudu
 High concurrency,
Low latency analytics
using Spark/Kudu High concurrency,
Low latency analytics
using Spark/Kudu
High concurrency,
Low latency analytics
using Spark/Kudu
Chris George
 
Making KVS 10x Scalable
Making KVS 10x ScalableMaking KVS 10x Scalable
Making KVS 10x Scalable
Sadayuki Furuhashi
 
Scylla Summit 2022: ScyllaDB Rust Driver: One Driver to Rule Them All
Scylla Summit 2022: ScyllaDB Rust Driver: One Driver to Rule Them AllScylla Summit 2022: ScyllaDB Rust Driver: One Driver to Rule Them All
Scylla Summit 2022: ScyllaDB Rust Driver: One Driver to Rule Them All
ScyllaDB
 
Quick trip around the Cosmos - Things every astronaut supposed to know
Quick trip around the Cosmos - Things every astronaut supposed to knowQuick trip around the Cosmos - Things every astronaut supposed to know
Quick trip around the Cosmos - Things every astronaut supposed to know
Rafał Hryniewski
 
ScyllaDB @ Apache BigData, may 2016
ScyllaDB @ Apache BigData, may 2016ScyllaDB @ Apache BigData, may 2016
ScyllaDB @ Apache BigData, may 2016
Tzach Livyatan
 
Introduction to Sqoop Aaron Kimball Cloudera Hadoop User Group UK
Introduction to Sqoop Aaron Kimball Cloudera Hadoop User Group UKIntroduction to Sqoop Aaron Kimball Cloudera Hadoop User Group UK
Introduction to Sqoop Aaron Kimball Cloudera Hadoop User Group UK
Skills Matter
 
What's new in Jewel and Beyond
What's new in Jewel and BeyondWhat's new in Jewel and Beyond
What's new in Jewel and Beyond
Sage Weil
 
Data Grids with Oracle Coherence
Data Grids with Oracle CoherenceData Grids with Oracle Coherence
Data Grids with Oracle Coherence
Ben Stopford
 
Using ScyllaDB for Extreme Scale Workloads
Using ScyllaDB for Extreme Scale WorkloadsUsing ScyllaDB for Extreme Scale Workloads
Using ScyllaDB for Extreme Scale Workloads
MarisaDelao3
 
k8s-batch-sig_-_Dask_on_Kubernetes.pptx__1_.pdf
k8s-batch-sig_-_Dask_on_Kubernetes.pptx__1_.pdfk8s-batch-sig_-_Dask_on_Kubernetes.pptx__1_.pdf
k8s-batch-sig_-_Dask_on_Kubernetes.pptx__1_.pdf
RyzaAlvieMancunian
 
Cassandra Summit 2014: Down with Tweaking! Removing Tunable Complexity for Ca...
Cassandra Summit 2014: Down with Tweaking! Removing Tunable Complexity for Ca...Cassandra Summit 2014: Down with Tweaking! Removing Tunable Complexity for Ca...
Cassandra Summit 2014: Down with Tweaking! Removing Tunable Complexity for Ca...
DataStax Academy
 
Puppet and Apache CloudStack
Puppet and Apache CloudStackPuppet and Apache CloudStack
Puppet and Apache CloudStack
Puppet
 
Infrastructure as code with Puppet and Apache CloudStack
Infrastructure as code with Puppet and Apache CloudStackInfrastructure as code with Puppet and Apache CloudStack
Infrastructure as code with Puppet and Apache CloudStack
ke4qqq
 
Postgres the hardway
Postgres the hardwayPostgres the hardway
Postgres the hardway
Dave Pitts
 
OSv at Cassandra Summit
OSv at Cassandra SummitOSv at Cassandra Summit
OSv at Cassandra Summit
Don Marti
 
Reusable, composable, battle-tested Terraform modules
Reusable, composable, battle-tested Terraform modulesReusable, composable, battle-tested Terraform modules
Reusable, composable, battle-tested Terraform modules
Yevgeniy Brikman
 
Under The Hood Of A Shard-Per-Core Database Architecture
Under The Hood Of A Shard-Per-Core Database ArchitectureUnder The Hood Of A Shard-Per-Core Database Architecture
Under The Hood Of A Shard-Per-Core Database Architecture
ScyllaDB
 
OCF.tw's talk about "Introduction to spark"
OCF.tw's talk about "Introduction to spark"OCF.tw's talk about "Introduction to spark"
OCF.tw's talk about "Introduction to spark"
Giivee The
 
Cassandra at Pollfish
Cassandra at PollfishCassandra at Pollfish
Cassandra at Pollfish
Pollfish
 
High concurrency,
Low latency analytics
using Spark/Kudu
 High concurrency,
Low latency analytics
using Spark/Kudu High concurrency,
Low latency analytics
using Spark/Kudu
High concurrency,
Low latency analytics
using Spark/Kudu
Chris George
 
Scylla Summit 2022: ScyllaDB Rust Driver: One Driver to Rule Them All
Scylla Summit 2022: ScyllaDB Rust Driver: One Driver to Rule Them AllScylla Summit 2022: ScyllaDB Rust Driver: One Driver to Rule Them All
Scylla Summit 2022: ScyllaDB Rust Driver: One Driver to Rule Them All
ScyllaDB
 
Quick trip around the Cosmos - Things every astronaut supposed to know
Quick trip around the Cosmos - Things every astronaut supposed to knowQuick trip around the Cosmos - Things every astronaut supposed to know
Quick trip around the Cosmos - Things every astronaut supposed to know
Rafał Hryniewski
 
ScyllaDB @ Apache BigData, may 2016
ScyllaDB @ Apache BigData, may 2016ScyllaDB @ Apache BigData, may 2016
ScyllaDB @ Apache BigData, may 2016
Tzach Livyatan
 
Introduction to Sqoop Aaron Kimball Cloudera Hadoop User Group UK
Introduction to Sqoop Aaron Kimball Cloudera Hadoop User Group UKIntroduction to Sqoop Aaron Kimball Cloudera Hadoop User Group UK
Introduction to Sqoop Aaron Kimball Cloudera Hadoop User Group UK
Skills Matter
 
What's new in Jewel and Beyond
What's new in Jewel and BeyondWhat's new in Jewel and Beyond
What's new in Jewel and Beyond
Sage Weil
 
Data Grids with Oracle Coherence
Data Grids with Oracle CoherenceData Grids with Oracle Coherence
Data Grids with Oracle Coherence
Ben Stopford
 
Using ScyllaDB for Extreme Scale Workloads
Using ScyllaDB for Extreme Scale WorkloadsUsing ScyllaDB for Extreme Scale Workloads
Using ScyllaDB for Extreme Scale Workloads
MarisaDelao3
 
k8s-batch-sig_-_Dask_on_Kubernetes.pptx__1_.pdf
k8s-batch-sig_-_Dask_on_Kubernetes.pptx__1_.pdfk8s-batch-sig_-_Dask_on_Kubernetes.pptx__1_.pdf
k8s-batch-sig_-_Dask_on_Kubernetes.pptx__1_.pdf
RyzaAlvieMancunian
 
Cassandra Summit 2014: Down with Tweaking! Removing Tunable Complexity for Ca...
Cassandra Summit 2014: Down with Tweaking! Removing Tunable Complexity for Ca...Cassandra Summit 2014: Down with Tweaking! Removing Tunable Complexity for Ca...
Cassandra Summit 2014: Down with Tweaking! Removing Tunable Complexity for Ca...
DataStax Academy
 
Puppet and Apache CloudStack
Puppet and Apache CloudStackPuppet and Apache CloudStack
Puppet and Apache CloudStack
Puppet
 
Infrastructure as code with Puppet and Apache CloudStack
Infrastructure as code with Puppet and Apache CloudStackInfrastructure as code with Puppet and Apache CloudStack
Infrastructure as code with Puppet and Apache CloudStack
ke4qqq
 
Postgres the hardway
Postgres the hardwayPostgres the hardway
Postgres the hardway
Dave Pitts
 
OSv at Cassandra Summit
OSv at Cassandra SummitOSv at Cassandra Summit
OSv at Cassandra Summit
Don Marti
 
Reusable, composable, battle-tested Terraform modules
Reusable, composable, battle-tested Terraform modulesReusable, composable, battle-tested Terraform modules
Reusable, composable, battle-tested Terraform modules
Yevgeniy Brikman
 
Under The Hood Of A Shard-Per-Core Database Architecture
Under The Hood Of A Shard-Per-Core Database ArchitectureUnder The Hood Of A Shard-Per-Core Database Architecture
Under The Hood Of A Shard-Per-Core Database Architecture
ScyllaDB
 
OCF.tw's talk about "Introduction to spark"
OCF.tw's talk about "Introduction to spark"OCF.tw's talk about "Introduction to spark"
OCF.tw's talk about "Introduction to spark"
Giivee The
 
Ad

More from Sperasoft (20)

особенности работы с Locomotion в Unreal Engine 4
особенности работы с Locomotion в Unreal Engine 4особенности работы с Locomotion в Unreal Engine 4
особенности работы с Locomotion в Unreal Engine 4
Sperasoft
 
концепт и архитектура геймплея в Creach: The Depleted World
концепт и архитектура геймплея в Creach: The Depleted Worldконцепт и архитектура геймплея в Creach: The Depleted World
концепт и архитектура геймплея в Creach: The Depleted World
Sperasoft
 
Опыт разработки VR игры для UE4
Опыт разработки VR игры для UE4Опыт разработки VR игры для UE4
Опыт разработки VR игры для UE4
Sperasoft
 
Организация работы с UE4 в команде до 20 человек
Организация работы с UE4 в команде до 20 человек Организация работы с UE4 в команде до 20 человек
Организация работы с UE4 в команде до 20 человек
Sperasoft
 
Gameplay Tags
Gameplay TagsGameplay Tags
Gameplay Tags
Sperasoft
 
Data Driven Gameplay in UE4
Data Driven Gameplay in UE4Data Driven Gameplay in UE4
Data Driven Gameplay in UE4
Sperasoft
 
Code and Memory Optimisation Tricks
Code and Memory Optimisation Tricks Code and Memory Optimisation Tricks
Code and Memory Optimisation Tricks
Sperasoft
 
The theory of relational databases
The theory of relational databasesThe theory of relational databases
The theory of relational databases
Sperasoft
 
Automated layout testing using Galen Framework
Automated layout testing using Galen FrameworkAutomated layout testing using Galen Framework
Automated layout testing using Galen Framework
Sperasoft
 
Sperasoft talks: Android Security Threats
Sperasoft talks: Android Security ThreatsSperasoft talks: Android Security Threats
Sperasoft talks: Android Security Threats
Sperasoft
 
Sperasoft Talks: RxJava Functional Reactive Programming on Android
Sperasoft Talks: RxJava Functional Reactive Programming on AndroidSperasoft Talks: RxJava Functional Reactive Programming on Android
Sperasoft Talks: RxJava Functional Reactive Programming on Android
Sperasoft
 
Sperasoft‬ talks j point 2015
Sperasoft‬ talks j point 2015Sperasoft‬ talks j point 2015
Sperasoft‬ talks j point 2015
Sperasoft
 
Effective Мeetings
Effective МeetingsEffective Мeetings
Effective Мeetings
Sperasoft
 
Unreal Engine 4 Introduction
Unreal Engine 4 IntroductionUnreal Engine 4 Introduction
Unreal Engine 4 Introduction
Sperasoft
 
JIRA Development
JIRA DevelopmentJIRA Development
JIRA Development
Sperasoft
 
Introduction to Elasticsearch
Introduction to ElasticsearchIntroduction to Elasticsearch
Introduction to Elasticsearch
Sperasoft
 
MOBILE DEVELOPMENT with HTML, CSS and JS
MOBILE DEVELOPMENT with HTML, CSS and JSMOBILE DEVELOPMENT with HTML, CSS and JS
MOBILE DEVELOPMENT with HTML, CSS and JS
Sperasoft
 
Quick Intro Into Kanban
Quick Intro Into KanbanQuick Intro Into Kanban
Quick Intro Into Kanban
Sperasoft
 
ECMAScript 6 Review
ECMAScript 6 ReviewECMAScript 6 Review
ECMAScript 6 Review
Sperasoft
 
Console Development in 15 minutes
Console Development in 15 minutesConsole Development in 15 minutes
Console Development in 15 minutes
Sperasoft
 
особенности работы с Locomotion в Unreal Engine 4
особенности работы с Locomotion в Unreal Engine 4особенности работы с Locomotion в Unreal Engine 4
особенности работы с Locomotion в Unreal Engine 4
Sperasoft
 
концепт и архитектура геймплея в Creach: The Depleted World
концепт и архитектура геймплея в Creach: The Depleted Worldконцепт и архитектура геймплея в Creach: The Depleted World
концепт и архитектура геймплея в Creach: The Depleted World
Sperasoft
 
Опыт разработки VR игры для UE4
Опыт разработки VR игры для UE4Опыт разработки VR игры для UE4
Опыт разработки VR игры для UE4
Sperasoft
 
Организация работы с UE4 в команде до 20 человек
Организация работы с UE4 в команде до 20 человек Организация работы с UE4 в команде до 20 человек
Организация работы с UE4 в команде до 20 человек
Sperasoft
 
Gameplay Tags
Gameplay TagsGameplay Tags
Gameplay Tags
Sperasoft
 
Data Driven Gameplay in UE4
Data Driven Gameplay in UE4Data Driven Gameplay in UE4
Data Driven Gameplay in UE4
Sperasoft
 
Code and Memory Optimisation Tricks
Code and Memory Optimisation Tricks Code and Memory Optimisation Tricks
Code and Memory Optimisation Tricks
Sperasoft
 
The theory of relational databases
The theory of relational databasesThe theory of relational databases
The theory of relational databases
Sperasoft
 
Automated layout testing using Galen Framework
Automated layout testing using Galen FrameworkAutomated layout testing using Galen Framework
Automated layout testing using Galen Framework
Sperasoft
 
Sperasoft talks: Android Security Threats
Sperasoft talks: Android Security ThreatsSperasoft talks: Android Security Threats
Sperasoft talks: Android Security Threats
Sperasoft
 
Sperasoft Talks: RxJava Functional Reactive Programming on Android
Sperasoft Talks: RxJava Functional Reactive Programming on AndroidSperasoft Talks: RxJava Functional Reactive Programming on Android
Sperasoft Talks: RxJava Functional Reactive Programming on Android
Sperasoft
 
Sperasoft‬ talks j point 2015
Sperasoft‬ talks j point 2015Sperasoft‬ talks j point 2015
Sperasoft‬ talks j point 2015
Sperasoft
 
Effective Мeetings
Effective МeetingsEffective Мeetings
Effective Мeetings
Sperasoft
 
Unreal Engine 4 Introduction
Unreal Engine 4 IntroductionUnreal Engine 4 Introduction
Unreal Engine 4 Introduction
Sperasoft
 
JIRA Development
JIRA DevelopmentJIRA Development
JIRA Development
Sperasoft
 
Introduction to Elasticsearch
Introduction to ElasticsearchIntroduction to Elasticsearch
Introduction to Elasticsearch
Sperasoft
 
MOBILE DEVELOPMENT with HTML, CSS and JS
MOBILE DEVELOPMENT with HTML, CSS and JSMOBILE DEVELOPMENT with HTML, CSS and JS
MOBILE DEVELOPMENT with HTML, CSS and JS
Sperasoft
 
Quick Intro Into Kanban
Quick Intro Into KanbanQuick Intro Into Kanban
Quick Intro Into Kanban
Sperasoft
 
ECMAScript 6 Review
ECMAScript 6 ReviewECMAScript 6 Review
ECMAScript 6 Review
Sperasoft
 
Console Development in 15 minutes
Console Development in 15 minutesConsole Development in 15 minutes
Console Development in 15 minutes
Sperasoft
 

Recently uploaded (20)

Noah Loul Shares 5 Steps to Implement AI Agents for Maximum Business Efficien...
Noah Loul Shares 5 Steps to Implement AI Agents for Maximum Business Efficien...Noah Loul Shares 5 Steps to Implement AI Agents for Maximum Business Efficien...
Noah Loul Shares 5 Steps to Implement AI Agents for Maximum Business Efficien...
Noah Loul
 
Splunk Security Update | Public Sector Summit Germany 2025
Splunk Security Update | Public Sector Summit Germany 2025Splunk Security Update | Public Sector Summit Germany 2025
Splunk Security Update | Public Sector Summit Germany 2025
Splunk
 
AI and Data Privacy in 2025: Global Trends
AI and Data Privacy in 2025: Global TrendsAI and Data Privacy in 2025: Global Trends
AI and Data Privacy in 2025: Global Trends
InData Labs
 
TrsLabs - Fintech Product & Business Consulting
TrsLabs - Fintech Product & Business ConsultingTrsLabs - Fintech Product & Business Consulting
TrsLabs - Fintech Product & Business Consulting
Trs Labs
 
Linux Professional Institute LPIC-1 Exam.pdf
Linux Professional Institute LPIC-1 Exam.pdfLinux Professional Institute LPIC-1 Exam.pdf
Linux Professional Institute LPIC-1 Exam.pdf
RHCSA Guru
 
Into The Box Conference Keynote Day 1 (ITB2025)
Into The Box Conference Keynote Day 1 (ITB2025)Into The Box Conference Keynote Day 1 (ITB2025)
Into The Box Conference Keynote Day 1 (ITB2025)
Ortus Solutions, Corp
 
IEDM 2024 Tutorial2_Advances in CMOS Technologies and Future Directions for C...
IEDM 2024 Tutorial2_Advances in CMOS Technologies and Future Directions for C...IEDM 2024 Tutorial2_Advances in CMOS Technologies and Future Directions for C...
IEDM 2024 Tutorial2_Advances in CMOS Technologies and Future Directions for C...
organizerofv
 
Massive Power Outage Hits Spain, Portugal, and France: Causes, Impact, and On...
Massive Power Outage Hits Spain, Portugal, and France: Causes, Impact, and On...Massive Power Outage Hits Spain, Portugal, and France: Causes, Impact, and On...
Massive Power Outage Hits Spain, Portugal, and France: Causes, Impact, and On...
Aqusag Technologies
 
SAP Modernization: Maximizing the Value of Your SAP S/4HANA Migration.pdf
SAP Modernization: Maximizing the Value of Your SAP S/4HANA Migration.pdfSAP Modernization: Maximizing the Value of Your SAP S/4HANA Migration.pdf
SAP Modernization: Maximizing the Value of Your SAP S/4HANA Migration.pdf
Precisely
 
HCL Nomad Web – Best Practices and Managing Multiuser Environments
HCL Nomad Web – Best Practices and Managing Multiuser EnvironmentsHCL Nomad Web – Best Practices and Managing Multiuser Environments
HCL Nomad Web – Best Practices and Managing Multiuser Environments
panagenda
 
What is Model Context Protocol(MCP) - The new technology for communication bw...
What is Model Context Protocol(MCP) - The new technology for communication bw...What is Model Context Protocol(MCP) - The new technology for communication bw...
What is Model Context Protocol(MCP) - The new technology for communication bw...
Vishnu Singh Chundawat
 
Build Your Own Copilot & Agents For Devs
Build Your Own Copilot & Agents For DevsBuild Your Own Copilot & Agents For Devs
Build Your Own Copilot & Agents For Devs
Brian McKeiver
 
Cyber Awareness overview for 2025 month of security
Cyber Awareness overview for 2025 month of securityCyber Awareness overview for 2025 month of security
Cyber Awareness overview for 2025 month of security
riccardosl1
 
Greenhouse_Monitoring_Presentation.pptx.
Greenhouse_Monitoring_Presentation.pptx.Greenhouse_Monitoring_Presentation.pptx.
Greenhouse_Monitoring_Presentation.pptx.
hpbmnnxrvb
 
The Evolution of Meme Coins A New Era for Digital Currency ppt.pdf
The Evolution of Meme Coins A New Era for Digital Currency ppt.pdfThe Evolution of Meme Coins A New Era for Digital Currency ppt.pdf
The Evolution of Meme Coins A New Era for Digital Currency ppt.pdf
Abi john
 
Big Data Analytics Quick Research Guide by Arthur Morgan
Big Data Analytics Quick Research Guide by Arthur MorganBig Data Analytics Quick Research Guide by Arthur Morgan
Big Data Analytics Quick Research Guide by Arthur Morgan
Arthur Morgan
 
Generative Artificial Intelligence (GenAI) in Business
Generative Artificial Intelligence (GenAI) in BusinessGenerative Artificial Intelligence (GenAI) in Business
Generative Artificial Intelligence (GenAI) in Business
Dr. Tathagat Varma
 
UiPath Community Berlin: Orchestrator API, Swagger, and Test Manager API
UiPath Community Berlin: Orchestrator API, Swagger, and Test Manager APIUiPath Community Berlin: Orchestrator API, Swagger, and Test Manager API
UiPath Community Berlin: Orchestrator API, Swagger, and Test Manager API
UiPathCommunity
 
Rusty Waters: Elevating Lakehouses Beyond Spark
Rusty Waters: Elevating Lakehouses Beyond SparkRusty Waters: Elevating Lakehouses Beyond Spark
Rusty Waters: Elevating Lakehouses Beyond Spark
carlyakerly1
 
DevOpsDays Atlanta 2025 - Building 10x Development Organizations.pptx
DevOpsDays Atlanta 2025 - Building 10x Development Organizations.pptxDevOpsDays Atlanta 2025 - Building 10x Development Organizations.pptx
DevOpsDays Atlanta 2025 - Building 10x Development Organizations.pptx
Justin Reock
 
Noah Loul Shares 5 Steps to Implement AI Agents for Maximum Business Efficien...
Noah Loul Shares 5 Steps to Implement AI Agents for Maximum Business Efficien...Noah Loul Shares 5 Steps to Implement AI Agents for Maximum Business Efficien...
Noah Loul Shares 5 Steps to Implement AI Agents for Maximum Business Efficien...
Noah Loul
 
Splunk Security Update | Public Sector Summit Germany 2025
Splunk Security Update | Public Sector Summit Germany 2025Splunk Security Update | Public Sector Summit Germany 2025
Splunk Security Update | Public Sector Summit Germany 2025
Splunk
 
AI and Data Privacy in 2025: Global Trends
AI and Data Privacy in 2025: Global TrendsAI and Data Privacy in 2025: Global Trends
AI and Data Privacy in 2025: Global Trends
InData Labs
 
TrsLabs - Fintech Product & Business Consulting
TrsLabs - Fintech Product & Business ConsultingTrsLabs - Fintech Product & Business Consulting
TrsLabs - Fintech Product & Business Consulting
Trs Labs
 
Linux Professional Institute LPIC-1 Exam.pdf
Linux Professional Institute LPIC-1 Exam.pdfLinux Professional Institute LPIC-1 Exam.pdf
Linux Professional Institute LPIC-1 Exam.pdf
RHCSA Guru
 
Into The Box Conference Keynote Day 1 (ITB2025)
Into The Box Conference Keynote Day 1 (ITB2025)Into The Box Conference Keynote Day 1 (ITB2025)
Into The Box Conference Keynote Day 1 (ITB2025)
Ortus Solutions, Corp
 
IEDM 2024 Tutorial2_Advances in CMOS Technologies and Future Directions for C...
IEDM 2024 Tutorial2_Advances in CMOS Technologies and Future Directions for C...IEDM 2024 Tutorial2_Advances in CMOS Technologies and Future Directions for C...
IEDM 2024 Tutorial2_Advances in CMOS Technologies and Future Directions for C...
organizerofv
 
Massive Power Outage Hits Spain, Portugal, and France: Causes, Impact, and On...
Massive Power Outage Hits Spain, Portugal, and France: Causes, Impact, and On...Massive Power Outage Hits Spain, Portugal, and France: Causes, Impact, and On...
Massive Power Outage Hits Spain, Portugal, and France: Causes, Impact, and On...
Aqusag Technologies
 
SAP Modernization: Maximizing the Value of Your SAP S/4HANA Migration.pdf
SAP Modernization: Maximizing the Value of Your SAP S/4HANA Migration.pdfSAP Modernization: Maximizing the Value of Your SAP S/4HANA Migration.pdf
SAP Modernization: Maximizing the Value of Your SAP S/4HANA Migration.pdf
Precisely
 
HCL Nomad Web – Best Practices and Managing Multiuser Environments
HCL Nomad Web – Best Practices and Managing Multiuser EnvironmentsHCL Nomad Web – Best Practices and Managing Multiuser Environments
HCL Nomad Web – Best Practices and Managing Multiuser Environments
panagenda
 
What is Model Context Protocol(MCP) - The new technology for communication bw...
What is Model Context Protocol(MCP) - The new technology for communication bw...What is Model Context Protocol(MCP) - The new technology for communication bw...
What is Model Context Protocol(MCP) - The new technology for communication bw...
Vishnu Singh Chundawat
 
Build Your Own Copilot & Agents For Devs
Build Your Own Copilot & Agents For DevsBuild Your Own Copilot & Agents For Devs
Build Your Own Copilot & Agents For Devs
Brian McKeiver
 
Cyber Awareness overview for 2025 month of security
Cyber Awareness overview for 2025 month of securityCyber Awareness overview for 2025 month of security
Cyber Awareness overview for 2025 month of security
riccardosl1
 
Greenhouse_Monitoring_Presentation.pptx.
Greenhouse_Monitoring_Presentation.pptx.Greenhouse_Monitoring_Presentation.pptx.
Greenhouse_Monitoring_Presentation.pptx.
hpbmnnxrvb
 
The Evolution of Meme Coins A New Era for Digital Currency ppt.pdf
The Evolution of Meme Coins A New Era for Digital Currency ppt.pdfThe Evolution of Meme Coins A New Era for Digital Currency ppt.pdf
The Evolution of Meme Coins A New Era for Digital Currency ppt.pdf
Abi john
 
Big Data Analytics Quick Research Guide by Arthur Morgan
Big Data Analytics Quick Research Guide by Arthur MorganBig Data Analytics Quick Research Guide by Arthur Morgan
Big Data Analytics Quick Research Guide by Arthur Morgan
Arthur Morgan
 
Generative Artificial Intelligence (GenAI) in Business
Generative Artificial Intelligence (GenAI) in BusinessGenerative Artificial Intelligence (GenAI) in Business
Generative Artificial Intelligence (GenAI) in Business
Dr. Tathagat Varma
 
UiPath Community Berlin: Orchestrator API, Swagger, and Test Manager API
UiPath Community Berlin: Orchestrator API, Swagger, and Test Manager APIUiPath Community Berlin: Orchestrator API, Swagger, and Test Manager API
UiPath Community Berlin: Orchestrator API, Swagger, and Test Manager API
UiPathCommunity
 
Rusty Waters: Elevating Lakehouses Beyond Spark
Rusty Waters: Elevating Lakehouses Beyond SparkRusty Waters: Elevating Lakehouses Beyond Spark
Rusty Waters: Elevating Lakehouses Beyond Spark
carlyakerly1
 
DevOpsDays Atlanta 2025 - Building 10x Development Organizations.pptx
DevOpsDays Atlanta 2025 - Building 10x Development Organizations.pptxDevOpsDays Atlanta 2025 - Building 10x Development Organizations.pptx
DevOpsDays Atlanta 2025 - Building 10x Development Organizations.pptx
Justin Reock
 

Developing with Cassandra

  • 2. In memory Key-Value: Redis CouchBase File system BerkeleyDB Distributed file system HDFS Velocity Volume Variety In memory NewSQL / domain specific VoltDB Traditional [SQL] RDBMS MongoDB Distributed DBMS Cassandra HBase CAP theorem: • Consistency • Availability • Partition tolerance - pick two Eventual consistency Transactional? • No Choose the DB
  • 3. Why select Cassandra? • [relatively] easy to setup • [relatively] easy to use • ~zero routine ops • it works (!!) as promised: o real-time replication o node/site failure recovery o zero load writes o double of nodes = double of speed
  • 4. Because Cassandra is Fast! But needs some time to deliver • 12'000 WPS on a laptop • ~0.1 / 1 ms constant latency for writes/reads
  • 6. Good For: • log-like data TTL helps • massive writes 1M WPS enough? • simple real-time analytics Not So Good For: • dump of junk (consider HDFS) • OLAP (depends on "O") Good and Not So Good
  • 7. Distributed DBMS Just DBMS - closed monolithic solution o not a platform to run custom code (as MongoDB); o not an extension (as HBase); o highly optimized No-master, eventually consistent NoSQL Data model - Key-Value http://cassandra.apache.org Apache Cassandra
  • 8. Developed at Facebook for Inbox search Released to open source in 2008 In use: • Netflix - main non-content data store~500 Cassandra nodes (2012) • eBay - recommendation system"dozens of nodes", 200 TB storage (2012) • Twitter - tweet analysis100 + TB of data • More clients: (http://www.datastax.com/cassandrausers) History
  • 9. 1.0 - October 2011 1.1 - April 2012 1.2 - January 2013 2.0 - expected this summer (2013) June 26 2013 - 158 bugs, 89 worth to notice Sperasoft Experience: • hit 1 bug in production (stability issue) • hit 1 bug in QA (in a crafted case) Mature & Agile
  • 10. Apache .tar.gz and Debian packageshttp://cassandra.apache.org/download/ DataStax DSC - Cassandra + OpsCenter http://planetcassandra.org/Download/DataStaxCommunityEdition Embedded – for funct. tests on Java apps Maven Documentation http://wiki.apache.org/cassandra/ http://www.datastax.com/docs Distributions
  • 12. Bare metal CPU: 8 cores (4 works too) RAM: 16 - 64 GB (min 8 GB) Storage: rotating disks 3 - 5 TB total (SSD better) VM works too, but... Storage: local disks, avoid NAS More on Hardware
  • 13. Software Yes & No (production) ?
  • 14. Software for Development: All Yes Plus more: Java, Python, C#, PHP, Ruby, Clojure, Go, R, ...
  • 15. KeyspaceKeyspace Column family Column family ~ RDBMS DB / Schema ~ RDBMS table Key1 Key2 Value Column Row Clustering key Partitioning key Map < ... , Map < ... , ... > > Data Model
  • 16. Node 3Node 2Node 1 CFCFCF 1 2 3 Client Parallel reads,writes 1 2 3 Data on Discs
  • 17. https://github.com/datastax/java-driver Client API Options Thrift RPC Native protocol + CQL3 Apache Thrift Custom protocol Synchronous Asynchronous Schema-less Static schema Store & Forward Cursors promised in 2.0 API for any language Java; Python, C# coming Cryptic API JDBC-like API Supported yet Going forward
  • 18. • Forget RDB design principles • Forget abstract data model - shape data for queries • No joins - materialized views • Data duplication - OK • Remember eventual consistency • Queries are precious • Use right data types - timestamp, uuid Why? Because NoSQL is a low level tool for high optimization. Data Modeling for NoSQL
  • 19. Do & Don't Patterns and Anti-patterns
  • 21. CREATE TABLE timeline ( event uuid, timestamp timeuuid, ... PRIMARY KEY (event, timestamp) ); event:date timestamp ... CREATE TABLE timeline ( event uuid, date long, timestamp timeuuid, ... PRIMARY KEY ((event, date), timestamp) ); event timestamp ... Still bad - need sharding http://www.datastax.com/dev/blog/advanced-time-series-with-cassandra Long rows - Cassandra handle 2G columns, but... slo-o-ow Timeline
  • 22. Insert = Update = Delete A B C D1 Y1 Z1 1 A Y Z1 a b c d UPDATE ... SET b = 'Y' WHERE id = 1 INSERT INTO ... SET (id, c) values (1, 'Z') DELETE d FROM ... WHERE id = 1 SELECT * FROM ... WHERE id = 1 have to fetch 4 rows slo-o-ow Plan Data Immutable
  • 23. http://www.datastax.com/dev/blog/cassandra-anti-patterns-queues-and-queue-like-datasets Q Queue: INSERT INTO ... SET( name, enqueued_at, payload ) VALUES ( 'Q', now(), ... ) Dequeue: DELETE payload FROM ... WHERE name = 'Q' AND enqueued_at = ... Pick the next: SELECT * FROM ... WHERE id = 1 LIMIT 1 *Q *Q *Q *Q Q Q *? ? ?Q have to fetch 4 rowsslo-o-ow Queue
  • 24. Remember - eventual consistency. Concurrent updates => wrong count SELECT count(*) FROM ... WHERE .... ; Full scan over the selection => Default 10'000 rows limit => wrong count Have an integer column and increment it CREATE TABLE count_table ( id uuid, value counter, PRIMARY KEY (id) ); ... UPDATE count_table SET value = value + 1 WHERE id = ... ; Counter column family http://www.datastax.com/documentation/cassandr a/1.2/cassandra/cql_using/use_counter_t.html slo-o-ow mess mess How Many?
  • 25. CREATE TABLE blob ( id uuid, data blob, PRIMARY KEY (id) ); id chunk_no data CREATE TABLE blob ( id uuid, chunk_no int, data blob, PRIMARY KEY (id, chunk_no) ); id data http://wiki.apache.org/cassandra/FAQ#large_file_and_blob_storage http://wiki.apache.org/cassandra/CassandraLimitations OutOfMemory Blobs
  • 26. Unbounded Queries Result set Client OutOfMemory OutOfMemory Cursors. Planned to 2.0. https://issues.apache.org/jira/browse/CASSANDRA-4415 C* clients use RPC yet Cassandra slo-o-ow
  • 27. Column names - data... yet Keep them short.
  • 28. Node 3Node 1 CF 3 Client 2 Node 3Node 2 Why?? slo-o-ow hot spots Limit Client to a node
  • 29. Helpful Links Sperasoft @ slideshare: http://www.slideshare.net/Sperasoft Sperasoft @ speakerdeck: https://speakerdeck.com/sperasoft Sperasoft @ github: https://github.com/Sperasoft/Workshop