SlideShare a Scribd company logo
STAMPEDECON 2014	

CASSANDRA 	

IN THE REAL WORLD	

Nate McCall	

@zznate	

!
Co-Founder & Sr.Technical Consultant	

!
Licensed under a Creative Commons Attribution-NonCommercial 3.0 New Zealand License
AboutThe Last Pickle.	

!
Work with clients to deliver and improve
Apache Cassandra based solutions.	

!
Based in New Zealand & USA.
“…in the Real World?”	

!
Lots of hype,	

stats get attention,	

as do big names
“Real World?”
!
“…1.1 million client writes per second.
Data was automatically replicated across all
three zones making a total of 3.3 million
writes per second across the cluster.”
http://techblog.netflix.com/2011/11/benchmarking-cassandra-scalability-on.html
“Real World?”
!
“+10 clusters, +100s nodes, 	

250TB provisioned, 	

9 billion writes/day, 5 billion reads/day”
http://www.slideshare.net/jaykumarpatel/cassandra-at-ebay-cassandra-summit-2013
“Real World?”
!
…	

• “but I don’t have an∞ AMZN budget”	

• “maybe one day I’ll have that much data”
“Real World!”
!
Most folks needed:	

real fault tolerance,	

scale out characteristics
“Real World!”
!
Most folks have:	

3 to 12 nodes with 2-15TB,	

commodity hardware,	

small teams
!
Cassandra at 10k feet
Case Studies	

Common Best Practices
Cassandra in the Real World.
Cassandra Architecture (briefly).
API's
Cluster Aware
Cluster Unaware
Clients
Disk
Cassandra Cluster Architecture (briefly).
API's
Cluster Aware
Cluster Unaware
Clients
Disk
API's
Cluster Aware
Cluster Unaware
Disk
Node 1 Node 2
Dynamo Cluster Architecture (briefly).
API's
Dynamo
Database
Clients
Disk
API's
Dynamo
Database
Disk
Node 1 Node 2
Cassandra Architecture (briefly).	

!
API
Dynamo	

Database
APITransports.
!
	

 Thrift
Native Binary
Thrift transport.
!
	

 Extremely performant for
specific workloads
Astyanax,	

disruptor-based HSHA in 2.0
APITransports.
!
	

 Thrift
Native Binary
Native BinaryTransport.
!
	

 Focus of future development
Uses Netty,
CQL 3 only,	

asynchronous
API Services.
!
	

 JMX
Thrift
CQL 3
!
API Services.
!
	

 JMX
Thrift
CQL 3
!
API Services.
!
	

 JMX
Thrift
CQL 3
!
Cassandra Architecture (briefly).	

!
API
Dynamo	

Database
Please see:	

http://www.slideshare.net/aaronmorton/cassandra-community-webinar-introduction-to-apache-cassandra-12-20353118	

http://www.slideshare.net/planetcassandra/c-summit-eu-2013-cassandra-internals	

http://www.slideshare.net/aaronmorton/cassandra-community-webinar-august-29th-2013-in-case-of-emergency-break-glass
Cassandra in the Real World.	

!
Cassandra at 10k feet
Case Studies	

Common Best Practices
Case Studies.
Ad Tech	

Sensor Data	

Mobile Device Diagnostics
AdTech.	

Latency = $$$
AdTech.	

Large “Hot Data” set	

active users,	

targeting,	

display count
AdTech.	

Huge Long Tail	

who saw what,	

used for billing,	

campaign effectiveness over time,	

all sorts of analytics
AdTech: Software.	

Java 	

CQL via DataStax Java Driver	

Python	

Pycassa (Thrift)
AdTech: Cluster.
Cluster	

12 nodes,	

2 datacenters,	

{DC1:R1:3,DC2:R2:3}
AdTech: Systems.
Physical Hardware	

commodity 1U 8xSSD,	

36GB RAM, 	

10gigE + 4x1gigE
Case Studies.
AdTech	

Sensor Data	

Mobile Device Diagnostics
Sensor Data.	

Latency != $$$
Sensor Data.	

High Write Throughput:	

consistent “shape”,	

immutable data,	

large sequential reads,	

high uptime (for writes)
Sensor Data: Software.	

REST application:	

separate reader service,	

writes to kafka,	

ELB to multiple regions
Sensor Data: Software.	

Java:	

Thrift via Astyanax,	

read from kafka and batch
insertions to optimal size
Sensor Data: Cluster.	

Cluster	

9 nodes,	

1 availability zone,	

{RF:3}
Sensor Data: Systems.	

m1.xlarge:	

15GB, 2TB RAID0	

“high”,	

tablesnap for backup
Case Studies.
AdTech	

Sensor Data	

Mobile Device Diagnostics
Device Diagnostics.	

Latency = battery
Device Diagnostics.	

Write Bursts	

large single payloads,	

large hot data set
Device Diagnostics.	

Huge long tail	

but irrelevant after 2 months,	

external partner API*	

!
*thar be dragons
Device Diagnostics: Software.	

Java	

CQL / DataStax Java Driver
Device Diagnostics: Software.	

REST application	

Payloads to S3,	

pointer in kafka to payload
Device Diagnostics: Cluster.	

Cluster	

12 nodes,	

3 availability zones	

{us-east-1:1}
Device Diagnostics: Systems.	

i2.2xlarge	

61gb, 1.8TB RAID0 SSD	

“Enhanced Networking”,	

dedicated ENI
Device Diagnostics: Systems.	

No Backups.	

!
!
Device Diagnostics: Systems.	

No Backups.	

!
“Replay the front end.”
Cassandra in the Real World.	

!
Cassandra at 10k feet
Case Studies	

Common Best Practices
Common Best Practices.
API's
Cluster Aware
Cluster Unaware
Clients
Disk
Client Best Practices.	

Decouple!	

buffer writes for 	

event based systems,	

use asynchronous operations
Client Best Practices.	

Use Official Drivers	

(but there are exceptions)
Client Best Practices.	

CQL3:	

collections, 	

user defined types,	

tooling available
Common Best Practices.
API's
Cluster Aware
Cluster Unaware
Clients
Disk
API Best Practices.	

Understand Replication!
API Best Practices.	

Monitor & Instrument
Common Best Practices.
API's
Cluster Aware
Cluster Unaware
Clients
Disk
Cluster Best Practices.	

Understand Replication!	

learn all you can about
topology options
Cluster Best Practices.	

Verify Assumptions:	

test failure scenarios explicitly
Common Best Practices.
API's
Cluster Aware
Cluster Unaware
Clients
Disk
Systems Best Practices.	

Better to have a lot of a little	

commodity hardware*,	

32-64gb or RAM (or more)
*10gigE is now commodity
Systems Best Practices.	

BUT: do you have staff that
can tune kernels?	

larger hardware needs tuning:	

“receive packet steering”
Systems Best Practices.	

EC2	

SSD instances if you can,	

UseVPCs, Deployment groups
and ENIs
Common Best Practices.
API's
Cluster Aware
Cluster Unaware
Clients
Disk
Storage Best Practices.	

Dependent on workload	

can mix and match:	

rotational for commitlog and
system
Storage Best Practices.	

You can mix and match:	

rotational for commitlog and
system,	

SSD for data
Storage Best Practices.	

SSD	

consider JBOD,	

consumer grade works fine
Storage Best Practices.	

“What about SANs?”
Storage Best Practices.	

“What about SANs?”	

!
NO.	

!
(You would be moving a distributed system 	

onto a centralized component)
Storage Best Practices.	

Backups:	

tablesnap on EC2,	

rsync (immutable data FTW!)
Storage Best Practices.	

Backups:	

combine rebuild+replay for
best results	

(Bonus: loading production data to staging is
testing your backups!)
Thanks.	

!
Nate McCall	

@zznate	

!
Co-Founder & Sr.Technical Consultant	

www.thelastpickle.com

More Related Content

ODP
Meetup cassandra for_java_cql
zznate
 
PDF
Software Development with Apache Cassandra
zznate
 
PDF
Seattle C* Meetup: Hardening cassandra for compliance or paranoia
zznate
 
PDF
Hardening cassandra q2_2016
zznate
 
PDF
Hardening cassandra for compliance or paranoia
zznate
 
PDF
Securing Cassandra The Right Way
DataStax Academy
 
PDF
Successful Software Development with Apache Cassandra
zznate
 
PDF
Dynamic Database Credentials: Security Contingency Planning
Sean Chittenden
 
Meetup cassandra for_java_cql
zznate
 
Software Development with Apache Cassandra
zznate
 
Seattle C* Meetup: Hardening cassandra for compliance or paranoia
zznate
 
Hardening cassandra q2_2016
zznate
 
Hardening cassandra for compliance or paranoia
zznate
 
Securing Cassandra The Right Way
DataStax Academy
 
Successful Software Development with Apache Cassandra
zznate
 
Dynamic Database Credentials: Security Contingency Planning
Sean Chittenden
 

What's hot (18)

PDF
2020-02-20 - HashiTalks 2020 - HashiCorp Vault configuration as code via Hash...
Andrey Devyatkin
 
PDF
Advanced Apache Cassandra Operations with JMX
zznate
 
PDF
Modern tooling to assist with developing applications on FreeBSD
Sean Chittenden
 
PDF
Production Readiness Strategies in an Automated World
Sean Chittenden
 
PDF
Apache cassandra en production - devoxx 2017
Alexander DEJANOVSKI
 
PDF
HashiCorp Vault Workshop:幫 Credentials 找個窩
smalltown
 
PPT
SparkSQL et Cassandra - Tool In Action Devoxx 2015
Alexander DEJANOVSKI
 
PDF
Compliance as Code with terraform-compliance
Emre Erkunt
 
PDF
Cassandra and security
Ben Bromhead
 
PPTX
Gruntwork Executive Summary
Yevgeniy Brikman
 
PPTX
Vault - Secret and Key Management
Anthony Ikeda
 
PDF
Keybase Vault Auto-Unseal HashiTalks2020
Bas Meijer
 
PDF
C* Summit EU 2013: Cassandra Internals
DataStax Academy
 
PDF
HashiCorp Vault configuration as code via HashiCorp Terraform- stories from t...
Andrey Devyatkin
 
PDF
Chickens & Eggs: Managing secrets in AWS with Hashicorp Vault
Jeff Horwitz
 
PDF
HashiCorp's Vault - The Examples
Michał Czeraszkiewicz
 
PDF
Lessons learned from writing over 300,000 lines of infrastructure code
Yevgeniy Brikman
 
KEY
DjangoCon 2010 Scaling Disqus
zeeg
 
2020-02-20 - HashiTalks 2020 - HashiCorp Vault configuration as code via Hash...
Andrey Devyatkin
 
Advanced Apache Cassandra Operations with JMX
zznate
 
Modern tooling to assist with developing applications on FreeBSD
Sean Chittenden
 
Production Readiness Strategies in an Automated World
Sean Chittenden
 
Apache cassandra en production - devoxx 2017
Alexander DEJANOVSKI
 
HashiCorp Vault Workshop:幫 Credentials 找個窩
smalltown
 
SparkSQL et Cassandra - Tool In Action Devoxx 2015
Alexander DEJANOVSKI
 
Compliance as Code with terraform-compliance
Emre Erkunt
 
Cassandra and security
Ben Bromhead
 
Gruntwork Executive Summary
Yevgeniy Brikman
 
Vault - Secret and Key Management
Anthony Ikeda
 
Keybase Vault Auto-Unseal HashiTalks2020
Bas Meijer
 
C* Summit EU 2013: Cassandra Internals
DataStax Academy
 
HashiCorp Vault configuration as code via HashiCorp Terraform- stories from t...
Andrey Devyatkin
 
Chickens & Eggs: Managing secrets in AWS with Hashicorp Vault
Jeff Horwitz
 
HashiCorp's Vault - The Examples
Michał Czeraszkiewicz
 
Lessons learned from writing over 300,000 lines of infrastructure code
Yevgeniy Brikman
 
DjangoCon 2010 Scaling Disqus
zeeg
 
Ad

Viewers also liked (14)

PDF
2College Jozefmavo - Bring Your Own Device (BYOD)
MetPascal
 
PDF
New เอกสาร microsoft word
pitukpong
 
PDF
1234
pitukpong
 
PDF
Social Media @ Sint-Oelbert Gymnasium
MetPascal
 
PDF
1234
pitukpong
 
PDF
Summer school program
The Joy of Marketing
 
PPS
Mars
pamcbride
 
PDF
Sarah Petty and The Joy of Marketing - Sell Your Butt Off!
The Joy of Marketing
 
ODP
Meetup cassandra sfo_jdbc
zznate
 
DOCX
REPORT ON IOCL
Sarita Sardar
 
PPT
Hector v2: The Second Version of the Popular High-Level Java Client for Apach...
zznate
 
DOCX
Immediate Newborn Cordcare Checklist
floresmichaeltangog
 
PPTX
Workday: Building Large Scale Machine Learning Pipelines
DataStax Academy
 
DOC
Tense ทั้งหมด
pitukpong
 
2College Jozefmavo - Bring Your Own Device (BYOD)
MetPascal
 
New เอกสาร microsoft word
pitukpong
 
1234
pitukpong
 
Social Media @ Sint-Oelbert Gymnasium
MetPascal
 
1234
pitukpong
 
Summer school program
The Joy of Marketing
 
Mars
pamcbride
 
Sarah Petty and The Joy of Marketing - Sell Your Butt Off!
The Joy of Marketing
 
Meetup cassandra sfo_jdbc
zznate
 
REPORT ON IOCL
Sarita Sardar
 
Hector v2: The Second Version of the Popular High-Level Java Client for Apach...
zznate
 
Immediate Newborn Cordcare Checklist
floresmichaeltangog
 
Workday: Building Large Scale Machine Learning Pipelines
DataStax Academy
 
Tense ทั้งหมด
pitukpong
 
Ad

Similar to Stampede con 2014 cassandra in the real world (20)

PDF
Cloud Architecture Tutorial - Platform Component Architecture (2of3)
Adrian Cockcroft
 
PPT
The Enterprise Cloud
Mark Masterson
 
PDF
MayaData Datastax webinar - Operating Cassandra on Kubernetes with the help ...
MayaData Inc
 
PDF
GumGum: Multi-Region Cassandra in AWS
DataStax Academy
 
PDF
Container Attached Storage with OpenEBS - CNCF Paris Meetup
MayaData Inc
 
PDF
Building Antifragile Applications with Apache Cassandra
Patrick McFadin
 
PDF
Leveraging Cassandra for real-time multi-datacenter public cloud analytics
Julien Anguenot
 
PDF
iland Internet Solutions: Leveraging Cassandra for real-time multi-datacenter...
DataStax Academy
 
PDF
Intro to SW Eng Principles for Cloud Computing - DNelson Apr2015
Darryl Nelson
 
PPT
How to Design a Scalable Private Cloud
AFCOM
 
PDF
Designing Resilient Application Platforms with Apache Cassandra - Hayato Shim...
jaxLondonConference
 
PPTX
CMG2013 Workshop: Netflix Cloud Native, Capacity, Performance and Cost Optimi...
Adrian Cockcroft
 
PDF
Cassandra Summit 2014: Apache Cassandra Best Practices at Ebay
DataStax Academy
 
PDF
Netflix at-disney-09-26-2014
Monal Daxini
 
PPTX
Cassandra Operations at Netflix
greggulrich
 
PDF
BigData as a Platform: Cassandra and Current Trends
Matthew Dennis
 
PPT
Cloud introduction2.ppt
Bala Anand
 
PDF
Solving k8s persistent workloads using k8s DevOps style
MayaData
 
PPTX
Cassandra in Operation
niallmilton
 
PPTX
Webinar: OpenEBS - Still Free and now FASTEST Kubernetes storage
MayaData Inc
 
Cloud Architecture Tutorial - Platform Component Architecture (2of3)
Adrian Cockcroft
 
The Enterprise Cloud
Mark Masterson
 
MayaData Datastax webinar - Operating Cassandra on Kubernetes with the help ...
MayaData Inc
 
GumGum: Multi-Region Cassandra in AWS
DataStax Academy
 
Container Attached Storage with OpenEBS - CNCF Paris Meetup
MayaData Inc
 
Building Antifragile Applications with Apache Cassandra
Patrick McFadin
 
Leveraging Cassandra for real-time multi-datacenter public cloud analytics
Julien Anguenot
 
iland Internet Solutions: Leveraging Cassandra for real-time multi-datacenter...
DataStax Academy
 
Intro to SW Eng Principles for Cloud Computing - DNelson Apr2015
Darryl Nelson
 
How to Design a Scalable Private Cloud
AFCOM
 
Designing Resilient Application Platforms with Apache Cassandra - Hayato Shim...
jaxLondonConference
 
CMG2013 Workshop: Netflix Cloud Native, Capacity, Performance and Cost Optimi...
Adrian Cockcroft
 
Cassandra Summit 2014: Apache Cassandra Best Practices at Ebay
DataStax Academy
 
Netflix at-disney-09-26-2014
Monal Daxini
 
Cassandra Operations at Netflix
greggulrich
 
BigData as a Platform: Cassandra and Current Trends
Matthew Dennis
 
Cloud introduction2.ppt
Bala Anand
 
Solving k8s persistent workloads using k8s DevOps style
MayaData
 
Cassandra in Operation
niallmilton
 
Webinar: OpenEBS - Still Free and now FASTEST Kubernetes storage
MayaData Inc
 

More from zznate (9)

PDF
An Introduction to the Vert.x framework
zznate
 
PDF
Intravert atx meetup_condensed
zznate
 
PDF
Apachecon cassandra transport
zznate
 
KEY
Oscon 2012 tdd_cassandra
zznate
 
PPTX
Strata west 2012_java_cassandra
zznate
 
ODP
Nyc summit intro_to_cassandra
zznate
 
ODP
Introduciton to Apache Cassandra for Java Developers (JavaOne)
zznate
 
ODP
Introduction to apache_cassandra_for_developers-lhg
zznate
 
PPT
Introduction to apache_cassandra_for_develope
zznate
 
An Introduction to the Vert.x framework
zznate
 
Intravert atx meetup_condensed
zznate
 
Apachecon cassandra transport
zznate
 
Oscon 2012 tdd_cassandra
zznate
 
Strata west 2012_java_cassandra
zznate
 
Nyc summit intro_to_cassandra
zznate
 
Introduciton to Apache Cassandra for Java Developers (JavaOne)
zznate
 
Introduction to apache_cassandra_for_developers-lhg
zznate
 
Introduction to apache_cassandra_for_develope
zznate
 

Recently uploaded (20)

PPTX
New ThousandEyes Product Innovations: Cisco Live June 2025
ThousandEyes
 
PDF
Security features in Dell, HP, and Lenovo PC systems: A research-based compar...
Principled Technologies
 
PDF
Orbitly Pitch Deck|A Mission-Driven Platform for Side Project Collaboration (...
zz41354899
 
PDF
NewMind AI Weekly Chronicles - July'25 - Week IV
NewMind AI
 
PDF
Software Development Company | KodekX
KodekX
 
PDF
Using Anchore and DefectDojo to Stand Up Your DevSecOps Function
Anchore
 
PDF
The Evolution of KM Roles (Presented at Knowledge Summit Dublin 2025)
Enterprise Knowledge
 
PDF
CIFDAQ'S Market Insight: BTC to ETH money in motion
CIFDAQ
 
PPTX
ChatGPT's Deck on The Enduring Legacy of Fax Machines
Greg Swan
 
PDF
Make GenAI investments go further with the Dell AI Factory - Infographic
Principled Technologies
 
PDF
Cloud-Migration-Best-Practices-A-Practical-Guide-to-AWS-Azure-and-Google-Clou...
Artjoker Software Development Company
 
PDF
Doc9.....................................
SofiaCollazos
 
PPTX
The Power of IoT Sensor Integration in Smart Infrastructure and Automation.pptx
Rejig Digital
 
DOCX
Top AI API Alternatives to OpenAI: A Side-by-Side Breakdown
vilush
 
PDF
Why Your AI & Cybersecurity Hiring Still Misses the Mark in 2025
Virtual Employee Pvt. Ltd.
 
PPTX
PA Analog/Digital System: The Backbone of Modern Surveillance and Communication
AVTRON Technologies LLC
 
PDF
agentic-ai-and-the-future-of-autonomous-systems.pdf
siddharthnetsavvies
 
PDF
BLW VOCATIONAL TRAINING SUMMER INTERNSHIP REPORT
codernjn73
 
PDF
madgavkar20181017ppt McKinsey Presentation.pdf
georgschmitzdoerner
 
PPTX
How to Build a Scalable Micro-Investing Platform in 2025 - A Founder’s Guide ...
Third Rock Techkno
 
New ThousandEyes Product Innovations: Cisco Live June 2025
ThousandEyes
 
Security features in Dell, HP, and Lenovo PC systems: A research-based compar...
Principled Technologies
 
Orbitly Pitch Deck|A Mission-Driven Platform for Side Project Collaboration (...
zz41354899
 
NewMind AI Weekly Chronicles - July'25 - Week IV
NewMind AI
 
Software Development Company | KodekX
KodekX
 
Using Anchore and DefectDojo to Stand Up Your DevSecOps Function
Anchore
 
The Evolution of KM Roles (Presented at Knowledge Summit Dublin 2025)
Enterprise Knowledge
 
CIFDAQ'S Market Insight: BTC to ETH money in motion
CIFDAQ
 
ChatGPT's Deck on The Enduring Legacy of Fax Machines
Greg Swan
 
Make GenAI investments go further with the Dell AI Factory - Infographic
Principled Technologies
 
Cloud-Migration-Best-Practices-A-Practical-Guide-to-AWS-Azure-and-Google-Clou...
Artjoker Software Development Company
 
Doc9.....................................
SofiaCollazos
 
The Power of IoT Sensor Integration in Smart Infrastructure and Automation.pptx
Rejig Digital
 
Top AI API Alternatives to OpenAI: A Side-by-Side Breakdown
vilush
 
Why Your AI & Cybersecurity Hiring Still Misses the Mark in 2025
Virtual Employee Pvt. Ltd.
 
PA Analog/Digital System: The Backbone of Modern Surveillance and Communication
AVTRON Technologies LLC
 
agentic-ai-and-the-future-of-autonomous-systems.pdf
siddharthnetsavvies
 
BLW VOCATIONAL TRAINING SUMMER INTERNSHIP REPORT
codernjn73
 
madgavkar20181017ppt McKinsey Presentation.pdf
georgschmitzdoerner
 
How to Build a Scalable Micro-Investing Platform in 2025 - A Founder’s Guide ...
Third Rock Techkno
 

Stampede con 2014 cassandra in the real world