SlideShare a Scribd company logo
The Short and Straight Road That
Leads from Cassandra to Scylla
Felipe Móz
Big Data Specialist
Presenter bio
First of all: I’m not a Jedi.
I believe that without data you only have an
opinion.
I’m a big data specialist @Natura
Three brands, one vision
thousand
stores
3.2
thousand
co-workers
1872
countries
million
Sales
consultants
1.8
thousand
6.5
co-workers
Some figures
million
consumers
100
8
countries
Americas
& Europe
Natura
stores
28
million
Online consumers
3,5
TECH TALK
AWS
Sensedia
API Gateway
OAM / AD
Autenticação
Scheduler
Oracle
Batch replication
Private
access
+ +
Streaming +
RDD
Monitoring:
Dynatrace
~40
long batch’s
per day
12 x m4.4xlarge
(autoscaling)
~10
builds/day
+380k
messages/day
Streaming Statistics:
~160K completed rdd/day
Processing Time:
Avg: 76 ms
Scheduling Delay
Avg: 0 ms
ARCHITECTURE
SCALING
Snapshot: https://snapshot.raintank.io/dashboard/snapshot/GnPAoY6J1RCRgj6y6vMOkTHovu34Q0IB?orgId=2
THE DECISION
Why ScyllaDB?
▪ No more JVM, we are so glad;
▪ Lot of performance issues on C*/DSE;
▪ Support works! They know our use case and data modeling;
▪ Access directly to developers, if you need;
▪ They are flexible to contract: choose the right way to scale your db, by
cores not nodes!
▪ Datastax overpricing during renewership process;
▪ ScyllaDB Roadmap is better than C*/DSE.
SCYLLA METRICS
C*/DSE METRICS
NSG
RedeNaturaAzure-SP
SUBNETNATPRD-AZ1-Isolated
dc0prdvm0 (#0)
Standard DS14 v2
(16 CPU, 112 GB DDR)
LUN 11 x P20 SSD
dc0prdvm1 (#1)
Standard DS14 v2
(16 CPU, 112 GB DDR)
LUN 11 x P20 SSD
dc0prdvm3 (#3)
Standard DS14 v2
(16 CPU, 112 GB DDR)
LUN 11 x P20 SSD
opscenterprd
Standard DS1 v2
(1 CPU, 3,5 GB DDR)
dc0prdvm4 (#4)
Standard DS14 v2
(16 CPU, 112 GB DDR)
LUN 11 x P20 SSD
dc0prdvm5 (#5)
Standard DS14 v2
(16 CPU, 112 GB DDR)
LUN 11 x P20 SSD
C*/DSE SIZING
dc0prdvm0 (#0)
dc0prdvm1 (#1)dc0prdvm3 (#3)
dc0prdvm4 (#4)
dc0prdvm5 (#5)
C*/DSE Cloud Hardware Costs
US$
2.008,44
Each node @Azure
Location: Brazil
Type: Ubuntu 16 LTS
Size: DS14_V2
Managed Disks: 11 x P20 SSD
Hours: 744h hours
*Pricelist without reserve
(DSE License not included)
US$
2.008,44
US$
2.008,44
US$
2.008,44 US$
2.008,44
SCYLLADB SIZING
NSG
AWS
SUBNETNATPRD-AZ1-Isolated
node (#0)
i3.4xlarge
(16 CPU, 112 GB DDR)
3.5TB ephemeral NVME
node (#1)
i3.4xlarge
(16 CPU, 112 GB DDR)
3.5TB ephemeral NVME
node (#3)
i3.4xlarge
(16 CPU, 112 GB DDR)
3.5TB ephemeral NVME
grafana
t2.medium
(1 CPU, 24GB)
Node (#4)
i3.4xlarge
(16 CPU, 112 GB DDR)
3.5TB ephemeral NVME
node (#5)
i3.4xlarge
(16 CPU, 112 GB DDR)
3.5TB ephemeral NVME
SCYLLADB Cloud Hardware Costs
node (#0)
node (#1)node (#3)
Node (#4)
node (#5)
Each node @AWS
Location: Brasil
Type: ami-2c311640 (helpful)
Size: i3.4xlarge
Managed Disks: 2 x 1900 NVMe SSD
Hours: 744h hours
*Pricelist On-Demand (No Contract)
(ScyllaDB License not included)
US$
913.54
US$
913.54 US$
913.54
US$
913.54
US$
913.54
RESULTS
▪ Reducing batch process at least to
10%
▪ In some cases the reductions went
from 6 hours to below 10 minutes
(1/36th!)
▪ Saving a lot of money
▪ Almost zero rebuilding
▪ From milliseconds to microseconds
write latency
Streaming Statistics:
~160K completed rdd/day
Processing Time:
Avg: 6 ms
Scheduling Delay
Avg: 0 ms
Streaming Statistics:
~160K completed rdd/day
Processing Time:
Avg: 76 ms
Scheduling Delay
Avg: 0 ms
DSE AVG
95th percentile
~ 220 msec
by operations
ScyllaDB AVG
95th percentile
~ 500 µsec
by operations
TIPS
ScyllaDB + Spark (MUST read)
https://docs.scylladb.com/kb/scylla-and-spark-integration/
https://www.scylladb.com/2018/07/31/spark-scylla/
https://pt.slideshare.net/ScyllaDB/spark-powered-by-scylla
They aren't our provider,
They are part of our team.
Thank You
Any Questions ?
Please stay in touch
felipemoz@natura.net

More Related Content

PPTX
Scylla Summit 2018: How We Made Large Partition Scans Over Two Times Faster
PPTX
Scylla Summit 2018: Keynote - 4 Years of Scylla
PPTX
Scylla Summit 2018: Cassandra and ScyllaDB at Yahoo! Japan
PPTX
Using ScyllaDB with JanusGraph for Cyber Security
PDF
AdGear Use Case with Scylla - 1M Queries Per Second with Single-Digit Millise...
PDF
Introducing Scylla Cloud
PPTX
Scylla Summit 2018: OLAP or OLTP? Why Not Both?
PPTX
Empowering the AWS DynamoDB™ application developer with Alternator
Scylla Summit 2018: How We Made Large Partition Scans Over Two Times Faster
Scylla Summit 2018: Keynote - 4 Years of Scylla
Scylla Summit 2018: Cassandra and ScyllaDB at Yahoo! Japan
Using ScyllaDB with JanusGraph for Cyber Security
AdGear Use Case with Scylla - 1M Queries Per Second with Single-Digit Millise...
Introducing Scylla Cloud
Scylla Summit 2018: OLAP or OLTP? Why Not Both?
Empowering the AWS DynamoDB™ application developer with Alternator

What's hot (20)

PDF
Scylla Summit 2016: Using ScyllaDB for a Microservice-based Pipeline in Go
PPTX
Scylla Summit 2018: Getting the Most Out of Scylla on Kubernetes
PPTX
How to be Successful with Scylla
PPTX
Scylla Summit 2019 Keynote - Avi Kivity
PPTX
Powering a Graph Data System with Scylla + JanusGraph
PPTX
Scylla Summit 2018: Make Scylla Fast Again! Find out how using Tools, Talent,...
PPTX
Scylla Summit 2018: Building Recoverable (and optionally Async) Spark Pipelines
PDF
Addressing the High Cost of Apache Cassandra
PDF
Numberly on Joining Billions of Rows in Seconds: Replacing MongoDB and Hive w...
PDF
Introducing Scylla Open Source 4.0
PPTX
Scylla Summit 2018: Joining Billions of Rows in Seconds with One Database Ins...
PPTX
Scylla Summit 2018: Consensus in Eventually Consistent Databases
PPTX
FireEye & Scylla: Intel Threat Analysis Using a Graph Database
PDF
How to achieve no compromise performance and availability
PDF
Real-World Resiliency: Surviving Datacenter Disaster
PPTX
Seastar Summit 2019 Keynote
PPTX
Scylla Summit 2018: Scylla Feature Talks - Gains by Using Scylla-Specific Dri...
PPTX
MongoDB vs Scylla: Production Experience from Both Dev & Ops Standpoint at Nu...
PDF
Scylla Summit 2016: Compose on Containing the Database
PPTX
Scylla Summit 2018: Introducing ValuStor, A Memcached Alternative Made to Run...
Scylla Summit 2016: Using ScyllaDB for a Microservice-based Pipeline in Go
Scylla Summit 2018: Getting the Most Out of Scylla on Kubernetes
How to be Successful with Scylla
Scylla Summit 2019 Keynote - Avi Kivity
Powering a Graph Data System with Scylla + JanusGraph
Scylla Summit 2018: Make Scylla Fast Again! Find out how using Tools, Talent,...
Scylla Summit 2018: Building Recoverable (and optionally Async) Spark Pipelines
Addressing the High Cost of Apache Cassandra
Numberly on Joining Billions of Rows in Seconds: Replacing MongoDB and Hive w...
Introducing Scylla Open Source 4.0
Scylla Summit 2018: Joining Billions of Rows in Seconds with One Database Ins...
Scylla Summit 2018: Consensus in Eventually Consistent Databases
FireEye & Scylla: Intel Threat Analysis Using a Graph Database
How to achieve no compromise performance and availability
Real-World Resiliency: Surviving Datacenter Disaster
Seastar Summit 2019 Keynote
Scylla Summit 2018: Scylla Feature Talks - Gains by Using Scylla-Specific Dri...
MongoDB vs Scylla: Production Experience from Both Dev & Ops Standpoint at Nu...
Scylla Summit 2016: Compose on Containing the Database
Scylla Summit 2018: Introducing ValuStor, A Memcached Alternative Made to Run...
Ad

Similar to Scylla Summit 2018: The Short and Straight Road That Leads from Cassandra to Scylla (20)

PDF
The True Cost of NoSQL DBaaS Options
PPTX
Radically Outperforming DynamoDB @ Digital Turbine with SADA and Google Cloud
PPTX
Scylla Virtual Workshop 2022
PDF
How Development Teams Cut Costs with ScyllaDB.pdf
PDF
Architecture, Products, and Total Cost of Ownership of the Leading Machine Le...
PDF
SPG AWS infrastructure analysis (public example)
PDF
New Ways to Reduce Database Costs with ScyllaDB
PDF
Reduce Your Cloud Spend with ScyllaDB by Tzach Livyatan
PPTX
Red Hat Storage Day LA - Performance and Sizing Software Defined Storage
PDF
Understanding The True Cost of DynamoDB Webinar
PDF
Cloud storage
PPTX
Scylla Summit 2019 Keynote - Dor Laor - Beyond Cassandra
PDF
Design Choices for Cloud Data Platforms
PDF
Effective use of cloud resources for Data Engineering - Johnson Darkwah
PDF
Eliminating Volatile Latencies Inside Rakuten’s NoSQL Migration
PPTX
Cassandra vs Databases
PDF
To Serverless and Beyond
PDF
Benefits of Hadoop as Platform as a Service
PPTX
Cassandra to ScyllaDB: Technical Comparison and the Path to Success
PPTX
MongoDB vs ScyllaDB: Tractian’s Experience with Real-Time ML
The True Cost of NoSQL DBaaS Options
Radically Outperforming DynamoDB @ Digital Turbine with SADA and Google Cloud
Scylla Virtual Workshop 2022
How Development Teams Cut Costs with ScyllaDB.pdf
Architecture, Products, and Total Cost of Ownership of the Leading Machine Le...
SPG AWS infrastructure analysis (public example)
New Ways to Reduce Database Costs with ScyllaDB
Reduce Your Cloud Spend with ScyllaDB by Tzach Livyatan
Red Hat Storage Day LA - Performance and Sizing Software Defined Storage
Understanding The True Cost of DynamoDB Webinar
Cloud storage
Scylla Summit 2019 Keynote - Dor Laor - Beyond Cassandra
Design Choices for Cloud Data Platforms
Effective use of cloud resources for Data Engineering - Johnson Darkwah
Eliminating Volatile Latencies Inside Rakuten’s NoSQL Migration
Cassandra vs Databases
To Serverless and Beyond
Benefits of Hadoop as Platform as a Service
Cassandra to ScyllaDB: Technical Comparison and the Path to Success
MongoDB vs ScyllaDB: Tractian’s Experience with Real-Time ML
Ad

More from ScyllaDB (20)

PDF
Database Benchmarking for Performance Masterclass: Session 2 - Data Modeling ...
PDF
Database Benchmarking for Performance Masterclass: Session 1 - Benchmarking F...
PDF
Designing Low-Latency Systems with Rust and ScyllaDB: An Architectural Deep Dive
PDF
Powering a Billion Dreams: Scaling Meesho’s E-commerce Revolution with Scylla...
PDF
Leading a High-Stakes Database Migration
PDF
Achieving Extreme Scale with ScyllaDB: Tips & Tradeoffs
PDF
Securely Serving Millions of Boot Artifacts a Day by João Pedro Lima & Matt ...
PDF
How Agoda Scaled 50x Throughput with ScyllaDB by Worakarn Isaratham
PDF
How Yieldmo Cut Database Costs and Cloud Dependencies Fast by Todd Coleman
PDF
ScyllaDB: 10 Years and Beyond by Dor Laor
PDF
Migrating 50TB Data From a Home-Grown Database to ScyllaDB, Fast by Terence Liu
PDF
Vector Search with ScyllaDB by Szymon Wasik
PDF
Workload Prioritization: How to Balance Multiple Workloads in a Cluster by Fe...
PDF
Two Leading Approaches to Data Virtualization, and Which Scales Better? by Da...
PDF
Scaling a Beast: Lessons from 400x Growth in a High-Stakes Financial System b...
PDF
Object Storage in ScyllaDB by Ran Regev, ScyllaDB
PDF
Lessons Learned from Building a Serverless Notifications System by Srushith R...
PDF
A Dist Sys Programmer's Journey into AI by Piotr Sarna
PDF
High Availability: Lessons Learned by Paul Preuveneers
PDF
How Natura Uses ScyllaDB and ScyllaDB Connector to Create a Real-time Data Pi...
Database Benchmarking for Performance Masterclass: Session 2 - Data Modeling ...
Database Benchmarking for Performance Masterclass: Session 1 - Benchmarking F...
Designing Low-Latency Systems with Rust and ScyllaDB: An Architectural Deep Dive
Powering a Billion Dreams: Scaling Meesho’s E-commerce Revolution with Scylla...
Leading a High-Stakes Database Migration
Achieving Extreme Scale with ScyllaDB: Tips & Tradeoffs
Securely Serving Millions of Boot Artifacts a Day by João Pedro Lima & Matt ...
How Agoda Scaled 50x Throughput with ScyllaDB by Worakarn Isaratham
How Yieldmo Cut Database Costs and Cloud Dependencies Fast by Todd Coleman
ScyllaDB: 10 Years and Beyond by Dor Laor
Migrating 50TB Data From a Home-Grown Database to ScyllaDB, Fast by Terence Liu
Vector Search with ScyllaDB by Szymon Wasik
Workload Prioritization: How to Balance Multiple Workloads in a Cluster by Fe...
Two Leading Approaches to Data Virtualization, and Which Scales Better? by Da...
Scaling a Beast: Lessons from 400x Growth in a High-Stakes Financial System b...
Object Storage in ScyllaDB by Ran Regev, ScyllaDB
Lessons Learned from Building a Serverless Notifications System by Srushith R...
A Dist Sys Programmer's Journey into AI by Piotr Sarna
High Availability: Lessons Learned by Paul Preuveneers
How Natura Uses ScyllaDB and ScyllaDB Connector to Create a Real-time Data Pi...

Recently uploaded (20)

PDF
Become an Agentblazer Champion Challenge
PDF
Exploring AI Agents in Process Industries
PDF
Flood Susceptibility Mapping Using Image-Based 2D-CNN Deep Learnin. Overview ...
PDF
AI in Product Development-omnex systems
PDF
Microsoft Teams Essentials; The pricing and the versions_PDF.pdf
PPTX
What to Capture When It Breaks: 16 Artifacts That Reveal Root Causes
PDF
How to Choose the Most Effective Social Media Agency in Bangalore.pdf
DOCX
The Five Best AI Cover Tools in 2025.docx
PDF
IEEE-CS Tech Predictions, SWEBOK and Quantum Software: Towards Q-SWEBOK
PDF
The Future of Smart Factories Why Embedded Analytics Leads the Way
PDF
Micromaid: A simple Mermaid-like chart generator for Pharo
PDF
QAware_Mario-Leander_Reimer_Architecting and Building a K8s-based AI Platform...
PDF
How to Confidently Manage Project Budgets
PPTX
Benefits of DCCM for Genesys Contact Center
PPT
Introduction Database Management System for Course Database
PPTX
Materi_Pemrograman_Komputer-Looping.pptx
PPTX
Materi-Enum-and-Record-Data-Type (1).pptx
PPTX
Hire Expert Blazor Developers | Scalable Solutions by OnestopDA
PDF
Community & News Update Q2 Meet Up 2025
PDF
A REACT POMODORO TIMER WEB APPLICATION.pdf
Become an Agentblazer Champion Challenge
Exploring AI Agents in Process Industries
Flood Susceptibility Mapping Using Image-Based 2D-CNN Deep Learnin. Overview ...
AI in Product Development-omnex systems
Microsoft Teams Essentials; The pricing and the versions_PDF.pdf
What to Capture When It Breaks: 16 Artifacts That Reveal Root Causes
How to Choose the Most Effective Social Media Agency in Bangalore.pdf
The Five Best AI Cover Tools in 2025.docx
IEEE-CS Tech Predictions, SWEBOK and Quantum Software: Towards Q-SWEBOK
The Future of Smart Factories Why Embedded Analytics Leads the Way
Micromaid: A simple Mermaid-like chart generator for Pharo
QAware_Mario-Leander_Reimer_Architecting and Building a K8s-based AI Platform...
How to Confidently Manage Project Budgets
Benefits of DCCM for Genesys Contact Center
Introduction Database Management System for Course Database
Materi_Pemrograman_Komputer-Looping.pptx
Materi-Enum-and-Record-Data-Type (1).pptx
Hire Expert Blazor Developers | Scalable Solutions by OnestopDA
Community & News Update Q2 Meet Up 2025
A REACT POMODORO TIMER WEB APPLICATION.pdf

Scylla Summit 2018: The Short and Straight Road That Leads from Cassandra to Scylla