SlideShare a Scribd company logo
Getting started with Apache Apache
Cassandra™
DuyHai DOAN
Apache Cassandra™ Evangelist
1 Apache Cassandra™ use-cases
2 Why do I need Apache Cassandra™ ?
3 Distribution, replication & consistency model
4 Features Summary
© DataStax, All Rights Reserved. 2
Apache Cassandra™ use-cases
Before (< 2016)
© 2016 DataStax, All Rights Reserved. 4
Messaging
Collections/
Playlists
Fraud
detection
Recommendation/
Personalization
Internet of things/
Sensor data
Before (< 2016)
© 2016 DataStax, All Rights Reserved. 5
Messaging
Collections/
Playlists
Fraud
detection
Recommendation/
Personalization
Internet of things/
Sensor data
Before (< 2016)
© 2016 DataStax, All Rights Reserved. 6
Messaging
Collections/
Playlists
Fraud
detection
Recommendation/
Personalization
Internet of things/
Sensor data
Today (≥ 2016)
© 2016 DataStax, All Rights Reserved. 7
Today (≥ 2016)
© 2016 DataStax, All Rights Reserved. 8
Today (≥ 2016)
© 2016 DataStax, All Rights Reserved. 9
Today (≥ 2016)
© 2016 DataStax, All Rights Reserved. 10
Today (≥ 2016)
© 2016 DataStax, All Rights Reserved. 11
Why do I need Apache Cassandra™ ?
Linear Scalability
© DataStax, All Rights Reserved. 13
C*
C*	C*
NetcoSports
3 nodes, ≈3GB
1k+ nodes, PB+
YOU
Continuous availability
© DataStax, All Rights Reserved. 14
•  thanks to the Dynamo architecture
Multi data-centers/cloud native
•  out-of-the-box (config only)
•  AWS/GCE/Azure/CloudStack support
•  Cloud/Bare-metal
© DataStax, All Rights Reserved. 15
Multi-DC usages
Data locality, disaster recovery
© DataStax, All Rights Reserved. 16
C*
C*
C*
C*
C* C*
C* C* C*
C*
C*
C*
C*
New York (DC1) London (DC2)
Async
replication
Multi-DC usages
Virtual DC for workload segregation
© DataStax, All Rights Reserved. 17
C*
C*
C*
C*
C* C*
C* C* C*
C*
C*
C*
C*
Production
(LIVE)
Analytics
(Spark)
Async
replication
Same room
Multi-DC usages
Prod data copy for back-up/benchmark
© DataStax, All Rights Reserved. 18
C*
C*
C*
C*
C* C*
C* C* C*
C*
C*
C*
C*
Use
LOCAL_XXX
Consistency
Levels
My tiny test DC
READ-ONLY!!!
Async
replication
Operational simplicity
•  1 node = 1 process + 2 config files (cassandra.yaml +
cassandra-rackdc.properties)
•  deployment automation (Ansible …)
•  No role between nodes, perfect symmetry
© DataStax, All Rights Reserved. 19
Eco System
•  Apache Spark – Apache Cassandra integration
•  analytics
•  joins, aggregation
•  SparkSQL/Dataframe integration with CQL (predicates push
down)
•  Apache Zeppelin – Apache Cassandra integration
•  web-based notebook
•  tabular/graph display
© DataStax, All Rights Reserved. 20
© 2016 DataStax, All Rights Reserved. 21
Q & A
! "
Apache Cassandra™ Architecture
The Tokens
© 2016 DataStax, All Rights Reserved. 23
Random hash of #partition à token = hash(#p)
Hash: ] –x, x ]
hash range: 264 values
x = 264/2
C*
C*
C*
C*
C* C*
C* C*
Token Ranges
© 2016 DataStax, All Rights Reserved. 24
A: −x,−
3x
4
⎤
⎦
⎥
⎥
⎤
⎦
⎥
⎥
B: −
3x
4
,−
2x
4
⎤
⎦
⎥
⎥
⎤
⎦
⎥
⎥
C: −
2x
4
,−
x
4
⎤
⎦
⎥
⎥
⎤
⎦
⎥
⎥
D: −
x
4
,0
⎤
⎦
⎥
⎥
⎤
⎦
⎥
⎥
E: 0,
x
4
⎤
⎦
⎥
⎥
⎤
⎦
⎥
⎥
F:
x
4
,
2x
4
⎤
⎦
⎥
⎥
⎤
⎦
⎥
⎥
G:
2x
4
,
3x
4
⎤
⎦
⎥
⎥
⎤
⎦
⎥
⎥
H:
3x
4
,x
⎤
⎦
⎥
⎥
⎤
⎦
⎥
⎥
H
A
E
D
B C
G F
Distributed Tables
© 2016 DataStax, All Rights Reserved. 25
user_id1
user_id2
user_id3
user_id4
user_id5
CREATE TABLE users(
user_id int,
…,
PRIMARY KEY(user_id)
);
H
A
E
D
B C
G F
Distributed Tables
© 2016 DataStax, All Rights Reserved. 26
user_id1user_id2
user_id3
user_id4
user_id5
H
A
E
D
B C
G F
Linear Scalability
© 2016 DataStax, All Rights Reserved. 27
Today = high load, production
In danger
H
A
E
D
B C
G F
Scaling Out
© 2016 DataStax, All Rights Reserved. 28
+2 nodes to lower the pressure
H
A
E
D
B
C
G
F
I
J
© 2016 DataStax, All Rights Reserved. 29
Q & A
! "
Data Replication
Failure Tolerance
© 2016 DataStax, All Rights Reserved. 31
Replication factor (RF) = 3
H
A
E
D
B C
G F
1
2 3
{A, H, G}
{B, A, H} {C, B, A}
Coordinator Node
© 2016 DataStax, All Rights Reserved. 32
Responsible for handling requests (read/write)
Every node can be coordinator
•  masterless
•  round robin master for each request
•  no SPOF
•  proxy role
H
A
E
D
B C
G F
coordinator
request
1
2 3
Consistency Model
Consistency Level
© 2016 DataStax, All Rights Reserved. 34
Tunable at runtime
•  ONE
•  QUORUM (strict majority w.r.t RF)
•  ALL
Applicable to any request (read/write)
Consistency In Action
© 2016 DataStax, All Rights Reserved. 35
B A A
B A A
Read ONE: A
data replication in progress …
Write ONE: B
ack
RF = 3, Write ONE, Read ONE
Consistency In Action
© 2016 DataStax, All Rights Reserved. 36
B A A
B A A
Read QUORUM: A
data replication in progress …
Write ONE: B
ack
RF = 3, Write ONE, Read QUORUM
Consistency In Action
© 2016 DataStax, All Rights Reserved. 37
B A A
B A A
Read ALL: B
data replication in progress …
Write ONE: B
ack
RF = 3, Write ONE, Read ALL
Last Write Win
© 2016 DataStax, All Rights Reserved. 38
H
A
E
D
B C
G F
coordinator
Read the
value back
1
2 3
B (t2) A (t1)
A (t1)
Consistency In Action
© 2016 DataStax, All Rights Reserved. 39
B B A
B B A
Read ONE: A
data replication in progress …
Write QUORUM: B
ack
RF = 3, Write QUORUM, Read ONE
Consistency In Action
© 2016 DataStax, All Rights Reserved. 40
B B A
B B A
Read QUORUM: A
data replication in progress …
Write QUORUM: B
ack
RF = 3, Write QUORUM, Read QUORUM
Consistency Level = Trade-off
© 2016 DataStax, All Rights Reserved. 41
Consistency Level
© 2016 DataStax, All Rights Reserved. 42
ONE
Fast, may not read latest written value
Consistency Level
© 2016 DataStax, All Rights Reserved. 43
QUORUM
Strict majority w.r.t. Replication Factor
Good balance
Consistency Level
© 2016 DataStax, All Rights Reserved. 44
ALL
Paranoid
Slow, lost of high availability
Consistency Level Common Patterns
© 2016 DataStax, All Rights Reserved. 45
ONERead + ONEWrite
☞ available for read/write even (N-1) replicas down
QUORUMRead + QUORUMWrite
☞ available for read/write even if (RF - 1) replica (s) down
© 2016 DataStax, All Rights Reserved. 46
Q & A
! "
Features Overview
What is a keyspace/schema ?
© 2016 DataStax, All Rights Reserved. 48
Simple table container
Defines how data are replicated in the cluster (RF)
Keyspace/
Schema
Cluster
1 n
Table
1 n
Keyspace/Schema Creation
© 2016 DataStax, All Rights Reserved. 49
CREATE KEYSPACE single_dc WITH REPLICATION =
{ 'class': 'SimpleStrategy', 'replication_factor': 3}
CREATE KEYSPACE multi_dc WITH REPLICATION =
{ 'class': 'NetworkTopologyStrategy', 'DC1': 3, 'DC2': 3, 'DC3': 1, …}
Single Data-center
Multi Data-center
DDL Syntax
© 2016 DataStax, All Rights Reserved. 50
CREATE TABLE users (
login text,
name text,
age int,
…
PRIMARY KEY(login));
ALTER TABLE users ADD address text;
ALTER TABLE users DROP address;
…
DROP TABLE
DML Syntax
© 2016 DataStax, All Rights Reserved. 51
INSERT INTO users(login, name, age) VALUES('jdoe', 'John DOE', 33);
UPDATE users SET age = 34 WHERE login = 'jdoe';
DELETE age FROM users WHERE login = 'jdoe';
SELECT age FROM users WHERE login = 'jdoe';
Built-In Security Features
© 2016 DataStax, All Rights Reserved. 52
Role
•  CREATE ROLE x WITH PASSWORD y (NOSUPERUSER | SUPERUSER)
•  ALTER ROLE x WITH PASSWORD y (NOSUPERUSER | SUPERUSER)
•  DROP ROLE x
Permissions
•  GRANT <xxx> PERMISSION ON <resource> TO <role_name>
•  REVOKE <xxx> PERMISSION ON <resource> FROM <role_name>
Collections
© DataStax, All Rights Reserved. 53
CREATE TABLE xxx(
…,
li list<text>,
se set<text>,
ma map<int, text>,
…
);
UPDATE xxx SET li = li + [append] …
UPDATE xxx SET se = se + {append}
UPDATE xxx SET ma[key] = value …
User Defined Types
© DataStax, All Rights Reserved. 54
CREATE TYPE address (
number int,
street text,
zipcode text,
city text,
country text
);
LightWeight Transactions
© DataStax, All Rights Reserved. 55
INSERT INTO users(…) VALUES(...) IF NOT EXISTS;
DELETE users WHERE ... IF EXISTS;
UPDATE users SET age = xxx WHERE ... IF age = 30;
Linearizable writes on a single partition
Time To Live
© DataStax, All Rights Reserved. 56
INSERT INTO users(…) VALUES(...) USING TTL = 3600;
UPDATE users USING TTL = 3600 SET age = xxx
WHERE ...;
User Defined Functions/Aggregates
© DataStax, All Rights Reserved. 57
CREATE FUNCTION toUpperCase(input text)
RETURNS NULL ON NULL INPUT
RETURNS int
LANGUAGE java
AS $$ return input.toUpperCase(); $$;
SELECT toUpperCase(firstname) FROM users WHERE …
SELECT max(salary) FROM users WHERE ...
Materialized Views
© DataStax, All Rights Reserved. 58
CREATE MATERIALIZED VIEW user_by_country
AS SELECT * FROM users
WHERE user_id IS NOT NULL AND country IS NOT NULL
PRIMARY KEY ((country), user_id);
JSON Syntax for INSERT/UPDATE/DELETE
© DataStax, All Rights Reserved. 59
CREATE TABLE users (
id text PRIMARY KEY,
age int,
state text );
INSERT INTO users JSON '{"id": "user123", "age": 42, "state": "TX"}’;
INSERT INTO users(id, age, state) VALUES('me', fromJson('20'), 'CA');
UPDATE users SET age = fromJson('25’) WHERE id = fromJson('"me"');
DELETE FROM users WHERE id = fromJson('"me"');
JSON Syntax for SELECT
© DataStax, All Rights Reserved. 60
SELECT JSON * FROM users WHERE id = 'me';
[json]
----------------------------------------
{"id": "me", "age": 25, "state": "CA”}
SELECT JSON age,state FROM users WHERE id = 'me';
[json]
----------------------------------------
{"age": 25, "state": "CA"}
SELECT age, toJson(state) FROM users WHERE id = 'me';
age | system.tojson(state)
-----+----------------------
25 | "CA"
© 2016 DataStax, All Rights Reserved. 61
Q & A
! "

More Related Content

What's hot (20)

Spark cassandra integration 2016
Spark cassandra integration 2016Spark cassandra integration 2016
Spark cassandra integration 2016
Duyhai Doan
 
Cassandra 3 new features 2016
Cassandra 3 new features 2016Cassandra 3 new features 2016
Cassandra 3 new features 2016
Duyhai Doan
 
Big data 101 for beginners devoxxpl
Big data 101 for beginners devoxxplBig data 101 for beginners devoxxpl
Big data 101 for beginners devoxxpl
Duyhai Doan
 
Spark Cassandra 2016
Spark Cassandra 2016Spark Cassandra 2016
Spark Cassandra 2016
Duyhai Doan
 
Fast track to getting started with DSE Max @ ING
Fast track to getting started with DSE Max @ INGFast track to getting started with DSE Max @ ING
Fast track to getting started with DSE Max @ ING
Duyhai Doan
 
Horizontally Scalable Relational Databases with Spark: Spark Summit East talk...
Horizontally Scalable Relational Databases with Spark: Spark Summit East talk...Horizontally Scalable Relational Databases with Spark: Spark Summit East talk...
Horizontally Scalable Relational Databases with Spark: Spark Summit East talk...
Spark Summit
 
Apache Spark and DataStax Enablement
Apache Spark and DataStax EnablementApache Spark and DataStax Enablement
Apache Spark and DataStax Enablement
Vincent Poncet
 
Frustration-Reduced Spark: DataFrames and the Spark Time-Series Library
Frustration-Reduced Spark: DataFrames and the Spark Time-Series LibraryFrustration-Reduced Spark: DataFrames and the Spark Time-Series Library
Frustration-Reduced Spark: DataFrames and the Spark Time-Series Library
Ilya Ganelin
 
Cassandra nice use cases and worst anti patterns no sql-matters barcelona
Cassandra nice use cases and worst anti patterns no sql-matters barcelonaCassandra nice use cases and worst anti patterns no sql-matters barcelona
Cassandra nice use cases and worst anti patterns no sql-matters barcelona
Duyhai Doan
 
Beyond the Query – Bringing Complex Access Patterns to NoSQL with DataStax - ...
Beyond the Query – Bringing Complex Access Patterns to NoSQL with DataStax - ...Beyond the Query – Bringing Complex Access Patterns to NoSQL with DataStax - ...
Beyond the Query – Bringing Complex Access Patterns to NoSQL with DataStax - ...
StampedeCon
 
Real time data processing with spark & cassandra @ NoSQLMatters 2015 Paris
Real time data processing with spark & cassandra @ NoSQLMatters 2015 ParisReal time data processing with spark & cassandra @ NoSQLMatters 2015 Paris
Real time data processing with spark & cassandra @ NoSQLMatters 2015 Paris
Duyhai Doan
 
Spark cassandra connector.API, Best Practices and Use-Cases
Spark cassandra connector.API, Best Practices and Use-CasesSpark cassandra connector.API, Best Practices and Use-Cases
Spark cassandra connector.API, Best Practices and Use-Cases
Duyhai Doan
 
Cassandra and Spark, closing the gap between no sql and analytics codemotio...
Cassandra and Spark, closing the gap between no sql and analytics   codemotio...Cassandra and Spark, closing the gap between no sql and analytics   codemotio...
Cassandra and Spark, closing the gap between no sql and analytics codemotio...
Duyhai Doan
 
Spark Cassandra Connector Dataframes
Spark Cassandra Connector DataframesSpark Cassandra Connector Dataframes
Spark Cassandra Connector Dataframes
Russell Spitzer
 
From Postgres to Cassandra (Rimas Silkaitis, Heroku) | C* Summit 2016
From Postgres to Cassandra (Rimas Silkaitis, Heroku) | C* Summit 2016From Postgres to Cassandra (Rimas Silkaitis, Heroku) | C* Summit 2016
From Postgres to Cassandra (Rimas Silkaitis, Heroku) | C* Summit 2016
DataStax
 
Apache spark Intro
Apache spark IntroApache spark Intro
Apache spark Intro
Tudor Lapusan
 
Cassandra Materialized Views
Cassandra Materialized ViewsCassandra Materialized Views
Cassandra Materialized Views
Carl Yeksigian
 
Spark ETL Techniques - Creating An Optimal Fantasy Baseball Roster
Spark ETL Techniques - Creating An Optimal Fantasy Baseball RosterSpark ETL Techniques - Creating An Optimal Fantasy Baseball Roster
Spark ETL Techniques - Creating An Optimal Fantasy Baseball Roster
Don Drake
 
Spark Cassandra Connector: Past, Present, and Future
Spark Cassandra Connector: Past, Present, and FutureSpark Cassandra Connector: Past, Present, and Future
Spark Cassandra Connector: Past, Present, and Future
Russell Spitzer
 
Advanced data modeling with apache cassandra
Advanced data modeling with apache cassandraAdvanced data modeling with apache cassandra
Advanced data modeling with apache cassandra
Patrick McFadin
 
Spark cassandra integration 2016
Spark cassandra integration 2016Spark cassandra integration 2016
Spark cassandra integration 2016
Duyhai Doan
 
Cassandra 3 new features 2016
Cassandra 3 new features 2016Cassandra 3 new features 2016
Cassandra 3 new features 2016
Duyhai Doan
 
Big data 101 for beginners devoxxpl
Big data 101 for beginners devoxxplBig data 101 for beginners devoxxpl
Big data 101 for beginners devoxxpl
Duyhai Doan
 
Spark Cassandra 2016
Spark Cassandra 2016Spark Cassandra 2016
Spark Cassandra 2016
Duyhai Doan
 
Fast track to getting started with DSE Max @ ING
Fast track to getting started with DSE Max @ INGFast track to getting started with DSE Max @ ING
Fast track to getting started with DSE Max @ ING
Duyhai Doan
 
Horizontally Scalable Relational Databases with Spark: Spark Summit East talk...
Horizontally Scalable Relational Databases with Spark: Spark Summit East talk...Horizontally Scalable Relational Databases with Spark: Spark Summit East talk...
Horizontally Scalable Relational Databases with Spark: Spark Summit East talk...
Spark Summit
 
Apache Spark and DataStax Enablement
Apache Spark and DataStax EnablementApache Spark and DataStax Enablement
Apache Spark and DataStax Enablement
Vincent Poncet
 
Frustration-Reduced Spark: DataFrames and the Spark Time-Series Library
Frustration-Reduced Spark: DataFrames and the Spark Time-Series LibraryFrustration-Reduced Spark: DataFrames and the Spark Time-Series Library
Frustration-Reduced Spark: DataFrames and the Spark Time-Series Library
Ilya Ganelin
 
Cassandra nice use cases and worst anti patterns no sql-matters barcelona
Cassandra nice use cases and worst anti patterns no sql-matters barcelonaCassandra nice use cases and worst anti patterns no sql-matters barcelona
Cassandra nice use cases and worst anti patterns no sql-matters barcelona
Duyhai Doan
 
Beyond the Query – Bringing Complex Access Patterns to NoSQL with DataStax - ...
Beyond the Query – Bringing Complex Access Patterns to NoSQL with DataStax - ...Beyond the Query – Bringing Complex Access Patterns to NoSQL with DataStax - ...
Beyond the Query – Bringing Complex Access Patterns to NoSQL with DataStax - ...
StampedeCon
 
Real time data processing with spark & cassandra @ NoSQLMatters 2015 Paris
Real time data processing with spark & cassandra @ NoSQLMatters 2015 ParisReal time data processing with spark & cassandra @ NoSQLMatters 2015 Paris
Real time data processing with spark & cassandra @ NoSQLMatters 2015 Paris
Duyhai Doan
 
Spark cassandra connector.API, Best Practices and Use-Cases
Spark cassandra connector.API, Best Practices and Use-CasesSpark cassandra connector.API, Best Practices and Use-Cases
Spark cassandra connector.API, Best Practices and Use-Cases
Duyhai Doan
 
Cassandra and Spark, closing the gap between no sql and analytics codemotio...
Cassandra and Spark, closing the gap between no sql and analytics   codemotio...Cassandra and Spark, closing the gap between no sql and analytics   codemotio...
Cassandra and Spark, closing the gap between no sql and analytics codemotio...
Duyhai Doan
 
Spark Cassandra Connector Dataframes
Spark Cassandra Connector DataframesSpark Cassandra Connector Dataframes
Spark Cassandra Connector Dataframes
Russell Spitzer
 
From Postgres to Cassandra (Rimas Silkaitis, Heroku) | C* Summit 2016
From Postgres to Cassandra (Rimas Silkaitis, Heroku) | C* Summit 2016From Postgres to Cassandra (Rimas Silkaitis, Heroku) | C* Summit 2016
From Postgres to Cassandra (Rimas Silkaitis, Heroku) | C* Summit 2016
DataStax
 
Cassandra Materialized Views
Cassandra Materialized ViewsCassandra Materialized Views
Cassandra Materialized Views
Carl Yeksigian
 
Spark ETL Techniques - Creating An Optimal Fantasy Baseball Roster
Spark ETL Techniques - Creating An Optimal Fantasy Baseball RosterSpark ETL Techniques - Creating An Optimal Fantasy Baseball Roster
Spark ETL Techniques - Creating An Optimal Fantasy Baseball Roster
Don Drake
 
Spark Cassandra Connector: Past, Present, and Future
Spark Cassandra Connector: Past, Present, and FutureSpark Cassandra Connector: Past, Present, and Future
Spark Cassandra Connector: Past, Present, and Future
Russell Spitzer
 
Advanced data modeling with apache cassandra
Advanced data modeling with apache cassandraAdvanced data modeling with apache cassandra
Advanced data modeling with apache cassandra
Patrick McFadin
 

Viewers also liked (18)

Cassandra 3 new features @ Geecon Krakow 2016
Cassandra 3 new features  @ Geecon Krakow 2016Cassandra 3 new features  @ Geecon Krakow 2016
Cassandra 3 new features @ Geecon Krakow 2016
Duyhai Doan
 
Spark zeppelin-cassandra at synchrotron
Spark zeppelin-cassandra at synchrotronSpark zeppelin-cassandra at synchrotron
Spark zeppelin-cassandra at synchrotron
Duyhai Doan
 
Cassandra spark connector
Cassandra spark connectorCassandra spark connector
Cassandra spark connector
Duyhai Doan
 
Cassandra data structures and algorithms
Cassandra data structures and algorithmsCassandra data structures and algorithms
Cassandra data structures and algorithms
Duyhai Doan
 
Introduction to KillrChat
Introduction to KillrChatIntroduction to KillrChat
Introduction to KillrChat
Duyhai Doan
 
KillrChat presentation
KillrChat presentationKillrChat presentation
KillrChat presentation
Duyhai Doan
 
Cassandra introduction mars jug
Cassandra introduction mars jugCassandra introduction mars jug
Cassandra introduction mars jug
Duyhai Doan
 
Cassandra introduction @ ParisJUG
Cassandra introduction @ ParisJUGCassandra introduction @ ParisJUG
Cassandra introduction @ ParisJUG
Duyhai Doan
 
Cassandra introduction @ NantesJUG
Cassandra introduction @ NantesJUGCassandra introduction @ NantesJUG
Cassandra introduction @ NantesJUG
Duyhai Doan
 
KillrChat Data Modeling
KillrChat Data ModelingKillrChat Data Modeling
KillrChat Data Modeling
Duyhai Doan
 
Cassandra drivers and libraries
Cassandra drivers and librariesCassandra drivers and libraries
Cassandra drivers and libraries
Duyhai Doan
 
Apache Zeppelin @DevoxxFR 2016
Apache Zeppelin @DevoxxFR 2016Apache Zeppelin @DevoxxFR 2016
Apache Zeppelin @DevoxxFR 2016
Duyhai Doan
 
Cassandra introduction at FinishJUG
Cassandra introduction at FinishJUGCassandra introduction at FinishJUG
Cassandra introduction at FinishJUG
Duyhai Doan
 
Spark cassandra integration, theory and practice
Spark cassandra integration, theory and practiceSpark cassandra integration, theory and practice
Spark cassandra integration, theory and practice
Duyhai Doan
 
Data stax academy
Data stax academyData stax academy
Data stax academy
Duyhai Doan
 
Libon cassandra summiteu2014
Libon cassandra summiteu2014Libon cassandra summiteu2014
Libon cassandra summiteu2014
Duyhai Doan
 
Apache zeppelin the missing component for the big data ecosystem
Apache zeppelin the missing component for the big data ecosystemApache zeppelin the missing component for the big data ecosystem
Apache zeppelin the missing component for the big data ecosystem
Duyhai Doan
 
Cassandra for the ops dos and donts
Cassandra for the ops   dos and dontsCassandra for the ops   dos and donts
Cassandra for the ops dos and donts
Duyhai Doan
 
Cassandra 3 new features @ Geecon Krakow 2016
Cassandra 3 new features  @ Geecon Krakow 2016Cassandra 3 new features  @ Geecon Krakow 2016
Cassandra 3 new features @ Geecon Krakow 2016
Duyhai Doan
 
Spark zeppelin-cassandra at synchrotron
Spark zeppelin-cassandra at synchrotronSpark zeppelin-cassandra at synchrotron
Spark zeppelin-cassandra at synchrotron
Duyhai Doan
 
Cassandra spark connector
Cassandra spark connectorCassandra spark connector
Cassandra spark connector
Duyhai Doan
 
Cassandra data structures and algorithms
Cassandra data structures and algorithmsCassandra data structures and algorithms
Cassandra data structures and algorithms
Duyhai Doan
 
Introduction to KillrChat
Introduction to KillrChatIntroduction to KillrChat
Introduction to KillrChat
Duyhai Doan
 
KillrChat presentation
KillrChat presentationKillrChat presentation
KillrChat presentation
Duyhai Doan
 
Cassandra introduction mars jug
Cassandra introduction mars jugCassandra introduction mars jug
Cassandra introduction mars jug
Duyhai Doan
 
Cassandra introduction @ ParisJUG
Cassandra introduction @ ParisJUGCassandra introduction @ ParisJUG
Cassandra introduction @ ParisJUG
Duyhai Doan
 
Cassandra introduction @ NantesJUG
Cassandra introduction @ NantesJUGCassandra introduction @ NantesJUG
Cassandra introduction @ NantesJUG
Duyhai Doan
 
KillrChat Data Modeling
KillrChat Data ModelingKillrChat Data Modeling
KillrChat Data Modeling
Duyhai Doan
 
Cassandra drivers and libraries
Cassandra drivers and librariesCassandra drivers and libraries
Cassandra drivers and libraries
Duyhai Doan
 
Apache Zeppelin @DevoxxFR 2016
Apache Zeppelin @DevoxxFR 2016Apache Zeppelin @DevoxxFR 2016
Apache Zeppelin @DevoxxFR 2016
Duyhai Doan
 
Cassandra introduction at FinishJUG
Cassandra introduction at FinishJUGCassandra introduction at FinishJUG
Cassandra introduction at FinishJUG
Duyhai Doan
 
Spark cassandra integration, theory and practice
Spark cassandra integration, theory and practiceSpark cassandra integration, theory and practice
Spark cassandra integration, theory and practice
Duyhai Doan
 
Data stax academy
Data stax academyData stax academy
Data stax academy
Duyhai Doan
 
Libon cassandra summiteu2014
Libon cassandra summiteu2014Libon cassandra summiteu2014
Libon cassandra summiteu2014
Duyhai Doan
 
Apache zeppelin the missing component for the big data ecosystem
Apache zeppelin the missing component for the big data ecosystemApache zeppelin the missing component for the big data ecosystem
Apache zeppelin the missing component for the big data ecosystem
Duyhai Doan
 
Cassandra for the ops dos and donts
Cassandra for the ops   dos and dontsCassandra for the ops   dos and donts
Cassandra for the ops dos and donts
Duyhai Doan
 

Similar to Datastax day 2016 introduction to apache cassandra (20)

Toronto jaspersoft meetup
Toronto jaspersoft meetupToronto jaspersoft meetup
Toronto jaspersoft meetup
Patrick McFadin
 
Slides: Relational to NoSQL Migration
Slides: Relational to NoSQL MigrationSlides: Relational to NoSQL Migration
Slides: Relational to NoSQL Migration
DATAVERSITY
 
Tokyo Cassandra Summit 2014: Apache Cassandra 2.0 + 2.1 by Jonathan Ellis
Tokyo Cassandra Summit 2014: Apache Cassandra 2.0 + 2.1 by Jonathan EllisTokyo Cassandra Summit 2014: Apache Cassandra 2.0 + 2.1 by Jonathan Ellis
Tokyo Cassandra Summit 2014: Apache Cassandra 2.0 + 2.1 by Jonathan Ellis
DataStax Academy
 
Tokyo cassandra conference 2014
Tokyo cassandra conference 2014Tokyo cassandra conference 2014
Tokyo cassandra conference 2014
jbellis
 
Cassandra's Sweet Spot - an introduction to Apache Cassandra
Cassandra's Sweet Spot - an introduction to Apache CassandraCassandra's Sweet Spot - an introduction to Apache Cassandra
Cassandra's Sweet Spot - an introduction to Apache Cassandra
Dave Gardner
 
Cassandra Summit 2013 Keynote
Cassandra Summit 2013 KeynoteCassandra Summit 2013 Keynote
Cassandra Summit 2013 Keynote
jbellis
 
DataStax Enterprise & Apache Cassandra – Essentials for Financial Services – ...
DataStax Enterprise & Apache Cassandra – Essentials for Financial Services – ...DataStax Enterprise & Apache Cassandra – Essentials for Financial Services – ...
DataStax Enterprise & Apache Cassandra – Essentials for Financial Services – ...
Daniel Cohen
 
Introduction to Apache Cassandra
Introduction to Apache CassandraIntroduction to Apache Cassandra
Introduction to Apache Cassandra
Robert Stupp
 
Highly available, scalable and secure data with Cassandra and DataStax Enterp...
Highly available, scalable and secure data with Cassandra and DataStax Enterp...Highly available, scalable and secure data with Cassandra and DataStax Enterp...
Highly available, scalable and secure data with Cassandra and DataStax Enterp...
Johnny Miller
 
State of Cassandra 2012
State of Cassandra 2012State of Cassandra 2012
State of Cassandra 2012
jbellis
 
An Overview of Apache Cassandra
An Overview of Apache CassandraAn Overview of Apache Cassandra
An Overview of Apache Cassandra
DataStax
 
Introduction to Cassandra & Data model
Introduction to Cassandra & Data modelIntroduction to Cassandra & Data model
Introduction to Cassandra & Data model
Duyhai Doan
 
Building Scalable, Real Time Applications for Financial Services with DataStax
Building Scalable, Real Time Applications for Financial Services with DataStaxBuilding Scalable, Real Time Applications for Financial Services with DataStax
Building Scalable, Real Time Applications for Financial Services with DataStax
DataStax
 
DataStax Enterprise – Foundations for Finance – 20160419
DataStax Enterprise – Foundations for Finance – 20160419DataStax Enterprise – Foundations for Finance – 20160419
DataStax Enterprise – Foundations for Finance – 20160419
Daniel Cohen
 
Apache Cassandra For Java Developers - Why, What and How. LJC @ UCL October 2014
Apache Cassandra For Java Developers - Why, What and How. LJC @ UCL October 2014Apache Cassandra For Java Developers - Why, What and How. LJC @ UCL October 2014
Apache Cassandra For Java Developers - Why, What and How. LJC @ UCL October 2014
Johnny Miller
 
Cassandra hands on
Cassandra hands onCassandra hands on
Cassandra hands on
niallmilton
 
Time series with Apache Cassandra - Long version
Time series with Apache Cassandra - Long versionTime series with Apache Cassandra - Long version
Time series with Apache Cassandra - Long version
Patrick McFadin
 
Cassandra and Spark
Cassandra and SparkCassandra and Spark
Cassandra and Spark
nickmbailey
 
Cassandra 2.1
Cassandra 2.1Cassandra 2.1
Cassandra 2.1
jbellis
 
Cassandra data modelling best practices
Cassandra data modelling best practicesCassandra data modelling best practices
Cassandra data modelling best practices
Sandeep Sharma IIMK Smart City,IoT,Bigdata,Cloud,BI,DW
 
Toronto jaspersoft meetup
Toronto jaspersoft meetupToronto jaspersoft meetup
Toronto jaspersoft meetup
Patrick McFadin
 
Slides: Relational to NoSQL Migration
Slides: Relational to NoSQL MigrationSlides: Relational to NoSQL Migration
Slides: Relational to NoSQL Migration
DATAVERSITY
 
Tokyo Cassandra Summit 2014: Apache Cassandra 2.0 + 2.1 by Jonathan Ellis
Tokyo Cassandra Summit 2014: Apache Cassandra 2.0 + 2.1 by Jonathan EllisTokyo Cassandra Summit 2014: Apache Cassandra 2.0 + 2.1 by Jonathan Ellis
Tokyo Cassandra Summit 2014: Apache Cassandra 2.0 + 2.1 by Jonathan Ellis
DataStax Academy
 
Tokyo cassandra conference 2014
Tokyo cassandra conference 2014Tokyo cassandra conference 2014
Tokyo cassandra conference 2014
jbellis
 
Cassandra's Sweet Spot - an introduction to Apache Cassandra
Cassandra's Sweet Spot - an introduction to Apache CassandraCassandra's Sweet Spot - an introduction to Apache Cassandra
Cassandra's Sweet Spot - an introduction to Apache Cassandra
Dave Gardner
 
Cassandra Summit 2013 Keynote
Cassandra Summit 2013 KeynoteCassandra Summit 2013 Keynote
Cassandra Summit 2013 Keynote
jbellis
 
DataStax Enterprise & Apache Cassandra – Essentials for Financial Services – ...
DataStax Enterprise & Apache Cassandra – Essentials for Financial Services – ...DataStax Enterprise & Apache Cassandra – Essentials for Financial Services – ...
DataStax Enterprise & Apache Cassandra – Essentials for Financial Services – ...
Daniel Cohen
 
Introduction to Apache Cassandra
Introduction to Apache CassandraIntroduction to Apache Cassandra
Introduction to Apache Cassandra
Robert Stupp
 
Highly available, scalable and secure data with Cassandra and DataStax Enterp...
Highly available, scalable and secure data with Cassandra and DataStax Enterp...Highly available, scalable and secure data with Cassandra and DataStax Enterp...
Highly available, scalable and secure data with Cassandra and DataStax Enterp...
Johnny Miller
 
State of Cassandra 2012
State of Cassandra 2012State of Cassandra 2012
State of Cassandra 2012
jbellis
 
An Overview of Apache Cassandra
An Overview of Apache CassandraAn Overview of Apache Cassandra
An Overview of Apache Cassandra
DataStax
 
Introduction to Cassandra & Data model
Introduction to Cassandra & Data modelIntroduction to Cassandra & Data model
Introduction to Cassandra & Data model
Duyhai Doan
 
Building Scalable, Real Time Applications for Financial Services with DataStax
Building Scalable, Real Time Applications for Financial Services with DataStaxBuilding Scalable, Real Time Applications for Financial Services with DataStax
Building Scalable, Real Time Applications for Financial Services with DataStax
DataStax
 
DataStax Enterprise – Foundations for Finance – 20160419
DataStax Enterprise – Foundations for Finance – 20160419DataStax Enterprise – Foundations for Finance – 20160419
DataStax Enterprise – Foundations for Finance – 20160419
Daniel Cohen
 
Apache Cassandra For Java Developers - Why, What and How. LJC @ UCL October 2014
Apache Cassandra For Java Developers - Why, What and How. LJC @ UCL October 2014Apache Cassandra For Java Developers - Why, What and How. LJC @ UCL October 2014
Apache Cassandra For Java Developers - Why, What and How. LJC @ UCL October 2014
Johnny Miller
 
Cassandra hands on
Cassandra hands onCassandra hands on
Cassandra hands on
niallmilton
 
Time series with Apache Cassandra - Long version
Time series with Apache Cassandra - Long versionTime series with Apache Cassandra - Long version
Time series with Apache Cassandra - Long version
Patrick McFadin
 
Cassandra and Spark
Cassandra and SparkCassandra and Spark
Cassandra and Spark
nickmbailey
 
Cassandra 2.1
Cassandra 2.1Cassandra 2.1
Cassandra 2.1
jbellis
 

More from Duyhai Doan (7)

Pourquoi Terraform n'est pas le bon outil pour les déploiements automatisés d...
Pourquoi Terraform n'est pas le bon outil pour les déploiements automatisés d...Pourquoi Terraform n'est pas le bon outil pour les déploiements automatisés d...
Pourquoi Terraform n'est pas le bon outil pour les déploiements automatisés d...
Duyhai Doan
 
Le futur d'apache cassandra
Le futur d'apache cassandraLe futur d'apache cassandra
Le futur d'apache cassandra
Duyhai Doan
 
Algorithme distribués pour big data saison 2 @DevoxxFR 2016
Algorithme distribués pour big data saison 2 @DevoxxFR 2016Algorithme distribués pour big data saison 2 @DevoxxFR 2016
Algorithme distribués pour big data saison 2 @DevoxxFR 2016
Duyhai Doan
 
Cassandra UDF and Materialized Views
Cassandra UDF and Materialized ViewsCassandra UDF and Materialized Views
Cassandra UDF and Materialized Views
Duyhai Doan
 
Apache zeppelin, the missing component for the big data ecosystem
Apache zeppelin, the missing component for the big data ecosystemApache zeppelin, the missing component for the big data ecosystem
Apache zeppelin, the missing component for the big data ecosystem
Duyhai Doan
 
Distributed algorithms for big data @ GeeCon
Distributed algorithms for big data @ GeeConDistributed algorithms for big data @ GeeCon
Distributed algorithms for big data @ GeeCon
Duyhai Doan
 
Algorithmes distribues pour le big data @ DevoxxFR 2015
Algorithmes distribues pour le big data @ DevoxxFR 2015Algorithmes distribues pour le big data @ DevoxxFR 2015
Algorithmes distribues pour le big data @ DevoxxFR 2015
Duyhai Doan
 
Pourquoi Terraform n'est pas le bon outil pour les déploiements automatisés d...
Pourquoi Terraform n'est pas le bon outil pour les déploiements automatisés d...Pourquoi Terraform n'est pas le bon outil pour les déploiements automatisés d...
Pourquoi Terraform n'est pas le bon outil pour les déploiements automatisés d...
Duyhai Doan
 
Le futur d'apache cassandra
Le futur d'apache cassandraLe futur d'apache cassandra
Le futur d'apache cassandra
Duyhai Doan
 
Algorithme distribués pour big data saison 2 @DevoxxFR 2016
Algorithme distribués pour big data saison 2 @DevoxxFR 2016Algorithme distribués pour big data saison 2 @DevoxxFR 2016
Algorithme distribués pour big data saison 2 @DevoxxFR 2016
Duyhai Doan
 
Cassandra UDF and Materialized Views
Cassandra UDF and Materialized ViewsCassandra UDF and Materialized Views
Cassandra UDF and Materialized Views
Duyhai Doan
 
Apache zeppelin, the missing component for the big data ecosystem
Apache zeppelin, the missing component for the big data ecosystemApache zeppelin, the missing component for the big data ecosystem
Apache zeppelin, the missing component for the big data ecosystem
Duyhai Doan
 
Distributed algorithms for big data @ GeeCon
Distributed algorithms for big data @ GeeConDistributed algorithms for big data @ GeeCon
Distributed algorithms for big data @ GeeCon
Duyhai Doan
 
Algorithmes distribues pour le big data @ DevoxxFR 2015
Algorithmes distribues pour le big data @ DevoxxFR 2015Algorithmes distribues pour le big data @ DevoxxFR 2015
Algorithmes distribues pour le big data @ DevoxxFR 2015
Duyhai Doan
 

Recently uploaded (20)

Content and eLearning Standards: Finding the Best Fit for Your-Training
Content and eLearning Standards: Finding the Best Fit for Your-TrainingContent and eLearning Standards: Finding the Best Fit for Your-Training
Content and eLearning Standards: Finding the Best Fit for Your-Training
Rustici Software
 
SDG 9000 Series: Unleashing multigigabit everywhere
SDG 9000 Series: Unleashing multigigabit everywhereSDG 9000 Series: Unleashing multigigabit everywhere
SDG 9000 Series: Unleashing multigigabit everywhere
Adtran
 
Dev Dives: System-to-system integration with UiPath API Workflows
Dev Dives: System-to-system integration with UiPath API WorkflowsDev Dives: System-to-system integration with UiPath API Workflows
Dev Dives: System-to-system integration with UiPath API Workflows
UiPathCommunity
 
Cognitive Chasms - A Typology of GenAI Failure Failure Modes
Cognitive Chasms - A Typology of GenAI Failure Failure ModesCognitive Chasms - A Typology of GenAI Failure Failure Modes
Cognitive Chasms - A Typology of GenAI Failure Failure Modes
Dr. Tathagat Varma
 
GDG Cloud Southlake #43: Tommy Todd: The Quantum Apocalypse: A Looming Threat...
GDG Cloud Southlake #43: Tommy Todd: The Quantum Apocalypse: A Looming Threat...GDG Cloud Southlake #43: Tommy Todd: The Quantum Apocalypse: A Looming Threat...
GDG Cloud Southlake #43: Tommy Todd: The Quantum Apocalypse: A Looming Threat...
James Anderson
 
Measuring Microsoft 365 Copilot and Gen AI Success
Measuring Microsoft 365 Copilot and Gen AI SuccessMeasuring Microsoft 365 Copilot and Gen AI Success
Measuring Microsoft 365 Copilot and Gen AI Success
Nikki Chapple
 
Multistream in SIP and NoSIP @ OpenSIPS Summit 2025
Multistream in SIP and NoSIP @ OpenSIPS Summit 2025Multistream in SIP and NoSIP @ OpenSIPS Summit 2025
Multistream in SIP and NoSIP @ OpenSIPS Summit 2025
Lorenzo Miniero
 
Kubernetes Cloud Native Indonesia Meetup - May 2025
Kubernetes Cloud Native Indonesia Meetup - May 2025Kubernetes Cloud Native Indonesia Meetup - May 2025
Kubernetes Cloud Native Indonesia Meetup - May 2025
Prasta Maha
 
Security Operations and the Defense Analyst - Splunk Certificate
Security Operations and the Defense Analyst - Splunk CertificateSecurity Operations and the Defense Analyst - Splunk Certificate
Security Operations and the Defense Analyst - Splunk Certificate
VICTOR MAESTRE RAMIREZ
 
AI Trends - Mary Meeker
AI Trends - Mary MeekerAI Trends - Mary Meeker
AI Trends - Mary Meeker
Razin Mustafiz
 
Droidal: AI Agents Revolutionizing Healthcare
Droidal: AI Agents Revolutionizing HealthcareDroidal: AI Agents Revolutionizing Healthcare
Droidal: AI Agents Revolutionizing Healthcare
Droidal LLC
 
AI Emotional Actors: “When Machines Learn to Feel and Perform"
AI Emotional Actors:  “When Machines Learn to Feel and Perform"AI Emotional Actors:  “When Machines Learn to Feel and Perform"
AI Emotional Actors: “When Machines Learn to Feel and Perform"
AkashKumar809858
 
Cyber security cyber security cyber security cyber security cyber security cy...
Cyber security cyber security cyber security cyber security cyber security cy...Cyber security cyber security cyber security cyber security cyber security cy...
Cyber security cyber security cyber security cyber security cyber security cy...
pranavbodhak
 
System Card: Claude Opus 4 & Claude Sonnet 4
System Card: Claude Opus 4 & Claude Sonnet 4System Card: Claude Opus 4 & Claude Sonnet 4
System Card: Claude Opus 4 & Claude Sonnet 4
Razin Mustafiz
 
European Accessibility Act & Integrated Accessibility Testing
European Accessibility Act & Integrated Accessibility TestingEuropean Accessibility Act & Integrated Accessibility Testing
European Accessibility Act & Integrated Accessibility Testing
Julia Undeutsch
 
Agentic AI - The New Era of Intelligence
Agentic AI - The New Era of IntelligenceAgentic AI - The New Era of Intelligence
Agentic AI - The New Era of Intelligence
Muzammil Shah
 
A Comprehensive Guide on Integrating Monoova Payment Gateway
A Comprehensive Guide on Integrating Monoova Payment GatewayA Comprehensive Guide on Integrating Monoova Payment Gateway
A Comprehensive Guide on Integrating Monoova Payment Gateway
danielle hunter
 
Master tester AI toolbox - Kari Kakkonen at Testaus ja AI 2025 Professio
Master tester AI toolbox - Kari Kakkonen at Testaus ja AI 2025 ProfessioMaster tester AI toolbox - Kari Kakkonen at Testaus ja AI 2025 Professio
Master tester AI toolbox - Kari Kakkonen at Testaus ja AI 2025 Professio
Kari Kakkonen
 
Offshore IT Support: Balancing In-House and Offshore Help Desk Technicians
Offshore IT Support: Balancing In-House and Offshore Help Desk TechniciansOffshore IT Support: Balancing In-House and Offshore Help Desk Technicians
Offshore IT Support: Balancing In-House and Offshore Help Desk Technicians
john823664
 
Introducing Ensemble Cloudlet vRouter
Introducing Ensemble  Cloudlet vRouterIntroducing Ensemble  Cloudlet vRouter
Introducing Ensemble Cloudlet vRouter
Adtran
 
Content and eLearning Standards: Finding the Best Fit for Your-Training
Content and eLearning Standards: Finding the Best Fit for Your-TrainingContent and eLearning Standards: Finding the Best Fit for Your-Training
Content and eLearning Standards: Finding the Best Fit for Your-Training
Rustici Software
 
SDG 9000 Series: Unleashing multigigabit everywhere
SDG 9000 Series: Unleashing multigigabit everywhereSDG 9000 Series: Unleashing multigigabit everywhere
SDG 9000 Series: Unleashing multigigabit everywhere
Adtran
 
Dev Dives: System-to-system integration with UiPath API Workflows
Dev Dives: System-to-system integration with UiPath API WorkflowsDev Dives: System-to-system integration with UiPath API Workflows
Dev Dives: System-to-system integration with UiPath API Workflows
UiPathCommunity
 
Cognitive Chasms - A Typology of GenAI Failure Failure Modes
Cognitive Chasms - A Typology of GenAI Failure Failure ModesCognitive Chasms - A Typology of GenAI Failure Failure Modes
Cognitive Chasms - A Typology of GenAI Failure Failure Modes
Dr. Tathagat Varma
 
GDG Cloud Southlake #43: Tommy Todd: The Quantum Apocalypse: A Looming Threat...
GDG Cloud Southlake #43: Tommy Todd: The Quantum Apocalypse: A Looming Threat...GDG Cloud Southlake #43: Tommy Todd: The Quantum Apocalypse: A Looming Threat...
GDG Cloud Southlake #43: Tommy Todd: The Quantum Apocalypse: A Looming Threat...
James Anderson
 
Measuring Microsoft 365 Copilot and Gen AI Success
Measuring Microsoft 365 Copilot and Gen AI SuccessMeasuring Microsoft 365 Copilot and Gen AI Success
Measuring Microsoft 365 Copilot and Gen AI Success
Nikki Chapple
 
Multistream in SIP and NoSIP @ OpenSIPS Summit 2025
Multistream in SIP and NoSIP @ OpenSIPS Summit 2025Multistream in SIP and NoSIP @ OpenSIPS Summit 2025
Multistream in SIP and NoSIP @ OpenSIPS Summit 2025
Lorenzo Miniero
 
Kubernetes Cloud Native Indonesia Meetup - May 2025
Kubernetes Cloud Native Indonesia Meetup - May 2025Kubernetes Cloud Native Indonesia Meetup - May 2025
Kubernetes Cloud Native Indonesia Meetup - May 2025
Prasta Maha
 
Security Operations and the Defense Analyst - Splunk Certificate
Security Operations and the Defense Analyst - Splunk CertificateSecurity Operations and the Defense Analyst - Splunk Certificate
Security Operations and the Defense Analyst - Splunk Certificate
VICTOR MAESTRE RAMIREZ
 
AI Trends - Mary Meeker
AI Trends - Mary MeekerAI Trends - Mary Meeker
AI Trends - Mary Meeker
Razin Mustafiz
 
Droidal: AI Agents Revolutionizing Healthcare
Droidal: AI Agents Revolutionizing HealthcareDroidal: AI Agents Revolutionizing Healthcare
Droidal: AI Agents Revolutionizing Healthcare
Droidal LLC
 
AI Emotional Actors: “When Machines Learn to Feel and Perform"
AI Emotional Actors:  “When Machines Learn to Feel and Perform"AI Emotional Actors:  “When Machines Learn to Feel and Perform"
AI Emotional Actors: “When Machines Learn to Feel and Perform"
AkashKumar809858
 
Cyber security cyber security cyber security cyber security cyber security cy...
Cyber security cyber security cyber security cyber security cyber security cy...Cyber security cyber security cyber security cyber security cyber security cy...
Cyber security cyber security cyber security cyber security cyber security cy...
pranavbodhak
 
System Card: Claude Opus 4 & Claude Sonnet 4
System Card: Claude Opus 4 & Claude Sonnet 4System Card: Claude Opus 4 & Claude Sonnet 4
System Card: Claude Opus 4 & Claude Sonnet 4
Razin Mustafiz
 
European Accessibility Act & Integrated Accessibility Testing
European Accessibility Act & Integrated Accessibility TestingEuropean Accessibility Act & Integrated Accessibility Testing
European Accessibility Act & Integrated Accessibility Testing
Julia Undeutsch
 
Agentic AI - The New Era of Intelligence
Agentic AI - The New Era of IntelligenceAgentic AI - The New Era of Intelligence
Agentic AI - The New Era of Intelligence
Muzammil Shah
 
A Comprehensive Guide on Integrating Monoova Payment Gateway
A Comprehensive Guide on Integrating Monoova Payment GatewayA Comprehensive Guide on Integrating Monoova Payment Gateway
A Comprehensive Guide on Integrating Monoova Payment Gateway
danielle hunter
 
Master tester AI toolbox - Kari Kakkonen at Testaus ja AI 2025 Professio
Master tester AI toolbox - Kari Kakkonen at Testaus ja AI 2025 ProfessioMaster tester AI toolbox - Kari Kakkonen at Testaus ja AI 2025 Professio
Master tester AI toolbox - Kari Kakkonen at Testaus ja AI 2025 Professio
Kari Kakkonen
 
Offshore IT Support: Balancing In-House and Offshore Help Desk Technicians
Offshore IT Support: Balancing In-House and Offshore Help Desk TechniciansOffshore IT Support: Balancing In-House and Offshore Help Desk Technicians
Offshore IT Support: Balancing In-House and Offshore Help Desk Technicians
john823664
 
Introducing Ensemble Cloudlet vRouter
Introducing Ensemble  Cloudlet vRouterIntroducing Ensemble  Cloudlet vRouter
Introducing Ensemble Cloudlet vRouter
Adtran
 

Datastax day 2016 introduction to apache cassandra

  • 1. Getting started with Apache Apache Cassandra™ DuyHai DOAN Apache Cassandra™ Evangelist
  • 2. 1 Apache Cassandra™ use-cases 2 Why do I need Apache Cassandra™ ? 3 Distribution, replication & consistency model 4 Features Summary © DataStax, All Rights Reserved. 2
  • 4. Before (< 2016) © 2016 DataStax, All Rights Reserved. 4 Messaging Collections/ Playlists Fraud detection Recommendation/ Personalization Internet of things/ Sensor data
  • 5. Before (< 2016) © 2016 DataStax, All Rights Reserved. 5 Messaging Collections/ Playlists Fraud detection Recommendation/ Personalization Internet of things/ Sensor data
  • 6. Before (< 2016) © 2016 DataStax, All Rights Reserved. 6 Messaging Collections/ Playlists Fraud detection Recommendation/ Personalization Internet of things/ Sensor data
  • 7. Today (≥ 2016) © 2016 DataStax, All Rights Reserved. 7
  • 8. Today (≥ 2016) © 2016 DataStax, All Rights Reserved. 8
  • 9. Today (≥ 2016) © 2016 DataStax, All Rights Reserved. 9
  • 10. Today (≥ 2016) © 2016 DataStax, All Rights Reserved. 10
  • 11. Today (≥ 2016) © 2016 DataStax, All Rights Reserved. 11
  • 12. Why do I need Apache Cassandra™ ?
  • 13. Linear Scalability © DataStax, All Rights Reserved. 13 C* C* C* NetcoSports 3 nodes, ≈3GB 1k+ nodes, PB+ YOU
  • 14. Continuous availability © DataStax, All Rights Reserved. 14 •  thanks to the Dynamo architecture
  • 15. Multi data-centers/cloud native •  out-of-the-box (config only) •  AWS/GCE/Azure/CloudStack support •  Cloud/Bare-metal © DataStax, All Rights Reserved. 15
  • 16. Multi-DC usages Data locality, disaster recovery © DataStax, All Rights Reserved. 16 C* C* C* C* C* C* C* C* C* C* C* C* C* New York (DC1) London (DC2) Async replication
  • 17. Multi-DC usages Virtual DC for workload segregation © DataStax, All Rights Reserved. 17 C* C* C* C* C* C* C* C* C* C* C* C* C* Production (LIVE) Analytics (Spark) Async replication Same room
  • 18. Multi-DC usages Prod data copy for back-up/benchmark © DataStax, All Rights Reserved. 18 C* C* C* C* C* C* C* C* C* C* C* C* C* Use LOCAL_XXX Consistency Levels My tiny test DC READ-ONLY!!! Async replication
  • 19. Operational simplicity •  1 node = 1 process + 2 config files (cassandra.yaml + cassandra-rackdc.properties) •  deployment automation (Ansible …) •  No role between nodes, perfect symmetry © DataStax, All Rights Reserved. 19
  • 20. Eco System •  Apache Spark – Apache Cassandra integration •  analytics •  joins, aggregation •  SparkSQL/Dataframe integration with CQL (predicates push down) •  Apache Zeppelin – Apache Cassandra integration •  web-based notebook •  tabular/graph display © DataStax, All Rights Reserved. 20
  • 21. © 2016 DataStax, All Rights Reserved. 21 Q & A ! "
  • 23. The Tokens © 2016 DataStax, All Rights Reserved. 23 Random hash of #partition à token = hash(#p) Hash: ] –x, x ] hash range: 264 values x = 264/2 C* C* C* C* C* C* C* C*
  • 24. Token Ranges © 2016 DataStax, All Rights Reserved. 24 A: −x,− 3x 4 ⎤ ⎦ ⎥ ⎥ ⎤ ⎦ ⎥ ⎥ B: − 3x 4 ,− 2x 4 ⎤ ⎦ ⎥ ⎥ ⎤ ⎦ ⎥ ⎥ C: − 2x 4 ,− x 4 ⎤ ⎦ ⎥ ⎥ ⎤ ⎦ ⎥ ⎥ D: − x 4 ,0 ⎤ ⎦ ⎥ ⎥ ⎤ ⎦ ⎥ ⎥ E: 0, x 4 ⎤ ⎦ ⎥ ⎥ ⎤ ⎦ ⎥ ⎥ F: x 4 , 2x 4 ⎤ ⎦ ⎥ ⎥ ⎤ ⎦ ⎥ ⎥ G: 2x 4 , 3x 4 ⎤ ⎦ ⎥ ⎥ ⎤ ⎦ ⎥ ⎥ H: 3x 4 ,x ⎤ ⎦ ⎥ ⎥ ⎤ ⎦ ⎥ ⎥ H A E D B C G F
  • 25. Distributed Tables © 2016 DataStax, All Rights Reserved. 25 user_id1 user_id2 user_id3 user_id4 user_id5 CREATE TABLE users( user_id int, …, PRIMARY KEY(user_id) ); H A E D B C G F
  • 26. Distributed Tables © 2016 DataStax, All Rights Reserved. 26 user_id1user_id2 user_id3 user_id4 user_id5 H A E D B C G F
  • 27. Linear Scalability © 2016 DataStax, All Rights Reserved. 27 Today = high load, production In danger H A E D B C G F
  • 28. Scaling Out © 2016 DataStax, All Rights Reserved. 28 +2 nodes to lower the pressure H A E D B C G F I J
  • 29. © 2016 DataStax, All Rights Reserved. 29 Q & A ! "
  • 31. Failure Tolerance © 2016 DataStax, All Rights Reserved. 31 Replication factor (RF) = 3 H A E D B C G F 1 2 3 {A, H, G} {B, A, H} {C, B, A}
  • 32. Coordinator Node © 2016 DataStax, All Rights Reserved. 32 Responsible for handling requests (read/write) Every node can be coordinator •  masterless •  round robin master for each request •  no SPOF •  proxy role H A E D B C G F coordinator request 1 2 3
  • 34. Consistency Level © 2016 DataStax, All Rights Reserved. 34 Tunable at runtime •  ONE •  QUORUM (strict majority w.r.t RF) •  ALL Applicable to any request (read/write)
  • 35. Consistency In Action © 2016 DataStax, All Rights Reserved. 35 B A A B A A Read ONE: A data replication in progress … Write ONE: B ack RF = 3, Write ONE, Read ONE
  • 36. Consistency In Action © 2016 DataStax, All Rights Reserved. 36 B A A B A A Read QUORUM: A data replication in progress … Write ONE: B ack RF = 3, Write ONE, Read QUORUM
  • 37. Consistency In Action © 2016 DataStax, All Rights Reserved. 37 B A A B A A Read ALL: B data replication in progress … Write ONE: B ack RF = 3, Write ONE, Read ALL
  • 38. Last Write Win © 2016 DataStax, All Rights Reserved. 38 H A E D B C G F coordinator Read the value back 1 2 3 B (t2) A (t1) A (t1)
  • 39. Consistency In Action © 2016 DataStax, All Rights Reserved. 39 B B A B B A Read ONE: A data replication in progress … Write QUORUM: B ack RF = 3, Write QUORUM, Read ONE
  • 40. Consistency In Action © 2016 DataStax, All Rights Reserved. 40 B B A B B A Read QUORUM: A data replication in progress … Write QUORUM: B ack RF = 3, Write QUORUM, Read QUORUM
  • 41. Consistency Level = Trade-off © 2016 DataStax, All Rights Reserved. 41
  • 42. Consistency Level © 2016 DataStax, All Rights Reserved. 42 ONE Fast, may not read latest written value
  • 43. Consistency Level © 2016 DataStax, All Rights Reserved. 43 QUORUM Strict majority w.r.t. Replication Factor Good balance
  • 44. Consistency Level © 2016 DataStax, All Rights Reserved. 44 ALL Paranoid Slow, lost of high availability
  • 45. Consistency Level Common Patterns © 2016 DataStax, All Rights Reserved. 45 ONERead + ONEWrite ☞ available for read/write even (N-1) replicas down QUORUMRead + QUORUMWrite ☞ available for read/write even if (RF - 1) replica (s) down
  • 46. © 2016 DataStax, All Rights Reserved. 46 Q & A ! "
  • 48. What is a keyspace/schema ? © 2016 DataStax, All Rights Reserved. 48 Simple table container Defines how data are replicated in the cluster (RF) Keyspace/ Schema Cluster 1 n Table 1 n
  • 49. Keyspace/Schema Creation © 2016 DataStax, All Rights Reserved. 49 CREATE KEYSPACE single_dc WITH REPLICATION = { 'class': 'SimpleStrategy', 'replication_factor': 3} CREATE KEYSPACE multi_dc WITH REPLICATION = { 'class': 'NetworkTopologyStrategy', 'DC1': 3, 'DC2': 3, 'DC3': 1, …} Single Data-center Multi Data-center
  • 50. DDL Syntax © 2016 DataStax, All Rights Reserved. 50 CREATE TABLE users ( login text, name text, age int, … PRIMARY KEY(login)); ALTER TABLE users ADD address text; ALTER TABLE users DROP address; … DROP TABLE
  • 51. DML Syntax © 2016 DataStax, All Rights Reserved. 51 INSERT INTO users(login, name, age) VALUES('jdoe', 'John DOE', 33); UPDATE users SET age = 34 WHERE login = 'jdoe'; DELETE age FROM users WHERE login = 'jdoe'; SELECT age FROM users WHERE login = 'jdoe';
  • 52. Built-In Security Features © 2016 DataStax, All Rights Reserved. 52 Role •  CREATE ROLE x WITH PASSWORD y (NOSUPERUSER | SUPERUSER) •  ALTER ROLE x WITH PASSWORD y (NOSUPERUSER | SUPERUSER) •  DROP ROLE x Permissions •  GRANT <xxx> PERMISSION ON <resource> TO <role_name> •  REVOKE <xxx> PERMISSION ON <resource> FROM <role_name>
  • 53. Collections © DataStax, All Rights Reserved. 53 CREATE TABLE xxx( …, li list<text>, se set<text>, ma map<int, text>, … ); UPDATE xxx SET li = li + [append] … UPDATE xxx SET se = se + {append} UPDATE xxx SET ma[key] = value …
  • 54. User Defined Types © DataStax, All Rights Reserved. 54 CREATE TYPE address ( number int, street text, zipcode text, city text, country text );
  • 55. LightWeight Transactions © DataStax, All Rights Reserved. 55 INSERT INTO users(…) VALUES(...) IF NOT EXISTS; DELETE users WHERE ... IF EXISTS; UPDATE users SET age = xxx WHERE ... IF age = 30; Linearizable writes on a single partition
  • 56. Time To Live © DataStax, All Rights Reserved. 56 INSERT INTO users(…) VALUES(...) USING TTL = 3600; UPDATE users USING TTL = 3600 SET age = xxx WHERE ...;
  • 57. User Defined Functions/Aggregates © DataStax, All Rights Reserved. 57 CREATE FUNCTION toUpperCase(input text) RETURNS NULL ON NULL INPUT RETURNS int LANGUAGE java AS $$ return input.toUpperCase(); $$; SELECT toUpperCase(firstname) FROM users WHERE … SELECT max(salary) FROM users WHERE ...
  • 58. Materialized Views © DataStax, All Rights Reserved. 58 CREATE MATERIALIZED VIEW user_by_country AS SELECT * FROM users WHERE user_id IS NOT NULL AND country IS NOT NULL PRIMARY KEY ((country), user_id);
  • 59. JSON Syntax for INSERT/UPDATE/DELETE © DataStax, All Rights Reserved. 59 CREATE TABLE users ( id text PRIMARY KEY, age int, state text ); INSERT INTO users JSON '{"id": "user123", "age": 42, "state": "TX"}’; INSERT INTO users(id, age, state) VALUES('me', fromJson('20'), 'CA'); UPDATE users SET age = fromJson('25’) WHERE id = fromJson('"me"'); DELETE FROM users WHERE id = fromJson('"me"');
  • 60. JSON Syntax for SELECT © DataStax, All Rights Reserved. 60 SELECT JSON * FROM users WHERE id = 'me'; [json] ---------------------------------------- {"id": "me", "age": 25, "state": "CA”} SELECT JSON age,state FROM users WHERE id = 'me'; [json] ---------------------------------------- {"age": 25, "state": "CA"} SELECT age, toJson(state) FROM users WHERE id = 'me'; age | system.tojson(state) -----+---------------------- 25 | "CA"
  • 61. © 2016 DataStax, All Rights Reserved. 61 Q & A ! "