SlideShare a Scribd company logo
@doanduyhai
Apache Zeppelin, the missing GUI
for your BigData eco-system
DuyHai DOAN, Technical Advocate
@doanduyhai
Who Am I ?
Duy Hai DOAN
Cassandra technical advocate
•  talks, meetups, confs
•  open-source devs (Achilles, …)
•  OSS Cassandra point of contact
☞ duy_hai.doan@datastax.com
☞ @doanduyhai
2
@doanduyhai
Datastax
•  Founded in April 2010
•  We contribute a lot to Apache Cassandra™
•  400+ customers (25 of the Fortune 100), 400+ employees
•  Headquarter in San Francisco Bay area
•  EU headquarter in London, offices in France and Germany
•  Datastax Enterprise = OSS Cassandra + extra features
3
What is Apache Zeppelin ?
Presentation
Architecture
@doanduyhai
Zeppelin Presentation
5
@doanduyhai
Demo
https://github.com/doanduyhai/incubator-zeppelin/tree/ZeppelinPresentation
@doanduyhai
Zeppelin Architecture
Zeppelin Server
Zeppelin Engine
7
R
E
S
TWebSocket
Spark Interpreter Group
Spark SparkSQL
Zeppelin
Interpreter
Factory
Tajo Interpreter
Flink Interpreter
Cassandra Interpreter
JVM
JVM
JVM
JVM
JVM
@doanduyhai
What does Zeppelin provide ?
Front-end & display system for free
Generic back-end with REST APIs & WebSocket
Pluggable interpreters system
Task scheduler (à la CRON)
8
Zeppelin UI Layout
Notebook
Paragraph
UI elements
@doanduyhai
Demo
https://github.com/doanduyhai/incubator-zeppelin/tree/ZeppelinPresentation
Zeppelin Display System
Raw, Table, HTML
Available graphs
View modes
Dynamic form
Iframe export
@doanduyhai
Demo
https://github.com/doanduyhai/incubator-zeppelin/tree/ZeppelinPresentation
Interpreter system
Core interpreters
Third-parties interpreters
Interpreters conf & usage
@doanduyhai
Interpreter processing lifecycle
①  Receive input commands/data
•  as raw text
•  from form data
②  Process the input commands/data by the external back-end
③  Format the response using Zeppelin display system
④  Send response back to the Zeppelin engine
14
@doanduyhai
Core interpreters
•  Spark (Spark core, SparkSQL/DataFrame, PySpark)
•  Spark core = default (or %spark)
•  SparkSQL = %sql
•  Shell (%sh)
•  Markdown (%md)
•  AngularJS (%angular)
15
@doanduyhai
Third-parties interpreters
•  Hive
•  Phoenix
•  Tajo
•  Flink
•  Ignite
•  Lens
•  Cassandra
•  Geode
•  PostgreSQL
•  Kylin
16
@doanduyhai
Interpreter conf & usage
https://github.com/doanduyhai/incubator-zeppelin/tree/ZeppelinPresentation
Writing An Interpreter
How To
Simple interpreter example (AsciiDoc)
Complex interpreter example (Cassandra)
@doanduyhai
Steps to write your own interpreter
•  Create a class that extends Interpreter base class
•  Register it in a static block
•  Optionnally define default config params
19
static {
Interpreter.register("MyInterpreterName", MyClassName.class.getName());
}
static {
Interpreter.register("MyInterpreterName", MyClassName.class.getName(),
new InterpreterPropertyBuilder()
.add("property1", "default value", "Description of property1").build());
}
@doanduyhai
To register your interpreter as default
•  Edit the enum ZeppelinConfiguration.ConfVars
•  Add your interpreter FQCN in the property ZEPPELIN_INTERPRETERS
20
@doanduyhai
To register your interpreter in config files
•  Create conf/zeppelin-site.xml from conf/zeppelin-site.xml.template
•  Add your interpreter FQCN in the property zeppelin.interpreters
21
<property>
<name>zeppelin.interpreters</name>
<value>org.apache.zeppelin.spark.SparkInterpreter,org.apache.zeppelin.spark.PySparkInterpreter,
org.apache.zeppelin.spark.SparkSqlInterpreter,org.apache.zeppelin.spark.DepInterpreter,
org.apache.zeppelin.markdown.Markdown,org.apache.zeppelin.shell.ShellInterpreter,
org.apache.zeppelin.hive.HiveInterpreter,com.me.MyNewInterpreter
</value>
</property>
@doanduyhai
Simple AsciiDoc Interpreter
22
Zeppelin Server
AsciiDoc Interpreter
JVMZeppelin Engine
Raw
Text
Block
Raw
Text
Block
Converted
To
HTML
HTML
Output
① ②
③④
JVM
@doanduyhai
Simple interpreter (AsciiDoc)
https://github.com/doanduyhai/incubator-zeppelin/tree/ZeppelinPresentation
@doanduyhai
Cassandra Interpreter Architecture
24
Cassandra
Interpreter
JVM
Display
Results as
HTML
① ②
⑤
Zeppelin
Server
JVM
Raw
Text
Block
Raw
Text
Block
Cassandra
Cassandra
Java
Driver
③
Async CQL
statements
④Render
HTML
⑥
@doanduyhai
Cassandra Interpreter Commands
25
Native CQL statements
SELECT * FROM …;
INSERT INTO …;
…
Schema commands
DESCRIBE TABLE …;
DESCRIBE KEYSPACE …;
…
Prepared statements
Commands
@prepare …;
@bind …;
@remove_prepared …;
Help command
HELP;
Options Commands
@consistency …;
@retryPolicy …;
@fetchSize …;
@doanduyhai
Complex interpreter (Cassandra)
https://github.com/doanduyhai/incubator-zeppelin/tree/ZeppelinPresentation
@doanduyhai
Cassandra Online Interpreter Docs
27
•  http://zeppelin.incubator.apache.org/docs/interpreter/cassandra.html
@doanduyhai
Cassandra Interactive Help
28
•  Type HELP; in the interpreter
Zeppelin future
Roadmap
@doanduyhai
Roadmap & future
•  More graph options (Map viz ZEPPELIN-157)
•  Helium project, packaging Zeppelin view, logic (code) & resource into
Applications
•  Interpreters packaging re-design
•  ship & compile core interpreters only
•  third-parties interpreters can be pulled from repository
•  which interpreter is core ? Who will maintain ? Community….
•  Integrate security (Apache Shiro, pull request #53 by Hayssam Saleh)
30
@doanduyhai
Roadmap & future
•  Out of incubation state to become 1st class Apache project
31
@doanduyhai
Q & R
! "
@doanduyhai
Thank You
@doanduyhai
duy_hai.doan@datastax.com
http://zeppelin.incubator.apache.org/

More Related Content

What's hot (20)

PDF
Apache cassandra in 2016
Duyhai Doan
 
PDF
Habits of Effective Sqoop Users
Kathleen Ting
 
PDF
Logging logs with Logstash - Devops MK 10-02-2016
Steve Howe
 
PDF
Logging with Elasticsearch, Logstash & Kibana
Amazee Labs
 
PDF
Spark Cassandra Connector: Past, Present, and Future
Russell Spitzer
 
PDF
Logstash-Elasticsearch-Kibana
dknx01
 
PPTX
Presto overview
Shixiong Zhu
 
PDF
Sparkly Notebook: Interactive Analysis and Visualization with Spark
felixcss
 
PDF
From HelloWorld to Configurable and Reusable Apache Spark Applications in Sca...
Databricks
 
PDF
Debugging PySpark: Spark Summit East talk by Holden Karau
Spark Summit
 
PPT
ELK stack at weibo.com
琛琳 饶
 
PDF
Machine Learning in a Twitter ETL using ELK
hypto
 
PPTX
Hadoop on osx
Devopam Mittra
 
PDF
Apache Sqoop: Unlocking Hadoop for Your Relational Database
huguk
 
PPT
Logstash
琛琳 饶
 
PPTX
Elasticsearch, Logstash, Kibana. Cool search, analytics, data mining and more...
Oleksiy Panchenko
 
PPTX
Up and running with pyspark
Krishna Sangeeth KS
 
PPT
{{more}} Kibana4
琛琳 饶
 
PDF
Spark cassandra connector.API, Best Practices and Use-Cases
Duyhai Doan
 
PDF
Turning a Search Engine into a Relational Database
Matthias Wahl
 
Apache cassandra in 2016
Duyhai Doan
 
Habits of Effective Sqoop Users
Kathleen Ting
 
Logging logs with Logstash - Devops MK 10-02-2016
Steve Howe
 
Logging with Elasticsearch, Logstash & Kibana
Amazee Labs
 
Spark Cassandra Connector: Past, Present, and Future
Russell Spitzer
 
Logstash-Elasticsearch-Kibana
dknx01
 
Presto overview
Shixiong Zhu
 
Sparkly Notebook: Interactive Analysis and Visualization with Spark
felixcss
 
From HelloWorld to Configurable and Reusable Apache Spark Applications in Sca...
Databricks
 
Debugging PySpark: Spark Summit East talk by Holden Karau
Spark Summit
 
ELK stack at weibo.com
琛琳 饶
 
Machine Learning in a Twitter ETL using ELK
hypto
 
Hadoop on osx
Devopam Mittra
 
Apache Sqoop: Unlocking Hadoop for Your Relational Database
huguk
 
Logstash
琛琳 饶
 
Elasticsearch, Logstash, Kibana. Cool search, analytics, data mining and more...
Oleksiy Panchenko
 
Up and running with pyspark
Krishna Sangeeth KS
 
{{more}} Kibana4
琛琳 饶
 
Spark cassandra connector.API, Best Practices and Use-Cases
Duyhai Doan
 
Turning a Search Engine into a Relational Database
Matthias Wahl
 

Viewers also liked (20)

PDF
Welcome to Apache Zeppelin Community
Ahyoung Ryu
 
PPTX
Zeppelin at Twitter
Prasad Wagle
 
PDF
A gentle intro of Apache zeppelin
Ahyoung Ryu
 
PDF
Apache Zeppelin, Helium and Beyond
DataWorks Summit/Hadoop Summit
 
PDF
Apache zeppelin 0.7.0 helium
Ahyoung Ryu
 
PPTX
Apache Zeppelin + LIvy: Bringing Multi Tenancy to Interactive Data Analysis
DataWorks Summit/Hadoop Summit
 
PDF
[113]apache zeppelin 이문수
NAVER D2
 
PDF
KillrChat presentation
Duyhai Doan
 
PDF
Introduction to KillrChat
Duyhai Doan
 
PDF
Cassandra drivers and libraries
Duyhai Doan
 
PDF
Fast track to getting started with DSE Max @ ING
Duyhai Doan
 
PDF
Cassandra introduction @ ParisJUG
Duyhai Doan
 
PDF
KillrChat Data Modeling
Duyhai Doan
 
PDF
Cassandra introduction @ NantesJUG
Duyhai Doan
 
PDF
Apache Zeppelin @DevoxxFR 2016
Duyhai Doan
 
PDF
Sasi, cassandra on the full text search ride At Voxxed Day Belgrade 2016
Duyhai Doan
 
PDF
Cassandra introduction mars jug
Duyhai Doan
 
PDF
CamelOne 2012 - Spoilt for Choice: Which Integration Framework to use?
Kai Wähner
 
PDF
Datastax day 2016 introduction to apache cassandra
Duyhai Doan
 
PDF
Cassandra introduction at FinishJUG
Duyhai Doan
 
Welcome to Apache Zeppelin Community
Ahyoung Ryu
 
Zeppelin at Twitter
Prasad Wagle
 
A gentle intro of Apache zeppelin
Ahyoung Ryu
 
Apache Zeppelin, Helium and Beyond
DataWorks Summit/Hadoop Summit
 
Apache zeppelin 0.7.0 helium
Ahyoung Ryu
 
Apache Zeppelin + LIvy: Bringing Multi Tenancy to Interactive Data Analysis
DataWorks Summit/Hadoop Summit
 
[113]apache zeppelin 이문수
NAVER D2
 
KillrChat presentation
Duyhai Doan
 
Introduction to KillrChat
Duyhai Doan
 
Cassandra drivers and libraries
Duyhai Doan
 
Fast track to getting started with DSE Max @ ING
Duyhai Doan
 
Cassandra introduction @ ParisJUG
Duyhai Doan
 
KillrChat Data Modeling
Duyhai Doan
 
Cassandra introduction @ NantesJUG
Duyhai Doan
 
Apache Zeppelin @DevoxxFR 2016
Duyhai Doan
 
Sasi, cassandra on the full text search ride At Voxxed Day Belgrade 2016
Duyhai Doan
 
Cassandra introduction mars jug
Duyhai Doan
 
CamelOne 2012 - Spoilt for Choice: Which Integration Framework to use?
Kai Wähner
 
Datastax day 2016 introduction to apache cassandra
Duyhai Doan
 
Cassandra introduction at FinishJUG
Duyhai Doan
 
Ad

Similar to Apache zeppelin, the missing component for the big data ecosystem (20)

PDF
Getting Started with Apache Cassandra and Apache Zeppelin (DuyHai DOAN, DataS...
DataStax
 
PDF
Data science lifecycle with Apache Zeppelin
DataWorks Summit/Hadoop Summit
 
PDF
Big Data visualization with Apache Spark and Zeppelin
prajods
 
PPTX
Quick Tour On Zeppelin
Knoldus Inc.
 
PDF
Apache Zeppelin & Cluster
Jongyoul Lee
 
PPTX
Data Science lifecycle with Apache Zeppelin and Spark by Moonsoo Lee
Spark Summit
 
PDF
Apache Zeppelin Helium and Beyond
DataWorks Summit/Hadoop Summit
 
PDF
Moon soo Lee – Data Science Lifecycle with Apache Flink and Apache Zeppelin
Flink Forward
 
PDF
Python and Zope: An introduction (May 2004)
Kiran Jonnalagadda
 
PPTX
Data Science with Spark & Zeppelin
Vinay Shukla
 
PPTX
Data Science in the Cloud with Spark, Zeppelin, and Cloudbreak
DataWorks Summit
 
PPTX
Summer 2017 undergraduate research powerpoint
Christopher Dubois
 
PPTX
Enabling Apache Zeppelin and Spark for Data Science in the Enterprise
DataWorks Summit/Hadoop Summit
 
PDF
Apache Zeppelin and Spark for Enterprise Data Science
Bikas Saha
 
PPTX
Apache Zeppelin and Spark for Enterprise Data Science
Bikas Saha
 
PDF
Flight on Zeppelin with Apache Spark & Cassandra
Alex Ott
 
PDF
Mining public datasets using opensource tools: Zeppelin, Spark and Juju
seoul_engineer
 
PPTX
Future of data visualization
hadoopsphere
 
PPTX
Toulouse Data Science meetup - Apache zeppelin
Gérard Dupont
 
PPTX
4.Building a Data Product using apache Zeppelin - Apache Kylin Meetup @Shanghai
Luke Han
 
Getting Started with Apache Cassandra and Apache Zeppelin (DuyHai DOAN, DataS...
DataStax
 
Data science lifecycle with Apache Zeppelin
DataWorks Summit/Hadoop Summit
 
Big Data visualization with Apache Spark and Zeppelin
prajods
 
Quick Tour On Zeppelin
Knoldus Inc.
 
Apache Zeppelin & Cluster
Jongyoul Lee
 
Data Science lifecycle with Apache Zeppelin and Spark by Moonsoo Lee
Spark Summit
 
Apache Zeppelin Helium and Beyond
DataWorks Summit/Hadoop Summit
 
Moon soo Lee – Data Science Lifecycle with Apache Flink and Apache Zeppelin
Flink Forward
 
Python and Zope: An introduction (May 2004)
Kiran Jonnalagadda
 
Data Science with Spark & Zeppelin
Vinay Shukla
 
Data Science in the Cloud with Spark, Zeppelin, and Cloudbreak
DataWorks Summit
 
Summer 2017 undergraduate research powerpoint
Christopher Dubois
 
Enabling Apache Zeppelin and Spark for Data Science in the Enterprise
DataWorks Summit/Hadoop Summit
 
Apache Zeppelin and Spark for Enterprise Data Science
Bikas Saha
 
Apache Zeppelin and Spark for Enterprise Data Science
Bikas Saha
 
Flight on Zeppelin with Apache Spark & Cassandra
Alex Ott
 
Mining public datasets using opensource tools: Zeppelin, Spark and Juju
seoul_engineer
 
Future of data visualization
hadoopsphere
 
Toulouse Data Science meetup - Apache zeppelin
Gérard Dupont
 
4.Building a Data Product using apache Zeppelin - Apache Kylin Meetup @Shanghai
Luke Han
 
Ad

More from Duyhai Doan (16)

PDF
Pourquoi Terraform n'est pas le bon outil pour les déploiements automatisés d...
Duyhai Doan
 
PDF
Le futur d'apache cassandra
Duyhai Doan
 
PDF
Big data 101 for beginners devoxxpl
Duyhai Doan
 
PDF
Big data 101 for beginners riga dev days
Duyhai Doan
 
PDF
Datastax enterprise presentation
Duyhai Doan
 
PDF
Datastax day 2016 : Cassandra data modeling basics
Duyhai Doan
 
PDF
Sasi, cassandra on full text search ride
Duyhai Doan
 
PDF
Cassandra 3 new features @ Geecon Krakow 2016
Duyhai Doan
 
PDF
Algorithme distribués pour big data saison 2 @DevoxxFR 2016
Duyhai Doan
 
PDF
Cassandra 3 new features 2016
Duyhai Doan
 
PDF
Cassandra introduction 2016
Duyhai Doan
 
PDF
Cassandra introduction 2016
Duyhai Doan
 
PDF
Cassandra nice use cases and worst anti patterns no sql-matters barcelona
Duyhai Doan
 
PDF
Distributed algorithms for big data @ GeeCon
Duyhai Doan
 
PDF
Algorithmes distribues pour le big data @ DevoxxFR 2015
Duyhai Doan
 
PDF
Real time data processing with spark & cassandra @ NoSQLMatters 2015 Paris
Duyhai Doan
 
Pourquoi Terraform n'est pas le bon outil pour les déploiements automatisés d...
Duyhai Doan
 
Le futur d'apache cassandra
Duyhai Doan
 
Big data 101 for beginners devoxxpl
Duyhai Doan
 
Big data 101 for beginners riga dev days
Duyhai Doan
 
Datastax enterprise presentation
Duyhai Doan
 
Datastax day 2016 : Cassandra data modeling basics
Duyhai Doan
 
Sasi, cassandra on full text search ride
Duyhai Doan
 
Cassandra 3 new features @ Geecon Krakow 2016
Duyhai Doan
 
Algorithme distribués pour big data saison 2 @DevoxxFR 2016
Duyhai Doan
 
Cassandra 3 new features 2016
Duyhai Doan
 
Cassandra introduction 2016
Duyhai Doan
 
Cassandra introduction 2016
Duyhai Doan
 
Cassandra nice use cases and worst anti patterns no sql-matters barcelona
Duyhai Doan
 
Distributed algorithms for big data @ GeeCon
Duyhai Doan
 
Algorithmes distribues pour le big data @ DevoxxFR 2015
Duyhai Doan
 
Real time data processing with spark & cassandra @ NoSQLMatters 2015 Paris
Duyhai Doan
 

Recently uploaded (20)

PDF
[Newgen] NewgenONE Marvin Brochure 1.pdf
darshakparmar
 
PDF
Advancing WebDriver BiDi support in WebKit
Igalia
 
PDF
Biography of Daniel Podor.pdf
Daniel Podor
 
PPTX
The Project Compass - GDG on Campus MSIT
dscmsitkol
 
PPTX
AI Penetration Testing Essentials: A Cybersecurity Guide for 2025
defencerabbit Team
 
PDF
POV_ Why Enterprises Need to Find Value in ZERO.pdf
darshakparmar
 
PDF
Transforming Utility Networks: Large-scale Data Migrations with FME
Safe Software
 
PPTX
Building Search Using OpenSearch: Limitations and Workarounds
Sease
 
PDF
CIFDAQ Market Insights for July 7th 2025
CIFDAQ
 
PDF
"Beyond English: Navigating the Challenges of Building a Ukrainian-language R...
Fwdays
 
PDF
Newgen Beyond Frankenstein_Build vs Buy_Digital_version.pdf
darshakparmar
 
PDF
Jak MŚP w Europie Środkowo-Wschodniej odnajdują się w świecie AI
dominikamizerska1
 
PDF
Staying Human in a Machine- Accelerated World
Catalin Jora
 
PDF
Reverse Engineering of Security Products: Developing an Advanced Microsoft De...
nwbxhhcyjv
 
PDF
Using FME to Develop Self-Service CAD Applications for a Major UK Police Force
Safe Software
 
PPTX
From Sci-Fi to Reality: Exploring AI Evolution
Svetlana Meissner
 
PPTX
"Autonomy of LLM Agents: Current State and Future Prospects", Oles` Petriv
Fwdays
 
PDF
Go Concurrency Real-World Patterns, Pitfalls, and Playground Battles.pdf
Emily Achieng
 
PDF
Transcript: New from BookNet Canada for 2025: BNC BiblioShare - Tech Forum 2025
BookNet Canada
 
PDF
CIFDAQ Market Wrap for the week of 4th July 2025
CIFDAQ
 
[Newgen] NewgenONE Marvin Brochure 1.pdf
darshakparmar
 
Advancing WebDriver BiDi support in WebKit
Igalia
 
Biography of Daniel Podor.pdf
Daniel Podor
 
The Project Compass - GDG on Campus MSIT
dscmsitkol
 
AI Penetration Testing Essentials: A Cybersecurity Guide for 2025
defencerabbit Team
 
POV_ Why Enterprises Need to Find Value in ZERO.pdf
darshakparmar
 
Transforming Utility Networks: Large-scale Data Migrations with FME
Safe Software
 
Building Search Using OpenSearch: Limitations and Workarounds
Sease
 
CIFDAQ Market Insights for July 7th 2025
CIFDAQ
 
"Beyond English: Navigating the Challenges of Building a Ukrainian-language R...
Fwdays
 
Newgen Beyond Frankenstein_Build vs Buy_Digital_version.pdf
darshakparmar
 
Jak MŚP w Europie Środkowo-Wschodniej odnajdują się w świecie AI
dominikamizerska1
 
Staying Human in a Machine- Accelerated World
Catalin Jora
 
Reverse Engineering of Security Products: Developing an Advanced Microsoft De...
nwbxhhcyjv
 
Using FME to Develop Self-Service CAD Applications for a Major UK Police Force
Safe Software
 
From Sci-Fi to Reality: Exploring AI Evolution
Svetlana Meissner
 
"Autonomy of LLM Agents: Current State and Future Prospects", Oles` Petriv
Fwdays
 
Go Concurrency Real-World Patterns, Pitfalls, and Playground Battles.pdf
Emily Achieng
 
Transcript: New from BookNet Canada for 2025: BNC BiblioShare - Tech Forum 2025
BookNet Canada
 
CIFDAQ Market Wrap for the week of 4th July 2025
CIFDAQ
 

Apache zeppelin, the missing component for the big data ecosystem