SlideShare a Scribd company logo
(Big Data)2
How YARN Timeline Service v.2 Unlocks 360-Degree
Pla@orm Insights at Scale
Sangjin Lee @sjlee (Twi5er)
Li Lu (Hortonworks)
Vrushali Channapa5an @vrushalivc (Twi5er)
Outline
• Why v.2?
• Highlights
• Developing for Timeline Service v.2
• SeIng up Timeline Service v.2
• Milestones
• Demo
Why v.2?
• YARN Timeline Service v 1.x
• Gained good adopSon: Tez, HIVE, Pig, etc.
• Keeps improving with v 1.5 APIs and storage implementaSon
• SSll facing some fundamental challenges...
Why v.2?
• Scalability and reliability challenges
• Single instance of Timeline Server
• Storage (single local LevelDB instance)
• Usability
• Flow
• Metrics and configuraSon as first-class ciSzens
• Metrics aggregaSon up the enSty hierarchy
Highlights
v.1 v.2
Single writer/reader Timeline Server Distributed writer/collector architecture
Single local LevelDB storage* Scalable storage (HBase)
v.1 enSty model New v.2 enSty model
No aggregaSon Metrics aggregaSon
REST API Richer query REST API
Architecture
• SeparaSon of writers (“collectors”) and readers
• Distributed collectors: one collector for each app
• Dedicated RM collector for RM-generated data
• Collector discovery via RM
• Pluggable storage with HBase as default storage
Distributed collectors & readers
!meline
reader
!meline
reader
Storage
!meline
reader
AM
!meline
collector
NM
!meline reader pool
app metrics/events
container events/metrics
RM
!meline collector
app/container events
user queries
(worker node running AM)
(worker node running containers)
write flow
read flow
Collector discovery
RM
AM
app id => address
! start AM container
NM
3meline
collector
" node heartbeat
# allocate response
worker node
3meline
client
New enSty model
• Flows and flow runs as parents of YARN applicaSon enSSes
• First-class configuraSon (key-value pairs)
• First-class metrics (single-value or Sme series)
• Designed to handle mulS-cluster environment out of the box
What is a flow?
• A flow is a group of YARN
applicaSons that are launched as
parts of a logical app
• Oozie, Scalding, Pig, etc.
• name:
“frequent_visitor_stat”
• run id: 1466097809000
• version: “b9b9068”
ConfiguraSon and metrics
• Now explicit top-level a5ributes of
enSSes
• Fine-grained updates and queries
made possible
• “update metric A to value x”
• “query enMMes where config A = B”
container 1_1
metric: A = 10
metric: B = 100
config: "Foo" = "bar"
ConfiguraSon and metrics
• Now explicit top-level a5ributes of
enSSes
• Fine-grained updates and queries
made possible
• “update metric A to value x”
• “query enMMes where config A = B”
container 1_1
metric: A = 50
metric: B = 100
config: "Foo" = "bar"
HBase Storage
• Scalable backend
• Row Key structure
• efficient range scans
• KeyPrefixRegionSplitPolicy
• Filter pushdown
• Coprocessors for flow aggregaSon (“readless” aggregaSon)
• Cell tags for metadata (applicaSon id, aggregaSon operaSon)
• Cell Smestamps generated during put
• lei shiied with app id added to avoid overwrites
Tables in HBase
• flow run
• application
• entity
• flow activity
• app to flow
table: flow run
Row key:
clusterId!userName!
flowName!
inverted(flowRunId)
most recent flow run stored first
coprocessor enabled
table: applicaSon
Row key:
clusterId!userName!
flowName!
inverted(flowRunId)!
AppId
applicaSons within a flow run stored
together
most recent flow run stored first
table: enSty
Row key:
userName!clusterId!flowName!
inverted(flowRunId)!AppId!entityType!
entityId
enSSes within an applicaSon within a flow run stored together per
type
• for example, all containers within a yarn applicaSon will be
stored together
pre-split table
stores information per entity run like info, relatesTo, relatedTo,
events, metrics, config
table: flow acSvity
Row key:
clusterId!
inverted(TopOfTheDay)!
userName!flowName
shows the flows that ran on that day
stores informaSon per flow like number of
runs, the run ids, versions
table: appToFlow
Row key:
clusterId!appId
- stores mapping of appId to
flowName and flowRunId
Metrics aggregaSon
• ApplicaSon level
• Rolls up sub-applicaSon metrics
• Performed in real Sme in the collectors in memory
• Flow run level
• Rolls up app level metrics
• Performed in HBase region servers via coprocessors
• Offline aggregaSon (TBD)
• Rolls up on user, queue, and flow offline periodically
• Phoenix tables
Container 1_1
“bytes” : 23
Container 1_2
“bytes” : 135
Container 2_1
“bytes” : 50
Container 3_1
“bytes” : 64
App1
“bytes”: 158
App2
“bytes”: 50
App3
“bytes”: 64
flow1
“bytes”: 208
flow2
“bytes”: 64
user1
“bytes”: 272
queue1
“bytes”: 272
App
aggregation
In collector
flow
aggregation
In hbase
offline
aggregation
FlowRun
Aggrega:on
via the HBase
Coprocessor
App
Metrics
Cells
in
HBase
FlowRun
Metric
Sum
App
Metrics
Cells
in
HBase
FlowRun
Metric
Sum
FlowRun
Aggrega:on
via the HBase
Coprocessor
Reader REST API: paths
• URLs under /ws/v2/Smeline
• Canonical REST style URLs: /ws/v2/Smeline/clusters/cluster_name/
users/user_name/flows/flow_name/runs/run_id
• Path elements may be omi5ed if they can be inferred
• flow context can be inferred by app id
• default cluster is assumed if cluster is omi5ed
Reader REST API: query params
• limit, createdTimeStart, createdTimeEnd: constrain the enSSes
• fields (ALL | EVENTS | INFO | CONFIGS | METRICS | RELATES_TO |
IS_RELATED_TO): limit the contents to return
• metricsToRetrieve, confsToRetrieve: further limit the contents to
return
• metricsLimit: limits the number of values in a Sme series
Reader REST API: query params
• relatesTo, isRelatedTo: filters by associaSon
• *Filters: filters by info, config, metric, event, …
• Supports complex filters including operators
• metricFilter=(((metric1 eq 50) AND (metric2 gt 40)) OR (metric1 lt
20))
Developing: TimelineClient
In your application master:
// create TimelineClient v.2 style
TimelineClient client = TimelineClient.createTimelineClient(appId);
client.init(conf);
client.start();
// bind it to AM/RM client to receive the collector address
amRMClient.registerTimelineClient(client);
// create and write timeline entities
TimelineEntity entity = new TimelineEntity();
client.putEntities(entity);
// when the app is complete, stop the timeline client
client.stop();
Developing: Flow context
In your app submitter:
ApplicationSubmissionContext appContext =
app.getApplicationSubmissionContext();
// set the flow context as YARN application tags
Set<String> tags = new HashSet<>();
tags.add(TimelineUtils.generateFlowNameTag("distributed grep"));
tags.add(TimelineUtils.generateFlowVersionTag(
"3df8b0d6100530080d2e0decf9e528e57c42a90a"));
tags.add(TimelineUtils.generateFlowRunIdTag(System.currentTimeMillis()));
appContext.setApplicationTags(tags);
SeIng up Timeline Service v.2
• Set up the HBase cluster (1.1.x)
• Add the Smeline service jar to HBase
• Install the flow run coprocessor
• Create tables via TimelineSchemaCreator uSlity
• Configure the YARN cluster
• Enable Timeline Service v.2
• Add hbase-site.xml for the Smeline collector and readers
• Start the Smeline reader daemon
Milestone 1 ("Alpha 1")
• Merge discussion (YARN-2928) in progress as we speak!
✓ Complete end-to-end read/write
flow
✓ Real Sme applicaSon and flow
aggregaSon
✓ New enSty model
✓ HBase Storage
✓ Rich REST API
✓ IntegraSon with Distributed Shell
and MapReduce
✓ YARN generic events and system
metrics
Milestones - Future
• Milestone 2 (“Alpha 2”)
• IntegraSon with new YARN
UI
• IntegraSon with more
frameworks
• Beta
• Freeze API and storage schema
• Security
• Collectors as containers
• Storage fault tolerance
• ProducSon-ready
• MigraSon-ready
Demo
Contributors
• Li Lu, Junping Du, Vinod Kumar Vavilapalli (Hortonworks)
• Varun Saxena, Naganarasimha G. R. (Huawei)
• Sangjin Lee, Vrushali Channapa5an, Joep RoInghuis (Twi5er)
• Zhijie Shen (now at Facebook)
• The HBase and Phoenix community!
Thank you!

More Related Content

What's hot (18)

PDF
So You Want to Write a Connector?
confluent
 
PPTX
Confluent Kafka and KSQL: Streaming Data Pipelines Made Easy
Kairo Tavares
 
PPTX
Running Non-MapReduce Big Data Applications on Apache Hadoop
hitesh1892
 
PDF
Performance Tuning RocksDB for Kafka Streams’ State Stores
confluent
 
PPTX
Webinar: Flink SQL in Action - Fabian Hueske
Ververica
 
PDF
Using Location Data to Showcase Keys, Windows, and Joins in Kafka Streams DSL...
confluent
 
PPTX
Kafka Connect: Real-time Data Integration at Scale with Apache Kafka, Ewen Ch...
confluent
 
PDF
KSQL Intro
confluent
 
PDF
KSQL Deep Dive - The Open Source Streaming Engine for Apache Kafka
Kai Wähner
 
PDF
Flink Forward SF 2017: Bill Liu & Haohui Mai - AthenaX : Uber’s streaming pro...
Flink Forward
 
PDF
Kafka Connect by Datio
Datio Big Data
 
PDF
AWS Study Group - Chapter 07 - Integrating Application Services [Solution Arc...
QCloudMentor
 
PDF
Mobius: C# Language Binding For Spark
Spark Summit
 
PDF
Kafka for Microservices – You absolutely need Avro Schemas! | Gerardo Gutierr...
HostedbyConfluent
 
PDF
Utilizing Kafka Connect to Integrate Classic Monoliths into Modern Microservi...
HostedbyConfluent
 
PPTX
KSQL and Kafka Streams – When to Use Which, and When to Use Both
confluent
 
PDF
A Pluggable Autoscaling System @ UCC
Chris Bunch
 
PPT
Co 4, session 2, aws analytics services
m vaishnavi
 
So You Want to Write a Connector?
confluent
 
Confluent Kafka and KSQL: Streaming Data Pipelines Made Easy
Kairo Tavares
 
Running Non-MapReduce Big Data Applications on Apache Hadoop
hitesh1892
 
Performance Tuning RocksDB for Kafka Streams’ State Stores
confluent
 
Webinar: Flink SQL in Action - Fabian Hueske
Ververica
 
Using Location Data to Showcase Keys, Windows, and Joins in Kafka Streams DSL...
confluent
 
Kafka Connect: Real-time Data Integration at Scale with Apache Kafka, Ewen Ch...
confluent
 
KSQL Intro
confluent
 
KSQL Deep Dive - The Open Source Streaming Engine for Apache Kafka
Kai Wähner
 
Flink Forward SF 2017: Bill Liu & Haohui Mai - AthenaX : Uber’s streaming pro...
Flink Forward
 
Kafka Connect by Datio
Datio Big Data
 
AWS Study Group - Chapter 07 - Integrating Application Services [Solution Arc...
QCloudMentor
 
Mobius: C# Language Binding For Spark
Spark Summit
 
Kafka for Microservices – You absolutely need Avro Schemas! | Gerardo Gutierr...
HostedbyConfluent
 
Utilizing Kafka Connect to Integrate Classic Monoliths into Modern Microservi...
HostedbyConfluent
 
KSQL and Kafka Streams – When to Use Which, and When to Use Both
confluent
 
A Pluggable Autoscaling System @ UCC
Chris Bunch
 
Co 4, session 2, aws analytics services
m vaishnavi
 

Viewers also liked (20)

PPTX
Less is More: 2X Storage Efficiency with HDFS Erasure Coding
Zhe Zhang
 
PDF
Apache Hadoop Crash Course
DataWorks Summit/Hadoop Summit
 
PPTX
What's new in hadoop 3.0
Heiko Loewe
 
PPTX
Hadoop Summit Tokyo Apache NiFi Crash Course
DataWorks Summit/Hadoop Summit
 
PDF
#HSTokyo16 Apache Spark Crash Course
DataWorks Summit/Hadoop Summit
 
PPTX
Analyzing Historical Data of Applications on YARN for Fun and Profit
DataWorks Summit
 
PPTX
The Past, Present, and Future of Hadoop at LinkedIn
Carl Steinbach
 
PPTX
Application Timeline Server - Past, Present and Future
VARUN SAXENA
 
PDF
Native erasure coding support inside hdfs presentation
lin bao
 
PPTX
Apache Hive 2.0: SQL, Speed, Scale
DataWorks Summit/Hadoop Summit
 
PDF
図でわかるHDFS Erasure Coding
Kai Sasaki
 
PPTX
Procesos lineales e intermitentes
pao garcia
 
PDF
Data Science Crash Course Hadoop Summit SJ
Daniel Madrigal
 
PPTX
Inferno Scalable Deep Learning on Spark
DataWorks Summit/Hadoop Summit
 
PPTX
Spark meets Smart Meters
DataWorks Summit/Hadoop Summit
 
PDF
HDFS Deep Dive
Yifeng Jiang
 
PPTX
Crash Course HS16Melb - Hands on Intro to Spark & Zeppelin
DataWorks Summit/Hadoop Summit
 
PPTX
Achieving 100k Queries per Hour on Hive on Tez
DataWorks Summit/Hadoop Summit
 
PPTX
Advanced Spark Meetup - Jan 12, 2016
Michelle Casbon
 
PPTX
Apache Tez - A New Chapter in Hadoop Data Processing
DataWorks Summit
 
Less is More: 2X Storage Efficiency with HDFS Erasure Coding
Zhe Zhang
 
Apache Hadoop Crash Course
DataWorks Summit/Hadoop Summit
 
What's new in hadoop 3.0
Heiko Loewe
 
Hadoop Summit Tokyo Apache NiFi Crash Course
DataWorks Summit/Hadoop Summit
 
#HSTokyo16 Apache Spark Crash Course
DataWorks Summit/Hadoop Summit
 
Analyzing Historical Data of Applications on YARN for Fun and Profit
DataWorks Summit
 
The Past, Present, and Future of Hadoop at LinkedIn
Carl Steinbach
 
Application Timeline Server - Past, Present and Future
VARUN SAXENA
 
Native erasure coding support inside hdfs presentation
lin bao
 
Apache Hive 2.0: SQL, Speed, Scale
DataWorks Summit/Hadoop Summit
 
図でわかるHDFS Erasure Coding
Kai Sasaki
 
Procesos lineales e intermitentes
pao garcia
 
Data Science Crash Course Hadoop Summit SJ
Daniel Madrigal
 
Inferno Scalable Deep Learning on Spark
DataWorks Summit/Hadoop Summit
 
Spark meets Smart Meters
DataWorks Summit/Hadoop Summit
 
HDFS Deep Dive
Yifeng Jiang
 
Crash Course HS16Melb - Hands on Intro to Spark & Zeppelin
DataWorks Summit/Hadoop Summit
 
Achieving 100k Queries per Hour on Hive on Tez
DataWorks Summit/Hadoop Summit
 
Advanced Spark Meetup - Jan 12, 2016
Michelle Casbon
 
Apache Tez - A New Chapter in Hadoop Data Processing
DataWorks Summit
 
Ad

Similar to Timeline Service v.2 (Hadoop Summit 2016) (20)

PPTX
HBaseConEast2016: How yarn timeline service v.2 unlocks 360 degree platform i...
Michael Stack
 
PPTX
Application Timeline Server Past, Present and Future
Naganarasimha Garla
 
PDF
Actors or Not: Async Event Architectures
Yaroslav Tkachenko
 
PDF
Monitoring Akka with Kamon 1.0
Steffen Gebert
 
PDF
WSO2 Quarterly Technical Update
WSO2
 
PDF
VMworld 2013: Performance Management of Business Critical Applications using ...
VMworld
 
PPTX
Stephane Lapointe, Frank Boucher & Alexandre Brisebois: Les micro-services et...
MSDEVMTL
 
PPTX
Event Streaming Architectures with Confluent and ScyllaDB
ScyllaDB
 
PDF
Introducing Kafka's Streams API
confluent
 
PPTX
Stream Application Development with Apache Kafka
Matthias J. Sax
 
PPTX
Log insight technical overview customer facing (based on 3.x)
David Pasek
 
PDF
RichFaces 4 Component Deep Dive - JAX/JSFSummit
balunasj
 
PPTX
Rails Request & Middlewares
Santosh Wadghule
 
PDF
Azure Container Apps
Juan Fabian
 
PDF
Modern Cloud-Native Streaming Platforms: Event Streaming Microservices with K...
confluent
 
PDF
A Practical Deep Dive into Observability of Streaming Applications with Kosta...
HostedbyConfluent
 
PPTX
What's New in .Net 4.5
Malam Team
 
PDF
Andrei shakirin rest_cxf
Aravindharamanan S
 
PDF
SoftwareCircus 2020 "The Past, Present, and Future of Cloud Native API Gateways"
Daniel Bryant
 
PPTX
Event-Based API Patterns and Practices
LaunchAny
 
HBaseConEast2016: How yarn timeline service v.2 unlocks 360 degree platform i...
Michael Stack
 
Application Timeline Server Past, Present and Future
Naganarasimha Garla
 
Actors or Not: Async Event Architectures
Yaroslav Tkachenko
 
Monitoring Akka with Kamon 1.0
Steffen Gebert
 
WSO2 Quarterly Technical Update
WSO2
 
VMworld 2013: Performance Management of Business Critical Applications using ...
VMworld
 
Stephane Lapointe, Frank Boucher & Alexandre Brisebois: Les micro-services et...
MSDEVMTL
 
Event Streaming Architectures with Confluent and ScyllaDB
ScyllaDB
 
Introducing Kafka's Streams API
confluent
 
Stream Application Development with Apache Kafka
Matthias J. Sax
 
Log insight technical overview customer facing (based on 3.x)
David Pasek
 
RichFaces 4 Component Deep Dive - JAX/JSFSummit
balunasj
 
Rails Request & Middlewares
Santosh Wadghule
 
Azure Container Apps
Juan Fabian
 
Modern Cloud-Native Streaming Platforms: Event Streaming Microservices with K...
confluent
 
A Practical Deep Dive into Observability of Streaming Applications with Kosta...
HostedbyConfluent
 
What's New in .Net 4.5
Malam Team
 
Andrei shakirin rest_cxf
Aravindharamanan S
 
SoftwareCircus 2020 "The Past, Present, and Future of Cloud Native API Gateways"
Daniel Bryant
 
Event-Based API Patterns and Practices
LaunchAny
 
Ad

Recently uploaded (20)

PDF
How Agentic AI Networks are Revolutionizing Collaborative AI Ecosystems in 2025
ronakdubey419
 
PDF
Troubleshooting Virtual Threads in Java!
Tier1 app
 
PDF
Step-by-Step Guide to Install SAP HANA Studio | Complete Installation Tutoria...
SAP Vista, an A L T Z E N Company
 
PDF
Supabase Meetup: Build in a weekend, scale to millions
Carlo Gilmar Padilla Santana
 
PPTX
Role Of Python In Programing Language.pptx
jaykoshti048
 
PPTX
ASSIGNMENT_1[1][1][1][1][1] (1) variables.pptx
kr2589474
 
PDF
Salesforce Pricing Update 2025: Impact, Strategy & Smart Cost Optimization wi...
GetOnCRM Solutions
 
PDF
WatchTraderHub - Watch Dealer software with inventory management and multi-ch...
WatchDealer Pavel
 
PPTX
classification of computer and basic part of digital computer
ravisinghrajpurohit3
 
PPTX
Employee salary prediction using Machine learning Project template.ppt
bhanuk27082004
 
PDF
Balancing Resource Capacity and Workloads with OnePlan – Avoid Overloading Te...
OnePlan Solutions
 
PDF
ChatPharo: an Open Architecture for Understanding How to Talk Live to LLMs
ESUG
 
PDF
Protecting the Digital World Cyber Securit
dnthakkar16
 
PDF
On Software Engineers' Productivity - Beyond Misleading Metrics
Romén Rodríguez-Gil
 
PPTX
Presentation about variables and constant.pptx
kr2589474
 
PDF
Why Are More Businesses Choosing Partners Over Freelancers for Salesforce.pdf
Cymetrix Software
 
PDF
Summary Of Odoo 18.1 to 18.4 : The Way For Odoo 19
CandidRoot Solutions Private Limited
 
PDF
AI Image Enhancer: Revolutionizing Visual Quality”
docmasoom
 
PDF
MiniTool Power Data Recovery Crack New Pre Activated Version Latest 2025
imang66g
 
PDF
System Center 2025 vs. 2022; What’s new, what’s next_PDF.pdf
Q-Advise
 
How Agentic AI Networks are Revolutionizing Collaborative AI Ecosystems in 2025
ronakdubey419
 
Troubleshooting Virtual Threads in Java!
Tier1 app
 
Step-by-Step Guide to Install SAP HANA Studio | Complete Installation Tutoria...
SAP Vista, an A L T Z E N Company
 
Supabase Meetup: Build in a weekend, scale to millions
Carlo Gilmar Padilla Santana
 
Role Of Python In Programing Language.pptx
jaykoshti048
 
ASSIGNMENT_1[1][1][1][1][1] (1) variables.pptx
kr2589474
 
Salesforce Pricing Update 2025: Impact, Strategy & Smart Cost Optimization wi...
GetOnCRM Solutions
 
WatchTraderHub - Watch Dealer software with inventory management and multi-ch...
WatchDealer Pavel
 
classification of computer and basic part of digital computer
ravisinghrajpurohit3
 
Employee salary prediction using Machine learning Project template.ppt
bhanuk27082004
 
Balancing Resource Capacity and Workloads with OnePlan – Avoid Overloading Te...
OnePlan Solutions
 
ChatPharo: an Open Architecture for Understanding How to Talk Live to LLMs
ESUG
 
Protecting the Digital World Cyber Securit
dnthakkar16
 
On Software Engineers' Productivity - Beyond Misleading Metrics
Romén Rodríguez-Gil
 
Presentation about variables and constant.pptx
kr2589474
 
Why Are More Businesses Choosing Partners Over Freelancers for Salesforce.pdf
Cymetrix Software
 
Summary Of Odoo 18.1 to 18.4 : The Way For Odoo 19
CandidRoot Solutions Private Limited
 
AI Image Enhancer: Revolutionizing Visual Quality”
docmasoom
 
MiniTool Power Data Recovery Crack New Pre Activated Version Latest 2025
imang66g
 
System Center 2025 vs. 2022; What’s new, what’s next_PDF.pdf
Q-Advise
 

Timeline Service v.2 (Hadoop Summit 2016)

  • 1. (Big Data)2 How YARN Timeline Service v.2 Unlocks 360-Degree Pla@orm Insights at Scale Sangjin Lee @sjlee (Twi5er) Li Lu (Hortonworks) Vrushali Channapa5an @vrushalivc (Twi5er)
  • 2. Outline • Why v.2? • Highlights • Developing for Timeline Service v.2 • SeIng up Timeline Service v.2 • Milestones • Demo
  • 3. Why v.2? • YARN Timeline Service v 1.x • Gained good adopSon: Tez, HIVE, Pig, etc. • Keeps improving with v 1.5 APIs and storage implementaSon • SSll facing some fundamental challenges...
  • 4. Why v.2? • Scalability and reliability challenges • Single instance of Timeline Server • Storage (single local LevelDB instance) • Usability • Flow • Metrics and configuraSon as first-class ciSzens • Metrics aggregaSon up the enSty hierarchy
  • 5. Highlights v.1 v.2 Single writer/reader Timeline Server Distributed writer/collector architecture Single local LevelDB storage* Scalable storage (HBase) v.1 enSty model New v.2 enSty model No aggregaSon Metrics aggregaSon REST API Richer query REST API
  • 6. Architecture • SeparaSon of writers (“collectors”) and readers • Distributed collectors: one collector for each app • Dedicated RM collector for RM-generated data • Collector discovery via RM • Pluggable storage with HBase as default storage
  • 7. Distributed collectors & readers !meline reader !meline reader Storage !meline reader AM !meline collector NM !meline reader pool app metrics/events container events/metrics RM !meline collector app/container events user queries (worker node running AM) (worker node running containers) write flow read flow
  • 8. Collector discovery RM AM app id => address ! start AM container NM 3meline collector " node heartbeat # allocate response worker node 3meline client
  • 9. New enSty model • Flows and flow runs as parents of YARN applicaSon enSSes • First-class configuraSon (key-value pairs) • First-class metrics (single-value or Sme series) • Designed to handle mulS-cluster environment out of the box
  • 10. What is a flow? • A flow is a group of YARN applicaSons that are launched as parts of a logical app • Oozie, Scalding, Pig, etc. • name: “frequent_visitor_stat” • run id: 1466097809000 • version: “b9b9068”
  • 11. ConfiguraSon and metrics • Now explicit top-level a5ributes of enSSes • Fine-grained updates and queries made possible • “update metric A to value x” • “query enMMes where config A = B” container 1_1 metric: A = 10 metric: B = 100 config: "Foo" = "bar"
  • 12. ConfiguraSon and metrics • Now explicit top-level a5ributes of enSSes • Fine-grained updates and queries made possible • “update metric A to value x” • “query enMMes where config A = B” container 1_1 metric: A = 50 metric: B = 100 config: "Foo" = "bar"
  • 13. HBase Storage • Scalable backend • Row Key structure • efficient range scans • KeyPrefixRegionSplitPolicy • Filter pushdown • Coprocessors for flow aggregaSon (“readless” aggregaSon) • Cell tags for metadata (applicaSon id, aggregaSon operaSon) • Cell Smestamps generated during put • lei shiied with app id added to avoid overwrites
  • 14. Tables in HBase • flow run • application • entity • flow activity • app to flow
  • 15. table: flow run Row key: clusterId!userName! flowName! inverted(flowRunId) most recent flow run stored first coprocessor enabled
  • 16. table: applicaSon Row key: clusterId!userName! flowName! inverted(flowRunId)! AppId applicaSons within a flow run stored together most recent flow run stored first
  • 17. table: enSty Row key: userName!clusterId!flowName! inverted(flowRunId)!AppId!entityType! entityId enSSes within an applicaSon within a flow run stored together per type • for example, all containers within a yarn applicaSon will be stored together pre-split table stores information per entity run like info, relatesTo, relatedTo, events, metrics, config
  • 18. table: flow acSvity Row key: clusterId! inverted(TopOfTheDay)! userName!flowName shows the flows that ran on that day stores informaSon per flow like number of runs, the run ids, versions
  • 19. table: appToFlow Row key: clusterId!appId - stores mapping of appId to flowName and flowRunId
  • 20. Metrics aggregaSon • ApplicaSon level • Rolls up sub-applicaSon metrics • Performed in real Sme in the collectors in memory • Flow run level • Rolls up app level metrics • Performed in HBase region servers via coprocessors • Offline aggregaSon (TBD) • Rolls up on user, queue, and flow offline periodically • Phoenix tables Container 1_1 “bytes” : 23 Container 1_2 “bytes” : 135 Container 2_1 “bytes” : 50 Container 3_1 “bytes” : 64 App1 “bytes”: 158 App2 “bytes”: 50 App3 “bytes”: 64 flow1 “bytes”: 208 flow2 “bytes”: 64 user1 “bytes”: 272 queue1 “bytes”: 272 App aggregation In collector flow aggregation In hbase offline aggregation
  • 23. Reader REST API: paths • URLs under /ws/v2/Smeline • Canonical REST style URLs: /ws/v2/Smeline/clusters/cluster_name/ users/user_name/flows/flow_name/runs/run_id • Path elements may be omi5ed if they can be inferred • flow context can be inferred by app id • default cluster is assumed if cluster is omi5ed
  • 24. Reader REST API: query params • limit, createdTimeStart, createdTimeEnd: constrain the enSSes • fields (ALL | EVENTS | INFO | CONFIGS | METRICS | RELATES_TO | IS_RELATED_TO): limit the contents to return • metricsToRetrieve, confsToRetrieve: further limit the contents to return • metricsLimit: limits the number of values in a Sme series
  • 25. Reader REST API: query params • relatesTo, isRelatedTo: filters by associaSon • *Filters: filters by info, config, metric, event, … • Supports complex filters including operators • metricFilter=(((metric1 eq 50) AND (metric2 gt 40)) OR (metric1 lt 20))
  • 26. Developing: TimelineClient In your application master: // create TimelineClient v.2 style TimelineClient client = TimelineClient.createTimelineClient(appId); client.init(conf); client.start(); // bind it to AM/RM client to receive the collector address amRMClient.registerTimelineClient(client); // create and write timeline entities TimelineEntity entity = new TimelineEntity(); client.putEntities(entity); // when the app is complete, stop the timeline client client.stop();
  • 27. Developing: Flow context In your app submitter: ApplicationSubmissionContext appContext = app.getApplicationSubmissionContext(); // set the flow context as YARN application tags Set<String> tags = new HashSet<>(); tags.add(TimelineUtils.generateFlowNameTag("distributed grep")); tags.add(TimelineUtils.generateFlowVersionTag( "3df8b0d6100530080d2e0decf9e528e57c42a90a")); tags.add(TimelineUtils.generateFlowRunIdTag(System.currentTimeMillis())); appContext.setApplicationTags(tags);
  • 28. SeIng up Timeline Service v.2 • Set up the HBase cluster (1.1.x) • Add the Smeline service jar to HBase • Install the flow run coprocessor • Create tables via TimelineSchemaCreator uSlity • Configure the YARN cluster • Enable Timeline Service v.2 • Add hbase-site.xml for the Smeline collector and readers • Start the Smeline reader daemon
  • 29. Milestone 1 ("Alpha 1") • Merge discussion (YARN-2928) in progress as we speak! ✓ Complete end-to-end read/write flow ✓ Real Sme applicaSon and flow aggregaSon ✓ New enSty model ✓ HBase Storage ✓ Rich REST API ✓ IntegraSon with Distributed Shell and MapReduce ✓ YARN generic events and system metrics
  • 30. Milestones - Future • Milestone 2 (“Alpha 2”) • IntegraSon with new YARN UI • IntegraSon with more frameworks • Beta • Freeze API and storage schema • Security • Collectors as containers • Storage fault tolerance • ProducSon-ready • MigraSon-ready
  • 31. Demo
  • 32. Contributors • Li Lu, Junping Du, Vinod Kumar Vavilapalli (Hortonworks) • Varun Saxena, Naganarasimha G. R. (Huawei) • Sangjin Lee, Vrushali Channapa5an, Joep RoInghuis (Twi5er) • Zhijie Shen (now at Facebook) • The HBase and Phoenix community!