SlideShare a Scribd company logo
2017 February 07
Hieu LE (hieulq@vn.fujitsu.com)
Fujitsu Vietnam Limited
PODC (Platform Offshore Development Center)
Vietnam OpenStack Community - VFOSSA
Logging/Request Tracing in Distributed
Environment
Copyright 2017 Fujitsu Vietnam Limited
/me
2 APRICOT 2017
Hieu LE
Vietnam Official OpenStack Community Organizer
VFOSSA Executive Member
OpenStack Project leader @ Fujitsu
OpenStack ATC/AUC
Email: hieulq@vn.fujitsu.com
Outline
3 APRICOT 2017
1. Intro
2. Current Logging solution
 Pros
 Cons
3. Tracing requirements
4. Request tracing
 Demo with OpenStack
Intro
4 APRICOT 2017
 Distributed Environment:
 Cloud Computing – Fog Computing.
 IoT environment.
 Micro-services architecture.
IoT – Fog – Cloud
5 APRICOT 2017
(Virtual) Storage
Services/Servers
Virtual Compute
Resources
Virtual Network
O2M2 Thingworx DeviceHive
Other
Platforms
Multiple Clouds
- Routing
+ Optimizing paths
+ Data pre-processing
6 APRICOT 2017
• What if something happened in our system?
• How can we resolve the problems as quick as possible?
Current Logging solution (1)
7 APRICOT 2017
 ELK, Graylog:
 Collecting logs from systems and appliances.
 Indexing and filtering  RCA
 Multiple Alert/Notify mechanisms.
 Visualization based on user’s needs.
Current Logging solution (2)
8 APRICOT 2017
 Pros:
 Quickly trouble-shoot problems of systems/appliances.
 Reduce cost for storing log, based on PCI DSS or HIPAA
requirements.
 Cons:
 Mostly depend on systems/appliances log.
 Require more efforts on sizing/deploying, maintaining and operating
these logging solution.
 Ate up resources (mostly storage)  May not suitable for small
sensors.
Current Logging solution (3)
9 APRICOT 2017
 Example 01:
 Single request for launching 01 VM in OpenStack cloud system can
go through at least 04 micro-services.
 Log INFO level sometimes contain misleading information or not-
enough information for trouble-shooting
 Turn on DEBUG log level
 Too much information and eat up storage.
 Hard to control the overhead threshold.
Current Logging solution (4)
10 APRICOT 2017
 Example 02:
 ELK/Graylog requires some tweaks and efforts on visualize,
collecting, profiling and RCA in distributed environment.
 Consider following queries in environments with >10 services:
 “Find me the root cause of all error requests where the requests
process X business.”
 “Find me requests where the user was logged in and the request
took more than two seconds and a DB transaction was held open
for more than 500 ms.”
Tracing Requirements
Address the Data
Explosion
Logs, Metrics, Events,
Active/Passive Checks,
…
End-to-End Debugging
Understand what the real
issue is and what is affected
when errors occur
Visibility
Deliver centralized
intelligence for cloud
operations at scale
Operator Needs
Resource Utilization
Understand resource
availability and
utilization
Solution Requirements
Able to Collect,
Store and Access
all types of data
in one place
Highly
Performant and
Scalable
Platform
Flexible Processing Pipeline that
can support multiple use cases:
diagnostics, root cause analysis,
SLA calculations, utilization
reporting, …
Extensible Platform that
can be extended to
support new types of data
and processing
11 APRICOT 2017
Tracing Requirements
• Users need centralize solution that provide enough
information related to machine centric (monitor) and
workflow centric (tracing).
– Provide general picture for every workflow: the
communication steps, req/resp time for each step
for performance reviewing purpose.
– Show monitoring metrics of hardware/services for
each step at the time of investigation.
– Provide general purpose RCA method for quickly
troubleshooting.
12 APRICOT 2017
Workflow Centric solution quick survey
There are many solutions aim to tracing the workflow centric, divided into
3 categories: [1]
1. Explicit metadata propagation: inject tracing metadata into current
system (Zipkin, Kieker, X-Trace, Tracelytics, Cloudera Htrace,
ExplorViz, OpenTracing - CNCF)
2. Schema-based: rely on the event semantics of system and use
temporal schema of custom log message for tracing. (Magpie)
3. Black-box tracing: rely on log analysis for inferring relationship among
events. (Fchain, Netmedic)
[1]. HANSEL: Diagnosing Faults in OpenStack – IBM Research
13 APRICOT 2017
Workflow centric solutions (1)
14 APRICOT 2017
• Figure of traditional workflow
Service A Service B Service C Service D
Req
Workflow centric solutions (2)
15 APRICOT 2017
• Explicit metadata propagation
 Figure of explicit metadata tracing workflow: inject metadata in request/response
and send to tracing mechanism (Zipkin, Dapper..)
Service A Service B Service C Service D
Tracing
Mechanism
Req
Workflow centric solutions (3)
16 APRICOT 2017
• Explicit metadata propagation
 Pros:
• Give enough detail for tracing the problems
• Highly scalability.
 Cons:
• Must modify code base and inject meta-data into header of each request and
response
• Increase network packet (maybe a little bit like Zipkin - around 500bytes)
Workflow centric solutions (4)
17 APRICOT 2017
• Schema-based: based on sematic of event generated from system
(including OS, services and applications), then joining all related event
schema for final inference.
Service A Service B Service C Service D
Authenticate
Authenticate
Authenticate
Get Image
Create port, IP and attach
Req Read/Write
DB
Event Listener
Workflow centric solutions (5)
18 APRICOT 2017
• Schema-based
 Pros:
• Less modification into code base
 Cons:
• Low scalability. (the result is delayed until all event are collected).
• Less details than explicit meta-data. (the semantic of event, the event list and also
the way to join schemas define the success of this approach  we need to build a
warehouse of event semantic)
Workflow centric solutions (6)
19 APRICOT 2017
• Black-box tracing: collect logs of all services, then do analyzing all the
logs and infer the root cause of problem.
Service A Service B Service C Service D
DB
Log Collector
and Analyzer
Logs
Logs Logs Logs
Logs
Workflow centric solutions (7)
20 APRICOT 2017
• Black-box tracing:
 Pros:
• No modification to code base.
 Cons:
• High error rate. (almost is probabilistic data mining approaches)
Example (1)
21 APRICOT 2017
Magpie: Schema-based
Example (2)
22 APRICOT 2017
Zipkin: Explicit metadata propagation
Demo with OpenStack
23 APRICOT 2017
OSProfiler: Explicit metadata propagation small library
Q & A
THANK YOU!
24 APRICOT 2017

More Related Content

What's hot (20)

CI/CD for a Data Platform
CI/CD for a Data PlatformCI/CD for a Data Platform
CI/CD for a Data Platform
Codit
 
A Framework for Infrastructure Visibility, Analytics & Operational Intelligence
A Framework for Infrastructure Visibility, Analytics & Operational IntelligenceA Framework for Infrastructure Visibility, Analytics & Operational Intelligence
A Framework for Infrastructure Visibility, Analytics & Operational Intelligence
Stephen Collins
 
Combining Logs, Metrics, and Traces for Unified Observability
Combining Logs, Metrics, and Traces for Unified ObservabilityCombining Logs, Metrics, and Traces for Unified Observability
Combining Logs, Metrics, and Traces for Unified Observability
Elasticsearch
 
Getting started with apache flink streaming api
Getting started with apache flink streaming apiGetting started with apache flink streaming api
Getting started with apache flink streaming api
Preetdeep Kumar
 
Elastic at KPN
Elastic at KPNElastic at KPN
Elastic at KPN
Elasticsearch
 
What’s Evolving in the Elastic Stack
What’s Evolving in the Elastic StackWhat’s Evolving in the Elastic Stack
What’s Evolving in the Elastic Stack
Elasticsearch
 
Cloudera Federal Forum 2014: EzBake, the DoDIIS App Engine
Cloudera Federal Forum 2014: EzBake, the DoDIIS App EngineCloudera Federal Forum 2014: EzBake, the DoDIIS App Engine
Cloudera Federal Forum 2014: EzBake, the DoDIIS App Engine
Cloudera, Inc.
 
Search for all with Elastic Enterprise Search
Search for all with Elastic Enterprise Search Search for all with Elastic Enterprise Search
Search for all with Elastic Enterprise Search
Elasticsearch
 
Siscale Lightning Talk: Automated Root Cause Analysis with Elastic Stack
Siscale Lightning Talk: Automated Root Cause Analysis with Elastic StackSiscale Lightning Talk: Automated Root Cause Analysis with Elastic Stack
Siscale Lightning Talk: Automated Root Cause Analysis with Elastic Stack
Elasticsearch
 
SnapLogic Live: Anaplan Integration
SnapLogic Live: Anaplan IntegrationSnapLogic Live: Anaplan Integration
SnapLogic Live: Anaplan Integration
SnapLogic
 
SMW Use Cases at the Provincial Government of Lower Austria, Gerald Streimelw...
SMW Use Cases at the Provincial Government of Lower Austria, Gerald Streimelw...SMW Use Cases at the Provincial Government of Lower Austria, Gerald Streimelw...
SMW Use Cases at the Provincial Government of Lower Austria, Gerald Streimelw...
KDZ - Zentrum für Verwaltungsforschung
 
Discover How Allscripts Uses InfluxDB to Monitor its Healthcare IT Platform
Discover How Allscripts Uses InfluxDB to Monitor its Healthcare IT PlatformDiscover How Allscripts Uses InfluxDB to Monitor its Healthcare IT Platform
Discover How Allscripts Uses InfluxDB to Monitor its Healthcare IT Platform
InfluxData
 
Big data lab as a service
Big data lab as a serviceBig data lab as a service
Big data lab as a service
Hadi Fadlallah
 
How KeyBank Used Elastic to Build an Enterprise Monitoring Solution
How KeyBank Used Elastic to Build an Enterprise Monitoring SolutionHow KeyBank Used Elastic to Build an Enterprise Monitoring Solution
How KeyBank Used Elastic to Build an Enterprise Monitoring Solution
Elasticsearch
 
Achieving cyber mission assurance with near real-time impact
Achieving cyber mission assurance with near real-time impactAchieving cyber mission assurance with near real-time impact
Achieving cyber mission assurance with near real-time impact
Elasticsearch
 
O monitoramento da infraestrutura facilitado, da ingestão ao insight
O monitoramento da infraestrutura facilitado, da ingestão ao insightO monitoramento da infraestrutura facilitado, da ingestão ao insight
O monitoramento da infraestrutura facilitado, da ingestão ao insight
Elasticsearch
 
Combinação de logs, métricas e rastreamentos para observabilidade unificada
Combinação de logs, métricas e rastreamentos para observabilidade unificadaCombinação de logs, métricas e rastreamentos para observabilidade unificada
Combinação de logs, métricas e rastreamentos para observabilidade unificada
Elasticsearch
 
ECL-Watch: A Big Data Application Performance Tuning Tool in the HPCC Systems...
ECL-Watch: A Big Data Application Performance Tuning Tool in the HPCC Systems...ECL-Watch: A Big Data Application Performance Tuning Tool in the HPCC Systems...
ECL-Watch: A Big Data Application Performance Tuning Tool in the HPCC Systems...
HPCC Systems
 
Empower your security practitioners with the Elastic Stack
Empower your security practitioners with the Elastic StackEmpower your security practitioners with the Elastic Stack
Empower your security practitioners with the Elastic Stack
Elasticsearch
 
The Elastic Evolution of CenturyLink’s Network Management System
The Elastic Evolution of CenturyLink’s Network Management SystemThe Elastic Evolution of CenturyLink’s Network Management System
The Elastic Evolution of CenturyLink’s Network Management System
Elasticsearch
 
CI/CD for a Data Platform
CI/CD for a Data PlatformCI/CD for a Data Platform
CI/CD for a Data Platform
Codit
 
A Framework for Infrastructure Visibility, Analytics & Operational Intelligence
A Framework for Infrastructure Visibility, Analytics & Operational IntelligenceA Framework for Infrastructure Visibility, Analytics & Operational Intelligence
A Framework for Infrastructure Visibility, Analytics & Operational Intelligence
Stephen Collins
 
Combining Logs, Metrics, and Traces for Unified Observability
Combining Logs, Metrics, and Traces for Unified ObservabilityCombining Logs, Metrics, and Traces for Unified Observability
Combining Logs, Metrics, and Traces for Unified Observability
Elasticsearch
 
Getting started with apache flink streaming api
Getting started with apache flink streaming apiGetting started with apache flink streaming api
Getting started with apache flink streaming api
Preetdeep Kumar
 
What’s Evolving in the Elastic Stack
What’s Evolving in the Elastic StackWhat’s Evolving in the Elastic Stack
What’s Evolving in the Elastic Stack
Elasticsearch
 
Cloudera Federal Forum 2014: EzBake, the DoDIIS App Engine
Cloudera Federal Forum 2014: EzBake, the DoDIIS App EngineCloudera Federal Forum 2014: EzBake, the DoDIIS App Engine
Cloudera Federal Forum 2014: EzBake, the DoDIIS App Engine
Cloudera, Inc.
 
Search for all with Elastic Enterprise Search
Search for all with Elastic Enterprise Search Search for all with Elastic Enterprise Search
Search for all with Elastic Enterprise Search
Elasticsearch
 
Siscale Lightning Talk: Automated Root Cause Analysis with Elastic Stack
Siscale Lightning Talk: Automated Root Cause Analysis with Elastic StackSiscale Lightning Talk: Automated Root Cause Analysis with Elastic Stack
Siscale Lightning Talk: Automated Root Cause Analysis with Elastic Stack
Elasticsearch
 
SnapLogic Live: Anaplan Integration
SnapLogic Live: Anaplan IntegrationSnapLogic Live: Anaplan Integration
SnapLogic Live: Anaplan Integration
SnapLogic
 
SMW Use Cases at the Provincial Government of Lower Austria, Gerald Streimelw...
SMW Use Cases at the Provincial Government of Lower Austria, Gerald Streimelw...SMW Use Cases at the Provincial Government of Lower Austria, Gerald Streimelw...
SMW Use Cases at the Provincial Government of Lower Austria, Gerald Streimelw...
KDZ - Zentrum für Verwaltungsforschung
 
Discover How Allscripts Uses InfluxDB to Monitor its Healthcare IT Platform
Discover How Allscripts Uses InfluxDB to Monitor its Healthcare IT PlatformDiscover How Allscripts Uses InfluxDB to Monitor its Healthcare IT Platform
Discover How Allscripts Uses InfluxDB to Monitor its Healthcare IT Platform
InfluxData
 
Big data lab as a service
Big data lab as a serviceBig data lab as a service
Big data lab as a service
Hadi Fadlallah
 
How KeyBank Used Elastic to Build an Enterprise Monitoring Solution
How KeyBank Used Elastic to Build an Enterprise Monitoring SolutionHow KeyBank Used Elastic to Build an Enterprise Monitoring Solution
How KeyBank Used Elastic to Build an Enterprise Monitoring Solution
Elasticsearch
 
Achieving cyber mission assurance with near real-time impact
Achieving cyber mission assurance with near real-time impactAchieving cyber mission assurance with near real-time impact
Achieving cyber mission assurance with near real-time impact
Elasticsearch
 
O monitoramento da infraestrutura facilitado, da ingestão ao insight
O monitoramento da infraestrutura facilitado, da ingestão ao insightO monitoramento da infraestrutura facilitado, da ingestão ao insight
O monitoramento da infraestrutura facilitado, da ingestão ao insight
Elasticsearch
 
Combinação de logs, métricas e rastreamentos para observabilidade unificada
Combinação de logs, métricas e rastreamentos para observabilidade unificadaCombinação de logs, métricas e rastreamentos para observabilidade unificada
Combinação de logs, métricas e rastreamentos para observabilidade unificada
Elasticsearch
 
ECL-Watch: A Big Data Application Performance Tuning Tool in the HPCC Systems...
ECL-Watch: A Big Data Application Performance Tuning Tool in the HPCC Systems...ECL-Watch: A Big Data Application Performance Tuning Tool in the HPCC Systems...
ECL-Watch: A Big Data Application Performance Tuning Tool in the HPCC Systems...
HPCC Systems
 
Empower your security practitioners with the Elastic Stack
Empower your security practitioners with the Elastic StackEmpower your security practitioners with the Elastic Stack
Empower your security practitioners with the Elastic Stack
Elasticsearch
 
The Elastic Evolution of CenturyLink’s Network Management System
The Elastic Evolution of CenturyLink’s Network Management SystemThe Elastic Evolution of CenturyLink’s Network Management System
The Elastic Evolution of CenturyLink’s Network Management System
Elasticsearch
 

Viewers also liked (20)

Interesting Facts About Mount Kilimanjaro
Interesting Facts About Mount KilimanjaroInteresting Facts About Mount Kilimanjaro
Interesting Facts About Mount Kilimanjaro
Art Samberg
 
Γνωρίζω την οικογένειά μου
Γνωρίζω την οικογένειά μου Γνωρίζω την οικογένειά μου
Γνωρίζω την οικογένειά μου
litsathana
 
Actividad no. 11 ecologia.
Actividad no. 11 ecologia.Actividad no. 11 ecologia.
Actividad no. 11 ecologia.
isabeltrejo44
 
Hot Resources for Financial Advisors to get more Referrals and Clients
Hot Resources for Financial Advisors to get more Referrals and ClientsHot Resources for Financial Advisors to get more Referrals and Clients
Hot Resources for Financial Advisors to get more Referrals and Clients
Net Pro Referral
 
Abeer Elshahat
Abeer ElshahatAbeer Elshahat
Abeer Elshahat
Abeer Elshahat
 
Distributed tracing - get a grasp on your production
Distributed tracing - get a grasp on your productionDistributed tracing - get a grasp on your production
Distributed tracing - get a grasp on your production
nklmish
 
Fordismo
FordismoFordismo
Fordismo
Bryan Salas
 
Simplifying the OpenStack and Kubernetes network stack with Romana
Simplifying the OpenStack and Kubernetes network stack with RomanaSimplifying the OpenStack and Kubernetes network stack with Romana
Simplifying the OpenStack and Kubernetes network stack with Romana
Juergen Brendel
 
Summit 16: Cengn Experience in Opnfv Projects
Summit 16: Cengn Experience in Opnfv ProjectsSummit 16: Cengn Experience in Opnfv Projects
Summit 16: Cengn Experience in Opnfv Projects
OPNFV
 
Monasca 를 이용한 cloud 모니터링 final
Monasca 를 이용한 cloud 모니터링 finalMonasca 를 이용한 cloud 모니터링 final
Monasca 를 이용한 cloud 모니터링 final
SangWook Byun
 
Stop todos a ler
Stop todos a lerStop todos a ler
Stop todos a ler
Filomena Claudino
 
OpenStack本番環境の作り方 - Interop 2016
OpenStack本番環境の作り方 - Interop 2016OpenStack本番環境の作り方 - Interop 2016
OpenStack本番環境の作り方 - Interop 2016
VirtualTech Japan Inc.
 
How to Develop OpenStack
How to Develop OpenStackHow to Develop OpenStack
How to Develop OpenStack
Mehdi Ali Soltani
 
OpenStack networking-sfc flow 분석
OpenStack networking-sfc flow 분석OpenStack networking-sfc flow 분석
OpenStack networking-sfc flow 분석
Yongyoon Shin
 
Internet Resource Management (IRM) & Internet Routing Registry (IRR)
Internet Resource Management (IRM) & Internet Routing Registry (IRR)Internet Resource Management (IRM) & Internet Routing Registry (IRR)
Internet Resource Management (IRM) & Internet Routing Registry (IRR)
APNIC
 
Geek Week 2016 - Deep Dive To Openstack
Geek Week 2016 -  Deep Dive To OpenstackGeek Week 2016 -  Deep Dive To Openstack
Geek Week 2016 - Deep Dive To Openstack
Haim Ateya
 
Openstack에 컨트리뷰션 해보기
Openstack에 컨트리뷰션 해보기Openstack에 컨트리뷰션 해보기
Openstack에 컨트리뷰션 해보기
영우 김
 
Ceph Performance on OpenStack - Barcelona Summit
Ceph Performance on OpenStack - Barcelona SummitCeph Performance on OpenStack - Barcelona Summit
Ceph Performance on OpenStack - Barcelona Summit
Takehiro Kudou
 
Open stack ocata summit enabling aws lambda-like functionality with openstac...
Open stack ocata summit  enabling aws lambda-like functionality with openstac...Open stack ocata summit  enabling aws lambda-like functionality with openstac...
Open stack ocata summit enabling aws lambda-like functionality with openstac...
Shaun Murakami
 
Logging/Request Tracing in Distributed Environment
Logging/Request Tracing in Distributed EnvironmentLogging/Request Tracing in Distributed Environment
Logging/Request Tracing in Distributed Environment
APNIC
 
Interesting Facts About Mount Kilimanjaro
Interesting Facts About Mount KilimanjaroInteresting Facts About Mount Kilimanjaro
Interesting Facts About Mount Kilimanjaro
Art Samberg
 
Γνωρίζω την οικογένειά μου
Γνωρίζω την οικογένειά μου Γνωρίζω την οικογένειά μου
Γνωρίζω την οικογένειά μου
litsathana
 
Actividad no. 11 ecologia.
Actividad no. 11 ecologia.Actividad no. 11 ecologia.
Actividad no. 11 ecologia.
isabeltrejo44
 
Hot Resources for Financial Advisors to get more Referrals and Clients
Hot Resources for Financial Advisors to get more Referrals and ClientsHot Resources for Financial Advisors to get more Referrals and Clients
Hot Resources for Financial Advisors to get more Referrals and Clients
Net Pro Referral
 
Distributed tracing - get a grasp on your production
Distributed tracing - get a grasp on your productionDistributed tracing - get a grasp on your production
Distributed tracing - get a grasp on your production
nklmish
 
Simplifying the OpenStack and Kubernetes network stack with Romana
Simplifying the OpenStack and Kubernetes network stack with RomanaSimplifying the OpenStack and Kubernetes network stack with Romana
Simplifying the OpenStack and Kubernetes network stack with Romana
Juergen Brendel
 
Summit 16: Cengn Experience in Opnfv Projects
Summit 16: Cengn Experience in Opnfv ProjectsSummit 16: Cengn Experience in Opnfv Projects
Summit 16: Cengn Experience in Opnfv Projects
OPNFV
 
Monasca 를 이용한 cloud 모니터링 final
Monasca 를 이용한 cloud 모니터링 finalMonasca 를 이용한 cloud 모니터링 final
Monasca 를 이용한 cloud 모니터링 final
SangWook Byun
 
OpenStack本番環境の作り方 - Interop 2016
OpenStack本番環境の作り方 - Interop 2016OpenStack本番環境の作り方 - Interop 2016
OpenStack本番環境の作り方 - Interop 2016
VirtualTech Japan Inc.
 
OpenStack networking-sfc flow 분석
OpenStack networking-sfc flow 분석OpenStack networking-sfc flow 분석
OpenStack networking-sfc flow 분석
Yongyoon Shin
 
Internet Resource Management (IRM) & Internet Routing Registry (IRR)
Internet Resource Management (IRM) & Internet Routing Registry (IRR)Internet Resource Management (IRM) & Internet Routing Registry (IRR)
Internet Resource Management (IRM) & Internet Routing Registry (IRR)
APNIC
 
Geek Week 2016 - Deep Dive To Openstack
Geek Week 2016 -  Deep Dive To OpenstackGeek Week 2016 -  Deep Dive To Openstack
Geek Week 2016 - Deep Dive To Openstack
Haim Ateya
 
Openstack에 컨트리뷰션 해보기
Openstack에 컨트리뷰션 해보기Openstack에 컨트리뷰션 해보기
Openstack에 컨트리뷰션 해보기
영우 김
 
Ceph Performance on OpenStack - Barcelona Summit
Ceph Performance on OpenStack - Barcelona SummitCeph Performance on OpenStack - Barcelona Summit
Ceph Performance on OpenStack - Barcelona Summit
Takehiro Kudou
 
Open stack ocata summit enabling aws lambda-like functionality with openstac...
Open stack ocata summit  enabling aws lambda-like functionality with openstac...Open stack ocata summit  enabling aws lambda-like functionality with openstac...
Open stack ocata summit enabling aws lambda-like functionality with openstac...
Shaun Murakami
 
Logging/Request Tracing in Distributed Environment
Logging/Request Tracing in Distributed EnvironmentLogging/Request Tracing in Distributed Environment
Logging/Request Tracing in Distributed Environment
APNIC
 

Similar to Apricot2017 Request tracing in distributed environment (20)

Enterprise Data Lakes
Enterprise Data LakesEnterprise Data Lakes
Enterprise Data Lakes
Farid Gurbanov
 
EUDAT B2STAGE & EOSC-hub
EUDAT B2STAGE & EOSC-hubEUDAT B2STAGE & EOSC-hub
EUDAT B2STAGE & EOSC-hub
EOSC-hub project
 
SharePoint Best Practices Conference 2013
SharePoint Best Practices Conference 2013SharePoint Best Practices Conference 2013
SharePoint Best Practices Conference 2013
Mike Brannon
 
Introducing Events and Stream Processing into Nationwide Building Society
Introducing Events and Stream Processing into Nationwide Building SocietyIntroducing Events and Stream Processing into Nationwide Building Society
Introducing Events and Stream Processing into Nationwide Building Society
confluent
 
Introducing Events and Stream Processing into Nationwide Building Society (Ro...
Introducing Events and Stream Processing into Nationwide Building Society (Ro...Introducing Events and Stream Processing into Nationwide Building Society (Ro...
Introducing Events and Stream Processing into Nationwide Building Society (Ro...
confluent
 
Building Reactive Real-time Data Pipeline
Building Reactive Real-time Data PipelineBuilding Reactive Real-time Data Pipeline
Building Reactive Real-time Data Pipeline
Trieu Nguyen
 
A Logical Architecture is Always a Flexible Architecture (ASEAN)
A Logical Architecture is Always a Flexible Architecture (ASEAN)A Logical Architecture is Always a Flexible Architecture (ASEAN)
A Logical Architecture is Always a Flexible Architecture (ASEAN)
Denodo
 
CPaaS.io Y1 Review Meeting - Use Cases
CPaaS.io Y1 Review Meeting - Use CasesCPaaS.io Y1 Review Meeting - Use Cases
CPaaS.io Y1 Review Meeting - Use Cases
Stephan Haller
 
DM Radio Webinar: Adopting a Streaming-Enabled Architecture
DM Radio Webinar: Adopting a Streaming-Enabled ArchitectureDM Radio Webinar: Adopting a Streaming-Enabled Architecture
DM Radio Webinar: Adopting a Streaming-Enabled Architecture
DATAVERSITY
 
Monitoring in 2017 - TIAD Camp Docker
Monitoring in 2017 - TIAD Camp DockerMonitoring in 2017 - TIAD Camp Docker
Monitoring in 2017 - TIAD Camp Docker
The Incredible Automation Day
 
Big Data LDN 2018: FORTUNE 100 LESSONS ON ARCHITECTING DATA LAKES FOR REAL-TI...
Big Data LDN 2018: FORTUNE 100 LESSONS ON ARCHITECTING DATA LAKES FOR REAL-TI...Big Data LDN 2018: FORTUNE 100 LESSONS ON ARCHITECTING DATA LAKES FOR REAL-TI...
Big Data LDN 2018: FORTUNE 100 LESSONS ON ARCHITECTING DATA LAKES FOR REAL-TI...
Matt Stubbs
 
Modern Monitoring
Modern MonitoringModern Monitoring
Modern Monitoring
Ron Sengupta
 
Mapping presentation THAG big data from space
Mapping presentation THAG big data from spaceMapping presentation THAG big data from space
Mapping presentation THAG big data from space
Bartosz Szkudlarek
 
Cytoscape: Now and Future
Cytoscape: Now and FutureCytoscape: Now and Future
Cytoscape: Now and Future
Keiichiro Ono
 
Trivadis TechEvent 2017 Field report SQL Server by Stephan Hurni
Trivadis TechEvent 2017 Field report SQL Server by Stephan HurniTrivadis TechEvent 2017 Field report SQL Server by Stephan Hurni
Trivadis TechEvent 2017 Field report SQL Server by Stephan Hurni
Trivadis
 
Importance of ‘Centralized Event collection’ and BigData platform for Analysis !
Importance of ‘Centralized Event collection’ and BigData platform for Analysis !Importance of ‘Centralized Event collection’ and BigData platform for Analysis !
Importance of ‘Centralized Event collection’ and BigData platform for Analysis !
Piyush Kumar
 
Private Network Project for Colleges
Private Network Project for CollegesPrivate Network Project for Colleges
Private Network Project for Colleges
Aditya Jain
 
Ojoconsulting Oy Nimbus Monitoring Service description v1.2 public
Ojoconsulting Oy Nimbus Monitoring Service description v1.2 publicOjoconsulting Oy Nimbus Monitoring Service description v1.2 public
Ojoconsulting Oy Nimbus Monitoring Service description v1.2 public
Ojoconsulting Oy
 
Monitoring and Scaling Redis at DataDog - Ilan Rabinovitch, DataDog
 Monitoring and Scaling Redis at DataDog - Ilan Rabinovitch, DataDog Monitoring and Scaling Redis at DataDog - Ilan Rabinovitch, DataDog
Monitoring and Scaling Redis at DataDog - Ilan Rabinovitch, DataDog
Redis Labs
 
Big Data Architectures @ JAX / BigDataCon 2016
Big Data Architectures @ JAX / BigDataCon 2016Big Data Architectures @ JAX / BigDataCon 2016
Big Data Architectures @ JAX / BigDataCon 2016
Guido Schmutz
 
SharePoint Best Practices Conference 2013
SharePoint Best Practices Conference 2013SharePoint Best Practices Conference 2013
SharePoint Best Practices Conference 2013
Mike Brannon
 
Introducing Events and Stream Processing into Nationwide Building Society
Introducing Events and Stream Processing into Nationwide Building SocietyIntroducing Events and Stream Processing into Nationwide Building Society
Introducing Events and Stream Processing into Nationwide Building Society
confluent
 
Introducing Events and Stream Processing into Nationwide Building Society (Ro...
Introducing Events and Stream Processing into Nationwide Building Society (Ro...Introducing Events and Stream Processing into Nationwide Building Society (Ro...
Introducing Events and Stream Processing into Nationwide Building Society (Ro...
confluent
 
Building Reactive Real-time Data Pipeline
Building Reactive Real-time Data PipelineBuilding Reactive Real-time Data Pipeline
Building Reactive Real-time Data Pipeline
Trieu Nguyen
 
A Logical Architecture is Always a Flexible Architecture (ASEAN)
A Logical Architecture is Always a Flexible Architecture (ASEAN)A Logical Architecture is Always a Flexible Architecture (ASEAN)
A Logical Architecture is Always a Flexible Architecture (ASEAN)
Denodo
 
CPaaS.io Y1 Review Meeting - Use Cases
CPaaS.io Y1 Review Meeting - Use CasesCPaaS.io Y1 Review Meeting - Use Cases
CPaaS.io Y1 Review Meeting - Use Cases
Stephan Haller
 
DM Radio Webinar: Adopting a Streaming-Enabled Architecture
DM Radio Webinar: Adopting a Streaming-Enabled ArchitectureDM Radio Webinar: Adopting a Streaming-Enabled Architecture
DM Radio Webinar: Adopting a Streaming-Enabled Architecture
DATAVERSITY
 
Big Data LDN 2018: FORTUNE 100 LESSONS ON ARCHITECTING DATA LAKES FOR REAL-TI...
Big Data LDN 2018: FORTUNE 100 LESSONS ON ARCHITECTING DATA LAKES FOR REAL-TI...Big Data LDN 2018: FORTUNE 100 LESSONS ON ARCHITECTING DATA LAKES FOR REAL-TI...
Big Data LDN 2018: FORTUNE 100 LESSONS ON ARCHITECTING DATA LAKES FOR REAL-TI...
Matt Stubbs
 
Mapping presentation THAG big data from space
Mapping presentation THAG big data from spaceMapping presentation THAG big data from space
Mapping presentation THAG big data from space
Bartosz Szkudlarek
 
Cytoscape: Now and Future
Cytoscape: Now and FutureCytoscape: Now and Future
Cytoscape: Now and Future
Keiichiro Ono
 
Trivadis TechEvent 2017 Field report SQL Server by Stephan Hurni
Trivadis TechEvent 2017 Field report SQL Server by Stephan HurniTrivadis TechEvent 2017 Field report SQL Server by Stephan Hurni
Trivadis TechEvent 2017 Field report SQL Server by Stephan Hurni
Trivadis
 
Importance of ‘Centralized Event collection’ and BigData platform for Analysis !
Importance of ‘Centralized Event collection’ and BigData platform for Analysis !Importance of ‘Centralized Event collection’ and BigData platform for Analysis !
Importance of ‘Centralized Event collection’ and BigData platform for Analysis !
Piyush Kumar
 
Private Network Project for Colleges
Private Network Project for CollegesPrivate Network Project for Colleges
Private Network Project for Colleges
Aditya Jain
 
Ojoconsulting Oy Nimbus Monitoring Service description v1.2 public
Ojoconsulting Oy Nimbus Monitoring Service description v1.2 publicOjoconsulting Oy Nimbus Monitoring Service description v1.2 public
Ojoconsulting Oy Nimbus Monitoring Service description v1.2 public
Ojoconsulting Oy
 
Monitoring and Scaling Redis at DataDog - Ilan Rabinovitch, DataDog
 Monitoring and Scaling Redis at DataDog - Ilan Rabinovitch, DataDog Monitoring and Scaling Redis at DataDog - Ilan Rabinovitch, DataDog
Monitoring and Scaling Redis at DataDog - Ilan Rabinovitch, DataDog
Redis Labs
 
Big Data Architectures @ JAX / BigDataCon 2016
Big Data Architectures @ JAX / BigDataCon 2016Big Data Architectures @ JAX / BigDataCon 2016
Big Data Architectures @ JAX / BigDataCon 2016
Guido Schmutz
 

Recently uploaded (20)

Artificial Intelligence (AI) basics.pptx
Artificial Intelligence (AI) basics.pptxArtificial Intelligence (AI) basics.pptx
Artificial Intelligence (AI) basics.pptx
aditichinar
 
Introduction to FLUID MECHANICS & KINEMATICS
Introduction to FLUID MECHANICS &  KINEMATICSIntroduction to FLUID MECHANICS &  KINEMATICS
Introduction to FLUID MECHANICS & KINEMATICS
narayanaswamygdas
 
DSP and MV the Color image processing.ppt
DSP and MV the  Color image processing.pptDSP and MV the  Color image processing.ppt
DSP and MV the Color image processing.ppt
HafizAhamed8
 
DATA-DRIVEN SHOULDER INVERSE KINEMATICS YoungBeom Kim1 , Byung-Ha Park1 , Kwa...
DATA-DRIVEN SHOULDER INVERSE KINEMATICS YoungBeom Kim1 , Byung-Ha Park1 , Kwa...DATA-DRIVEN SHOULDER INVERSE KINEMATICS YoungBeom Kim1 , Byung-Ha Park1 , Kwa...
DATA-DRIVEN SHOULDER INVERSE KINEMATICS YoungBeom Kim1 , Byung-Ha Park1 , Kwa...
charlesdick1345
 
Smart Storage Solutions.pptx for production engineering
Smart Storage Solutions.pptx for production engineeringSmart Storage Solutions.pptx for production engineering
Smart Storage Solutions.pptx for production engineering
rushikeshnavghare94
 
Level 1-Safety.pptx Presentation of Electrical Safety
Level 1-Safety.pptx Presentation of Electrical SafetyLevel 1-Safety.pptx Presentation of Electrical Safety
Level 1-Safety.pptx Presentation of Electrical Safety
JoseAlbertoCariasDel
 
211421893-M-Tech-CIVIL-Structural-Engineering-pdf.pdf
211421893-M-Tech-CIVIL-Structural-Engineering-pdf.pdf211421893-M-Tech-CIVIL-Structural-Engineering-pdf.pdf
211421893-M-Tech-CIVIL-Structural-Engineering-pdf.pdf
inmishra17121973
 
Data Structures_Searching and Sorting.pptx
Data Structures_Searching and Sorting.pptxData Structures_Searching and Sorting.pptx
Data Structures_Searching and Sorting.pptx
RushaliDeshmukh2
 
Data Structures_Introduction to algorithms.pptx
Data Structures_Introduction to algorithms.pptxData Structures_Introduction to algorithms.pptx
Data Structures_Introduction to algorithms.pptx
RushaliDeshmukh2
 
Compiler Design Unit1 PPT Phases of Compiler.pptx
Compiler Design Unit1 PPT Phases of Compiler.pptxCompiler Design Unit1 PPT Phases of Compiler.pptx
Compiler Design Unit1 PPT Phases of Compiler.pptx
RushaliDeshmukh2
 
Lidar for Autonomous Driving, LiDAR Mapping for Driverless Cars.pptx
Lidar for Autonomous Driving, LiDAR Mapping for Driverless Cars.pptxLidar for Autonomous Driving, LiDAR Mapping for Driverless Cars.pptx
Lidar for Autonomous Driving, LiDAR Mapping for Driverless Cars.pptx
RishavKumar530754
 
QA/QC Manager (Quality management Expert)
QA/QC Manager (Quality management Expert)QA/QC Manager (Quality management Expert)
QA/QC Manager (Quality management Expert)
rccbatchplant
 
railway wheels, descaling after reheating and before forging
railway wheels, descaling after reheating and before forgingrailway wheels, descaling after reheating and before forging
railway wheels, descaling after reheating and before forging
Javad Kadkhodapour
 
Reagent dosing (Bredel) presentation.pptx
Reagent dosing (Bredel) presentation.pptxReagent dosing (Bredel) presentation.pptx
Reagent dosing (Bredel) presentation.pptx
AlejandroOdio
 
Process Parameter Optimization for Minimizing Springback in Cold Drawing Proc...
Process Parameter Optimization for Minimizing Springback in Cold Drawing Proc...Process Parameter Optimization for Minimizing Springback in Cold Drawing Proc...
Process Parameter Optimization for Minimizing Springback in Cold Drawing Proc...
Journal of Soft Computing in Civil Engineering
 
Structural Response of Reinforced Self-Compacting Concrete Deep Beam Using Fi...
Structural Response of Reinforced Self-Compacting Concrete Deep Beam Using Fi...Structural Response of Reinforced Self-Compacting Concrete Deep Beam Using Fi...
Structural Response of Reinforced Self-Compacting Concrete Deep Beam Using Fi...
Journal of Soft Computing in Civil Engineering
 
introduction to machine learining for beginers
introduction to machine learining for beginersintroduction to machine learining for beginers
introduction to machine learining for beginers
JoydebSheet
 
ELectronics Boards & Product Testing_Shiju.pdf
ELectronics Boards & Product Testing_Shiju.pdfELectronics Boards & Product Testing_Shiju.pdf
ELectronics Boards & Product Testing_Shiju.pdf
Shiju Jacob
 
fluke dealers in bangalore..............
fluke dealers in bangalore..............fluke dealers in bangalore..............
fluke dealers in bangalore..............
Haresh Vaswani
 
Avnet Silica's PCIM 2025 Highlights Flyer
Avnet Silica's PCIM 2025 Highlights FlyerAvnet Silica's PCIM 2025 Highlights Flyer
Avnet Silica's PCIM 2025 Highlights Flyer
WillDavies22
 
Artificial Intelligence (AI) basics.pptx
Artificial Intelligence (AI) basics.pptxArtificial Intelligence (AI) basics.pptx
Artificial Intelligence (AI) basics.pptx
aditichinar
 
Introduction to FLUID MECHANICS & KINEMATICS
Introduction to FLUID MECHANICS &  KINEMATICSIntroduction to FLUID MECHANICS &  KINEMATICS
Introduction to FLUID MECHANICS & KINEMATICS
narayanaswamygdas
 
DSP and MV the Color image processing.ppt
DSP and MV the  Color image processing.pptDSP and MV the  Color image processing.ppt
DSP and MV the Color image processing.ppt
HafizAhamed8
 
DATA-DRIVEN SHOULDER INVERSE KINEMATICS YoungBeom Kim1 , Byung-Ha Park1 , Kwa...
DATA-DRIVEN SHOULDER INVERSE KINEMATICS YoungBeom Kim1 , Byung-Ha Park1 , Kwa...DATA-DRIVEN SHOULDER INVERSE KINEMATICS YoungBeom Kim1 , Byung-Ha Park1 , Kwa...
DATA-DRIVEN SHOULDER INVERSE KINEMATICS YoungBeom Kim1 , Byung-Ha Park1 , Kwa...
charlesdick1345
 
Smart Storage Solutions.pptx for production engineering
Smart Storage Solutions.pptx for production engineeringSmart Storage Solutions.pptx for production engineering
Smart Storage Solutions.pptx for production engineering
rushikeshnavghare94
 
Level 1-Safety.pptx Presentation of Electrical Safety
Level 1-Safety.pptx Presentation of Electrical SafetyLevel 1-Safety.pptx Presentation of Electrical Safety
Level 1-Safety.pptx Presentation of Electrical Safety
JoseAlbertoCariasDel
 
211421893-M-Tech-CIVIL-Structural-Engineering-pdf.pdf
211421893-M-Tech-CIVIL-Structural-Engineering-pdf.pdf211421893-M-Tech-CIVIL-Structural-Engineering-pdf.pdf
211421893-M-Tech-CIVIL-Structural-Engineering-pdf.pdf
inmishra17121973
 
Data Structures_Searching and Sorting.pptx
Data Structures_Searching and Sorting.pptxData Structures_Searching and Sorting.pptx
Data Structures_Searching and Sorting.pptx
RushaliDeshmukh2
 
Data Structures_Introduction to algorithms.pptx
Data Structures_Introduction to algorithms.pptxData Structures_Introduction to algorithms.pptx
Data Structures_Introduction to algorithms.pptx
RushaliDeshmukh2
 
Compiler Design Unit1 PPT Phases of Compiler.pptx
Compiler Design Unit1 PPT Phases of Compiler.pptxCompiler Design Unit1 PPT Phases of Compiler.pptx
Compiler Design Unit1 PPT Phases of Compiler.pptx
RushaliDeshmukh2
 
Lidar for Autonomous Driving, LiDAR Mapping for Driverless Cars.pptx
Lidar for Autonomous Driving, LiDAR Mapping for Driverless Cars.pptxLidar for Autonomous Driving, LiDAR Mapping for Driverless Cars.pptx
Lidar for Autonomous Driving, LiDAR Mapping for Driverless Cars.pptx
RishavKumar530754
 
QA/QC Manager (Quality management Expert)
QA/QC Manager (Quality management Expert)QA/QC Manager (Quality management Expert)
QA/QC Manager (Quality management Expert)
rccbatchplant
 
railway wheels, descaling after reheating and before forging
railway wheels, descaling after reheating and before forgingrailway wheels, descaling after reheating and before forging
railway wheels, descaling after reheating and before forging
Javad Kadkhodapour
 
Reagent dosing (Bredel) presentation.pptx
Reagent dosing (Bredel) presentation.pptxReagent dosing (Bredel) presentation.pptx
Reagent dosing (Bredel) presentation.pptx
AlejandroOdio
 
introduction to machine learining for beginers
introduction to machine learining for beginersintroduction to machine learining for beginers
introduction to machine learining for beginers
JoydebSheet
 
ELectronics Boards & Product Testing_Shiju.pdf
ELectronics Boards & Product Testing_Shiju.pdfELectronics Boards & Product Testing_Shiju.pdf
ELectronics Boards & Product Testing_Shiju.pdf
Shiju Jacob
 
fluke dealers in bangalore..............
fluke dealers in bangalore..............fluke dealers in bangalore..............
fluke dealers in bangalore..............
Haresh Vaswani
 
Avnet Silica's PCIM 2025 Highlights Flyer
Avnet Silica's PCIM 2025 Highlights FlyerAvnet Silica's PCIM 2025 Highlights Flyer
Avnet Silica's PCIM 2025 Highlights Flyer
WillDavies22
 

Apricot2017 Request tracing in distributed environment

  • 1. 2017 February 07 Hieu LE ([email protected]) Fujitsu Vietnam Limited PODC (Platform Offshore Development Center) Vietnam OpenStack Community - VFOSSA Logging/Request Tracing in Distributed Environment Copyright 2017 Fujitsu Vietnam Limited
  • 2. /me 2 APRICOT 2017 Hieu LE Vietnam Official OpenStack Community Organizer VFOSSA Executive Member OpenStack Project leader @ Fujitsu OpenStack ATC/AUC Email: [email protected]
  • 3. Outline 3 APRICOT 2017 1. Intro 2. Current Logging solution  Pros  Cons 3. Tracing requirements 4. Request tracing  Demo with OpenStack
  • 4. Intro 4 APRICOT 2017  Distributed Environment:  Cloud Computing – Fog Computing.  IoT environment.  Micro-services architecture.
  • 5. IoT – Fog – Cloud 5 APRICOT 2017 (Virtual) Storage Services/Servers Virtual Compute Resources Virtual Network O2M2 Thingworx DeviceHive Other Platforms Multiple Clouds - Routing + Optimizing paths + Data pre-processing
  • 6. 6 APRICOT 2017 • What if something happened in our system? • How can we resolve the problems as quick as possible?
  • 7. Current Logging solution (1) 7 APRICOT 2017  ELK, Graylog:  Collecting logs from systems and appliances.  Indexing and filtering  RCA  Multiple Alert/Notify mechanisms.  Visualization based on user’s needs.
  • 8. Current Logging solution (2) 8 APRICOT 2017  Pros:  Quickly trouble-shoot problems of systems/appliances.  Reduce cost for storing log, based on PCI DSS or HIPAA requirements.  Cons:  Mostly depend on systems/appliances log.  Require more efforts on sizing/deploying, maintaining and operating these logging solution.  Ate up resources (mostly storage)  May not suitable for small sensors.
  • 9. Current Logging solution (3) 9 APRICOT 2017  Example 01:  Single request for launching 01 VM in OpenStack cloud system can go through at least 04 micro-services.  Log INFO level sometimes contain misleading information or not- enough information for trouble-shooting  Turn on DEBUG log level  Too much information and eat up storage.  Hard to control the overhead threshold.
  • 10. Current Logging solution (4) 10 APRICOT 2017  Example 02:  ELK/Graylog requires some tweaks and efforts on visualize, collecting, profiling and RCA in distributed environment.  Consider following queries in environments with >10 services:  “Find me the root cause of all error requests where the requests process X business.”  “Find me requests where the user was logged in and the request took more than two seconds and a DB transaction was held open for more than 500 ms.”
  • 11. Tracing Requirements Address the Data Explosion Logs, Metrics, Events, Active/Passive Checks, … End-to-End Debugging Understand what the real issue is and what is affected when errors occur Visibility Deliver centralized intelligence for cloud operations at scale Operator Needs Resource Utilization Understand resource availability and utilization Solution Requirements Able to Collect, Store and Access all types of data in one place Highly Performant and Scalable Platform Flexible Processing Pipeline that can support multiple use cases: diagnostics, root cause analysis, SLA calculations, utilization reporting, … Extensible Platform that can be extended to support new types of data and processing 11 APRICOT 2017
  • 12. Tracing Requirements • Users need centralize solution that provide enough information related to machine centric (monitor) and workflow centric (tracing). – Provide general picture for every workflow: the communication steps, req/resp time for each step for performance reviewing purpose. – Show monitoring metrics of hardware/services for each step at the time of investigation. – Provide general purpose RCA method for quickly troubleshooting. 12 APRICOT 2017
  • 13. Workflow Centric solution quick survey There are many solutions aim to tracing the workflow centric, divided into 3 categories: [1] 1. Explicit metadata propagation: inject tracing metadata into current system (Zipkin, Kieker, X-Trace, Tracelytics, Cloudera Htrace, ExplorViz, OpenTracing - CNCF) 2. Schema-based: rely on the event semantics of system and use temporal schema of custom log message for tracing. (Magpie) 3. Black-box tracing: rely on log analysis for inferring relationship among events. (Fchain, Netmedic) [1]. HANSEL: Diagnosing Faults in OpenStack – IBM Research 13 APRICOT 2017
  • 14. Workflow centric solutions (1) 14 APRICOT 2017 • Figure of traditional workflow Service A Service B Service C Service D Req
  • 15. Workflow centric solutions (2) 15 APRICOT 2017 • Explicit metadata propagation  Figure of explicit metadata tracing workflow: inject metadata in request/response and send to tracing mechanism (Zipkin, Dapper..) Service A Service B Service C Service D Tracing Mechanism Req
  • 16. Workflow centric solutions (3) 16 APRICOT 2017 • Explicit metadata propagation  Pros: • Give enough detail for tracing the problems • Highly scalability.  Cons: • Must modify code base and inject meta-data into header of each request and response • Increase network packet (maybe a little bit like Zipkin - around 500bytes)
  • 17. Workflow centric solutions (4) 17 APRICOT 2017 • Schema-based: based on sematic of event generated from system (including OS, services and applications), then joining all related event schema for final inference. Service A Service B Service C Service D Authenticate Authenticate Authenticate Get Image Create port, IP and attach Req Read/Write DB Event Listener
  • 18. Workflow centric solutions (5) 18 APRICOT 2017 • Schema-based  Pros: • Less modification into code base  Cons: • Low scalability. (the result is delayed until all event are collected). • Less details than explicit meta-data. (the semantic of event, the event list and also the way to join schemas define the success of this approach  we need to build a warehouse of event semantic)
  • 19. Workflow centric solutions (6) 19 APRICOT 2017 • Black-box tracing: collect logs of all services, then do analyzing all the logs and infer the root cause of problem. Service A Service B Service C Service D DB Log Collector and Analyzer Logs Logs Logs Logs Logs
  • 20. Workflow centric solutions (7) 20 APRICOT 2017 • Black-box tracing:  Pros: • No modification to code base.  Cons: • High error rate. (almost is probabilistic data mining approaches)
  • 21. Example (1) 21 APRICOT 2017 Magpie: Schema-based
  • 22. Example (2) 22 APRICOT 2017 Zipkin: Explicit metadata propagation
  • 23. Demo with OpenStack 23 APRICOT 2017 OSProfiler: Explicit metadata propagation small library
  • 24. Q & A THANK YOU! 24 APRICOT 2017