SlideShare a Scribd company logo
Monitor OpenStack Environments
From the Bottom Up and Front to Back
Thomas Stocking, Director of Systems Engineering, Founder. GroundWork Open Source
Inc.
February 16, 2016 | Icinga Camp
What’s ahead?
• Overview—the impact of virtualization on IT operations
• How OpenStack fits into the virtualization landscape
• Monitoring the changing landscape of IT infrastructure
• New monitoring concepts
• Selecting the right tools to fit the right process
• Conclusion
2
www.gwos.com ©2016 Groundwork Open Source, Inc. All rights reserved worldwide. CONFIDENTIAL
Start with the facts: OpenStack is for real!
3
www.gwos.com ©2016 Groundwork Open Source, Inc. All rights reserved worldwide.
OpenStack
deployments are
not just happening
in a far away land…
It’s all
open source.
5000 active
members,
and growing.
Not just a
geek
movement.
Serious
deployments for IT
operations, not just
in the Silicon Valley.
It’s disruptive and
requires serious
retooling for IT
operations.
Many
corporate
sponsors.
Let’s review the challenges and discover which tools best fit the new realities…
Once upon a time…
SysAdmin task:
Add to IT infrastructure {
Deploy servers into data center
Provisioning applications
Define monitoring for each element
Monitoring: SSH checks and port gets
} repeat
Checking for up down and send email
4
www.gwos.com ©2016 Groundwork Open Source, Inc. All rights reserved worldwide.
MONITOR
SERVER
A
P
P
S
A
P
P
S
SERVER
A
P
P
S
A
P
P
S
SERVER
A
P
P
S
SWITCH SWITCH
FIREWALL
ROUTER
PORT
GET
SSH
SNMP
SNMP
SNMP
VIRTUALIZATION
Then came along compute virtualization…
Data center compute optimization
• Server resources were virtualized to
improve efficiency which was <15%.
• Products like ESX allowed resource
optimization not disturbing the
provisioning process.
Rollout
• VM machines were provisioned.
• Applications installed.
• Monitoring defined as before.
API was added to the virtualization
manager (example Vsphere API)
5
www.gwos.com ©2016 Groundwork Open Source, Inc. All rights reserved worldwide.
MONITOR
SERVER
A
P
P
S
A
P
P
S
SERVER
A
P
P
S
A
P
P
S
SERVER
A
P
P
S
SWITCH SWITCH
FIREWALL
ROUTER
PORT
GET
SSH
SNMP
SNMP
SNMP
REST API
MONITOR
SNMP
VIRTUALIZATION
But it didn’t stop there…
Software defined everything
• Compute (hypervisors)
• Storage (SDS)
• Network (SDN)
Hybrid Cloud Public/Private
• Amazon Web Services
• Rackspace
• Azure
Change created blind spots in coverage.
Suddenly, SSH/SNMP methods don’t
cover everything.
6
www.gwos.com ©2016 Groundwork Open Source, Inc. All rights reserved worldwide.
SERVER
A
P
P
S
A
P
P
S
SERVER
A
P
P
S
A
P
P
S
SERVER
A
P
P
S
SWITCH SWITCH
FIREWALL
ROUTER
PORT
GET
SSH
REST API
REST API
SNMP
SNMP
VIRTUALIZED NETWORK
VIRTUALIZED NETWORK
SWITCHES SWITCHESSWITCHES
Files
Devices
VMs
Shared I/O Fog
Network & Storage
The infrastructure landscape completely changed.
7
www.gwos.com ©2016 Groundwork Open Source, Inc. All rights reserved worldwide.
SILOVIEWSILOVIEWSILOVIEWSILOVIEWSILOVIEW
HYPERVISORMANAGERVIEW
VIRTUALIZATION
And the moment you think you’ve seen it all…
DevOps is pushing the envelope
even further…
Linux Containers
are the new kids in town
API for deployment:
Monitoring is somebody else’s job
8
www.gwos.com ©2016 Groundwork Open Source, Inc. All rights reserved worldwide.
MONITOR
SERVER
A
P
P
S
A
P
P
S
SERVER
A
P
P
S
A
P
P
S
SERVER
A
P
P
S
SWITCH SWITCH
FIREWALL
ROUTER
PORT
GET
SSH
VIRTUALIZED NETWORK
VIRTUALIZED NETWORK
SWITCHES SWITCHESSWITCHES
REST API
REST API
DOCKER
HOST
DOCKER
HOST
DOCKER
HOST
Application Isolation
Rapid Deployment
Elastic Scalability
ESX
NSX
vSan
Nova
Neutron
Cinder
Glance
EC2
VPC
S3
Linux KVM
Network
NFS
Diverse Virtualization Stacks
Don’t panic.
What happened over the last 5 years:
• Virtualize everything
• Private and public clouds
• API-centric world is not just for
applications
• REST API is a standard
• CORBA, SOAP and proprietary API’s
are now classified as dinosaurs
• Browser based UI/JavaScript is king
• Learn to speak REST and JSON
9
www.gwos.com ©2016 Groundwork Open Source, Inc. All rights reserved worldwide.
vSphere
API
OpenStack
API
AWS
API
oVirt
API
Sounds really good, doesn’t it?
• Service-oriented architecture
• Pluggable hypervisors, network, storage to support a wide range of technologies
• Elastic Compute Units for better virtualization efficiency
• Standardized API’s to all services
But what’s the reality?
How does OpenStack fit in?
10
www.gwos.com ©2016 Groundwork Open Source, Inc. All rights reserved worldwide.
Object
Store–Swift
Dashboard–
Horizon
Image–
Glance
Compute/Hyperviso
r–Nova Storage–Cinder Network–
Quantum-Neutron
Identity–
Keystone
Façade
Service
API
API
API API API API
OpenStack Profile Editor
11
www.gwos.com ©2016 Groundwork Open Source, Inc. All rights reserved worldwide.
OpenStack Profile Editor
12
www.gwos.com ©2016 Groundwork Open Source, Inc. All rights reserved worldwide.
Performance Data
Virtual Infrastructure Container Metrics
Provisioning/Config
uration
ManageIQ/
Cloud Forms
Ansible
API
Manager Stack
API
HP HelionAPI
VMWare VRealize
Mirantis Fuel
API
API
API is good, but comprehensive coverage is even better...
13www.gwos.com ©2016 Groundwork Open Source, Inc. All rights reserved worldwide.
Dashboard
Object Store Image Computer Storage Network
Identity
OpenStack
API
Legacy Network
Servers ServersServers
Storage Racks
Applications,
Infrastructure Checks
SSH/SNMP/port get
Unified View
Standardized Data Collection
Unified View
Silo Tools
• Impossible to correlate
• Naming mismatch
• Over-monitoring
• Encourages “It’s not my problem,
check your system”
• No Big Picture dashboards
14
www.gwos.com ©2016 Groundwork Open Source, Inc. All rights reserved worldwide.
Need For
• Stack monitoring bottom to top
• Aliasing of names
• Combine best of breed collectors
• Correlated metrics across all
infrastructure
• Dashboard for each customer/client
Stack monitoring for OpenStack
15
www.gwos.com ©2016 Groundwork Open Source, Inc. All rights reserved worldwide.
Function Source Host name Alias
Network, Storage Netflow
SNMP
Dsw1-422 os-eng-h1
Hardware IPMI
SNMP
Drac-server1 os-eng-h1
Identity server check-mysql
check-port
server1:port os-eng-h1
Operating
System
check-proc
check-mem
check-load
server1 os-eng-h1
OpenStack-API Hypervisor CPU/Mem metrics,
Network
storage
10.10.0.1 os-eng-h1
OpenStack API VM CPU/mem, Network, storage 10.10.10.123 eng-slicer-1
Operating
system
check-proc
check-mem
check-load
eng-cent6-actg eng-slicer-1
DockerHost Memory, CPU eng-cent6-actg eng-slicer-1
Container Memory, CPU, Procs FAC3443DA77 eng-slicer-1
Application Check_https 172.28.102.51 eng-slicer-1
HYPERVISORVMAPP
Automation/Continuous configuration discovery
Adding virtual machines, containers is automatic…
Monitoring is like a flight recorder—collecting all active data
“Continuous configuration discovery” by…
Synchronization of configuration over APIs
Auto-registration of agents/Download of plugins
Collection of data availability and performance
16
www.gwos.com ©2016 Groundwork Open Source, Inc. All rights reserved worldwide.
V C
HYPERVISOR
Management
Monitoring System
Configuration
collector
Data
collector
Plugins
Dashboards
D
A
T
A
C C V V V V V C
so is the monitoring
OpenStack Monitoring: Tools Selection
• Fuel—OpenStack deployment
• Murano—Application deployment
• Fuel Plugin—Configuration generate
• Fuel Plugin—Configure monitoring
17
www.gwos.com ©2016 Groundwork Open Source, Inc. All rights reserved worldwide.
OpenStack
API Icinga2 setup
OpenStack Linux
OS Platform
Hypervisor Hardware
IPMI
Port
Agent
OpenStack ControllerKeystone
OpenStack StorageCinder
Unified Monitoring View
Data Management
CollectorCollectorCollector
Fuel
Murano Application Software Catalog
Icinga
2
Docker
Node
2
Node
3
Docker
VM 1
OPEN
TSDB
Icinga2
VM 1 VM 2 VM 3 VM 4
OPEN
TSDB Docker
Performance Data
Monitoring Systems
18
www.gwos.com ©2016 Groundwork Open Source, Inc. All rights reserved worldwide.
Centralized Data Collector
• Monitoring send Perf Data to single API
Expandable Storage Cluster
• Expand on demand
Dashboard to visualize and drill down
• Historic, raw performance data
• Group alike metrics to find outliers
Grafana Dashboards
Reference Architecture Monitoring
19
www.gwos.com ©2016 Groundwork Open Source, Inc. All rights reserved worldwide.
Aliasing and data
normalization in backend
A
P
I
Data
Collector
Hub
Collector
Collector
Collector
Collect
Data Integration
Normalization
Aliasing
A
P
I
Integrate
A
P
I
Visualize
Growing number of API’s
require integration Hub
Benefits
Automation
• Reduced cost of maintenance
• Dynamic configuration
• Streamlined operations
Personnel
• System Administrator and DevOps
functions merging
20
www.gwos.com ©2016 Groundwork Open Source, Inc. All rights reserved worldwide.
Capacity & Resource Planning
• Complete Bottom-to-Top
(Network-to-Application) data
collection
• Allocate resources at the right level
Hardware
• No vendor lock-in
• Transparency through API
Lessons learned…
• Virtualization and containerization require new monitoring techniques
• Legacy hardware/software will be around for awhile, so don’t throw tools away
• Everything API needs integration
• Aliasing is hard, but may be easier than cross silo cooperation
• Use the best tool for the task
• Don’t over-monitor to cover a gap
21
www.gwos.com ©2016 Groundwork Open Source, Inc. All rights reserved worldwide.
Conclusion
• Open source tools are the driving force for innovation
• DevOps just select the best tools to do the job
• A single monitoring tool to “rule them all” doesn’t exist
• Integration is a complex task—don’t expect teams to agree on naming, process,
and workflow
• Automation and pragmatism will prevail as DevOps has demonstrated
• “Virtualize everything” will continue
• Automation and continuous discovery is necessary for rapid scale out
• OpenStack is the open source virtualization platform, but monitoring coverage
needs major improvements
22
www.gwos.com ©2016 Groundwork Open Source, Inc. All rights reserved worldwide.
Thank you.
Thomas Stocking
tstocking@gwos.com
23
www.gwos.com ©2016 Groundwork Open Source, Inc. All rights reserved worldwide.

More Related Content

What's hot (20)

PDF
Icinga @OSMC 2013
Icinga
 
PDF
State of Development - Icinga Meetup Linz August 2019
Icinga
 
PPT
Nagios Conference 2014 - Konstantin Benz - Monitoring Openstack The Relations...
Nagios
 
PDF
Icinga Web 2 is more
Icinga
 
PDF
Icinga Camp San Diego: Apify them all
Icinga
 
PDF
Icinga 2012 Development at 6th TF-NOC Meeting
Icinga
 
PDF
Icinga 2011 at Nagios Workshop
Icinga
 
PPTX
Icinga Camp Antwerp - Current State of Icinga
Icinga
 
PPTX
Icinga @ OSMC 2014
Icinga
 
PPTX
Icinga Camp Barcelona - Icinga
Icinga
 
PDF
Icinga 1, Icinga 2 @ FrOSCon 2014
Icinga
 
PPTX
Open Source Monitoring with Icinga at Fossasia 2015
Icinga
 
ODP
Eng. Johor Alam Presentation Slide on icinga 2
Eng. Johor Alam
 
PDF
Why favour Icinga over Nagios @ FrOSCon 2015
Icinga
 
PPTX
Presentation about Icinga at Kiratech DevOps Day in Verona
Icinga
 
PPTX
Icinga at Flossuk 2015 in York
Icinga
 
PDF
Why favor Icinga over Nagios @ DebConf15
Icinga
 
PDF
Icinga 2 and Puppet - Automate Monitoring
OlinData
 
PPTX
Icinga Camp Antwerp - Icinga2 Cluster
Icinga
 
PDF
Icinga 2011 at Chemnitzer Linuxtage
Icinga
 
Icinga @OSMC 2013
Icinga
 
State of Development - Icinga Meetup Linz August 2019
Icinga
 
Nagios Conference 2014 - Konstantin Benz - Monitoring Openstack The Relations...
Nagios
 
Icinga Web 2 is more
Icinga
 
Icinga Camp San Diego: Apify them all
Icinga
 
Icinga 2012 Development at 6th TF-NOC Meeting
Icinga
 
Icinga 2011 at Nagios Workshop
Icinga
 
Icinga Camp Antwerp - Current State of Icinga
Icinga
 
Icinga @ OSMC 2014
Icinga
 
Icinga Camp Barcelona - Icinga
Icinga
 
Icinga 1, Icinga 2 @ FrOSCon 2014
Icinga
 
Open Source Monitoring with Icinga at Fossasia 2015
Icinga
 
Eng. Johor Alam Presentation Slide on icinga 2
Eng. Johor Alam
 
Why favour Icinga over Nagios @ FrOSCon 2015
Icinga
 
Presentation about Icinga at Kiratech DevOps Day in Verona
Icinga
 
Icinga at Flossuk 2015 in York
Icinga
 
Why favor Icinga over Nagios @ DebConf15
Icinga
 
Icinga 2 and Puppet - Automate Monitoring
OlinData
 
Icinga Camp Antwerp - Icinga2 Cluster
Icinga
 
Icinga 2011 at Chemnitzer Linuxtage
Icinga
 

Viewers also liked (17)

PDF
Using Agilio SmartNICs for OpenStack Networking Acceleration
Netronome
 
PDF
NFV Tutorial
Rashid Mijumbi
 
PDF
NFV and OpenStack
Marie-Paule Odini
 
PDF
Network visibility and control using industry standard sFlow telemetry
pphaal
 
PDF
大規模環境のOpenStack アップグレードの考え方と実施のコツ
Tomoya Hashimoto
 
PDF
Treasure Data Cloud Data Platform
inside-BigData.com
 
PDF
Nfv orchestration open stack summit may2015 aricent
Aricent
 
PPTX
5 g network &amp; technology
Frikha Nour
 
PDF
Digdagによる大規模データ処理の自動化とエラー処理
Sadayuki Furuhashi
 
PDF
NFV evolution towards 5G
Marie-Paule Odini
 
PDF
Design Principles for 5G
Open Networking Summit
 
PPTX
NFV : Virtual Network Function Architecture
sidneel
 
PDF
【AWS初心者向けWebinar】AWSから始める動画配信
Amazon Web Services Japan
 
PDF
Cloud Network Virtualization with Juniper Contrail
buildacloud
 
PPSX
Contrail Deep-dive - Cloud Network Services at Scale
MarketingArrowECS_CZ
 
PDF
170827 jtf garafana
OSSラボ株式会社
 
PDF
ビッグデータ処理データベースの全体像と使い分け
Recruit Technologies
 
Using Agilio SmartNICs for OpenStack Networking Acceleration
Netronome
 
NFV Tutorial
Rashid Mijumbi
 
NFV and OpenStack
Marie-Paule Odini
 
Network visibility and control using industry standard sFlow telemetry
pphaal
 
大規模環境のOpenStack アップグレードの考え方と実施のコツ
Tomoya Hashimoto
 
Treasure Data Cloud Data Platform
inside-BigData.com
 
Nfv orchestration open stack summit may2015 aricent
Aricent
 
5 g network &amp; technology
Frikha Nour
 
Digdagによる大規模データ処理の自動化とエラー処理
Sadayuki Furuhashi
 
NFV evolution towards 5G
Marie-Paule Odini
 
Design Principles for 5G
Open Networking Summit
 
NFV : Virtual Network Function Architecture
sidneel
 
【AWS初心者向けWebinar】AWSから始める動画配信
Amazon Web Services Japan
 
Cloud Network Virtualization with Juniper Contrail
buildacloud
 
Contrail Deep-dive - Cloud Network Services at Scale
MarketingArrowECS_CZ
 
170827 jtf garafana
OSSラボ株式会社
 
ビッグデータ処理データベースの全体像と使い分け
Recruit Technologies
 
Ad

Similar to Monitor OpenStack Environments from the bottom up and front to back (20)

PDF
OSMC 2015: Monitor Open stack environments from the bottom up and front to ba...
NETWAYS
 
PDF
OSMC 2015 | Monitor OpenStack environments from the bottom up and front to ba...
NETWAYS
 
PDF
The Open-Source Monitoring Landscape
VictorOps
 
PDF
The Open-Source Monitoring Landscape
Mike Merideth
 
PPTX
Cloud and OpenStack
Seyed Ehsan Beheshtian
 
PPTX
Open stack presentation
Frikha Nour
 
PPTX
OpenStack: Why Is It Gaining So Much Traction?
mestery
 
PDF
stackconf 2025 | How Open Source Communities are Defining the Next Generation...
NETWAYS
 
PDF
The Future of Networks is Open...Source
Francois Duthilleul
 
PPTX
OpenStack & the Evolving Cloud Ecosystem
Mark Voelker
 
PPTX
Workshop - Openstack, Cloud Computing, Virtualization
Jayaprakash R
 
PPTX
Openstack workshop @ Kalasalingam
Beny Raja
 
PPTX
GDL OpenStack Community - Openstack Introduction
Victor Morales
 
PPTX
Open Stack Cloud Services
Saurabh Gupta
 
PDF
Training Ensimag OpenStack 2016
Bruno Cornec
 
PDF
Bitnami Bootcamp. OpenStack
Alberto Molina Coballes
 
PPTX
Cloudexpowest opensourcecloudcomputing-1by arun kumar
Arun Kumar
 
PPTX
Cloudexpowest opensourcecloudcomputing-1by arun kumar
Arun Kumar
 
PDF
Monitor everything from physical hardware to application functionality
Nicolas Seyvet
 
PPT
OpenStack - An Overview
graziol
 
OSMC 2015: Monitor Open stack environments from the bottom up and front to ba...
NETWAYS
 
OSMC 2015 | Monitor OpenStack environments from the bottom up and front to ba...
NETWAYS
 
The Open-Source Monitoring Landscape
VictorOps
 
The Open-Source Monitoring Landscape
Mike Merideth
 
Cloud and OpenStack
Seyed Ehsan Beheshtian
 
Open stack presentation
Frikha Nour
 
OpenStack: Why Is It Gaining So Much Traction?
mestery
 
stackconf 2025 | How Open Source Communities are Defining the Next Generation...
NETWAYS
 
The Future of Networks is Open...Source
Francois Duthilleul
 
OpenStack & the Evolving Cloud Ecosystem
Mark Voelker
 
Workshop - Openstack, Cloud Computing, Virtualization
Jayaprakash R
 
Openstack workshop @ Kalasalingam
Beny Raja
 
GDL OpenStack Community - Openstack Introduction
Victor Morales
 
Open Stack Cloud Services
Saurabh Gupta
 
Training Ensimag OpenStack 2016
Bruno Cornec
 
Bitnami Bootcamp. OpenStack
Alberto Molina Coballes
 
Cloudexpowest opensourcecloudcomputing-1by arun kumar
Arun Kumar
 
Cloudexpowest opensourcecloudcomputing-1by arun kumar
Arun Kumar
 
Monitor everything from physical hardware to application functionality
Nicolas Seyvet
 
OpenStack - An Overview
graziol
 
Ad

More from Icinga (20)

PDF
Upgrading Incident Management with Icinga - Icinga Camp Milan 2023
Icinga
 
PDF
Extending Icinga Web with Modules: powerful, smart and easily created - Icing...
Icinga
 
PDF
Infrastructure Monitoring for Cloud Native Enterprises - Icinga Camp Milan 2023
Icinga
 
PDF
Incident management: Best industry practices your team should know - Icinga C...
Icinga
 
PDF
Monitoring Cooling Units in a pharmaceutical GxP regulated environment - Icin...
Icinga
 
PDF
SNMP Monitoring at scale - Icinga Camp Milan 2023
Icinga
 
PPTX
Monitoring Kubernetes with Icinga - Icinga Camp Milan 2023
Icinga
 
PPTX
Current State of Icinga - Icinga Camp Milan 2023
Icinga
 
PDF
Efficient IT operations using monitoring systems and standardized tools - Ici...
Icinga
 
PPTX
Tornado Complex Event Processing Framework for Icinga - Icinga Camp Zurich 2019
Icinga
 
PDF
Signalilo: Visualizing Prometheus alerts in Icinga2 - Icinga Camp Zurich 2019
Icinga
 
PDF
Moving from Icinga 1 to Icinga 2 + Director - Icinga Camp Zurich 2019
Icinga
 
PDF
Icinga Director and vSphereDB - how they play together - Icinga Camp Zurich 2019
Icinga
 
PDF
Current State of Icinga - Icinga Camp Zurich 2019
Icinga
 
PDF
NetEye 4 based on Icinga 2 - Icinga Camp Milan 2019
Icinga
 
PDF
Integrating Icinga 2 and ntopng - Icinga Camp Milan 2019
Icinga
 
PDF
DevOps monitoring: Best Practices using OpenShift combined with Icinga & Big ...
Icinga
 
PPTX
Current State of Icinga - Icinga Camp Milan 2019
Icinga
 
PPTX
Best of Icinga Modules - Icinga Camp Milan 2019
Icinga
 
PPTX
hallenges of Monitoring Big Infrastructure - Icinga Camp Milan 2019
Icinga
 
Upgrading Incident Management with Icinga - Icinga Camp Milan 2023
Icinga
 
Extending Icinga Web with Modules: powerful, smart and easily created - Icing...
Icinga
 
Infrastructure Monitoring for Cloud Native Enterprises - Icinga Camp Milan 2023
Icinga
 
Incident management: Best industry practices your team should know - Icinga C...
Icinga
 
Monitoring Cooling Units in a pharmaceutical GxP regulated environment - Icin...
Icinga
 
SNMP Monitoring at scale - Icinga Camp Milan 2023
Icinga
 
Monitoring Kubernetes with Icinga - Icinga Camp Milan 2023
Icinga
 
Current State of Icinga - Icinga Camp Milan 2023
Icinga
 
Efficient IT operations using monitoring systems and standardized tools - Ici...
Icinga
 
Tornado Complex Event Processing Framework for Icinga - Icinga Camp Zurich 2019
Icinga
 
Signalilo: Visualizing Prometheus alerts in Icinga2 - Icinga Camp Zurich 2019
Icinga
 
Moving from Icinga 1 to Icinga 2 + Director - Icinga Camp Zurich 2019
Icinga
 
Icinga Director and vSphereDB - how they play together - Icinga Camp Zurich 2019
Icinga
 
Current State of Icinga - Icinga Camp Zurich 2019
Icinga
 
NetEye 4 based on Icinga 2 - Icinga Camp Milan 2019
Icinga
 
Integrating Icinga 2 and ntopng - Icinga Camp Milan 2019
Icinga
 
DevOps monitoring: Best Practices using OpenShift combined with Icinga & Big ...
Icinga
 
Current State of Icinga - Icinga Camp Milan 2019
Icinga
 
Best of Icinga Modules - Icinga Camp Milan 2019
Icinga
 
hallenges of Monitoring Big Infrastructure - Icinga Camp Milan 2019
Icinga
 

Recently uploaded (20)

PDF
What Makes Contify’s News API Stand Out: Key Features at a Glance
Contify
 
PPTX
The Project Compass - GDG on Campus MSIT
dscmsitkol
 
PDF
Biography of Daniel Podor.pdf
Daniel Podor
 
PDF
CIFDAQ Token Spotlight for 9th July 2025
CIFDAQ
 
PDF
Achieving Consistent and Reliable AI Code Generation - Medusa AI
medusaaico
 
PDF
"AI Transformation: Directions and Challenges", Pavlo Shaternik
Fwdays
 
PDF
Building Real-Time Digital Twins with IBM Maximo & ArcGIS Indoors
Safe Software
 
PDF
The Rise of AI and IoT in Mobile App Tech.pdf
IMG Global Infotech
 
DOCX
Python coding for beginners !! Start now!#
Rajni Bhardwaj Grover
 
DOCX
Cryptography Quiz: test your knowledge of this important security concept.
Rajni Bhardwaj Grover
 
PPTX
Future Tech Innovations 2025 – A TechLists Insight
TechLists
 
PPTX
OpenID AuthZEN - Analyst Briefing July 2025
David Brossard
 
PDF
How Startups Are Growing Faster with App Developers in Australia.pdf
India App Developer
 
PPTX
Q2 FY26 Tableau User Group Leader Quarterly Call
lward7
 
PPTX
"Autonomy of LLM Agents: Current State and Future Prospects", Oles` Petriv
Fwdays
 
PDF
Agentic AI lifecycle for Enterprise Hyper-Automation
Debmalya Biswas
 
PDF
Advancing WebDriver BiDi support in WebKit
Igalia
 
PDF
Newgen 2022-Forrester Newgen TEI_13 05 2022-The-Total-Economic-Impact-Newgen-...
darshakparmar
 
PDF
[Newgen] NewgenONE Marvin Brochure 1.pdf
darshakparmar
 
PPTX
WooCommerce Workshop: Bring Your Laptop
Laura Hartwig
 
What Makes Contify’s News API Stand Out: Key Features at a Glance
Contify
 
The Project Compass - GDG on Campus MSIT
dscmsitkol
 
Biography of Daniel Podor.pdf
Daniel Podor
 
CIFDAQ Token Spotlight for 9th July 2025
CIFDAQ
 
Achieving Consistent and Reliable AI Code Generation - Medusa AI
medusaaico
 
"AI Transformation: Directions and Challenges", Pavlo Shaternik
Fwdays
 
Building Real-Time Digital Twins with IBM Maximo & ArcGIS Indoors
Safe Software
 
The Rise of AI and IoT in Mobile App Tech.pdf
IMG Global Infotech
 
Python coding for beginners !! Start now!#
Rajni Bhardwaj Grover
 
Cryptography Quiz: test your knowledge of this important security concept.
Rajni Bhardwaj Grover
 
Future Tech Innovations 2025 – A TechLists Insight
TechLists
 
OpenID AuthZEN - Analyst Briefing July 2025
David Brossard
 
How Startups Are Growing Faster with App Developers in Australia.pdf
India App Developer
 
Q2 FY26 Tableau User Group Leader Quarterly Call
lward7
 
"Autonomy of LLM Agents: Current State and Future Prospects", Oles` Petriv
Fwdays
 
Agentic AI lifecycle for Enterprise Hyper-Automation
Debmalya Biswas
 
Advancing WebDriver BiDi support in WebKit
Igalia
 
Newgen 2022-Forrester Newgen TEI_13 05 2022-The-Total-Economic-Impact-Newgen-...
darshakparmar
 
[Newgen] NewgenONE Marvin Brochure 1.pdf
darshakparmar
 
WooCommerce Workshop: Bring Your Laptop
Laura Hartwig
 

Monitor OpenStack Environments from the bottom up and front to back

  • 1. Monitor OpenStack Environments From the Bottom Up and Front to Back Thomas Stocking, Director of Systems Engineering, Founder. GroundWork Open Source Inc. February 16, 2016 | Icinga Camp
  • 2. What’s ahead? • Overview—the impact of virtualization on IT operations • How OpenStack fits into the virtualization landscape • Monitoring the changing landscape of IT infrastructure • New monitoring concepts • Selecting the right tools to fit the right process • Conclusion 2 www.gwos.com ©2016 Groundwork Open Source, Inc. All rights reserved worldwide. CONFIDENTIAL
  • 3. Start with the facts: OpenStack is for real! 3 www.gwos.com ©2016 Groundwork Open Source, Inc. All rights reserved worldwide. OpenStack deployments are not just happening in a far away land… It’s all open source. 5000 active members, and growing. Not just a geek movement. Serious deployments for IT operations, not just in the Silicon Valley. It’s disruptive and requires serious retooling for IT operations. Many corporate sponsors. Let’s review the challenges and discover which tools best fit the new realities…
  • 4. Once upon a time… SysAdmin task: Add to IT infrastructure { Deploy servers into data center Provisioning applications Define monitoring for each element Monitoring: SSH checks and port gets } repeat Checking for up down and send email 4 www.gwos.com ©2016 Groundwork Open Source, Inc. All rights reserved worldwide. MONITOR SERVER A P P S A P P S SERVER A P P S A P P S SERVER A P P S SWITCH SWITCH FIREWALL ROUTER PORT GET SSH SNMP SNMP SNMP
  • 5. VIRTUALIZATION Then came along compute virtualization… Data center compute optimization • Server resources were virtualized to improve efficiency which was <15%. • Products like ESX allowed resource optimization not disturbing the provisioning process. Rollout • VM machines were provisioned. • Applications installed. • Monitoring defined as before. API was added to the virtualization manager (example Vsphere API) 5 www.gwos.com ©2016 Groundwork Open Source, Inc. All rights reserved worldwide. MONITOR SERVER A P P S A P P S SERVER A P P S A P P S SERVER A P P S SWITCH SWITCH FIREWALL ROUTER PORT GET SSH SNMP SNMP SNMP REST API
  • 6. MONITOR SNMP VIRTUALIZATION But it didn’t stop there… Software defined everything • Compute (hypervisors) • Storage (SDS) • Network (SDN) Hybrid Cloud Public/Private • Amazon Web Services • Rackspace • Azure Change created blind spots in coverage. Suddenly, SSH/SNMP methods don’t cover everything. 6 www.gwos.com ©2016 Groundwork Open Source, Inc. All rights reserved worldwide. SERVER A P P S A P P S SERVER A P P S A P P S SERVER A P P S SWITCH SWITCH FIREWALL ROUTER PORT GET SSH REST API REST API SNMP SNMP VIRTUALIZED NETWORK VIRTUALIZED NETWORK SWITCHES SWITCHESSWITCHES
  • 7. Files Devices VMs Shared I/O Fog Network & Storage The infrastructure landscape completely changed. 7 www.gwos.com ©2016 Groundwork Open Source, Inc. All rights reserved worldwide. SILOVIEWSILOVIEWSILOVIEWSILOVIEWSILOVIEW HYPERVISORMANAGERVIEW
  • 8. VIRTUALIZATION And the moment you think you’ve seen it all… DevOps is pushing the envelope even further… Linux Containers are the new kids in town API for deployment: Monitoring is somebody else’s job 8 www.gwos.com ©2016 Groundwork Open Source, Inc. All rights reserved worldwide. MONITOR SERVER A P P S A P P S SERVER A P P S A P P S SERVER A P P S SWITCH SWITCH FIREWALL ROUTER PORT GET SSH VIRTUALIZED NETWORK VIRTUALIZED NETWORK SWITCHES SWITCHESSWITCHES REST API REST API DOCKER HOST DOCKER HOST DOCKER HOST Application Isolation Rapid Deployment Elastic Scalability
  • 9. ESX NSX vSan Nova Neutron Cinder Glance EC2 VPC S3 Linux KVM Network NFS Diverse Virtualization Stacks Don’t panic. What happened over the last 5 years: • Virtualize everything • Private and public clouds • API-centric world is not just for applications • REST API is a standard • CORBA, SOAP and proprietary API’s are now classified as dinosaurs • Browser based UI/JavaScript is king • Learn to speak REST and JSON 9 www.gwos.com ©2016 Groundwork Open Source, Inc. All rights reserved worldwide. vSphere API OpenStack API AWS API oVirt API
  • 10. Sounds really good, doesn’t it? • Service-oriented architecture • Pluggable hypervisors, network, storage to support a wide range of technologies • Elastic Compute Units for better virtualization efficiency • Standardized API’s to all services But what’s the reality? How does OpenStack fit in? 10 www.gwos.com ©2016 Groundwork Open Source, Inc. All rights reserved worldwide. Object Store–Swift Dashboard– Horizon Image– Glance Compute/Hyperviso r–Nova Storage–Cinder Network– Quantum-Neutron Identity– Keystone Façade Service API API API API API API
  • 11. OpenStack Profile Editor 11 www.gwos.com ©2016 Groundwork Open Source, Inc. All rights reserved worldwide.
  • 12. OpenStack Profile Editor 12 www.gwos.com ©2016 Groundwork Open Source, Inc. All rights reserved worldwide.
  • 13. Performance Data Virtual Infrastructure Container Metrics Provisioning/Config uration ManageIQ/ Cloud Forms Ansible API Manager Stack API HP HelionAPI VMWare VRealize Mirantis Fuel API API API is good, but comprehensive coverage is even better... 13www.gwos.com ©2016 Groundwork Open Source, Inc. All rights reserved worldwide. Dashboard Object Store Image Computer Storage Network Identity OpenStack API Legacy Network Servers ServersServers Storage Racks Applications, Infrastructure Checks SSH/SNMP/port get Unified View Standardized Data Collection
  • 14. Unified View Silo Tools • Impossible to correlate • Naming mismatch • Over-monitoring • Encourages “It’s not my problem, check your system” • No Big Picture dashboards 14 www.gwos.com ©2016 Groundwork Open Source, Inc. All rights reserved worldwide. Need For • Stack monitoring bottom to top • Aliasing of names • Combine best of breed collectors • Correlated metrics across all infrastructure • Dashboard for each customer/client
  • 15. Stack monitoring for OpenStack 15 www.gwos.com ©2016 Groundwork Open Source, Inc. All rights reserved worldwide. Function Source Host name Alias Network, Storage Netflow SNMP Dsw1-422 os-eng-h1 Hardware IPMI SNMP Drac-server1 os-eng-h1 Identity server check-mysql check-port server1:port os-eng-h1 Operating System check-proc check-mem check-load server1 os-eng-h1 OpenStack-API Hypervisor CPU/Mem metrics, Network storage 10.10.0.1 os-eng-h1 OpenStack API VM CPU/mem, Network, storage 10.10.10.123 eng-slicer-1 Operating system check-proc check-mem check-load eng-cent6-actg eng-slicer-1 DockerHost Memory, CPU eng-cent6-actg eng-slicer-1 Container Memory, CPU, Procs FAC3443DA77 eng-slicer-1 Application Check_https 172.28.102.51 eng-slicer-1 HYPERVISORVMAPP
  • 16. Automation/Continuous configuration discovery Adding virtual machines, containers is automatic… Monitoring is like a flight recorder—collecting all active data “Continuous configuration discovery” by… Synchronization of configuration over APIs Auto-registration of agents/Download of plugins Collection of data availability and performance 16 www.gwos.com ©2016 Groundwork Open Source, Inc. All rights reserved worldwide. V C HYPERVISOR Management Monitoring System Configuration collector Data collector Plugins Dashboards D A T A C C V V V V V C so is the monitoring
  • 17. OpenStack Monitoring: Tools Selection • Fuel—OpenStack deployment • Murano—Application deployment • Fuel Plugin—Configuration generate • Fuel Plugin—Configure monitoring 17 www.gwos.com ©2016 Groundwork Open Source, Inc. All rights reserved worldwide. OpenStack API Icinga2 setup OpenStack Linux OS Platform Hypervisor Hardware IPMI Port Agent OpenStack ControllerKeystone OpenStack StorageCinder Unified Monitoring View Data Management CollectorCollectorCollector Fuel Murano Application Software Catalog Icinga 2 Docker Node 2 Node 3 Docker VM 1 OPEN TSDB Icinga2 VM 1 VM 2 VM 3 VM 4 OPEN TSDB Docker
  • 18. Performance Data Monitoring Systems 18 www.gwos.com ©2016 Groundwork Open Source, Inc. All rights reserved worldwide. Centralized Data Collector • Monitoring send Perf Data to single API Expandable Storage Cluster • Expand on demand Dashboard to visualize and drill down • Historic, raw performance data • Group alike metrics to find outliers Grafana Dashboards
  • 19. Reference Architecture Monitoring 19 www.gwos.com ©2016 Groundwork Open Source, Inc. All rights reserved worldwide. Aliasing and data normalization in backend A P I Data Collector Hub Collector Collector Collector Collect Data Integration Normalization Aliasing A P I Integrate A P I Visualize Growing number of API’s require integration Hub
  • 20. Benefits Automation • Reduced cost of maintenance • Dynamic configuration • Streamlined operations Personnel • System Administrator and DevOps functions merging 20 www.gwos.com ©2016 Groundwork Open Source, Inc. All rights reserved worldwide. Capacity & Resource Planning • Complete Bottom-to-Top (Network-to-Application) data collection • Allocate resources at the right level Hardware • No vendor lock-in • Transparency through API
  • 21. Lessons learned… • Virtualization and containerization require new monitoring techniques • Legacy hardware/software will be around for awhile, so don’t throw tools away • Everything API needs integration • Aliasing is hard, but may be easier than cross silo cooperation • Use the best tool for the task • Don’t over-monitor to cover a gap 21 www.gwos.com ©2016 Groundwork Open Source, Inc. All rights reserved worldwide.
  • 22. Conclusion • Open source tools are the driving force for innovation • DevOps just select the best tools to do the job • A single monitoring tool to “rule them all” doesn’t exist • Integration is a complex task—don’t expect teams to agree on naming, process, and workflow • Automation and pragmatism will prevail as DevOps has demonstrated • “Virtualize everything” will continue • Automation and continuous discovery is necessary for rapid scale out • OpenStack is the open source virtualization platform, but monitoring coverage needs major improvements 22 www.gwos.com ©2016 Groundwork Open Source, Inc. All rights reserved worldwide.
  • 23. Thank you. Thomas Stocking [email protected] 23 www.gwos.com ©2016 Groundwork Open Source, Inc. All rights reserved worldwide.

Editor's Notes

  • #2: Thanks for attending icinga day San Francisco! I’m happy to be here and give this little talk about Openstack monitoring. I think this will take about 40-45 minutes, maybe less if I talk fast. I’ll be happy to take your questions at the end, so write them down or remember them, and ask away when we get to there. I’ll try to keep it interesting.
  • #3: So, I work for Groundwork Open Source Inc. We are involved in the effort to make Openstack more easily managed and monitored, and this presentation will show you a little more about that, and how the various tools fit in, and where. I’ll talk about how the space is evolving, tell the story of monitoring, as it were. Things are changing, and with DevOps, automation, and a changing landscape, monitoring will have to change. We will go over some high-level concepts and ideas we have, and in conclusion share some of the things we have learned after 12 years in the space.
  • #4: Lot’s of companies are adopting Openstack, for lots of reasons, mostly efficiency (more on that later). It’s for real, and growing. It’s open source, which realy means that innovation is free to move ahead, and it is. Open stack is innovative and flexible, and is getting a lot of attention. It’s newness is driving more waves of change, and retooling for openstack management is under way Geeks love it because it is new and shiny, but it’s gaining ground because it’s not just a geek tool, or a silicon valley fad. There are a lot of serious members, and corporate sponsors.
  • #5: Let me tell you a story about the old way we used to do deployments, maybe 10 years ago or more. This will be familiar to some of you, I’m sure. You want to deploy an app? Make a multi-tiered network, Put some servers into racks and plug them in to the top-of-rack switches Load the apps on the servers Add monitoring to each one Add your ssh keys and plugins, turn on SNMP Lather, rinse and repeat You get a nice dashboard and email alerts. It works.
  • #6: But virtualization came a long, for a lot of reasons. Ease of deployment sure, but also application isolation in VMs made the apps more secure A big driver was overprovisioning - In order to handle peaks, most pre-virtualization servers were averaging at 15% capacity. Afterwards this jumped to 35% (some claim more). It’s a big jump. Fundamentally, though monitoring didn’t change much, except that you were checking VMs, so the results for CPU and memory were not exactly true when checked with plugins. An important point that was added was that the hypervisors were instrumented with APIs (like vsphere). These exposed some data on the hypervisor, and if you knew how you could grab it for monitoring.
  • #7: But the story continued. Soon the network was virtual, and the switch fabric more and more became defined in software. We are moving to sftware defined everything, even storage systems use virualized disks, luns, etc. Cloud services started getting more widely used, and APIs were offered. MS deprecated their Azure API a whule back, and a re re-launching in a more hybrid-cloud way. We will see what happens! At this point, monitoring became blind in some areas. We no longer could get to what we wanted with SNMP and ssh with plugins. Also, things were getting deployed and redeployed too fast to keep up with the monitoring configuration. Something had to be done.
  • #8: This is another view of the landscape. Notice that the silo views have multiplied, here. Not that I was a new problem, but the areas of coverage for monitoring are really fragmented now, with the hypervisor seeing a few layers, and the apps seeing the top. Shared fog in the network side, since the old way of looking at traffic can’t really pick out flows and find problems when services (like storage) are increasingly virtualized, and so hit a network somewhere in the process. Everything was always connectoed to everything else, but now it’s more dependent.
  • #9: And just when you thought it was safe to go back in the water, there are containers floating around. Containers are interesting from a monitoring perspective: they are opaque from the host OS. You can’t see into them to find out what’s going on. In fact you need another container to see the details, which is exactly what we did with Boxspy (it’s on github if you need it). Containers are great for app isolation. They complicate monitoring the apps, quite a bit. They are also great for dynamic provisioning - just spin up more or less of a given component, like app or web servers ,or more cached systems to scale up or down. The focus so far has been on provisioning, though, and monitoring is a low priority. The tools we use have to account for this situation, and the likely trends towards more micro-services and isolation.
  • #10: So, that’s a lot of things to deal with. But don’t panic. Let’s summarize: There are a lot of virtual stacks around these days, but all of them have APIs that you can use to access them. The flavor of the day is REST APIs, with JSON, so we had better learn how to speak REST. Pretty much no one uses SOAP any more, and does anyone even remember CORBA? XML/RPC is also considered really old by now. The trend is also strongly towards mobile-friendly web Uis, and that usually means javascript. This is the direction we are seeing the big players take, and through systems like Openstack they are driving more and more into companies that we have as customers. We have to learn to speak REST and JSON.
  • #11: So how does Openstack fit in? Let’s go over how it works at a high level – deeper dives are the subject of much longer talks. We will be in Austin in April for those. Openstack uses a service oriented architecture. It splits off compute, network storage, authentication and authorization all into their own services. Using a pluggable model, you can replace services with other compatible components. For instance you can swap out the default KVM hypervisor for VMWare, or even HyperV if you want. On the hardware side, it’s completely transparent – the APIs make hardware vendor choices irrelevant to programming, as long at they are supported. A lot of hardware types are supported – the device vendors are on board, which accounts for a lot of the success of openstack overall, actually. The other important thing openstack does is break down respurce allocation into small “elastic compute units”, or ECUs. These are used like quanta – you have a multiplier on a small unit to get a bigger unit. That makes the edges of these units align in memory, storage, transport, etc across the whole system, and makes performance more predictable, and easier to optimize. Contrast this to Vmware, where you can set the resource level you want for a VM, and then the hypervisor figures out how to deliver that to you when you need it. Wow, it sounds good! So why hasn't Openstack taken over the world yet? The reality is a little more complicated. This is the openstack architecture, and the black box services it contains. Each service has an API that allows control and visibility into the black box. The focus so far in development has been on provisioning (like with Docker), and monitoring is largely under-coded, and spread among the APIs. What we are working on at GroundWork is a monitoring façade API, that gets all the metrics wherever they are and allows simple interactuion with the openstack installation. We coded this, and are prepping it for incubation in the Openstack project, but that process takes time and attention from the managers. In the meantime wi put it in our commercial product, but ultimeatly we want to contribute it to Openstack
  • #12: Here’s some counters and metrics we can pull from the façade
  • #13: Also for VMs. We found that VMs (part of compute) were better instrumented for monitoring than other parts of the system. Let’s hope that the other managers put some emphasis on monitoring in future!
  • #14: Ok, this is where it gets interesting. At a high level, when you are implementing openstack, you really aren’t doing it in a vacuum. There are always other components around that you will be monitoring in the normal, older way with SSH and Plugins or SNMP, or other common techniques. Here’s going to be a network, non software-defined. SNMP is dead – long live SNMP - and unless you are at a startup which uses only openstack all the time (anyone?) then there will be legacy systems running. Also you are going to have the provisioning systems around with tools like Mirantis, Manage IQ, Ansible, Chef etc. You need to monitor those. In openstack itself, also, there are places you need to check that aren’t API-accessible – like is the MySQL instance running? Are the ports for the APIs available? Basic monitoring that makes sense outside the black boxes of openstack. And all of these systems will produce performance data that you will want to capture and store and compare. More on that in a minute. What we are talking about here from a high-level management perspective is a Unified view – all the data from all the monitoring combined into one place to allow for analysis and alerting. To get that, we need to standardize the data.
  • #15: You also have (in any environment) a plethora of monitoring tools. Usually these are what we call “silo tools” – specific to certain groups of admins, with certain functions like network database, security, etc. This makes for all sorts of problems, from finger-pointing to over monitoring. The biggest issues in our opinion is the lack of an overall big-picture dashboard, which allows you to see what’s going on from multiple angles, provided by various monitoring methods and tools. This unified monitoring dashboard has a feature we call “stack monitoring” – from bottom to top – which means that you get to see a given resource and all the status and metrics associated with it in a single view. The collection of the metrics can be done by the tool of your choice, be it an API call, Icinga, or any other tool, even one from a silo you don’t’ control. The stack allows you to correlate metrics across the whole infrastructure. It also allows you to create filtered sets of infrastructure that are of interest to individual clients (like exec, networking, security, support, etc).
  • #16: If you are collecting and aggregating data from multiple systems, you will run into the problem of aliasing. For example, container, VMs, and storage systems may all have different names for the same resource. Or hostnames under some Oses are capitalized (any guesses here?), while others are not. By far the biggest issue is agreement among humans. In our experience, as hard as it is to build an effective aliasing system, it’s much harder to convince people to use consistent host naming conventions. And some places have no conventions at all – one person we know named a stack of systems “who?”, “when?”, “where?”, and “are you crazy?”. The results, while funny, were far from productive! So here you see two examples, one where you have network interface data and netflow data combining with IPMI and ID server data, and someone even reported back results under an IPv4 address, but in the unified view we want it all under one device/hostname. The alias we report all of this data under is os-eng-h1, which is an openstack installation grouped under hypervisors. There’s a predictive alert here from IPMI saying that the power supply is apt to fail. Might be good to pay some attention to that soon. The second example is a docker container, but we are bringing in data from the boxspy and the API of the hypervisor where the VM docker host is running, and that VM itself, all to the alias of the container. So you can see that aliasing gives us the unification we want and need. Full auto-discovery and name alignment just doesn’t exist (yet). You need this kind of aliasing.
  • #17: But what about automation? We are definitely seeing a trend towards no manual anything as an ideal, and the closer we get in monitoring the more time we can spend on important things like root cause analysis. So how do we deal with the fact that virtualization makes dynamic provisioning of VMs and containers really easy, and in fact, really useful? We want to monitor the newly added (and remove the newly deleted) systems right away. There’s no way we will do this by hand – there’s no time, and it’s going to happen at 3am, when no one will make any changes anyway. So we need a way to make things show up automatically in monitoring. We can spin up instances with agents installed. These will auto-register and start monitoring with plugins, and extra plugins can be automatically downloaded when needed. We can also collect the data from the API and start metric collection on the new vms and containers automatically, so the data will quickly appear in the monitoring system. We call this process continuous configuration discovery
  • #18: Let’s take a look at some of the tools we can use to set up Openstack for monitoring. In the context of deployment, you will need a way to gather all the particulars of the Openstack installation together, deploy the installation, and then use that same tool to manage and as a repository of the data needed to monitor it. We find Fuel (open source) to be excellent at this. It has the abulity to fully validate the parameters of the install, including the problematic issues with networking like VLAN tagging, addressing schema, etc that are the cause of most issues with new deployments. Once validated, you can deploy it with one click, and half an hour later have a guarnteed working install of openstack. You can then use another tool like Murano, whoch an also open source, and provides a repo of apps in VM form. You can then deploy your apps, including iconga for monitoring if you like, and the agents you might want or need like boxspy to monitor containers, and opentsbd to collect performance data. Using Murano, you can be sure to have consistency with the versions you deploy in openstack. What about monitoring configuration? Well Fuel has a lot of details you may need like SSH keys, the addresses of database servers, and credentials you needd for monitoring. Fuel has a lot of plugins, some of which allow you to grab this data out, and use it configure icinga Once you get that going, you can forward all the data from icinga and the APIS, the agents, etc to your unified view, and send the perfomance data to opentsdb.
  • #19: Let’s talk abut opentsdb. It’s become more and more important to gather and store raw performance data as time series for graphing and correlation, and ultimately root cause analysis and pattern recognition. We have found that the combination of opentsdb and grafana (both open source) are excelltn for this. We even integrated them into GroundWork! The can provide an centralized data collector for performance data. Opentsdb is very storage efficient, and can be expanded on demand, and trimmed as needed. Grafana is a front-end that allows you to make nice dashboards, ad-hoc as well, and to find the outliers in the time series of metrics you store.
  • #20: So, to summarize again, we have built up an architecture for monitoring openstack and other similar systems. Using tools of choice, which have APIs we can use and query, we collect the metrics and compare them to configured thresholds to create status data. We then send this to or REST API internally, and normalize it, apply aliases, and present it in a UI (through another API, actually). The performance data follows a similar path, and ends up where we point it – we can send it to RRDs, or to OpenTSDB. The good part of this design is that it is easy to write and update the collectors. If the data we want and need is available in a REST API, one collector looks much like another, and can quickly be instrumented to the collector hub. We just recently finished the first draft of the Icinga connector, for example, so that Icinga data can be easily integrated into the GroundWork product. Using this approach, we get true unified monitoring of openstack.
  • #21: Let’s go overs the advantages and benefits of this approach. Automation reduces cost f maintenance and improves quality of data, especially in dynamic configuration scenarios. It leads to streamlined operations, which is good, since operations staff are getting trimmed and infrastructures grow and become more complex. In terms of Hardware, you avoid lock in using Linux and Openstack for everything, and you gain visibility into the hardware you use for openstack via the API, in terms of configuration, status, and performance. With stack monitoring you can reduce finger pointing and over monitoring, and gain a view of your applications that is complete, Bottom to top and front to back. This helps you optimize resource allocation and put the resources in the right place, at the right level. Finally, we see more and more that the roles of DevOps and Sysadmins are merging (More ops and troubleshooting in Devops). Ths is part of a trend we see towards more systems/person, even as high as 1000/1 in some cases.
  • #22: So the conclusions we have reached here actually take two slides to explain. Sorry for the repetition… First of all, we need to change our tools. Old techniques still have a role, but leave us blind – we need to have a new approach. Don’t throw away the old tools Integrate everything with an API. It’s the only way to keep up. Aliasing is easier than arguing. Never underestimate the human capacity to delay and deny! Pick the best tool for the task – no one tool can do everything, and you need the specialization Don’t overlap your monitoring tools. Monitoring is expensive, and you are wasting more than just bandwidth. People need to respond to alerts, even if they automate configuring them.
  • #23: Open source tools are still driving innovation. We will see advances in open source before we see them in proprietary code, usually. Part of DevOps is to automate, and part of automation means having tools that respond to automation. Some will be proprietary, some open, but all should be possible to configure and monitor, preferably via a REST API! You will not find one tool to rule them all. If you do, someone will install something it can’t monitor within 15 minutes. This is complicated stuff. Don’t expect agreement, especially on names. We see some trends here: Pragmatic approaches will rule, as Devops shows. More and more virtualization and app isolation will happen. Again automation, and continuous configuration discovery is needed to make it possible to keep up with dynamic changes Openstack has feet of clay. It’s real, and will evolve, but it still needs work to get fully instrumented, even with the architecture we are advocating here. We do think that this is the right approach, and we will keep working on developing the façade, as well as the connectors as the APIs mature and add more monitoring metrics.