SlideShare a Scribd company logo
closedloopautomation
Telemetry/Analytics-ml/Orchestration
Q3 2019
Emma Collins/Tong Zhang
2
Legal Disclaimer
General Disclaimer:
© Copyright 2019 Intel Corporation. All rights reserved. Intel, the Intel logo, Intel Inside, the Intel Inside logo, Intel.
Experience What’s Inside are trademarks of Intel. Corporation in the U.S. and/or other countries. *Other names and
brands may be claimed as the property of others.
Technology Disclaimer:
Intel technologies’ features and benefits depend on system configuration and may require enabled hardware, software
or service activation. Performance varies depending on system configuration. No computer system can be absolutely
secure. Check with your system manufacturer or retailer or learn more at [intel.com].
Performance Disclaimers:
Cost reduction scenarios described are intended as examples of how a given Intel- based product, in the specified
circumstances and configurations, may affect future costs and provide cost savings. Circumstances will vary. Intel does
not guarantee any costs or cost reduction.
Results have been estimated or simulated using internal Intel analysis or architecture simulation or modeling, and
provided to you for informational purposes. Any differences in your system hardware, software or configuration may
affect your actual performance.
ScaleEfficiencywithData-Driven,ClosedLoopAutomation
Intel Platform Features are part of intelligent, closed loop solutions
that are reactive, proactive and predictive, delivering new levels of
efficiency for IT and network infrastructure.
Automated Action Telemetry Analysis
Software and Services
Telemetry
IA Platform Telemetry
Fine-grained Hardware and software insights feeding operational intelligence and automation
Intel's Ingredients for Closed Loop Automation
 Intel driven Analytics solutions using
IA feature data
OR
 Integrate with proprietary/commercial
network monitoring and analytics
solutionsOrchestration
IA Platform
Telemetry
Analytics
 Intel driven MANO solutions to
provide scale/heal/placement
actions that incorporate IA features
OR
 Integrate with
proprietary/commercial MANO
solutions
IA Feature Metrics and
Events Exposure
IA Feature Detection and
Provisioning
Power PMU RDT RAS Other…
The Closed Loop
Intel Platform
Inteltelemetrycollectionandpublication
Platform
Compute Networking Memory Storage Acceleration
Collectd South Bound Plugins
Collectd
Collectd North Bound Plugins
Open stack
MANO
Platform/
NFVI
Monitoring /Analytics
Systems
Telemetry Publication
Telemetry Consumption
Kubernetes
ONAP
Telemetry Collection
Application
6
IntelTelemetryCoverage
Compute Network Storage
Hypervisor
NFVIVirtualised
Compute
Virtualised
Network
Virtualised
Storage
Collectd
PMU
counters
NIC counters
vSwitch
counters
Common / Standard Open APIs
VM Stall
Detection/
RT Stall Detection
Enterprise and Network Management Tools
RAS
Hypervisor/Container
Counters
Intel® Node Manager
Open Platform
Collector
Intel® Run Sure Technology
MCA* PCIe AER
Resilient System Technology
Resilient Memory Technology
SDDC DDDC+1 Mirroring
RAID
Intel® Rapid
Storage
Technology
Intel®
Management
Engine
IPMI
C
M
T
Intel® RDT
C
A
T
M
B
M
C
D
P
P
O
W
E
R
VIM
Intel® Infrastructure
Management
Technologies
Redfish SYSLOG KafkaSNMP API VES Plugin Prometheus
OpenStack Kubernetes
TyingtheIntelPlatformtotheCustomerBusinessCase
Customer Business Use Case
Platform
Slicing
(Part of 5G Network
Slicing)
Platform
Resiliency
Security
(side channel attacks)
Power
Optimisation
Intel Platform
Power PMU RDT RAS Other…
MANO
Analytics
MachineLearningWillCompletetheEvolution
Learn
Learn
Learn
Learn
Watch Decide
Collect
Act
App
Server/Storage/Network
Watch
Expose Platform and
App data through
APIs
Act
Mechanism for
Policy Activation
and Enforcement
Learn
Machine Learning
for continuous
improvement
Decide
Analytics to define
correct response
Moving From Automation to Self-OptimizationTaking the manpower out of networking, cloud…
DeliveringClosedLoopAutomationforNFV
Automated
Action
IA Platform
Telemetry
Telemetry
Analysis
1. Enable IA Service Assurance Telemetry
through instrumentation and exposure in
an industry standard manner
2. Enable IA Telemetry Analytics through
Telemetry Compaction, KPI identification
and prediction
3. Enable Closed Loop Automation through
Orchestration enabling and industry
proof points
Casestudy:MLforNFVserviceassurance
10
• Efficient dynamic network management is one major challenge for NFV
• Machine learning plays an important role in addressing this challenge by
analyzing gathered data for various purposes:
• dynamic resource allocation
• security threats alert
• performance degradation detection
• demand prediction
“Cognet: A Network Management Architecture Featuring Cognitive Capabilities,” Proc. Euro. Conf. Networks and Commun., June 2016
1. Data pre-processing
Feature Engineering/Reduction
2. KPI prediction/forecasting
Regression/Classification
3. Closed loop optimization
Reinforcement Learning
NFVtestsystems
11
vEPC – virtual Evolved Packet Core
vCMTS – virtual Cable Modem Termination System
• Telemetry data dumped through Collectd includes CPU, PMU, Memory, Load, etc.
• Total number of telemetry data hundreds to thousands sampled at configured interval (1s or 10s)
• Target KPI (Key Performance Indicator): packet drop rate
Datapre-processing
12
• Data filtering – remove irrelevant data (e.g. control plane data for this case)
 1065 features remaining
• Data alignment and interpolation
• Feature selection – remove features with no change over time
 726 features remaining
• Data normalization
• Data splitting to training, validation and testing sets
 Tens of thousands of samples split into 8:1:1
Telemetryfeaturecompaction
• Feature Selection
• The process of selecting a subset of relevant features for use in the model construction
• Filter methods – Select features based on scoring from statistical measures e.g. SelectKBest
• Wrapper methods – search for optimal feature combination that results in best predictive results
e.g. Recursive Feature Elimination, Boruta
• Unsupervised learning methods – group features that behave similarly e.g. FeatureAgglomeration
• Feature Transformation (dimension reduction)
• Convert the feature vector into lower dimension space with learned transformations
• Supervised: PLS (Partial Least Squares), CCA (Canonical Correlation Analysis), LDA (Linear
Discriminant Analysis)
• Unsupervised: PCA (Principal Component Analysis)
13
10 20 40 Feature 10 20 40 Feature 10 20 40 Feature
• cpu_value_idle_18 • • intel_rdt_value_bytes_llc_20 • • • intel_rdt_value_memory_bandwidth_local_12
• cpu_value_idle_49 • • intel_rdt_value_bytes_llc_27 • intel_rdt_value_memory_bandwidth_local_3
• • cpu_value_interrupt_18 • intel_rdt_value_bytes_llc_41 • intel_rdt_value_memory_bandwidth_local_33
• cpu_value_interrupt_20 • • intel_rdt_value_bytes_llc_50 • intel_rdt_value_memory_bandwidth_local_53
• cpu_value_system_39 • intel_rdt_value_bytes_llc_53 • • intel_rdt_value_memory_bandwidth_local_9
• • cpu_value_user_13 • • intel_rdt_value_bytes_llc_6 • intel_rdt_value_memory_bandwidth_remote_2
• cpu_value_user_3 • intel_rdt_value_bytes_llc_8 • intel_rdt_value_memory_bandwidth_remote_6
• cpu_value_user_46 • intel_rdt_value_ipc_nan_22 • • ipmi_value_fanspeed_System Fan 1 fan_cooling (29.1)
• cpu_value_user_9 • • intel_rdt_value_ipc_nan_23 • ipmi_value_temperature_HSBP 1 Temp drive_backplane (15.1)
• df_value_free_etc-hosts • intel_rdt_value_ipc_nan_26 • ipmi_value_temperature_LAN NICTemp system_board (7.1)
• df_value_used_etc-hosts • intel_rdt_value_ipc_nan_5 • ipmi_value_temperature_P2 DTS Therm Mgn processor (3.2)
• disk_read_disk_time_sda • • intel_rdt_value_ipc_nan_50 • • • irq_value_TLB
• intel_pmu_value_branches_42 • intel_rdt_value_ipc_nan_53 • • • load_longterm
• intel_pmu_value_instructions_14 • intel_rdt_value_ipc_nan_54 • • • load_midterm
• intel_pmu_value_page-faults_51 • intel_rdt_value_ipc_nan_9 • load_shortterm
• intel_pmu_value_page-faults_all • numa_value_other_node_node1 • • • memory_value_cached
• • memory_value_slab_unrecl
Top TopTop
TopTelemetryFeaturesSelectedByML
14
Recursive feature elimination stepping down the target from 40  20  10
Featureselectionresults
15
• Applying various feature selection algorithms to reduce the number of telemetry data sampled without
compromising the prediction accuracy
• Smaller set of telemetry data saves training/inference time
• vEPC test data set with original feature dimension ~400
• GradientBoostRegressor algorithm used for packet loss rate prediction
• Select top number of features from feature importance output of GradientBoostRegressor
KPIprediction&forecasting–supervisedlearning
16
• Regression
• Train the model to predict target KPI using reduced telemetry data samples
• Example: Packet drop rate
• Accuracy Measured by MSE (Mean Squared Error) – the smaller the better
• Classification
• Detect packet drop from telemetry data
• Measurement
• Accuracy_score = correct_prediction / total_sample
• Precision/Recall/F1-score, etc.
• Time Series Forecasting
• Predict future KPI value based on historical telemetry data and observed KPI trend
KPIforecastingusingLSTM
17
• vCMTS downlink test data
• KPI – scheduling packet loss rate
• Input: previous 60 seconds of 20 selected
telemetry data + KPI
• Output: predicted future value (5 seconds
later) of target KPI
• Tensorflow BasicLSTMCell: Two layers,
each layer 150 neurons
Closed-loopAutomation Xeon
vCMTS -
0
Traffic Generator
vCMTS -
0VNFs
NIC
Machine Learning
Modules
collectd
InfluxDB
Traffic
forecasting
Reinforcement
learning
HW
Optimization
(RDT, DVFS, WL
consolidation, etc.)
Compacted
metrics
Using ML:
• Track/forecast workload,
performance
• Dynamically adjust
resource allocation
Benefits:
• Reduced TCO through
power saving and
increased HW utilization
Frequencytuning
ENIPoC–NetworkSliceLifecycleManagement
20
• For generating new scale
up/down and converting the
intent to suggested configuration
• LSTM is used for traffic prediction
AI-based predictor:
TNSM:
CNSM:
• Provides underlay network
control to satisfy the network
slice requests
• FlexE and a FlexE-based
optimization algorithm are used
for underlay network slice
creation and modification
• Provides core network control
to satisfy the network slice
requests
ETSI ISG ENI – Experiential Networked Intelligence
Trafficpredictionforresourceoptimization
21
Blue: actual traffic
Orange: predicted traffic
AIsystemarchitecture
22
Source:
Intel Confidential
Backup
23
Further Resources
Learn more from these helpful sites:
https://networkbuilders.intel.com/network-technologies/serviceassurance
https://wiki.opnfv.org/display/fastpath/Barometer+Home
https://wiki.openstack.org/wiki/Telemetry
https://01.org/openstack/blogs/2015/openstack-enhanced-platform-
awareness-white-paper
25
Collectd101materials
• Collectd 101
• https://wiki.opnfv.org/display/fastpath/Collectd+101
• Write simple read plugin
• https://wiki.opnfv.org/display/fastpath/Collectd+how+to+implement+a+si
mple+plugin
26
Barometer Strategy:
• Ensure platform metrics/events are accessible
through open industry standard interfaces.
• Demonstrate IA platform technologies can be
monitored, consumed and actioned in real time
Opnfvbarometer–Intelplatformfeatureplugins
One Click Install:
• Easy install/configuration
for customers
• One command to install
Collectd/Influxdb/Grafana
Three container approach for Collectd:
• Stable Container: latest stable branch
• Master Container: up to date with master
• Experimental Container: cherry pick
features of interest
BarometerLinks
Barometer Home: https://wiki.opnfv.org/display/fastpath/Barometer+Home
Metrics/Events through Barometer (not on Collectd site):
https://wiki.opnfv.org/display/fastpath/Collectd+Metrics+and+Events#CollectdM
etricsandEvents-Metrics
Barometer “One-click” install:
https://wiki.opnfv.org/display/fastpath/One+Click+Install+of+Barometer+Contain
ers
27

More Related Content

What's hot (18)

PDF
Building efficient 5G NR base stations with Intel® Xeon® Scalable Processors
Michelle Holley
 
PDF
Ligato - A platform for development of Cloud-Native VNF's - SDN/NFV London me...
Haidee McMahon
 
PPSX
Development, test, and characterization of MEC platforms with Teranium and Dr...
Michelle Holley
 
PDF
Using Xeon + FPGA for Accelerating HPC Workloads
inside-BigData.com
 
PDF
Edge and 5G: What is in it for the developers?
Michelle Holley
 
PDF
FPGAs and Machine Learning
inside-BigData.com
 
PDF
Distributed Resource Management Application API (DRMAA) Version 2
Peter Tröger
 
PPTX
Task allocation on many core-multi processor distributed system
Deepak Shankar
 
PPTX
Introduction to architecture exploration
Deepak Shankar
 
PDF
Preparing to program Aurora at Exascale - Early experiences and future direct...
inside-BigData.com
 
PPTX
Exploration of Radars and Software Defined Radios using VisualSim
Deepak Shankar
 
PDF
Centralized Emergency Traffic Optimizer NEV SDK
Michelle Holley
 
PDF
Comparative Analysis of IT Monitoring Tools
apprize360
 
PPT
Data Center Design Guide 4 2
Fiyaz Syed
 
PDF
"Accelerating Deep Learning Using Altera FPGAs," a Presentation from Intel
Edge AI and Vision Alliance
 
PDF
Storage Networking Solutions for High Performance Databases by QLogic
Jone Smith
 
PDF
Sdn and open flow tutorial 4
UmaMahesh Sistu
 
PDF
ETSI NFV#13 NFV resiliency presentation - ali kafel - stratus
Ali Kafel
 
Building efficient 5G NR base stations with Intel® Xeon® Scalable Processors
Michelle Holley
 
Ligato - A platform for development of Cloud-Native VNF's - SDN/NFV London me...
Haidee McMahon
 
Development, test, and characterization of MEC platforms with Teranium and Dr...
Michelle Holley
 
Using Xeon + FPGA for Accelerating HPC Workloads
inside-BigData.com
 
Edge and 5G: What is in it for the developers?
Michelle Holley
 
FPGAs and Machine Learning
inside-BigData.com
 
Distributed Resource Management Application API (DRMAA) Version 2
Peter Tröger
 
Task allocation on many core-multi processor distributed system
Deepak Shankar
 
Introduction to architecture exploration
Deepak Shankar
 
Preparing to program Aurora at Exascale - Early experiences and future direct...
inside-BigData.com
 
Exploration of Radars and Software Defined Radios using VisualSim
Deepak Shankar
 
Centralized Emergency Traffic Optimizer NEV SDK
Michelle Holley
 
Comparative Analysis of IT Monitoring Tools
apprize360
 
Data Center Design Guide 4 2
Fiyaz Syed
 
"Accelerating Deep Learning Using Altera FPGAs," a Presentation from Intel
Edge AI and Vision Alliance
 
Storage Networking Solutions for High Performance Databases by QLogic
Jone Smith
 
Sdn and open flow tutorial 4
UmaMahesh Sistu
 
ETSI NFV#13 NFV resiliency presentation - ali kafel - stratus
Ali Kafel
 

Similar to Closed Loop Platform Automation - Tong Zhong & Emma Collins (20)

PDF
Platform Observability and Infrastructure Closed Loops
Open Source Technology Center MeetUps
 
PDF
Intel Robotics AI Use Case
Mary Bunzel
 
PDF
AIDC Summit LA- Hands-on Training
Intel® Software
 
PDF
Microsoft Build 2019- Intel AI Workshop
Intel® Software
 
PDF
Intel colfax optimizing-machine-learning-workloads
Tracy Johnson
 
PDF
Optimize Machine Learning Workloads on Intel® Platforms
Intel® Software
 
PDF
“Smarter Manufacturing with Intel’s Deep Learning-Based Machine Vision,” a Pr...
Edge AI and Vision Alliance
 
PDF
Quieting noisy neighbor with Intel® Resource Director Technology
Michelle Holley
 
PPTX
Service Assurance Constructs for Achieving Network Transformation by Sunku Ra...
Liz Warner
 
PDF
2 new hw_features_cat_cod_etc
videos
 
PDF
Trends in Systems and How to Get Efficient Performance
inside-BigData.com
 
PDF
Scalable AutoML for Time Series Forecasting using Ray
Databricks
 
PDF
Machine Learning and Internet of Things
Sofian Hadiwijaya
 
PDF
Parallel machines flinkforward2017
Nisha Talagala
 
PDF
Intel NFVi Enabling Kit Demo/Lab
Michelle Holley
 
PPTX
20151021_DataScienceMeetup_revised
rerngvit yanggratoke
 
PDF
Aplications for machine learning in IoT
Yashesh Shroff
 
PDF
CSW2017Richard Johnson_harnessing intel processor trace on windows for vulner...
CanSecWest
 
PDF
Preparing the Data Center for the Internet of Things
Intel IoT
 
PDF
Intel's Machine Learning Strategy
inside-BigData.com
 
Platform Observability and Infrastructure Closed Loops
Open Source Technology Center MeetUps
 
Intel Robotics AI Use Case
Mary Bunzel
 
AIDC Summit LA- Hands-on Training
Intel® Software
 
Microsoft Build 2019- Intel AI Workshop
Intel® Software
 
Intel colfax optimizing-machine-learning-workloads
Tracy Johnson
 
Optimize Machine Learning Workloads on Intel® Platforms
Intel® Software
 
“Smarter Manufacturing with Intel’s Deep Learning-Based Machine Vision,” a Pr...
Edge AI and Vision Alliance
 
Quieting noisy neighbor with Intel® Resource Director Technology
Michelle Holley
 
Service Assurance Constructs for Achieving Network Transformation by Sunku Ra...
Liz Warner
 
2 new hw_features_cat_cod_etc
videos
 
Trends in Systems and How to Get Efficient Performance
inside-BigData.com
 
Scalable AutoML for Time Series Forecasting using Ray
Databricks
 
Machine Learning and Internet of Things
Sofian Hadiwijaya
 
Parallel machines flinkforward2017
Nisha Talagala
 
Intel NFVi Enabling Kit Demo/Lab
Michelle Holley
 
20151021_DataScienceMeetup_revised
rerngvit yanggratoke
 
Aplications for machine learning in IoT
Yashesh Shroff
 
CSW2017Richard Johnson_harnessing intel processor trace on windows for vulner...
CanSecWest
 
Preparing the Data Center for the Internet of Things
Intel IoT
 
Intel's Machine Learning Strategy
inside-BigData.com
 
Ad

More from Liz Warner (18)

PDF
Open Source 5G/Edge Automation via ONAP
Liz Warner
 
PPTX
Easing the Path to Network Transformation - Network Transformation Experience...
Liz Warner
 
PPTX
CNTT with Airship
Liz Warner
 
PDF
Your Path to Edge Computing - Akraino Edge Stack Update
Liz Warner
 
PPTX
Introduction to Tungsten Fabric and the vRouter
Liz Warner
 
PDF
Linux Akraino Blueprint
Liz Warner
 
PDF
ONAP and the K8s Ecosystem: A Converged Edge Application & Network Function P...
Liz Warner
 
PDF
P4/FPGA, Packet Acceleration
Liz Warner
 
PPTX
Enabling the Deployment of Edge Services with the Open Network Edge Services ...
Liz Warner
 
PPTX
Unleashing the Power of Fabric Orchestrating New Performance Features for SR-...
Liz Warner
 
PPTX
Closed-Loop Platform Automation by Tong Zhong and Emma Collins
Liz Warner
 
PPTX
Closed-Loop Network Automation for Optimal Resource Allocation via Reinforcem...
Liz Warner
 
PPTX
Open Network Edge Services Software for 5G and Edge
Liz Warner
 
PPTX
Akraino and Edge Computing
Liz Warner
 
PPTX
Whats New with Kata Containers
Liz Warner
 
PDF
SEBA: SDN Enabled Broadband Access - Transporting SDN principles to PON Networks
Liz Warner
 
PPTX
Simplifying and accelerating converged media with Open Visual Cloud
Liz Warner
 
PPTX
Open Source for the 4th Industrial Revolution
Liz Warner
 
Open Source 5G/Edge Automation via ONAP
Liz Warner
 
Easing the Path to Network Transformation - Network Transformation Experience...
Liz Warner
 
CNTT with Airship
Liz Warner
 
Your Path to Edge Computing - Akraino Edge Stack Update
Liz Warner
 
Introduction to Tungsten Fabric and the vRouter
Liz Warner
 
Linux Akraino Blueprint
Liz Warner
 
ONAP and the K8s Ecosystem: A Converged Edge Application & Network Function P...
Liz Warner
 
P4/FPGA, Packet Acceleration
Liz Warner
 
Enabling the Deployment of Edge Services with the Open Network Edge Services ...
Liz Warner
 
Unleashing the Power of Fabric Orchestrating New Performance Features for SR-...
Liz Warner
 
Closed-Loop Platform Automation by Tong Zhong and Emma Collins
Liz Warner
 
Closed-Loop Network Automation for Optimal Resource Allocation via Reinforcem...
Liz Warner
 
Open Network Edge Services Software for 5G and Edge
Liz Warner
 
Akraino and Edge Computing
Liz Warner
 
Whats New with Kata Containers
Liz Warner
 
SEBA: SDN Enabled Broadband Access - Transporting SDN principles to PON Networks
Liz Warner
 
Simplifying and accelerating converged media with Open Visual Cloud
Liz Warner
 
Open Source for the 4th Industrial Revolution
Liz Warner
 
Ad

Recently uploaded (20)

PDF
Printable Belarusian Gospel Tract - Do Not Fear Death.pdf
Filipino Tracts and Literature Society Inc.
 
PPTX
384 Surely your wrath against mankind brings you praise
Rick Peterson
 
PPTX
Yiddish (ייִדיש) - The Importance of Child Discipline and Honoring Your Paren...
Filipino Tracts and Literature Society Inc.
 
PPTX
385 The LORD bestows Grace & Glory; 386 Strength on a Warrior
Rick Peterson
 
PDF
Printable Bambara Gospel Tract - Do Not Fear Death.pdf
Filipino Tracts and Literature Society Inc.
 
PPTX
Dalai Lama Succession Explained | History, Politics & China’s Role
Kritika Chauhan
 
PDF
Printable Arabic Gospel Tract - Do Not Fear Death.pdf
Filipino Tracts and Literature Society Inc.
 
PPTX
JUNE 1 2025 SINGINSPIRATION SONG LYRICS WORSHIP SERVICE .pptx
OnelDelosSantos
 
PPTX
Select portions of dwelling places of God
LoyDsouza9
 
PDF
Printable Czech Gospel Tract - Do Not Fear Death.pdf
Filipino Tracts and Literature Society Inc.
 
PPTX
Select portions of Leviticus from Moses tabernacle
LoyDsouza9
 
PPT
01_-_Introduction_and_Chapter-01.pptythty
ThakurAbhirajSingh1
 
PDF
4 Garba and Jay Aadhya Ssakti- Aarti - Vishwambhari Stuti.pdf
Samirsinh Parmar
 
PDF
Printable Assamese Gospel Tract - Do Not Fear Death.pdf
Filipino Tracts and Literature Society Inc.
 
PPTX
I don't know why scribd is making me upload pets
yangjessica629
 
PDF
Printable Danish Gospel Tract - Do Not Fear Death.pdf
Filipino Tracts and Literature Society Inc.
 
PDF
Printable Catalan Gospel Tract - Do Not Fear Death.pdf
Filipino Tracts and Literature Society Inc.
 
PPTX
How we should behave in God's House_082013.pptx
peacelekanbello
 
PPTX
Kenotic Reflections Kenotic Reflections on Teilhard de Chardin’s Hymn of the ...
mputerba
 
PDF
Printable Albanian Gospel Tract - Do Not Fear Death.pdf
Filipino Tracts and Literature Society Inc.
 
Printable Belarusian Gospel Tract - Do Not Fear Death.pdf
Filipino Tracts and Literature Society Inc.
 
384 Surely your wrath against mankind brings you praise
Rick Peterson
 
Yiddish (ייִדיש) - The Importance of Child Discipline and Honoring Your Paren...
Filipino Tracts and Literature Society Inc.
 
385 The LORD bestows Grace & Glory; 386 Strength on a Warrior
Rick Peterson
 
Printable Bambara Gospel Tract - Do Not Fear Death.pdf
Filipino Tracts and Literature Society Inc.
 
Dalai Lama Succession Explained | History, Politics & China’s Role
Kritika Chauhan
 
Printable Arabic Gospel Tract - Do Not Fear Death.pdf
Filipino Tracts and Literature Society Inc.
 
JUNE 1 2025 SINGINSPIRATION SONG LYRICS WORSHIP SERVICE .pptx
OnelDelosSantos
 
Select portions of dwelling places of God
LoyDsouza9
 
Printable Czech Gospel Tract - Do Not Fear Death.pdf
Filipino Tracts and Literature Society Inc.
 
Select portions of Leviticus from Moses tabernacle
LoyDsouza9
 
01_-_Introduction_and_Chapter-01.pptythty
ThakurAbhirajSingh1
 
4 Garba and Jay Aadhya Ssakti- Aarti - Vishwambhari Stuti.pdf
Samirsinh Parmar
 
Printable Assamese Gospel Tract - Do Not Fear Death.pdf
Filipino Tracts and Literature Society Inc.
 
I don't know why scribd is making me upload pets
yangjessica629
 
Printable Danish Gospel Tract - Do Not Fear Death.pdf
Filipino Tracts and Literature Society Inc.
 
Printable Catalan Gospel Tract - Do Not Fear Death.pdf
Filipino Tracts and Literature Society Inc.
 
How we should behave in God's House_082013.pptx
peacelekanbello
 
Kenotic Reflections Kenotic Reflections on Teilhard de Chardin’s Hymn of the ...
mputerba
 
Printable Albanian Gospel Tract - Do Not Fear Death.pdf
Filipino Tracts and Literature Society Inc.
 

Closed Loop Platform Automation - Tong Zhong & Emma Collins

  • 2. 2 Legal Disclaimer General Disclaimer: © Copyright 2019 Intel Corporation. All rights reserved. Intel, the Intel logo, Intel Inside, the Intel Inside logo, Intel. Experience What’s Inside are trademarks of Intel. Corporation in the U.S. and/or other countries. *Other names and brands may be claimed as the property of others. Technology Disclaimer: Intel technologies’ features and benefits depend on system configuration and may require enabled hardware, software or service activation. Performance varies depending on system configuration. No computer system can be absolutely secure. Check with your system manufacturer or retailer or learn more at [intel.com]. Performance Disclaimers: Cost reduction scenarios described are intended as examples of how a given Intel- based product, in the specified circumstances and configurations, may affect future costs and provide cost savings. Circumstances will vary. Intel does not guarantee any costs or cost reduction. Results have been estimated or simulated using internal Intel analysis or architecture simulation or modeling, and provided to you for informational purposes. Any differences in your system hardware, software or configuration may affect your actual performance.
  • 3. ScaleEfficiencywithData-Driven,ClosedLoopAutomation Intel Platform Features are part of intelligent, closed loop solutions that are reactive, proactive and predictive, delivering new levels of efficiency for IT and network infrastructure. Automated Action Telemetry Analysis Software and Services Telemetry IA Platform Telemetry Fine-grained Hardware and software insights feeding operational intelligence and automation
  • 4. Intel's Ingredients for Closed Loop Automation  Intel driven Analytics solutions using IA feature data OR  Integrate with proprietary/commercial network monitoring and analytics solutionsOrchestration IA Platform Telemetry Analytics  Intel driven MANO solutions to provide scale/heal/placement actions that incorporate IA features OR  Integrate with proprietary/commercial MANO solutions IA Feature Metrics and Events Exposure IA Feature Detection and Provisioning Power PMU RDT RAS Other… The Closed Loop Intel Platform
  • 5. Inteltelemetrycollectionandpublication Platform Compute Networking Memory Storage Acceleration Collectd South Bound Plugins Collectd Collectd North Bound Plugins Open stack MANO Platform/ NFVI Monitoring /Analytics Systems Telemetry Publication Telemetry Consumption Kubernetes ONAP Telemetry Collection Application
  • 6. 6 IntelTelemetryCoverage Compute Network Storage Hypervisor NFVIVirtualised Compute Virtualised Network Virtualised Storage Collectd PMU counters NIC counters vSwitch counters Common / Standard Open APIs VM Stall Detection/ RT Stall Detection Enterprise and Network Management Tools RAS Hypervisor/Container Counters Intel® Node Manager Open Platform Collector Intel® Run Sure Technology MCA* PCIe AER Resilient System Technology Resilient Memory Technology SDDC DDDC+1 Mirroring RAID Intel® Rapid Storage Technology Intel® Management Engine IPMI C M T Intel® RDT C A T M B M C D P P O W E R VIM Intel® Infrastructure Management Technologies Redfish SYSLOG KafkaSNMP API VES Plugin Prometheus OpenStack Kubernetes
  • 7. TyingtheIntelPlatformtotheCustomerBusinessCase Customer Business Use Case Platform Slicing (Part of 5G Network Slicing) Platform Resiliency Security (side channel attacks) Power Optimisation Intel Platform Power PMU RDT RAS Other… MANO Analytics
  • 8. MachineLearningWillCompletetheEvolution Learn Learn Learn Learn Watch Decide Collect Act App Server/Storage/Network Watch Expose Platform and App data through APIs Act Mechanism for Policy Activation and Enforcement Learn Machine Learning for continuous improvement Decide Analytics to define correct response Moving From Automation to Self-OptimizationTaking the manpower out of networking, cloud…
  • 9. DeliveringClosedLoopAutomationforNFV Automated Action IA Platform Telemetry Telemetry Analysis 1. Enable IA Service Assurance Telemetry through instrumentation and exposure in an industry standard manner 2. Enable IA Telemetry Analytics through Telemetry Compaction, KPI identification and prediction 3. Enable Closed Loop Automation through Orchestration enabling and industry proof points
  • 10. Casestudy:MLforNFVserviceassurance 10 • Efficient dynamic network management is one major challenge for NFV • Machine learning plays an important role in addressing this challenge by analyzing gathered data for various purposes: • dynamic resource allocation • security threats alert • performance degradation detection • demand prediction “Cognet: A Network Management Architecture Featuring Cognitive Capabilities,” Proc. Euro. Conf. Networks and Commun., June 2016 1. Data pre-processing Feature Engineering/Reduction 2. KPI prediction/forecasting Regression/Classification 3. Closed loop optimization Reinforcement Learning
  • 11. NFVtestsystems 11 vEPC – virtual Evolved Packet Core vCMTS – virtual Cable Modem Termination System • Telemetry data dumped through Collectd includes CPU, PMU, Memory, Load, etc. • Total number of telemetry data hundreds to thousands sampled at configured interval (1s or 10s) • Target KPI (Key Performance Indicator): packet drop rate
  • 12. Datapre-processing 12 • Data filtering – remove irrelevant data (e.g. control plane data for this case)  1065 features remaining • Data alignment and interpolation • Feature selection – remove features with no change over time  726 features remaining • Data normalization • Data splitting to training, validation and testing sets  Tens of thousands of samples split into 8:1:1
  • 13. Telemetryfeaturecompaction • Feature Selection • The process of selecting a subset of relevant features for use in the model construction • Filter methods – Select features based on scoring from statistical measures e.g. SelectKBest • Wrapper methods – search for optimal feature combination that results in best predictive results e.g. Recursive Feature Elimination, Boruta • Unsupervised learning methods – group features that behave similarly e.g. FeatureAgglomeration • Feature Transformation (dimension reduction) • Convert the feature vector into lower dimension space with learned transformations • Supervised: PLS (Partial Least Squares), CCA (Canonical Correlation Analysis), LDA (Linear Discriminant Analysis) • Unsupervised: PCA (Principal Component Analysis) 13
  • 14. 10 20 40 Feature 10 20 40 Feature 10 20 40 Feature • cpu_value_idle_18 • • intel_rdt_value_bytes_llc_20 • • • intel_rdt_value_memory_bandwidth_local_12 • cpu_value_idle_49 • • intel_rdt_value_bytes_llc_27 • intel_rdt_value_memory_bandwidth_local_3 • • cpu_value_interrupt_18 • intel_rdt_value_bytes_llc_41 • intel_rdt_value_memory_bandwidth_local_33 • cpu_value_interrupt_20 • • intel_rdt_value_bytes_llc_50 • intel_rdt_value_memory_bandwidth_local_53 • cpu_value_system_39 • intel_rdt_value_bytes_llc_53 • • intel_rdt_value_memory_bandwidth_local_9 • • cpu_value_user_13 • • intel_rdt_value_bytes_llc_6 • intel_rdt_value_memory_bandwidth_remote_2 • cpu_value_user_3 • intel_rdt_value_bytes_llc_8 • intel_rdt_value_memory_bandwidth_remote_6 • cpu_value_user_46 • intel_rdt_value_ipc_nan_22 • • ipmi_value_fanspeed_System Fan 1 fan_cooling (29.1) • cpu_value_user_9 • • intel_rdt_value_ipc_nan_23 • ipmi_value_temperature_HSBP 1 Temp drive_backplane (15.1) • df_value_free_etc-hosts • intel_rdt_value_ipc_nan_26 • ipmi_value_temperature_LAN NICTemp system_board (7.1) • df_value_used_etc-hosts • intel_rdt_value_ipc_nan_5 • ipmi_value_temperature_P2 DTS Therm Mgn processor (3.2) • disk_read_disk_time_sda • • intel_rdt_value_ipc_nan_50 • • • irq_value_TLB • intel_pmu_value_branches_42 • intel_rdt_value_ipc_nan_53 • • • load_longterm • intel_pmu_value_instructions_14 • intel_rdt_value_ipc_nan_54 • • • load_midterm • intel_pmu_value_page-faults_51 • intel_rdt_value_ipc_nan_9 • load_shortterm • intel_pmu_value_page-faults_all • numa_value_other_node_node1 • • • memory_value_cached • • memory_value_slab_unrecl Top TopTop TopTelemetryFeaturesSelectedByML 14 Recursive feature elimination stepping down the target from 40  20  10
  • 15. Featureselectionresults 15 • Applying various feature selection algorithms to reduce the number of telemetry data sampled without compromising the prediction accuracy • Smaller set of telemetry data saves training/inference time • vEPC test data set with original feature dimension ~400 • GradientBoostRegressor algorithm used for packet loss rate prediction • Select top number of features from feature importance output of GradientBoostRegressor
  • 16. KPIprediction&forecasting–supervisedlearning 16 • Regression • Train the model to predict target KPI using reduced telemetry data samples • Example: Packet drop rate • Accuracy Measured by MSE (Mean Squared Error) – the smaller the better • Classification • Detect packet drop from telemetry data • Measurement • Accuracy_score = correct_prediction / total_sample • Precision/Recall/F1-score, etc. • Time Series Forecasting • Predict future KPI value based on historical telemetry data and observed KPI trend
  • 17. KPIforecastingusingLSTM 17 • vCMTS downlink test data • KPI – scheduling packet loss rate • Input: previous 60 seconds of 20 selected telemetry data + KPI • Output: predicted future value (5 seconds later) of target KPI • Tensorflow BasicLSTMCell: Two layers, each layer 150 neurons
  • 18. Closed-loopAutomation Xeon vCMTS - 0 Traffic Generator vCMTS - 0VNFs NIC Machine Learning Modules collectd InfluxDB Traffic forecasting Reinforcement learning HW Optimization (RDT, DVFS, WL consolidation, etc.) Compacted metrics Using ML: • Track/forecast workload, performance • Dynamically adjust resource allocation Benefits: • Reduced TCO through power saving and increased HW utilization
  • 20. ENIPoC–NetworkSliceLifecycleManagement 20 • For generating new scale up/down and converting the intent to suggested configuration • LSTM is used for traffic prediction AI-based predictor: TNSM: CNSM: • Provides underlay network control to satisfy the network slice requests • FlexE and a FlexE-based optimization algorithm are used for underlay network slice creation and modification • Provides core network control to satisfy the network slice requests ETSI ISG ENI – Experiential Networked Intelligence
  • 24. Further Resources Learn more from these helpful sites: https://networkbuilders.intel.com/network-technologies/serviceassurance https://wiki.opnfv.org/display/fastpath/Barometer+Home https://wiki.openstack.org/wiki/Telemetry https://01.org/openstack/blogs/2015/openstack-enhanced-platform- awareness-white-paper
  • 25. 25 Collectd101materials • Collectd 101 • https://wiki.opnfv.org/display/fastpath/Collectd+101 • Write simple read plugin • https://wiki.opnfv.org/display/fastpath/Collectd+how+to+implement+a+si mple+plugin
  • 26. 26 Barometer Strategy: • Ensure platform metrics/events are accessible through open industry standard interfaces. • Demonstrate IA platform technologies can be monitored, consumed and actioned in real time Opnfvbarometer–Intelplatformfeatureplugins One Click Install: • Easy install/configuration for customers • One command to install Collectd/Influxdb/Grafana Three container approach for Collectd: • Stable Container: latest stable branch • Master Container: up to date with master • Experimental Container: cherry pick features of interest
  • 27. BarometerLinks Barometer Home: https://wiki.opnfv.org/display/fastpath/Barometer+Home Metrics/Events through Barometer (not on Collectd site): https://wiki.opnfv.org/display/fastpath/Collectd+Metrics+and+Events#CollectdM etricsandEvents-Metrics Barometer “One-click” install: https://wiki.opnfv.org/display/fastpath/One+Click+Install+of+Barometer+Contain ers 27