SlideShare a Scribd company logo
Scalable AutoML for Time Series Forecasting using Ray
Scalable AutoML for Time Series
Forecasting using Ray
Shengsheng Huang
Intel Corporation
Jason Dai
Intel Corporation
Agenda
Background
Analytics Zoo, Time Series, AutoML, Ray
Scalable AutoML for Time Series
Architecture, workflow, etc.
Use Case Sharing & Learnings
Use case study, what we learn from early
users, future work
Background
AI on Big Data
Accelerating Data Analytics + AI Solutions At Scale
Distributed, High-Performance
Deep Learning Framework
for Apache Spark*
https://github.com/intel-analytics/bigdl
Unified Analytics + AI Platform
for TensorFlow*, PyTorch*, Keras*, BigDL, Ray* and
Apache Spark*
https://github.com/intel-analytics/analytics-zoo
Analytics Zoo
https://github.com/intel-analytics/analytics-zoo
Unified Data Analytics and AI Platform
Time Series In a Nutshell
Time Series data
▪ A series of data that is observed sequentially in time.
▪ stock prices, sales volume, CPU/IO monitoring metrics,
KPIs in telecom networks ...
Time Series Analysis
▪ Time Series Forecasting
▪ Anomaly Detection
▪ Time Series Classification, clustering, etc.
Applications
▪ Demand forecasting
▪ network quality management
▪ predictive maintenance
▪ AIOps
Total volume of taxi passengers in NYC from 2014/07-2015/02 ( source :
https://github.com/intel-analytics/analytics-zoo/blob/master/apps/anomaly-
detection/anomaly-detection-nyc-taxi.ipynb)
Time Series Forecasting
Problem Definition
▪ Given all history t observations 𝑦_1,…, 𝑦_𝐭 , Predict
values of next 𝐡 steps, 𝑦_(𝐭+𝟏),…, 𝑦_(𝐭+𝒉)
▪ Usually only lookback 𝐤 steps, 𝑦_(𝐭−𝒌+𝟏),…, 𝑦_𝐭
Forecasting Methods
▪ Autoregression, Exponential Smoothing, ARIMA, …
▪ Machine Learning and Deep Learning methods
𝒚 𝟏 𝒚 𝟐 𝒚 𝒕 𝒚 𝒕$𝟏 𝒚 𝒕$𝒉𝒚 𝒕&𝒌$𝟏
lookback k steps forecast h steps forward
AutoML
Source: Yao, Q., Wang, et. al Taking the Human out of Learning Applications : A Survey on Automated Machine Learning.
Ray and RayOnSpark
Ray
▪ A distributed framework for emerging AI applications
Ray Tune
▪ A library on Ray for experiment execution and
hyperparameter tuning
RayOnSpark
▪ a feature in Analytics Zoo
▪ Directly run Ray programs on big data cluster
▪ Seamlessly integrate ray into spark data processing
pipeline
https://ray.readthedocs.io/en/latest/
https://analytics-zoo.github.io/master/#ProgrammingGuide/rayonspark/
Scalable AutoML for Time Series
Time Series Solution In Analytics Zoo
Rich algorithms
statistical, neural-networks, hybrid
state-of-art models, etc.
AutoML
for automatic feature generation,
model selection, hyper-parameter
tuning, etc.
Seamless scaling
with integrated analytics and AI
pipelines
Software Stack
AutoML Framework
▪ FeatureTransformer
▪ Model
▪ SearchEngine
▪ Pipeline
Time Series upon AutoML
▪ TimeSequencePredictor
▪ TimeSequencePipeline
https://medium.com/riselab/scalable-automl-for-time-series-prediction-using-ray-and-analytics-
zoo-b79a6fd08139
Training at Runtime
A glimpse of API
Training a Pipeline
▪ fit (w/ automl)
▪ recipe
▪ distributed mode
Using a Pipeline
▪ save/load
▪ evaluate/predict
▪ fit (incremental)
Project Zouwu
Use case
▪ reference time series use cases for Telco (such as
network traffic forecasting, etc.)
Models
▪ built-in models for time series analysis
(such as LSTM and MTNet)
“AutoTS”
▪ AutoML support for building E2E time series analysis
pipelines (including automatic feature generation,
model selection and hyperparameter tuning)
Project
Zouwu
Built-in Models
ML Workflow AutoML Workflow
Integrated Analytics & AI Pipelines
use-case
autots model
https://github.com/intel-analytics/analytics-
zoo/tree/master/pyzoo/zoo/zouwu
Use Case Sharing & Learnings
Network Traffic KPI Forecasting in Telco
Usage Scenario
▪ KPI/metrics forecasting is widely used in
Telco applications (e.g., energy saving,
network slicing, etc.)
▪ aggregated traffic KPI’s (i.e. total bytes,
average rate in Mbps/Gbps) in the past
week to forecast the KPI in the next two
hours.
2 ways to solve this
problem using Zouwu
▪ Use built-in “Forecaster” models for
training, and forecasting (notebook link)
▪ Use “AutoTS” (with built-in AutoML
support) to train an E2E Time Series
Analysis Pipeline, and forecast
(notebook link) Example result of network traffic average rate forecasting on the test period
Network Quality Prediction in SK Telecom
Spark-SQL
Data Loading
Data
Loader
Data Source APIs
File, HTTP, Kafka
forked.
DRAM Store
customized.
Flash
Store
tiering
Preprocess RDD of Tensor Model Code of TF
DL Training & Inferencing
Data Model
SIMD Acceleration
https://databricks.com/session_eu19/apache-spark-ai-use-case-in-
telco-network-quality-analysis-and-prediction-with-geospatial-
visualization
Forecast-based Analysis for AIOps at Neusoft
https://platform.neusoft.com/2020/01/17/xw-intel.html
https://platform.neusoft.com/2017/03/04/qt-baomaqiche.html
Takeaways from Early Users
Highlights
▪ Additional features allowed
▪ Less efforts in tuning
▪ Satisfactory accuracy
Data quality matters
▪ Missing values, outliers, etc.
Scale, scale, scale
▪ hundreds of thousands of cells X KPIs => millions of time series
▪ millions of servers/containers X metrics => hundreds of millions of time series
Future Work
Extremely high dimensional time series
Search algo, meta-learning, ensemble, …
More models, features, …
Automatic data preprocessing (e.g. missing data & outliers
Accelerate Your Data Analytics & AI Journey with Intel
Feedback
Your feedback is important to us.
Don’t forget to rate and
review the sessions.
Scalable AutoML for Time Series Forecasting using Ray

More Related Content

What's hot (20)

PDF
Text summarization
kareemhashem
 
PPTX
Machine learning
vaishnavip23
 
PDF
Recommender Systems In Industry
Xavier Amatriain
 
PDF
ChatGPTをシステムに組み込むためのプロンプト技法 #chatgptjp
K Kinzal
 
PDF
Introduction to natural language processing
Minh Pham
 
PPTX
Introduction to natural language processing, history and origin
Shubhankar Mohan
 
PDF
Meta learning tutorial
Joaquin Vanschoren
 
PDF
딥러닝 논문읽기 모임 - 송헌 Deep sets 슬라이드
taeseon ryu
 
PDF
“Powering the Connected Intelligent Edge and the Future of On-Device AI,” a P...
Edge AI and Vision Alliance
 
PPTX
Meta-Learning Presentation
AkshayaNagarajan10
 
PPTX
Natural Language Processing
saurabhnarhe
 
PDF
Data Preparation vs. Inline Data Wrangling in Data Science and Machine Learning
Kai Wähner
 
PPTX
NLTK
Girish Khanzode
 
PDF
Prompt Engineering by Dr. Naveed.pdf
Naveed Ahmed Siddiqui
 
DOCX
Natural language processing
KarenVacca
 
PPTX
NLP
guestff64339
 
PDF
Practical sentiment analysis
Diana Maynard
 
PDF
グリー株式会社『私たちが GCP を使い始めた本当の理由』第 9 回 Google Cloud INSIDE Game & Apps
Google Cloud Platform - Japan
 
PDF
Text classification & sentiment analysis
M. Atif Qureshi
 
PPTX
[PR12] PR-050: Convolutional LSTM Network: A Machine Learning Approach for Pr...
Taegyun Jeon
 
Text summarization
kareemhashem
 
Machine learning
vaishnavip23
 
Recommender Systems In Industry
Xavier Amatriain
 
ChatGPTをシステムに組み込むためのプロンプト技法 #chatgptjp
K Kinzal
 
Introduction to natural language processing
Minh Pham
 
Introduction to natural language processing, history and origin
Shubhankar Mohan
 
Meta learning tutorial
Joaquin Vanschoren
 
딥러닝 논문읽기 모임 - 송헌 Deep sets 슬라이드
taeseon ryu
 
“Powering the Connected Intelligent Edge and the Future of On-Device AI,” a P...
Edge AI and Vision Alliance
 
Meta-Learning Presentation
AkshayaNagarajan10
 
Natural Language Processing
saurabhnarhe
 
Data Preparation vs. Inline Data Wrangling in Data Science and Machine Learning
Kai Wähner
 
Prompt Engineering by Dr. Naveed.pdf
Naveed Ahmed Siddiqui
 
Natural language processing
KarenVacca
 
Practical sentiment analysis
Diana Maynard
 
グリー株式会社『私たちが GCP を使い始めた本当の理由』第 9 回 Google Cloud INSIDE Game & Apps
Google Cloud Platform - Japan
 
Text classification & sentiment analysis
M. Atif Qureshi
 
[PR12] PR-050: Convolutional LSTM Network: A Machine Learning Approach for Pr...
Taegyun Jeon
 

Similar to Scalable AutoML for Time Series Forecasting using Ray (20)

PDF
Automated Time Series Analysis using Deep Learning, Ray and Analytics Zoo
Jason Dai
 
PDF
Running Emerging AI Applications on Big Data Platforms with Ray On Apache Spark
Databricks
 
PDF
Ultra Fast Deep Learning in Hybrid Cloud using Intel Analytics Zoo & Alluxio
Alluxio, Inc.
 
PDF
Automated ML Workflow for Distributed Big Data Using Analytics Zoo (CVPR2020 ...
Jason Dai
 
PDF
End-to-End Big Data AI with Analytics Zoo
Jason Dai
 
PDF
Analytics Zoo: Building Analytics and AI Pipeline for Apache Spark and BigDL ...
Databricks
 
PDF
Ultra Fast Deep Learning in Hybrid Cloud Using Intel Analytics Zoo & Alluxio
Alluxio, Inc.
 
PDF
Auto-Train a Time-Series Forecast Model With AML + ADB
Databricks
 
PDF
Leveraging NLP and Deep Learning for Document Recommendations in the Cloud
Databricks
 
PDF
How to Leverage Machine Learning (R, Hadoop, Spark, H2O) for Real Time Proces...
Codemotion Tel Aviv
 
PDF
Build Large-Scale Data Analytics and AI Pipeline Using RayDP
Databricks
 
PPTX
Automated Analytics at Scale
DataWorks Summit/Hadoop Summit
 
PDF
Vectorized Deep Learning Acceleration from Preprocessing to Inference and Tra...
Databricks
 
PDF
How to Leverage Machine Learning (R, Hadoop, Spark, H2O) for Real Time Proces...
Codemotion
 
PDF
R, Spark, Tensorflow, H20.ai Applied to Streaming Analytics
Kai Wähner
 
PDF
Context-aware Fast Food Recommendation with Ray on Apache Spark at Burger King
Databricks
 
PDF
Learning Ray, 5th Early Release Max Pumperla
gjslndtloto
 
PDF
Media_Entertainment_Veriticals
Peyman Mohajerian
 
PPTX
Closed-Loop Platform Automation by Tong Zhong and Emma Collins
Liz Warner
 
PPTX
Closed Loop Platform Automation - Tong Zhong & Emma Collins
Liz Warner
 
Automated Time Series Analysis using Deep Learning, Ray and Analytics Zoo
Jason Dai
 
Running Emerging AI Applications on Big Data Platforms with Ray On Apache Spark
Databricks
 
Ultra Fast Deep Learning in Hybrid Cloud using Intel Analytics Zoo & Alluxio
Alluxio, Inc.
 
Automated ML Workflow for Distributed Big Data Using Analytics Zoo (CVPR2020 ...
Jason Dai
 
End-to-End Big Data AI with Analytics Zoo
Jason Dai
 
Analytics Zoo: Building Analytics and AI Pipeline for Apache Spark and BigDL ...
Databricks
 
Ultra Fast Deep Learning in Hybrid Cloud Using Intel Analytics Zoo & Alluxio
Alluxio, Inc.
 
Auto-Train a Time-Series Forecast Model With AML + ADB
Databricks
 
Leveraging NLP and Deep Learning for Document Recommendations in the Cloud
Databricks
 
How to Leverage Machine Learning (R, Hadoop, Spark, H2O) for Real Time Proces...
Codemotion Tel Aviv
 
Build Large-Scale Data Analytics and AI Pipeline Using RayDP
Databricks
 
Automated Analytics at Scale
DataWorks Summit/Hadoop Summit
 
Vectorized Deep Learning Acceleration from Preprocessing to Inference and Tra...
Databricks
 
How to Leverage Machine Learning (R, Hadoop, Spark, H2O) for Real Time Proces...
Codemotion
 
R, Spark, Tensorflow, H20.ai Applied to Streaming Analytics
Kai Wähner
 
Context-aware Fast Food Recommendation with Ray on Apache Spark at Burger King
Databricks
 
Learning Ray, 5th Early Release Max Pumperla
gjslndtloto
 
Media_Entertainment_Veriticals
Peyman Mohajerian
 
Closed-Loop Platform Automation by Tong Zhong and Emma Collins
Liz Warner
 
Closed Loop Platform Automation - Tong Zhong & Emma Collins
Liz Warner
 
Ad

More from Databricks (20)

PPTX
DW Migration Webinar-March 2022.pptx
Databricks
 
PPTX
Data Lakehouse Symposium | Day 1 | Part 1
Databricks
 
PPT
Data Lakehouse Symposium | Day 1 | Part 2
Databricks
 
PPTX
Data Lakehouse Symposium | Day 2
Databricks
 
PPTX
Data Lakehouse Symposium | Day 4
Databricks
 
PDF
5 Critical Steps to Clean Your Data Swamp When Migrating Off of Hadoop
Databricks
 
PDF
Democratizing Data Quality Through a Centralized Platform
Databricks
 
PDF
Learn to Use Databricks for Data Science
Databricks
 
PDF
Why APM Is Not the Same As ML Monitoring
Databricks
 
PDF
The Function, the Context, and the Data—Enabling ML Ops at Stitch Fix
Databricks
 
PDF
Stage Level Scheduling Improving Big Data and AI Integration
Databricks
 
PDF
Simplify Data Conversion from Spark to TensorFlow and PyTorch
Databricks
 
PDF
Scaling your Data Pipelines with Apache Spark on Kubernetes
Databricks
 
PDF
Scaling and Unifying SciKit Learn and Apache Spark Pipelines
Databricks
 
PDF
Sawtooth Windows for Feature Aggregations
Databricks
 
PDF
Redis + Apache Spark = Swiss Army Knife Meets Kitchen Sink
Databricks
 
PDF
Re-imagine Data Monitoring with whylogs and Spark
Databricks
 
PDF
Raven: End-to-end Optimization of ML Prediction Queries
Databricks
 
PDF
Processing Large Datasets for ADAS Applications using Apache Spark
Databricks
 
PDF
Massive Data Processing in Adobe Using Delta Lake
Databricks
 
DW Migration Webinar-March 2022.pptx
Databricks
 
Data Lakehouse Symposium | Day 1 | Part 1
Databricks
 
Data Lakehouse Symposium | Day 1 | Part 2
Databricks
 
Data Lakehouse Symposium | Day 2
Databricks
 
Data Lakehouse Symposium | Day 4
Databricks
 
5 Critical Steps to Clean Your Data Swamp When Migrating Off of Hadoop
Databricks
 
Democratizing Data Quality Through a Centralized Platform
Databricks
 
Learn to Use Databricks for Data Science
Databricks
 
Why APM Is Not the Same As ML Monitoring
Databricks
 
The Function, the Context, and the Data—Enabling ML Ops at Stitch Fix
Databricks
 
Stage Level Scheduling Improving Big Data and AI Integration
Databricks
 
Simplify Data Conversion from Spark to TensorFlow and PyTorch
Databricks
 
Scaling your Data Pipelines with Apache Spark on Kubernetes
Databricks
 
Scaling and Unifying SciKit Learn and Apache Spark Pipelines
Databricks
 
Sawtooth Windows for Feature Aggregations
Databricks
 
Redis + Apache Spark = Swiss Army Knife Meets Kitchen Sink
Databricks
 
Re-imagine Data Monitoring with whylogs and Spark
Databricks
 
Raven: End-to-end Optimization of ML Prediction Queries
Databricks
 
Processing Large Datasets for ADAS Applications using Apache Spark
Databricks
 
Massive Data Processing in Adobe Using Delta Lake
Databricks
 
Ad

Recently uploaded (20)

PDF
SQL for Accountants and Finance Managers
ysmaelreyes
 
PDF
apidays Singapore 2025 - From API Intelligence to API Governance by Harsha Ch...
apidays
 
PDF
The Best NVIDIA GPUs for LLM Inference in 2025.pdf
Tamanna36
 
PPT
tuberculosiship-2106031cyyfuftufufufivifviviv
AkshaiRam
 
PPTX
Listify-Intelligent-Voice-to-Catalog-Agent.pptx
nareshkottees
 
PPTX
thid ppt defines the ich guridlens and gives the information about the ICH gu...
shaistabegum14
 
PDF
Unlocking Insights: Introducing i-Metrics Asia-Pacific Corporation and Strate...
Janette Toral
 
PDF
Development and validation of the Japanese version of the Organizational Matt...
Yoga Tokuyoshi
 
PPTX
办理学历认证InformaticsLetter新加坡英华美学院毕业证书,Informatics成绩单
Taqyea
 
PDF
apidays Singapore 2025 - Building a Federated Future, Alex Szomora (GSMA)
apidays
 
PPTX
SHREYAS25 INTERN-I,II,III PPT (1).pptx pre
swapnilherage
 
PDF
NIS2 Compliance for MSPs: Roadmap, Benefits & Cybersecurity Trends (2025 Guide)
GRC Kompas
 
PPTX
Feb 2021 Ransomware Recovery presentation.pptx
enginsayin1
 
PPTX
Aict presentation on dpplppp sjdhfh.pptx
vabaso5932
 
PPTX
SlideEgg_501298-Agentic AI.pptx agentic ai
530BYManoj
 
PPTX
01_Nico Vincent_Sailpeak.pptx_AI_Barometer_2025
FinTech Belgium
 
PPTX
美国史蒂文斯理工学院毕业证书{SIT学费发票SIT录取通知书}哪里购买
Taqyea
 
PDF
apidays Singapore 2025 - The API Playbook for AI by Shin Wee Chuang (PAND AI)
apidays
 
PDF
1750162332_Snapshot-of-Indias-oil-Gas-data-May-2025.pdf
sandeep718278
 
PDF
apidays Singapore 2025 - Streaming Lakehouse with Kafka, Flink and Iceberg by...
apidays
 
SQL for Accountants and Finance Managers
ysmaelreyes
 
apidays Singapore 2025 - From API Intelligence to API Governance by Harsha Ch...
apidays
 
The Best NVIDIA GPUs for LLM Inference in 2025.pdf
Tamanna36
 
tuberculosiship-2106031cyyfuftufufufivifviviv
AkshaiRam
 
Listify-Intelligent-Voice-to-Catalog-Agent.pptx
nareshkottees
 
thid ppt defines the ich guridlens and gives the information about the ICH gu...
shaistabegum14
 
Unlocking Insights: Introducing i-Metrics Asia-Pacific Corporation and Strate...
Janette Toral
 
Development and validation of the Japanese version of the Organizational Matt...
Yoga Tokuyoshi
 
办理学历认证InformaticsLetter新加坡英华美学院毕业证书,Informatics成绩单
Taqyea
 
apidays Singapore 2025 - Building a Federated Future, Alex Szomora (GSMA)
apidays
 
SHREYAS25 INTERN-I,II,III PPT (1).pptx pre
swapnilherage
 
NIS2 Compliance for MSPs: Roadmap, Benefits & Cybersecurity Trends (2025 Guide)
GRC Kompas
 
Feb 2021 Ransomware Recovery presentation.pptx
enginsayin1
 
Aict presentation on dpplppp sjdhfh.pptx
vabaso5932
 
SlideEgg_501298-Agentic AI.pptx agentic ai
530BYManoj
 
01_Nico Vincent_Sailpeak.pptx_AI_Barometer_2025
FinTech Belgium
 
美国史蒂文斯理工学院毕业证书{SIT学费发票SIT录取通知书}哪里购买
Taqyea
 
apidays Singapore 2025 - The API Playbook for AI by Shin Wee Chuang (PAND AI)
apidays
 
1750162332_Snapshot-of-Indias-oil-Gas-data-May-2025.pdf
sandeep718278
 
apidays Singapore 2025 - Streaming Lakehouse with Kafka, Flink and Iceberg by...
apidays
 

Scalable AutoML for Time Series Forecasting using Ray

  • 2. Scalable AutoML for Time Series Forecasting using Ray Shengsheng Huang Intel Corporation Jason Dai Intel Corporation
  • 3. Agenda Background Analytics Zoo, Time Series, AutoML, Ray Scalable AutoML for Time Series Architecture, workflow, etc. Use Case Sharing & Learnings Use case study, what we learn from early users, future work
  • 5. AI on Big Data Accelerating Data Analytics + AI Solutions At Scale Distributed, High-Performance Deep Learning Framework for Apache Spark* https://github.com/intel-analytics/bigdl Unified Analytics + AI Platform for TensorFlow*, PyTorch*, Keras*, BigDL, Ray* and Apache Spark* https://github.com/intel-analytics/analytics-zoo
  • 7. Time Series In a Nutshell Time Series data ▪ A series of data that is observed sequentially in time. ▪ stock prices, sales volume, CPU/IO monitoring metrics, KPIs in telecom networks ... Time Series Analysis ▪ Time Series Forecasting ▪ Anomaly Detection ▪ Time Series Classification, clustering, etc. Applications ▪ Demand forecasting ▪ network quality management ▪ predictive maintenance ▪ AIOps Total volume of taxi passengers in NYC from 2014/07-2015/02 ( source : https://github.com/intel-analytics/analytics-zoo/blob/master/apps/anomaly- detection/anomaly-detection-nyc-taxi.ipynb)
  • 8. Time Series Forecasting Problem Definition ▪ Given all history t observations 𝑦_1,…, 𝑦_𝐭 , Predict values of next 𝐡 steps, 𝑦_(𝐭+𝟏),…, 𝑦_(𝐭+𝒉) ▪ Usually only lookback 𝐤 steps, 𝑦_(𝐭−𝒌+𝟏),…, 𝑦_𝐭 Forecasting Methods ▪ Autoregression, Exponential Smoothing, ARIMA, … ▪ Machine Learning and Deep Learning methods 𝒚 𝟏 𝒚 𝟐 𝒚 𝒕 𝒚 𝒕$𝟏 𝒚 𝒕$𝒉𝒚 𝒕&𝒌$𝟏 lookback k steps forecast h steps forward
  • 9. AutoML Source: Yao, Q., Wang, et. al Taking the Human out of Learning Applications : A Survey on Automated Machine Learning.
  • 10. Ray and RayOnSpark Ray ▪ A distributed framework for emerging AI applications Ray Tune ▪ A library on Ray for experiment execution and hyperparameter tuning RayOnSpark ▪ a feature in Analytics Zoo ▪ Directly run Ray programs on big data cluster ▪ Seamlessly integrate ray into spark data processing pipeline https://ray.readthedocs.io/en/latest/ https://analytics-zoo.github.io/master/#ProgrammingGuide/rayonspark/
  • 11. Scalable AutoML for Time Series
  • 12. Time Series Solution In Analytics Zoo Rich algorithms statistical, neural-networks, hybrid state-of-art models, etc. AutoML for automatic feature generation, model selection, hyper-parameter tuning, etc. Seamless scaling with integrated analytics and AI pipelines
  • 13. Software Stack AutoML Framework ▪ FeatureTransformer ▪ Model ▪ SearchEngine ▪ Pipeline Time Series upon AutoML ▪ TimeSequencePredictor ▪ TimeSequencePipeline https://medium.com/riselab/scalable-automl-for-time-series-prediction-using-ray-and-analytics- zoo-b79a6fd08139
  • 15. A glimpse of API Training a Pipeline ▪ fit (w/ automl) ▪ recipe ▪ distributed mode Using a Pipeline ▪ save/load ▪ evaluate/predict ▪ fit (incremental)
  • 16. Project Zouwu Use case ▪ reference time series use cases for Telco (such as network traffic forecasting, etc.) Models ▪ built-in models for time series analysis (such as LSTM and MTNet) “AutoTS” ▪ AutoML support for building E2E time series analysis pipelines (including automatic feature generation, model selection and hyperparameter tuning) Project Zouwu Built-in Models ML Workflow AutoML Workflow Integrated Analytics & AI Pipelines use-case autots model https://github.com/intel-analytics/analytics- zoo/tree/master/pyzoo/zoo/zouwu
  • 17. Use Case Sharing & Learnings
  • 18. Network Traffic KPI Forecasting in Telco Usage Scenario ▪ KPI/metrics forecasting is widely used in Telco applications (e.g., energy saving, network slicing, etc.) ▪ aggregated traffic KPI’s (i.e. total bytes, average rate in Mbps/Gbps) in the past week to forecast the KPI in the next two hours. 2 ways to solve this problem using Zouwu ▪ Use built-in “Forecaster” models for training, and forecasting (notebook link) ▪ Use “AutoTS” (with built-in AutoML support) to train an E2E Time Series Analysis Pipeline, and forecast (notebook link) Example result of network traffic average rate forecasting on the test period
  • 19. Network Quality Prediction in SK Telecom Spark-SQL Data Loading Data Loader Data Source APIs File, HTTP, Kafka forked. DRAM Store customized. Flash Store tiering Preprocess RDD of Tensor Model Code of TF DL Training & Inferencing Data Model SIMD Acceleration https://databricks.com/session_eu19/apache-spark-ai-use-case-in- telco-network-quality-analysis-and-prediction-with-geospatial- visualization
  • 20. Forecast-based Analysis for AIOps at Neusoft https://platform.neusoft.com/2020/01/17/xw-intel.html https://platform.neusoft.com/2017/03/04/qt-baomaqiche.html
  • 21. Takeaways from Early Users Highlights ▪ Additional features allowed ▪ Less efforts in tuning ▪ Satisfactory accuracy Data quality matters ▪ Missing values, outliers, etc. Scale, scale, scale ▪ hundreds of thousands of cells X KPIs => millions of time series ▪ millions of servers/containers X metrics => hundreds of millions of time series
  • 22. Future Work Extremely high dimensional time series Search algo, meta-learning, ensemble, … More models, features, … Automatic data preprocessing (e.g. missing data & outliers
  • 23. Accelerate Your Data Analytics & AI Journey with Intel
  • 24. Feedback Your feedback is important to us. Don’t forget to rate and review the sessions.