SlideShare a Scribd company logo
STARBUCKS
TECHNOLOGY
Simplifying Deep Learning
with HorovodRunner at Starbucks
About the presenters
Denny Lee
Denny Lee is a Technology
Evangelist with Databricks; he
is a hands-on data sciences
engineer with more than 15
years of experience
developing internet-scale
infrastructure, data platforms,
and distributed systems for
both on-premises and cloud.
His key focuses surround
solving complex large scale
data problems – providing not
only architectural direction
but the hands-on
implementation of these
systems.
Vishwanath Subramanian is a
Director of Data and Analytics
Engineering at Starbucks.
Vishwanath has over 15 years of
experience with a background in
distributed systems, product
management, software
engineering and Analytics.
At Starbucks, his key focus is on
providing Next Generation
Analytics platforms and enabling
large scale data processing and
machine learning to enable
Business Intelligence and Data
Services across Starbucks.
Vishwanath Subramanian
Scenarios
• On-Demand one click Provisioning
of Seamlessly integrated
Infrastructure Bill of Material for
Data Science and Intelligent Apps.
• Secured Connectivity to Enterprise
Data Platform completely
abstracted from Analytics teams.
• Solution template containing
organization of deployments to
enable Adhoc experiments, shared
data engineering and Intelligent
App Development
• Smarter checkout experiences
• Predicting customer traffic
• Planogram Analysis
• And more…
Current State
• Solving complex / streaming image and video analytics is
hard
• It also typically involves distributing the problem to multiple
nodes
• But how do I perform Keras+TensorFlow on a distributed
environment?
Convolutional Neural Networks
Convolutional Neural Networks
28 x 28 28 x 28 14 x 14
Convolution
32 filters
Convolution
64 filters
Subsampling
Stride (2,2)
Feature Extraction Classification
0
1
8
9
FullyConnected
Dropout
DEMO
Running Keras CNNs Standalone
Keras, TensorFlow, HorovodRunner, and MLflow: https://dbricks.co/2D58PDw
Introducing HorovodRunner
• On-Demand one click Provisioning
of Seamlessly integrated
Infrastructure Bill of Material for
Data Science and Intelligent Apps.
• Secured Connectivity to Enterprise
Data Platform completely
abstracted from Analytics teams.
• Solution template containing
organization of deployments to
enable Adhoc experiments, shared
data engineering and Intelligent
App Development
• HorovodRunner is a general API to run distributed learning workloads
on Databricks using Uber’s Horovod framework
• Combining Horovod with Apache Spark’s barrier mode allows longer-
running deep learning training jobs
• A Horovod MPI job is embedded as a Spark job using barrier
execution mode
HorovodRunner
• HorovodRunner takes a Python
method that contains DL training code
with Horovod hooks
• The first executor collects the IP
address of all of the task executors
using BarrierTaskContext
• Then it triggers a Horovod job using
mpirun.
• Each Python MPI process loads the
pickled program back, deserializes it,
and runs it.
HorovodRunner
driver
workers
HorovodRunner
driver
workers
runCNN():
model.add(Conv2D(32, …))
model.add(Conv2D(64, …))
model.add(MaxPooling2D(…))
model.add(Dense(128, …)
model.add(Dense(10, ’softmax’)
optimizer = keras.optimizers 
.Adadelta(1.0)
In standalone or hvd local mode, the code is running on the driver
HorovodRunner
driver
workers
variables
runCNN_hvd():
hvd.init()
config.tf.ConfigProto()
# Original code
runCNN()
callbacks = []
With HorovodRunner, we wrap the original code and
code and variables are pushed to the workers
HorovodRunner
driver
workers
With HorovodRunner, we wrap the original code and
code and variables are pushed to the workers
HorovodRunner
driver
workers
With HorovodRunner, we wrap the original code and
code and variables are pushed to the workers
HorovodRunner
driver
workers
With HorovodRunner, we wrap the original code and
code and variables are pushed to the workers
HorovodRunner
driver
workers
Variables are transferred from driver to workers
Code is executed at the workers
Migrate to HorovodRunner
• On-Demand one click Provisioning
of Seamlessly integrated
Infrastructure Bill of Material for
Data Science and Intelligent Apps.
• Secured Connectivity to Enterprise
Data Platform completely
abstracted from Analytics teams.
• Solution template containing
organization of deployments to
enable Adhoc experiments, shared
data engineering and Intelligent
App Development
# Primary code differences are noted below
+ hvd.init()
+ config.tfConfigProto()
+ config.gpu_options.allow_growth = True
+ config.gpu_options.visible_device_list = str(hvd.local_rank())
+ epochs = int(math.ceil(12.0 / hvd.size()))
+ callbacks = [
+ hvd.callbacks.BroadcastGlobalVariablesCallback(0),
+ ]
Comparing the runs using MLflow
• On-Demand one click Provisioning
of Seamlessly integrated
Infrastructure Bill of Material for
Data Science and Intelligent Apps.
• Secured Connectivity to Enterprise
Data Platform completely
abstracted from Analytics teams.
• Solution template containing
organization of deployments to
enable Adhoc experiments, shared
data engineering and Intelligent
App Development
DEMO
Object Detection
Keras, TensorFlow, HorovodRunner, and MLflow
Object Detection Approaches
RCNN (2012)
• Region proposal algorithms - give you a set of regions in the image that are likely
to contain objects.
• Run those images in the bounding boxes to a pre-trained alexnet to compute
the features for that bounding box.
• Support vector machine, to classify what the object in the image is of.
• Run the box through a linear regression model to output tighter coordinates
for the box.
• RCNN -> Fast RCNN ->Faster RCNN
Rich feature hierarchies for accurate object detection and semantic segmentation - Girshick, Donahue, Darrell, Malik
Fast R-CNN - Girshick
Faster R-CNN: Towards Real-Time ObjectDetection with Region Proposal Networks - Ren, He, Girshick, Su
Object Detection Approaches (contd.)
• YOLO – detection as a regression problem
• Not a traditional classifier
• Divide image into grid, each cell is responsible for predicting n bounding boxes
• Output confidence score that predicted bounding box
• Gives a probability distribution of all the classes its trained on
• Confidence score and class prediction is combined is combined into a score for
object classification
• Based on threshold, we determine relevant boxes.
• All the boxes fed to the neural network all at once.
You Only Look Once: Unified, Real-Time Object Detection - Redmon, Divvala, Girshick, Farhadi
A TALENTED TECHNOLOGISTS
DELIVERING TODAY
aavaLEADING INTO THE FUTURE
https://www.starbucks.com/careers/

More Related Content

What's hot (20)

Build, Scale, and Deploy Deep Learning Pipelines Using Apache Spark
Build, Scale, and Deploy Deep Learning Pipelines Using Apache SparkBuild, Scale, and Deploy Deep Learning Pipelines Using Apache Spark
Build, Scale, and Deploy Deep Learning Pipelines Using Apache Spark
Databricks
 
Koalas: How Well Does Koalas Work?
Koalas: How Well Does Koalas Work?Koalas: How Well Does Koalas Work?
Koalas: How Well Does Koalas Work?
Databricks
 
Apache Spark Model Deployment
Apache Spark Model Deployment Apache Spark Model Deployment
Apache Spark Model Deployment
Databricks
 
Build Large-Scale Data Analytics and AI Pipeline Using RayDP
Build Large-Scale Data Analytics and AI Pipeline Using RayDPBuild Large-Scale Data Analytics and AI Pipeline Using RayDP
Build Large-Scale Data Analytics and AI Pipeline Using RayDP
Databricks
 
Advanced MLflow: Multi-Step Workflows, Hyperparameter Tuning and Integrating ...
Advanced MLflow: Multi-Step Workflows, Hyperparameter Tuning and Integrating ...Advanced MLflow: Multi-Step Workflows, Hyperparameter Tuning and Integrating ...
Advanced MLflow: Multi-Step Workflows, Hyperparameter Tuning and Integrating ...
Databricks
 
Unified Framework for Real Time, Near Real Time and Offline Analysis of Video...
Unified Framework for Real Time, Near Real Time and Offline Analysis of Video...Unified Framework for Real Time, Near Real Time and Offline Analysis of Video...
Unified Framework for Real Time, Near Real Time and Offline Analysis of Video...
Spark Summit
 
Spark Summit EU talk by Zoltan Zvara
Spark Summit EU talk by Zoltan ZvaraSpark Summit EU talk by Zoltan Zvara
Spark Summit EU talk by Zoltan Zvara
Spark Summit
 
Operationalizing Machine Learning—Managing Provenance from Raw Data to Predic...
Operationalizing Machine Learning—Managing Provenance from Raw Data to Predic...Operationalizing Machine Learning—Managing Provenance from Raw Data to Predic...
Operationalizing Machine Learning—Managing Provenance from Raw Data to Predic...
Databricks
 
From Pipelines to Refineries: scaling big data applications with Tim Hunter
From Pipelines to Refineries: scaling big data applications with Tim HunterFrom Pipelines to Refineries: scaling big data applications with Tim Hunter
From Pipelines to Refineries: scaling big data applications with Tim Hunter
Databricks
 
Updates from Project Hydrogen: Unifying State-of-the-Art AI and Big Data in A...
Updates from Project Hydrogen: Unifying State-of-the-Art AI and Big Data in A...Updates from Project Hydrogen: Unifying State-of-the-Art AI and Big Data in A...
Updates from Project Hydrogen: Unifying State-of-the-Art AI and Big Data in A...
Databricks
 
Yggdrasil: Faster Decision Trees Using Column Partitioning In Spark
Yggdrasil: Faster Decision Trees Using Column Partitioning In SparkYggdrasil: Faster Decision Trees Using Column Partitioning In Spark
Yggdrasil: Faster Decision Trees Using Column Partitioning In Spark
Jen Aman
 
Deploying MLlib for Scoring in Structured Streaming with Joseph Bradley
Deploying MLlib for Scoring in Structured Streaming with Joseph BradleyDeploying MLlib for Scoring in Structured Streaming with Joseph Bradley
Deploying MLlib for Scoring in Structured Streaming with Joseph Bradley
Databricks
 
Productionizing Machine Learning Pipelines with Databricks and Azure ML
Productionizing Machine Learning Pipelines with Databricks and Azure MLProductionizing Machine Learning Pipelines with Databricks and Azure ML
Productionizing Machine Learning Pipelines with Databricks and Azure ML
Databricks
 
Operationalizing Machine Learning at Scale with Sameer Nori
Operationalizing Machine Learning at Scale with Sameer NoriOperationalizing Machine Learning at Scale with Sameer Nori
Operationalizing Machine Learning at Scale with Sameer Nori
Databricks
 
Scaling Ride-Hailing with Machine Learning on MLflow
Scaling Ride-Hailing with Machine Learning on MLflowScaling Ride-Hailing with Machine Learning on MLflow
Scaling Ride-Hailing with Machine Learning on MLflow
Databricks
 
A Predictive Analytics Workflow on DICOM Images using Apache Spark with Anahi...
A Predictive Analytics Workflow on DICOM Images using Apache Spark with Anahi...A Predictive Analytics Workflow on DICOM Images using Apache Spark with Anahi...
A Predictive Analytics Workflow on DICOM Images using Apache Spark with Anahi...
Databricks
 
Accelerating Data Science with Better Data Engineering on Databricks
Accelerating Data Science with Better Data Engineering on DatabricksAccelerating Data Science with Better Data Engineering on Databricks
Accelerating Data Science with Better Data Engineering on Databricks
Databricks
 
Scalable Deep Learning Platform On Spark In Baidu
Scalable Deep Learning Platform On Spark In BaiduScalable Deep Learning Platform On Spark In Baidu
Scalable Deep Learning Platform On Spark In Baidu
Jen Aman
 
Handling Data Skew Adaptively In Spark Using Dynamic Repartitioning
Handling Data Skew Adaptively In Spark Using Dynamic RepartitioningHandling Data Skew Adaptively In Spark Using Dynamic Repartitioning
Handling Data Skew Adaptively In Spark Using Dynamic Repartitioning
Spark Summit
 
Spark Autotuning: Spark Summit East talk by Lawrence Spracklen
Spark Autotuning: Spark Summit East talk by Lawrence SpracklenSpark Autotuning: Spark Summit East talk by Lawrence Spracklen
Spark Autotuning: Spark Summit East talk by Lawrence Spracklen
Spark Summit
 
Build, Scale, and Deploy Deep Learning Pipelines Using Apache Spark
Build, Scale, and Deploy Deep Learning Pipelines Using Apache SparkBuild, Scale, and Deploy Deep Learning Pipelines Using Apache Spark
Build, Scale, and Deploy Deep Learning Pipelines Using Apache Spark
Databricks
 
Koalas: How Well Does Koalas Work?
Koalas: How Well Does Koalas Work?Koalas: How Well Does Koalas Work?
Koalas: How Well Does Koalas Work?
Databricks
 
Apache Spark Model Deployment
Apache Spark Model Deployment Apache Spark Model Deployment
Apache Spark Model Deployment
Databricks
 
Build Large-Scale Data Analytics and AI Pipeline Using RayDP
Build Large-Scale Data Analytics and AI Pipeline Using RayDPBuild Large-Scale Data Analytics and AI Pipeline Using RayDP
Build Large-Scale Data Analytics and AI Pipeline Using RayDP
Databricks
 
Advanced MLflow: Multi-Step Workflows, Hyperparameter Tuning and Integrating ...
Advanced MLflow: Multi-Step Workflows, Hyperparameter Tuning and Integrating ...Advanced MLflow: Multi-Step Workflows, Hyperparameter Tuning and Integrating ...
Advanced MLflow: Multi-Step Workflows, Hyperparameter Tuning and Integrating ...
Databricks
 
Unified Framework for Real Time, Near Real Time and Offline Analysis of Video...
Unified Framework for Real Time, Near Real Time and Offline Analysis of Video...Unified Framework for Real Time, Near Real Time and Offline Analysis of Video...
Unified Framework for Real Time, Near Real Time and Offline Analysis of Video...
Spark Summit
 
Spark Summit EU talk by Zoltan Zvara
Spark Summit EU talk by Zoltan ZvaraSpark Summit EU talk by Zoltan Zvara
Spark Summit EU talk by Zoltan Zvara
Spark Summit
 
Operationalizing Machine Learning—Managing Provenance from Raw Data to Predic...
Operationalizing Machine Learning—Managing Provenance from Raw Data to Predic...Operationalizing Machine Learning—Managing Provenance from Raw Data to Predic...
Operationalizing Machine Learning—Managing Provenance from Raw Data to Predic...
Databricks
 
From Pipelines to Refineries: scaling big data applications with Tim Hunter
From Pipelines to Refineries: scaling big data applications with Tim HunterFrom Pipelines to Refineries: scaling big data applications with Tim Hunter
From Pipelines to Refineries: scaling big data applications with Tim Hunter
Databricks
 
Updates from Project Hydrogen: Unifying State-of-the-Art AI and Big Data in A...
Updates from Project Hydrogen: Unifying State-of-the-Art AI and Big Data in A...Updates from Project Hydrogen: Unifying State-of-the-Art AI and Big Data in A...
Updates from Project Hydrogen: Unifying State-of-the-Art AI and Big Data in A...
Databricks
 
Yggdrasil: Faster Decision Trees Using Column Partitioning In Spark
Yggdrasil: Faster Decision Trees Using Column Partitioning In SparkYggdrasil: Faster Decision Trees Using Column Partitioning In Spark
Yggdrasil: Faster Decision Trees Using Column Partitioning In Spark
Jen Aman
 
Deploying MLlib for Scoring in Structured Streaming with Joseph Bradley
Deploying MLlib for Scoring in Structured Streaming with Joseph BradleyDeploying MLlib for Scoring in Structured Streaming with Joseph Bradley
Deploying MLlib for Scoring in Structured Streaming with Joseph Bradley
Databricks
 
Productionizing Machine Learning Pipelines with Databricks and Azure ML
Productionizing Machine Learning Pipelines with Databricks and Azure MLProductionizing Machine Learning Pipelines with Databricks and Azure ML
Productionizing Machine Learning Pipelines with Databricks and Azure ML
Databricks
 
Operationalizing Machine Learning at Scale with Sameer Nori
Operationalizing Machine Learning at Scale with Sameer NoriOperationalizing Machine Learning at Scale with Sameer Nori
Operationalizing Machine Learning at Scale with Sameer Nori
Databricks
 
Scaling Ride-Hailing with Machine Learning on MLflow
Scaling Ride-Hailing with Machine Learning on MLflowScaling Ride-Hailing with Machine Learning on MLflow
Scaling Ride-Hailing with Machine Learning on MLflow
Databricks
 
A Predictive Analytics Workflow on DICOM Images using Apache Spark with Anahi...
A Predictive Analytics Workflow on DICOM Images using Apache Spark with Anahi...A Predictive Analytics Workflow on DICOM Images using Apache Spark with Anahi...
A Predictive Analytics Workflow on DICOM Images using Apache Spark with Anahi...
Databricks
 
Accelerating Data Science with Better Data Engineering on Databricks
Accelerating Data Science with Better Data Engineering on DatabricksAccelerating Data Science with Better Data Engineering on Databricks
Accelerating Data Science with Better Data Engineering on Databricks
Databricks
 
Scalable Deep Learning Platform On Spark In Baidu
Scalable Deep Learning Platform On Spark In BaiduScalable Deep Learning Platform On Spark In Baidu
Scalable Deep Learning Platform On Spark In Baidu
Jen Aman
 
Handling Data Skew Adaptively In Spark Using Dynamic Repartitioning
Handling Data Skew Adaptively In Spark Using Dynamic RepartitioningHandling Data Skew Adaptively In Spark Using Dynamic Repartitioning
Handling Data Skew Adaptively In Spark Using Dynamic Repartitioning
Spark Summit
 
Spark Autotuning: Spark Summit East talk by Lawrence Spracklen
Spark Autotuning: Spark Summit East talk by Lawrence SpracklenSpark Autotuning: Spark Summit East talk by Lawrence Spracklen
Spark Autotuning: Spark Summit East talk by Lawrence Spracklen
Spark Summit
 

Similar to Simplify Distributed TensorFlow Training for Fast Image Categorization at Starbucks (20)

Khan farhan cv
Khan farhan cvKhan farhan cv
Khan farhan cv
farhan0039
 
Anomaly Detection with Azure and .NET
Anomaly Detection with Azure and .NETAnomaly Detection with Azure and .NET
Anomaly Detection with Azure and .NET
Marco Parenzan
 
Deep learning with_computer_vision
Deep learning with_computer_visionDeep learning with_computer_vision
Deep learning with_computer_vision
Anand Narayanan
 
Project Hydrogen, HorovodRunner, and Pandas UDF: Distributed Deep Learning Tr...
Project Hydrogen, HorovodRunner, and Pandas UDF: Distributed Deep Learning Tr...Project Hydrogen, HorovodRunner, and Pandas UDF: Distributed Deep Learning Tr...
Project Hydrogen, HorovodRunner, and Pandas UDF: Distributed Deep Learning Tr...
Anyscale
 
Demystifying-AI-Frameworks-TensorFlow-PyTorch-JAX-and-More (1).pptx
Demystifying-AI-Frameworks-TensorFlow-PyTorch-JAX-and-More (1).pptxDemystifying-AI-Frameworks-TensorFlow-PyTorch-JAX-and-More (1).pptx
Demystifying-AI-Frameworks-TensorFlow-PyTorch-JAX-and-More (1).pptx
Anant Garg
 
NVIDIA 深度學習教育機構 (DLI): Approaches to object detection
NVIDIA 深度學習教育機構 (DLI): Approaches to object detectionNVIDIA 深度學習教育機構 (DLI): Approaches to object detection
NVIDIA 深度學習教育機構 (DLI): Approaches to object detection
NVIDIA Taiwan
 
AI meets Big Data
AI meets Big DataAI meets Big Data
AI meets Big Data
Jan Wiegelmann
 
SMART RECOGNITION FOR OBJECT DETECTION.pptx
SMART RECOGNITION FOR OBJECT DETECTION.pptxSMART RECOGNITION FOR OBJECT DETECTION.pptx
SMART RECOGNITION FOR OBJECT DETECTION.pptx
divyasindhu040
 
Anomaly Detection with Azure and .net
Anomaly Detection with Azure and .netAnomaly Detection with Azure and .net
Anomaly Detection with Azure and .net
Marco Parenzan
 
Data Science Powered Apps for Internet of Things
Data Science Powered Apps for Internet of ThingsData Science Powered Apps for Internet of Things
Data Science Powered Apps for Internet of Things
VMware Tanzu
 
Deep Learning with Databricks
Deep Learning with Databricks  Deep Learning with Databricks
Deep Learning with Databricks
Henning Kropp
 
ICCV 2019 - A view
ICCV 2019 - A viewICCV 2019 - A view
ICCV 2019 - A view
LiberiFatali
 
Tensorflow a brief introduction 2nd Sess.pptx
Tensorflow a brief introduction 2nd Sess.pptxTensorflow a brief introduction 2nd Sess.pptx
Tensorflow a brief introduction 2nd Sess.pptx
AnandMenon54
 
Data Science in business World
Data Science in business World Data Science in business World
Data Science in business World
DeepikaGauriBaijal
 
Microsoft DevOps for AI with GoDataDriven
Microsoft DevOps for AI with GoDataDrivenMicrosoft DevOps for AI with GoDataDriven
Microsoft DevOps for AI with GoDataDriven
GoDataDriven
 
We Must Go Deeper
We Must Go DeeperWe Must Go Deeper
We Must Go Deeper
The Software House
 
End-to-End Deep Learning with Horovod on Apache Spark
End-to-End Deep Learning with Horovod on Apache SparkEnd-to-End Deep Learning with Horovod on Apache Spark
End-to-End Deep Learning with Horovod on Apache Spark
Databricks
 
Automatic Attendace using convolutional neural network Face Recognition
Automatic Attendace using convolutional neural network Face RecognitionAutomatic Attendace using convolutional neural network Face Recognition
Automatic Attendace using convolutional neural network Face Recognition
vatsal199567
 
Spark-Zeppelin-ML on HWX
Spark-Zeppelin-ML on HWXSpark-Zeppelin-ML on HWX
Spark-Zeppelin-ML on HWX
Kirk Haslbeck
 
Deep Learning for Autonomous Driving
Deep Learning for Autonomous DrivingDeep Learning for Autonomous Driving
Deep Learning for Autonomous Driving
Jan Wiegelmann
 
Khan farhan cv
Khan farhan cvKhan farhan cv
Khan farhan cv
farhan0039
 
Anomaly Detection with Azure and .NET
Anomaly Detection with Azure and .NETAnomaly Detection with Azure and .NET
Anomaly Detection with Azure and .NET
Marco Parenzan
 
Deep learning with_computer_vision
Deep learning with_computer_visionDeep learning with_computer_vision
Deep learning with_computer_vision
Anand Narayanan
 
Project Hydrogen, HorovodRunner, and Pandas UDF: Distributed Deep Learning Tr...
Project Hydrogen, HorovodRunner, and Pandas UDF: Distributed Deep Learning Tr...Project Hydrogen, HorovodRunner, and Pandas UDF: Distributed Deep Learning Tr...
Project Hydrogen, HorovodRunner, and Pandas UDF: Distributed Deep Learning Tr...
Anyscale
 
Demystifying-AI-Frameworks-TensorFlow-PyTorch-JAX-and-More (1).pptx
Demystifying-AI-Frameworks-TensorFlow-PyTorch-JAX-and-More (1).pptxDemystifying-AI-Frameworks-TensorFlow-PyTorch-JAX-and-More (1).pptx
Demystifying-AI-Frameworks-TensorFlow-PyTorch-JAX-and-More (1).pptx
Anant Garg
 
NVIDIA 深度學習教育機構 (DLI): Approaches to object detection
NVIDIA 深度學習教育機構 (DLI): Approaches to object detectionNVIDIA 深度學習教育機構 (DLI): Approaches to object detection
NVIDIA 深度學習教育機構 (DLI): Approaches to object detection
NVIDIA Taiwan
 
SMART RECOGNITION FOR OBJECT DETECTION.pptx
SMART RECOGNITION FOR OBJECT DETECTION.pptxSMART RECOGNITION FOR OBJECT DETECTION.pptx
SMART RECOGNITION FOR OBJECT DETECTION.pptx
divyasindhu040
 
Anomaly Detection with Azure and .net
Anomaly Detection with Azure and .netAnomaly Detection with Azure and .net
Anomaly Detection with Azure and .net
Marco Parenzan
 
Data Science Powered Apps for Internet of Things
Data Science Powered Apps for Internet of ThingsData Science Powered Apps for Internet of Things
Data Science Powered Apps for Internet of Things
VMware Tanzu
 
Deep Learning with Databricks
Deep Learning with Databricks  Deep Learning with Databricks
Deep Learning with Databricks
Henning Kropp
 
ICCV 2019 - A view
ICCV 2019 - A viewICCV 2019 - A view
ICCV 2019 - A view
LiberiFatali
 
Tensorflow a brief introduction 2nd Sess.pptx
Tensorflow a brief introduction 2nd Sess.pptxTensorflow a brief introduction 2nd Sess.pptx
Tensorflow a brief introduction 2nd Sess.pptx
AnandMenon54
 
Data Science in business World
Data Science in business World Data Science in business World
Data Science in business World
DeepikaGauriBaijal
 
Microsoft DevOps for AI with GoDataDriven
Microsoft DevOps for AI with GoDataDrivenMicrosoft DevOps for AI with GoDataDriven
Microsoft DevOps for AI with GoDataDriven
GoDataDriven
 
End-to-End Deep Learning with Horovod on Apache Spark
End-to-End Deep Learning with Horovod on Apache SparkEnd-to-End Deep Learning with Horovod on Apache Spark
End-to-End Deep Learning with Horovod on Apache Spark
Databricks
 
Automatic Attendace using convolutional neural network Face Recognition
Automatic Attendace using convolutional neural network Face RecognitionAutomatic Attendace using convolutional neural network Face Recognition
Automatic Attendace using convolutional neural network Face Recognition
vatsal199567
 
Spark-Zeppelin-ML on HWX
Spark-Zeppelin-ML on HWXSpark-Zeppelin-ML on HWX
Spark-Zeppelin-ML on HWX
Kirk Haslbeck
 
Deep Learning for Autonomous Driving
Deep Learning for Autonomous DrivingDeep Learning for Autonomous Driving
Deep Learning for Autonomous Driving
Jan Wiegelmann
 

More from Databricks (20)

DW Migration Webinar-March 2022.pptx
DW Migration Webinar-March 2022.pptxDW Migration Webinar-March 2022.pptx
DW Migration Webinar-March 2022.pptx
Databricks
 
Data Lakehouse Symposium | Day 1 | Part 1
Data Lakehouse Symposium | Day 1 | Part 1Data Lakehouse Symposium | Day 1 | Part 1
Data Lakehouse Symposium | Day 1 | Part 1
Databricks
 
Data Lakehouse Symposium | Day 1 | Part 2
Data Lakehouse Symposium | Day 1 | Part 2Data Lakehouse Symposium | Day 1 | Part 2
Data Lakehouse Symposium | Day 1 | Part 2
Databricks
 
Data Lakehouse Symposium | Day 2
Data Lakehouse Symposium | Day 2Data Lakehouse Symposium | Day 2
Data Lakehouse Symposium | Day 2
Databricks
 
Data Lakehouse Symposium | Day 4
Data Lakehouse Symposium | Day 4Data Lakehouse Symposium | Day 4
Data Lakehouse Symposium | Day 4
Databricks
 
5 Critical Steps to Clean Your Data Swamp When Migrating Off of Hadoop
5 Critical Steps to Clean Your Data Swamp When Migrating Off of Hadoop5 Critical Steps to Clean Your Data Swamp When Migrating Off of Hadoop
5 Critical Steps to Clean Your Data Swamp When Migrating Off of Hadoop
Databricks
 
Democratizing Data Quality Through a Centralized Platform
Democratizing Data Quality Through a Centralized PlatformDemocratizing Data Quality Through a Centralized Platform
Democratizing Data Quality Through a Centralized Platform
Databricks
 
Learn to Use Databricks for Data Science
Learn to Use Databricks for Data ScienceLearn to Use Databricks for Data Science
Learn to Use Databricks for Data Science
Databricks
 
Why APM Is Not the Same As ML Monitoring
Why APM Is Not the Same As ML MonitoringWhy APM Is Not the Same As ML Monitoring
Why APM Is Not the Same As ML Monitoring
Databricks
 
The Function, the Context, and the Data—Enabling ML Ops at Stitch Fix
The Function, the Context, and the Data—Enabling ML Ops at Stitch FixThe Function, the Context, and the Data—Enabling ML Ops at Stitch Fix
The Function, the Context, and the Data—Enabling ML Ops at Stitch Fix
Databricks
 
Stage Level Scheduling Improving Big Data and AI Integration
Stage Level Scheduling Improving Big Data and AI IntegrationStage Level Scheduling Improving Big Data and AI Integration
Stage Level Scheduling Improving Big Data and AI Integration
Databricks
 
Simplify Data Conversion from Spark to TensorFlow and PyTorch
Simplify Data Conversion from Spark to TensorFlow and PyTorchSimplify Data Conversion from Spark to TensorFlow and PyTorch
Simplify Data Conversion from Spark to TensorFlow and PyTorch
Databricks
 
Scaling your Data Pipelines with Apache Spark on Kubernetes
Scaling your Data Pipelines with Apache Spark on KubernetesScaling your Data Pipelines with Apache Spark on Kubernetes
Scaling your Data Pipelines with Apache Spark on Kubernetes
Databricks
 
Scaling and Unifying SciKit Learn and Apache Spark Pipelines
Scaling and Unifying SciKit Learn and Apache Spark PipelinesScaling and Unifying SciKit Learn and Apache Spark Pipelines
Scaling and Unifying SciKit Learn and Apache Spark Pipelines
Databricks
 
Sawtooth Windows for Feature Aggregations
Sawtooth Windows for Feature AggregationsSawtooth Windows for Feature Aggregations
Sawtooth Windows for Feature Aggregations
Databricks
 
Redis + Apache Spark = Swiss Army Knife Meets Kitchen Sink
Redis + Apache Spark = Swiss Army Knife Meets Kitchen SinkRedis + Apache Spark = Swiss Army Knife Meets Kitchen Sink
Redis + Apache Spark = Swiss Army Knife Meets Kitchen Sink
Databricks
 
Re-imagine Data Monitoring with whylogs and Spark
Re-imagine Data Monitoring with whylogs and SparkRe-imagine Data Monitoring with whylogs and Spark
Re-imagine Data Monitoring with whylogs and Spark
Databricks
 
Raven: End-to-end Optimization of ML Prediction Queries
Raven: End-to-end Optimization of ML Prediction QueriesRaven: End-to-end Optimization of ML Prediction Queries
Raven: End-to-end Optimization of ML Prediction Queries
Databricks
 
Processing Large Datasets for ADAS Applications using Apache Spark
Processing Large Datasets for ADAS Applications using Apache SparkProcessing Large Datasets for ADAS Applications using Apache Spark
Processing Large Datasets for ADAS Applications using Apache Spark
Databricks
 
Massive Data Processing in Adobe Using Delta Lake
Massive Data Processing in Adobe Using Delta LakeMassive Data Processing in Adobe Using Delta Lake
Massive Data Processing in Adobe Using Delta Lake
Databricks
 
DW Migration Webinar-March 2022.pptx
DW Migration Webinar-March 2022.pptxDW Migration Webinar-March 2022.pptx
DW Migration Webinar-March 2022.pptx
Databricks
 
Data Lakehouse Symposium | Day 1 | Part 1
Data Lakehouse Symposium | Day 1 | Part 1Data Lakehouse Symposium | Day 1 | Part 1
Data Lakehouse Symposium | Day 1 | Part 1
Databricks
 
Data Lakehouse Symposium | Day 1 | Part 2
Data Lakehouse Symposium | Day 1 | Part 2Data Lakehouse Symposium | Day 1 | Part 2
Data Lakehouse Symposium | Day 1 | Part 2
Databricks
 
Data Lakehouse Symposium | Day 2
Data Lakehouse Symposium | Day 2Data Lakehouse Symposium | Day 2
Data Lakehouse Symposium | Day 2
Databricks
 
Data Lakehouse Symposium | Day 4
Data Lakehouse Symposium | Day 4Data Lakehouse Symposium | Day 4
Data Lakehouse Symposium | Day 4
Databricks
 
5 Critical Steps to Clean Your Data Swamp When Migrating Off of Hadoop
5 Critical Steps to Clean Your Data Swamp When Migrating Off of Hadoop5 Critical Steps to Clean Your Data Swamp When Migrating Off of Hadoop
5 Critical Steps to Clean Your Data Swamp When Migrating Off of Hadoop
Databricks
 
Democratizing Data Quality Through a Centralized Platform
Democratizing Data Quality Through a Centralized PlatformDemocratizing Data Quality Through a Centralized Platform
Democratizing Data Quality Through a Centralized Platform
Databricks
 
Learn to Use Databricks for Data Science
Learn to Use Databricks for Data ScienceLearn to Use Databricks for Data Science
Learn to Use Databricks for Data Science
Databricks
 
Why APM Is Not the Same As ML Monitoring
Why APM Is Not the Same As ML MonitoringWhy APM Is Not the Same As ML Monitoring
Why APM Is Not the Same As ML Monitoring
Databricks
 
The Function, the Context, and the Data—Enabling ML Ops at Stitch Fix
The Function, the Context, and the Data—Enabling ML Ops at Stitch FixThe Function, the Context, and the Data—Enabling ML Ops at Stitch Fix
The Function, the Context, and the Data—Enabling ML Ops at Stitch Fix
Databricks
 
Stage Level Scheduling Improving Big Data and AI Integration
Stage Level Scheduling Improving Big Data and AI IntegrationStage Level Scheduling Improving Big Data and AI Integration
Stage Level Scheduling Improving Big Data and AI Integration
Databricks
 
Simplify Data Conversion from Spark to TensorFlow and PyTorch
Simplify Data Conversion from Spark to TensorFlow and PyTorchSimplify Data Conversion from Spark to TensorFlow and PyTorch
Simplify Data Conversion from Spark to TensorFlow and PyTorch
Databricks
 
Scaling your Data Pipelines with Apache Spark on Kubernetes
Scaling your Data Pipelines with Apache Spark on KubernetesScaling your Data Pipelines with Apache Spark on Kubernetes
Scaling your Data Pipelines with Apache Spark on Kubernetes
Databricks
 
Scaling and Unifying SciKit Learn and Apache Spark Pipelines
Scaling and Unifying SciKit Learn and Apache Spark PipelinesScaling and Unifying SciKit Learn and Apache Spark Pipelines
Scaling and Unifying SciKit Learn and Apache Spark Pipelines
Databricks
 
Sawtooth Windows for Feature Aggregations
Sawtooth Windows for Feature AggregationsSawtooth Windows for Feature Aggregations
Sawtooth Windows for Feature Aggregations
Databricks
 
Redis + Apache Spark = Swiss Army Knife Meets Kitchen Sink
Redis + Apache Spark = Swiss Army Knife Meets Kitchen SinkRedis + Apache Spark = Swiss Army Knife Meets Kitchen Sink
Redis + Apache Spark = Swiss Army Knife Meets Kitchen Sink
Databricks
 
Re-imagine Data Monitoring with whylogs and Spark
Re-imagine Data Monitoring with whylogs and SparkRe-imagine Data Monitoring with whylogs and Spark
Re-imagine Data Monitoring with whylogs and Spark
Databricks
 
Raven: End-to-end Optimization of ML Prediction Queries
Raven: End-to-end Optimization of ML Prediction QueriesRaven: End-to-end Optimization of ML Prediction Queries
Raven: End-to-end Optimization of ML Prediction Queries
Databricks
 
Processing Large Datasets for ADAS Applications using Apache Spark
Processing Large Datasets for ADAS Applications using Apache SparkProcessing Large Datasets for ADAS Applications using Apache Spark
Processing Large Datasets for ADAS Applications using Apache Spark
Databricks
 
Massive Data Processing in Adobe Using Delta Lake
Massive Data Processing in Adobe Using Delta LakeMassive Data Processing in Adobe Using Delta Lake
Massive Data Processing in Adobe Using Delta Lake
Databricks
 

Recently uploaded (20)

Understanding Tree Data Structure and Its Applications
Understanding Tree Data Structure and Its ApplicationsUnderstanding Tree Data Structure and Its Applications
Understanding Tree Data Structure and Its Applications
M Munim
 
How Data Annotation Services Drive Innovation in Autonomous Vehicles.docx
How Data Annotation Services Drive Innovation in Autonomous Vehicles.docxHow Data Annotation Services Drive Innovation in Autonomous Vehicles.docx
How Data Annotation Services Drive Innovation in Autonomous Vehicles.docx
sofiawilliams5966
 
Market Share Analysis.pptx nnnnnnnnnnnnnn
Market Share Analysis.pptx nnnnnnnnnnnnnnMarket Share Analysis.pptx nnnnnnnnnnnnnn
Market Share Analysis.pptx nnnnnnnnnnnnnn
rocky
 
Veterinary Anatomy, The Regional Gross Anatomy of Domestic Animals (VetBooks....
Veterinary Anatomy, The Regional Gross Anatomy of Domestic Animals (VetBooks....Veterinary Anatomy, The Regional Gross Anatomy of Domestic Animals (VetBooks....
Veterinary Anatomy, The Regional Gross Anatomy of Domestic Animals (VetBooks....
JazmnAltamirano1
 
15 Benefits of Data Analytics in Business Growth.pdf
15 Benefits of Data Analytics in Business Growth.pdf15 Benefits of Data Analytics in Business Growth.pdf
15 Benefits of Data Analytics in Business Growth.pdf
AffinityCore
 
GROUP 7 CASE STUDY Real Life Incident.pptx
GROUP 7 CASE STUDY Real Life Incident.pptxGROUP 7 CASE STUDY Real Life Incident.pptx
GROUP 7 CASE STUDY Real Life Incident.pptx
mardoglenn21
 
Geospatial Data_ Unlocking the Power for Smarter Urban Planning.docx
Geospatial Data_ Unlocking the Power for Smarter Urban Planning.docxGeospatial Data_ Unlocking the Power for Smarter Urban Planning.docx
Geospatial Data_ Unlocking the Power for Smarter Urban Planning.docx
sofiawilliams5966
 
Ethical Frameworks for Trustworthy AI – Opportunities for Researchers in Huma...
Ethical Frameworks for Trustworthy AI – Opportunities for Researchers in Huma...Ethical Frameworks for Trustworthy AI – Opportunities for Researchers in Huma...
Ethical Frameworks for Trustworthy AI – Opportunities for Researchers in Huma...
Karim Baïna
 
Cyber Security Presentation(Neon)xu.pptx
Cyber Security Presentation(Neon)xu.pptxCyber Security Presentation(Neon)xu.pptx
Cyber Security Presentation(Neon)xu.pptx
vilakshbhargava
 
time_series_forecasting_constructor_uni.pptx
time_series_forecasting_constructor_uni.pptxtime_series_forecasting_constructor_uni.pptx
time_series_forecasting_constructor_uni.pptx
stefanopinto1113
 
Is RAG Really Dead Generative AI for Reterial.pptx
Is RAG Really Dead Generative AI for Reterial.pptxIs RAG Really Dead Generative AI for Reterial.pptx
Is RAG Really Dead Generative AI for Reterial.pptx
NajeebAhmed36
 
Mastering Data Science: Unlocking Insights and Opportunities at Yale IT Skill...
Mastering Data Science: Unlocking Insights and Opportunities at Yale IT Skill...Mastering Data Science: Unlocking Insights and Opportunities at Yale IT Skill...
Mastering Data Science: Unlocking Insights and Opportunities at Yale IT Skill...
smrithimuralidas
 
Debo: A Lightweight and Modular Infrastructure Management System in C
Debo: A Lightweight and Modular Infrastructure Management System in CDebo: A Lightweight and Modular Infrastructure Management System in C
Debo: A Lightweight and Modular Infrastructure Management System in C
ssuser49be50
 
Comprehensive Roadmap of AI, ML, DS, DA & DSA.pdf
Comprehensive Roadmap of AI, ML, DS, DA & DSA.pdfComprehensive Roadmap of AI, ML, DS, DA & DSA.pdf
Comprehensive Roadmap of AI, ML, DS, DA & DSA.pdf
epsilonice
 
Acounting Softwares Options & ERP system
Acounting Softwares Options & ERP systemAcounting Softwares Options & ERP system
Acounting Softwares Options & ERP system
huenkwan1214
 
delta airlines new york office (Airwayscityoffice)
delta airlines new york office (Airwayscityoffice)delta airlines new york office (Airwayscityoffice)
delta airlines new york office (Airwayscityoffice)
jamespromind
 
2. Conditional_Probabilkbkjbj,vj,v,ity.ppt
2. Conditional_Probabilkbkjbj,vj,v,ity.ppt2. Conditional_Probabilkbkjbj,vj,v,ity.ppt
2. Conditional_Probabilkbkjbj,vj,v,ity.ppt
SalmitaSalman
 
Content Moderation Services_ Leading the Future of Online Safety.docx
Content Moderation Services_ Leading the Future of Online Safety.docxContent Moderation Services_ Leading the Future of Online Safety.docx
Content Moderation Services_ Leading the Future of Online Safety.docx
sofiawilliams5966
 
"Machine Learning in Agriculture: 12 Production-Grade Models", Danil Polyakov
"Machine Learning in Agriculture: 12 Production-Grade Models", Danil Polyakov"Machine Learning in Agriculture: 12 Production-Grade Models", Danil Polyakov
"Machine Learning in Agriculture: 12 Production-Grade Models", Danil Polyakov
Fwdays
 
Nonverbal_Communication_Presentation.pptx
Nonverbal_Communication_Presentation.pptxNonverbal_Communication_Presentation.pptx
Nonverbal_Communication_Presentation.pptx
srtcuibinpm
 
Understanding Tree Data Structure and Its Applications
Understanding Tree Data Structure and Its ApplicationsUnderstanding Tree Data Structure and Its Applications
Understanding Tree Data Structure and Its Applications
M Munim
 
How Data Annotation Services Drive Innovation in Autonomous Vehicles.docx
How Data Annotation Services Drive Innovation in Autonomous Vehicles.docxHow Data Annotation Services Drive Innovation in Autonomous Vehicles.docx
How Data Annotation Services Drive Innovation in Autonomous Vehicles.docx
sofiawilliams5966
 
Market Share Analysis.pptx nnnnnnnnnnnnnn
Market Share Analysis.pptx nnnnnnnnnnnnnnMarket Share Analysis.pptx nnnnnnnnnnnnnn
Market Share Analysis.pptx nnnnnnnnnnnnnn
rocky
 
Veterinary Anatomy, The Regional Gross Anatomy of Domestic Animals (VetBooks....
Veterinary Anatomy, The Regional Gross Anatomy of Domestic Animals (VetBooks....Veterinary Anatomy, The Regional Gross Anatomy of Domestic Animals (VetBooks....
Veterinary Anatomy, The Regional Gross Anatomy of Domestic Animals (VetBooks....
JazmnAltamirano1
 
15 Benefits of Data Analytics in Business Growth.pdf
15 Benefits of Data Analytics in Business Growth.pdf15 Benefits of Data Analytics in Business Growth.pdf
15 Benefits of Data Analytics in Business Growth.pdf
AffinityCore
 
GROUP 7 CASE STUDY Real Life Incident.pptx
GROUP 7 CASE STUDY Real Life Incident.pptxGROUP 7 CASE STUDY Real Life Incident.pptx
GROUP 7 CASE STUDY Real Life Incident.pptx
mardoglenn21
 
Geospatial Data_ Unlocking the Power for Smarter Urban Planning.docx
Geospatial Data_ Unlocking the Power for Smarter Urban Planning.docxGeospatial Data_ Unlocking the Power for Smarter Urban Planning.docx
Geospatial Data_ Unlocking the Power for Smarter Urban Planning.docx
sofiawilliams5966
 
Ethical Frameworks for Trustworthy AI – Opportunities for Researchers in Huma...
Ethical Frameworks for Trustworthy AI – Opportunities for Researchers in Huma...Ethical Frameworks for Trustworthy AI – Opportunities for Researchers in Huma...
Ethical Frameworks for Trustworthy AI – Opportunities for Researchers in Huma...
Karim Baïna
 
Cyber Security Presentation(Neon)xu.pptx
Cyber Security Presentation(Neon)xu.pptxCyber Security Presentation(Neon)xu.pptx
Cyber Security Presentation(Neon)xu.pptx
vilakshbhargava
 
time_series_forecasting_constructor_uni.pptx
time_series_forecasting_constructor_uni.pptxtime_series_forecasting_constructor_uni.pptx
time_series_forecasting_constructor_uni.pptx
stefanopinto1113
 
Is RAG Really Dead Generative AI for Reterial.pptx
Is RAG Really Dead Generative AI for Reterial.pptxIs RAG Really Dead Generative AI for Reterial.pptx
Is RAG Really Dead Generative AI for Reterial.pptx
NajeebAhmed36
 
Mastering Data Science: Unlocking Insights and Opportunities at Yale IT Skill...
Mastering Data Science: Unlocking Insights and Opportunities at Yale IT Skill...Mastering Data Science: Unlocking Insights and Opportunities at Yale IT Skill...
Mastering Data Science: Unlocking Insights and Opportunities at Yale IT Skill...
smrithimuralidas
 
Debo: A Lightweight and Modular Infrastructure Management System in C
Debo: A Lightweight and Modular Infrastructure Management System in CDebo: A Lightweight and Modular Infrastructure Management System in C
Debo: A Lightweight and Modular Infrastructure Management System in C
ssuser49be50
 
Comprehensive Roadmap of AI, ML, DS, DA & DSA.pdf
Comprehensive Roadmap of AI, ML, DS, DA & DSA.pdfComprehensive Roadmap of AI, ML, DS, DA & DSA.pdf
Comprehensive Roadmap of AI, ML, DS, DA & DSA.pdf
epsilonice
 
Acounting Softwares Options & ERP system
Acounting Softwares Options & ERP systemAcounting Softwares Options & ERP system
Acounting Softwares Options & ERP system
huenkwan1214
 
delta airlines new york office (Airwayscityoffice)
delta airlines new york office (Airwayscityoffice)delta airlines new york office (Airwayscityoffice)
delta airlines new york office (Airwayscityoffice)
jamespromind
 
2. Conditional_Probabilkbkjbj,vj,v,ity.ppt
2. Conditional_Probabilkbkjbj,vj,v,ity.ppt2. Conditional_Probabilkbkjbj,vj,v,ity.ppt
2. Conditional_Probabilkbkjbj,vj,v,ity.ppt
SalmitaSalman
 
Content Moderation Services_ Leading the Future of Online Safety.docx
Content Moderation Services_ Leading the Future of Online Safety.docxContent Moderation Services_ Leading the Future of Online Safety.docx
Content Moderation Services_ Leading the Future of Online Safety.docx
sofiawilliams5966
 
"Machine Learning in Agriculture: 12 Production-Grade Models", Danil Polyakov
"Machine Learning in Agriculture: 12 Production-Grade Models", Danil Polyakov"Machine Learning in Agriculture: 12 Production-Grade Models", Danil Polyakov
"Machine Learning in Agriculture: 12 Production-Grade Models", Danil Polyakov
Fwdays
 
Nonverbal_Communication_Presentation.pptx
Nonverbal_Communication_Presentation.pptxNonverbal_Communication_Presentation.pptx
Nonverbal_Communication_Presentation.pptx
srtcuibinpm
 

Simplify Distributed TensorFlow Training for Fast Image Categorization at Starbucks

  • 2. About the presenters Denny Lee Denny Lee is a Technology Evangelist with Databricks; he is a hands-on data sciences engineer with more than 15 years of experience developing internet-scale infrastructure, data platforms, and distributed systems for both on-premises and cloud. His key focuses surround solving complex large scale data problems – providing not only architectural direction but the hands-on implementation of these systems. Vishwanath Subramanian is a Director of Data and Analytics Engineering at Starbucks. Vishwanath has over 15 years of experience with a background in distributed systems, product management, software engineering and Analytics. At Starbucks, his key focus is on providing Next Generation Analytics platforms and enabling large scale data processing and machine learning to enable Business Intelligence and Data Services across Starbucks. Vishwanath Subramanian
  • 3. Scenarios • On-Demand one click Provisioning of Seamlessly integrated Infrastructure Bill of Material for Data Science and Intelligent Apps. • Secured Connectivity to Enterprise Data Platform completely abstracted from Analytics teams. • Solution template containing organization of deployments to enable Adhoc experiments, shared data engineering and Intelligent App Development • Smarter checkout experiences • Predicting customer traffic • Planogram Analysis • And more…
  • 4. Current State • Solving complex / streaming image and video analytics is hard • It also typically involves distributing the problem to multiple nodes • But how do I perform Keras+TensorFlow on a distributed environment?
  • 6. Convolutional Neural Networks 28 x 28 28 x 28 14 x 14 Convolution 32 filters Convolution 64 filters Subsampling Stride (2,2) Feature Extraction Classification 0 1 8 9 FullyConnected Dropout
  • 7. DEMO Running Keras CNNs Standalone Keras, TensorFlow, HorovodRunner, and MLflow: https://dbricks.co/2D58PDw
  • 8. Introducing HorovodRunner • On-Demand one click Provisioning of Seamlessly integrated Infrastructure Bill of Material for Data Science and Intelligent Apps. • Secured Connectivity to Enterprise Data Platform completely abstracted from Analytics teams. • Solution template containing organization of deployments to enable Adhoc experiments, shared data engineering and Intelligent App Development • HorovodRunner is a general API to run distributed learning workloads on Databricks using Uber’s Horovod framework • Combining Horovod with Apache Spark’s barrier mode allows longer- running deep learning training jobs • A Horovod MPI job is embedded as a Spark job using barrier execution mode
  • 9. HorovodRunner • HorovodRunner takes a Python method that contains DL training code with Horovod hooks • The first executor collects the IP address of all of the task executors using BarrierTaskContext • Then it triggers a Horovod job using mpirun. • Each Python MPI process loads the pickled program back, deserializes it, and runs it.
  • 11. HorovodRunner driver workers runCNN(): model.add(Conv2D(32, …)) model.add(Conv2D(64, …)) model.add(MaxPooling2D(…)) model.add(Dense(128, …) model.add(Dense(10, ’softmax’) optimizer = keras.optimizers .Adadelta(1.0) In standalone or hvd local mode, the code is running on the driver
  • 12. HorovodRunner driver workers variables runCNN_hvd(): hvd.init() config.tf.ConfigProto() # Original code runCNN() callbacks = [] With HorovodRunner, we wrap the original code and code and variables are pushed to the workers
  • 13. HorovodRunner driver workers With HorovodRunner, we wrap the original code and code and variables are pushed to the workers
  • 14. HorovodRunner driver workers With HorovodRunner, we wrap the original code and code and variables are pushed to the workers
  • 15. HorovodRunner driver workers With HorovodRunner, we wrap the original code and code and variables are pushed to the workers
  • 16. HorovodRunner driver workers Variables are transferred from driver to workers Code is executed at the workers
  • 17. Migrate to HorovodRunner • On-Demand one click Provisioning of Seamlessly integrated Infrastructure Bill of Material for Data Science and Intelligent Apps. • Secured Connectivity to Enterprise Data Platform completely abstracted from Analytics teams. • Solution template containing organization of deployments to enable Adhoc experiments, shared data engineering and Intelligent App Development # Primary code differences are noted below + hvd.init() + config.tfConfigProto() + config.gpu_options.allow_growth = True + config.gpu_options.visible_device_list = str(hvd.local_rank()) + epochs = int(math.ceil(12.0 / hvd.size())) + callbacks = [ + hvd.callbacks.BroadcastGlobalVariablesCallback(0), + ]
  • 18. Comparing the runs using MLflow • On-Demand one click Provisioning of Seamlessly integrated Infrastructure Bill of Material for Data Science and Intelligent Apps. • Secured Connectivity to Enterprise Data Platform completely abstracted from Analytics teams. • Solution template containing organization of deployments to enable Adhoc experiments, shared data engineering and Intelligent App Development
  • 19. DEMO Object Detection Keras, TensorFlow, HorovodRunner, and MLflow
  • 20. Object Detection Approaches RCNN (2012) • Region proposal algorithms - give you a set of regions in the image that are likely to contain objects. • Run those images in the bounding boxes to a pre-trained alexnet to compute the features for that bounding box. • Support vector machine, to classify what the object in the image is of. • Run the box through a linear regression model to output tighter coordinates for the box. • RCNN -> Fast RCNN ->Faster RCNN Rich feature hierarchies for accurate object detection and semantic segmentation - Girshick, Donahue, Darrell, Malik Fast R-CNN - Girshick Faster R-CNN: Towards Real-Time ObjectDetection with Region Proposal Networks - Ren, He, Girshick, Su
  • 21. Object Detection Approaches (contd.) • YOLO – detection as a regression problem • Not a traditional classifier • Divide image into grid, each cell is responsible for predicting n bounding boxes • Output confidence score that predicted bounding box • Gives a probability distribution of all the classes its trained on • Confidence score and class prediction is combined is combined into a score for object classification • Based on threshold, we determine relevant boxes. • All the boxes fed to the neural network all at once. You Only Look Once: Unified, Real-Time Object Detection - Redmon, Divvala, Girshick, Farhadi
  • 22. A TALENTED TECHNOLOGISTS DELIVERING TODAY aavaLEADING INTO THE FUTURE https://www.starbucks.com/careers/