SlideShare a Scribd company logo
© 2014 MapR Technologies 1© 2014 MapR Technologies
Distributed Deep Learning on Spark
Mathieu Dumoulin - Data Engineer
MapR Professional Services APAC
© 2014 MapR Technologies 2
Tonight’s Presentation FAQ-Style
• Short intro on machine learning
• What’s Deep learning?
• Why distributed? Why do we need a computer cluster?
• Why run it on Spark?
• How does it work?
– Case study of SparkNet: Training Deep Networks in Spark
– Case Study of CaffeOnSpark
• Can I see a Demo?
– Installation Process
– Caffe demo
– CaffeOnSpark demo
© 2014 MapR Technologies 3
Machine Learning is all around us!
• Internet search with Google and Bing
• Contextual ads (Adsense)
• Apple iOS 9&10 (interesting link with details!)
• Google GMail/Inbox (Priority Inbox, Spam filtering)
• Fraud Detection
• Recommendations (Amazon)
• Image recognition (I can see… cats!)
• Language Modeling & Speech Recognition (Siri, Google Now,
Google Translate)
© 2016 MapR Technologies 4© 2016 MapR Technologies 4MapR Confidential
Classification of images
© 2016 MapR Technologies 5© 2016 MapR Technologies 5MapR Confidential
Why Deep Learning?
• Because they work really, really well!
• Deep learning is the state of the art in applied machine learning
– Wins in every major machine learning competition
• Kaggle
• ImageNet
• Especially well suited for:
– Images (classification, object detection, etc)
– Sounds (speech, music)
– Text (translation)
• Deep Learning is very CPU intensive
– More processing for better models
– More processing for faster training
© 2016 MapR Technologies 6© 2016 MapR Technologies 6MapR Confidential
MNIST digits task
• Classify 60,000 handwritten digits to the correct number
Taken from Wikipedia (https://en.wikipedia.org/wiki/MNIST_database)
More deep learning results: (http://yann.lecun.com/exdb/mnist/)
Type Error rate
(%)
K-Nearest Neighbors 0.52[14]
Support vector machine 0.56[16]
Deep neural network 0.35[18]
Convolutional neural
network
0.23[8]
© 2016 MapR Technologies 7© 2016 MapR Technologies 7MapR Confidential
Results are now competitive with humans!
© 2016 MapR Technologies 8© 2016 MapR Technologies 8MapR Confidential
Why Distributed
“training can be time consuming, often requiring multiple days on a
single GPU using [SGD]” - Moritz et al - SparkNet
• The most GPU for one physical node is 3-4
• A cluster can spread the CPU/GPU load at the cost of increased
complexity
• Google coded such software from scratch early 2010.
© 2016 MapR Technologies 9© 2016 MapR Technologies 9MapR Confidential
How to Distribute: Parameter Server
• Li et al propose the “Parameter Server” approach in 2014
– https://www.cs.cmu.edu/~dga/papers/osdi14-paper-li_mu.pdf
From Arimo’s Distributed TensorFlow blog post (link)
© 2016 MapR Technologies 10© 2016 MapR Technologies 10MapR Confidential
Why Spark?
• Integrates well with existing “big data” batch processing
frameworks (Hadoop/MapReduce)
• Allows data to be kept in memory from start to finish
• Work with a single computational framework
• Relatively easy to implement parameter server
© 2016 MapR Technologies 11© 2016 MapR Technologies 11MapR Confidential
New frameworks for spark-based Distributed DL
• CaffeOnSpark (Yahoo America)
• SparkNet (Berkeley University’s Amplab)
• DeepLearning4J (Skymind)
• Elephas (Keras team)
• Distributed Tensor Flow (Arimo)
© 2016 MapR Technologies 12© 2016 MapR Technologies 12MapR Confidential
SparkNet implementation
From: https://arxiv.org/pdf/1511.06051v4.pdf
© 2016 MapR Technologies 13© 2016 MapR Technologies 13MapR Confidential
SparkNet implementation 2
From: https://arxiv.org/pdf/1511.06051v4.pdf
© 2016 MapR Technologies 14© 2016 MapR Technologies 14MapR Confidential
SparkNet implementation 3
From: https://arxiv.org/pdf/1511.06051v4.pdf
© 2016 MapR Technologies 15© 2016 MapR Technologies 15MapR Confidential
We need a Solver: Caffe
● (+) Good for feedforward networks and image processing
● (+) Good for finetuning existing networks
● (+) Train models without writing any code
● (+) Python interface is pretty useful
● (-) Need to write C++ / CUDA for new GPU layers
● (-) Not good for recurrent networks
● (-) Cumbersome for big networks (GoogLeNet, ResNet)
● (-) Not extensible, bit of a hairball
● (-) No commercial support
taken from: http://deeplearning4j.org/compare-dl4j-torch7-pylearn.html#caffe
© 2016 MapR Technologies 16© 2016 MapR Technologies 16MapR Confidential
Distributed SGD and Parameter Server
From: https://arxiv.org/pdf/1511.06051v4.pdf
© 2016 MapR Technologies 17© 2016 MapR Technologies 17MapR Confidential
SparkNet’s implementation of DSGD
From: https://arxiv.org/pdf/1511.06051v4.pdf
© 2016 MapR Technologies 18© 2016 MapR Technologies 18MapR Confidential
Benefits of the approach
From: https://arxiv.org/pdf/1511.06051v4.pdf
© 2016 MapR Technologies 19© 2016 MapR Technologies 19MapR Confidential
Scaling performance of SparkNet
From: https://arxiv.org/pdf/1511.06051v4.pdf
© 2016 MapR Technologies 20© 2016 MapR Technologies 20MapR Confidential
CaffeOnSpark
• Mix Java and Scala implementation
• Developed and used in production at Yahoo America
• Much easier to install than SparkNet, less buggy
• Can take advantage of Infiniband network
• Enhanced Caffe to use multi-GPU
• CaffeOnSpark executors communicate to each other via MPI
allreduce style interface
• Spark+MPI architecture achieves similar performance as
dedicated deep learning clusters
– Peer-to-peer parameter server
• Faster than SparkNet
© 2016 MapR Technologies 21© 2016 MapR Technologies 21MapR Confidential
CaffeOnSpark System Architecture
From: http://yahoohadoop.tumblr.com/post/129872361846/large-scale-distributed-deep-learning-on-hadoop
© 2016 MapR Technologies 22© 2016 MapR Technologies 22MapR Confidential
CaffeOnSpark vs. SparkNet
• Much faster
communication between
nodes (Infiniband
capability)
• Peer-to-peer parameter
exchange model is a
much faster
implementation
• Enhanced multi-GPU
Caffe also faster
© 2016 MapR Technologies 23© 2016 MapR Technologies 23MapR Confidential
Comparison of Frameworks (Spark Summit 2016)
By Yu Cao (EMC) and Zhe Dong (EMC) (Slideshare)
© 2016 MapR Technologies 24© 2016 MapR Technologies 24MapR Confidential
Benchmark 2
By Yu Cao (EMC) and Zhe Dong (EMC) (Slideshare)
© 2016 MapR Technologies 25© 2016 MapR Technologies 25MapR Confidential
Installing CaffeOnSpark
• I recommend Centos 7 or Ubuntu 14+
• Process is very “touchy”, easy to mess up
• Go step by step!
Process:
1. Update the OS and kernel, install dev tools (gcc, etc.) reboot
a. Disable “nouveau” driver!!!
2. Install NVidia Drivers latest, Cuda 7.5, cuDNN 4
3. Install Caffe
a. Install all caffe dependencies, make sure it compiles and examples
run.
4. Install CaffeOnSpark
© 2016 MapR Technologies 26© 2016 MapR Technologies 26MapR Confidential
Installing Caffe
Good tutorials are quite few!
• Ubuntu works more “out of the box” the default paths are all
correct
• Centos7: a few changes are needed but it’s still OK
The caffe web site instructions for Centos are a bit outdated.
© 2016 MapR Technologies 27© 2016 MapR Technologies 27MapR Confidential
Demos
• Running an example on Caffe
– Caffe deep network description files
– MNIST example
• Running an example with CaffeOnSpark
– MNIST example
– running on YARN/Spark Standalone
© 2016 MapR Technologies 28© 2016 MapR Technologies 28MapR Confidential © 2016 MapR Technologies
Q&A time

More Related Content

What's hot (20)

PPTX
Scaling TensorFlow Models for Training using multi-GPUs & Google Cloud ML
Seldon
 
PDF
Simplify Distributed TensorFlow Training for Fast Image Categorization at Sta...
Databricks
 
PDF
Kaz Sato, Evangelist, Google at MLconf ATL 2016
MLconf
 
PDF
Yggdrasil: Faster Decision Trees Using Column Partitioning In Spark
Jen Aman
 
PDF
Scalable Deep Learning Platform On Spark In Baidu
Jen Aman
 
PDF
Updates from Project Hydrogen: Unifying State-of-the-Art AI and Big Data in A...
Databricks
 
PDF
Distributed deep learning
Mehdi Shibahara
 
PPTX
Arun Rathinasabapathy, Senior Software Engineer, LexisNexis at MLconf ATL 2016
MLconf
 
PPTX
Converged and Containerized Distributed Deep Learning With TensorFlow and Kub...
Mathieu Dumoulin
 
PPTX
State of the Art Robot Predictive Maintenance with Real-time Sensor Data
Mathieu Dumoulin
 
PDF
Deep Learning at Scale
Mateusz Dymczyk
 
PDF
Deep Learning with DL4J on Apache Spark: Yeah it’s Cool, but are You Doing it...
Databricks
 
PDF
Very large scale distributed deep learning on BigDL
DESMOND YUEN
 
PDF
Accelerating Data Science with Better Data Engineering on Databricks
Databricks
 
PDF
No More Cumbersomeness: Automatic Predictive Modeling on Apache Spark with Ma...
Databricks
 
PDF
Lessons Learned while Implementing a Sparse Logistic Regression Algorithm in ...
Spark Summit
 
PDF
A Graph-Based Method For Cross-Entity Threat Detection
Jen Aman
 
PDF
Simple, Modular and Extensible Big Data Platform Concept
Satish Mohan
 
PDF
Auto-Pilot for Apache Spark Using Machine Learning
Databricks
 
PDF
Streaming Architecture to Connect Everything (Including Hybrid Cloud) - Strat...
Mathieu Dumoulin
 
Scaling TensorFlow Models for Training using multi-GPUs & Google Cloud ML
Seldon
 
Simplify Distributed TensorFlow Training for Fast Image Categorization at Sta...
Databricks
 
Kaz Sato, Evangelist, Google at MLconf ATL 2016
MLconf
 
Yggdrasil: Faster Decision Trees Using Column Partitioning In Spark
Jen Aman
 
Scalable Deep Learning Platform On Spark In Baidu
Jen Aman
 
Updates from Project Hydrogen: Unifying State-of-the-Art AI and Big Data in A...
Databricks
 
Distributed deep learning
Mehdi Shibahara
 
Arun Rathinasabapathy, Senior Software Engineer, LexisNexis at MLconf ATL 2016
MLconf
 
Converged and Containerized Distributed Deep Learning With TensorFlow and Kub...
Mathieu Dumoulin
 
State of the Art Robot Predictive Maintenance with Real-time Sensor Data
Mathieu Dumoulin
 
Deep Learning at Scale
Mateusz Dymczyk
 
Deep Learning with DL4J on Apache Spark: Yeah it’s Cool, but are You Doing it...
Databricks
 
Very large scale distributed deep learning on BigDL
DESMOND YUEN
 
Accelerating Data Science with Better Data Engineering on Databricks
Databricks
 
No More Cumbersomeness: Automatic Predictive Modeling on Apache Spark with Ma...
Databricks
 
Lessons Learned while Implementing a Sparse Logistic Regression Algorithm in ...
Spark Summit
 
A Graph-Based Method For Cross-Entity Threat Detection
Jen Aman
 
Simple, Modular and Extensible Big Data Platform Concept
Satish Mohan
 
Auto-Pilot for Apache Spark Using Machine Learning
Databricks
 
Streaming Architecture to Connect Everything (Including Hybrid Cloud) - Strat...
Mathieu Dumoulin
 

Viewers also liked (20)

PPTX
TensorFrames: Google Tensorflow on Apache Spark
Databricks
 
PPTX
Hadoop Summit 2014 Distributed Deep Learning
Adam Gibson
 
PDF
Hello Swift Final
Cody Yun
 
PPT
Machine Learning Methods for Parameter Acquisition in a Human ...
butest
 
PPTX
MALT: Distributed Data-Parallelism for Existing ML Applications (Distributed ...
asimkadav
 
PDF
Spark Summit EU talk by Rolf Jagerman
Spark Summit
 
PDF
Data Engineering 101: Building your first data product by Jonathan Dinu PyDat...
PyData
 
PDF
キャンバス個人用アプリ 速習ガイド
Kazuki Nakajima
 
PPT
Tns Profile V 12.0
jw
 
PDF
今さらきけない環境ハブ
Kazuki Nakajima
 
PDF
Challenges on Distributed Machine Learning
jie cao
 
PDF
Large Scale Distributed Deep Networks
Hiroyuki Vincent Yamazaki
 
PDF
Spark Summit EU talk by Nick Pentreath
Spark Summit
 
PDF
Distributed machine learning
Stanley Wang
 
PDF
Alex Smola, Professor in the Machine Learning Department, Carnegie Mellon Uni...
MLconf
 
PDF
Distributed implementation of a lstm on spark and tensorflow
Emanuel Di Nardo
 
PPTX
Self-Service.AI - Pitch Competition for AI-Driven SaaS Startups
Datentreiber
 
PDF
Spark Based Distributed Deep Learning Framework For Big Data Applications
Humoyun Ahmedov
 
PDF
FlinkML: Large Scale Machine Learning with Apache Flink
Theodoros Vasiloudis
 
PDF
BI Consultancy - Data, Analytics and Strategy
Shivam Dhawan
 
TensorFrames: Google Tensorflow on Apache Spark
Databricks
 
Hadoop Summit 2014 Distributed Deep Learning
Adam Gibson
 
Hello Swift Final
Cody Yun
 
Machine Learning Methods for Parameter Acquisition in a Human ...
butest
 
MALT: Distributed Data-Parallelism for Existing ML Applications (Distributed ...
asimkadav
 
Spark Summit EU talk by Rolf Jagerman
Spark Summit
 
Data Engineering 101: Building your first data product by Jonathan Dinu PyDat...
PyData
 
キャンバス個人用アプリ 速習ガイド
Kazuki Nakajima
 
Tns Profile V 12.0
jw
 
今さらきけない環境ハブ
Kazuki Nakajima
 
Challenges on Distributed Machine Learning
jie cao
 
Large Scale Distributed Deep Networks
Hiroyuki Vincent Yamazaki
 
Spark Summit EU talk by Nick Pentreath
Spark Summit
 
Distributed machine learning
Stanley Wang
 
Alex Smola, Professor in the Machine Learning Department, Carnegie Mellon Uni...
MLconf
 
Distributed implementation of a lstm on spark and tensorflow
Emanuel Di Nardo
 
Self-Service.AI - Pitch Competition for AI-Driven SaaS Startups
Datentreiber
 
Spark Based Distributed Deep Learning Framework For Big Data Applications
Humoyun Ahmedov
 
FlinkML: Large Scale Machine Learning with Apache Flink
Theodoros Vasiloudis
 
BI Consultancy - Data, Analytics and Strategy
Shivam Dhawan
 
Ad

Similar to Distributed Deep Learning on Spark (20)

PDF
Advanced Spark and TensorFlow Meetup - Dec 12 2017 - Dong Meng, MapR + Kubern...
Chris Fregly
 
PPTX
Amazon Deep Learning
Amanda Mackay (she/her)
 
PPTX
SparkNet presentation
Sneh Pahilwani
 
PPT
Deep Learning Jeff-Shomaker_1-20-17_Final_
Jeffrey Shomaker
 
PPTX
Explore Deep Learning Architecture using Tensorflow 2.0 now! Part 2
Tyrone Systems
 
PDF
Jeremy Nixon, Machine Learning Engineer, Spark Technology Center at MLconf AT...
MLconf
 
PPTX
Machine Learning for Chickens, Autonomous Driving and a 3-year-old Who Won’t ...
MapR Technologies
 
PPTX
Deep Learning with Spark and GPUs
DataWorks Summit
 
PPTX
Best Deep Learning Post from LinkedIn Group
Farshid Pirahansiah
 
PDF
Deep Learning with Apache Spark and GPUs with Pierce Spitler
Databricks
 
PPTX
Deep learning tutorial (i)
Guan Wang
 
PPTX
Learn about Tensorflow for Deep Learning now! Part 1
Tyrone Systems
 
PPT
Neural nets jeff_shomaker_7-6-16_
Jeffrey Shomaker
 
PDF
Predictive Maintenance Using Recurrent Neural Networks
Justin Brandenburg
 
PDF
Austin,TX Meetup presentation tensorflow final oct 26 2017
Clarisse Hedglin
 
PPTX
Deep Learning on Qubole Data Platform
Shivaji Dutta
 
PDF
Large Scale Deep Learning with TensorFlow
Jen Aman
 
PDF
Open source ai_technical_trend
Mario Cho
 
PDF
instruction of install Caffe on ubuntu
Pouya Ahv
 
PDF
Open-Source Frameworks for Deep Learning: an Overview
Vincenzo Lomonaco
 
Advanced Spark and TensorFlow Meetup - Dec 12 2017 - Dong Meng, MapR + Kubern...
Chris Fregly
 
Amazon Deep Learning
Amanda Mackay (she/her)
 
SparkNet presentation
Sneh Pahilwani
 
Deep Learning Jeff-Shomaker_1-20-17_Final_
Jeffrey Shomaker
 
Explore Deep Learning Architecture using Tensorflow 2.0 now! Part 2
Tyrone Systems
 
Jeremy Nixon, Machine Learning Engineer, Spark Technology Center at MLconf AT...
MLconf
 
Machine Learning for Chickens, Autonomous Driving and a 3-year-old Who Won’t ...
MapR Technologies
 
Deep Learning with Spark and GPUs
DataWorks Summit
 
Best Deep Learning Post from LinkedIn Group
Farshid Pirahansiah
 
Deep Learning with Apache Spark and GPUs with Pierce Spitler
Databricks
 
Deep learning tutorial (i)
Guan Wang
 
Learn about Tensorflow for Deep Learning now! Part 1
Tyrone Systems
 
Neural nets jeff_shomaker_7-6-16_
Jeffrey Shomaker
 
Predictive Maintenance Using Recurrent Neural Networks
Justin Brandenburg
 
Austin,TX Meetup presentation tensorflow final oct 26 2017
Clarisse Hedglin
 
Deep Learning on Qubole Data Platform
Shivaji Dutta
 
Large Scale Deep Learning with TensorFlow
Jen Aman
 
Open source ai_technical_trend
Mario Cho
 
instruction of install Caffe on ubuntu
Pouya Ahv
 
Open-Source Frameworks for Deep Learning: an Overview
Vincenzo Lomonaco
 
Ad

More from Mathieu Dumoulin (6)

PPTX
CEP - simplified streaming architecture - Strata Singapore 2016
Mathieu Dumoulin
 
PDF
Real-World Machine Learning - Leverage the Features of MapR Converged Data Pl...
Mathieu Dumoulin
 
PPTX
Introduction aux algorithmes map reduce
Mathieu Dumoulin
 
PPTX
MapReduce: Traitement de données distribué à grande échelle simplifié
Mathieu Dumoulin
 
PPTX
Presentation Hadoop Québec
Mathieu Dumoulin
 
PPTX
Introduction à Hadoop
Mathieu Dumoulin
 
CEP - simplified streaming architecture - Strata Singapore 2016
Mathieu Dumoulin
 
Real-World Machine Learning - Leverage the Features of MapR Converged Data Pl...
Mathieu Dumoulin
 
Introduction aux algorithmes map reduce
Mathieu Dumoulin
 
MapReduce: Traitement de données distribué à grande échelle simplifié
Mathieu Dumoulin
 
Presentation Hadoop Québec
Mathieu Dumoulin
 
Introduction à Hadoop
Mathieu Dumoulin
 

Recently uploaded (20)

PPTX
Agentic Automation Journey Session 1/5: Context Grounding and Autopilot for E...
klpathrudu
 
PPTX
Foundations of Marketo Engage - Powering Campaigns with Marketo Personalization
bbedford2
 
PDF
IDM Crack with Internet Download Manager 6.42 Build 43 with Patch Latest 2025
bashirkhan333g
 
PPTX
OpenChain @ OSS NA - In From the Cold: Open Source as Part of Mainstream Soft...
Shane Coughlan
 
PPTX
Customise Your Correlation Table in IBM SPSS Statistics.pptx
Version 1 Analytics
 
PDF
Technical-Careers-Roadmap-in-Software-Market.pdf
Hussein Ali
 
PDF
MiniTool Partition Wizard Free Crack + Full Free Download 2025
bashirkhan333g
 
PDF
[Solution] Why Choose the VeryPDF DRM Protector Custom-Built Solution for You...
Lingwen1998
 
PDF
Generic or Specific? Making sensible software design decisions
Bert Jan Schrijver
 
PDF
NEW-Viral>Wondershare Filmora 14.5.18.12900 Crack Free
sherryg1122g
 
PDF
MiniTool Partition Wizard 12.8 Crack License Key LATEST
hashhshs786
 
PPTX
Coefficient of Variance in IBM SPSS Statistics Version 31.pptx
Version 1 Analytics
 
PPTX
Comprehensive Risk Assessment Module for Smarter Risk Management
EHA Soft Solutions
 
PDF
Top Agile Project Management Tools for Teams in 2025
Orangescrum
 
PDF
Digger Solo: Semantic search and maps for your local files
seanpedersen96
 
PDF
Automate Cybersecurity Tasks with Python
VICTOR MAESTRE RAMIREZ
 
PDF
AI + DevOps = Smart Automation with devseccops.ai.pdf
Devseccops.ai
 
PPTX
Change Common Properties in IBM SPSS Statistics Version 31.pptx
Version 1 Analytics
 
PPTX
Agentic Automation: Build & Deploy Your First UiPath Agent
klpathrudu
 
PDF
Download Canva Pro 2025 PC Crack Full Latest Version
bashirkhan333g
 
Agentic Automation Journey Session 1/5: Context Grounding and Autopilot for E...
klpathrudu
 
Foundations of Marketo Engage - Powering Campaigns with Marketo Personalization
bbedford2
 
IDM Crack with Internet Download Manager 6.42 Build 43 with Patch Latest 2025
bashirkhan333g
 
OpenChain @ OSS NA - In From the Cold: Open Source as Part of Mainstream Soft...
Shane Coughlan
 
Customise Your Correlation Table in IBM SPSS Statistics.pptx
Version 1 Analytics
 
Technical-Careers-Roadmap-in-Software-Market.pdf
Hussein Ali
 
MiniTool Partition Wizard Free Crack + Full Free Download 2025
bashirkhan333g
 
[Solution] Why Choose the VeryPDF DRM Protector Custom-Built Solution for You...
Lingwen1998
 
Generic or Specific? Making sensible software design decisions
Bert Jan Schrijver
 
NEW-Viral>Wondershare Filmora 14.5.18.12900 Crack Free
sherryg1122g
 
MiniTool Partition Wizard 12.8 Crack License Key LATEST
hashhshs786
 
Coefficient of Variance in IBM SPSS Statistics Version 31.pptx
Version 1 Analytics
 
Comprehensive Risk Assessment Module for Smarter Risk Management
EHA Soft Solutions
 
Top Agile Project Management Tools for Teams in 2025
Orangescrum
 
Digger Solo: Semantic search and maps for your local files
seanpedersen96
 
Automate Cybersecurity Tasks with Python
VICTOR MAESTRE RAMIREZ
 
AI + DevOps = Smart Automation with devseccops.ai.pdf
Devseccops.ai
 
Change Common Properties in IBM SPSS Statistics Version 31.pptx
Version 1 Analytics
 
Agentic Automation: Build & Deploy Your First UiPath Agent
klpathrudu
 
Download Canva Pro 2025 PC Crack Full Latest Version
bashirkhan333g
 

Distributed Deep Learning on Spark

  • 1. © 2014 MapR Technologies 1© 2014 MapR Technologies Distributed Deep Learning on Spark Mathieu Dumoulin - Data Engineer MapR Professional Services APAC
  • 2. © 2014 MapR Technologies 2 Tonight’s Presentation FAQ-Style • Short intro on machine learning • What’s Deep learning? • Why distributed? Why do we need a computer cluster? • Why run it on Spark? • How does it work? – Case study of SparkNet: Training Deep Networks in Spark – Case Study of CaffeOnSpark • Can I see a Demo? – Installation Process – Caffe demo – CaffeOnSpark demo
  • 3. © 2014 MapR Technologies 3 Machine Learning is all around us! • Internet search with Google and Bing • Contextual ads (Adsense) • Apple iOS 9&10 (interesting link with details!) • Google GMail/Inbox (Priority Inbox, Spam filtering) • Fraud Detection • Recommendations (Amazon) • Image recognition (I can see… cats!) • Language Modeling & Speech Recognition (Siri, Google Now, Google Translate)
  • 4. © 2016 MapR Technologies 4© 2016 MapR Technologies 4MapR Confidential Classification of images
  • 5. © 2016 MapR Technologies 5© 2016 MapR Technologies 5MapR Confidential Why Deep Learning? • Because they work really, really well! • Deep learning is the state of the art in applied machine learning – Wins in every major machine learning competition • Kaggle • ImageNet • Especially well suited for: – Images (classification, object detection, etc) – Sounds (speech, music) – Text (translation) • Deep Learning is very CPU intensive – More processing for better models – More processing for faster training
  • 6. © 2016 MapR Technologies 6© 2016 MapR Technologies 6MapR Confidential MNIST digits task • Classify 60,000 handwritten digits to the correct number Taken from Wikipedia (https://en.wikipedia.org/wiki/MNIST_database) More deep learning results: (http://yann.lecun.com/exdb/mnist/) Type Error rate (%) K-Nearest Neighbors 0.52[14] Support vector machine 0.56[16] Deep neural network 0.35[18] Convolutional neural network 0.23[8]
  • 7. © 2016 MapR Technologies 7© 2016 MapR Technologies 7MapR Confidential Results are now competitive with humans!
  • 8. © 2016 MapR Technologies 8© 2016 MapR Technologies 8MapR Confidential Why Distributed “training can be time consuming, often requiring multiple days on a single GPU using [SGD]” - Moritz et al - SparkNet • The most GPU for one physical node is 3-4 • A cluster can spread the CPU/GPU load at the cost of increased complexity • Google coded such software from scratch early 2010.
  • 9. © 2016 MapR Technologies 9© 2016 MapR Technologies 9MapR Confidential How to Distribute: Parameter Server • Li et al propose the “Parameter Server” approach in 2014 – https://www.cs.cmu.edu/~dga/papers/osdi14-paper-li_mu.pdf From Arimo’s Distributed TensorFlow blog post (link)
  • 10. © 2016 MapR Technologies 10© 2016 MapR Technologies 10MapR Confidential Why Spark? • Integrates well with existing “big data” batch processing frameworks (Hadoop/MapReduce) • Allows data to be kept in memory from start to finish • Work with a single computational framework • Relatively easy to implement parameter server
  • 11. © 2016 MapR Technologies 11© 2016 MapR Technologies 11MapR Confidential New frameworks for spark-based Distributed DL • CaffeOnSpark (Yahoo America) • SparkNet (Berkeley University’s Amplab) • DeepLearning4J (Skymind) • Elephas (Keras team) • Distributed Tensor Flow (Arimo)
  • 12. © 2016 MapR Technologies 12© 2016 MapR Technologies 12MapR Confidential SparkNet implementation From: https://arxiv.org/pdf/1511.06051v4.pdf
  • 13. © 2016 MapR Technologies 13© 2016 MapR Technologies 13MapR Confidential SparkNet implementation 2 From: https://arxiv.org/pdf/1511.06051v4.pdf
  • 14. © 2016 MapR Technologies 14© 2016 MapR Technologies 14MapR Confidential SparkNet implementation 3 From: https://arxiv.org/pdf/1511.06051v4.pdf
  • 15. © 2016 MapR Technologies 15© 2016 MapR Technologies 15MapR Confidential We need a Solver: Caffe ● (+) Good for feedforward networks and image processing ● (+) Good for finetuning existing networks ● (+) Train models without writing any code ● (+) Python interface is pretty useful ● (-) Need to write C++ / CUDA for new GPU layers ● (-) Not good for recurrent networks ● (-) Cumbersome for big networks (GoogLeNet, ResNet) ● (-) Not extensible, bit of a hairball ● (-) No commercial support taken from: http://deeplearning4j.org/compare-dl4j-torch7-pylearn.html#caffe
  • 16. © 2016 MapR Technologies 16© 2016 MapR Technologies 16MapR Confidential Distributed SGD and Parameter Server From: https://arxiv.org/pdf/1511.06051v4.pdf
  • 17. © 2016 MapR Technologies 17© 2016 MapR Technologies 17MapR Confidential SparkNet’s implementation of DSGD From: https://arxiv.org/pdf/1511.06051v4.pdf
  • 18. © 2016 MapR Technologies 18© 2016 MapR Technologies 18MapR Confidential Benefits of the approach From: https://arxiv.org/pdf/1511.06051v4.pdf
  • 19. © 2016 MapR Technologies 19© 2016 MapR Technologies 19MapR Confidential Scaling performance of SparkNet From: https://arxiv.org/pdf/1511.06051v4.pdf
  • 20. © 2016 MapR Technologies 20© 2016 MapR Technologies 20MapR Confidential CaffeOnSpark • Mix Java and Scala implementation • Developed and used in production at Yahoo America • Much easier to install than SparkNet, less buggy • Can take advantage of Infiniband network • Enhanced Caffe to use multi-GPU • CaffeOnSpark executors communicate to each other via MPI allreduce style interface • Spark+MPI architecture achieves similar performance as dedicated deep learning clusters – Peer-to-peer parameter server • Faster than SparkNet
  • 21. © 2016 MapR Technologies 21© 2016 MapR Technologies 21MapR Confidential CaffeOnSpark System Architecture From: http://yahoohadoop.tumblr.com/post/129872361846/large-scale-distributed-deep-learning-on-hadoop
  • 22. © 2016 MapR Technologies 22© 2016 MapR Technologies 22MapR Confidential CaffeOnSpark vs. SparkNet • Much faster communication between nodes (Infiniband capability) • Peer-to-peer parameter exchange model is a much faster implementation • Enhanced multi-GPU Caffe also faster
  • 23. © 2016 MapR Technologies 23© 2016 MapR Technologies 23MapR Confidential Comparison of Frameworks (Spark Summit 2016) By Yu Cao (EMC) and Zhe Dong (EMC) (Slideshare)
  • 24. © 2016 MapR Technologies 24© 2016 MapR Technologies 24MapR Confidential Benchmark 2 By Yu Cao (EMC) and Zhe Dong (EMC) (Slideshare)
  • 25. © 2016 MapR Technologies 25© 2016 MapR Technologies 25MapR Confidential Installing CaffeOnSpark • I recommend Centos 7 or Ubuntu 14+ • Process is very “touchy”, easy to mess up • Go step by step! Process: 1. Update the OS and kernel, install dev tools (gcc, etc.) reboot a. Disable “nouveau” driver!!! 2. Install NVidia Drivers latest, Cuda 7.5, cuDNN 4 3. Install Caffe a. Install all caffe dependencies, make sure it compiles and examples run. 4. Install CaffeOnSpark
  • 26. © 2016 MapR Technologies 26© 2016 MapR Technologies 26MapR Confidential Installing Caffe Good tutorials are quite few! • Ubuntu works more “out of the box” the default paths are all correct • Centos7: a few changes are needed but it’s still OK The caffe web site instructions for Centos are a bit outdated.
  • 27. © 2016 MapR Technologies 27© 2016 MapR Technologies 27MapR Confidential Demos • Running an example on Caffe – Caffe deep network description files – MNIST example • Running an example with CaffeOnSpark – MNIST example – running on YARN/Spark Standalone
  • 28. © 2016 MapR Technologies 28© 2016 MapR Technologies 28MapR Confidential © 2016 MapR Technologies Q&A time