Deep Dive into Apache MXNet on AWS

© 2017, Amazon Web Services, Inc. or its Affiliates. All rights reserved.
Radhika Ravirala | Solutions Architect | AWS
August 17th, 2017
Deep Dive into Apache MXNet on AWS

Deep Learning is a coming-of-age for neural networks
and is being used to solve previously intractable
machine learning problems.
“deep learning” trend in the past 10 years
image understanding speech recognition natural language
processing
autonomy

Agenda
• Applications
• Apache MXNet Overview
• Framework Comparison
• Mechanics of Apache MXNet
• Walkthrough| MXNet Jupyter Notebook
• Developer Tools and Resources

Apache MXNet
Programmable Portable High Performance
Near linear scaling
across hundreds of GPUs
Highly efficient
models for mobile
and IoT
Simple syntax,
multiple languages
Most Open Best On AWS
Optimized for
deep learning on
AWS
Accepted into the
Apache Incubator

AI Services
AI Platform
AI Engines
Amazon
Rekognition
Amazon
Polly
Amazon
Lex
More to come
in 2017
Amazon
Machine Learning
Amazon Elastic
MapReduce
Spark &
SparkML
More to come
in 2017
Apache
MXNet
Caffe Theano KerasTorch CNTK
Amazon AI: Democratized Artificial Intelligence
TensorFlow
P2 ECS Lambda
AWS
Greengrass
FPGAEMR/Spark
More to
come
in 2017
Hardware

Amazon Strategy | Apache MXNet
Integrate with
AWS Services
Bring Scalable Deep
Learning to AWS
Services such as
Amazon EMR, AWS
Lambda and
Amazon ECS.
Foundation for
AI Services
AmazonAI API
Services, Internal AI
Research, Amazon
Core AI
Development
Leverage the
Community
Community brings
velocity and
innovation with no
single project owner
or controller

Deep Learning using MXNet @Amazon
• Applied Research
• Core Research
• Alexa
• Demand Forecasting
• Risk Analytics
• Search
• Recommendations
• AI Services | Rek, Lex, Polly
• Q&A Systems
• Supply Chain Optimization
• Advertising
• Machine Translation
• Video Content Analysis
• Robotics
• Lots of Computer Vision..
• Lots of NLP/U..
*Teams are either actively evaluating, in development, or transitioning to scale production

Collaborations and Community
4th DL Framework in Popularity
(Outpacing Torch, CNTK and
Theano)
Diverse Community
(Spans Industry and Academia)
0 10,000 20,000 30,000 40,000 50,000 60,000
Yutian Li (Stanford)
Nan Zhu (MSFT)
Liang Depeng (Sun Yat-sen U.)
Xingjian Shi (HKUST)
Tianjun Xiao (Tesla)
Chiyuan Zhang (MIT)
Yao Wang (AWS)
Jian Guo (TuSimple)
Yizhi Liu (Mediav)
Sandeep K. (AWS)
Sergey Kolychev (Whitehat)
Eric Xie (AWS)
Tianqi Chen (UW)
Mu Li (AWS)
Bing Su (Apple)
*As of 3/30/17
0 50 100 150 200
Torch
CNTK
DL4J
Theano
Apache MXNet
Keras
Caffe
TensorFlow
*As of 2/11/17

Deep Learning Framework Comparison
Apache MXNet TensorFlow Cognitive Toolkit
Industry Owner
N/A – Apache
Community
Google Microsoft
Programmability
Imperative and
Declarative
Declarative only Declarative only
Language
Support
R, Python, Scala, Julia,
Cpp. Javascript, Go,
Matlab and more..
Python, Cpp.
Experimental Go and
Java
Python, Cpp,
Brainscript.
Code Length|
AlexNet (Python)
44 sloc 107 sloc using TF.Slim 214 sloc
Memory Footprint
(LSTM)
2.6GB 7.2GB N/A
*sloc – source lines of code

0
4
8
12
16
1 2 4 8 16
Ideal
Inception v3
Resnet
Alexnet
91%
Efficiency
Multi-GPU Scaling With MXNet

0
64
128
192
256
1 2 4 8 16 32 64 128 256
Multi-GPU Scaling With MXNet

Ideal
Inception v3
Resnet
Alexnet
88%
Efficiency
0
64
128
192
256
1 2 4 8 16 32 64 128 256
Multi-Machine Scaling With MXNet

Apache MXNet | The Basics
• NDArray: Manipulate multi-dimensional arrays in a command line
paradigm (imperative).
• Symbol: Symbolic expression for neural networks (declarative).
• Module: Intermediate-level and high-level interface for neural
network training and inference.
• Loading Data: Feeding data into training/inference programs.
• Mixed Programming: Training algorithms developed using
NDArrays in concert with Symbols.

0.2
-0.1
...
0.7
Input Output
1 1 1
1 0 1
0 0 0
3
mx.sym.Pooling(data, pool_type="max", kernel=(2,2), stride=(2,2)
lstm.lstm_unroll(num_lstm_layer, seq_len, len, num_hidden, num_embed)
4 2
2 0
4=Max
1
3
...
4
0.2
-0.1
...
0.7
mx.sym.FullyConnected(data, num_hidden=128)
2
mx.symbol.Embedding(data, input_dim, output_dim = k)
Queen
4 2
2 0
2=Avg
Input Weights
cos(w, queen) = cos(w, king) - cos(w, man) + cos(w, woman)
mx.sym.Activation(data, act_type="xxxx")
"relu"
"tanh"
"sigmoid"
"softrelu"
Neural Art
Face Search
Image Segmentation
Image Caption
“People Riding Bikes”
Bicycle, People,
Road, Sport
Image Labels
Image
Video
Speech
Text
“People Riding Bikes”
Machine Translation
“Οι άνθρωποι
ιππασίας ποδήλατα”
Events
mx.model.FeedForward model.fit
mx.sym.SoftmaxOutput
Anatomy of a Deep Learning Model
mx.sym.Convolution(data, kernel=(5,5), num_filter=20)
Deep Learning Models

import numpy as np
a = np.ones(10)
b = np.ones(10) * 2
c = b * a
d = c + 1
• Straightforward and flexible.
• Take advantage of language
native features (loop,
condition, debugger).
• E.g. Numpy, Matlab, Torch, …
•Hard to optimize
PROS
CONSEasy to tweak
in Python
Imperative Programming

• More chances for
optimization
• Cross different languages
• E.g. TensorFlow, Theano,
Caffe
•Less flexible
PROS
CONSC can share memory with
D because C is deleted
later
A = Variable('A')
B = Variable('B')
C = B * A
D = C + 1
f = compile(D)
d = f(A=np.ones(10),
B=np.ones(10)*2)
A B
1
+
X
Declarative Programming

IMPERATIVE
NDARRAY API
DECLARATIVE
SYMBOLIC
EXECUTOR
>>> import mxnet as mx
>>> a = mx.nd.zeros((100, 50))
>>> b = mx.nd.ones((100, 50))
>>> c = a + b
>>> c += 1
>>> print(c)
>>> import mxnet as mx
>>> net = mx.symbol.Variable('data')
>>> net = mx.symbol.FullyConnected(data=net, num_hidden=128)
>>> net = mx.symbol.SoftmaxOutput(data=net)
>>> texec = mx.module.Module(net)
>>> texec.forward(data=c)
>>> texec.backward()
NDArray can be set
as input to the graph
Mixed Programming Paradigm

Embed symbolic expressions into imperative
programming
texec = mx.module.Module(net)
for batch in train_data:
texec.forward(batch)
texec.backward()
for param, grad in zip(texec.get_params(), texec.get_grads()):
param -= 0.2 * grad
Mixed Programming Paradigm

• Fit the core library with all dependencies into a
single C++ source file
• Easy to compile on any platform
Amalgamation
BlindTool by Joseph Paul Cohen, demo on Nexus 4
RUNS IN BROWSER
WITH JAVASCRIPT

And now, even easier with Apple’s Core ML

Roadmap / Areas of Investment
• Usability
• Keras Integration / Gluon Interface
• Apple’s Core ML Convertor
• MinPy being merged (Dynamic Computation graphs, Std Numpy interface)
• Documentation (installation, native documents, etc.)
• Tutorials, examples | Jupyter Notebooks
• Platform support
(Linux, Windows, OS X, mobile …)
• Language bindings
(Python, C++, R, Scala, Julia, JavaScript …)
• Sparse datatypes and LSTM performance improvements
• Deploy your model your way: Lambda (+GreenGrass), Amazon EC2/Docker,
Raspberry Pi

Apache MXNet | Developer Tools and Resources

One-Click GPU or CPU
Deep Learning
AWS Deep Learning AMI
Up to~40k CUDA cores
Apache MXNet
TensorFlow
Theano
Caffe
Torch
Keras
Pre-configured CUDA drivers,
MKL
Anaconda, Python3
Ubuntu and Amazon Linux
+ AWS CloudFormation template
+ Container image

Application Examples | Jupyter Notebooks
• https://github.com/dmlc/mxnet-notebooks
• Basic concepts
• NDArray - multi-dimensional array computation
• Symbol - symbolic expression for neural networks
• Module - neural network training and inference
• Applications
• MNIST: recognize handwritten digits
• Check out the distributed training results
• Predict with pre-trained models
• LSTMs for sequence learning
• Recommender systems
• Train a state of the art Computer Vision model (CNN)
• Lots more..

Developer Resources
MXNet Resources:
• MXNet Blog Post | AWS Endorsement
• Read up on MXNet and Learn More: mxnet.io
• MXNet Github Repo
• MXNet Recommender Systems Talk | Leo Dirac
Developer Resources:
• Deep Learning AMI |Amazon Linux
• Deep Learning AMI | Ubuntu
• CloudFormation Template Instructions
• Deep Learning Benchmark
• MXNet on Lambda
• MXNet on ECS/Docker
• MXNet on Raspberry Pi | Image Detector using Inception Network

Apache MXNet | Jupyter Notebook Demo
Training MNIST on MXNet

Thank You!
spisakj@amazon.com
gernest@amazon.com

Deep Dive into Apache MXNet on AWS

More Related Content

Similar to Deep Dive into Apache MXNet on AWS (20)

More from Kristana Kane (8)

Recently uploaded (20)

Deep Dive into Apache MXNet on AWS