Deep Learning: DL4J and DataVec

skymind.io | deeplearning.org | gitter.im/deeplearning4j
DL4J and DataVec
Building Production Class Deep Learning Workflows for the Enterprise
Josh Patterson / Director Field Org
MLConf 2016 / Atlanta, GA

Josh Patterson
Director Field Engineering / Skymind
Co-Author: O’Reilly’s “Deep Learning: A Practitioners Approach”
Past:
Self-Organizing Mesh Networks / Meta-Heuristics Research
Smartgrid work / TVA + NERC
Principal Field Architect / Cloudera

Topics
• Deep Learning in Production for the Enterprise
• DL4J and DataVec
• Example Workflow: Modeling Sensor Data with RNNs

Defining Deep Learning
Higher neuron counts than in previous generation neural networks
Different and evolved ways to connect layers inside neural networks
More computing power to train
Automated Feature Learning
“machines that learn to represent the world”

Quick Usage Guide
• If I have Timeseries or Audio Input: Use a Recurrent Neural Network
• If I have Image input: Use a Convolutional Neural Network
• If I have Video input: Use a hybrid Convolutional + Recurrent Architecture!

The Challenge of the Fortune 500
Take business problem and translate it into a product-izable solution
• Get data together
• Understand modeling, pull together expertise
Get the right data workflow / infra architecture to production-ize application
• Security
• Integration

“Google is living a few years in the future and
sending the rest of us messages”
-- Doug Cutting in 2013
However
Most organizations are not built like Google
(and Jeff Dean does not work at your company…)
Anyone building Next-Gen infrastructure has to consider these things

Production Considerations
• Security – even though I can build a model, will IT let me
run it?
• Data Warehouse Integration – can I easily run this In the
existing IT footprint?
• Speedup – once I need to go faster, how hard is it to speed
up modeling?

DL4J and DataVec
• DL4J – ASF 2.0 Licensed JVM Platform for Enterprise Deep Learning
• DataVec - a tool for machine learning ETL (Extract, Transform, Load)
operations.
• Both run natively on Spark on CPU or GPU as Backends
• DL4J Suite certified on CDH5, HDP2.4, and upcoming IBM IOP platform.

ND4J: The Need for Speed
JavaCPP
• Auto generate JNI Bindings for C++
• Allows for easy maintenance and deployment of C++ binaries in Java
CPU Backends
• OpenMP (multithreading within native operations)
• OpenBLAS or MKL (BLAS operations)
• SIMD-extensions
GPU Backends
• DL4J supports Cuda 7.5 (+cuBLAS) at the moment, and will support 8.0 support as soon as it comes out.
• Leverages cuDNN as well
https://github.com/deeplearning4j/dl4j-benchmark

Prepping Data is Time Consuming
http://www.forbes.com/sites/gilpress/2016/03/23/data-preparation-most-time-consuming-least-enjoyable-data-science-task-survey-
says/#633ea7f67f75

Preparing Data for Modeling is Hard

DL4J Workflow Toolchain
ETL
(DataVec)
Vectorization
(DataVec)
Modeling
(DL4J)
Evaluation
(Arbiter)
Execution Platforms: Spark/Hadoop, Single Machine
ND4J - Linear Algebra Runtime: CPU, GPU

Modeling Sensor Data with RNNs and
DL4J

NERC Sensor Data Collection
openPDC PMU Data Collection circa 2009
• 120 Sensors
• 30 samples/second
• 4.3B Samples/day
• Housed in Hadoop

Classifying UCI Sensor Data: Trends
A – Downward Trend
B – Cyclic
C – Normal
D – Upward Shift
E – Upward Trend
F – Downward Shift

Loading and Transforming Timeseries Data with DataVec
SequenceRecordReader trainFeatures = new CSVSequenceRecordReader();
trainFeatures.initialize(new NumberedFileInputSplit(featuresDirTrain.getAbsolutePath() + "/%d.csv", 0, 449));
SequenceRecordReader trainLabels = new CSVSequenceRecordReader();
trainLabels.initialize(new NumberedFileInputSplit(labelsDirTrain.getAbsolutePath() + "/%d.csv", 0, 449));
int minibatch = 10;
int numLabelClasses = 6;
DataSetIterator trainData = new SequenceRecordReaderDataSetIterator(trainFeatures, trainLabels, minibatch,
numLabelClasses, false, SequenceRecordReaderDataSetIterator.AlignmentMode.ALIGN_END);
//Normalize the training data
DataNormalization normalizer = new NormalizerStandardize();
normalizer.fit(trainData); //Collect training data statistics
trainData.reset();
trainData.setPreProcessor(normalizer); //Use previously collected statistics to normalize on-the-fly

Configuring a Recurrent Neural Network with DL4J
MultiLayerConfiguration conf = new NeuralNetConfiguration.Builder()
.optimizationAlgo(OptimizationAlgorithm.STOCHASTIC_GRADIENT_DESCENT).iterations(1)
.updater(Updater.NESTEROVS).momentum(0.9).learningRate(0.005)
.gradientNormalization(GradientNormalization.ClipElementWiseAbsoluteValue)
.gradientNormalizationThreshold(0.5)
.list()
.layer(0, new GravesLSTM.Builder().activation("tanh").nIn(1).nOut(10).build())
.layer(1, new RnnOutputLayer.Builder(LossFunctions.LossFunction.MCXENT)
.activation("softmax").nIn(10).nOut(numLabelClasses).build())
.pretrain(false).backprop(true).build();
MultiLayerNetwork net = new MultiLayerNetwork(conf);
net.init();

Train the Network on Local Machine
int nEpochs = 40;
String str = "Test set evaluation at epoch %d: Accuracy = %.2f, F1 = %.2f";
for (int i = 0; i < nEpochs; i++) {
net.fit(trainData);
//Evaluate on the test set:
Evaluation evaluation = net.evaluate(testData);
System.out.println(String.format(str, i, evaluation.accuracy(), evaluation.f1()));
testData.reset();
trainData.reset();
}

Train the Network on Spark
TrainingMaster tm = new ParameterAveragingTrainingMaster(true,executors_count,1,batchSizePerWorker,1,0);
//Create Spark multi layer network from configuration
SparkDl4jMultiLayer sparkNetwork = new SparkDl4jMultiLayer(sc, net, tm);
int nEpochs = 40;
String str = "Test set evaluation at epoch %d: Accuracy = %.2f, F1 = %.2f";
for (int i = 0; i < nEpochs; i++) {
sparkNetwork.fit(trainDataRDD);
//Evaluate on the test set:
Evaluation evaluation = net.evaluate(testData);
System.out.println(String.format(str, i, evaluation.accuracy(), evaluation.f1()));
testData.reset();
trainData.reset();
}

Thank you!
Please visit
skymind.io/learn for more
information
OR
Visit us at booth P33

Deep Learning: DL4J and DataVec

More Related Content

What's hot (19)

Viewers also liked (20)

Similar to Deep Learning: DL4J and DataVec (20)

More from Josh Patterson (14)

Recently uploaded (20)

Deep Learning: DL4J and DataVec

Editor's Notes