100% found this document useful (1 vote)
186 views101 pages

Deeplearningsmartnetworks 190505233523

This document discusses a presentation on deep learning given by Melanie Swan. It begins by providing context on humanity's digital transformation journey and increasing use of technologies like artificial intelligence. The presentation then defines deep learning as one of the most important AI technologies, involving neural networks that can identify patterns in data. The talk covers how deep learning works technically and its near-term and future applications. It concludes by noting the need for continued research and addressing risks.

Uploaded by

naldo
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
100% found this document useful (1 vote)
186 views101 pages

Deeplearningsmartnetworks 190505233523

This document discusses a presentation on deep learning given by Melanie Swan. It begins by providing context on humanity's digital transformation journey and increasing use of technologies like artificial intelligence. The presentation then defines deep learning as one of the most important AI technologies, involving neural networks that can identify patterns in data. The talk covers how deep learning works technically and its near-term and future applications. It concludes by noting the need for continued research and addressing risks.

Uploaded by

naldo
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 101

Image credit: NVIDIA

Deep Learning Explained


The future of Artificial Intelligence and Smart Networks

Melanie Swan
Purdue University
Scientech [email protected]
Indianapolis IN, May 6, 2019
Slides: http://slideshare.net/LaBlogga
Melanie Swan, Technology Theorist
 Philosophy Department, Purdue University,
Indiana, USA
 Founder, Institute for Blockchain Studies
 Singularity University Instructor; Institute for Ethics and
Emerging Technology Affiliate Scholar; EDGE
Essayist; FQXi Advisor
Economics and Financial
Traditional Markets Background Theory Leadership

New Economies research group


https://www.facebook.com/groups/NewEconomies

Source: http://www.melanieswan.com, http://blockchainstudies.org/NSNE.pdf, http://blockchainstudies.org/Metaphilosophy_CFP.pdf


6 May 2019
1
Deep Learning
Deep Learning Smart Network Thesis

(1) Deep learning (machine learning) is one of the


latest and most important Artificial Intelligence
technologies.
This is in the bigger context that
(2) Humanity is embarked on a Digital
Transformation Journey, evolving into a
Computation-harnessing Society with Smart
Network Technologies
(Smart networks: autonomous computing networks such as
deep learning nets, blockchains, and UAV fleets)

6 May 2019 Source: Swan, M., and dos Santos, R.P. In prep. Smart Network Field Theory: The Technophysics of Blockchain and Deep Learning.
https://www.researchgate.net/publication/328051668_Smart_Network_Field_Theory_The_Technophysics_of_Blockchain_and_Deep_Learning 2
Deep Learning
Agenda
 Digital Transformation Journey
 Artificial Intelligence
 Deep Learning
 Definition
 How does it work?
 Technical details
 Applications
 Near-term
 Future
 Conclusion
 Research and Risks

Image Source: http://www.opennn.net


6 May 2019
3
Deep Learning
Digital Transformation Journey
 Digital transformation: digitizing information and processes
 $3.8 trillion global IT spend 2019 (Gartner)
 $3.9 trillion global business value derived from AI in 2022
 $1.3 trillion Digital Transformation Technologies (IDC)
 $77.6 billion spend on AI systems in 2022
 Digital transformation
 Technology used to
make existing work more
efficient, now technology
is transforming the work
itself
 Blockchain, IoT, AI,
Cloud technologies

Source: https://www.gartner.com/en/newsroom/press-releases/2019-01-28-gartner-says-global-it-spending-to-reach--3-8-trillio,
6 May 2019 https://www.idc.com/getdoc.jsp?containerId=prUS43381817 4
Deep Learning
Philosophy of Economic Theory
Future of the Digital Economy
Traditional Digital Economy
Economy
Phase 1 Phase 2
Digitization Intelligence
Physical Infrastructure Digital Infrastructure Smart Infrastructure
• Natural Resources • Data • Blockchain
• Electricity • Communications • Deep Learning

Transportation Digital Intelligent


Networks Networks Networks

1700-1970 1970-2015 2015-2050

6 May 2019 Now 5


Deep Learning
Philosophy of Economic Theory
Longer-term Economic Futures
Traditional Digital Biological Space
Economy Economy Economy Economy

Phase 1 Phase 2
Digitization Intelligence
Natural resources Social Networks Blockchain CRISPR Mining
Electricity Apps Deep Learning Bioprinting Settlement
Manufacturing Payments Cellular Therapies Exploration

1700-1970 1970-2015 2015-2050 2020-2080 2025-2100


Atoms Bits Value Cells Energy

6 May 2019 Now 6


Deep Learning
Big Data ≠ Smart Data
 Exascale supercomputing 2021e
 Exabyte global data volume 2020e: 40 EB
 Scientific, governmental, corporate, and personal

Only 6% data protected, only


42% companies say they know
how to extract meaningful
insights from the data available
to them (Oxford Economics
Workforce 2020)

6 May 2019 Sources: http://www.oyster-ims.com/media/resources/dealing-information-growth-dark-data-six-practical-steps/,


Deep Learning https://www.theverge.com/2019/3/18/18271328/supercomputer-build-date-exascale-intel-argonne-national-laboratory-energy
7
Why do we need Learning Technologies?
 Big data is not smart data (i.e. usable)
 New data science methods needed for data growth,
older learning algorithms under-performing

Source: http://blog.algorithmia.com/introduction-to-deep-learning-2016
6 May 2019
8
Deep Learning
Agenda
 Digital Transformation Journey
 Artificial Intelligence
 Deep Learning
 Definition
 How does it work?
 Technical details
 Applications
 Near-term
 Future
 Conclusion
 Research and Risks

Image Source: http://www.opennn.net


6 May 2019
9
Deep Learning
Artificial Intelligence (AI) Argument
 Artificial intelligence is using
computers to do cognitive work
(physical or mental) that usually
requires a human
 Deep Learning/Machine Learning
is the biggest area in AI

Ke Jie vs. AlphaGo AI Go player, Future of


Go Summit, Wuzhen China, May 2017

Source: Swan, M. Philosophy of Deep Learning Networks: Reality Automation Modules.


6 May 2019
10
Deep Learning
Progression in AI Learning Machines

Deep Blue, 1997 Watson, 2011 AlphaGo, 2016


Hard-coded AI machine Deep Learning prototype Deep Learning machine

Single-purpose AI: Question-answering AI: Multi-purpose AI:


Hard-coded rules Natural-language processing Algorithm detects rules,
reusable template

6 May 2019
11
Deep Learning
What is Deep Learning?

Conceptual Definition:
Deep learning is a computer program that can
identify what something is

Technical Definition:
Deep learning is a class of machine learning
algorithms in the form of a neural network that
uses a cascade of layers of processing units to
extract features from data sets in order to make
predictive guesses about new data

Source: Extending Jann LeCun, http://spectrum.ieee.org/automaton/robotics/artificial-intelligence/facebook-ai-director-yann-lecun-


6 May 2019 on-deep-learning
12
Deep Learning
How are AI and Deep Learning related?
 Artificial intelligence: Computer Science
 Using computers to do cognitive work
Artificial Intelligence
that usually requires a human
 Machine learning: Machine Learning

Neural Nets
 Computers with the capability to learn
using patterns and inference as Deep
Learning
opposed to explicit instructions
 Neural network:
 A computer system modeled on the Within the Computer Science
discipline, in the field of Artificial
human brain and nervous system Intelligence, Deep Learning is a
class of Machine Learning
 Deep learning: algorithms, that are in the form
 Program that can recognize objects of a Neural Network

Source: Machine Learning Guide, 9. Deep Learning


6 May 2019
13
Deep Learning
What is a Neural Net?
 Intuition: create an Artificial Neural Network to solve
problems in the same way as the human brain

6 May 2019
14
Deep Learning
Technophysics and Statistical Mechanics
Deep Learning is inspired by Physics
 Sigmoid function suggested as a model for neurons,
per statistical mechanical behavior (Cowan, 1972)
 Stationary solutions for dynamic models (asymmetric
weights create an oscillator to model neuron signaling)
 Hopfield Neural Network: content-addressable
memory system with binary threshold nodes,
converges to a local minimum (Hopfield, 1982)
 Can use statistical mechanics (Ising model of
ferromagnetism) for neurons
 Restricted Boltzmann Machine (Hinton, 1983)
 Statistical mechanics and condensed matter: Boltzmann
distribution, free energy, Gibbs sampling, renormalization;
stochastic processing units with binary output
Source: https://www.quora.com/Is-deep-learning-related-to-statistical-physics-particularly-network-science
6 May 2019
15
Deep Learning
Agenda
 Digital Transformation Journey
 Artificial Intelligence
 Deep Learning
 Definition
 How does it work?
 Technical details
 Applications
 Near-term
 Future
 Conclusion
 Research and Risks

Image Source: http://www.opennn.net


6 May 2019
16
Deep Learning
Why is it called “Deep” Learning?
 Hidden layers of processing (2-20 intermediary layers)
 “Deep” networks (3+ layers) versus “shallow” (1-2 layers)
 Basic deep learning network: 5 layers; GoogleNet: 22 layers

Sandwich Architecture:
visible Input and Output layers
with hidden processing layers

GoogleNet:
22 layers

6 May 2019
17
Deep Learning
Why Deep “Learning”?
 System is “dumb” (i.e. mechanistic)
 “Learns” by having big data (lots of input examples), and making
trial-and-error guesses to adjust weights to find key features
 Creates a predictive system to identity new examples
 Usual AI argument: big enough data is what makes a
difference (“simple” algorithms run over large data sets)

Input: Big Data (e.g.; Method: Trial-and-error Output: system identifies


many examples) guesses to adjust node weights new examples

6 May 2019
18
Deep Learning
Sample task: is that a Car?
 Create an image recognition system that determines
which features are relevant (at increasingly higher levels
of abstraction) and correctly identifies new examples

Source: Jann LeCun, http://www.pamitc.org/cvpr15/files/lecun-20150610-cvpr-keynote.pdf


6 May 2019
19
Deep Learning
Two classes of Learning Systems
Supervised and Unsupervised Learning
 Supervised
 Classify labeled data

 Unsupervised
 Find patterns in
unlabeled data

Source: https://www.slideshare.net/ThomasDaSilvaPaula/an-introduction-to-machine-learning-and-a-little-bit-of-deep-learning
6 May 2019
20
Deep Learning
Early success in Supervised Learning (2011)
 YouTube: user-classified data
perfect for Supervised Learning

Source: Google Brain: Le, QV, Dean, Jeff, Ng, Andrew, et al. 2012. Building high-level features using large scale unsupervised
6 May 2019 learning. https://arxiv.org/abs/1112.6209
21
Deep Learning
2 main kinds of Deep Learning neural nets
 Convolutional Neural Nets
 Image recognition
 Convolve: roll up to higher
levels of abstraction to identify
feature sets
 Recurrent Neural Nets
 Speech, text, audio recognition
 Recur: iterate over sequential
inputs with a memory function
 LSTM (Long Short-Term
Memory) remembers
sequences and avoids
gradient vanishing

Source: Yann LeCun, CVPR 2015 keynote (Computer Vision ), "What's wrong with Deep Learning" http://t.co/nPFlPZzMEJ
6 May 2019
22
Deep Learning
Image Recognition and Computer Vision
History

Marv Minsky, 1966 Jeff Hawkins, 2004, Hierarchical Quoc Le, 2011, Google
“summer project” Temporal Memory (HTM) Brain cat recognition

Current state of
the art - 2019

Convolutional net for autonomous driving, http://cs231n.github.io/convolutional-networks

Source: Quoc Le, https://arxiv.org/abs/1112.6209; Yann LeCun, NIPS 2016,


6 May 2019 https://drive.google.com/file/d/0BxKBnD5y2M8NREZod0tVdW5FLTQ/view
23
Deep Learning
Image Classification
 Human-level image recognition and captioning

Source: https://cs.stanford.edu/people/karpathy/deepimagesent/?hn
6 May 2019
24
Deep Learning
Image Understanding
 “Understanding” is the system’s three-step process
 Image -> internal representation -> text
 Labels “tennis racket” = concepts
 Machine learning: Kantian-level object recognition, not Hegelian

Source: https://cs.stanford.edu/people/karpathy/deepimagesent/?hn
6 May 2019
25
Deep Learning
Famous Image Nets
 Image recognition (<10% error rate)
 AlexNet (2012) - 5 layers
 Error rate 15.3% versus 26.2%
 VGGNet (2018) - 19 CNN layers
 GoogleNet (2019) - 22 CNN layers
 BatchNorm (between Conv and Pooling)

 Microsoft ResNet (2015) - diverse layers

Sources: https://towardsdatascience.com/an-overview-of-resnet-and-its-variants-5281e2f56035,
6 May 2019 https://medium.com/coinmonks/paper-review-of-vggnet-1st-runner-up-of-ilsvlc-2014-image-classification-d02355543a11 26
Deep Learning
Speed and size of Deep Learning nets?
 Google Deep Brain cat recognition, 2011
 1 bn connections, 10 mn images (200x200 pixel),
1,000 machines (16,000 cores), 3 days

 State of the art, 2016-2019


 NVIDIA facial recognition, 100 million images, 10
layers, 1 bn parameters, 30 exaflops, 30 GPU days
 Google Net, 11.2 bn parameter system
 Lawrence Livermore Lab, 15 bn parameter system
 Digital Reasoning, “cognitive computing” (Nashville
TN), 160 bn parameters, trains on three multi-core
computers overnight

Parameters: variables that determine the network structure


6 May 2019 Sources:,https://futurism.com/biggest-neural-network-ever-pushes-ai-deep-learning, Digital Reasoning paper: 27
Deep Learning https://arxiv.org/pdf/1506.02338v3.pdf
Agenda
 Digital Transformation Journey
 Artificial Intelligence
 Deep Learning
 Definition
 How does it work?
 Technical details
 Applications
 Near-term
 Future
 Conclusion
 Research and Risks

Image Source: http://www.opennn.net


6 May 2019
28
Deep Learning
Problem: correctly recognize “apple”

Source: Michael A. Nielsen, Neural Networks and Deep Learning


6 May 2019
29
Deep Learning
Modular Processing Units
 Unit: processing unit, logit (logistic
regression unit), perceptron, artificial neuron

1. Input 2. Hidden layers 3. Output

X X X X X
X X X X X
X X X X X
X X X X X
X X X X X
X X X X X
X X X X X
Source: http://deeplearning.stanford.edu/tutorial
6 May 2019
30
Deep Learning
Image Recognition
Digitize Input Data into Vectors

Source: Quoc V. Le, A Tutorial on Deep Learning, Part 1: Nonlinear Classifiers and The Backpropagation Algorithm, 2015, Google
6 May 2019 Brain, https://cs.stanford.edu/~quocle/tutorial1.pdf
31
Deep Learning
Image Recognition
Log features and trial-and-error test
Inference Actual
1. Input 2. Hidden layers 3. Output Guess

Feed-forward pass (0,1)

X Feature 1 .5 X .5 X .5 X X 1 1
.75
Backward pass to update probabilities per correct guess
Feature 2 .5 .5 .5
X X X .25
X X 0 1
Feature 3
X X X X X

 Mathematical methods used to update the weights


 Linear algebra: matrix multiplications of input vectors
 Statistics: logistic regression units (Y/N (0,1)), probability weighting
and updating, inference for outcome prediction
 Calculus: optimization (minimization), gradient descent in back-
propagation to avoid local minima with saddle points

Source: http://deeplearning.stanford.edu/tutorial; MNIST dataset: http://yann.lecun.com/exdb/mnist


6 May 2019
32
Deep Learning
Image Recognition
Levels of Abstraction Object Recognition
 Layer 1: Log all features (line, edge, unit of sound)
 Layer 2: Identify more complicated features (jaw line,
corner, combination of speech sounds)
 Layer 3+: Push features to higher levels of abstraction
until full objects can be recognized

Source: Jann LeCun, http://www.pamitc.org/cvpr15/files/lecun-20150610-cvpr-keynote.pdf


6 May 2019
33
Deep Learning
Image Recognition
Higher Abstractions of Feature Recognition

Source: https://adeshpande3.github.io/The-9-Deep-Learning-Papers-You-Need-To-Know-About.html
6 May 2019
34
Deep Learning
Example: NVIDIA Facial Recognition
 First hidden layer extracts all possible low-level features
from data (lines, edges, contours); next layers abstract
into more complex features of possible relevance

Source: NVIDIA
6 May 2019
35
Deep Learning
Deep Learning

Source: Quoc V. Le et al, Building high-level features using large scale unsupervised learning, 2011, https://arxiv.org/abs/1112.6209
6 May 2019
36
Deep Learning
Speech, Text, Audio Recognition
Sequence-to-sequence Recognition + LSTM
 LSTM: Long Short Term Memory
 Technophysics technique: each subsequent layer remembers
data for twice as long (fractal-type model)
 The “grocery store” not the “grocery church”

Source: Andrew Ng
6 May 2019
37
Deep Learning
Agenda
 Digital Transformation Journey
 Artificial Intelligence
 Deep Learning
 Definition
 How does it work?
 Technical details
 Applications
 Near-term
 Future
 Conclusion
 Research and Risks

Image Source: http://www.opennn.net


6 May 2019
38
Deep Learning
3 Key Technical Aspects of Deep Learning
 Logistic regression, Lego-like structure of layers of
processing units, and finding the minimum of the curve
Sigmoid Function Perceptron Structure Loss Function

Squash values into Core processing unit Reduce combinatoric


Sigmoidal S-curve (input-processing-output) dimensionality
What -Binary values (Y/N, 0/1) Levers: weights and bias
-Probability values (0 to 1)
-Tanh values 9(-1) to 1)

Non-linear curve “Dumb” system learns by Loss function


Why (logistic regression) adjusting parameters and optimizes efficiency
means manipulability checking against outcome of solution
6 May 2019
39
Deep Learning
1. Regression
Linear Regression
 Regression: how does one variable relate to another

House price vs. Size (square feet)

House price

y=mx+b Size (square feet)

Source: https://www.statcrunch.com/5.0/viewreport.php?reportid=5647
6 May 2019
40
Deep Learning
Logistic Regression

Source: http://www.simafore.com/blog/bid/99443/Understand-3-critical-steps-in-developing-logistic-regression-models
6 May 2019
41
Deep Learning
Logistic Regression
 Higher-order mathematical
formulation Sigmoid Function
 Sigmoid function
 S-shaped and bounded
 Maps the whole real axis into a finite
interval (0-1)
 Non-linear
 Can fit probability
 Can apply optimization techniques

 Deep Learning classification


predictions are in the form of a
probability value
Unit Step Function

Source: https://www.quora.com/Logistic-Regression-Why-sigmoid-function
6 May 2019
42
Deep Learning
Sigmoid function: Taleb
 Thesis: mapping a phenomenon to an
s-curve curve (“convexify” it), means
its risk may be controlled
 Antifragility = convexity = risk-manageable
 Fragility = concavity
 Non-linear dose response in medicine
suggests treatment optimality
 U-shaped, j-shaped curves in hormesis
(biphasic response); Bell’s theorem

Source: Swan, M. (2019). Blockchain Theory of Programmable Risk: Black Swan Smart Contracts. In Blockchain Economics: Implications
6 May 2019 of Distributed Ledgers - Markets, communications networks, and algorithmic reality. London: World Scientific.
43
Deep Learning
Regression (summary)

Linear Regression Logistic Regression (Sigmoid function)


(0-1) or Tanh ((-1)-1)

 Linear regression  Logistic regression


 Predict continuous set  Predict binary outcomes:
of values (house prices)  Perceptron (0 or 1)
 Predict probabilities:
 Sigmoid Neuron (values 0-1)
 Tanh Hyperbolic Tangent
Neuron (values (-1)-1)

6 May 2019
44
Deep Learning
2. Lego-like layers of processing units Modular Processing Units

Deep Learning Architecture

Source: Michael A. Nielsen, Neural Networks and Deep Learning


6 May 2019
45
Deep Learning
More complicated in actual use
 Convolutional neural net scale-up for
number recognition
 Example data: MNIST dataset
 http://yann.lecun.com/exdb/mnist

Source: http://www.kdnuggets.com/2016/04/deep-learning-vs-svm-random-forest.html
6 May 2019
46
Deep Learning
Node Structure: Computation Graph

Edge
(input value) 3

Node
(operation) Add
Edge
??
(output value)
Edge
(input value) 4

Architecture Example 1

Multiply
??

Example 2
6 May 2019
47
Deep Learning
Basic node with Weights and Bias
 Basic node structure is fixed: input-processing-output
 Weight and bias are variable parameters that are
adjusted as the system iterates and “learns”

Basic Node Structure (fixed) Basic Node with Weights and Bias (variable)
Input Processing Output Variable Weights and
Input Values have Biases
Edge Weights w
Input value = 4 Nodes have a
w 1 * x1 Bias b
Node .25*4=1
Operation = N+b
Edge
Add 13+2
Output value = 20 15
Edge w2*x2
Input value = 16 .75*16=12

Mimics NAND gate

Source: http://neuralnetworksanddeeplearning.com/chap1.html
6 May 2019
48
Deep Learning
Image Recognition
Log features and trial-and-error test
Inference Actual
1. Input 2. Hidden layers 3. Output Guess

Feed-forward pass (0,1)

X Feature 1 .5 X .5 X .5 X X 1 1
.75
Backward pass to update probabilities per correct guess

X
Feature 2 .5
X .5
X .5
X X 0 1
.25
Feature 3
X X X X X

 Mathematical methods used to update the weights


 Linear algebra: matrix multiplications of input vectors
 Statistics: logistic regression units (Y/N (0,1)), probability weighting
and updating, inference for outcome prediction
 Calculus: optimization (minimization), gradient descent in back-
propagation to avoid local minima with saddle points

Source: http://deeplearning.stanford.edu/tutorial; MNIST dataset: http://yann.lecun.com/exdb/mnist


6 May 2019
49
Deep Learning
Actual: same structure, more complicated

6 May 2019
50
Deep Learning
Same structure, more complicated values

Source: https://medium.com/@karpathy/software-2-0-a64152b37c35
6 May 2019
51
Deep Learning
Neural net: massive scale-up of nodes

Source: http://neuralnetworksanddeeplearning.com/chap1.html
6 May 2019
52
Deep Learning
Same Structure

6 May 2019
53
Deep Learning
How does the neural net actually “learn”?
 Structural system based on cascading layers of
neurons with variable parameters: weight and bias
 Vary the weights
and biases to see if
a better outcome is
obtained
 Repeat until the net
correctly classifies
the data

Source: http://neuralnetworksanddeeplearning.com/chap2.html
6 May 2019
54
Deep Learning
3. Loss function optimization
Backpropagation
 Problem: Combinatorial complexity
 Inefficient to test all possible parameter variations

 Solution: Backpropagation (1986 Nature paper)


 Optimization method used to calculate the error
contribution of each neuron after a batch of data is
processed

Source: http://neuralnetworksanddeeplearning.com/chap2.html
6 May 2019
55
Deep Learning
Backpropagation of errors
1. Calculate the total error
2. Calculate the contribution to the error at each step
going backwards
 Variety of Error Calculation methods: Mean Square Error
(MSE), sum of squared errors of prediction (SSE), Cross-
Entropy (Softmax), Softplus
 Goal: identify which feature solutions have a higher
power of potential accuracy

6 May 2019
56
Deep Learning
Backpropagation
 Heart of Deep Learning
 Backpropagation: algorithm dynamically calculates
the gradient (derivative) of the loss function with
respect to the weights in a network to find the
minimum and optimize the function from there
 Algorithms optimize the performance of the network by
adjusting the weights, e.g.; in the gradient descent algorithm
 Error and gradient are computed for each node
 Intermediate errors transmitted backwards through the
network (backpropagation)
 Objective: optimize the weights so the network can
learn how to correctly map arbitrary inputs to outputs

Source: http://briandolhansky.com/blog/2013/9/27/artificial-neural-networks-backpropagation-part-4,
6 May 2019 https://mattmazur.com/2015/03/17/a-step-by-step-backpropagation-example/
57
Deep Learning
Gradient Descent
 Gradient: derivative to find the minimum of a function
 Gradient descent: optimization algorithm to find the
biggest errors (minima) most quickly
 Error = MSE, log loss, cross-entropy; e.g.; least correct
predictions to correctly identify data
 Technophysics methods: spin glass, simulated
annealing

Source: http://briandolhansky.com/blog/2013/9/27/artificial-neural-networks-backpropagation-part-4
6 May 2019
58
Deep Learning
Loss Function
 Optimization Technique
 Mathematical tool used in statistics, finance, decision
Laplace
theory, biological modeling, computational neuroscience
 State as non-linear equation to optimize
 Minimize loss or cost
 Maximize reward, utility, profit, or fitness
 Loss function links instance of an event to its cost
 Accident (event) means $1,000 damage on average (cost)
 5 cm height (event) confers 5% fitness advantage (reward)
 Deep learning: system feedback loop
 Apply cost penalty for incorrect classifications in training
 Methods: CNN (classification): cross-entropy; RNN
(regression): MSE
6 May 2019
59
Deep Learning
Known problems: Overfitting
 Regularization
 Introduce additional information
such as a lambda parameter in the
cost function (to update the theta
parameters in the gradient descent
algorithm)
 Dropout: prevent complex
adaptations on training data by
dropping out units (both hidden and
visible)
 Test new datasets

6 May 2019
60
Deep Learning
Agenda
 Digital Transformation Journey
 Artificial Intelligence
 Deep Learning
 Definition
 How does it work?
 Technical details
 Applications
 Near-term
 Future
 Conclusion
 Research and Risks

Image Source: http://www.opennn.net


6 May 2019
61
Deep Learning
Applications: Cats to Cancer to Cognition

Computational imaging: Machine learning for 3D microscopy


https://www.nature.com/nature/journal/v523/n7561/full/523416a.html

Source: Yann LeCun, CVPR 2015 keynote (Computer Vision ), "What's wrong with Deep Learning" http://t.co/nPFlPZzMEJ
6 May 2019
62
Deep Learning
Radiology: Tumor Image Recognition
 Computer-Aided
Diagnosis with
Deep Learning
 Breast tissue
lesions in images
 Pulmonary nodules
in CT Scans

Source: https://www.nature.com/articles/srep24454
6 May 2019
63
Deep Learning
Melanoma Image Recognition

2017

Source: Nature volume542, pages115–118 (02 February 2017


6 May 2019 http://www.nature.com/nature/journal/v542/n7639/full/nature21056.html
64
Deep Learning
Melanoma Classification
 Diagnose skin cancer using deep learning CNNs
 Algorithm trained to detect skin cancer (melanoma)
using 130,000 images of skin lesions representing over
2,000 different diseases

Source: https://www.techemergence.com/machine-learning-medical-diagnostics-4-current-applications/
6 May 2019
65
Deep Learning
DIY Image Recognition: use Contrast
Apple or Orange? Melanoma risk or healthy skin?

Degree of contrast in photo colors?

How many orange pixels?

Source: https://developer.clarifai.com/modelshttps://developer.clarifai.com/models
6 May 2019
66
Deep Learning
Deep Learning and Genomics: RNNs
 Large classes of hypothesized but unknown correlations
 Genotype-phenotype disease linkage unknown
 Computer-identifiable patterns in genomic data
 RNN: textual analysis; CNN: genome symmetry

Source: http://ieeexplore.ieee.org/document/7347331
6 May 2019
67
Deep Learning
AI Medical Diagnosis
 Earlier stage diagnosis, personalized, world health clinic
 Smartphone-based diagnostic tools with AI for optical
detection and EVA (enhanced visual assessment)

Source: https://spectrum.ieee.org/biomedical/devices/ai-medicine-comes-to-africas-rural-clinics
6 May 2019
68
Deep Learning
Deep Learning World Clinic
 WHO estimates 400 million people without
access to essential health services
 6% in extreme poverty due to healthcare costs
 Next leapfrog technology: Deep Learning
 Last-mile build out of brick-and-mortar clinics
does not make sense in era of digital medicine
 Medical diagnosis via image recognition, natural
language processing symptoms description
 Convergence Solution: Digital Health Wallet
 Deep Learning medical diagnosis + Blockchain-
based EMRs (electronic medical records)
 Empowerment Effect: Deep learning = “tool I Digital Health Wallet:
Deep Learning diagnosis
use,” not hierarchically “doctor-administered” Blockchain-based EMRs

Source: http://www.who.int/mediacentre/news/releases/2015/uhc-report/en/
6 May 2019
69
Deep Learning
Deep Learning and the Brain

6 May 2019
70
Deep Learning
Deep Qualia machine? General purpose AI
Mutual inspiration of neurological and computing research

 Deep learning neural networks are inspired by the


structure of the cerebral cortex
 The processing unit, perceptron, artificial neuron is the
mathematical representation of a biological neuron
 In the cerebral cortex, there can be several layers of
interconnected perceptrons

6 May 2019
71
Deep Learning
Brain is hierarchically organized
 Visual cortex is hierarchical with intermediate layers
 The ventral (recognition) pathway in the visual cortex has multiple
stages: Retina - LGN - V1 - V2 - V4 - PIT – AIT
 Human brain simulation projects
 Swiss Blue Brain project, European Human Brain Project

Source: Jann LeCun, http://www.pamitc.org/cvpr15/files/lecun-20150610-cvpr-keynote.pdf


6 May 2019
72
Deep Learning
Agenda
 Digital Transformation Journey
 Artificial Intelligence
 Deep Learning
 Definition
 How does it work?
 Technical details
 Applications
 Near-term
 Future
 Conclusion
 Research and Risks

Image Source: http://www.opennn.net


6 May 2019
73
Deep Learning
the farther future: better horse is a car.

better horse “horseless carriage” => car

new technology.
6 May 2019
74
Deep Learning
Autonomous Driving
 Deep Learning
 Identify what things are
 CNNs: core element of machine
vision systems
 Scenario-based decision-making

6 May 2019
75
Deep Learning
The Very Small
Deep Learning in Cells
 On-board pacemaker data security,
software updates, patient monitoring
 Medical nanorobotics for cell repair
 Deep Learning: identify what things are
(diagnosis)
 Blockchain: secure automation technology
 Bio-cryptoeconomics: secure automation
of medical nanorobotics for cell repair
 Medical nanorobotics as coming-onboard
repair platform for the human body
 High number of agents and “transactions”
 Identification and automation is obvious

Sources: Swan, M. Blockchain Thinking: The Brain as a DAC (Decentralized Autonomous Corporation)., IEEE 2015; 34(4): 41-52 , Swan,
6 May 2019 M. Forthcoming. Technophysics, Smart Health Networks, and the Bio-cryptoeconomy: Quantized Fungible Global Health Care Equivalency
Units for Health and Well-being. In Boehm, F. Ed., Nanotechnology, Nanomedicine, and AI. Boca Raton FL: CRC Press 76
Deep Learning
The Very Small
Human Brain/Cloud Interface

Sources: Martins, Swan, Freitas Jr., et. al. 2019. Human Brain/Cloud Interface. Front. Neurosci.
6 May 2019
77
Deep Learning
The Very Large
Deep Learning in Space
 Satellite networks
 Automated space
construction bots/agents
 Deep Learning: identify
what things are
(classification)
 Blockchain: secure
automation technology
 Applications: asteroid
mining, terraforming,
radiation-monitoring,
space-based solar power,
debris tracking net
6 May 2019
78
Deep Learning
Quantum Machine Learning
 Quantum Computing: assign an amplitude (not a
probability) for possible states of the world
 Amplitudes can interfere destructively and cancel out,
be complex numbers, not sum to 1
 Feynman: “QM boils down to the minus signs”
 QC: a device that maintains a state that is a
superposition for every configuration of bits
 Turn amplitude into probabilities (event probability is
the squared absolute value of its amplitude)
 Challenge: obtain speed advantage by exploiting
amplitudes, need to choreograph a pattern of
interference (not measure random configurations)

Sources: Scott Aaronson; and Biamonte, Lloyd, et al. (2017). Quantum machine learning. Nature. 549:195–202.
6 May 2019
79
Deep Learning
Agenda
 Digital Transformation Journey
 Artificial Intelligence
 Deep Learning
 Definition
 How does it work?
 Technical details
 Applications
 Near-term
 Future
 Conclusion
 Research and Risks

Image Source: http://www.opennn.net


6 May 2019
80
Deep Learning
Research Topics
 Layer depth vs. height: (1x9, 3x3, etc.); L1/2 slow-downs

 Dark knowledge: data compression, compress dark


(unseen) knowledge into a single summary model
 Adversarial networks: two networks, adversary network
generates false data and discriminator network identifies
 Reinforcement networks: goal-oriented algorithm for
system to attain a complex objective over many steps
Source: http://cs231n.github.io/convolutional-networks, https://arxiv.org/abs/1605.09304,
6 May 2019 https://www.iro.umontreal.ca/~bengioy/talks/LondonParisMeetup_15April2015.pdf
81
Deep Learning
Research Topics
 Language representation models
 BERT (Bidirectional Encoder
Representations from Transformers)
Deep
 Deep Belief Network Belief
 Connections between layers not units Network
 Find initial weighting guesses for units
as system pre-processing step
 Deep Boltzmann Machine
 Stochastic recurrent neural network
 Internal representations of learning
 Represent and solve combinatoric Deep
problems Boltzmann
Machine

Sources: Devlin et al. 2018. BERT: Pre-training of Deep Bidirectional Transformers for Language Understanding,
6 May 2019 http://prog3.com/sbdm/blog/zouxy09/article/details/8781396 82
Deep Learning
Google Deep Dream net
 Deep dream generated images
 Not random pasting of dog snouts
 System synthesizes every pixel in
context, and determines good places
for dog snouts

Source: Georges Seurat, Un dimanche après-midi à l'Île de la Grande Jatte, 1884-1886;


6 May 2019 http://web.cs.hacettepe.edu.tr/~aykut/classes/spring2016/bil722; Google DeepDream uses algorithmic pareidolia (seeing an image
when none is present) to create a dream-like hallucinogenic appearance 83
Deep Learning
Hardware and Software Innovation

6 May 2019
84
Deep Learning
Hardware advance
TPU and GPU clusters
 Chip design and cloud data center
architecture
 GPU chips (graphics processing unit): 3D
graphics cards for fast matrix multiplication
 Google TPU chip (tensor processing unit):
flow through matrix multiplications without
storing interim values in memory (AlphaGo)
Google TPU
 Chip design advances Cloud and
Chip
 Google Cloud TPUs: ML accelerators for
TensorFlow; TPU 3.0 pod (8x more
powerful, up to 100 petaflops (2018))
 NVIDIA DGX-1 integrated deep learning
system (Eight Tesla P100 GPU
accelerators)
NVIDIA DGX-1
Source: http://www.techradar.com/news/computing-components/processors/google-s-tensor-processing-unit-explained-this-is-what-
6 May 2019 the-future-of-computing-looks-like-1326915
85
Deep Learning
Software advance Google’s open-source
What is TensorFlow? machine learning library

 “Tensor” = multidimensional arrays used in NN operations


 “Flow” directly through tensor operations (matrix multiplications)
without needing to store intermediate values in memory

Computation graph Design in TensorFlow

TensorBoard (TensorFlow) visualization Python code invoking TensorFlow


Source: https://www.youtube.com/watch?v=uHaKOFPpphU
6 May 2019
86
Deep Learning
Network advance
Edge Device-based Machine Learning
 Surveillance camera, USB and
Browser-based Machine Learning
 Intel: Movidius Visual Processing
Unit (VPU): USB ML for IOT
 Security cameras, industrial
equipment, robots, drones
 Apple: ML acquisition Turi (Dato)
 Browser-based Deep Learning
 ConvNetJS; TensorFire
 Javascript library to run Deep
Learning nets in a browser
 Smart Network in a browser
 JavaScript Deep Learning
 Blockchain EtherWallets

Source: http://cs.stanford.edu/people/karpathy/convnetjs/, http://www.infoworld.com/article/3212884/machine-learning/machine-learning-


6 May 2019 comes-to-your-browser-via-javascript.html
87
Deep Learning
Risks and Limitations of Deep Learning
 Complicated conceptually and technically
 Skilled workforce
 Limited solution 2018

 So far, restricted to a specific range of applications (supervised


learning for image and text recognition)
 Plateau: cheap hardware and already-labeled data sets; need
to model complex network science relationships between data
 Non-generalizable intelligence
 AlphaGo learns each arcade game from scratch
 How does the “black box” system work?
 Claim: no “learning,” just a clever mapping of the input data
vector space to output solution vector space

Source: Battaglia et al. 2018. Relational inductive biases, deep learning, and graph networks. arXiv:1806.01261.
6 May 2019
88
Deep Learning
Conclusion
Conclusion
• Deep learning is not merely an
AI technique or a software
 Deep learning is an AI
program, but a new class of
software
smart technology
network information for
identifyingthat
technology objects
is changing the
concept
 of the modern
Applications: healthcare,
autonomous
technology projectdriving, robotics
by offering
 real-time engagement
Deep learning is a newwithclass
reality
of smart network information
• Deep learningthat
technology is a isdata
replacing
automation method that
hard-coded software with a
replaces hard-coded software
capacity,
with in the
a capacity, in form of aof a
the form
learning
learning network
network thatthat is
is trained
totrained
performtoaperform
task a task

6 May 2019 89
Deep Learning
Deep Learning Smart Network Thesis

(1) Deep learning (machine learning) is one of the


latest and most important Artificial Intelligence
technologies.
This is in the bigger context that
(2) Humanity is embarked on a Digital
Transformation Journey, evolving into a
Computation-harnessing Society with Smart
Network Technologies
(Smart networks: autonomous computing networks such as
deep learning nets, blockchains, and UAV fleets)

6 May 2019 Source: Swan, M., and dos Santos, R.P. In prep. Smart Network Field Theory: The Technophysics of Blockchain and Deep Learning.
https://www.researchgate.net/publication/328051668_Smart_Network_Field_Theory_The_Technophysics_of_Blockchain_and_Deep_Learning 90
Deep Learning
Possibility space of Intelligence
 Machine intelligence as its own species

Sources: http://hplusmagazine.com/2015/09/02/the-space-of-mind-designs-and-the-human-mental-model/,
6 May 2019 https://www.nature.com/articles/s41586-019-1138-y 91
Deep Learning
Smart networks
 The network is the computer

Computer networking Computer networks Computing networks


1970-1980 1990-2010 2015+

Source: https://towardsdatascience.com/a-weird-introduction-to-deep-learning-7828803693b0
6 May 2019
92
Deep Learning
Resources
 Neural Networks and Deep Learning, Michael Nielsen,
http://neuralnetworksanddeeplearning.com/

 Deep Learning, Ian Goodfellow, Yoshua Bengio, Aaron


Courville, http://www.deeplearningbook.org/Machine learning and deep neural nets
 Machine Learning Guide podcast, Tyler Renelle,
http://ocdevel.com/podcasts/machine-learning

 notMNIST dataset http://yaroslavvb.blogspot.com/2011/09/notmnist-dataset.html


 Metacademy; Fast.ai; Keras.io

https://www.deeplearning.ai/

Distill (visual ML journal)


Source: http://cs231n.stanford.edu http://distill.pub
6 May 2019
93
Deep Learning
Deep Learning frameworks and libraries

Source: http://www.infoworld.com/article/3163525/analytics/review-the-best-frameworks-for-machine-learning-and-deep-
6 May 2019 learning.html#tk.ifw-ifwsb
94
Deep Learning
Future of AI and Smart Networks
Source: https://www.nvidia.com/en-us/deep-learning-ai/industries
Image credit: NVIDIA

Deep Learning Explained


The future of Artificial Intelligence and Smart Networks

Thank You! Questions?

Melanie Swan
Purdue University
Scientech [email protected]
Indianapolis IN, May 6, 2019
Slides: http://slideshare.net/LaBlogga
Technophysics Research Program:
Application of physics principles to technology
Biophysics • Disease causality: role of cellular dysfunction and environmental degradation
• Concentration limits in short and long range inter-cellular signaling
• Boltzmann distribution and diffusion limits in RNAi and SiRNA delivery

Econophysics • Path integrals extend point calculations in dynamical systems


• General (not only specialized) Schrödinger for Black Scholes option pricing
• Quantum game theory (greater than fixed sum options), Quantum finance

Technophysics The application of physics principles to the study of technology


(particularly statistical physics and information theory for the control of complex networks)

General Topics Smart Networks


• Apply renormalization group to system (intelligent self-operating networks)
criticality and phase transition detection
Technologies Tools
(Aygun, Goldenfeld) and extend tensor
• Blockchain • Smart network
network renormalization (Evenbly, Vidal)
• Deep Learning field theory
• Unifying principles: same probability
• UAV, HFT, RTB, IoT • Optimal control
functions used for spin glasses (statistical
• Satellite, nanorobot theory
physics), error-correcting (LDPC) codes
(information theory), and randomized
Research Topics
algorithms (computer science) (Mézard)
• Define relationships between statistical • Apply complexity theory to blockchain and deep
physics and information theory: generalized learning (dos Santos)
temperature and Fisher information, partition • Apply spin glass models to blockchain and deep
functions and free energy, and Gibbs’ learning (LeCun, Auffinger, Stein)
inequality and entropy (Merhav) • Apply deep learning to particle physics (Radovic)

Data Science Method: Science Modules Quantum Computation

Scientific Paradigms Computational Complexity, Black


Light and
Mechanics Steam Electromagnetics Information Holes, and Quantum Gravity
16-17c 18-19c 20c 21c (Aaronson, Susskind, Zenil)
6 May 2019
97
Deep Learning
Deep Learning Timeline

Source: F. Vazquez, https://towardsdatascience.com/a-weird-introduction-to-deep-learning-7828803693b0


6 May 2019
98
Deep Learning
What is a Neural Net?
 Structure: input-processing-output
 Mimic neuronal signal firing structure of brain with
computational processing units

Source: https://www.slideshare.net/ThomasDaSilvaPaula/an-introduction-to-machine-learning-and-a-little-bit-of-deep-learning,
6 May 2019 http://cs231n.github.io/convolutional-networks/
99
Deep Learning
Deep Learning vocabulary
What do these terms mean?

 Deep Learning, Machine Learning, Artificial Intelligence


 Perceptron, Artificial Neuron, Logit
 Deep Belief Net, Artificial Neural Net, Boltzmann Machine
 Google DeepDream, Google Brain, Google DeepMind
 Supervised and Unsupervised Learning
 Convolutional Neural Nets
 Recurrent NN & LSTM (Long Short Term Memory)
 Activation Function ReLU (Rectified Linear Unit)
 Deep Learning libraries and frameworks
 TensorFlow, Caffe, Theano, Torch, DL4J
 Backpropagation, gradient descent, loss function

6 May 2019
100
Deep Learning

You might also like