0% found this document useful (0 votes)
61 views47 pages

SDS Deep Learning For Spatial Application

This document provides an overview of deep learning and mathematical modeling techniques. It begins with an introduction to artificial intelligence concepts like computer vision, natural language processing, and machine learning algorithms such as random forests, logistic regression, and support vector machines. It then discusses mathematical modeling, including developing computational algorithms, performing test calculations, and communicating results. The remainder of the document focuses on neural networks, including a history of the field beginning in the 1940s, early neuron models like McCulloch-Pitt neurons, the development of backpropagation, and types of learning like supervised and unsupervised learning. It also discusses deep learning and why it is useful for learning representations from large amounts of data.
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
61 views47 pages

SDS Deep Learning For Spatial Application

This document provides an overview of deep learning and mathematical modeling techniques. It begins with an introduction to artificial intelligence concepts like computer vision, natural language processing, and machine learning algorithms such as random forests, logistic regression, and support vector machines. It then discusses mathematical modeling, including developing computational algorithms, performing test calculations, and communicating results. The remainder of the document focuses on neural networks, including a history of the field beginning in the 1940s, early neuron models like McCulloch-Pitt neurons, the development of backpropagation, and types of learning like supervised and unsupervised learning. It also discusses deep learning and why it is useful for learning representations from large amounts of data.
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 47

SDS Bootcamp

Deep Learning for Spatial


Application
Marcelinus A.S. Adhiwibawa, S.P., M.Stat.
Statistician/Data Scientist/Research Assistant
MRCPP, Univ. Ma Chung Malang
Artificial Intelligence
Natural
Language
Processing

Computer Speech
Vision Machine Recognition

Learning
Random Forest
Logistic Regression

Linear regression
SVM
Deep
Learning
Convolutional
Neural Networks
Mathematical Modelling of
Neural Network
Mathematical Modeling
• Creates a mathematical representation of some
phenomenon to better understand it.
• Matches observation with symbolic
representation.
• Informs theory and explanation.

The success of a mathematical model depends on how


easily it can be used, and how accurately it predicts and
how well it explains the phenomenon being studied.
Mathematical Modeling and
the Scientific Method
How do we incorporate mathematical
modeling/computational science in the
scientific method?
Mathematical Modeling
Problem-Solving Steps
• Identify problem area • Develop
• Conduct background computational
research algorithm
• State project goal • Perform test
• Define relationships calculations
• Develop mathematical • Interpret results
model
• Communicate results
Introduction to Mathematical Modeling
• Prerequisite: Linear algebra and calculus
• Goals
• Learn to build models
• Understand mathematics behind models
• Improve understanding of mathematical concepts
• Communicate mathematics
• Models may be mathematical equations,
spreadsheets, or computer simulations.
Linear Regression: Classic Approach
Experience
(X) Salary (Y)
2 3
1 2.5
3 3.2
2 2.9
1 2.5
6 3.9
5 3.4
7 3.5
9 3.9
10 4
𝑌 = 𝑏 + 𝑚. 𝑋 +error
Linear Regression: Optimization Approach

𝑌 = 𝐵𝑖𝑎𝑠 + 𝑊. 𝑋 +error
Sejarah Jaringan Saraf Tiruan
Ide jaringan saraf dimulai sebagai model dari fungsi neuron di otak.
Pada tahun 1943, ahli neurofisiologi Warren McCulloch dan
matematikawan Walter Pitts menggambarkannya dengan sirkuit
listrik sederhana.
Donald Hebb mengambil gagasan tersebut dan menuliskan dalam
bukunya, The Organisation of Behavior (1949).
Dua konsep utama yang merupakan pendahulu dari Neural
Networks adalah:
'Threshold Logic' - mengubah input kontinu menjadi output diskrit
'Hebbian Learning' - model pembelajaran berdasarkan plastisitas
saraf, yang dikemukakan oleh Donald Hebb dalam bukunya "The
Organisation of Behavior" yang sering diringkas dengan frasa:
"Sel yang menyala bersama, menyatu.
Keduanya diusulkan pada tahun 1940-an. Pada tahun 1950-an,
peneliti mulai mencoba menerjemahkan jaringan ini ke sistem
komputasi.
McCulloch-Pitt Neuron

-Boolean inputs
-Boolean output
-Be able to model logic function such as OR, AND,
NOT
-Does not have ability to learn so threshold b need
to be adjust analytically to fit the output
Pada tahun 1950-an tersebut Frank Rosenblatt, seorang psikolog di
Cornell, sedang berusaha memahami sistem keputusan yang relatif
lebih sederhana yang ada di mata lalat.
Dalam upaya untuk memahami dan mengukur proses ini, dia
mengusulkan gagasan Perceptron pada tahun 1958, serta
menyebutnya Mark I Perceptron.
Gagasan tersebut berupa sistem dengan hubungan input output
sederhana beserta bobot, dimodelkan pada neuron McCulloch-
Pitts.
Pada tahun 1969, Marvin Minsky dan Seymour Papert menerbitkan
Perceptrons - teks bersejarah yang akan mengubah jalannya
penelitian kecerdasan buatan selama beberapa dekade.Dalam teks
tersebut, Minsky dan Papert membuktikan bahwa satu perceptron -
kakek dari unit komputasi yang menyusun jaringan saraf modern -
tidak mampu mempelajari fungsi eksklusif-atau (alias XOR).
Pada tahun 1986, Hinton, Rumelhart, dan Williams menerbitkan
sebuah makalah “Learning representations by back-propagating
errors”, memperkenalkan konsep backpropagation dan hidden
layers, oleh sebab itu dapat dikatakan mereka melahirkan
Multilayer Perceptrons (MLPs)
Backpropagation, prosedur untuk menyesuaikan bobot berulang kali
untuk meminimalkan perbedaan antara keluaran aktual dan prediksi
Hidden layers, yang merupakan node neuron yang ditumpuk di antara
input dan output, memungkinkan jaringan neural mempelajari fitur yang
lebih rumit (seperti logika XOR)
JST Dari Sudut Pandang Statistika
“There has been much publicity about the ability of artificial neural
networks to learn and generalize. In fact, the most commonly used
artificial neural networks, called multilayer perceptrons, are nothing
more than nonlinear regression and discriminant models that can be
implemented with standard statistical software.”

Sarle, W.S.1994. Neural networks and statistical models. In Proc. of the


Nineteenth Annual SASUsers Group Inter-national Conference.
Jargon
Variables are called features
Independent variables are called inputs
Dependent variables are called targets or training values
Predicted values are called outputs
Residuals are called errors
Estimation is called training, learning, adaptation, or selforganization.
An estimation criterion is called an error function, cost function
Observations are called patterns or training pairs
Parameter estimates are called (synaptic) weights
Interactions are called higher-order neurons
Transformations are called functional links
Regression and discriminant analysis are called supervised learning or heteroassociation
Data reduction is called unsupervised learning, encoding, or autoassociation
Cluster analysis is called competitive learning or adaptive vector quantization
Interpolation and extrapolation are called generalization
To Explain or To Predict?
“Statistical modeling is a powerful tool for developing and testing
theories by way of causal explanation, prediction, and description. In
many disciplines there is near-exclusive use of statistical modeling
for causal explanation and the assumption that models with high
explanatory power are inherently of high predictive power. Conflation
between explanation and prediction is common, yet the distinction
must be understood for progressing scientific knowledge.”

Shmueli, G. 2010. To explain or to predict?. Statistical science, 25(3), 289-


310.
Types of Learning
Supervised: Learning with a labeled training set
Example: email classification with already labeled emails

Unsupervised:Discover patterns in unlabeled data


Example: cluster similar documents based on text

Reinforcement learning: learn to act based on feedback/reward


Example:learn to play Go, reward: win or lose

class A

class A

Classification Regression Clustering

Anomaly Detection
Sequence labeling
http://mbjoseph.github.io/2013/11/27/measure.html

What is a Neural Net?
 Structure: input-processing-output
 Mimic neuronal signal firing structure of brain with
computational processing units

Source: https://www.slideshare.net/ThomasDaSilvaPaula/an-introduction-to-machine-learning-and-a-little-bit-of-deep-learning,
http://cs231n.github.io/convolutional-networks/
Artificial Neural Network
Weights

Activation functions

How do we train?

4 +2 =6 neurons (not counting inputs)


[3 x 4] +[4 x 2] =20 weights
4 +2 =6 biases
26 learnable parameters
What is Deep Learning (DL) ?
A machine learning subfield of learning representations of data.
Exceptional effective at learning patterns.
Deep learning algorithms attempt to learn (multiple levels of)
representation by using a hierarchy of multiple layers
If you provide the system tons of information, it begins to understand it
and respond in useful ways.

https://www.xenonstack.com/blog/static/public/uploads/media/machine-learning-vs-deep-learning.png
Conceptual Definition:
Deep learning is a computer program that
can
identify what something is
Technical Definition:
Deep learning is a class of machine learning
algorithms in the form of a neural network that
uses a cascade of layers (tiers) of processing
units to extract features from data and make
predictive guesses about new data
Source: Extending Jann LeCun, http://spectrum.ieee.org/automaton/robotics/artificial-intelligence/facebook-ai-director-yann-lecun-
on-deep-learning
Why is DL useful?
o Manually designed features are often over-specified, incomplete and
take a long time to design and validate
o Learned Features are easy to adapt, fast to learn
o Deep learning provides a very flexible, (almost?) universal, learnable
framework for representing world, visual and linguistic information.
o Can learn both unsupervised and supervised
o Effective end-to-end joint system learning
o Utilize large amounts of training data

In ~2010 DL started outperforming


other ML techniques
first in speech and vision, then NLP
Why is it called Deep Learning?
 Deep: Hidden layers (cascading tiers) of processing
 “Deep” networks (3+ layers) versus “shallow” (1-2 layers)
 Learning: Algorithms “learn” from data by modeling
features and updating probability weights assigned to
feature nodes in testing how relevant specific features
are in determining the general type of item

Deep: Hidden processing layers Learning: Updating probability


weights re: feature importance
2 main kinds of Deep Learning neural nets
 Convolutional Neural Nets
 Image recognition
 Convolve: roll up to higher
levels of abstraction in feature
sets
 Recurrent Neural Nets
 Speech, text, audio recognition
 Recur: iterate over sequential
inputs with a memory function
 LSTM (Long Short-Term
Memory) remembers
sequences and avoids
gradient vanishing

Source: Yann LeCun, CVPR 2015 keynote (Computer Vision ), "What's wrong with Deep Learning" http://t.co/nPFlPZzMEJ
Convolutional Neural
Networks (CNNs)
Convolutional Neural Network (CNN) is one of the latest developments of artificial
neural networks inspired by human neural networks and commonly used in image data to
detect and recognize an object in an image. CNN consists of neurons that have weights,
biases and activation functions.

Convolutional
Input matrix 3x3 filter
http://deeplearning.stanford.edu/wiki/index.php/Feature_extraction_using_convolution
Convolutional Layer
Convolutional layer section performs convolution operations by using linear
filters on local areas. This layer is the first step to receive an image feeded into
the deep learning architecture. This layer is a filtered with specific filter matrix
with certain length (pixel), width (pixel) and dimension according to image
channel/band of the feeded data. These filters will shift throughout the image
according stride parameter. Filter direction is from left to the right and from top
to the bottom of matrix image. This shift will do a "dot" operation between the
input and the value of the filter so that it will produce an output called the
activation map or feature map. Figure 4 shows the convolution process in the
convolution layer and Figure 5 is how to calculate the convolution value.
Pooling
Layer
Main CNN idea for text:
Compute vectors for n-grams and group them afterwards

max pool
2x2 filters
and stride 2

https://shafeentejani.github.io/assets/images/pooling.gif
Fully Connected
•Fully connected layer takes input from the output pooling layer in the form of a
feature map. The feature map is still in the form of a multidimensional array so it
will reshape the feature map and generate n-dimensional vectors where n is the
number of output classes that the program must choose. For example, the layer
consists of 500 neurons, softmax will be applied which returns the list of greatest
probabilities for each of the 10 class labels as the final classification of the
network. Figure7 shows the process in the fully connected layer.
CNN
Architecture
Evaluation metrics of
deep learning training
𝑇𝑃+𝑇𝑁
𝐴𝑐𝑐𝑢𝑟𝑎𝑐𝑦 =
𝑇𝑃+𝑇𝑁+𝐹𝑃+𝐹𝑁
Where
TP = True Positives,
TN = True Negatives,
FP = False Positives, and
FN = False Negatives.
How does the neural net actually learn?
 Structural system based on cascading layers of
neurons with variable parameters: weight and bias
 Vary the weights
and biases to see if
a better outcome is
obtained
 Repeat until the net
correctly classifies
the data

Source: http://neuralnetworksanddeeplearning.com/chap2.html
Deep Learning frameworks and libraries

Source: http://www.infoworld.com/article/3163525/analytics/review-the-best-frameworks-for-machine-learning-and-deep-
learning.html#tk.ifw-ifwsb
Hardware
 Advances in chip design
 GPU chips (graphics processing unit):
3D graphics cards designed to do fast
matrix multiplication
 Google TPU chip (tensor processing
unit): custom ASICs for machine
learning, used in AlphaGo
 TPUs process matrix Google TPU chip (Tensor
multiplications without storing Processing Unit), 2016

intermediate values in memory


 NVIDIA DGX-1 integrated deep
learning system NVIDIADGX-1
Deep Learning System
 Eight Tesla P100 GPU accelerators
Source: http://www.techradar.com/news/computing-components/processors/google-s-tensor-processing-unit-explained-this-is-what-
the-future-of-computing-looks-like-1326915
USB-based Machine Learning

 Intel: Movidius Visual Processing


Unit (VPU): USB ML for IOT
 Security cameras, industrial
equipment, robots, drones
How big are Deep Learning neural nets?
 Google Deep Brain cat recognition, 2011
 1 billion connections, 10 million images (200x200
pixel), 1,000 machines (16,000 cores), 3 days, each
instantiation of the network spanned 170 servers, and
20,000 object categories
 State of the art, 2016-2017
 NVIDIA facial recognition, 100 million images, 10
layers, 1 bn parameters, 30 exaflops, 30 GPU days
 Google, 11.2-billion parameter system
 Lawrence Livermore Lab, 15-billion parameter system
 Digital Reasoning, cognitive computing (Nashville TN),
160 billion parameters, trained on three multi-core
computers overnight

Source: https://futurism.com/biggest-neural-network-ever-pushes-ai-deep-learning, Digital Reasoning paper:


https://arxiv.org/pdf/1506.02338v3.pdf
What is
TensorFlow
• Originally developed by researchers and engineers working on the Google Brain
Team for the purposes of conducting machine learning and deep neural
networks research.
• Open source software (Apache v2.0 license)
• Hardware independent
• CPU (via Eigen and BLAS)
• GPU (via CUDA and cuDNN)
• TPU (Tensor Processing Unit)
• Supports automatic differentiation
• Distributed execution and large datasets
What is a
tensor?
• Spoiler alert: it’s an array
Tensor dimensionality R object class Example
0 Vector of length one Point value
1 Vector Weights
2 Matrix Time series
3 Array Grey scale image
4 Array Colour images
5 Array Video

Note that the first dimension is always used for the


observations, thus “adding” a dimension
Uses of
TensorFlow
• Image classification
• Time series forecasting
• Classifying peptides for cancer immunotherapy
• Credit card fraud detection using an autoencoder
• Classifying duplicate questions from Quora
• Predicting customer churn
• Learning word embeddings for Amazon reviews

https://tensorflow.rstudio.com/gallery/
What are
layers?
• Data transformation functions parameterized by weights
• A layer is a geometric transformation function on the data that goes
through it (transformations must be differentiable for stochastic gradient
descent)
• Weights determine the data transformation behavior of a layer
R examples in
the gallery
• https://tensorflow.rstudio.com/gallery/

• Image classification on small datasets


• Time series forecasting with recurrent networks
• Deep learning for cancer immunotherapy
• Credit card fraud detection using an autoencoder
• Classifying duplicate questions from Quora
• Deep learning to predict customer churn
• Learning word embeddings for Amazon reviews
• Work on explainability of predictions
Recommended reading
Chollet and Allaire Goodfellow, Bengio & Courville

40
Studi Kasus
Penilaian Potensi Bahaya Longsor Menggunakan ML

X variables
B1="slope.tif"
B2="ndvi.tif"
B3="landcover.tif"
B4="elevation.tif"
B5="curvature.tif"
Y variables
Landslide occurence

You might also like