0% found this document useful (0 votes)
10 views

Predictive Modeling BI 4

Uploaded by

ikher.shivin
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
10 views

Predictive Modeling BI 4

Uploaded by

ikher.shivin
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 28

Techniques for Predictive Modeling-Learning Objectives

• Understand the concept and definitions of artificial


neural networks (ANN)
• Learn the different types of ANN architectures
• Know how learning happens in ANN
• Become familiar with ANN applications
• Understand the sensitivity analysis in ANN
• Understand the concept and structure of support
vector machines (SVM)
• Comparison

Appln: Predictive Modeling Helps Better Understand and Manage Complex


Medical Procedures, …
A Process Map for
Training and
Testing Four
Predictive Models
The Comparison of Four Models –An example
Neural Network Concepts
• Neural networks (NN): a brain metaphor for
information processing
• Neural computing
• Artificial neural network (ANN)
• Many uses for ANN for
– pattern recognition, forecasting, prediction, and
classification
• Many application areas
– finance, marketing, manufacturing, operations,
information systems, and so on
Application Case 6.1
Neural Networks Are Helping to Save
Lives in the Mining Industry

Questions for
Discussion
1. How did neural networks help save lives in the mining
industry?
2. What were the challenges, the proposed solution, and the
obtained results?
Elements of ANN

• Processing element (PE)


• Network architecture
– Hidden layers
– Parallel processing
• Network information processing
– Inputs
– Outputs
– Connection weights
– Summation function
Elements of ANN

x1 (PE)

x2 Weighted Transfer
(PE) Sum Function
Y1
x3 (S) (f)

(PE)

(PE) (PE)

Output
(PE)
Layer

Hidden
(PE)
Layer Neural Network with
Input
One Hidden Layer
Layer
Elements of ANN

(a) Single neuron (b) Multiple neurons

x1 x1 w11 (PE) Y1
w1
w21
(PE) Y

w1 w12
x2 Y = X 1W1 + X 2W2
x2 w22 (PE) Y2
PE: Processing Element (or neuron)

Y1 = X1W11 + X 2W21
Summation Function for a Single w23
Y2 = X1W12 + X2W22
Neuron (a), and
Y3 = X 2W 23 (PE) Y3
Several Neurons (b)
Elements of ANN
• Transformation (Transfer) Function
– Linear function
– Sigmoid (logical activation) function [0 1]
– Tangent Hyperbolic function [-1 1]

Summation function: Y = 3(0.2) + 1(0.4) + 2(0.1) = 1.2


X1 = 3 Transfer function: YT = 1/(1 + e-1.2) = 0.77
W
1 =0
.2

W2 = 0.4 Processing Y = 1.2


X2 = 1 YT = 0.77
element (PE)
.1
3
=0
W

X3 = 2
❖ Threshold value?
Neural Network Architectures
• Architecture of a neural network is driven by the
task it is intended to address
– Classification, regression, clustering, general
optimization, association, ….
• Most popular architecture: Feedforward, multi-
layered perceptron with backpropagation learning
algorithm
– Used for both classification and regression type
problems
• Others – Recurrent, self-organizing feature maps,
Hopfield networks, …
Neural Network Architectures
Feed-Forward Neural Networks

Feed-forward MLP with 1 Hidden Layer

Socio-demographic
Predicted
= vs. Actual
Religious
Voted “yes” or
“no” to legalizing
Financial gaming

. .
. .
. .
Other

INPUT HIDDEN OUTPUT


LAYER LAYER LAYER
Neural Network Architectures
Recurrent Neural Networks
Testing a Trained ANN Model

• Data is split into three parts


– Training (~60%)
– Validation (~20%)
– Testing (~20%)

• k-fold cross validation


– Less bias
– Time consuming
AN Learning Process
A Supervised Learning Process
ANN
Model
Three-step process:
1. Compute temporary
Compute
output
outputs.
2. Compare outputs with
desired targets.
3. Adjust the weights and
Is desired
Adjust
weights
No
output repeat the process.
achieved?

Yes

Stop
learning
Support Vector Machines (SVM)

• SVM are among the most popular machine-learning techniques.


• SVM belong to the family of generalized linear models… (capable of
representing non-linear relationships in a linear fashion).
• SVM achieve a classification or regression decision based on the value of
the linear combination of input features.
• Because of their architectural similarities, SVM are also closely
associated with ANN.
Support Vector Machines (SVM)

• Goal of SVM: to generate mathematical functions that map input


variables to desired outputs for classification or regression type
prediction problems.
– First, SVM uses nonlinear kernel functions to transform non-linear relationships
among the variables into linearly separable feature spaces.
– Then, the maximum-margin hyperplanes are constructed to optimally separate
different classes from each other based on the training dataset.
• SVM has solid mathematical foundation!
Support Vector Machines (SVM)

• A hyperplane is a geometric concept used to describe the separation


surface between different classes of things.
– In SVM, two parallel hyperplanes are constructed on each side of the separation
space with the aim of maximizing the distance between them.
• A kernel function in SVM uses the kernel trick (a method for using a
linear classifier algorithm to solve a nonlinear problem)
– The most commonly used kernel function is the radial basis function (RBF).
Support Vector Machines (SVM)

L1
M
X2 X2 ar
gi
L2 n

e
an
L3

l
rp
pe
hy
n
gi
ar
-m
um
im
ax
M
X1 X1

➢ Many linear classifiers (hyperplanes) may separate the data


Application Case 6.4

Managing Student Retention with


Predictive Modeling
Questions for Discussion
1. Why is attrition one of the most important issues in
higher education?
2. How can predictive analytics (ANN, SVM, and so forth)
be used to better manage student retention?
3. What are the main challenges and potential solutions to
the use of analytics in retention management?
How Does an SVM Work?

• Following a machine-learning process, an SVM learns from the historic


cases.
• The Process of Building SVM
1. Preprocess the data
• Scrub and transform the data.
2. Develop the model.
• Select the kernel type (RBF is often a natural choice).
• Determine the kernel parameters for the selected kernel type.
• If the results are satisfactory, finalize the model; otherwise change the kernel type and/or kernel
parameters to achieve the desired accuracy level.
3. Extract and deploy the model.
The Process of Building an SVM
Pre-Process the Data
Training
ü Scrub the data
data
“Identify and handle missing,
incorrect, and noisy”
ü Transform the data
“Numerisize, normalize and
standardize the data”

Pre-processed data

Develop the Model


Experimentation
ü Select the kernel type “Training/Testing”
“Choose from RBF, Sigmoid
or Polynomial kernel types”
ü Determine the kernel values
“Use v-fold cross validation or
employ ‘grid-search’”

Validated SVM model

Deploy the Model


Prediction
ü Extract the model coefficients Model
ü Code the trained model into
the decision support system
ü Monitor and maintain the
model
SVM Applications
• SVMs are the most widely used kernel-learning
algorithms for wide range of classification and
regression problems
• SVMs represent the state-of-the-art by virtue of their
excellent generalization performance, superior
prediction power, ease of use, and rigorous theoretical
foundation
• Most comparative studies show its superiority in both
regression and classification type prediction problems.
• SVM versus ANN?
k-Nearest Neighbor Method (k-NN)

• ANNs and SVMs → time-demanding, computationally intensive iterative


derivations
• k-NN is a simplistic and logical prediction method, that produces very
competitive results
• k-NN is a prediction method for classification as well as regression types
(similar to ANN & SVM)
• k-NN is a type of instance-based learning – most of the work takes place
at the time of prediction (not at modeling)
• k : the number of neighbors used
k-Nearest Neighbor Method (k-NN)
Y

k=3

k=5
Yi

The answer depends on


the value of k

Xi X
The Process of k-NN Method

Training Set
Parameter Setting

Historic Data ü Distance measure


ü Value of “k”

Validation Set

Predicting
Classify (or Forecast)
new cases using k
number of most
similar cases

New Data
k-NN Model Parameter
1. Similarity Measure: The Distance Metric

– Numeric versus nominal values?


k-NN Model Parameter
2. Number of Neighbors (the value of k)
– The best value depends on the data
– Larger values reduce the effect of noise but also
make boundaries between classes less distinct
– An “optimal” value can be found heuristically
• Cross Validation is often used to determine the best
value for k and the distance measure
Application Case 6.5

Efficient Image Recognition and


Categorization with kNN

Questions for Discussion


1. Why is image recognition/classification a worthy
but difficult problem?
2. How can k-NN be effectively used for image
recognition/classification applications?

You might also like