0% found this document useful (0 votes)

6 views68 pages

AI notes Week 11

The document discusses machine learning, emphasizing its role in optimizing performance criteria using data, especially in scenarios where human expertise is lacking or difficult to articulate. It covers various learning types, including supervised, unsupervised, and reinforcement learning, along with their applications in fields like retail, finance, and medicine. Additionally, it highlights decision tree learning as a popular classification method and introduces concepts like information gain and entropy for building effective models.

Uploaded by

izahri495

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

6 views68 pages

AI notes Week 11

Uploaded by

izahri495

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

You are on page 1/ 68

Artificial Intelligence

Machine Learning
CS-412
Week-11-Fall 2024
*
Why “Learn” ?
■ Machine learning is programming computers to optimize
a performance criterion using example data or past
experience.
■ There is no need to “learn” to calculate payroll
■ Learning is used when:
◻ Human expertise does not exist (navigating on Mars),
◻ Humans are unable to explain their expertise (speech
recognition)
◻ Solution changes in time (routing on a computer network)
◻ Solution needs to be adapted to particular cases (user
biometrics)

2
What We Talk About When We
Talk About“Learning”
■ Learning general models from a data of particular
examples
■ Data is cheap and abundant (data warehouses, data
marts); knowledge is expensive and scarce.
■ Example in retail: Customer transactions to consumer
behavior:
People who bought “x-product” also bought “Y-product”
(www.amazon.com)
■ Build a model that is a good and useful approximation to
the data.

3
Data Mining
■ Retail: Market basket analysis, Customer relationship
management (CRM)
■ Finance: Credit scoring, fraud detection
■ Manufacturing: Optimization, troubleshooting
■ Medicine: Medical diagnosis
■ Telecommunications: Quality of service optimization
■ Bioinformatics: Motifs (protein sequence patterns),
alignment
■ Web mining: Search engines
■ ...

4
What is Machine Learning?
■ Optimize a performance criterion using example data or
past experience.
■ Role of Statistics: Inference from a sample
■ Role of Computer science: Efficient algorithms to
◻ Solve the optimization problem
◻ Representing and evaluating the model for inference

5
Applications
■ Association
■ Supervised Learning
◻ Classification
◻ Regression

■ Unsupervised Learning
■ Reinforcement Learning

6
Learning Associations
■ Basket analysis:
P (Y | X ) probability that somebody who buys X also
buys Y where X and Y are products/services.

Example: P ( Milk | bread ) = 0.7

7
Classification
■ Example: Credit
scoring
■ Differentiating
between low-risk
and high-risk
customers from their
income and savings

Discriminant: IF income > θ1 AND savings > θ2

THEN low-risk ELSE high-risk

8
Classification: Applications
■ Aka Pattern recognition
■ Face recognition: Pose, lighting, occlusion (glasses,
beard), make-up, hair style
■ Character recognition: Different handwriting styles.
■ Speech recognition: Temporal dependency.
◻ Use of a dictionary or the syntax of the language.
◻ Sensor fusion: Combine multiple modalities; eg, visual (lip
image) and acoustic for speech
■ Medical diagnosis: From symptoms to illnesses
■ ...

9
Face Recognition
Training examples of a person

Test images

AT&T Laboratories, Cambridge UK

http://www.uk.research.att.com/facedatabase.html

10
Regression
■ Example: Price of a used
car
■ x : car attributes y = wx+w0
y : price
y = g (x | θ )
g ( ) model,
θ parameters

11
Supervised Learning: Uses
■ Prediction of future cases: Use the rule to predict the
output for future inputs
■ Knowledge extraction: The rule is easy to understand
■ Compression: The rule is simpler than the data it
explains
■ Outlier detection: Exceptions that are not covered by the
rule, e.g., fraud

12
Unsupervised Learning
■ Learning “what normally happens”
■ No output
■ Clustering: Grouping similar instances
■ Example applications
◻ Customer segmentation in CRM
◻ Image compression: Color quantization
◻ Bioinformatics: Learning motifs

13
Reinforcement Learning
■ The “reinforcement” in reinforcement learning refers to
how certain behaviors are encouraged, and others
discouraged.
■ Behaviors are reinforced through rewards which are
gained through experiences with the environment.
■ Learning a policy: A sequence of outputs
■ Credit assignment problem
■ Game playing
■ Robot in a maze

14
Resources: Datasets
■ UCI Repository:
http://www.ics.uci.edu/~mlearn/MLRepository.html
■ UCI KDD Archive:
http://kdd.ics.uci.edu/summary.data.application.html
■ Statlib: http://lib.stat.cmu.edu/
■ Delve: http://www.cs.utoronto.ca/~delve/

15
Resources: Journals
■ Journal of Machine Learning Research www.jmlr.org
■ Machine Learning
■ Neural Computation
■ Neural Networks
■ IEEE Transactions on Neural Networks
■ IEEE Transactions on Pattern Analysis and Machine
Intelligence
■ Annals of Statistics
■ Journal of the American Statistical Association
■ ...
16
Resources: Conferences
■ International Conference on Machine Learning (ICML)
◻ ICML05: http://icml.ais.fraunhofer.de/
■ European Conference on Machine Learning (ECML)
◻ ECML05: http://ecmlpkdd05.liacc.up.pt/
■ Neural Information Processing Systems (NIPS)
◻ NIPS05: http://nips.cc/
■ Uncertainty in Artificial Intelligence (UAI)
◻ UAI05: http://www.cs.toronto.edu/uai2005/
■ Computational Learning Theory (COLT)
◻ COLT05: http://learningtheory.org/colt2005/
■ International Joint Conference on Artificial Intelligence (IJCAI)
◻ IJCAI05: http://ijcai05.csd.abdn.ac.uk/
■ International Conference on Neural Networks (Europe)
◻ ICANN05: http://www.ibspan.waw.pl/ICANN-2005/
■ ...

17
Supervised Learning
An example application
■ An emergency room in a hospital measures 17
variables (e.g., blood pressure, age, etc) of newly
admitted patients.
■ A decision is needed: whether to put a new patient
in an intensive-care unit.
■ Due to the high cost of ICU, those patients who
may survive less than a month are given higher
priority.
■ Problem: to predict high-risk patients and
discriminate them from low-risk patients.

19
Another application
■ A credit card company receives thousands of
applications for new cards. Each application
contains information about an applicant,
◻ age
◻ Marital status
◻ annual salary
◻ outstanding debts
◻ credit rating
◻ etc.
■ Problem: to decide whether an application should
approved, or to classify applications into two
categories, approved and not approved.

20
Machine learning and our focus
■ Like human learning from past experiences.
■ A computer does not have “experiences”.
■ A computer system learns from data, which
represent some “past experiences” of an
application domain.
■ Our focus: learn a target function that can be used
to predict the values of a discrete class attribute,
e.g., approve or not-approved, and high-risk or low
risk.
■ The task is commonly called: Supervised learning,
classification, or inductive learning.

21
The data and the goal
■ Data: A set of data records (also called examples,
instances or cases) described by
◻ k attributes: A1, A2, … Ak.
◻ a class: Each example is labelled with a pre-defined class.

■ Goal: To learn a classification model from the data that

can be used to predict the classes of new (future, or test)
cases/instances.

22
An example: data (loan application)
Approved or not

23
An example: the learning task
■ Learn a classification model from the data
■ Use the model to classify future loan applications
into
◻ Yes (approved) and
◻ No (not approved)
■ What is the class for following case/instance?

24
Supervised vs. unsupervised Learning
■ Supervised learning: classification is seen as supervised
learning from examples.
◻ Supervision: The data (observations, measurements, etc.) are
labeled with pre-defined classes. It is like that a “teacher” gives
the classes (supervision).
◻ Test data are classified into these classes too.
■ Unsupervised learning (clustering)
◻ Class labels of the data are unknown
◻ Given a set of data, the task is to establish the existence of
classes or clusters in the data

25
Supervised learning process: two
steps
■ Learning (training): Learn a model using the
training data
■ Testing: Test the model using unseen test
data to assess the model accuracy

26
What
■ Given
do we mean by learning?
◻ a data set D,
◻ a task T, and
◻ a performance measure M,
a computer system is said to learn from D to perform the
task T if after learning the system’s performance on T
improves as measured by M.
■ In other words, the learned model helps the system to
perform T better as compared to no learning.

27
An example
■ Data: Loan application data
■ Task: Predict whether a loan should be approved or not.
■ Performance measure: accuracy.

No learning: classify all future applications (test data) to the

majority class (i.e., Yes):
Accuracy = 9/15 = 60%.
■ We can do better than 60% with learning.

28
Fundamental assumption of learning
Assumption: The distribution of training examples is
identical to the distribution of test examples (including
future unseen examples).

■ In practice, this assumption is often violated to certain

degree.
■ Strong violations will clearly result in poor classification
accuracy.
■ To achieve good accuracy on the test data, training
examples must be sufficiently representative of the test
data.

29
Introduction
■ Decision tree learning is one of the most widely used
techniques for classification.
◻ Its classification accuracy is competitive with other methods,
and
◻ it is very efficient.

■ The classification model is a tree, called decision tree.

■ C4.5 by Ross Quinlan is perhaps the best known
system. It can be downloaded from the Web.

30
The loan data
Approved or not

31
A decision tree from the loan data
■ Decision nodes and leaf nodes (classes)

32
Use the decision tree

33
Is the decision tree unique?
■ No. Here is a simpler tree.
■ We want smaller tree and accurate tree.
■ Easy to understand and perform better.

■ Finding the best tree is

NP-hard.
■ All current tree building
algorithms are heuristic
algorithms

34
From a decision tree to a set of rules
■ A decision tree can
be converted to a
set of rules
■ Each path from the
root to a leaf is a
rule.

35
Algorithm for decision tree learning
■ Basic algorithm (a greedy divide-and-conquer algorithm)
◻ Assume attributes are categorical now (continuous attributes
can be handled too)
◻ Tree is constructed in a top-down recursive manner
◻ At start, all the training examples are at the root
◻ Examples are partitioned recursively based on selected
attributes
◻ Attributes are selected on the basis of an impurity function (e.g.,
information gain)
■ Conditions for stopping partitioning
◻ All examples for a given node belong to the same class
◻ There are no remaining attributes for further partitioning –
majority class is the leaf
◻ There are no examples left

36
Decision tree learning algorithm

37
Choose an attribute to partition data
■ The key to building a decision tree - which attribute to
choose in order to branch.
■ The objective is to reduce impurity or uncertainty in data
as much as possible.
◻ A subset of data is pure if all instances belong to the same class.
■ The heuristic in C4.5 is to choose the attribute with the
maximum Information Gain or Gain Ratio based on
information theory.

38
The loan data (reproduced)
Approved or not

39
Two possible roots, which is better?

■ Fig. (B) seems to be better.

40
Information theory
■ Information theory provides a mathematical
basis for measuring the information content.
■ To understand the notion of information, think
about it as providing the answer to a question,
for example, whether a coin will come up heads.
◻ If one already has a good guess about the answer,
then the actual answer is less informative.
◻ If one already knows that the coin is rigged so that it
will come with heads with probability 0.99, then a
message (advanced information) about the actual
outcome of a flip is worth less than it would be for a
honest coin (50-50).

41
Information theory (cont …)
■ For a fair (honest) coin, you have no
information, and you are willing to pay more
(say in terms of $) for advanced information -
less you know, the more valuable the
information.
■ Information theory uses this same intuition,
but instead of measuring the value for
information in dollars, it measures information
contents in bits.
■ One bit of information is enough to answer a
yes/no question about which one has no idea,
such as the flip of a fair coin

42
Information theory: Entropy measure
■ The entropy formula,

■ Pr(cj) is the probability of class cj in data set D

■ We use entropy as a measure of impurity or
disorder of data set D. (Or, a measure of
information in a tree)

43
Entropy measure: let us get a
feeling

■ As the data become purer and purer, the entropy value

becomes smaller and smaller. This is useful to us!
44
Information gain
■ Given a set of examples D, we first compute its
entropy:

■ If we make attribute Ai, with v values, the root of the

current tree, this will partition D into v subsets D1, D2
…, Dv . The expected entropy if Ai is used as the
current root:

45
Information gain (cont …)
■ Information gained by selecting attribute Ai to
branch or to partition the data is

■ We choose the attribute with the highest gain to

branch/split the current tree.

46
An example

■ Own_house is the best

choice for the root.

47
We build the final tree

■ We can use information gain ratio to evaluate the

impurity as well (see the handout)

48
Handling continuous attributes
■ Handle continuous attribute by splitting into two intervals
(can be more) at each node.
■ How to find the best threshold to divide?
◻ Use information gain or gain ratio again
◻ Sort all the values of an continuous attribute in increasing order
{v1, v2, …, vr},
◻ One possible threshold between two adjacent values vi and vi+1.
Try all possible thresholds and find the one that maximizes the
gain (or gain ratio).

49
An example in a continuous space

50
Avoid overfitting in classification
■ Overfitting: A tree may overfit the training data
◻ Good accuracy on training data but poor on test data
◻ Symptoms: tree too deep and too many branches,
some may reflect anomalies due to noise or outliers
■ Two approaches to avoid overfitting
◻ Pre-pruning: Halt tree construction early
■ Difficult to decide because we do not know what may happen
subsequently if we keep growing the tree.
◻ Post-pruning: Remove branches or sub-trees from a
“fully grown” tree.
■ This method is commonly used. C4.5 uses a statistical method to
estimates the errors at each node for pruning.
■ A validation set may be used for pruning as well.

51
Likely to overfit the data
An example

52
Other issues in decision tree
learning
■ From tree to rules, and rule pruning
■ Handling of miss values
■ Handing skewed distributions
■ Handling attributes and classes with different costs.
■ Attribute construction
■ Etc.

53
Evaluating classification methods
■ Predictive accuracy

■ Efficiency
◻ time to construct the model
◻ time to use the model
■ Robustness: handling noise and missing values
■ Scalability: efficiency in disk-resident databases
■ Interpretability:
◻ understandable and insight provided by the model
■ Compactness of the model: size of the tree, or the
number of rules.

54
Evaluation methods
■ Holdout set: The available data set D is divided into
two disjoint subsets,
◻ the training set Dtrain (for learning a model)
◻ the test set Dtest (for testing the model)
■ Important: training set should not be used in testing
and the test set should not be used in learning.
◻ Unseen test set provides a unbiased estimate of accuracy.
■ The test set is also called the holdout set. (the
examples in the original data set D are all labeled
with classes.)
■ This method is mainly used when the data set D is
large.

55
Evaluation methods (cont…)
■ n-fold cross-validation: The available data is
partitioned into n equal-size disjoint subsets.
■ Use each subset as the test set and combine the rest
n-1 subsets as the training set to learn a classifier.
■ The procedure is run n times, which give n
accuracies.
■ The final estimated accuracy of learning is the
average of the n accuracies.
■ 10-fold and 5-fold cross-validations are commonly
used.
■ This method is used when the available data is not
large.
56
Evaluation methods (cont…)
■ Leave-one-out cross-validation: This method is used
when the data set is very small.
■ It is a special case of cross-validation
■ Each fold of the cross validation has only a single test
example and all the rest of the data is used in training.
■ If the original data has m examples, this is m-fold
cross-validation

57
Evaluation methods (cont…)
■ Validation set: the available data is divided into
three subsets,
◻ a training set,
◻ a validation set and
◻ a test set.
■ A validation set is used frequently for estimating
parameters in learning algorithms.
■ In such cases, the values that give the best
accuracy on the validation set are used as the final
parameter values.
■ Cross-validation can be used for parameter
estimating as well.

58
Classification measures
■ Accuracy is only one measure (error = 1-accuracy).
■ Accuracy is not suitable in some applications.
■ In text mining, we may only be interested in the
documents of a particular topic, which are only a
small portion of a big document collection.
■ In classification involving skewed or highly
imbalanced data, e.g., network intrusion and
financial fraud detections, we are interested only in
the minority class.
◻ High accuracy does not mean any intrusion is detected.
◻ E.g., 1% intrusion. Achieve 99% accuracy by doing
nothing.
■ The class of interest is commonly called the
positive class, and the rest negative classes.
59
Precision and recall measures
■ Used in information retrieval and text classification.
■ We use a confusion matrix to introduce them.

60
Precision and recall measures (cont…)

■ Precision p is the number of correctly classified

positive examples divided by the total number of
examples that are classified as positive.
■ Recall r is the number of correctly classified positive
examples divided by the total number of actual
positive examples in the test set.
61
An example

■ This confusion matrix gives

◻ precision p = 100% and
◻ recall r = 1%
because we only classified one positive example correctly
and no negative examples wrongly.
■ Note: precision and recall only measure
classification on the positive class.

62
F1-value (also called F1-score)
■ It is hard to compare two classifiers using two measures. F1
score combines precision and recall into one measure

■ The harmonic mean of two numbers tends to be closer to the

smaller of the two.
■ For F1-value to be large, both p and r much be large.

63
Receive operating characteristics curve

■ It is commonly called the ROC curve.

■ It is a plot of the true positive rate (TPR) against the false
positive rate (FPR).
■ True positive rate:

■ False positive rate:

64
Sensitivity and Specificity
■ In statistics, there are two other evaluation measures:
◻ Sensitivity: Same as TPR
◻ Specificity: Also called True Negative Rate (TNR)

■ Then we have

65
Example ROC curves

66
Area under the curve (AUC)
■ Which classifier is better, C1 or C2?
◻ It depends on which region you talk about.
■ Can we have one measure?
◻ Yes, we compute the area under the curve (AUC)
■ If AUC for Ci is greater than that of Cj, it is said that Ci is
better than Cj.
◻ If a classifier is perfect, its AUC value is 1
◻ If a classifier makes all random guesses, its AUC value is 0.5.

67
Drawing an ROC curve

MC4301 - ML Unit 2 (Model Evaluation and Feature Engineering)
No ratings yet
MC4301 - ML Unit 2 (Model Evaluation and Feature Engineering)
40 pages
Data Sciencefor Business
No ratings yet
Data Sciencefor Business
107 pages
An Introduction To Serious Algorithmic Trading
86% (7)
An Introduction To Serious Algorithmic Trading
40 pages
Unit1-2
No ratings yet
Unit1-2
101 pages
Tirth.pdf
No ratings yet
Tirth.pdf
19 pages
Intro To ML
No ratings yet
Intro To ML
107 pages
UNIT I-Part 1
No ratings yet
UNIT I-Part 1
52 pages
CS3491-AI ML-Chapter 1
No ratings yet
CS3491-AI ML-Chapter 1
19 pages
MyChap1 - Introduction
No ratings yet
MyChap1 - Introduction
28 pages
AI Chapter 3 Part 1
No ratings yet
AI Chapter 3 Part 1
33 pages
1. ML Introduction
No ratings yet
1. ML Introduction
54 pages
ML Chap1
No ratings yet
ML Chap1
26 pages
Lecture 1.2 Introduction to Machine Learning
No ratings yet
Lecture 1.2 Introduction to Machine Learning
31 pages
Unit-1 MLT
No ratings yet
Unit-1 MLT
51 pages
lec001
No ratings yet
lec001
17 pages
Unit - 1 - SC
No ratings yet
Unit - 1 - SC
98 pages
WEEK 01 Merged
No ratings yet
WEEK 01 Merged
606 pages
1 - Introduction
No ratings yet
1 - Introduction
82 pages
ML -1_Sovan_Introduction to ML
No ratings yet
ML -1_Sovan_Introduction to ML
83 pages
Unit I MACHINE LEARNING
No ratings yet
Unit I MACHINE LEARNING
87 pages
Machine Learning and Applications (5L)
No ratings yet
Machine Learning and Applications (5L)
185 pages
01 LecIntro
No ratings yet
01 LecIntro
23 pages
COMP323 - Topic C - Introduction To Machine Learning 1
No ratings yet
COMP323 - Topic C - Introduction To Machine Learning 1
20 pages
Ai Chapter 5
No ratings yet
Ai Chapter 5
45 pages
Lect3 Machine Learning
No ratings yet
Lect3 Machine Learning
27 pages
Machine Learning Notes
100% (3)
Machine Learning Notes
134 pages
Unit 3
No ratings yet
Unit 3
62 pages
1 Leaning Introduction
No ratings yet
1 Leaning Introduction
29 pages
ML 01
No ratings yet
ML 01
44 pages
Unit 1 Machine Learning
No ratings yet
Unit 1 Machine Learning
68 pages
Week 8
No ratings yet
Week 8
70 pages
Intro - Types of Machine Learning
No ratings yet
Intro - Types of Machine Learning
24 pages
Machine Learning INTRO
No ratings yet
Machine Learning INTRO
12 pages
Presentation on ML - Copy
No ratings yet
Presentation on ML - Copy
469 pages
Module 1
No ratings yet
Module 1
175 pages
Artificial Intelligence: Chapter 5 - Machine Learning
No ratings yet
Artificial Intelligence: Chapter 5 - Machine Learning
30 pages
1. Machine Learning - Introduction
No ratings yet
1. Machine Learning - Introduction
138 pages
Lecture 1 - Introduction
No ratings yet
Lecture 1 - Introduction
49 pages
Machine Learning Basics
No ratings yet
Machine Learning Basics
16 pages
Machine Learning KTU Module 1
No ratings yet
Machine Learning KTU Module 1
77 pages
01 Introduction 1
No ratings yet
01 Introduction 1
71 pages
Machine Learning
No ratings yet
Machine Learning
42 pages
Chapter 1 Introduction To Machine Learning
No ratings yet
Chapter 1 Introduction To Machine Learning
29 pages
CCST9017 (2023-24lecture11printed Version) MachineLearning
No ratings yet
CCST9017 (2023-24lecture11printed Version) MachineLearning
55 pages
Introduction To ML
No ratings yet
Introduction To ML
48 pages
Introduction To Machine Learning
No ratings yet
Introduction To Machine Learning
10 pages
Mlfa Autumn 22 Lec 01
No ratings yet
Mlfa Autumn 22 Lec 01
43 pages
Machine Learning
No ratings yet
Machine Learning
13 pages
3. Decision Tree -1.Pptx
No ratings yet
3. Decision Tree -1.Pptx
31 pages
86 37 196 Mod 5
No ratings yet
86 37 196 Mod 5
52 pages
Presentation of AI ML Session 1
No ratings yet
Presentation of AI ML Session 1
131 pages
Unit_1_ML - Copy
No ratings yet
Unit_1_ML - Copy
96 pages
Chapter 2
No ratings yet
Chapter 2
35 pages
Machine Learning
No ratings yet
Machine Learning
26 pages
Lecture Compiled
No ratings yet
Lecture Compiled
224 pages
Lect1 Introduction
No ratings yet
Lect1 Introduction
38 pages
ML NOTES
No ratings yet
ML NOTES
101 pages
CPCS335 - Chapter 8-Final
No ratings yet
CPCS335 - Chapter 8-Final
23 pages
Machine Learning
No ratings yet
Machine Learning
46 pages
I2ml3e Chap1
No ratings yet
I2ml3e Chap1
20 pages
The Secret Of Machine Learning
From Everand
The Secret Of Machine Learning
Mhd Arjunanta
No ratings yet
The Fundamentals of Machine Learning: Building Intelligent Systems from Data
From Everand
The Fundamentals of Machine Learning: Building Intelligent Systems from Data
Ethan Bennett
No ratings yet
Fundamentals of Machine Learning: a Simplified Approach
From Everand
Fundamentals of Machine Learning: a Simplified Approach
Er. Sudhir Goswami
No ratings yet
UNIT2RM
No ratings yet
UNIT2RM
15 pages
Application of ANN in Pavement - Review
100% (1)
Application of ANN in Pavement - Review
61 pages
21AI502 Syllbus
No ratings yet
21AI502 Syllbus
5 pages
Numerical - Methods Ch1 Errors SM
No ratings yet
Numerical - Methods Ch1 Errors SM
15 pages
Ocean Engineering: Pin Zhang, Zhen-Yu Yin, Yuanyuan Zheng, Fu-Ping Gao
No ratings yet
Ocean Engineering: Pin Zhang, Zhen-Yu Yin, Yuanyuan Zheng, Fu-Ping Gao
13 pages
Unit 4 Aam
No ratings yet
Unit 4 Aam
26 pages
Machine Learning Assignment
No ratings yet
Machine Learning Assignment
13 pages
Module 2
No ratings yet
Module 2
151 pages
Data Mining Caselets
No ratings yet
Data Mining Caselets
10 pages
Student Placement Prediction Using Machine Learnin
No ratings yet
Student Placement Prediction Using Machine Learnin
7 pages
Classification Using Decision Trees
No ratings yet
Classification Using Decision Trees
43 pages
A Presentation On "Deep Neural Network" Nikhil Sunil Patil
No ratings yet
A Presentation On "Deep Neural Network" Nikhil Sunil Patil
9 pages
KNN REPORT
No ratings yet
KNN REPORT
28 pages
ERERER
No ratings yet
ERERER
1 page
1-s2.0-S0952197623018018-main
No ratings yet
1-s2.0-S0952197623018018-main
11 pages
Project Publish1
No ratings yet
Project Publish1
12 pages
Machine Learning For Real-Time Heart Disease Prediction
No ratings yet
Machine Learning For Real-Time Heart Disease Prediction
11 pages
Katsande Android Applicationfor Crop Disease Diagnosis Using Image Processing and Deep Learning
No ratings yet
Katsande Android Applicationfor Crop Disease Diagnosis Using Image Processing and Deep Learning
84 pages
Dyslexia Prediction Using Machine Learning
No ratings yet
Dyslexia Prediction Using Machine Learning
9 pages
Machine Learning Interview Questions & Answers for Data Scientists
No ratings yet
Machine Learning Interview Questions & Answers for Data Scientists
13 pages
1-s2.0-S0140700723001524-main
No ratings yet
1-s2.0-S0140700723001524-main
16 pages
Starting A Data Science Team: Dr. Jonathan D. Adler
No ratings yet
Starting A Data Science Team: Dr. Jonathan D. Adler
39 pages
paper11
No ratings yet
paper11
16 pages
ML
No ratings yet
ML
85 pages
Unit 3
No ratings yet
Unit 3
99 pages
Sample Project Doc-RIT
No ratings yet
Sample Project Doc-RIT
63 pages
Decision Trees For Predictive Modeling (Neville)
100% (1)
Decision Trees For Predictive Modeling (Neville)
24 pages

AI notes Week 11

Uploaded by

AI notes Week 11

Uploaded by

Artificial Intelligence

Example: P ( Milk | bread ) = 0.7

Discriminant: IF income > θ1 AND savings > θ2

AT&T Laboratories, Cambridge UK

■ Goal: To learn a classification model from the data that

No learning: classify all future applications (test data) to the

■ In practice, this assumption is often violated to certain

■ The classification model is a tree, called decision tree.

■ Finding the best tree is

■ Fig. (B) seems to be better.

■ Pr(cj) is the probability of class cj in data set D

■ As the data become purer and purer, the entropy value

■ If we make attribute Ai, with v values, the root of the

■ We choose the attribute with the highest gain to

■ Own_house is the best

■ We can use information gain ratio to evaluate the

■ Precision p is the number of correctly classified

■ This confusion matrix gives

■ The harmonic mean of two numbers tends to be closer to the

■ It is commonly called the ROC curve.

■ False positive rate:

You might also like