0% found this document useful (0 votes)

21 views20 pages

Unit 2

Uploaded by

Mohamed riyan

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as DOCX, PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

21 views20 pages

Unit 2

Uploaded by

Mohamed riyan

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as DOCX, PDF, TXT or read online on Scribd

You are on page 1/ 20

UNIT-2

Probability theory

Probability theory is a branch of mathematics that deals with the analysis of random
phenomena. It aims to assign numerical values to the likelihood of events occurring. The
main concepts and ideas in probability theory include:

1. Sample Space: The set of all possible outcomes of an experiment.

2. Event: A subset of the sample space.
3. Probability: A measure of the likelihood of an event occurring, represented by a number
between 0 and 1.

There are two primary methods to calculate the probability of an event:

1. Theoretical Probability: Calculated by dividing the number of favorable outcomes by the

total number of possible outcomes.
2. Experimental Probability: Calculated by conducting an experiment and observing the
relative frequency of an event occurring.

Complementary Probability is the probability that an event will not occur, calculated by
subtracting the probability of the event from 1.

Several important rules in probability theory help in calculating the probabilities of

compound events:

1. Addition Rule (Union): P(A ∪ B) = P(A) + P(B) - P(A ∩ B), where A and B are events.
2. Multiplication Rule (Intersection): P(A ∩ B) = P(A) * P(B|A), where A and B are events, and
P(B|A) is the probability of B given A has occurred.
3. Independence: Two events A and B are independent if P(B|A) = P(B).
4. Conditional Probability: P(A|B) is the probability of A occurring given that B has occurred.
Random variables are functions that assign numerical values to the outcomes of a random
experiment. Probability distributions describe the likelihood of different values that a
random variable can take. Some common probability distributions include:

1. Discrete Distributions: Bernoulli, Binomial, Geometric, Poisson, and Uniform distributions.

2. Continuous Distributions: Uniform, Exponential, Normal (Gaussian), and others.

Probability Density Functions (PDFs) and Cumulative Distribution Functions (CDFs) are used
to describe continuous probability distributions:

1. PDFs provide the probability density at any given point.

2. CDFs give the probability of a random variable being less than or equal to a certain value.

Finally, Expectation (or the mean) and Variance are essential concepts for understanding the
behavior of random variables and their distributions. Expectation is a measure of the central
tendency, while Variance is a measure of the dispersion or spread of a distribution.

Bayes rule:
Bayes' Theorem: Bayes' Theorem, named after the Reverend Thomas Bayes, is a
fundamental concept in probability theory that allows us to reverse the conditional
probability relationship between two events. It is particularly useful in situations where we
have prior knowledge or information about one event and want to update our beliefs when
new evidence becomes available.

Bayes' Theorem is given by:

P(A|B) = (P(B|A) * P(A)) / P(B)

Here,

- P(A|B) is the probability of event A occurring given that event B has occurred.
- P(B|A) is the probability of event B occurring given that event A has occurred.
- P(A) is the prior probability of event A, which is the probability of A occurring without
considering B.
- P(B) is the probability of event B occurring, also known as the marginal probability of B.

By using Bayes' Theorem, we can calculate the conditional probability P(A|B), which might
be difficult or impossible to determine directly. This is particularly useful when we want to
update our beliefs about the probability of an event based on new evidence or information.

Bayes' Theorem plays a crucial role in various fields, including statistics, machine learning,
data science, and scientific inquiry. Some common applications include:

- Medical diagnosis: Updating the probability of a disease given a patient's symptoms.

- Hypothesis testing: Updating the probability of a hypothesis being true based on observed
data.
- Spam filtering: Updating the probability of an email being spam based on its content and
sender characteristics.
- Genetic testing: Updating the probability of a person having a genetic disorder based on
family history and genetic markers.

Concept learning is a fundamental concept in machine learning and artificial

intelligence that involves the process of learning to classify objects or examples based on
their features or attributes. The goal of concept learning is to identify and generalize
patterns from data to make predictions or decisions about new, unseen instances.

Here are the key components and steps involved in concept learning:

Examples: Concept learning begins with a set of examples or instances that are labeled with
their corresponding classes or categories. These examples serve as the training data for the
learning algorithm.

Features or Attributes: Each example is described by a set of features or attributes that

capture relevant information about the object or instance. These features could be
numerical, categorical, or symbolic.
Hypothesis Space: The hypothesis space represents the set of possible concepts or
classifiers that the learning algorithm can consider. It defines the range of hypotheses that
the algorithm will explore to find the best concept that fits the data.

Training Algorithm: The training algorithm is used to search the hypothesis space and find a
concept that best fits the training data. This involves evaluating and comparing different
hypotheses based on how well they explain the examples.

Generalization: Once a concept is learned from the training data, the goal is to generalize
this concept to new, unseen examples. Generalization ensures that the learned concept can
accurately classify instances that were not part of the training set.

Evaluation: The learned concept is evaluated using performance metrics such as accuracy,
precision, recall, and F1 score to assess its effectiveness in classifying new instances.

Concept learning can be supervised, unsupervised, or semi-supervised, depending on the

availability of labeled data during the learning process. Supervised concept learning involves
learning from labeled examples, while unsupervised learning involves discovering patterns
and structures in unlabeled data. Semi-supervised learning combines both labeled and
unlabeled data for learning concepts.

Bayes' Theorem:
Bayes' theorem states that the probability of a hypothesis (class) given the evidence
(features) is proportional to the probability of the evidence given the hypothesis, multiplied
by the prior probability of the hypothesis, and divided by the probability of the evidence.

Mathematically, it is represented as:

EXAMPLE
Suppose we have a dataset of weather conditions and corresponding target
variable "Play". So using this dataset we need to decide that whether we
should play or not on a particular day according to the weather conditions. So
to solve this problem, we need to follow the below steps:
1. Convert the given dataset into frequency tables.
2. Generate Likelihood table by finding the probabilities of given features.
3. Now, use Bayes theorem to calculate the posterior probability.
Problem: If the weather is sunny, then the Player should play or not?
Solution: To solve this, first consider the below dataset:
Applying Bayes'theorem:
P(Yes|Sunny)= P(Sunny|Yes)*P(Yes)/P(Sunny)
P(Sunny|Yes)= 3/10= 0.3
P(Sunny)= 0.35
P(Yes)=0.71
So P(Yes|Sunny) = 0.3*0.71/0.35= 0.60
P(No|Sunny)= P(Sunny|No)*P(No)/P(Sunny)
P(Sunny|NO)= 2/4=0.5
P(No)= 0.29
P(Sunny)= 0.35
So P(No|Sunny)= 0.5*0.29/0.35 = 0.41
Naive Independence Assumption: The Naive Bayes algorithm assumes that all features are
conditionally independent given the class label. This means that the presence of a particular
feature in an instance is independent of the presence of other features, given the class label.

The Naive Bayes algorithm is computationally efficient, especially for high-dimensional data,
and can perform well even with relatively small training datasets. However, its assumption of
feature independence may not hold true in all cases, leading to potential inaccuracies,
especially when features are correlated.

Bayesian belief Networks:

It is a graphical representation of different probabilistic relationships among

random variables in a particular set. It is a classifier with no dependency on
attributes i.e it is condition independent. Due to its feature of joint probability, the
probability in Bayesian Belief Network is derived, based on a condition —
P(attribute/parent) i.e probability of an attribute, true over parent attribute.
Consider this example:
In the above figure, we have an alarm ‘A’ – a node, say installed in a house of a person ‘gfg’, which
rings upon two probabilities i.e burglary ‘B’ and fire ‘F’, which are – parent nodes of the alarm node.
The alarm is the parent node of two probabilities P1 calls ‘P1’ & P2 calls ‘P2’ person nodes. • Upon
the instance of burglary and fire, ‘P1’ and ‘P2’ call person ‘gfg’, respectively. But, there are few
drawbacks in this case, as sometimes ‘P1’ may forget to call the person ‘gfg’, even after hearing the
alarm, as he has a tendency to forget things, quick. Similarly, ‘P2’, sometimes fails to call the person
‘gfg’, as he is only able to hear the alarm, from a certain distance.

Expectation-Maximization Algorithm In the real-world applications of machine learning, it is very

common that there are many relevant features available for learning but only a small subset of them
are observable. So, for the variables which are sometimes observable and sometimes not, then we
can use the instances when that variable is visible is observed for the purpose of learning and then
predict its value in the instances when it is not observable.

On the other hand, Expectation-Maximization algorithm can be used for the latent variables
(variables that are not directly observable and are actually inferred from the values of the other
observed variables) too in order to predict their values with the condition that the general form of
probability distribution governing those latent variables is known to us. This algorithm is actually at
the base of many unsupervised clustering algorithms in the field of machine learning. It was
explained, proposed and given its name in a paper published in 1977 by Arthur Dempster, Nan Laird,
and Donald Rubin. It is used to find the local maximum likelihood parameters of a statistical model in
the cases where latent variables are involved and the data is missing or incomplete.

Algorithm:
1. Given a set of incomplete data, consider a set of starting parameters.
2. Expectation step (E – step): Using the observed available data of the
dataset, estimate (guess) the values of the missing data.
3. Maximization step (M – step): Complete data generated after the
expectation (E) step is used in order to update the parameters.
4. Repeat step 2 and step 3 until convergence.
The essence of Expectation-Maximization algorithm is to use the available
observed data of the dataset to estimate the missing data and then using that
data to update the values of the parameters. Let us understand the EM
algorithm in detail.
• Initially, a set of initial values of the parameters are considered. A set of
incomplete observed data is given to the system with the assumption that the
observed data comes from a specific model.
• The next step is known as “Expectation” – step or E-step. In this step, we use
the observed data in order to estimate or guess the values of the missing or
incomplete data. It is basically used to update the variables.
• The next step is known as “Maximization”-step or M-step. In this step, we
use the complete data generated in the preceding “Expectation” – step in
order to update the values of the parameters. It is basically used to update the
hypothesis.
• Now, in the fourth step, it is checked whether the values are converging or
not, if yes, then stop otherwise repeat step-2 and step-3 i.e. “Expectation” –
step and
“Maximization” – step until the convergence occurs.
Flow chart for EM algorithm
Usage of EM algorithm
• It can be used to fill the missing data in a sample.
• It can be used as the basis of unsupervised learning of clusters.
• It can be used for the purpose of estimating the parameters of Hidden Markov
Model (HMM).
• It can be used for discovering the values of latent variables.
Advantages of EM algorithm
• It is always guaranteed that likelihood will increase with each iteration.
• The E-step and M-step are often pretty easy for many problems in terms of
implementation.
• Solutions to the M-steps often exist in the closed form.
Usage of EM algorithm
• It can be used to fill the missing data in a sample.
• It can be used as the basis of unsupervised learning of clusters.
• It can be used for the purpose of estimating the parameters of Hidden Markov
Model (HMM).
• It can be used for discovering the values of latent variables.
Advantages of EM algorithm
• It is always guaranteed that likelihood will increase with each iteration.
• The E-step and M-step are often pretty easy for many problems in terms of
implementation.
• Solutions to the M-steps often exist in the closed form.

Bayes Theorem in Machine learning
No ratings yet
Bayes Theorem in Machine learning
37 pages
Hard Chromium Electroplating
No ratings yet
Hard Chromium Electroplating
14 pages
ML Material-I (2)
No ratings yet
ML Material-I (2)
35 pages
Aiml Iii
No ratings yet
Aiml Iii
28 pages
Unit 3 Bayesian Concept Learning
No ratings yet
Unit 3 Bayesian Concept Learning
66 pages
Bayes Theorem
No ratings yet
Bayes Theorem
20 pages
Naive Bayes
No ratings yet
Naive Bayes
29 pages
Unit-3
No ratings yet
Unit-3
157 pages
ml last document group 2.pdf
No ratings yet
ml last document group 2.pdf
13 pages
Unit-4
No ratings yet
Unit-4
36 pages
UNIT -3 ITAI & ML
No ratings yet
UNIT -3 ITAI & ML
57 pages
Notes On ML
No ratings yet
Notes On ML
42 pages
An Introduction to Naive Bayes Algorithm for Beginners
No ratings yet
An Introduction to Naive Bayes Algorithm for Beginners
11 pages
UNIT-3
No ratings yet
UNIT-3
46 pages
Bayes Decision Theorylect3
No ratings yet
Bayes Decision Theorylect3
12 pages
ML Unit 1
No ratings yet
ML Unit 1
13 pages
Bayesian Classification
No ratings yet
Bayesian Classification
7 pages
Unit I Probabilistic Reasoning I 9
No ratings yet
Unit I Probabilistic Reasoning I 9
20 pages
AIML-Unit 3 Notes-Assignment 3
No ratings yet
AIML-Unit 3 Notes-Assignment 3
37 pages
ML Unit III
No ratings yet
ML Unit III
40 pages
Project Maths
No ratings yet
Project Maths
14 pages
M3 - FDS
No ratings yet
M3 - FDS
38 pages
M3 - FDS
No ratings yet
M3 - FDS
38 pages
Unit II Probabilistic Reasoning
No ratings yet
Unit II Probabilistic Reasoning
28 pages
Module V_v1
No ratings yet
Module V_v1
58 pages
Ai2 Unit
No ratings yet
Ai2 Unit
22 pages
Naive Bayes
No ratings yet
Naive Bayes
60 pages
UNIT-2NEW
No ratings yet
UNIT-2NEW
26 pages
Unit Iii Bayesian Learning
No ratings yet
Unit Iii Bayesian Learning
5 pages
Unit 6
No ratings yet
Unit 6
19 pages
unit2 AI & ML
No ratings yet
unit2 AI & ML
29 pages
What Is Naive Bayes?
No ratings yet
What Is Naive Bayes?
6 pages
Chapter 4
No ratings yet
Chapter 4
25 pages
Mod 4
No ratings yet
Mod 4
26 pages
CS3491 Unit 2 Aiml
100% (1)
CS3491 Unit 2 Aiml
21 pages
Imp Class Bayes Therom and Basian Network Class
No ratings yet
Imp Class Bayes Therom and Basian Network Class
39 pages
Module 2 - Bayesian Learning
No ratings yet
Module 2 - Bayesian Learning
7 pages
18CS71 Module 4
No ratings yet
18CS71 Module 4
30 pages
ML - Unit4pdf
No ratings yet
ML - Unit4pdf
65 pages
Unit 2 - Probabilistic Reasoning
No ratings yet
Unit 2 - Probabilistic Reasoning
25 pages
Unit 3-2
No ratings yet
Unit 3-2
12 pages
671f5e482353d
No ratings yet
671f5e482353d
35 pages
Data Analytics Unit-2 PPT Notes
No ratings yet
Data Analytics Unit-2 PPT Notes
190 pages
Unit 3
No ratings yet
Unit 3
8 pages
Unit 2 Probabilistic Reasoning
No ratings yet
Unit 2 Probabilistic Reasoning
21 pages
AI (IT) UNIT-3-converted
No ratings yet
AI (IT) UNIT-3-converted
85 pages
Artifical Intelligence Notes Part 7
No ratings yet
Artifical Intelligence Notes Part 7
49 pages
Baysian Modelling
No ratings yet
Baysian Modelling
16 pages
3.bayesian Modeling
No ratings yet
3.bayesian Modeling
13 pages
Group_5_Practical
No ratings yet
Group_5_Practical
6 pages
UNIT 5 Artificial Intelligence Notes
No ratings yet
UNIT 5 Artificial Intelligence Notes
20 pages
ai3
No ratings yet
ai3
41 pages
ML UNIT-5 Notes PDF
No ratings yet
ML UNIT-5 Notes PDF
41 pages
Unit-V_POAI (1).pptx
No ratings yet
Unit-V_POAI (1).pptx
50 pages
E-Note 14654 Content Document 20231228101425AM
No ratings yet
E-Note 14654 Content Document 20231228101425AM
10 pages
Bayes Learning
No ratings yet
Bayes Learning
37 pages
Acting Under Uncertainty - Bayesian Inference-Probabilistic Reasoning
No ratings yet
Acting Under Uncertainty - Bayesian Inference-Probabilistic Reasoning
22 pages
Machine learning unit 5 part 2
No ratings yet
Machine learning unit 5 part 2
16 pages
MODULE - 4 QB SOLVED-1
No ratings yet
MODULE - 4 QB SOLVED-1
31 pages
Bayesian Methodology: an Overview With The Help Of R Software
From Everand
Bayesian Methodology: an Overview With The Help Of R Software
Editor IJSMI
No ratings yet
Abductive Reasoning: Fundamentals and Applications
From Everand
Abductive Reasoning: Fundamentals and Applications
Fouad Sabry
No ratings yet
Rocket Boomer 281
No ratings yet
Rocket Boomer 281
29 pages
Year 2015
No ratings yet
Year 2015
2 pages
The Trilateral Commission Seoul Plenary Meeting 2003
No ratings yet
The Trilateral Commission Seoul Plenary Meeting 2003
118 pages
Ad4 Research Pavilion
No ratings yet
Ad4 Research Pavilion
9 pages
Final 26, Yl5 Writing
No ratings yet
Final 26, Yl5 Writing
4 pages
Yang Lacasse Sandven
No ratings yet
Yang Lacasse Sandven
7 pages
Sun ZFS Storage Appliance Help Desk Support Consultant - Online Assessment
100% (1)
Sun ZFS Storage Appliance Help Desk Support Consultant - Online Assessment
33 pages
Course Title: Laser Physics: Lecture # 3
No ratings yet
Course Title: Laser Physics: Lecture # 3
17 pages
WEF 2025 (1)_compressed (1)_compressed-51-100
No ratings yet
WEF 2025 (1)_compressed (1)_compressed-51-100
50 pages
Top Question 1 Theories of Personality Reviewer
No ratings yet
Top Question 1 Theories of Personality Reviewer
10 pages
Building and Detailing Scale Model Aircraft
100% (1)
Building and Detailing Scale Model Aircraft
147 pages
Garbage Management Plan
100% (1)
Garbage Management Plan
23 pages
Modern Control Lecture Plan
No ratings yet
Modern Control Lecture Plan
2 pages
Interior Design and Interior Architecture: Certificate Program in
No ratings yet
Interior Design and Interior Architecture: Certificate Program in
8 pages
101 U2
No ratings yet
101 U2
24 pages
Understand The Flows of Significant Classes of Transactions, Including Walk-Through - Cash Disbursements
No ratings yet
Understand The Flows of Significant Classes of Transactions, Including Walk-Through - Cash Disbursements
11 pages
PRC 4m Pakalat Pls MSEUF
No ratings yet
PRC 4m Pakalat Pls MSEUF
2 pages
Pds - Avalon HLP
No ratings yet
Pds - Avalon HLP
1 page
Prevention of Accidents and Hazards in Operation Theatre
No ratings yet
Prevention of Accidents and Hazards in Operation Theatre
10 pages
Microsoft Office For Mac Shortcuts
No ratings yet
Microsoft Office For Mac Shortcuts
3 pages
AQA Double Slit Interference Answers
No ratings yet
AQA Double Slit Interference Answers
15 pages
3D Mapping Services - Portfolio
No ratings yet
3D Mapping Services - Portfolio
20 pages
Vitrinite Reflectance
No ratings yet
Vitrinite Reflectance
33 pages
CIMA P1 Performance Operations Study Text 2013
100% (8)
CIMA P1 Performance Operations Study Text 2013
697 pages
Course Outline-Taxation - Mr. S.B.Gabhawalla-CM - I Yr Trim III-2010-11
No ratings yet
Course Outline-Taxation - Mr. S.B.Gabhawalla-CM - I Yr Trim III-2010-11
2 pages
Pop Art Rubric
No ratings yet
Pop Art Rubric
1 page
Ariston
No ratings yet
Ariston
40 pages
Uttarakhand Climate Change Effects On Bridge Infrastructure
No ratings yet
Uttarakhand Climate Change Effects On Bridge Infrastructure
17 pages
Procedure Qualification Record (PQR) : Jl. Brigjend Katamso KM 5.6 Tanjung Uncang - Batam Indonesia
No ratings yet
Procedure Qualification Record (PQR) : Jl. Brigjend Katamso KM 5.6 Tanjung Uncang - Batam Indonesia
2 pages

Unit 2

Uploaded by

Unit 2

Uploaded by

UNIT-2

1. Sample Space: The set of all possible outcomes of an experiment.

There are two primary methods to calculate the probability of an event:

1. Theoretical Probability: Calculated by dividing the number of favorable outcomes by the

Several important rules in probability theory help in calculating the probabilities of

1. Discrete Distributions: Bernoulli, Binomial, Geometric, Poisson, and Uniform distributions.

1. PDFs provide the probability density at any given point.

Bayes' Theorem is given by:

P(A|B) = (P(B|A) * P(A)) / P(B)

- Medical diagnosis: Updating the probability of a disease given a patient's symptoms.

Concept learning is a fundamental concept in machine learning and artificial

Features or Attributes: Each example is described by a set of features or attributes that

Concept learning can be supervised, unsupervised, or semi-supervised, depending on the

Mathematically, it is represented as:

Bayesian belief Networks:

It is a graphical representation of different probabilistic relationships among

Expectation-Maximization Algorithm In the real-world applications of machine learning, it is very

You might also like