3.bayesian Modeling
3.bayesian Modeling
Machine Learning: Bayesian techniques are widely used in machine learning for
tasks like classification, regression, and anomaly detection.
Signal Processing: These models are valuable for filtering noise and extracting signals
from noisy data.
Overall, Bayesian modeling provides a flexible and powerful framework for building
statistical models that can capture uncertainty and incorporate prior knowledge
effectively. It is widely used in both academic research and practical applications across
diverse domains.
Bayes' theorem
Bayes' theorem is a fundamental concept in probability theory and statistics, named after
the Reverend Thomas Bayes, an 18th-century British mathematician, who developed the
theorem. It provides a way to update the probability of a hypothesis in light of new
evidence. Mathematically, Bayes' theorem can be expressed as:
Where:
- P(A|B) is the probability of event A occurring given that event B has occurred. This is
called the posterior probability.
- P(B|A) is the probability of event B occurring given that event A has occurred. This is
called the likelihood.
- P(A) is the probability of event A occurring. This is called the prior probability.
- P(B) is the probability of event B occurring. This is called the marginal likelihood or
evidence.
In words, Bayes' theorem states that the probability of A given B is proportional to the
probability of B given A, multiplied by the prior probability of A, and divided by the
marginal probability of B.
Bayes' theorem is particularly useful in situations where we want to update our beliefs or
probabilities about an event or hypothesis in light of new evidence.
It provides a formal framework for combining prior knowledge or beliefs with observed
data to arrive at a more informed or updated probability distribution.
This makes it a powerful tool in various fields including statistics, machine learning, and
artificial intelligence.
So, given that the test came back positive, the probability that the person actually has
Disease X is approximately 32.4%. Even though the test is quite accurate, the relatively
low prior probability of having the disease means that a positive test result doesn't
guarantee that the person has the disease.
Bayes’ Theorem — Example Solution
P(A|B) = P(B|A)*P(A)/P(B)
Problem 1:
Let’s work on a simple NLP problem with Bayes Theorem. By using NLP, I can detect spam e-
mails in my inbox. Assume that the word ‘offer’ occurs in 80% of the spam messages in my
account. Also, let’s assume ‘offer’ occurs in 10% of my desired e-mails. If 30% of the received
e-mails are considered as a scam, and I will receive a new message which contains ‘offer’,
what is the probability that it is spam?
Now, I assume that I received 100 e-mails. The percentage of spam in the whole e-mail is
30%. So, I have 30 spam e-mails and 70 desired e-mails in 100 e-mails. The percentage of the
word ‘offer’ that occurs in spam e-mails is 80%. It means 80% of 30 e-mail and it makes 24.
Now, I know that 30 e-mails of 100 are spam and 24 of them contain ‘offer’ whereas 6 of
them not contains ‘offer’.
The percentage of the word ‘offer’ that occurs in the desired e-mails is 10%. It means 7 of
them (10% of 70 desired e-mails) contain the word ‘offer’ and 63 of them not.
The question was what is the probability of spam where the mail contains the word ‘offer’:
1. We need to find the total number of mails which contains ‘offer’ ;
24 +7 = 31 mail contain the word ‘offer’
2. Find the probability of spam if the mail contains ‘offer’ ;
In 31 mails 24 contains ‘offer’ means 77.4% = 0.774 (probability)
NOTE: In this example, I choose the percentages which give integers after calculation. As a
general approach, you can think that we have 100 units at the beginning so if the results are
not an integer, it will not create a problem. Such that, we cannot say 15.3 e-mails but we can
say 15.3 units.
Solution with Bayes’ Equation:
A = Spam
Now we will find the probability of e-mail with the word ‘offer’. We can compute that by
adding ‘offer’ in spam and desired e-mails. Such that;
As it is seen in both ways the results are the same. In the first part, I solved the same
question with a simple chart and for the second part, I solved the same question with Bayes’
theorem.
Problem 2:
I want to solve one more example from a popular topic as Covid-19. As you know, Covid-19
tests are common nowadays, but some results of tests are not true. Let’s assume; a
diagnostic test has 99% accuracy and 60% of all people have Covid-19. If a patient tests
positive, what is the probability that they actually have the disease?
The total units which have positive results= 59.4 + 0.4 = 59.8
59.4 units (true positive) is 59.8 units means 99.3% = 0.993 probability
With Bayes’;
P(positive|covid19) = 0.99
P(covid19) = 0.6
P(positive) = 0.6*0.99+0.4*0.01=0.598
Again, we find the same answer with the chart. There are many examples to learn Bayes’
Theorem’s applications such as the Monty Hall problem which is a little puzzle that you have
3 doors. Behind the doors, there are 2 goats and 1 car. You are asked to select one door to
find the car. After selecting one door, the host opens one of the not-selected doors and
revealing the goat. Then, you are asked to switch the doors or stick with your first choice. By
running this process a thousand times and simulating it, you can find the probability of
winning and figure out the idea of Bayes’ theorem and Bayesian statistics in general through
the Monty Hall problem.
When we think of Bayes’ Theorem in the machine learning concept, it provides a way to
calculate the probability of the hypothesis based on conditions by using the relationship
between data and hypothesis. Also, it is the first step for understanding True Positive, False
Positive, True Negative, and False Negative concepts in data science classification problems
and Naive Bayes classifier.
Bayesian networks
Bayesian networks, also known as belief networks or probabilistic graphical models, are
powerful tools for representing and reasoning about uncertain knowledge. They provide a
formalism for capturing probabilistic relationships among a set of variables in a domain.
Here's an overview of Bayesian networks:
Components of Bayesian Networks:
1. Nodes (Vertices):
- Each node represents a random variable in the domain. These variables can be discrete,
continuous, or hybrid.
- Nodes may represent observable quantities (e.g., symptoms in a medical diagnosis) or
latent variables (e.g., underlying causes).
2. Uncertainty Handling: They provide a principled way to represent and reason under
uncertainty, crucial for many real-world applications.
4. Efficient Inference: Various algorithms exist for efficient probabilistic inference in Bayesian
networks, allowing for queries about the probability distribution of variables given evidence.
1. Medical Diagnosis: Bayesian networks are widely used in medical diagnosis systems to
combine patient symptoms with medical knowledge for accurate diagnosis.
2. Risk Assessment: They are used in risk assessment and decision-making processes, such as
determining the likelihood of failure in engineering systems or assessing financial risks.
3. Natural Language Processing: Bayesian networks are used in various natural language
processing tasks, such as language modeling, part-of-speech tagging, and parsing.
4. Computer Vision: They are used for object recognition, image segmentation, and scene
understanding in computer vision applications.
Overall, Bayesian networks provide a flexible framework for modeling uncertain domains
and are widely applied across different fields for decision-making, prediction, and inference
tasks.
Inference in Bayesian networks refers to the process of using the network's structure and
data to estimate the probability of different outcomes (states) for variables in the network.
Types of Inference:
There are several inference algorithms used to perform this task efficiently, including:
1. Variable Elimination: This algorithm computes the marginal probability distribution
of a subset of variables by systematically eliminating variables from the network
based on evidence.
3. Junction Tree Algorithm: This algorithm constructs a junction tree (also known as a
clique tree) from the Bayesian network and uses it to perform efficient inference.
4. Gibbs Sampling and Markov Chain Monte Carlo (MCMC) Methods: These methods
are used for approximate inference in Bayesian networks, especially when the
network structure is complex and exact inference is computationally infeasible.
Challenges of Inference: