AI Unit3
AI Unit3
ADVERTISEMENT
What to Represent:
Following are the kind of knowledge which needs to be represented in AI systems:
o Object: All the facts about objects in our world domain. E.g., Guitars contains
strings, trumpets are brass instruments.
o Events: Events are the actions which occur in our world.
o Performance: It describe behavior which involves knowledge about how to do
things.
o Meta-knowledge: It is knowledge about what we know.
o Facts: Facts are the truths about the real world and what we represent.
o Knowledge-Base: The central component of the knowledge-based agents is the
knowledge base. It is represented as KB. The Knowledgebase is a group of the
Sentences (Here, sentences are used as a technical term and not identical with
the English language).
Types of knowledge
Following are the various types of knowledge:
1. Declarative Knowledge:
2. Procedural Knowledge
3. Meta-knowledge:
4. Heuristic knowledge:
5. Structural knowledge:
o Structural knowledge is basic knowledge to problem-solving.
o It describes relationships between various concepts such as kind of, part of, and
grouping of something.
o It describes the relationship that exists between concepts or objects.
Let's suppose if you met some person who is speaking in a language which you don't
know, then how you will able to act on that. The same thing applies to the intelligent
behavior of the agents.
As we can see in below diagram, there is one decision maker which act by sensing the
environment and using knowledge. But if the knowledge part will not present then, it
cannot display intelligent behavior.
AI knowledge cycle:
An Artificial intelligence system has the following components for displaying intelligent
behavior:
o Perception
o Learning
o Knowledge Representation and Reasoning
o Planning
o Execution
The above diagram is showing how an AI system can interact with the real world and
what components help it to show intelligence. AI system has Perception component by
which it retrieves information from its environment. It can be visual, audio or another
form of sensory input. The learning component is responsible for learning from data
captured by Perception comportment. In the complete cycle, the main components are
knowledge representation and Reasoning. These two components are involved in
showing the intelligence in machine-like humans. These two components are
independent with each other but also coupled together. The planning and execution
depend on analysis of Knowledge representation and reasoning.
ADVERTISEMENT
o It is the simplest way of storing facts which uses the relational method, and each
fact about a set of the object is set out systematically in columns.
o This approach of knowledge representation is famous in database systems where
the relationship between different entities is represented.
o This approach has little opportunity for inference.
Player1 65 23
Player2 58 18
Player3 75 24
2. Inheritable knowledge:
o In the inheritable knowledge approach, all data must be stored into a hierarchy of
classes.
o All classes should be arranged in a generalized form or a hierarchal manner.
o In this approach, we apply inheritance property.
o Elements inherit values from other members of a class.
o This approach contains inheritable knowledge which shows a relation between
instance and class, and it is called instance relation.
o Every individual frame can represent the collection of attributes and its value.
o In this approach, objects and values are represented in Boxed nodes.
o We use Arrows which point from objects to their values.
o Example:
3. Inferential knowledge:
4. Procedural knowledge:
o Procedural knowledge approach uses small programs and codes which describes
how to do specific things, and how to proceed.
o In this approach, one important rule is used which is If-Then rule.
o In this knowledge, we can use various coding languages such as LISP
language and Prolog language.
o We can easily represent heuristic or domain-specific knowledge using this
approach.
o But it is not necessary that we can represent all cases in this approach.
1. 1. Representational Accuracy:
KR system should have the ability to represent all kind of required knowledge.
2. 2. Inferential Adequacy:
KR system should have ability to manipulate the representational structures to
produce new knowledge corresponding to existing structure.
3. 3. Inferential Efficiency:
The ability to direct the inferential knowledge mechanism into the most
productive directions by storing appropriate guides.
4. 4. Acquisitional efficiency- The ability to acquire the new knowledge easily
using automatic methods.
1. Logical Representation
2. Semantic Network Representation
3. Frame Representation
4. Production Rules
1. Logical Representation
Logical representation is a language with some concrete rules which deals with
propositions and has no ambiguity in representation. Logical representation means
drawing a conclusion based on various conditions. This representation lays down some
important communication rules. It consists of precisely defined syntax and semantics
which supports the sound inference. Each sentence can be translated into logics using
syntax and semantics.
Syntax:
o Syntaxes are the rules which decide how we can construct legal sentences in the logic.
o It determines which symbol we can use in knowledge representation.
o How to write those symbols.
Semantics:
o Semantics are the rules by which we can interpret the sentence in the logic.
o Semantic also involves assigning a meaning to each sentence.
a. Propositional Logics
b. Predicate logics
Note: We will discuss Prepositional Logics and Predicate logics in later chapters.
1. Logical representations have some restrictions and are challenging to work with.
2. Logical representation technique may not be very natural, and inference may not be so
efficient.
Note: Do not be confused with logical representation and logical reasoning as logical
representation is a representation language and reasoning is a process of thinking logically.
Example: Following are some statements which we need to represent in the form of
nodes and arcs.
Statements:
a. Jerry is a cat.
b. Jerry is a mammal
c. Jerry is owned by Priya.
d. Jerry is brown colored.
e. All Mammals are animal.
In the above diagram, we have represented the different type of knowledge in the form
of nodes and arcs. Each object is connected with another object by some relation.
1. Semantic networks take more computational time at runtime as we need to traverse the
complete network tree to answer some questions. It might be possible in the worst case
scenario that after traversing the entire tree, we find that the solution does not exist in
this network.
2. Semantic networks try to model human-like memory (Which has 1015 neurons and links)
to store the information, but in practice, it is not possible to build such a vast semantic
network.
3. These types of representations are inadequate as they do not have any equivalent
quantifier, e.g., for all, for some, none, etc.
4. Semantic networks do not have any standard definition for the link names.
5. These networks are not intelligent and depend on the creator of the system.
3. Frame Representation
A frame is a record like structure which consists of a collection of attributes and its
values to describe an entity in the world. Frames are the AI data structure which divides
knowledge into substructures by representing stereotypes situations. It consists of a
collection of slots and slot values. These slots may be of any type and sizes. Slots have
names and values which are called facets.
Facets: The various aspects of a slot is known as Facets. Facets are features of frames
which enable us to put constraints on the frames. Example: IF-NEEDED facts are called
when data of any particular slot is needed. A frame may consist of any number of slots,
and a slot may include any number of facets and facets may have any number of values.
A frame is also known as slot-filter knowledge representation in artificial intelligence.
Frames are derived from semantic networks and later evolved into our modern-day
classes and objects. A single frame is not much useful. Frames system consist of a
collection of frames which are connected. In the frame, knowledge about an object or
event can be stored together in the knowledge base. The frame is a type of technology
which is widely used in various applications including Natural language processing and
machine visions.
Example: 1
Let's take an example of a frame for a book
Slots Filters
Year 1996
Page 1152
Example 2:
Let's suppose we are taking an entity, Peter. Peter is an engineer as a profession, and his
age is 25, he lives in city London, and the country is England. So following is the frame
representation for this:
Slots Filter
Name Peter
Profession Doctor
Age 25
Weight 78
4. Production Rules
Production rules system consist of (condition, action) pairs which mean, "If condition
then action". It has mainly three parts:
In production rules agent checks for the condition and if the condition exists then
production rule fires and corresponding action is carried out. The condition part of the
rule determines which rule may be applied to a problem. And the action part carries out
the associated problem-solving steps. This complete process is called a recognize-act
cycle.
The working memory contains the description of the current state of problems-solving
and rule can write knowledge to the working memory. This knowledge match and may
fire other rules.
If there is a new situation (state) generates, then multiple production rules will be fired
together, this is called conflict set. In this situation, the agent needs to select a rule from
these sets, and it is called a conflict resolution.
Example:
o IF (at bus stop AND bus arrives) THEN action (get into the bus)
o IF (on the bus AND paid AND empty seat) THEN action (sit down).
o IF (on bus AND unpaid) THEN action (pay charges).
o IF (bus arrives at destination) THEN action (get down from the bus).
Advantages of Production rule:
1. Production rule system does not exhibit any learning capabilities, as it does not store the
result of the problem for the future uses.
2. During the execution of the program, many rules may be active hence rule-based
production systems are inefficient.
LOGIC
The knowledge bases consist of sentences. These sentencesare expressed according to thesyntax of the
representation language, which specifies all thesentences that are well formed. For example
“x + y = 4” is a well-formed sentence, whereas “x+y+ =” is not.
A logic must also define the semantics or meaning of sentences. The semantics definesthe truth of each
sentence with respect to each possible world. For example, the semanticsfor arithmetic specifies that the
sentence “x + y =4” is true in a world where x is 2 and yis 2, but false in a world where x is 1 and y is
1.
When we need to be precise, we use the term model. models are mathematical abstractions, each of which
simply fixes the truth or falsehood of every relevant sentence. for example, having x men and y women
sitting at a table playing bridge, and the sentencex + y =4 is true when there are four people in total.
Formally, the possible models are justall possible assignments of real numbers to the variables x and y.
If a sentence αis true inmodelm, we say that m satisfies αor sometimes m is a model of α.
What is Logic?
Logic is the basis of all mathematical reasoning and all automated
reasoning. The rules of logic specify the meaning of mathematical
statements. These rules help us understand and reason with statements
such as –
∃ x such that x = a2 + b2, where x, a, b∈ Z
Which in Simple English means “There exists an integer that is not the
sum of two squares“.
Importance of Mathematical Logic
The rules of logic give precise meaning to mathematical statements.
These rules are used to distinguish between valid and invalid
mathematical arguments. Apart from its importance in understanding
mathematical reasoning, logic has numerous applications in Computer
Science, varying from the design of digital circuits to the construction of
computer programs and verification of the correctness of programs.
Propositional Logic
What is a Proposition? A proposition is the basic building block of logic.
It is defined as a declarative sentence that is either True or False, but not
both. The Truth Value of a proposition is True(denoted as T) if it is a true
statement, and False(denoted as F) if it is a false statement. For
Example,
1. The sun rises in the East and sets in the West.
2. 1 + 1 = 2
3. „b‟ is a vowel.
All of the above sentences are propositions, where the first two are
Valid(True) and the third one is Invalid(False). Some sentences that do
not have a truth value or may have more than one truth value are not
propositions. For Example,
1. What time is it?
2. Go out and Play
3. x + 1 = 2
4.The above sentences are not propositions as the first two do not have a
truth value, and the third one may be true or false. To represent
propositions, propositional variables are used. By Convention, these
variables are represented by small alphabets such as p,q,r,s . The area of
logic which deals with propositions is called propositional
calculus or propositional logic. It also includes producing new
propositions using existing ones. Propositions constructed using one or
more propositions are called compound propositions. The propositions
are combined together using Logical Connectives or Logical
Operators.
Truth Table
Since we need to know the truth value of a proposition in all possible
scenarios, we consider all the possible combinations of the propositions
which are joined together by Logical Connectives to form the given
compound proposition. This compilation of all possible scenarios in a
tabular format is called a truth table. Most Common Logical
Connectives-
1. Negation
If p is a proposition, then the negation of p is denoted by ¬p , which
when translated to simple English means- “It is not the case that p” or
simply “not p“. The truth value of -p is the opposite of the truth value of p.
The truth table of -p is:
p ¬p
T F
F T
Example, Negation of “It is raining today”, is “It is not the case that is
raining today” or simply “It is not raining today”.
2. Conjunction
For any two propositions p and q , their conjunction is denoted by p∧q ,
which means “ p and q “. The conjunction p∧q is True when
both p and q are True, otherwise False. The truth table of p∧q is:
p q p∧q
T T T
T F F
p q p∧q
F T F
F F F
T T T
T F T
F T T
F F F
T T F
T F T
F T T
F F F
Example, Exclusive or of the propositions p – “Today is Friday” and q –
“It is raining today”, p⊕q is “Either today is Friday or it is raining today,
but not both”. This proposition is true on any day that is a Friday or a
rainy day(not including rainy Fridays) and is false on any day other than
Friday when it does not rain or rainy Fridays.
5. Implication
For any two propositions p and q , the statement “if p then q ” is called
an implication and it is denoted by p→q . In the implication p→q , p is
called the hypothesis or antecedent or premise and q is called
the conclusion or consequence. The implication is p→q is also called
a conditional statement. The implication is false when p is true and q is
false otherwise it is true. The truth table of p→q is:
p q p→q
T T T
T F F
F T T
F F T
One might wonder that why is p→q true when p is false. This is
because the implication guarantees that when p and q are true then the
implication is true. But the implication does not guarantee anything when
the premise p is false. There is no way of knowing whether or not the
implication is false since p did not happen. This situation is similar to the
“Innocent until proven Guilty” stance, which means that the
implication p→q is considered true until proven false. Since we cannot call
the implication p→q false when p is false, our only alternative is to call it
true.
This follows from the Explosion Principle which says: “A False
statement implies anything” Conditional statements play a very important
role in mathematical reasoning, thus a variety of terminology is used to
express p→q , some of which are listed below.
“If p, then “q”p is sufficient for q””q when p””a necessary condition for p is
q””p only if q””q unless ≠p””q follows from p”
Example, “If it is Friday then it is raining today” is a proposition which is
of the form p→q . The above proposition is true if it is not Friday(premise
is false) or if it is Friday and it is raining, and it is false when it is Friday
but it is not raining.
6. Biconditional or Double Implication
For any two propositions p and q , the statement “p if and only if(iff) q ”
is called a biconditional and it is denoted by p↔q . The
statement p↔q is also called a bi-implication. p↔q has the same truth
value as (p→q)∧(q→p) The implication is true when p and q have same
truth values, and is false otherwise. The truth table of p↔q is:
p q p↔q
T T T
T F F
F T F
F F T
Example:
1. a) It is Sunday.
2. b) The Sun rises from West (False proposition)
3. c) 3+3= 7(False proposition)
4. d) 5 is a prime number.
ADVERTISEMENT
a. Atomic Propositions
b. Compound propositions
Example:
ADVERTISEMENT
Example:
Logical Connectives:
Logical connectives are used to connect two simpler propositions or representing a
sentence logically. We can create compound propositions with the help of logical
connectives. There are mainly five connectives, which are given as follows:
Precedence of connectives:
Just like arithmetic operators, there is a precedence order for propositional
connectors or logical operators. This order should be followed while evaluating a
propositional problem. Following is the list of the precedence order for operators:
Precedence Operators
Note: For better understanding use parenthesis to make sure of the correct interpretations.
Such as ¬R∨ Q, It can be interpreted as (¬R) ∨ Q.
Logical equivalence:
Logical equivalence is one of the features of propositional logic. Two propositions
are said to be logically equivalent if and only if the columns in the truth table are
identical to each other.
Let's take two propositions A and B, so for logical equivalence, we can write it as
A⇔B. In below truth table we can see that column for ¬A∨ B and A→B, are identical
hence A is Equivalent to B
Properties of Operators:
o Commutativity:
o P∧ Q= Q ∧ P, or
o P ∨ Q = Q ∨ P.
o Associativity:
o (P ∧ Q) ∧ R= P ∧ (Q ∧ R),
o (P ∨ Q) ∨ R= P ∨ (Q ∨ R)
o Identity element:
o P ∧ True = P,
o P ∨ True= True.
o Distributive:
o P∧ (Q ∨ R) = (P ∧ Q) ∨ (P ∧ R).
o P ∨ (Q ∧ R) = (P ∨ Q) ∧ (P ∨ R).
o DE Morgan's Law:
o ¬ (P ∧ Q) = (¬P) ∨ (¬Q)
o ¬ (P ∨ Q) = (¬ P) ∧ (¬Q).
o Double-negation elimination:
o ¬ (¬P) = P.
Predicate Logic
Predicate Logic deals with predicates, which are propositions, consist of variables.
Quantifier:
The variable of predicates is quantified by quantifiers. There are two types of
quantifier in predicate logic - Existential Quantifier and Universal Quantifier.
Existential Quantifier:
If p(x) is a proposition over the universe U. Then it is denoted as ∃x p(x) and read as
"There exists at least one value in the universe of variable x such that p(x) is true. The
quantifier ∃ is called the existential quantifier.
There are several ways to write a proposition, with an existential quantifier, i.e.,
(∃x∈A)p(x) or ∃x∈A such that p (x) or (∃x)p(x) or p(x) is true for some x
∈A.
Universal Quantifier:
If p(x) is a proposition over the universe U. Then it is denoted as ∀x,p(x) and read as
"For every x∈U,p(x) is true." The quantifier ∀ is called the Universal Quantifier.
The two rules for negation of quantified proposition are as follows. These are also
called DeMorgan's Law.
2. (∃x∈U) (x+6=25)
3. ~( ∃ x p(x)∨∀ y q(y)
The proposition which contains both universal and existential quantifiers, the order
of those quantifiers can't be exchanged without altering the meaning of the
proposition, e.g., the proposition ∃x ∀ y p(x,y) means "There exists some x such that
p (x, y) is true for every y."
Example: Write the negation for each of the following. Determine whether the
resulting statement is true or false. Assume U = R.
1.∀ x ∃ m(x2<m)
Sol: Negation of ∀ x ∃ m(x2<m) is ∃ x ∀ m (x2≥m). The meaning of ∃ x ∀ m (x2≥m) is
that there exists for some x such that x2≥m, for every m. The statement is true as
there is some greater x such that x2≥m, for every m.
2. ∃ m∀ x(x2<m)
First-Order logic:
o First-order logic is another way of knowledge representation in artificial
intelligence. It is an extension to propositional logic.
o FOL is sufficiently expressive to represent the natural language statements in a
concise way.
o First-order logic is also known as Predicate logic or First-order predicate
logic. First-order logic is a powerful language that develops information
about the objects in a more easy way and can also express the relationship
between those objects.
o First-order logic (like natural language) does not only assume that the world
contains facts like propositional logic but also assumes the following things in
the world:
o Objects: A, B, people, numbers, colors, wars, theories, squares, pits,
wumpus, ......
o Relations: It can be unary relation such as: red, round, is adjacent, or
n-any relation such as: the sister of, brother of, has color, comes
between
o Function: Father of, best friend, third inning of, end of, ......
o As a natural language, first-order logic also has two main parts:
a. Syntax
b. Semantics
Variables x, y, z, a, b,....
Connectives ∧, ∨, ¬, ⇒, ⇔
Equality ==
Quantifier ∀, ∃
Atomic sentences:
o Atomic sentences are the most basic sentences of first-order logic. These
sentences are formed from a predicate symbol followed by a parenthesis with
a sequence of terms.
o We can represent atomic sentences as Predicate (term1, term2, ......, term
n).
Complex Sentences:
Consider the statement: "x is an integer.", it consists of two parts, the first part x is
the subject of the statement and second part "is an integer," is known as a predicate.
Universal Quantifier:
Universal quantifier is a symbol of logical representation, which specifies that the
statement within its range is true for everything or every instance of a particular
thing.
The Universal quantifier is represented by a symbol ∀, which resembles an inverted
A.
o For all x
o For each x
o For every x.
Example:
All man drink coffee.
Let a variable x which refers to a cat so all x can be represented in UOD as below:
It will be read as: There are all x where x is a man who drink coffee.
Existential Quantifier:
Existential quantifiers are the type of quantifiers, which express that the statement
within its scope is true for at least one instance of something.
If x is a variable, then existential quantifier will be ∃x or ∃(x). And it will be read as:
Example:
Some boys are intelligent.
It will be read as: There are some x where x is a boy who is intelligent.
Points to remember:
o The main connective for universal quantifier ∀ is implication →.
o The main connective for existential quantifier ∃ is and ∧.
Properties of Quantifiers:
o In universal quantifier, ∀x∀y is similar to ∀y∀x.
o In Existential quantifier, ∃x∃y is similar to ∃y∃x.
o ∃x∀y is not similar to ∀y∃x.
ADVERTISEMENT
Resolution in FOL
Resolution
Resolution is a theorem proving technique that proceeds by building refutation proofs, i.e.,
proofs by contradictions. It was invented by a Mathematician John Alan Robinson in the year
1965.
Resolution is used, if there are various statements are given, and we need to prove a conclusion
of those statements. Unification is a key concept in proofs by resolutions. Resolution is a single
inference rule which can efficiently operate on the conjunctive normal form or clausal form.
Clause: Disjunction of literals (an atomic sentence) is called a clause. It is also known as a unit
clause.
Note: To better understand this topic, firstly learns the FOL in AI.
This rule is also called the binary resolution rule because it only resolves exactly two literals.
Example:
We can resolve two clauses which are given below:
Where two complimentary literals are: Loves (f(x), x) and ¬ Loves (a, b)
These literals can be unified with unifier θ= [a/f(x), and b/x] , and it will generate a resolvent
clause:
To better understand all the above steps, we will take an example in which we will apply
resolution.
Example:
In the first step we will convert all the given statements into its first order logic.
In First order logic resolution, it is required to convert the FOL into CNF as CNF form makes
easier for resolution proofs.
Note: Statements "food(Apple) Λ food(vegetables)" and "eats (Anil, Peanuts) Λ alive(Anil)" can be
written in two separate statements.
In this statement, we will apply negation to the conclusion statements, which will be written as
¬likes(John, Peanuts)
Now in this step, we will solve the problem by resolution tree using substitution. For the above
problem, it will be given as follows:
Hence the negation of the conclusion has been proved as a complete contradiction with the
given set of statements.
Probability
We assign a probability measure P(A) to an event A. This is a value
between 0 and 1 that shows how likely the event is. If P(A) is close to 0, it is
very unlikely that the event A occurs. On the other hand, if P(A)(A) is close
to 1, A is very likely to occur. The main subject of probability theory is to
develop tools and techniques to calculate probabilities of different events.
Probability theory is based on some axioms that act as the foundation for
the theory, so let us state and explain these axioms.
Probability: Probability can be defined as a chance that an uncertain event will occur. It
is the numerical measure of the likelihood that an event will occur. The value of
probability always remains between 0 and 1 that represent ideal uncertainties.
We can find the probability of an uncertain event by using the below formula.
Sample space: The collection of all possible outcomes is called sample space.
Random variables: Random variables are used to represent the events in the real
world.
The first one is that the probability of an event is always between 0 and 1. 1 indicates
definite action of any of the outcome of an event and 0 indicates no outcome of the event is
possible.
And the third one is- the probability of the event containing any possible outcome of two
mutually disjoint is the summation of their individual probability.
1. Probability of Event
The first axiom of probability is that the probability of any event is between 0 and 1.
As we know the formula of probability is that we divide the total number of outcomes in the
event by the total number of outcomes in sample space.
And the event is a subset of sample space, so the event cannot have more outcome than
the sample space.
Clearly, this value is going to be between 0 and 1 since the denominator is always greater
than the numerator.
hese Mutually exclusive events mean that such events cannot occur together or in other
words, they don‟t have common values or we can say their intersection is zero/null. We can
also represent such events as follows:
This means that the intersection is zero or they do not have any common value. For
example, if the
Event A: is getting a number greater than 4 after rolling a die, the possible outcomes
would be 5 and 6.
Even B: is getting a number less than 3 on rolling a die. Here the possible outcomes
would be 1 and 2.
Clearly, both these events cannot have any common outcome. An interesting thing to note
here is that events A
and B are not complemented of each other but yet they’re mutually exclusive.
---------------------------------------------------------------
---------------------------------------------------------
Probabilistic reasoning in Artificial intelligence
Uncertainty:
Till now, we have learned knowledge representation using first-order logic and
propositional logic with certainty, which means we were sure about the predicates. With
this knowledge representation, we might write A→B, which means if A is true then B is
true, but consider a situation where we are not sure about whether A is true or not then
we cannot express this statement, this situation is called uncertainty.
So to represent uncertain knowledge, where we are not sure about the predicates, we
need uncertain reasoning or probabilistic reasoning.
Following are some leading causes of uncertainty to occur in the real world.
Probabilistic reasoning:
Probabilistic reasoning is a way of knowledge representation where we apply the
concept of probability to indicate the uncertainty in knowledge. In probabilistic
reasoning, we combine probability theory with logic to handle the uncertainty.
In the real world, there are lots of scenarios, where the certainty of something is not
confirmed, such as "It will rain today," "behavior of someone for some situations," "A
match between two teams or two players." These are probable sentences for which
we can assume that it will happen but not sure about it, so here we use
probabilistic reasoning.
Conditional probability:
Conditional probability is a probability of occurring an event when another event has
already happened.
Let's suppose, we want to calculate the event A when event B has already occurred, "the
probability of A under the conditions of B", it can be written as:
Where P(A⋀B)= Joint probability of a and B
P(B)= probability of B.
Example:
In a class, there are 70% of the students who like English and 40% of the students who
likes English and mathematics, and then what is the percent of students those who like
English also like mathematics?
Solution:
Hence, 57% are the students who like English also like Mathematics.
In probability theory, it relates the conditional probability and marginal probabilities of two
random events.
Bayes' theorem was named after the British mathematician Thomas Bayes. The Bayesian
inference is an application of Bayes' theorem, which is fundamental to Bayesian statistics.
Bayes' theorem allows updating the probability prediction of an event by observing new
information of the real world.
Example: If cancer corresponds to one's age then by using Bayes' theorem, we can determine
the probability of cancer more accurately with the help of age.
Bayes' theorem can be derived using product rule and conditional probability of event A with
known event B:
The above equation (a) is called as Bayes' rule or Bayes' theorem. This equation is basic of
most modern AI systems for probabilistic inference.
It shows the simple relationship between joint and conditional probabilities. Here,
P(A|B) is known as posterior, which we need to calculate, and it will be read as Probability of
hypothesis A when we have occurred an evidence B.
P(B|A) is called the likelihood, in which we consider that hypothesis is true, then we calculate the
probability of evidence.
P(A) is called the prior probability, probability of hypothesis before considering the evidence
In the equation (a), in general, we can write P (B) = P(A)*P(B|Ai), hence the Bayes' rule can be
written as:
Where A1, A2, A3,........, An is a set of mutually exclusive and exhaustive events.
Example-1:
Question: what is the probability that a patient has diseases meningitis with a stiff neck?
Given Data:
A doctor is aware that disease meningitis causes a patient to have a stiff neck, and it occurs 80%
of the time. He is also aware of some more facts, which are given as follows:
Let a be the proposition that patient has stiff neck and b be the proposition that patient has
meningitis. , so we can calculate the following as:
P(a|b) = 0.8
P(b) = 1/30000
P(a)= .02
Hence, we can assume that 1 patient out of 750 patients has meningitis disease with a stiff neck.
Example-2:
Question: From a standard deck of playing cards, a single card is drawn. The probability
that the card is king is 4/52, then calculate posterior probability P(King|Face), which
means the drawn face card is a king card.
Solution:
P(king): probability that the card is King= 4/52= 1/13
o It is used to calculate the next step of the robot when the already executed step is given.
o Bayes' theorem is helpful in weather forecasting.
o It can solve the Monty Hall problem.