Topic 3 Dealing With Uncertainty Slides
Topic 3 Dealing With Uncertainty Slides
Uncertainty
Imprecise language
• Our natural language is ambiguous and
imprecise.
• We describe facts with such terms as often and
sometimes, frequently and hardly ever.
• As a result, it can be difficult to express
knowledge in the precise IF-THEN form of
production rules.
Types of uncertain knowledge
Imprecise language
• However, if the meaning of the facts is
quantified, it can be used in expert systems.
• In 1944, Ray Simpson asked 355 high school and
college students to place 20 terms like often on
a scale between 1 and 100.
• In 1968, Milton Hakel repeated this experiment.
Quantification of ambiguous and imprecise
terms on a time-frequency scale
Ray Simpson (1944) Milton Hakel (1968)
Term Mean value Term Mean value
Always 99 Always 100
Very often 88 Very often 87
Usually 85 Usually 79
Often 78 Often 74
Generally 78 Rather often 74
Frequently 73 Frequently 72
Rather often 65 Generally 72
About as often as not 50 About as often as not 50
Now and then 20 Now and then 34
Sometimes 20 Sometimes 29
Occasionally 20 Occasionally 28
Once in a while 15 Once in a while 22
Not often 13 Not often 16
Usually not 10 Usually not 16
Seldom 10 Seldom 9
Hardly ever 7 Hardly ever 8
Very seldom 6 Very seldom 7
Rarely 5 Rarely 5
Almost never 3 Almost never 2
Never 0 Never 0
Unknown data
p(A|B)
Note:
1. The probability that event A will occur if
event B occurs is called the conditional
probability.
2. Vertical bar represents GIVEN,
3. Probability expression is interpreted as
“Conditional probability of event A occurring
given that event B has occurred”.
Mathematical
expression of
conditional
probability
p(A∩B)
The number of times A and B can occur, or the
probability that both A and B will occur, is called
the joint probability of A and B.
Joint probability of A and B
• The number of ways B can occur is the
probability of B, p(B), and thus Intersection
represents
Vertical bar AND
represents
GIVEN
pBA p A
p AB =
pB
Bayes’ rule
pBA p A
p AB =
pB
where:
• p(A|B) is the conditional probability that event A
occurs given that event B has occurred;
• p(B|A) is the conditional probability of event B
occurring given that event A has occurred;
• p(A) is the probability of event A occurring;
• p(B) is the probability of event B occurring.
The joint probability
• If the occurrence of event A depends on only
two mutually exclusive events, B and NOT B , we
obtain:
• Similarly,
An expert determines :
❖ the prior probabilities for possible hypotheses
p(H) and p(¬H), and
❖ the conditional probabilities for observing
evidence E :
✓ if hypothesis H is true, p(E|H), and
✓ if hypothesis H is false, p(E|¬H).
Bayesian reasoning
p E3 H i p H i
Probability
i =1 i =2 i =3
p E1 Hi p E3 H i p H i
p Hi E1E3 = , i = 1, 2, 3
3
p E1 H k p E3 H k p H k
k =1
Hence,
0.3 0.6 0.40
p H1 E1E3 = = 0.19
0.3 0.6 0.40 + 0.8 0.7 0.35 + 0.5 0.25
0.8 0.7 0.35
p H 2 E1E3 = = 0.52
0.3 0.6 0.40 + 0.8 0.7 0.35 + 0.5 0.25
0.5 0.9 0.25
p H3 E1E3 = = 0.29
0.3 0.6 0.40+ 0.8 0.7 0.35 + 0.5 0.25
Hypothesis H2 has now become the most likely one.
H y p o t h esi s
Probability
i =1 i =2 i =3
After observing evidence E2, the final posterior p Hi 0.40 0.35 0.25
pBA p A
p AB =
pB
Application in medical diagnosis (E.g. 2)
Data
– 99% of the time you have a plague, you also
have a high temperature. P(HT|P) = 0.99
– At any one time, 1 in every 1000,000,000
people has a plague. P(P) = 1/1000000000 =
0.000000001
– 1 in every 1000 people has a high
temperature. P(HT) = 1/1000 = 0.001
• What is the relative likelihood of cold and
plague? {P(C|HT)/P(P|HT)} = ? 49
Application in medical diagnosis
(E.g. 2 - continue)
• Medical diagnosis
– P(HT|C) = 0.8
– Probability of cold, P(C) = 0.0001
– Probability of plague, P(P) is 0.000000001
– 99% of the time you have a plague, you also
have a high temperature. P(HT|P) = 0.99
• Relative likelihood of cold and plague
𝑃(𝐻𝑇|𝐶) ∗ 𝑃(𝐶) 𝑃(𝐻𝑇|𝑃) ∗ 𝑃(𝑃)
𝑃(𝐶|𝐻𝑇) = 𝑃(𝑃|𝐻𝑇) =
𝑃(𝐻𝑇) 𝑃(𝐻𝑇)
𝑃(𝐶|𝐻𝑇) 𝑃(𝐻𝑇|𝐶) ∗ 𝑃(𝐶)
=
𝑃(𝑃|𝐻𝑇) 𝑃(𝐻𝑇|𝑃) ∗ 𝑃(𝑃)
Term Certainty Factor
Definitely not _ 1.0
IF <evidence> cf = belief in
THEN <hypothesis> {cf} hypothesis H
given that
evidence E has
occurred.
Certainty factors theory
Certainty Measure of Belief
factors theory Measure of Disbelief
MB H, E − MD H,E
cf =
1- min MB H,E , MD H, E
Certainty factors theory - Example
• Consider a simple rule:
IF A is X
THEN B is Y
• An expert may not be absolutely certain that this
rule holds.
• Also suppose it has been observed that in some
cases, even when the IF part of the rule is
satisfied and object A takes on value X, object B
can acquire some different value Z.
IF A is X
THEN B is Y {cf 0.7};
B is Z {cf 0.2}
Certainty factor
• The certainty factor assigned by a rule is
propagated through the reasoning chain.
• This involves establishing the net certainty of
the rule consequent when the evidence in the
rule antecedent is uncertain:
cf (H,e) = cf (E,e) x cf (H,E)
Hypothesis Evidence
Notes:
• cf (H, e): certainty factor hypothesis H was influenced by evidence e
• cf (E, e): certainty factor evidence E influenced by evidence e
• cf (H, E): certainty factor hypotheses H assuming the evidence
uncertain, i.e. when the cf (E, e) = 1
Certainty factor - Example
IF sky is clear
THEN the forecast is sunny {cf 0.8}
and the current certainty factor of sky is 0.5, then
cf (H,E) = 0.5 × 0.8 = 0.4
This result can be interpreted as “It may be
sunny”.
For conjunctive rules such as
IF <evidence E1>
………
AND <evidence En>
THEN <hypothesis H > {cf)
Example,
IF sky is clear
certainty factor
AND the forecast is sunny
THEN the action is ‘wear sunglasses’ {cf 0.8}
and the certainty of sky is clear is 0.9 and the certainty of the
forecast of sunny is 0.7,
then cf (H,E1∩E2) = min [0.9, 0.7] × 0.8 = 0.7 × 0.8 = 0.56
For disjunctive rules such as
IF <evidence E1>
………
OR <evidence En>
THEN <hypothesis H > {cf)
For example,
IF sky is overcast
certainty factor
OR the forecast is rain
THEN the action is ‘take an umbrella’ {cf 0.9}
and the certainty of sky is overcast is 0.6 and the certainty of the
forecast of rain is 0.8,
then cf (H,E1ᴜE2 ) = max [0.6, 0.8] × 0.9 = 0.8 × 0.9 = 0.72
Combine Certainty factor
• When the same consequent is obtained as a
result of the execution of two or more rules,
the individual certainty factors of these rules
must be merged to give a combined certainty
factor for a hypothesis.
• Suppose the knowledge base consists of the
following rules:
Rule 1 IF A is X
THEN C is Z {cf 0.8}
IF B is Y
Rule 2 THEN C is Z {cf 0.6}
Combine Certainty factor
• If we have two pieces of evidence (A is X
and B is Y) from different sources (Rule 1
and Rule 2) supporting the same
hypothesis (C is Z),
• then the confidence in this hypothesis
should increase and become stronger
than if only one piece of evidence had
been obtained.
Equation to calculate a combined certainty factor
where:
cf1 is the confidence in hypothesis H established by
Rule 1;
cf2 is the confidence in hypothesis H established by
Rule 2;
|cf1| and |cf2| are absolute magnitudes of cf1 and cf2,
respectively. Absolute value symbol
Certainty factors theory
• The certainty factors theory provides a
practical alternative to Bayesian reasoning.
• The heuristic manner of combining certainty
factors is different from the manner in which
they would be combined if they were
probabilities.
• The certainty theory is not “mathematically
pure” but does mimic the thinking process of
a human expert.
Comparison of Bayesian reasoning and
certainty factors
• Probability theory is the oldest and best-
established technique to deal with inexact
knowledge and random data.
• It works well in such areas as forecasting and
planning, where statistical data is usually
available and accurate probability statements
can be made.
Comparison of Bayesian reasoning and
certainty factors (continue)
X Y
i. Fact: x is A’
ii. Conclusion: y is B’
A’
B’
X Y
x is A’ y is B’
Fuzzy Reasoning
(Single rule with multiple antecedent)
Single rule with multiple antecedent
i. Rule: if x is A and y is B then z is C
Graphic Representation:
A’ A B’ B T-norm C
w
Z
X Y
i. Fact: x is A’ and y is B’
ii. Conclusion: z is C’
A’ B’
C’
Z
x is A’ X y is B’ Y z is C’
Fuzzy Reasoning
• Multiple rules with multiple antecedent
i. Rule 1: if x is A1 and y is B1 then z is C1
ii. Rule 2: if x is A2 and y is B2 then z is C2
iii. Fact: x is A’ and y is B’
iv. Conclusion: z is C’
• Graphic Representation: (next slide)
Multiple rules with multiple antecedent
Graphics representation:
A’ A1 B’ B1 C1
Rule
1 w1
Z
X Y
A’ A2 B’ B2 C2
Rule w2
2
Z
X Y
T-norm
A’ B’
C’
Z
x is A’ X y is B’ Y z is C’
Membership Functions
• The simplest membership functions are formed using
straight lines.
• The simplest is the triangular membership function,
and it has the function name trimf.
• The trapezoidal membership function, trapmf, has a
flat top and really is just a truncated triangle curve.
• These straight line membership functions have the
advantage of simplicity.
0
10 30 50 70 90 110
0.7
0.3
0
10 30 50 70 90 110
Temp. (F°)
Membership Functions (Gaussian distribution curve)
• Two membership functions are built on the Gaussian
distribution curve:
• a simple Gaussian curve and a two-sided composite of
two different Gaussian curves.
• The two functions are gaussmf and gauss2mf.
• The generalized bell membership has the function name
gbellmf.
• Because of their smoothness and concise notation,
Gaussian and bell membership functions are popular
methods for specifying fuzzy sets. Both of these curves
have the advantage of being smooth and nonzero at all
points.
Membership Functions (Sigmoidal membership function)
• Gaussian membership functions and bell membership functions
are unable to specify asymmetric membership functions, which are
important in certain applications.
• Sigmoidal membership function is defined, which is either open
left or right.
• Asymmetric and closed (i.e. not open to the left or right)
membership functions can be synthesized using two sigmoidal
functions, so in addition to the basic sigmf, you also have the
difference between two sigmoidal functions, dsigmf, and the
product of two sigmoidal functions psigmf.
Membership Functions (Polynomial based curves )
• Polynomial based curves account for several of the
membership functions in the toolbox.
• Three related membership functions are the Z, S, and Pi
curves, all named because of their shape.
• The function zmf is the asymmetrical polynomial curve
open to the left, smf is the mirror-image function that
opens to the right, and pimf is zero on both extremes
with a rise in the middle.
Rules As Knowledge
Rule-based reasoning
• One can often represent the expertise that
someone uses to do an expert task as rules.
• A rule means a structure which has an if
component and a then component.
The Edwin Smith papyrus
ABCDEECDBBACDACDBCDECDADCADBADE
_benefit]
Reasoning with production rules
observed data
working memory
select modify
rule
Inference
memory fire output
engine
Working
memory
and
rule memory
Linguistic approximation
➢Instead of the defuzzification module, a linguistic
approximation module is needed to :
✓Finds a linguistic term which is closest to the
obtained fuzzy set.
✓Use a measuring technique of distance between
fuzzy sets.
Fuzzy
expert
system
Scheduler
➢Controls all the processes in the fuzzy expert system.
➢Determines the rules to be executed and sequence of
their executions.
➢Provides an explanation function for the result.
Fuzzy expert
system
Knowledge base
➢Principal design parameters for an fuzzy logic
controller
➢Contains a knowledge of the application
domain and the control goals.
Fuzzy expert systems
Knowledge
base
User
input Fuzzification Linguistic Output
interface Scheduler approximation
Inference
engine
Fuzzy logic controller
Fuzzy logic controller
−1 0 +1
◼NM: negative medium
◼NS: negative small
N: negative, Z: zero, P: positive
◼ZE: zero
−1 0 +1
PS
big
ZO
NS
NB small
x1 x1
NB NS ZO PS PB small big
A fuzzy partition in 2-dimension input space A fuzzy partition having three rules
the maximum number of control rules = 20 (5x4)
Rule base
Source of fuzzy control rules
Derivation of fuzzy control rules
a) Heuristic method
i. Rules are formed by analyzing the behavior
of a controlled process.
ii. The derivation relies on the qualitative
knowledge of process behavior.
b) Deterministic method
i. Can systematically determine the linguistic
structure of rules.
Rule base
4 modes of derivation of fuzzy control rules
➢Expert experience and control engineering knowledge:
operating manual and questionnaire
➢Based on operators’ control actions :
observation of human controller’s actions in terms of input-
output operating data
➢Based on the fuzzy model of a process :
linguistic description of the dynamic characteristics of a process
➢Based on learning :
ability to modify control rules such as self-organizing controller
Types of fuzzy control rules:
1. State evaluation fuzzy control rules
A collection of rules of the form(MISO version)
R1: if x is A1, … and y is B1 then z is C1
R2: if x is A2, … and y is B2 then z is C2
…
Rn: if x is An, … and y is Bn then z is Cn
where x, … y and z = linguistic variables representing the
process state variable and the control variable.
Ai, … Bi and Ci are linguistic values of the variables x, … y
and z in the universe of discourse U, … V and W,
respectively i = 1, 2, … , n.
That is,
x U, Ai U, …, y V, Bi V, z W, Ci W
Types of fuzzy control rules:
2. Object evaluation fuzzy control rules
To predict present and future control actions, and evaluate
control objectives (predictive fuzzy control)
Typical rules
R1: if (z is C1 → (x is A1 and y is B1)) then z is C1.
…
Rn: if (z is Cn → (x is An and y is Bn)) then z is Cn.
• Control action is determined by an objective evaluation that
satisfies the desired states and objectives.
• x and y are performance indices for the evaluation and z is
control command.
• Ai and Bi are fuzzy values such as NM and PS.
• The most likely control rule is selected through predicting the
results (x, y) corresponding to every control command Ci, i = 1,
2, … , n.
Decision making logic
(Inference)
Mandani method
➢ minimum operator for a fuzzy implication
➢ max-min operator for the composition
Minimum
Maximum
The product operator
Decision making logic multiplies the terms of a
sequence or partial
(Inference) sequence.
Larsen method
➢ product operator(•) for a fuzzy implication
➢ max-product operator for the composition
Decision making logic
(Inference)
Tsukamoto method
Consequent part : fuzzy set with a monotonic membership
function
monotonic
function
Decision making logic
TSK method
(Inference)
The consequent part : function of input variables.
Defuzzification
• Defuzzification means the fuzzy to crisp
conversion
• Fuzzification = mapping is done to convert the
crisp results into fuzzy results
• Defuzzification = mapping is done to convert the
fuzzy results into crisp results.
• This process is capable of generating a non-fuzzy
control action.
Mean of maximum method (MOM)
The MOM strategy generates a control action which
represents the mean value of all control actions, whose
membership functions reach the maximum.
𝑘
𝑧𝑗
𝑧0 =
𝑘
𝑗=1
z0 z
Center of area method (COA)
COA generates the center of gravity of the possibility
distribution of a fuzzy set C .
σ𝑛𝑗=1 𝜇𝐶 (𝑧𝑗 ) ⋅ 𝑧𝑗
𝑧0 =
σ𝑛𝑗=1 𝜇𝐶 (𝑧𝑗 )
z0 z
Bisector of area (BOA)
The BOA generates the action (z0) which partitions the
area into two regions with the same area .
𝑧0 𝛽
න 𝜇𝐶 (𝑧) 𝑑𝑧 = න 𝜇𝐶 (𝑧) 𝑑𝑧
𝛼 𝑧0
= min{z | z W}
= max{z | z W}
Lookup table
FuzzySet init
Fuzzy Set class
• In the context of a fuzzy variable, all the sets
will have the same minimum, maximum and
resolution values.
• As we are dealing with a discretized domain, it
will be necessary to adjust any value used to set
or retrieve the degree-of-membership to the
closest value in the domain array.
Rule execution
Fuzzy System Class
(Bringing it all together)
• At the topmost level of this architecture, we have
the FuzzySystem that coordinates all activities
between the FuzzyVariables and FuzzyRules.
• Hence the system contains the input and output
variables, that are stored in python dictionaries
using variable-names as keys and a list of the
rules.
• One of the challenges at this stage is the method
that the end-user will use to add rules, that
should ideally abstract the implementation detail
of the FuzzyClause classes.
Fuzzy System Class
(Bringing it all together)
• The method that was implemented consists of
providing two python dictionaries that will
contain the antecedent and consequent
clauses of the rule in the following format;
variable name : set name
• A more user-friendly method is to provide the
rule as a string and then parse that string to
create the rule.
Fuzzy System Class
Addition of a new rule to the
FuzzySystem
The execution of the inference process can be achieved with
following steps (codes):
1. The output distribution sets of all the output variables are
cleared.
2. The input values to the system are passed to the
corresponding input variables (each set in the variable can
determine its degree-of-membership for that input value)
3. Execution of the Fuzzy Rules takes place (the output
distribution sets of all the output variables will now
contain the union of the contributions from each rule)
4. The output distribution sets are defuzzified using a
centre-of-gravity defuzzifier to obtain the crisp result.
Addition of a new rule to the
FuzzySystem
1
4
The use of the fuzzy inference system
• A fuzzy system begins with the consideration of the
input and output variables, and the design of the
fuzzy sets to explain that variable.
• The variables will require a lower and upper limit
and, as we will be dealing with discrete fuzzy sets,
the resolution of the system.
• Therefore a variable definition will look as follows
temp = FuzzyInputVariable('Temperature', 10, 40, 100)
• where the variable ‘Temperature’ ranges between
10 and 40 degrees and is discretized in 100 bins.
Fuzzy inference system
• The fuzzy sets define for the variable will require
different parameters depending on their shape.
• In the case of triangular sets, for example, three
parameters are needed, two for the lower and upper
extremes having a degree of membership of 0 and
one for the apex which has a degree-of-membership
of 1.
• A triangular set definition for variable ‘Temperature’
can, therefore, look as follows;
temp.add_triangular('Cold', 10, 10, 25)
• where the set called ‘Cold’ has extremes at 10 and 25
and apex at 10 degrees.
Fuzzy inference system
• If the system consists of two input variables,
‘Temperature’ and ‘Humidity’ and a single output
variable ‘Speed’. Each variable us described by
three fuzzy sets.
• The definition of the output variable ‘Speed’
looks as follows:
motor_speed = FuzzyOutputVariable('Speed', 0, 100, 100)
motor_speed.add_triangular('Slow', 0, 0, 50)
motor_speed.add_triangular('Moderate', 10, 50, 90)
motor_speed.add_triangular('Fast', 50, 100, 100)
Fuzzy inference system
As we have seen before, the fuzzy system is the
entity that will contain these variables and fuzzy
rules. Hence the variables will have to be added
to a system as follows:
system = FuzzySystem()
system.add_input_variable(temp)
system.add_input_variable(humidity)
system.add_output_variable(motor_speed)
Fuzzy Rules
• A fuzzy system executes fuzzy rules to operate of the
form
If x1 is S and x2 is M then y is S
• where the If part of the rule contains several
antecedent clauses and the then section will include
several consequent clauses.
• For simplification, we will assume rules that require
an antecedent clause from each input variable and
are only linked together with an ‘and’ statement.
• It is possible to have statements linked by ‘or’ and
statements can also contain operators on the sets like
‘not’.
Fuzzy Rules
• The simplest way to add a fuzzy rule to the
system is to provide a list of the antecedent
clauses and consequent clauses.
• One method of doing so is by using a python
dictionary that contains
Variable:Set
output = system.evaluate_output({
'Temperature':18,
'Humidity':60 })