0% found this document useful (0 votes)

3 views14 pages

samp_sol

The document discusses various topics in probability and decision-making, including Bayes' Nets, the Prisoner's Dilemma, A*-search, overfitting in models, and medical decision-making strategies. It presents problems and solutions related to these topics, illustrating concepts such as conditional independence, expected values, and optimal action strategies. Additionally, it compares different planning methods and discusses the implications of model complexity on predictive accuracy.

Uploaded by

krthienan0610

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

3 views14 pages

samp_sol

Uploaded by

krthienan0610

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

You are on page 1/ 14

6.

891 Review Problems

5/16/01

1 Bayes’ Nets
I am a professor trying to predict the performance of my class on an exam.
After much thought, it is apparent that the students who do well are those
that studied and do not have a headache when they take the exam. My vast
medical knowledge leads me to believe that headaches can only be caused by
being tired or by having the flu. Studying, the flu, and being tired are pairwise
independent.

a) Model 3 best models the relationship.

F T S F T F T S

H H S H

E E E

Figure 1: From left to right, models 1, 2, and 3

b) The first model makes flu, tiredness, and studying only conditionally in-
dependent (the first two conditioned on H and the third conditioned on
E). The second model has the right relations, but many unnecessary de-
pendencies. If the situation is accurately described in the problem, it will
produce the same joint distribution as the third model because the de-
pendencies will have no effect (for example, P (S|F, T ) will be the same as
P (S)).
c) If we assume that we need to hold only one value, P (X = true) to repre-
sent the apriori probability of a single variable, then we need 2n entries for
a conditional probability table for a node with n parents. So, the original
network has a complexity of 1 + 1 + 1 + 4 + 4 = 11 and the new network
has a complexity of 1 + 1 + 4 + 4 + 4 = 14. So, the size of the information

1
necessary to store the network has increased by about 27%, but we have
gained only a little more accuracy.
F T

S
H

2 True & False

1. False.
2. True.
3. False.
4. True.

2
5. False.
6. False.
7. False.
8. True.
9. True.

3 The Prisoner’s Dilemma

(0.40)
p-talks -50

-50

p-quiet -50
talk (0.60)

-46

quiet (0.40)
p-talks -100

-46

p-quiet -10
(0.60)

The decision tree shows that the best choice is to keep quiet - the expected
value is higher than that for talking. If we want to find out how much trust in
our partner is necessary make keeping quiet the better choice, we need to find
out for what value of P (p − quiet) the expected value of keeping quiet equals
-50.

−100 ∗ (1 − P (p − quiet)) + (−10 ∗ P (p − quiet)) = −50

−100 + 100P (p − quiet) − 10P (p − quiet) = −50
−100 + 90P (p − quiet) = −50
90P (p − quiet) = 50
5
P (p − quiet) = ≈ .55
9
5
So as long as I believe that the probability of my partner keeping quiet is 9
or better, choosing to keep quiet is the best strategy.

4 Searching
a) A*-search would be a good strategy. Straight-line distance is known to
be an admissible heuristic and should be a good approximation of the

3
distance remaining to our goal from any point. And, we know that A*-
search will return the shortest path, allowing us to beat out our trading
competitors.
b) I can still use the straight-line distances. Heuristics are admissible so long
as they do not overestimate the distance to a goal.

5 Overfitting
In regression and in crafting generative models, we can make our model fit the
training points too closely, which often means that we are modeling noise. Or
we start fitting with a model that is more complex than is necessary to fit the
points. In either case, future data are likely to not fit our model as well as they
would have fit a simpler model of the training data. The standard approach is
to add a penalty to the error criterion based on the regularity or the complexity
of the model. This will make the best-fit calculation balance the complexity of
the model against the error it produces on the training data, and reduce the
likelihood of overfitting.
In classification models, we can encounter a similar problem. The classifier
will separate the training data precisely, but the boundary may not generalize
well. We can resist overfitting by using simpler models, either with fewer features
or (in multi-layer nets) with fewer hidden nodes. Other possibilities are to use
support vector machines to find the maximal-margin separator, which should
be more resistant to error, or to add some noise to the training data to make
the learned classifier more robust.

6 Planning
There are no constraints on what order the actions must be executed. They are
all in the same layer, which indicates that they can be performed in parallel.

7 GraphPlan Vs. Partial Order Planner

GraphPlan will return a two-layer plan, either fizz followed by fuzz, or vice-
versa. POP, on the other hand, will return a single-layer plan, indicating that
fizz and fuzz can be performed in parallel. Why does this difference occur?
GraphPlan declares actions with inconsistent effects mutex, while POP does not.
So, POP’s solutions are more expressive (encompass more valid possibilities)
than do GraphPlan’s solutions.

8 More GraphPlan
Yes, this is an admissible heuristic because it will always underestimate the
distance to a solution (no solution can be nearer than the first layer where all

4
of the solution propositions are not mutex).

9 Bayesian Networks
Which of the following conditional independence assumptions are true?

1. False.
2. True.
3. False.
4. True.
5. False.
6. False.
7. False.
8. False.

Neither network is equivalent to the original one. The second network can
encode the same joint probability because it’s conditional independence relations
are a subset of the original network’s relations.

10 Automated Inference
The algorithm attempts to avoid calculating unnecesary information. It is easier
to understand what is going on if we look at the description and derivation,
rather than the pseudocode. α and β indicate normalizing factors that can be
computed with information from the known probability tables and the return
values of previously made recursive calls.

P (A|L) = αP (L|A)P (A)

P (L|A) = β[P (L|C) [P (C|AB)P (B) + P (C|A¬B)P (¬B)]

+P (L|¬C) [P (¬C|AB)P (B) + P (¬C|A¬B)P (¬B)]]

Now, we know every value in this equation except for P (L|C), so we make
a recursive call to calculate it.

P (L|C) = β[P (L|D)[P (D|CE)P (E) + P (D|C¬E)P (¬E)]

+P (L|¬D)[P (¬D|CE)P (E) + P (¬D|C¬E)P (¬E)]]

5
Again, all of these values can be looked up in the network’s conditional
probability tables, except for P (L|D).

P (L|D) = β[P (L|J)[P (J|DK)P (K) + P (J|D¬K)P (¬K)]

+P (L|¬J)[P (¬J|DK)P (K) + P (¬J|D¬K)P (¬K)]]

Finally, we have arrived at a point where every element of the equation can
be directly looked up in a known conditional probability table. Notice that we
avoided calculating information for the F-G-H chain, or for I. These help to make
our calculation more efficient than the brute-force method of reconstructing the
entire joint probability table.

11 Sampling
You would use sampling-based inference in a very large Bayesian network or one
with undirected cycles that make exact inference procedures infeasible. How-
ever, in cases where you wish to evaluate the probability of events with very
small probability, sampling is not an effective strategy.

12 Medical Decisions
(0.30)
chip -10

-10

no chip -10
surgery (0.70)

-10

nothing (0.30)
chip -100

-30

no-chip 0
(0.70)

Clearly, surgery is the best decision. The value of perfect information is 7

because it would allow me to skip surgery (have zero cost) 70% of the time.

P (chip) ∗ 0 + P (¬chip) ∗ (−10) = −3

6
chip -12

surgery
no -12

chip -102

pos nothing

no -2
test
chip -12

-10 neg surgery

surgery

nothing -30 no -12

chip -102
nothing

no -2

It’s too hard to fit the math on the diagram, so here are the relevant prob-
abilities and expected values. Abbreviations are: p - positive test, n - negative
test, bc - bone chip, s - surgery.

Now it gets ugly. Let’s refer to our decision to take the test as D.

E[D] = max(E[s|p], E[¬s|p])P (p) + max(E[s|n], E[¬s|n])P (n)

E[D] = max(−12, E[¬s|p])P (p) + max(E[−12, E[¬s|n])P (n)

Clearly, if we elect to have surgery no matter what, we are going to get

an expected value of -12, which is less than our previous max of -10. So we
need to at least not elect to have surgery some time. If we always elect not to
have surgery, then we are going to end up with a worse expected value than our
uninformed “no surgery” choice. Unless the test is completely useless, we should
elect to have surgery if it detects bone spurs and elect not to have surgery if it
does not.

E[D] = −12P (p) + E[¬s|n]P (n)

(−32 + 30.6x + 1.4y)
= −12(0.3x + 0.7y) + (1 − 0.3x − 0.7y)
(1 − 0.3x − 0.7y)
= −3.6x − 8.4y − 32 + 30.6x + 1.4y
= 27x − 7y − 32 > −10
27x − 7y > 22
x > (0.259y + 0.815)

Any values of x and y which satisfy this relationship (and, of course, which
are less than or equal to 1, so they are valid probabilities) will give us a better
expected value than just deciding on surgery without having the examination.

13 Markov Decision Processes

The decision tree for the optimal action strategy over two steps, starting in state
s1 .

8
.1 s1(2)

1.1
a1
1.5 .9 s2(1)
s1
.1 .5 s1(2)
a2
1.5
.96
.5 s2(1)

.6 s1(1)
.6
a1
.9 .4 s2(0)
a1 .9
s2
.9 s1(1)
a2
.9
.1 s2(0)
s1
.1 s1(2)
1.1
a1
1.5 .9 s2(1)

a2 s1
.5
.5 s1(2)
a2
1.2 1.5
.5 s2(1)
.6 s1(1)
.6
a1
.5 .9 .4 s2(0)
s2
.9 s1(1)
a2
.9
.1 s2(0)
The decision tree for the optimal one-step strategy, assuming no knowledge
of the starting state. Clearly, we should take a2 .

9
results
starts
s1(.1) 1

s1(.5) .1
s2(.9) 0
.35
s1(.6) 1
a1 s2(.5) .6
s2(.4) 0
s1(.5) 1
.5
a2 s1(.5)
s2(.5) 0
.7
s1(.9) 1
s2(.5) .9
s2(.1) 0
And now, the value functions. Because this is a simple system, we can
calculate them directly and avoid value-iteration.

V (s1 ) = 1 + γ max(.1V (s1 ) + .9V (s2 ), .5V (s1 ) + .5V (s2 ))

= 1 + .9[.5V (s1 ) + .5V (s2 )]
= 1 + .45V (s1 ) + .45V (s2 )
V (s2 ) = 0 + γ max(.6V (s1 ) + .4V (s2 ), .9V (s1 ) + .1V (s2 ))
= .9[.9V (s1 ) + .1V (s2 )]
= .81V (s1 ) + .09V (s2 )

So, we have two unknowns, and two linear equations. So we can easily solve
them and get V (s1 ) ≈ 6.69 and V (s2 ) ≈ 5.96.

14 Some Other Questions

1. If we do not observe any occurences of B this value is undefined. Also, if
there are no occurences of (A and B) or of not (A and B), then we can
get a 0 probability, which can sometimes cause trouble later.
2. Reversal might lower the error in a single step, but deletion and addition
as separate steps might require increasing the error temporarily. Gradient
descent algorithms will not select moves that increase the measured error,
so we need to make reversal an atomic action. Another alternative would
be a stochastic search method (such as simulated annealing) which can
sometimes take non-improving steps.

10
3. The four-element network has a complexity of 15, the five-element network
requires only 9 conditional probability values. This illustrates the use of
a hidden variable to simplify a network.

A B C D

4. This network is only ever going to generate a single output. Call it z.

Then
X
E= (z − y i )2 .
i

dE X
= 2(z − y i )z(1 − z) .
dz i

E is minimized when the derivative is zero; that happens, uninterestingly

at z = 0 and z = 1. Let’s set it to zero and see what else happens (using
the proportion of 1’s and 0’s among the y i ):

80(z − 1) + 20(z − 0) = 0
100z = 80
z = 0.8

5. Let’s assume that the event of prisoner A being executed is represented

by A, and the event of being pardoned is represented by ¬A.
So, at first there are three possibilities: A is executed, and B and C are
pardoned; B is executed, while A and C are pardoned; or C is executed,
while A and B are pardoned. All other possible combinations of events
are impossible and have probability 0.
When the guard returns and tells A that prisoner B has been pardoned,
what is the probability that A will be executed?

11
P (A, ¬B)
P (A|¬B) =
P (¬B)
P (A, ¬B, ¬C)
=
P (A, ¬B, ¬C) + P (C, ¬A, ¬B)
1/3
=
1/3 + 1/3
= 0.5

So, now that you know that B is being pardoned, the probability that you
will be executed is 50%. Before, your probability of execution was only 13 .
Weird, huh? This is similar to another famous probability scenario known
as the “Monty Hall” problem.
6. If f (X) = 0 or f (X) = 1, meaning that X · W was greater than b or
less than −b, there is no gradient. In the other cases, we can derive the
following gradient equation.

s = ((1/2b)(X · W + b) − d)2
ds
= ((1/2b)(X · W + b) − d)(Xi /b)
dWi

The problem is that points that are mislabeled, but produce a 0 or a 1

will produce large error, but no gradient. So the classifier will not be able
to reduce the error of these cases by gradient descent.
7. Let the vector of weights from the inputs to the first unit be w1 and the
vector of weights input the second unit be w2 . Let v = {g1 (w1 · x); x}.

E = (f − d)2
E = (g2 (w2 · v) − d)2
dE
= 2(g2 − d) · (g2 (1 − g2 )) · vn
dw2n
dE
= 2(g2 − d) · (g2 (1 − g2 )) · w20 g1 (1 − g1 ) · xn
dw1n

8. First example: looks good, let’s try resolution.

¬((((∃x)P (x)) → Q(A)) → ((∀x)(P (x) → Q(A))))

¬(¬(¬((∃x)P (x)) ∨ Q(A)) ∨ ((∀y)(¬P (y) ∨ Q(A))))

12
((¬∃x.P (x)) ∨ Q(A)) ∧ ∃y.(P (y) ∧ ¬Q(A))
(∀x.¬P (x) ∨ Q(A)) ∧ ∃y.(P (y) ∧ ¬Q(A))

1. ¬P (x) ∨ Q(A)
2. P (f red)
3. ¬Q(A)
4. ¬P (x) (3,1)
5. False (4,2, fred/x)

Second example: looks good, let’s try resolution.

¬(((∀x.P (x)) → Q(A)) → (∃x.(P (x) → Q(A))))

((∀x.P (x)) → Q(A)) ∧ ¬∃x.(P (x) → Q(A))
(¬(∀x.P (x)) ∨ Q(A)) ∧ ∀x.¬(¬P (x) ∨ Q(A))
(∃x.¬P (x) ∨ Q(A)) ∧ ∀x.(P (x) ∧ ¬Q(A))
(¬P (f red) ∨ Q(A)) ∧ P (x) ∧ ¬Q(A)

1. ¬P (f red) ∨ Q(A)
2. P (x)
3. ¬Q(A)
4. ¬P (f red) (3,1)
5. False (4,2)

Third example: Hmmm, I’m a bit suspicious. Maybe we can find a coun-
terexample.

Q(a) = f alse
P (f red) = true
P (ned) = f alse
∃x.(P (x) → Q(a)) → ∀x.(P (x) → Q(a))

We need a situation in which the left-hand side is true and the right-hand
side is false. Consider if we satisfy the left-hand side with x=ned. Because
P (ned) = f alse, the left-hand side will be true despite the fact the Q(a)
is always false. But, because the right-hand side is universally quantified,
it is not true, because P(fred) is true, which will make P (f red) → Q(a)
false. So, the entire expression is false because we have a true left-hand
side and a false right-hand side.
9. ¬On(x, z) ∨ ¬Above(z, y) ∨ Above(x, y)

13
10. The first-order logic description is:

∀x.(P (x) → B(x)) → ∀y.(¬P (y) → G(y))

∀x.(B(x) ∨ G(x)) ∧ (¬B(x) ∨ ¬G(x))
(∃x.¬P (x)) → (∀y.P (y) → B(y))
P (O1)
¬P (O2)

Which turns into the following clausal form statements:

1. P (F ) ∨ P (y) ∨ G(y)
2. ¬B(F ) ∨ P (y) ∨ G(y)
3. B(x) ∨ G(x)
4. ¬B(x) ∨ ¬G(x)
5. P (x) ∨ ¬P (y) ∨ B(y)
6. P (O1)
7. ¬P (O2)
And then we can prove that there is a green object:
8. ¬G(x)
9. ¬B(F ) ∨ G(O2) (2,7)
10. G(F ) ∨ G(O2) (3,9)
11. G(O2) (8,10)
12. False (8,11)

11. (a) Optimize threatens HappyCustomer because it deletes BugFree.

(b) HaveProgram is not linked to optimize, and HavePackaging (require-
ment of Ship) is unsupported.
(c) Any plan that places debug after optimize and both before ship is
acceptable.

Dan Morris - Bayes' Theorem Examples - A Visual Introduction For Beginners-Blue Windmill (2016) PDF
No ratings yet
Dan Morris - Bayes' Theorem Examples - A Visual Introduction For Beginners-Blue Windmill (2016) PDF
174 pages
slide07-bayes
No ratings yet
slide07-bayes
51 pages
Lecture Summary
No ratings yet
Lecture Summary
2 pages
Decision Trees
No ratings yet
Decision Trees
42 pages
L09 Learning I Bayesian Learning
No ratings yet
L09 Learning I Bayesian Learning
66 pages
Bayesian Learning
No ratings yet
Bayesian Learning
44 pages
UNIT 5
No ratings yet
UNIT 5
21 pages
DTreesAndOverfitting-1-11-2011_final
No ratings yet
DTreesAndOverfitting-1-11-2011_final
20 pages
Cs 171 18 IntroLearning Old
No ratings yet
Cs 171 18 IntroLearning Old
47 pages
Unit-V_POAI (1).pptx
No ratings yet
Unit-V_POAI (1).pptx
50 pages
Dealing With Uncertainty P (X - E) : Probability Theory The Foundation of Statistics
No ratings yet
Dealing With Uncertainty P (X - E) : Probability Theory The Foundation of Statistics
34 pages
Ai Unit V
No ratings yet
Ai Unit V
18 pages
PR January20 06 PDF
No ratings yet
PR January20 06 PDF
29 pages
Question of The Day: N N N N
No ratings yet
Question of The Day: N N N N
8 pages
תרגול - Bayesian Learning
No ratings yet
תרגול - Bayesian Learning
45 pages
Bayes Algorithm
No ratings yet
Bayes Algorithm
26 pages
BI Lecture-Mod 2
No ratings yet
BI Lecture-Mod 2
113 pages
AIML II Test Scheme and Soluion 2023
No ratings yet
AIML II Test Scheme and Soluion 2023
12 pages
Tutorial
No ratings yet
Tutorial
81 pages
An Introduction To Artificial Intelligence: Chapter 13 &14.1-14.2: Uncertainty & Bayesian Networks
No ratings yet
An Introduction To Artificial Intelligence: Chapter 13 &14.1-14.2: Uncertainty & Bayesian Networks
31 pages
Probabilistic AI
No ratings yet
Probabilistic AI
13 pages
07 Bayesian Networks
No ratings yet
07 Bayesian Networks
106 pages
Module 5
No ratings yet
Module 5
65 pages
Bayesian Nonparametrics and The Probabilistic Approach To Modelling
No ratings yet
Bayesian Nonparametrics and The Probabilistic Approach To Modelling
27 pages
Statistical Reasoning
No ratings yet
Statistical Reasoning
19 pages
AI-UNIT-IVM
No ratings yet
AI-UNIT-IVM
24 pages
Unit IV CI PDF
No ratings yet
Unit IV CI PDF
24 pages
Analyse a health care system and explain how would you represent a medical diagnostic system
No ratings yet
Analyse a health care system and explain how would you represent a medical diagnostic system
4 pages
Chapter19 4e
No ratings yet
Chapter19 4e
67 pages
Slide 1
No ratings yet
Slide 1
37 pages
Wa0031.
No ratings yet
Wa0031.
41 pages
Lec7 - Nonparametric Methods - II
No ratings yet
Lec7 - Nonparametric Methods - II
38 pages
Unit Iv L Earning
No ratings yet
Unit Iv L Earning
33 pages
Unit Iv L Earning
No ratings yet
Unit Iv L Earning
23 pages
05-Classification-II-2024
No ratings yet
05-Classification-II-2024
54 pages
Chap1 Bishop
No ratings yet
Chap1 Bishop
35 pages
Jeff Byers - Machine Learning and Advanced Statitics
No ratings yet
Jeff Byers - Machine Learning and Advanced Statitics
48 pages
Fall 2022 Midterm Notes PDF
No ratings yet
Fall 2022 Midterm Notes PDF
15 pages
Bayesian Networks
No ratings yet
Bayesian Networks
45 pages
Chapter 4 Bayesian Networks
No ratings yet
Chapter 4 Bayesian Networks
62 pages
Practice Final CS61c
No ratings yet
Practice Final CS61c
19 pages
UMBC CMSC 471 Final Exam,: 1. True/False (20 Points)
No ratings yet
UMBC CMSC 471 Final Exam,: 1. True/False (20 Points)
6 pages
Bayesian
No ratings yet
Bayesian
40 pages
ai cat 2 key
No ratings yet
ai cat 2 key
9 pages
ai unit 5 part 3
No ratings yet
ai unit 5 part 3
9 pages
Introduction To Bayesian Learning: Aaron Hertzmann University of Toronto SIGGRAPH 2004 Tutorial
No ratings yet
Introduction To Bayesian Learning: Aaron Hertzmann University of Toronto SIGGRAPH 2004 Tutorial
141 pages
Machine Learning Models and Theories
No ratings yet
Machine Learning Models and Theories
38 pages
t4-sol
No ratings yet
t4-sol
8 pages
Homework 5: Due On Jun 2, 2017
No ratings yet
Homework 5: Due On Jun 2, 2017
6 pages
Qualification Exam Question: 1 Statistical Models and Methods
No ratings yet
Qualification Exam Question: 1 Statistical Models and Methods
4 pages
ML Document-1 - Merged
No ratings yet
ML Document-1 - Merged
19 pages
Unit 1-1
No ratings yet
Unit 1-1
75 pages
AI-unit-4
No ratings yet
AI-unit-4
91 pages
cs188_sp16_f_sol
No ratings yet
cs188_sp16_f_sol
27 pages
09 AI Probability Based Expert Systems
No ratings yet
09 AI Probability Based Expert Systems
64 pages
Learning Bayesian Networks (Neapolitan, Richard) PDF
100% (1)
Learning Bayesian Networks (Neapolitan, Richard) PDF
704 pages
Differential Forms
From Everand
Differential Forms
Henri Cartan
5/5 (2)
Theory of Approximation
From Everand
Theory of Approximation
N. I. Achieser
No ratings yet
Lectures on Measure and Integration
From Everand
Lectures on Measure and Integration
Harold Widom
No ratings yet
Set Theory Essentials
From Everand
Set Theory Essentials
Emil Milewski
No ratings yet
Recursive Analysis
From Everand
Recursive Analysis
R. L. Goodstein
No ratings yet
Bayes' Rule With A Simple and Practical Example - by Tirthajyoti Sarkar - Towards Data Science
No ratings yet
Bayes' Rule With A Simple and Practical Example - by Tirthajyoti Sarkar - Towards Data Science
14 pages
CH 04 - Introduction To Probability: Page 1
0% (1)
CH 04 - Introduction To Probability: Page 1
55 pages
Frequentist Vs Bayesian
No ratings yet
Frequentist Vs Bayesian
48 pages
MLT by engineering express
No ratings yet
MLT by engineering express
94 pages
Bayesian Learning: Salma Itagi, Svit
No ratings yet
Bayesian Learning: Salma Itagi, Svit
14 pages
Lecture 02
No ratings yet
Lecture 02
33 pages
AI Unit 3
No ratings yet
AI Unit 3
17 pages
AMA1104Topic1 Lecture Notes
No ratings yet
AMA1104Topic1 Lecture Notes
37 pages
Expert System and Apllications: Ai - Iii-Unit
No ratings yet
Expert System and Apllications: Ai - Iii-Unit
27 pages
Ai ML Important Questions
No ratings yet
Ai ML Important Questions
21 pages
Lecture 11
No ratings yet
Lecture 11
49 pages
Module 3
No ratings yet
Module 3
36 pages
ExamQuestions and Solutions Probability PDF
No ratings yet
ExamQuestions and Solutions Probability PDF
118 pages
ECN 236 - Probability 3
No ratings yet
ECN 236 - Probability 3
9 pages
Full download From Statistical Physics to Data-Driven Modelling: with Applications to Quantitative Biology Simona Cocco pdf docx
100% (3)
Full download From Statistical Physics to Data-Driven Modelling: with Applications to Quantitative Biology Simona Cocco pdf docx
57 pages
1 s2.0 S004578251930578X Main PDF
No ratings yet
1 s2.0 S004578251930578X Main PDF
17 pages
Probabilistic Thinking: Nur Aini Masruroh
No ratings yet
Probabilistic Thinking: Nur Aini Masruroh
29 pages
ML Unit 3 Part B Material
No ratings yet
ML Unit 3 Part B Material
15 pages
AI (IT) UNIT-3-converted
No ratings yet
AI (IT) UNIT-3-converted
85 pages
Cat Combination
No ratings yet
Cat Combination
14 pages
Get From Statistical Physics To Data-Driven Modelling: With Applications To Quantitative Biology Simona Cocco Free All Chapters
100% (9)
Get From Statistical Physics To Data-Driven Modelling: With Applications To Quantitative Biology Simona Cocco Free All Chapters
48 pages
BAYE's Theorm
No ratings yet
BAYE's Theorm
27 pages
Lecture 9: Bayesian Learning: Cognitive Systems II - Machine Learning SS 2005
No ratings yet
Lecture 9: Bayesian Learning: Cognitive Systems II - Machine Learning SS 2005
39 pages
Lecture On Conditional & Bayes - Rule
No ratings yet
Lecture On Conditional & Bayes - Rule
5 pages
Lecture 5
No ratings yet
Lecture 5
6 pages
Bayes’ Theorem Questions with Solutions for CBSE Class 12 _ Testbook.com
No ratings yet
Bayes’ Theorem Questions with Solutions for CBSE Class 12 _ Testbook.com
13 pages
Bayes' Rule
No ratings yet
Bayes' Rule
2 pages
Bayesian Decision Theory
No ratings yet
Bayesian Decision Theory
5 pages
L3-GEC410-S. O. Edeki - Module-1
No ratings yet
L3-GEC410-S. O. Edeki - Module-1
45 pages

samp_sol

Uploaded by

samp_sol

Uploaded by

6.

891 Review Problems

a) Model 3 best models the relationship.

Figure 1: From left to right, models 1, 2, and 3

2 True & False

3 The Prisoner’s Dilemma

−100 ∗ (1 − P (p − quiet)) + (−10 ∗ P (p − quiet)) = −50

7 GraphPlan Vs. Partial Order Planner

P (A|L) = αP (L|A)P (A)

P (L|A) = β[P (L|C) [P (C|AB)P (B) + P (C|A¬B)P (¬B)]

P (L|C) = β[P (L|D)[P (D|CE)P (E) + P (D|C¬E)P (¬E)]

P (L|D) = β[P (L|J)[P (J|DK)P (K) + P (J|D¬K)P (¬K)]

Clearly, surgery is the best decision. The value of perfect information is 7

P (chip) ∗ 0 + P (¬chip) ∗ (−10) = −3

-10 neg surgery

nothing -30 no -12

E[D] = max(E[s|p], E[¬s|p])P (p) + max(E[s|n], E[¬s|n])P (n)

Clearly, if we elect to have surgery no matter what, we are going to get

E[D] = −12P (p) + E[¬s|n]P (n)

13 Markov Decision Processes

V (s1 ) = 1 + γ max(.1V (s1 ) + .9V (s2 ), .5V (s1 ) + .5V (s2 ))

14 Some Other Questions

4. This network is only ever going to generate a single output. Call it z.

E is minimized when the derivative is zero; that happens, uninterestingly

5. Let’s assume that the event of prisoner A being executed is represented

The problem is that points that are mislabeled, but produce a 0 or a 1

8. First example: looks good, let’s try resolution.

¬((((∃x)P (x)) → Q(A)) → ((∀x)(P (x) → Q(A))))

Second example: looks good, let’s try resolution.

¬(((∀x.P (x)) → Q(A)) → (∃x.(P (x) → Q(A))))

∀x.(P (x) → B(x)) → ∀y.(¬P (y) → G(y))

Which turns into the following clausal form statements:

11. (a) Optimize threatens HappyCustomer because it deletes BugFree.

You might also like