A2-24S3-test-template
A2-24S3-test-template
Student name:
Student ID:
RMIT Classification: Trusted
Important information
About this document: You must ALWAYS keep your work private and never share it with
anybody in or outside the course, except your teammates (if it is a teamwork project), even
after the course is completed. You are not allowed to make another repository copy outside
the provided Classroom without the written permission of the teaching staff.
Late submissions & extensions: Not allowed. Extensions will only be permitted
in exceptional circumstances under the University’s rules.
Code of Honour
We expect every RMIT student taking this course to adhere to the Code of Honour under
which every learner-student should:
Submission
Submit solutions for your Assignment 2 in the exam paper provided to you. You must provide
your name is your student number on the first page of this paper.
RMIT Classification: Trusted
A. Yes
B. No
Solution:
A. α entails β
B. M(α) ⊆ M(β)
Solution:
5. It is believed that: (0) RUVN students are smart; (1) if you are smart and you study hard, you
will pass the final exam; (2) if you are lucky and are either smart or you study hard you will
pass the final exam.
Construct a KB of propositional sentences using the five propositional variables (RUVN, smart,
study, lucky, pass) and logical connectives (∧,∨,¬,→). Encode each of the three sentences into
one or more sentences in conjunctive normal form (CNF).
RMIT Classification: Trusted
𝐾𝐵 = { 𝑃 ⇒ 𝑄, 𝑄 ⇒ 𝑅, 𝑃 𝑉 𝑄}
2. Choose a vocabulary of predicates, constants, and functions appropriate for representing the following
information in First-order Logic (FOL), then represent each sentence in FOL.
a) Every student studies Python programming language.
b) Every student except Yankee studies Python programming language.
Solution:
Rewrite the sentence with the substitution x/A, and put the antecedent into CNF.
Solution:
4. Determine whether the sentence in Question II.3 is valid (i.e., a tautology). Provide explanation or
proof.
Solution:
¬P(w) ∨ Q(w)
¬Q( y) ∨ S ( y)
P( x) ∨ R( x)
¬R(z) ∨ S (z)
Prove: S(A).
Question 2-4. Consider the Blocks World problem. Suppose we use the predicates in logic to
represent states as in Tutorial No. 06 of this course.
5. The blocks world is one of the most famous planning domains in artificial intelligence. It is a
NP-hard problem, and we wanted to find an intelligent solution to solve it. Several algorithms can
be used to solve the problem, such as DFS, BFS, and A*. Can you propose one heuristic to solve
the problem using A*? Provide a short description of the proposal.
Part 4: Probability
1. (Prior Probability) Probability distribution on a random variable gives the probabilities for all
the possible values of this random variable.
Given: Weather = <sunny, rainy, cloudy, snow>. Fill in the missing value p:
P(Weather) = <0.74, 0.1, p, 0.1> (normalized, i.e., sums to 1)
Solution: p =
2. Joint probability distribution on a set of random variables gives the probabilities of all
combinations of the values of those variables. P(Weather, Fever) is a 4 × 2 table of probabilities
as following. Fill in the missing value p.
Weather = sunny rainy cloudy snow
Fever = true 0.14 0.02 0.01 0.02
Fever = false 0.57 0.08 p 0.08
RMIT Classification: Trusted
Solution: p =
4. The probability of arriving on time when taking a taxi is 0.8. The probability of arriving on time
when taking a bus is 0.4. A student would take a taxi more frequently than taking a bus. On average,
for every 10 days, 7 of them are by taxi and 3 of them are by bus. What is the probability that he
will be on time in a given day?
5.
Part 5: MDPs
Consider a room-cleaning robot. It can be either in the room or at its charging station. The room
can be clean or dirty. So there are four states: RD (in the room, dirty), RC (in the room, clean),
CD (at the charger, dirty), and CC (at the charger, clean). The robot can either choose to suck up
dirt or return to its charger. Reward for being in the charging station when the room is clean is 0;
reward for being in the charging station when the room is dirty is -10; reward for other states is -
1. Assume also that after the robot has gotten a -10 penalty for entering the charging station when
the room is still dirty, it will get rewards of 0 thereafter, no matter what it does. Assume that if the
robot decides to suck up dirt while it is in the room, then the probability of going from a dirty to a
clean floor is 0.5. The return action always takes the robot to the charging station, leaving the
dirtiness of the room unchanged. The discount factor is 0.8.
1. What is value V* (CC) (the value of being in the CC state)?
2. Write the Bellman equation for V*(RC). What is value V* (RC)?
3. If V0(RD) = 0 (that is, the initial value assigned to this state is 0), what is V 1(RD), the value
of RD with one step to go (calculated via one iteration of value iteration)?