AI(U4)
AI(U4)
UNIT IV
Planning and Learning: Planning with state space search-partial order planning-planning
graphs-conditional planning-continuous planning-Multi-Agent planning. Forms of learning-
inductive learning-learning decision trees-ensemble learning-Neural Net learning
and Genetic learning
Planning is done by a planning agent or a planner to solve the current world problem.
It has certain algorithms to implement the solutions for the given problem and this is
called ideal planner algorithm
1
Unify action and goal representation to allow
selection (Use logical language for both)
Divide-and-conquer by subgoaling
Relax requirement for sequential construction of solutions
(ii) What is the difference between planner and problem solving agents?
Problem solving agent: It will represent the task of the problem and solves it using
search techniques and any other algorithm.
Planner: It overcomes the difficulties that arise in the problem solving agent.
3 key ideas to approach planning are open up the representation of states, goals ,actions
using FOL the planner is free to add actions to the plan whenever and wherever they are
needed rather than an incremental sequence most part of the world are going to be
independent of the other parts
Pre-condition
Effect
7. What is STRIPS?
2
Tidily arranged actions descriptions
Efficient algorithms
The planner takes the situation and searches for it in the KB and locates it. It is
reused if it is already present else a new plan is made for the situation and executed.
Progression: An algorithm that searches for the goal state by searching through
the states generated by actions that can be performed in the given state, starting
from the initial state.
Regression: An algorithm that searches backward from the goal state by finding
actions whose effects satisfy one or more of the posted goals, and posting the
chosen action's preconditions as goals ( goal regression).
Partial plan is an incomplete plan which may be done during the Initial phase. There are 2
main operation allowed in planning
3
Refinement operator
Modification operator
Causal links Si---c Sj
Si < Sj
c є effect(Si)
4
14. What are the Properties of POP?
5
The learning element takes some knowledge about the learning element and some
feedback on how the agent is doing, and determines how the performance element
should be modified to (hopefully) do better in the future.
The critic is designed to tell the learning element how well the agent is doing. The
critic employs a fixed standard of performance. This is necessary because the
percepts themselves provide no indication of the agent's success.
It is responsible for suggesting actions that will lead to new and informative experiences.
The point is that if the performance element had its way, it would keep doing the actions
that are best, given what it knows. But if the agent is willing to explore a little, and do
some perhaps suboptimal actions in the short run, it might discover much better actions
for the long run.
The learning element is also responsible for improving the efficiency of the
performance element. For example, when asked to make a trip to a new destination,
the taxi might take a while to consult its map and plan the best route. But the next
6
time a similar trip is requested, the planning process should be much faster. This is
called speedup learning
A means to infer relevant properties of the world from the percept sequence.
Information about the results of possible actions the agent can take.
Goals that describe classes of states whose achievement maximizes the agent's
utility.
Supervised learning
7
Unsupervised learning
The path for a restaurant full of patrons, with an estimated wait of 10-30 minutes
when the agent is not hungry is expressed by the logical sentence
Majority function is one which returns 1 if more than half of its inputs are 1.
8
28. Draw an example decision tree
29. Explain the terms Positive example, negative example and training set.
An example is described by the values of the attributes and the value of the goal
predicate. We call the value of the goal predicate the classification of the example. If
the goal predicate is true for some example, we call it a positive example; otherwise
we call it a negative example. A set of examples X\,... ,X\2 for the restaurant domain.
The positive examples are ones where the goal Will Wait is true (X\,Xi,,...) and
negative examples are ones where it is false (X2,X5,...). The complete set of examples
is called the training set.
30. Explain the methodology used for accessing the performance of learning
algorithm.
Divide it into two disjoint sets: the training set and the test set.
9
Use the learning algorithm with the training set as examples to generate a
hypothesis
H.
Measure the percentage of examples in the test set that are correctly classified by H.
Repeat steps 1 to 4 for different sizes of training sets and different randomly
selected
Whenever there is a large set of possible hypotheses, one has to be careful not to use
the resulting freedom to find meaningless "regularity" in the data. This problem is
called over fitting. It is a very general phenomenon, and occurs even when the target
function is not at all random. It afflicts every kind of learning algorithm, not just
decision trees.
10
The probability that the attribute is really irrelevant can be calculated with the help
of standard statistical software. This is called pruning.
Cross-validation is another technique that eliminates the dangers of over fitting. The
basic idea of cross-validation is to try to estimate how well the current hypothesis
will predict unseen data. This is done by setting aside some fraction of the known
data, and using it to test the prediction performance of a hypothesis induced from the
rest of the known data.
In many domains, not all the attribute values will be known for every example. The
values may not have been recorded, or they may be too expensive to obtain. This
gives rise to two problems. First, given a complete decision tree, how should one
classify an object that is missing one of the test attributes? Second, how should one
modify the information gain formula when some examples have unknown values for
the attribute?
When an attribute has a large number of possible values, the information gain
measure gives an inappropriate indication of the attribute's usefulness. Consider the
extreme case where every example has a different value for the attribute—for
instance, if we were to use an attribute Restaurant Name in the restaurant domain. In
such a case, each subset of examples is a singleton and therefore has a unique
classification, so the information gain measure would have its highest value for this
attribute. However, the GAIN RATIO attribute may be irrelevant or useless.
11
Attributes such as Height and Weight have a large or infinite set of possible values.
They are therefore not well-suited for decision-tree learning in raw form. An obvious
way to deal with this problem is to discretize the attribute. For example, the Price
attribute for restaurants was discretized into $, $$, and $$$ values. Normally, such
discrete ranges would be defined by hand. A better approach is to preprocess the raw
attribute values during the tree-growing process in order to find out which ranges
give the most useful information for classification purposes.
Version space methods are probably not practical in most real-world learning
problems, mainly because of noise, they provide a good deal of insight into the logical
structure of hypothesis space.
Any hypothesis that is seriously wrong will almost certainly be "found out"
with high probability after a small number of examples, because it will make an
incorrect prediction. Thus, any hypothesis that is consistent with a sufficiently large
set of training examples is unlikely to be seriously wrong—that is, it must be
Probably Approximately Correct. PAC-learning is the subfield of computational
learning theory that is devoted to this idea.
12
failure if examples is empty then return the value No t — a test that matches a nonempty
subset examples, of examples such that the members of examples, are all positive or all
negative if there is no such t then return failure
else o — No
return a decision list with initial test / and outcome o and remaining elements given
by DEClsiON-LlST-LEARNING
State space search is a process used in the field of computer science, including
artificial intelligence (AI), in which successive configurations or states of an instance
are considered, with the goal of finding a goal state with a desired property.
45. How could you differentiate normal and decision tree? (Apr-May’17)
Decision Tree is a flow-chart with only if then else statements, drawn using insights
from data.The process of making a decision tree involves extracting out if-else
divisions (and their order) greedily such that the total entropy(or some other
measure) of the leaves is lesser than the root.
Whereas Normal Tree nodes have many useful properties. The depth of a node is the
length of the path (or the number of edges) from the root to that node. The height of a
node is the longest path from that node to its leaves Binary: Each node has zero,
one, or two children.
13
Inductive learning is a kind of learning in which, given a set of examples an
agent tries to estimate or create an evaluation function. Most inductive learning is
supervised l earning, in which examples provided with classifications. (The alternative
is clustering.) More formally, an example is a pair (x, f(x)), where x is the input and f(x)
is the output of the function applied to x. The task of pure inductive inference (or
induction) is, given a set of examples of f, to find a hypothesis h that approximates f.
11 Marks
1. What is planning in AI? Explain the Planning done by an agent? (Nov'13) (Apr'15)
P lanning problem
– Find a sequence of actions that achieves a given goal when executed from a
given initial world state. That is, given
– a set of operator descriptions (defining the possible primitive actions by the agent),
– a sequence of operator instances, such that executing them in the initial state will
change the world to a state satisfying the goal-state description.
An Agent Architecture
14
P lanning vs. problem solving
– Planning and problem solving methods can often solve the same sorts of problems
– Search often proceeds through plan space rather than state space (though there
are also state-space planners)
– Sub goals can be planned independently, reducing the complexity of the planning
problem
Typical assumptions
15
B locks world
The blocks world is a micro-world that consists of a table, a set of blocks and a
robot hand. Some domain constraints:
ontable(a)
ontable(c)
– The General Problem Solver (GPS) system was an early planner (Newell, Shaw, and
Simon)
– GPS generated actions that reduced the difference between some state and a goal state
– GPS used Means-Ends Analysis
– Compare what is given or known with what is desired and select a reasonable
thing to do next
16
75. Intuition: Represent the planning problem using first-order logic
– Goal state:
(s) At(Home,s) ^ Have(Milk,s) ^ Have(Bananas,s) ^ Have(Drill,s)
– Operators are descriptions of how the world changes as a result of the agent’s actions:
(a,s) Have(Milk,Result(a,s)) <=> ((a=Buy(Milk) ^ At(Grocery,s)) (Have(Milk, s)
^ a~=Drop(Milk)))
– Action sequences are also useful: Result'(l,s) is the result of executing the list of actions
(l) starting in s:
(s) Result'([],s) = s
17
– Do not need to fully specify state
– Non-specified either don’t-care or assumed false
– Represent many cases in small storage
– Often only represent changes in state rather than entire situation
– Unlike theorem prover, not seeking whether the goal is true, but is there a
sequence of actions to attain it
O perator/action representation
Example:
Op[Action: Go(there),
Precond: At(here) ^ Path(here,there),
Effect: At(there) ^ ~At(here)]
18
– putdown(X): put block X on
the table Each will be
represented by
– a list of preconditions
– a list of new facts to be added (add-effects)
preconditions(stack(X,Y), [holding(X),clear(Y)])
deletes(stack(X,Y), [holding(X),clear(Y)]).
adds(stack(X,Y), [handempty,on(X,Y),clear(X)])
constraints(stack(X,Y),
[X\==Y,Y\==table,X\==table])
STRIPS planning
19
Typical BW planning problem Initial state:
clear(a)
clear(b)
clear(c)
ontable(a)
ontable(b)
ontable(c) handempty
G oal interaction
• Simple planning algorithms assume that the goals to be achieved are independent
– Solving on(A,B) first (by doing unstack(C,A), stack(A,B) will be undone when
solving the second goal on(B,C) (by doing unstack(A,B), stack(B,C)).
• Classic STRIPS could not handle this, although minor modifications can get it to do
simple cases
State-space planning
– We initially have a space of situations (where you are, what you have, etc.)
– The plan is a solution found by “searching” through the situations to get to the goal
20
– A progression planner searches forward from initial state to goal state
– A regression planner searches backward from the goal
– This works if operators have enough information to go both ways
– Ideally this leads to reduced branching –you are only considering things that are
[RightShoe, LeftShoe]
2. What is learning and its representation in AI explain?
21
Information processes that improve their performance or enlarge their knowledge
bases are said to learn.
Why is it hard?
Knowledge acquisition
Taking advice
-- Similar to rote learning although the knowledge that is input may need to be
transformed (or operationalized) in order to be used effectively.
Problem Solving
-- If we solve a problem one may learn from this experience. The next time we
see a similar problem we can solve it more efficiently. This does not usually
22
involve gathering new knowledge but may involve reorganization of data or
remembering how to achieve to solution.
Induction
-- One can learn from examples. Humans often classify things in the world
without knowing explicit rules. Usually involves a teacher or trainer to aid the
classification.
Discovery
Here one learns knowledge without the aid of a teacher.
Analogy
If a system can recognize similarities in information already stored then it
may be able to transfer some knowledge to improve to solution of the task
in hand.
General model:
Learning element
• Design of a learning
element is affected by
– Which components of the performance element are to be learned
– What feedback is available to learn these components
23
– What representation is used for the components
• Type of feedback:
– Supervised learning: correct answers for each example
– Unsupervised learning: correct answers not given
– Reinforcement learning: occasional rewards
Inductive learning is a kind of learning in which, given a set of examples an agent tries
to estimate or create an evaluation function. Most inductive learning is supervised l
earning, in which examples provided with classifications. (The alternative is
clustering.) More formally, an example is a pair (x, f(x)), where x is the input and f(x) is
the output of the function applied to x. The task of pure inductive inference (or
induction) is, given a set of examples of f, to find a hypothesis h that approximates f.
o Given an example, A pair <x, f(x)> where x is the input and f(x) is
the result o Generate a hypothesis, A function h(x) that
approximates f(x)
o That will generalize well, Correctly predict values for unseen samples
How do we do this?
24
An example is a pair x, f(x), e.g.,
25
• Construct/adjust h to agree with f on training set (h is consistent if it agrees
with f on all examples)
• E.g., curve fitting:
The most likely hypothesis is the simplest one consistent with the data."
Since there can be noise in the measurements, in practice we need to make a tradeoff
between simplicity of the hypothesis and how well it fits the data.
The reason reinforcement learning is harder than supervised learning is that the
agent is never told what the right action is, only whether it is doing well or poorly,
26
and in some cases (such as chess) it may only receive feedback after a long string of
actions.
There are two basic kinds of information an agent can try to learn.
Utility function -- The agent learns the utility of being in various states, and
chooses actions to maximize the expected utility of their outcomes. This
requires the agent keep a model of the environment.
A passive learning agent keeps an estimate U of the utility of each state, a table N of
how many times each state was seen, and a table M of transition probabilities. There
are a variety of ways the agent can update its table U.
Naive Updating
One simple updating method is the least mean squares (LMS) approach [Widrow and
Hoff, 1960]. It assumes that the observed reward-to-go of a state in a sequence
provides direct evidence of the actual reward-to-go. The approach is simply to keep
27
the utility as a running average of the rewards based upon the number of times the
state has been seen. This approach minimizes the mean square error with respect to
the observed data.
This approach converges very slowly, because it ignores the fact that the actual
utility of a state is the probability-weighted average of its successors' utilities,
plus its own reward. LMS disregards these probabilities.
If the transition probabilities and the rewards of the states are known (which will
usually happen after a reasonably small set of training examples), then the actual
utilities can be computed directly as
where U(i) is the utility of state i, R is its reward, and Mij is the probability of
transition from state i to state j. This is identical to a single value determination in the
policy iteration algorithm for Markov decision processes. Adaptive dynamic
programming is any kind of reinforcement learning method that works by solving the
utility equations using a dynamic programming algorithm. It is exact, but of course
highly inefficient in large state spaces.
Temporal Difference Learning
[Richard Sutton] Temporal difference learning uses the difference in utility values
between successive states to adjust them from one epoch to another. The key idea is
to use the observed transitions to adjust the values of the observed states so that they
agree with the ADP constraint equations. Practically, this means updating the utility
of state i so that it agrees better with its successor j. This is done with the temporal-
difference (TD) equation:
28
where a is a learning rate parameter.
This approach will cause U(i) to converge to the correct value if the learning rate
parameter decreases with the number of times a state has been visited [Dayan, 1992].
In general, as the number of training sequences tends to infinity, TD will converge on
the same utilities as ADP.
Since neither temporal difference learning nor LMS actually use the model M of state
transition probabilities, they will operate unchanged in an unknown environment.
The ADP approach, however, updates its estimated model of an unknown
environment after each step, and this model is used to revise the utility estimates.
Any method for learning stochastic functions can be used to learn the environment
model; in particular, in a simple environment the transition probability Mij is just the
percentage of times state i has transitioned to j.
The basic difference between TD and ADP is that TD adjusts a state to agree with the
observed successor, while ADP makes a state agree with all successors that might
occur, weighted by their probabilities. More importantly, ADP's adjustments may
need to be propagated across all of the utility equations, while TD's affect only the
current equation. TD is essentially a crude first approximation to ADP.
A middle-ground can be found by bounding or ordering the number of adjustments
made in ADP, beyond the simple one made in TD. The prioritized-sweeping heuristic
prefers only to make adjustments to states whose likely successors have just
undergone large adjustments in their utility estimates. Such approximate ADP
29
systems can be very nearly as efficient as ADP in terms of convergence, but operate
much more quickly.
The difference between active and passive agents is that passive agents learn a fixed
policy, while the active agent must decide what action to take and how it will affect its
rewards. To represent an active agent, the environment model M is extended to give
the probability of a transition from a state i to a state j, given an action a. Utility is
modified to be the reward of the state plus the maximum utility expected depending
upon the agent's action:
An ADP agent is extended to learn transition probabilities given actions; this is simply
another dimension in its transition table. A TD agent must similarly be extended to
have a model of the environment.
The equations for Q-learning are similar to those for state-based learning agents. The
difference is that Q-learning agents do not need models of the world. The equilibrium
equation, which can be used directly (as with ADP agents) is
The temporal difference version does not require that a model be learned; its update
equation is
30
Q(a, i) <- Q(a, i) + a(R(i) + maxa' Q(a', j) - Q(a, i))
Applications of Reinforcement Learning
The first significant reinforcement learning system was used in Arthur Samuel's
checker-playing program. It used a weighted linear function to evaluate positions,
though it did not use observed rewards in its learning process.
Neural Networks
More specifically, a neural network consists of a set of nodes (or units), links that
connect one node to another, and weights associated with each link. Some nodes
receive inputs via links; others directly from the environment, and some nodes send
outputs out of the network. Learning usually occurs by adjusting the weights on the
links.
31
Each unit has a set of weighted inputs, an activation level, and a way to compute its
activation level at the next time step. This is done by applying an activation function
to the weighted sum of the node's inputs. Generally, the weighted sum (also called the
input function) is a strictly linear sum, while the activation function may be nonlinear.
If the value of the activation function is above a threshold, the node "fires."
Generally, all nodes share the same activation function and threshold value, and only
the topology and weights change.
o g is a non-linear function which takes as input a weighted sum of the input link
signals (as well as an intrinsic bias weight) and outputs a certain signal
strength.
o g is commonly a threshold function or sigmoid
function. Network Structures
The two fundamental types of network structure are feed-forward and recurrent. A
feed-forward network is a directed acyclic graph; information flows in one direction
only, and there are no cycles. Such networks cannot represent internal state. Usually, n
eural networks are also layered, meaning that nodes are organized into groups of
layers, and links only go from nodes to nodes in adjacent layers.
Recurrent networks allow loops, and as a result can represent state, though they are
much more complex to analyze. Hopfield networks and Boltzmann machines are
examples of recurrent networks; Hopfield networks are the best understood. All
connections in Hopfield networks are bidirectional with symmetric weights, all units
have outputs of 1 or -1, and the activation function is the sign function. Also, all nodes
32
in a Hopfield network are both input and output nodes. Interestingly, it has been
shown that a Hopfield network can reliably recognize 0.138N training examples,
where N is the number of units in the network.
Boltzmann machines allow non-input/output units, and they use a stochastic
evaluation function that is based upon the sum of the total weighted input. Boltzmann
machines are formally equivalent to a certain kind of belief network evaluated with a
stochastic simulation algorithm.
One problem in building neural networks is deciding on the initial topology, e.g., how
many nodes there are and how they are connected. Genetic algorithms have been
used to explore this problem, but it is a large search space and this is a
computationally intensive approach. The optimal brain damage method uses
information theory to determine whether weights can be removed from the network
without loss of performance, and possibly improving it. The alternative of making the
network larger has been tested with the tiling algorithm [Mezard and Nadal, 1989]
which takes an approach similar to induction on decision trees; it expands a unit by
adding new ones to cover instances it misclassified. Cross-validation techniques can
be used to determine when the network size is right.
Perceptrons
Perceptron’s are single-layer, feed-forward networks that were first studied in the
1950's. They are only capable of learning linearly separable functions. That is, if we
view F features as defining an F-dimensional space, the network can recognize any
class that involves placing a single hyper plane between the instances of two classes.
So, for example, they can easily represent AND, OR, or NOT, but cannot represent
XOR.
33
Perceptron’s learn by updating the weights on their links in response to the
difference between their output value and the correct output value. The updating rule
(due to Frank Rosenblatt, 1960) is as follows. Define Err as the difference between
the correct output and actual output. Then the learning rule for each weight is
Wj <- Wj + A x Ij x Err
The back-propagation rule is similar to the perceptron learning rule. If Erri is the
error at the output node, then the weight update for the link from unit j to unit i (the
output node) is
34
Wj,i <- Wj,i + A x aj x Erri x g'(ini)
where g' is the derivative of the activation function, and aj is the activation of the unit
j. (Note that this means the activation function must have a derivative, so the sigmoid
function is usually used rather than the step function.) Define Di as Erri x g'(ini).
This updates the weights leading to the output node. To update the weights on the
interior links, we use the idea that the hidden node j is responsible for part of the
error in each of the nodes to which it connects. Thus the error at the output is divided
according to the strength of the connection between the output node and the hidden
node, and propagated backward to previous layers. Specifically,
Lastly, the weight updating rule for the weights from the input layer to the hidden layer is
is
where k is the input node and j the hidden node, and Ik is the input value of k.
A neural network requires 2n/n hidden units to represent all Boolean functions of n
inputs. For m training examples and W weights, each epoch in the learning process
takes O(mW) time; but in the worst case, the number of epochs can be exponential in
the number of inputs.
In general, if the number of hidden nodes is too large, the network may learn only the
training examples, while if the number is too small it may never converge on a set of
weights consistent with the training examples.
35
Multi-layer feed-forward networks can represent any continuous function with a
single hidden layer, and any function with two hidden layers [Cybenko, 1988, 1989].
John Denker remarked that "neural networks are the second best way of doing just
about anything." They provide passable performance on a wide variety of problems
that are difficult to solve well using other methods.
NETtalk [Sejnowski and Rosenberg, 1987] was designed to learn how to pronounce
written text. Input was a seven-character centered on the target character, and
output was a set of Booleans controlling the form of the sound to be produced. It
learned 95% accuracy on its training set, but had only 78% accuracy on the test set.
Not spectacularly good, but important because it impressed many people with the
potential of neural networks.
Other applications include a ZIP code recognition [Le Cun et al., 1989] system that
achieves 99% accuracy on handwritten codes, and driving [Pomerleau, 1993] in the
ALVINN system at CMU. ALVINN controls the NavLab vehicles, and translates inputs
from a video image into steering control directions. ALVINN performs exceptionally
well on the particular road-type
it learns, but poorly on other terrain types. The extended MANIAC system [Jochem et
al., 1993] has multiple ALVINN subnets combined to handle different road types.
36
37
8. Describe continuous planning. (Apr/May'14) (Nov'15)
38
39
40
9. Explain Multi agent Planning in detail .
Agents can be divided into different types ranging from simple to complex. Some categories
suggested to define these types include:
Passive agents or agent without goals (like obstacle, apple or key in any simple
simulation)
Active agentswith simple goals (like birds in flocking, or wolf–sheep in prey-predator
model)
Cognitive agents (which contain complex calculations)
Virtual Environment
Discrete Environment
Continuous Environment
41
Agent environments can also be organized according to various properties like: accessibility
(depending on if it is possible to gather complete information about the environment),
determinism (if an action performed in the environment causes a definite effect), dynamics (how
many entities influence the environment in the moment), discreteness (whether the number of
possible actions in
the environment is finite), episodicity (whether agent actions in certain time periods influence
other periods), and dimensionality (whether spatial characteristics are important factors of the
environment and the agent considers space in its decision making). Agent actions in the
environment are typically mediated via an appropriate middleware. This middleware offers a first-
class design abstraction for multi-agent systems, providing means to govern resource access and
agent coordination.
Characteristics
42
QUESTION BANK & UNIVERSITY QUESTIONS
1. Write about:
a. Partial Order Planning
b. Conditional Planning (Nov'15)(Dec’17) (Pg.No. 30, Q.No. 7)
2. Discuss the syntax and semantics of associate network.
3. Briefly explain the basic ascepts of planning. (Nov'13) (Pg.No. 10, Q.No. 1)
4. (a) Is there any connection between learning and planning methods? Explain. (8)
(b) Explain the components of a replanning agent.
5. Describe how Decision Trees could be used for inductive learning. Explain its
effectiveness with a suitable example. (Pg.No. 19, Q.No. 3)
6. Discuss Partial Order Planning in detail with an example.
7. Explain the process of planning with state space search with an example.
(Apr'15)(April’18) (Pg.No. 10, Q.No. 1)
8. Explain decision free learning in detail with suitable examples. (Apr'15)
9. Discuss in detail about genetic algorithm with an example. (Nov'15) (Apr/May'16)
(Apr’18)
10. (a) What is genetic algorithm? What are the advantages and disadvantages of
Generic algorithm (Nov'15)(Dec’17)
(b) Describe the three major issues affected in the design of a learning agent.
(Nov'15)
11. (a) What is learning? Explain learning by problem solving with suitable
examples. (Pg.No. 16, Q.No. 2)
(b) Explain Rote learning in detail. (Nov/Dec'14)
12. Elaborate on Multi agent planning. (Nov/Dec'14)(Apr-May-16)
13. Discuss in detail about neural networks. (Pg.No. 26, Q.No. 6)
14. Discuss about learning decision trees. (Nov'13) (Apr/May'14)
15. Write about: (Apr/May'14) (Nov'15)
a.Continuous Planning (Pg.No. 32, Q.No. 8)
b. Conditional Planning (Pg.No. 30, Q.No. 7)
16.Explain about partial order planning algorithm with an example. (Nov’18)
43