Ai Unit 2 Continued
Ai Unit 2 Continued
We extend classical search algorithms (e.g., BFS, DFS, A*) to handle complex
scenarios such as optimization problems, nondeterministic actions, partial
observability, and unknown environments. Key focus areas include local
search, stochastic environments, and online search
In many problems, however, the path to the goal is irrelevant. For example, in the
8-queens problem, what matters is the final configuration of queens, not the
order in which they are added. If the path to the goal does not matter, we might
consider a different class of algorithms, ones that do not worry about paths at all.
Local search algorithms operate using a single current node (rather than multiple
paths) and generally move only to neighbors of that node. Typically, the paths
followed by the search are not retained. Although local search algorithms are not
systematic, they have two key advantages:
(1) they use very little memory—usually a constant amount
(2) they can often find reasonable solutions in large or infinite (continuous) state
spaces for which systematic algorithms are unsuitable
In addition to finding goals, local search algorithms are useful for solving pure
optimization problems, in which the aim is to find the best state according to an
objective function
If elevation corresponds to cost, then the aim is to find the lowest valley—a global
minimum; if elevation corresponds to an objective function, then the aim is to
find the highest peak—a global maximum. Local search algorithms explore this
state-space landscape. A complete local search algorithm always finds a goal if
one exists; an optimal algorithm always finds a global minimum/maximum
Performance on 8-Queens
Hill climbing is sometimes called greedy local search because it grabs a good
neighbor state without thinking ahead about where to go next. Unfortunately, hill
climbing often gets stuck for the following reasons:
Hill-Climbing Variants
Pure Hill Climbing is incomplete — it gets stuck at local maxima and never moves
“downhill” (to worse states). Random Walk is complete — it eventually finds a
solution by exploring freely, but it’s very inefficient.
Therefore, it seems reasonable to try to combine hill climbing with a random walk
in some way that yields both efficiency and completeness. Simulated annealing is
such an algorithm
In metallurgy, annealing is the process used to temper or harden metals and glass
by heating them to a high temperature and then gradually cooling them, thus
allowing the material to reach a lowenergy crystalline state.
Simulated annealing was first used extensively to solve VLSI layout problems in
the early 1980s. It has been applied widely to factory scheduling and other large-
scale optimization tasks.
Simulated annealing may perform better in landscapes with many deceptive local
optimas. While random-restart hill climbing resets completely, simulated
annealing continuously explores, making better use of the current search
trajectory.
The local beam search algorithm3 keeps track of k states rather than just one. It
begins with k randomly generated states. At each step, all the successors of all k
states are generated. If any one is a goal, the algorithm halts. Otherwise, it selects
the k best successors from the complete list and repeats.
A local beam search with k states might seem to be nothing more than running k
random restarts in parallel instead of in sequence. In a random-restart search,
each search process runs independently of the others. In a local beam search,
useful information is passed among the parallel search threads. The algorithm
quickly abandons unfruitful searches and moves its resources to where the most
progress is being made.
In its simplest form, local beam search can suffer from a lack of diversity among
the k states—they can quickly become concentrated in a small region of the state
space, making the search little more than an expensive version of hill climbing
Like beam searches, GAs begin with a set of k randomly generated states, called
the population. Each state, or individual, is represented as a string over a finite
alphabet—most commonly, a string of 0s and 1s
Step-by-step Process:
Parent 1: 32752411
Parent 2: 24748552
Like stochastic beam search, genetic algorithms combine an uphill tendency with
random exploration and exchange of information among parallel search threads.
The primary advantage, if any, of genetic algorithms comes from the crossover
operation. For example, it could be that putting the first three queens in positions
2, 4, and 6 (where they do not attack each other) constitutes a useful block that
can be combined with other blocks to construct a solution
Schema Theory
Only first-choice hill climbing and simulated annealing can naturally handle
continuous spaces.
Local search methods suffer from local maxima, ridges, and plateaux in
continuous state spaces just as much as in discrete spaces. Random restarts and
simulated annealing can be used and are often helpful. High-dimensional
continuous spaces are, however, big places in which it is easy to get lost.
Constrained Optimization
Some problems have constraints (e.g., airports must be inside Romania and
on land).
Types:
1. Linear Programming (LP):
Constraints are linear inequalities.
Objective function is linear.
Constraint set forms a convex set.
Solvable in polynomial time.
It is a special case of convex optimization
2. Convex Optimization:
More general than LP.
Constraints form any convex region.
Objective function is convex within constraint region.
Under certain conditions, convex optimization problems are
also polynomially solvable and may be feasible in practice with
thousands of variables
Several important problems in machine learning and control
theory can be formulated as convex optimization problems
Algorithm Summary:
Cyclic Solutions –
A cyclic solution is a plan that includes loops — meaning, the agent may repeat
certain actions or return to earlier states until a goal is achieved.
Example: In the slippery vacuum world, Move Right might leave the agent
in the same position.
No finite sequence of actions guarantees the goal.
So we need to retry until we get the right result.
When the agent’s percepts provide no information at all, we have what is called a
sensorless problem or sometimes a conformant problem, sensorless agents can
be surprisingly useful, primarily because they don’t rely on sensors working
properly, also sensing can be quite expensive
In conformant planning, you're not always sure about the initial state. So, you
work with belief states, which are sets of possible world states.
👉 This helps prune the search space, because you can avoid redundant effort by
skipping belief states that are "less general" (i.e., subsets) of ones you've already
handled.
"Build a solution that works for all initial states, checking one at a time."
Step 1: Prediction
After acting, the agent gets a percept (e.g., “I see a wall” or “I hear a beep”).
But since the belief state has many possible world states, we consider:
o What are all the possible percepts I could get from this belief state?
This gives us a list of possible observations, depending on which actual
state the agent was in.
Step 3: Update
Once the actual percept is received, the agent filters the belief state.
It keeps only the states that could have produced that percept.
This shrinks the belief state → the agent becomes more certain about where
it might be.
Solving partially observable problems
Use AND–OR search on belief-state space.
The solution is a conditional plan based on updated belief states, not actual
states.
Optimization techniques like subset checking and incremental search also apply
here.
Offline Search:
o Computes a complete solution before interacting with the
environment.
o Suitable for static, deterministic domains where computation time
isn't a penalty.
Online Search:
o Interleaves computation and action: Takes an action, observes the
result, then decides the next action.
o Beneficial in:
Dynamic or semidynamic domains: There is a time penalty for
long computations.
Nondeterministic domains: Allows agent to focus on actual
contingencies instead of unlikely ones.
Unknown environments: Agent learns by exploring
(experimentation).
Assumptions:
We the compare actual path cost vs. optimal path cost (if state space were
known). This is called the competitive ratio, we would like it to be as small as
possible.
Limitations:
Dead ends: Irreversible actions can trap the agent, if some actions are
irreversible— i.e., they lead to a state from which no action leads back to
the previous state—the online search might accidentally reach a dead-end
state from which no goal state is reachable
Adversary argument: An adversary can design state spaces that force the
agent into suboptimal decisions. To an online search algorithm that has
visited states S and A, the two state spaces look identical, so it must make
the same decision in both. Therefore, it will fail in one of them. This is an
example of an adversary argument
Safely explorable environments:
Even with reversible actions, paths of unbounded cost can lead to arbitrarily bad efficiency .
Hence, algorithm performance is measured in terms of state space size, not depth of the goal.
Interleaving of planning and action means that online search algorithms are quite
different from the offline search algorithms, offline algorithms such as A∗ can
expand a node in one part of the space and then immediately expand a node in
another part of the space, because node expansion involves simulated rather
than real actions
An online algorithm, on the other hand, can discover successors only for a node
that it physically occupies. To avoid traveling all the way across the tree to expand
the next node, it seems better to expand nodes in a local order. Depth-first search
has exactly this property because (except when backtracking) the next node
expanded is a child of the previous node expanded.
Online-DFS-Agent:
Performance:
Limitations:
Random Walk:
Advantages:
Limitations:
Second, the local search agents acquire more accurate estimates of the cost of
each state by using local updating rules, as in LRTA, updates eventually converge
to exact values for every state, provided that the agent explores the state space in
the right way. Once exact values are known, optimal decisions can be taken
simply by moving to the lowest-cost successor—that is, pure hill climbing is then
an optimal strategy