0% found this document useful (0 votes)
20 views

Admissibility, Consistency Dominance of Heuristics

The document discusses the properties of heuristics in search algorithms, specifically focusing on admissibility and consistency, which are crucial for maintaining the optimality of A* search. It explains how admissibility ensures that heuristics do not overestimate costs, while consistency enforces that the estimated costs between connected nodes do not exceed actual costs. Additionally, the concept of dominance is introduced, allowing for the comparison of heuristics based on their effectiveness in estimating goal distances.

Uploaded by

jovana meceva
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as DOCX, PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
20 views

Admissibility, Consistency Dominance of Heuristics

The document discusses the properties of heuristics in search algorithms, specifically focusing on admissibility and consistency, which are crucial for maintaining the optimality of A* search. It explains how admissibility ensures that heuristics do not overestimate costs, while consistency enforces that the estimated costs between connected nodes do not exceed actual costs. Additionally, the concept of dominance is introduced, allowing for the comparison of heuristics based on their effectiveness in estimating goal distances.

Uploaded by

jovana meceva
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as DOCX, PDF, TXT or read online on Scribd
You are on page 1/ 6

Admissibility and Consistency

Now that we’ve discussed heuristics and how they are applied in both greedy and A* search, let’s spend
some time discussing what constitutes a good heuristic. To do so, let’s first reformulate the methods
used for determining priority queue ordering in UCS, greedy search, and A* with the following
definitions:

• g(n) - The function representing total backwards cost computed by UCS.


• h(n) - The heuristic value function, or estimated forward cost, used by greedy search.
• f (n) - The function representing estimated total cost, used by A* search. f (n) = g(n) + h(n).

Before attacking the question of what constitutes a "good" heuristic, we must first answer the question
of whether A* maintains its properties of completeness and optimality regardless of the heuristic
function we use. Indeed, it’s very easy to find heuristics that break these two coveted properties. As an
example, consider the heuristic function h(n) = 1 − g(n). Regardless of the search problem, using this
heuristic yields

f (n) = g(n) + h(n)


= g(n) + (1 − g(n))
=1

Hence, such a heuristic reduces A* search to BFS, where all edge costs are equivalent. As we’ve already
shown, BFS is not guaranteed to be optimal in the general case where edge weights are not constant.

The condition required for optimality when using A* tree search is known as admissibility. The

an overestimate. Defining h∗(n) as the true optimal forward cost to reach a goal state from a given node
admissibility constraint states that the value estimated by an admissible heuristic is neither negative nor

n, we can formulate the admissibility constraint mathematically as follows:


∀n, 0≤h(n)≤h (n)

Theorem. For a given search problem, if the admissibility constraint is satisfied by a heuristic function h,
using A* tree search with h on that search problem will yield an optimal solution.

Proof. Assume two reachable goal states are located in the search tree for a given search problem, an
optimal goal A and a suboptimal goal B. Some ancestor n of A (including perhaps A itself) must currently
be on the fringe, since A is reachable from the start state. We claim n will be selected for expansion
before B, using the following three statements:

1. g(A) < g(B). Because A is given to be optimal and B is given to be suboptimal, we can conclude
that A has a lower backwards cost to the start state than B.
2. h(A) = h(B) = 0, because we are given that our heuristic satisfies the admissibility constraint.

simply h∗(n) = 0; hence 0 ≤ h(n) ≤ 0.


Since both A and B are both goal states, the true optimal cost to a goal state from A or B is
3. f(n)≤f(A), because, through admissibility of h, f(n)=g(n)+h(n) ≤ g(n)+h∗(n) = g(A) = f(A). The total
cost through node n is at most the true backward cost of A, which is also the total cost of A.

We can combine statements 1. and 2. to conclude that f (A) < f (B) as follows:

f(A) = g(A)+h(A) = g(A) < g(B) = g(B)+h(B) = f(B)

A simple consequence of combining the above derived inequality with statement 3. is the following:

f(n) ≤ f(A)∧ f(A) < f(B) =⇒ f(n) < f(B)

Hence, we can conclude that n is expanded before B. Because we have proven this for arbitrary n, we
can conclude that all ancestors of A (including A itself) expand before B. P

One problem we found above with tree search was that in some cases it could fail to ever find a
solution, getting stuck searching the same cycle in the state space graph infinitely. Even in situations
where our search technique doesn’t involve such an infinite loop, it’s often the case that we revisit the
same node multiple times because there’s multiple ways to get to that same node. This leads to
exponentially more work, and the natural solution is to simply keep track of which states you’ve already
expanded, and never expand them again. More explicitly, maintain a "closed" set of expanded nodes
while utilizing your search method of choice. Then, ensure that each node isn’t already in the set before
expansion and add it to the set after expansion if it’s not. Tree search with this added optimization is
known as graph search, and the pseudocode for it is presented below:

Note that in implementation, it’s critically important to store the closed set as a disjoint set and not a
list. Storing it as a list requires costs O(n) operations to check for membership, which eliminates the
performance improvement graph search is intended to provide. An additional caveat of graph search is
that it tends to ruin the optimality of A*, even under admissible heuristics. Consider the following simple
state space graph and corresponding search tree, annotated with weights and heuristic values:
In the above example, it’s clear that the optimal route is to follow S → A → C → G, yielding a total path
cost of1+1+3=5. The only other path to the goal, S→B→C→G has a path cost of 1+2+3=6. However,
because the heuristic value of node A is so much larger than the heuristic value of node B, node C is first
expanded along the second, suboptimal path as a child of node B. It’s then placed into the "closed" set,
and so A* graph search fails to reexpand it when it visits it as a child of A, so it never finds the optimal
solution. Hence, to maintain completeness and optimality under A* graph search, we need an even
stronger property than admissibility, consistency. The central idea of consistency is that we enforce not
only that a heuristic underestimates the total distance to a goal from any given node, but also the
cost/weight of each edge in the graph. The cost of an edge as measured by the heuristic function is
simply the difference in heuristic values for two connected nodes. Mathematically, the consistency
constraint can be expressed as follows:

∀A,C h(A) − h(C) ≤ cost(A,C)

Theorem. For a given search problem, if the consistency constraint is satisfied by a heuristic function h,
using A* graph search with h on that search problem will yield an optimal solution.

Proof. In order to prove the above theorem, we first prove that when running A* graph search with a
consistent heuristic, whenever we remove a node for expansion, we’ve found the optimal path to that
node. Using the consistency constraint, we can show that the values of f (n) for nodes along any plan are
nondecreasing. Define two nodes, n and n′, where n′ is a successor of n. Then:

′ ′ ′
f (n ) = g(n ) + h(n )
′ ′
= g(n)+cost (n, n ) + h(n )
≥ g(n) + h(n)
= f(n)
If for every parent-child pair (n, n′) along a path, f(n′) ≥ f(n), then it must be the case that the values of f
(n) are nondecreasing along that path. We can check that the above graph violates this rule between f
(A) and f (C). With this information, we can now show that whenever a node n is removed for expansion,
its optimal path has been found. Assume towards a contradiction that this is false - that when n is
removed from the fringe, the path found to n is suboptimal. This means that there must be some
ancestor of n, n′′, on the fringe that was never expanded but is on the optimal path to n. Contradiction!
We’ve already shown that values of f along a path are nondecreasing, and so n′′ would have been
removed for expansion before n.

All we have left to show to complete our proof is that an optimal goal A will always be removed for
expansion and returned before any suboptimal goal B. This is trivial, since h(A) = h(B) = 0, so

f(A) = g(A) < g(B) = f(B)

just as in our proof of optimality of A* tree search under the admissibility constraint. Hence, we can
conclude that A* graph search is optimal under a consistent heuristic. P

A couple of important highlights from the discussion above before we proceed: for heuristics that are
either admissible/consistent to be valid, it must by definition be the case that h(G) = 0 for any goal state
G. Additionally, consistency is not just a stronger constraint than admissibility, consistency implies
admissibility. This stems simply from the fact that if no edge costs are overestimates (as guaranteed by
consistency), the total estimated cost from any node to a goal will also fail to be an overestimate.

Consider the following three-node network for an example of an admissible but inconsistent heuristic:

The red dotted line corresponds to the total estimated goal distance. If h(A) = 4, then the heuristic is
admissible, as the distance from A to the goal is 4 ≥ h(A), and same for h(C) = 1 ≤ 3. However, the
heuristic cost from A to C is h(A) − h(C) = 4 − 1 = 3. Our heuristic estimates the cost of the edge between
A and C to be 3 while the true value is cost(A,C) = 1, a smaller value. Since h(A) − h(C) cost(A,C), this
heuristic is not consistent. Running the same computation for h(A) = 2, however, yields h(A) − h(C) = 2 −
1 = 1 ≤ cost(A,C). Thus, using h(A) = 2 makes our heuristic consistent.

Dominance

Now that we’ve established the properties of admissibility and consistency and their roles in maintaining
the optimality of A* search, we can return to our original problem of creating "good" heuristics, and
how to tell if one heuristic is better than another. The standard metric for this is that of dominance. If
heuristic a is dominant over heuristic b, then the estimated goal distance for a is greater than the
estimated goal distance for b for every node in the state space graph. Mathematically,

∀n : ha(n) ≥ hb(n)

Dominance very intuitively captures the idea of one heuristic being better than another - if one
admissible/consistent heuristic is dominant over another, it must be better because it will always more
closely estimate the distance to a goal from any given state. Additionally, the trivial heuristic is defined
as h(n) = 0, and using it reduces A* search to UCS. All admissible heuristics dominate the trivial heuristic.
The trivial heuristic is often incorporated at the base of a semi-lattice for a search problem, a dominance
hierarchy of which it is located at the bottom. Below is an example of a semi-lattice that incorporates
various heuristics ha,hb, and hc ranging from the trivial heuristic at the bottom to the exact goal
distance at the top:

As a general rule, the max function applied to multiple admissible heuristics will also always be

constrained by the admissibility condition, 0 ≤ h(n) ≤ h∗(n). The maximum of numbers in this range must
admissible. This is simply a consequence of all values output by the heuristics for any given state being

also fall in the same range. The same can be shown easily for multiple consistent heuristics as well. It’s
common practice to generate multiple admissible/consistent heuristics for any given search problem
and compute the max over the values output by them to generate a heuristic that dominates (and hence
is better than) all of them individually.

You might also like