Dr. Mark Humphrys

School of Computing. Dublin City University.

Home      Blog      Teaching      Research      Contact

Search:

CA216      CA249      CA318

CA400      CA651      CA668


More Heuristic Search


Distance from start

In many problems, we are interested not just in any solution but in the shortest path to a solution.
The evaluation function f of state n would incorporate distance from start:

f(n) = g(n) + h(n)
where g(n) = actual distance from start state to state n
h(n) = heuristic estimated distance from n to goal

In this case, lowest f is best.



Algorithm A - Best-first search with distance from start

If you implement best-first with distance from start you get a best-first which has more of a breadth-first flavour.



Example of Algorithm A

8-puzzle.
Heuristic h(n) = no. of tiles out of place.

Luger Fig 4.15 (See goal state at bottom)
e.g. middle choice has 2,8,1 out of place, h(n) = 3, g(n) = 1, f(n) = 4


How this would direct search:

Image courtesy of Ralph Morelli.
See also Luger Fig 4.16

Exercise: Put in all the h values.

Progression of OPEN and CLOSED: See Luger Notes



4.3 Admissibility

We said designing heuristics is an empirical problem. How do we study when one heuristic is better than another?

A heuristic is said to be admissible if it provably finds the (joint) shortest path to the goal where it exists.
Breadth-first search is admissible, but impractical.



Algorithm A

Algorithm A: Best-first heuristic search with this heuristic evaluation function:

f(n) = g(n) + h(n)
where g(n) = distance from start state to state n along current, known path (haven't yet searched other paths, may find there are multiple different ways to get to n)
h(n) = heuristic estimated distance from n to goal

Consider:

f*(n) = g*(n) + h*(n)
where g*(n) = minimum distance from start state to state n along any path
h*(n) = actual, minimum distance from n to goal along any path

Best-first search with f* is admissible.
But we do not know f* (it would be an "oracle").
We want f to be as close to f* as possible.

Note that g(n) may be too high - it may not have found the shortest path to n.

For all A algorithms: g(n) ≥ g*(n) for all n.



A* algorithms

A* search algorithm

An A* algorithm is algorithm A where h(n) ≤ h*(n) for all n.

i.e. All states have a worse distance to goal than predicted, or equal, but never better.
h is always too optimistic.
h(n) = 0 for all n, for example.

Theorem: All A* algorithms are admissible.
Proof: See here.
Roughly: When A* terminates, it has by definition found a path whose actual cost is lower than the estimated cost of any path through any unsearched node. But since those estimates are all optimistic (real cost will be equal or higher), A* can ignore those nodes.


Discussion

Q. But this means you just set h(n) = 0 for all n, and you have an admissible algorithm that will guarantee to find shortest path!
A. Exactly! You have just defined breadth-first search. f(n) = g(n) + 0.
Breadth-first is an A* algorithm. Problem is it takes too long.

The trick is to find a non-zero h(n) that will prune the search, but that is still always an underestimate.

The states searched by any A* algorithm will be a subset of the states searched by breadth-first.


Example

8-puzzle.
Examples of A* algorithms:
  1. h(n) = 0
  2. h(n) = no. of pieces out of place (will need this no. of moves or more)

Shows that it may not be hard to find a non-zero h(n) that is still always optimistic. Then you have an algorithm guaranteed to find optimal path (eventually).
Obviously some h(n)'s are better (weed more states) than others.



4.3.3 Informedness

When is one heuristic better than another?

Imagine we have two A* heuristics.
h1(n) ≤ h*(n) for all n.
h2(n) ≤ h*(n) for all n.

If h1(n) ≤ h2(n) for all n, we say that heuristic h2 is more informed (i.e. better) than h1

But still need h2(n) ≤ h*(n) for all n.



Example

With 8-puzzle, where h1 = 0 (breadth-first), h2 = no. of tiles out of place:
h1(n) ≤ h2(n) ≤ h*(n) for all n
h2 is more informed (better) than h1



h2 searches only a subset of the states searched by h1
Image courtesy of Ralph Morelli.
See Luger Fig 4.18
h1 searches all states shown. h2 searches shaded area.

One can imagine a sequence of search spaces, each smaller than the previous one, converging on the optimal solution.



4.5 Complexity issues

Heuristic prunes state space.
But heuristic itself may take non-trivial time to compute, if it is complex.
Time to compute heuristic for every state may waste all the time saved by pruning the space. Trade-off.

Luger Fig 4.28
"Control strategy cost" = Cost of calculating heuristic.
"Rule application cost" = Number of states searched.

LHS - Simple heuristic. Quick to calculate heuristic. But doesn't prune much - huge space to search.
RHS - Good heuristic. Cuts space. But calculating heuristic takes time.
Trade-off - Find middle point where computational cost of problem solving is lowest.




Beam search

Beam search: Reduce space to be searched in best-first.
Fixed size OPEN list.
Only keep the n most promising (according to heuristic) unvisited states on OPEN.



Feeds      w2mind.org

On Internet since 1987.