Some Search Algorithms[References]


General Search Algorithm

Russell and Norvig have an elegant definition of a general algorithm for solving search problems. This algorithm can be "instantiated" to express specific search strategies.

A problem is represented as a structure with as components INITIAL-STATE, OPERATORS, GOAL-TEST, and PATH-COST-FUNCTION.
The search graph will use nodes with components STATE, PARENT-NODE, OPERATOR, DEPTH, and PATH-COST.
The representation of a state will be problem and search strategy dependent.

function General-Search (problem, QUEUING-FN) returns a solution, or a failure { nodes <-- MAKE-QUEUE(MAKE-NODE(INITIAL-STATE[problem])) loop { if nodes is empty then return failure node <-- REMOVE-FRONT(nodes) if GOAL-TEST[problem] applied to STATE(node) succeeds then return node nodes <-- QUEUING-FN(nodes, EXPAND(node, OPERATORS[problem]))}}

Clearly this general definition for search algorithms fails to adequately express all the possible variations in search algorithms. For example, it does not deal with the case where nodes are not expanded until doing so becomes necessary [instead only one additional successor of a node is considered at a time]. But it gives a good picture of the general strategy for searching.

IDA*: Iterative Deepening A*

Iterative Deepening is used in Depth-First Search to limit storage requirements. Here the idea is applied to the A* algorithm (from Russell-Norvig):

function IDA*(problem) returns a solution sequence f-limit, the current f-Cost limit root, a node ;;f-limit and root are static local variables { root <-- Make-Node(INITIAL-STATE[problem]) f-limit <-- f-Cost(root) loop { solution,f-limit <-- DFS-CONTOUR(root,f-limit) if solution is non-null then return solution if f-limit is INFINITE then return failure}} function DFS-CONTOUR(node,f-limit) returns a solution sequence and a new f-Cost limit next-f, the f-Cost limit for the next contour, initially INFINITE ;;next-f is a statically allocated local variable { if f-Cost[node]>f-limit then return null, f-Cost[node] if GOAL-TEST[problem](STATE[node]) then return node,f-limit for each node s in SUCCESSOR(node) do {solution,new-f <-- DFS-CONTOUR(s,f-limit) if solution is non-null then return solution,f-limit next-f <-- MIN(next-f,new-f)} return null, next-f}

IDA* is, in the same sense as A*, complete and optimal.

SMA*: Simplified Memory Bounded A*

Contrary to IDA*, SMA* remembers after each iteration, not just the f-cost limit, but as many nodes as the available memory affords. We restrict the search to consider only the nodes that can be reached from the root along a path whose nodes fit in memory. And it returns the best path among those that are restricted to these nodes. Again from Russell-Norvig we have:

function SMA*(problem) returns a solution sequence Queue, a queue of nodes ordered by f-cost {Queue is a static local variable} { Queue <-- MAKE-QUEUE(MAKE-NODE(INITIAL-STATE[problem])) loop { if Queue is empty then return failure n <-- deepest least f-cost node in Queue if GOAL-TEST(n) then return success s <-- NEXT-SUCCESSOR(n) if s is not a goal and is at maximum depth then f(s) <-- INFINITY else f(s) <-- MIN(f(n),g(s)+h(s)) if all of n's successors have been generated then update n's f-cost and those of its ancestors if necessary if SUCCESSORS(n) all in memory then remove n from Queue if memory is full then {delete shallowest, highest f-cost node in Queue remove it from its parent's successor list insert its parent on Queue if necessary} insert s in Queue}}

Simulated Annealing

Simulated Annealing search is an improvement upon Ascent and Steepest Ascent searches.

"Annealing" is defined as cooling a metal so that it reaches crystalline state, which is with minimal energy, and not a local energy minimum which leaves a metal in an amorphous state.
It appears that as the metal cools, instead of following a "Hill-Climbing" (or descending) algorithm towards a minimum state, at times it jumps around among energy levels going from low to high. This "nonsensical" move is characterised by the Delta between energy levels. As the metal cools, this Delta tends to become smaller. The Annealing Schedule is the ordered sequence (in time) of the temperatures used during cooling.

Mathematically, the Boltzman Distribution specifies the probability of being at energy level E when at temperature T:

-E/T e P = ------, where Z is a normalization constant to make sure that Z P is a probability (non-negative, with sum 1)

At a temperature the probability of states with low energy is higher than for states with high energy. If DE is the change in energy from a low energy state to a high energy state, the probability of such transition is

-DE/T e

Clearly, as T decreases, the probability of such energy jumps decreases.

The Simulated Annealing Search algorithm adheres to these ideas:

SIMULATED-ANNEALING(AnnealingSchedule, Parameters, EndCondition, EquilibriumCondition) ;;Here Parameters play the role of state { Previous_Energy <-- Energy(Parameters) loop for next T in AnnealingSchedule and not EndCondition) {loop until EquilibriumCondition {loop for each parameter of optimization problem {Trial-Parameters <-- Parameter + random variation DEnergy <-- Energy(Parameters) - PreviousEnergy if DEnergy <= 0 or Uniform-Rand(0,1) <= exp(-DEnergy/T) then {Previous-Energy <-- Energy(Parameters) Parameter <-- Trial-Parameter}}}} return Parameter (it is the current state) }

And-Or Graphs

The search algorithms we have been using use graphs called Or-graphs since from a node we can select any successor. There are cases where a problem can be decomposed into two or more subproblems that can be solved independently and, as a whole, solve the original problem. In this case, instead of nodes having arcs that go to single nodes, we have hyper-arcs or connectors (k-connectors, if with k end points), that is arcs that have one source and a number of end points, each representing a part of the solution.

Solution Base G' of an and-or graph G.
It is a subgraph G' of G such that:
  1. G' contains the start node s of G
  2. if n is in G' then n is reacheable from s within G'
  3. no node in G' is labeled UNSOLVABLE [This notion is discussed below]
  4. if a node is in G' then it has at most one outgoing connector in G'
Solution Graph G' of an and-or graph G:
It is a solution base G' of G whose leaf nodes are all goal nodes. When performing search in and-or graph we try to expand solution bases to solution graphs.

Graph Evaluation function f1:
It is used to compare solution bases. It is usually just the cost of s, where s is the start node.
Node Evaluation Function f2:
It is used to compare nodes within a single solution base (for example, if I have a connector to both n1 and n2, should I expand first n1 or n2?). It is usually the sum of the cost of n plus the cost of the path from s to n.
Heuristic Function h(n):
An estimate of the cost of an optimal solution graph with root n. The true optimum is h*(n).
Cost of a node n, c(n):
It is h(n) if n is a leaf, otherwise, if n1, n2, .., nk are the endpoints of the outgoing connector of n, it is c(n1)+c(n2)+...c(nk) + cost of connector (usually assumed to be k).

GBF Algorithm (J.Pearl)

  1. Put the start symbol s on OPEN, set CLOSED to empty, and set G' to s.
  2. From the current search graph G' compute the most promising solution-base G0 using f1 and h.
  3. Using f2, select a node n that is both in OPEN and G', remove it from OPEN and place it in CLOSED
  4. Expand n and add its successors to OPEN and to G' with back pointers to n. For each successor n' of n provide heuristic information that characterizes the set of solution graphs rooted at n'.
  5. If a successor n' is terminal, then
  6. Go to Step 2.

References

  • Russell,A.,Norvig,P.: Artificial Intelligence: A Modern Approach
    Prentice-Hall 1995
  • Pearl,J.: Heuristics: Intelligent search strategies for computer problem solving
    Addison-Wesley, 1984

    ingargiola@cis.temple.edu