Search in Complex Environments

Local Search and Optimization Problems

In many optimization problems, path is irrelevant; the goal state itself is the solution In problems where the requirement is to find the optimal configuration (example: TSP):

The state space is the set of “complete” configurations
The goal is the find the configuration satisfying constraints (example: timetable)

In such cases, we can use iterative improvement algorithms; Where we keep a single “current” state and try to improve it

Local search algorithms operate by searching from a start state to neighboring states, without keeping track of the paths, nor the set of states that have been reached.

They are not systematic
- They might never explore a portion of the search space where a solution actually resides
They start with any complete tour, then perform pairwise exchanges
Variants of this approach get within 1% of optimal very quickly with thousands of cities

Properties:

they use very little memory
they can often find reasonable solutions in large or infinite state spaces for which systematic algorithms are unsuitable.

Local search algorithms can also solve optimization problems, in which the aim is to find the best state according to an objective function.

Example: Travelig Salesperson Problem (TSP)

Example: n-queens

Put n queens on an n × n board with no two queens on the same row, column, or diagonal
Move a queen to reduce number of conflicts
Almost always solves n-queens problems almost instantaneously for very large n, e.g., n= 1 million

Hill-climbing

The hill-climbing search algorithm keeps track of one current state and on each iteration moves to the neighboring state with highest value—that is, it heads in the direction that provides the steepest ascent. It terminates when it reaches a “peak” where no neighbor has a higher value. Hill climbing does not look ahead beyond the immediate neighbors of the current state.

This resembles trying to find the top of Mount Everest in a thick fog while suffering from amnesia. Note that one way to use hill-climbing search is to use the negative of a heuristic cost function as the objective function; that will climb locally to the state with smallest heuristic distance to the goal.

function Hill-Climbing( problem) returns a state that is a local maximum
  inputs: problem, a problem
  local variables:  current, a node
                    neighbor, a node
  current← Make-Node(Initial-State [problem])
  loop do
    neighbor← a highest-valued successor of current
    if Value[neighbor] ≤ Value[current] then return State[current]
    current← neighbor
  end

Hill climbing is sometimes called greedy local search because it grabs a good neighbor state without thinking ahead about where to go next. Hill climbing can make rapid progress toward a solution because it is usually quite easy to improve a bad state.

Challenges: Hill climbing can get stuck for any of the following reasons:

Local maxima: A local maximum is a peak that is higher than each of its neighboring states but lower than the global maximum.
- Hill-climbing algorithms that reach the vicinity of a local maximum will be drawn upward toward the peak but will then be stuck with nowhere else to go
Ridges:
- Ridges result in a sequence of local maxima that is very difficult for greedy algorithms to navigate
Plateaus:
- A plateau is a flat area of the state-space landscape.
- It can be a flat local maximum, from which no uphill exit exists, or a shoulder, from which progress is possible.
- A hill-climbing search can get lost wandering on the plateau

Solution:

Allow a sideways move
- Random sideways moves escape from shoulders loop on flat maxima
First-choice hill climbing
- implement Stochastic hill climbing by generating successors randomly until one is generated that is better than the current state.
Random-restart hill climbing
- It conducts a series of hill-climbing searches from randomly generated initial states, until a goal is found
- overcomes local maxima—trivially complete

Simulated annealing

A hill-climbing algorithm that never makes “downhill” moves toward states with lower value (or higher cost) is always vulnerable to getting stuck in a local maximum. In contrast, a purely random walk that moves to a successor state without concern for the value will eventually stumble upon the global maximum, but will be extremely inefficient. Therefore, it seems reasonable to try to combine hill climbing with a random walk in a way that yields both efficiency and completeness.

*Idea: escape local maxima by allowing some “bad” moves, but gradually decrease their size and frequency

function Simulated-Annealing( problem,schedule) returns a solution state
  inputs: problem, a problem
          schedule,a mapping from time to “temperature”
  local variables:  current, a node
                    next, a node
                    T, a “temperature” controlling prob. of downward steps
current← Make-Node(Initial-State [problem])
for t← 1 to ∞ do
  T ← schedule[t]
  if T = 0 then return current
  next← a randomly selected successor of current
  ∆E← Value[next] – Value[current]
  if ∆E > 0 then current← next
  else current← next only with probability e∆ E/T

The overall structure of the simulated-annealing algorithm is similar to hill climbing. Instead of picking the best move, however, it picks a random move. If the move improves the situation, it is always accepted. Otherwise, the algorithm accepts the move with some probability less than 1. The probability decreases exponentially with the “badness” of the move—the amount ∆E by which the evaluation is worsened. The probability also decreases as the “temperature” $T$ goes down: “bad” moves are more likely to be allowed at the start when $T$ is high, and they become more unlikely as $T$ decreases. If the schedule lowers $T$ to 0 slowly enough, then a property of the Boltzmann distribution, $e^{∆E/T}$, is that all the probability is concentrated on the global maxima, which the algorithm will find with probability approaching 1.

Properties of simulated annealin

At fixed “temperature” T, state occupation probability reaches Boltzman distribution

\[p(x)=ae^{\frac{E(x)}{kT}}\]

T decreased slowly enough =⇒ always reach best state x* because $\frac{e^{\frac{E(x)}{kT}}}{e^{\frac{E(x)}{kT}}} = e^{\frac{E(x)-E(x)}{kT}} = 1 \text{ for small }T$

based on Metropolis et al., 1953, for physical process modelling
Simulated annealing was used to solve VLSI layout problems beginning in the 1980s
this is widely used in VLSI layout, airline scheduling, etc

Local beam search

The local beam search algorithm keeps track of k states rather than just one. It begins with k randomly generated states. At each step, all the successors of all k states are generated. If any one is a goal, the algorithm halts. Otherwise, it selects the k best successors from the complete list and repeats.

The two algorithms are quite different.

In a random-restart search, each search process runs independently of the others.
In a local beam search, useful information is passed among parallel search threads.
The algorithm quickly abandons unfruitful searches and moves its resources to where the most progress is being mad

Idea:

keep k states instead of 1;
choose top k of all their successors
Not the same as k searches run in parallel
Searches that find good states recruit other searches to join them

Problem:

Local beam search can suffer from a lack of diversity among the k states — they can become clustered in a small region of the state space, making the search little more than a k-times-slower version of hill climbing.
quite often, all k states end up on same local hill

Stochastic Beam search This variant analogous to stochastic hill climbing, helps alleviate this problem. Instead of choosing the top k successors, stochastic beam search chooses successors with probability proportional to the successor’s value, thus increasing diversity.

Idea:

choose ksuccessors randomly, biased towards good ones
Observe the close analogy to natural selection!

Evolutionary algorithms

Evolutionary algorithms can be seen as variants of stochastic beam search that are explicitly motivated by the metaphor of natural selection in biology
- there is a population of individuals (states), in which the fittest (highest value) individuals produce offspring (successor states) that populate the next generation, a process called recombination.

There are endless forms of evolutionary algorithms, varying in the following ways:

size of the population
representation of each individual
selection process for selectin
The recombination procedure
The mutation rate
The makeup of the next generation

Genetic algorithms

stochastic local beam search + generate successors from pairs of states

GAs require states encoded as strings
Crossover helps iff substrings are meaningful components
GAs /= evolution: e.g., real genes encode replication machinery!

Local search in continuous spaces

A continuous action space has an infinite branching factor, and thus can not be handled by most of the algorithms covered.

One way to deal with a continuous state space is to discretize it. For example, instead of allowing the $(x_i ,y_i)$ locations to be any point in continuous two-dimensional space, we could limit them to fixed points on a rectangular grid with spacing of size δ (delta).

Search with Nondeterministic Actions

The Agent does not know the state its transitioned to after action, the environment is nondeterministic. Rather, it will know the possible states it will be in, which is called “belief state”

Examples:

The erratic vacuum world
- (if-then-else) steps. If statement tests to know the current state.
AND-OR search trees
- Two possible actions (OR nodes)
- Branching that happens from a choice (AND nodes)
Try, try again
- A cyclic plan where minimum condition
- (every leaf = goal state & reachable from other points in the plan)

Search in Partially Observable Environments

Problem of partial observability:
where the agent’s percepts are not enough to pin down the exact state.

Searching with no observation:
Agent’s percepts provide no information at all, sensorless problem (or a conformant problem).

Solution: sequence of actions, not a conditional plan

Searching in partially observable environments requires a function that monitors or estimatesthe environment to maintain the belief state

Link: Metropolis et al., 1953: https://www.aliquote.org/pub/metropolis-et-al-1953.pdf Understanding the Metropolis-Hastings Algorithm: https://github.com/user-attachments/files/17469951/Understanding_the_Metropolis-H.pdf Effiient Metropolis Jumping Rules: https://stat.columbia.edu/~gelman/research/published/baystat5.pdf The Metropolis algorithm: https://stat.columbia.edu/~gelman/research/published/baystat5.pdf

Informed (Heuristic) Search Strategies

Adversarial Search And Games