# «The Robotics Institute Carnegie Mellon University Pittsburgh, PA 15213 May 13, 2005 Submitted in partial fulfillment of the requirements for the ...»

To explain the semantics of resource parameters at other points in a path, it is important to show how they evolve over a search. Goal DPARMS states and objective function values are used as a starting point for state expansions during the search. The state transition function β generates the change in state for DPARMS parameters as well as the costs that are reflected in the objective function. It subtracts quantities that would be added in the forward direction and vice versa. Returning to the example, the duration resource declines in the forward direction, and hence accumulates

** MISSION-DIRECTED PATH PLANNING FOR PLANETARY ROVER EXPLORATION**

in backwards expansions. Figure 3-1 depicts the evolution of a monotonic duration resource over an ISE search. The plot’s horizontal axis represents the sequence of states in a path in the direction of search, from the goal (extreme right) to the start (extreme left). The vertical axis represents duration, from zero to the maximum allowable duration D max. The solid curve starts at the goal at zero duration remaining. Moving left from the goal, arcs successively add duration to the total. Under this model, it is important to understand that resource parameters do not represent the instantaneous value of the quantity, but rather the minimum amount allowable to satisfy the goal conditions under the current optimal arc sequence.

Figure 3-1: Plots of Duration, a Monotonic Resource Parameter, Over an ISE Search. The solid curve depicts a path that satisfies the global duration constraint, the dashed curve shows a duration profile that exceeds the maximum duration before reaching the start. Steeper slopes indicate slower progress to the goal.

To constrain a monotonic resource, the ISE user must design β or the objective function to reject arcs, either through outright rejection or via high path cost, that exceed the minimum or maximum parameter value. The example duration constraint rejects arcs that force the duration parameter above D max (see the dashed curve in Figure 3-1); any such arc would force the robot to start with a greater duration allocation to reach the goal with no less than zero upon reaching the goal. The arc rejection mechanism is easily defined in either the state transition function β or within the objective function, as described in Sections 3.1.3, 3.1.4 and 3.1.6. Feasible paths arrive at the start state with margin remaining with respect to the maximum allocation (see the solid curve in Figure 3-1).

** INCREMENTAL SEARCH**

Non-monotonic resources are not as intuitive as their monotonic counterparts. Non-monotonic resources can go either up or down over any arc. Therefore, parameters that have finite ranges typically require upper and lower limits. Take, for example, a memory resource, where some arcs correspond to net data being stored, and others, net data being removed. Positive increments in memory in the forward direction (erase actions) appear as reductions in backwards search, providing greater margin with respect to the memory capacity. However, successive reduction arcs could drive the memory parameter to zero. Herein lies the subtlety: zero in the memory scale does not indicate that ISE should reject the arc. Instead, zero indicates that the goal could be reached from the current state, even starting with no available memory. ISE must prevent the parameter from dropping below the minimum - there is no meaning to a resource with “less-than-empty” conditions. However, the parameter remains legal, and can remain at zero or increase above zero in later state expansions.

Figure 3-2: Computer Memory, a Non-Monotonic Resource Parameter, Over an ISE Search. The three curves depict different outcomes of the search: a path that requires more than the maximum available memory (Rejected path 1), a path that exceeds the memory available at the start (Rejected path 2), and a path that meets both the maximum and initial memory requirements (Legal path).

Figure 3-2 illustrates the memory example. Similar to Figure 3-1, the horizontal axis represents a sequence of states from the goal (extreme right) to the start (extreme left). The vertical axis represents remaining unused memory, and ranges from zero remaining memory to M max, the full capacity of the storage device. All the example curves start at some allowable memory level (possibly non-zero). Observe that each curve displays non-monotonic behavior. The

curve labeled “Rejected path 1” behaves very similar to the rejected path in Figure 3-1. It ultimately exceeds the maximum capacity of the storage device, meaning that the particular series of arcs taken by the path required there to be more memory than available to meet the goal conditions. “Rejected path 2” is also rejected, but for different reasons. In that case, the path never requires more than the storage capacity, but requires more than is available at the start. Perhaps all the data in memory is critical and cannot be deleted or moved out of memory. A solution that required deletion of this data would be illegal. This is rejected on the basis of feasibility, and is alluded to in Section 3.3.1. Finally, the “Legal path” illustrates how a resource curve can be reduced to zero and remain valid. The portion of the curve that dips to zero (from right to left) indicates that the arcs move more data out of memory than in. The flat part of the curve at zero memory indicates that the capacity for data removal during those arcs is sufficient to allow a full memory at that point in the path, and still meet the goal conditions downstream. Furthermore, the starting memory requirement falls below the amount of memory available at the start.

To represent non-monotonic resource parameters using DPARMS, the ISE user must design transition function β to reject states that exceed one extremum, and saturate the values that would otherwise exceed the other. A resource parameter r represents a minimum allowable level, and has upper bound r max and lower bound r min. For successive states r i in a backwards search, where ∆r ( a ) is the change in resource level for a given arc a from a given state s in

**the backward direction, the next resource level could be given by:**

and would reject r i + 1 r max. The implementation of non-monotonic resources in the objective function is not as simple, and requires a special formulation, described in Section 3.1.7.

3.1.6 Path Cost As with A* and D*, ISE combines an objective function and a heuristic function to encode the cost of a candidate path. ISE searches in a backwards-chaining order, from one or more goal states to a start state. Backwards arc transitions between a state X and a second state Y result in a positive cost defined by the arc cost function c ( X, Y ). If Y does not have a forward arc to X, then c ( X, Y ) is undefined. The cost to traverse a sequence of states S = { x 1, x 2, …, x n }, such that x i + 1 = β ( x i ), is the sum of the arc costs incurred in backwards transitions along the sequence.

## INCREMENTAL SEARCH

The estimate of the cost of a path from an initial state R, through a state X, to a goal G is based on two functions,**h ( X ) and g ( X, R ). Given a sequence of states S from a goal G to state X:**

function h ( X ) is the objective function that returns the cost accrued over the sequence of n arcs, and can be viewed as an estimate of the optimal cost from the goal to the state.

Since h ( X ) is a summation over a path, it can represent a monotonically-increasing global constraint parameter. An application can simultaneously optimize the parameter and limit its expansion within the same objective function. In the case where it violates a constraint, the cost for the arc c ( x i, x i + 1 ), and hence h ( X ) itself, is set to a very large num

<

**ber that approximates infinity.**

Function g ( X, R ) is a focussing heuristic that estimates the cost from the state X to the state R. The heuristic function must be admissible - it must never over-estimate the cost between two states. A good heuristic approximates the actual cost as closely as possible without ever exceeding it. Both functions must be monotonic - h ( X ) must monotonically non-decrease with number of steps from the goal, and g ( X, R ) must monotonically non-decrease with number of steps from state R. The sum f ( X, R ) = h ( X ) + g ( X, R ) is the estimated total cost between a goal G and a state R, through state X.

3.1.7 Non-Monotonic Path Cost In some cases, it is desirable to optimize a path in terms of a non-monotonic resource. As an extension of standard cost functions like distance, duration or resource expense, ISE enables a user to encode composite objective functions that involve more than one parameter. One use for this mechanism is to regulate the search for an optimal solution in terms of a non-monotonic resource.

Non-monotonic parameters cannot be optimized directly using the incremental search approach. Heuristic search prioritizes its effort on the estimated lowest-cost solutions, based on the objective function. If the objective function is the sum of successive positive or negative resource costs, then, in general, the objective function can assume arbi

trarily small values. In cases where negative cost arcs are consistently reachable, the search will never terminate pursuing negative-cost arcs will yield continually lower objective function values with continually higher priority.

One solution is to add an additional term to the objective function which causes the sum to be monotonic. The new

**objective function takes the form:**

The summation component is the same as in Equation 3-4, but may also include constraint mechanisms like in Equation 3-1. In the new term nK, n is the number of arcs in the sequence, and K is defined as follows. If c min is the

**minimum cost taken over the entire state space S and action space A :**

In words, if negative costs exist in the set of possible transitions, K is the absolute value of the most negative cost

**over the state space. Therefore, the following always holds true:**

K exactly cancels out the largest possible negative cost, and outweighs all others. At every step, the objective function increments by the sum K + c ( x i, x j ). Hence, the h ( X ) defined in Equation 3-5 is monotonically non-decreasing.

As with the original objective function, the summation of costs can also be governed by global constraint mechanisms like in Equation 3-1. Note that the new objective function does not optimize the sum of original costs c ( x i, x j ).

Because K is added at every step in the plan, the term nK is a measure of plan length. Therefore, the path solution that results from a search under a composite objective function is optimal in combined terms of the cost and number of plan steps.

## INCREMENTAL SEARCH

This formulation is useful for optimizing non-monotonic resources in many applications: battery energy, fuel in a tank, data in memory, or thermal energy in an electronics component.3.2 Efficiency Mechanisms ISE employs three mechanisms - dynamic state generation, resolution equivalence pruning and state dominance - to counteract state proliferation during search. The following sections describe how each is used to manage the search memory and enable efficient path planning in greater than two dimensions.

3.2.1 Dynamic State Generation While many search algorithms store all states in the space explicitly, ISE generates states only as they are encountered during the search, and deletes them if they become irrelevant. ISE employs the arc transition function on existing states to generate new states. ISE defines a state set for each combination of IPARMS values. Each state set stores states with common IPARMS but potentially differing DPARMS. A state set stores the DPARMS for each state possessing its particular IPARMS. The IPARMS are implicit in each state set.

At the beginning of search, only the goal states are stored in the sets. ISE generates states to search the space, but manages state proliferation by continually checking for state redundancy and irrelevance. Unnecessary states are deleted from the sets, thereby saving memory.

Returning to the earlier example, each pair of IPARMS values ( x ∈ X, y ∈ Y ) defines an IPARMS set containing DPARMS values ( t ∈ T ). A transition function might describe nine legal actions - a motion to each of eight neighbors, and a wait action that remains in the current (x,y) location for fixed duration. For each action, it defines ∆T = f ( ∆X, ∆Y ). APARMS for each IPARMS combination might include parameters to define robot speed as a function of position, or encode whether an obstacle prevents the action from occurring.

ISE deletes states as they become irrelevant for planning. The following two sections describe two mechanisms for state deletion - resolution pruning and state dominance.

3.2.2 Resolution Equivalence In search over a discrete space, IPARMS dimensions can be represented at arbitrary resolutions, based only on requirements of the problem domain. Since changes in each axis can be considered independently, discretizing the space is a simple matter of independently discretizing each dimension to provide adequate resolution.

DPARMS change as functions of changes in independent or other dependent parameters. This leads to an important difference between the independent and dependent parameters. Transitions in IPARMS are integer multiples of the