«The Robotics Institute Carnegie Mellon University Pittsburgh, PA 15213 May 13, 2005 Submitted in partial fulfillment of the requirements for the ...»
axis interval; dependent state transitions are, in general, real-valued. DPARMS must also be integer-valued. Real valued transitions must be rounded or truncated to the discrete intervals of the dimension. To avoid excessive rounding errors over many transitions, the resolution of DPARMS axes must be sufficiently high. However, in selecting an adequately high resolution, the number of discrete intervals in the dimension multiplies, leading to a vast increase in state space size and in time to search the space.
ISE enforces user-defined DPARMS equivalence classes to address this issue. In each IPARMS set, equivalence classes group DPARMS over ranges of values into coarse bins. The user selects the bin size to reflect an acceptable range over which DPARMS can be considered equivalent for the particular application. During a search, ISE uses the DPARMS equivalence classes to prune away states that are “resolution equivalent” but inferior in terms of cost. This dual resolution representation enables ISE to classify and prune away sufficiently similar but inferior states while avoiding the excessive rounding errors that would emerge in using only the coarse resolution.
Returning to the example, assume the office area is 100 x 100 meters. From the ISE perspective, the resolutions for the X and Y axes are arbitrary, but might be selected to be 1 meter. If transitions are only possible to the eight neighboring cells in the grid, the base resolution of T must be selected to approximate changes in time for both transitions to principal-direction neighbors and to neighbors along diagonals. For example, if the robot in this example moves at
0.1 meters per second, then the time to cross a grid cell in the principal and diagonal directions would be 10.0 and
14.1 seconds, respectively. A time parameter T with a 10-second resolution would not differentiate between the time costs in these directions; 10.0 and 14.1 both round to one 10-second time unit. However, a one-second resolution might be sufficient for the application.
Problematically, if the time parameter T spans one hour, the total number of states in the space is 36 million. However, by dividing the DPARMS into 60 bins of one minute each, the state space drops to 6 × 10 state bins while maintaining the rounding error for transitions at ± 0.5 seconds per step. The drawback is that resolution pruning reduces the accuracy with which path solutions match the initial state of a robot. Errors can be up to one half the resolution equivalence class. In this example, solutions could be mismatched to the current robot time by as much as 30 seconds.
3.2.3 State Dominance In some applications, states obey a dominance relationship in which one state can be guaranteed to always produce a lower cost solution than another state. More formally, let S and Q be any two states in the graph and G be the goal
INCREMENTAL SEARCHstate. Let h ( S, Q, G ) be the cost of the optimal path starting at S, passing through Q and ending at G. Let h ( S, Q, G )
be infinity if no such path exists. State A dominates B for a given G if:
for all remaining states S. If a state is dominated by another, then it can never be a member of an optimal path and can be removed from consideration.
ISE enables the user to define the specific conditions for dominance, if any, amongst states with common IPARMS but possibly different DPARMS.
3.3 Search ISE searches to find optimal paths from a start state R to one or more goal states g ∈ G. The search is in backwardschaining order, originating at the goal states. To search the space, ISE “expands” states from a prioritized list called the OPEN list. Initially, this list contains only the goal states. Given a final state, ISE uses the backwards arc transition function β to dynamically generate all possible initial states from which the final state is reachable. Through state expansion from the goals, ISE builds a directed graph whose nodes are states, and whose edges are the transition arcs. By definition, every node in the graph is a state from which one of the goals is reachable. The heuristic function focusses expansion on states predicted to reach the start state with minimum cost. The search terminates when the search expands states in the state set containing R that meet the criteria for an optimal solution.
In an initial search, ISE results are similar to those from A* . With changes to transitions, either in arcs or objective function costs, ISE operates far more efficiently than A* in re-planning paths. The A* algorithm is forced to re-plan the entire space from scratch if any transition arcs change. ISE, like D*, can repair the search graph in the area of the changes, thereby drastically reducing the time for re-plans. Using incremental graph theory, ISE repairs the feasible set of solutions and the optimal path within it.
Sets whose arc parameters have changed have their member states placed back into the OPEN list. The effects of the changes propagate through the search graph under the same machinery used in initial search. The focussing heuristic limits repairs to states affecting the start state, making the algorithm much more time efficient than A*. The algo
MISSION-DIRECTED PATH PLANNING FOR PLANETARY ROVER EXPLORATION
rithm search direction is optimized for applications in which new arc information appears close to the robot’s current state, for example from robot sensors. The search is backwards-chaining so that changes near the start state only affect “leaf” states at the very edge of the graph. If the search were in the forward direction, changes near the robot’s state might have to propagate through the entire search graph, essentially as costly as a new search from scratch.
Another advantage of a backwards-chaining search is that state variables can be expressed in terms of goal state requirements. By initializing a goal state parameter to the minimum or maximum quantity allowed to achieve the goal, search state expansions propagate these requirements to other states. The semantics of the state parameter is the same at any step, including the first step in a plan solution - it reflects the minimum or maximum quantity allowable, from that state, to achieve the goal condition. This is a particularly advantageous operational tool. At any step in the plan, a robot can monitor and immediately determine, based on its current estimate of the parameter, whether the plan can feasibly achieve the goal. In contrast, a plan from a forward search is less insightful. Since state parameters in a forward search have their origin at the current robot state, they can only provide an estimate of the expected state of the robot at each step. The plan generated with a forward search does not provide a threshold for making operational decisions during execution. To generate a goal-referenced plan with a forward search would require multiple search iterations; it is not clear what iteration strategy would efficiently achieve the desired result.
3.3.1 Modes and Search Termination ISE operates in one of two search modes, selected by the user. BESTPCOST mode finds the minimum cost path.
BESTDPARMS mode finds the solution with the “best” start state DPARMS that falls below a maximum path cost. A more detailed description of these modes appears in Appendix 1.
In each case, search can terminate when the lowest value on the OPEN list equals or exceeds the path cost from the start state R. Since the cost of expanded states is monotonically non-decreasing, the OPEN list cannot possibly find a LOWER state that has a low enough path cost (effect of the objective function h ( X ) ) and that is “close” enough to the start state to be able to reduce the path cost from R when it reaches it through subsequent expansion (effect of the heuristic function h ( X, R ) ). Candidate start states must also meet a feasibility condition, typically to guarantee sufficient proximity of the candidate to the actual start state, or satisfaction of necessary start conditions.
3.3.2 Path Extraction To extract the optimal path from the search graph, ISE uses the forwards arc transition function Φ to locate successive states in the sequence. Beginning with the start state, ISE calls Φ to dynamically generate candidate parent states in the plan. ISE compares each generated state to existing states with the same IPARMS set to find a state with matching DPARMS and path cost. The matching state is the next state in the optimal path, and is used to generate its candidate parents from the search. The process repeats until a goal state is reached.
3.4 Experimental Results ISE can be adapted to a wide range of problems, and can be configured in multiple ways to solve the same problem.
The configuration that one applies affects memory and computational performance, the quantity to be optimized and the thoroughness of the search. This section quantifies ISE performance under a variety of problem and configuration parameters, and explores the other effects stemming from the ISE configuration.
3.4.1 Test Domain At the risk of losing generality, ISE experiments were conducted in a specific planning domain. The tests all solved for an optimal path between two positions. To emphasize ISE novel features, the problem sought optimal paths with respect to a renewable energy resource rather than the typical path length. The test domain is a two-dimensional grid of positions, with arc transitions defined from any grid position to each of its eight immediate neighbors (no stationary actions were possible). Arc transitions result in time and energy costs, randomly pre-generated in advance of each test for all cells in the map. Time costs were uniformly sampled over a strictly positive range, while energy costs were uniformly sampled over a range that enabled positive or negative costs.
The domain state space comprises four parameters: two position variables x and y, a time variable t, and the energy variable e. The two position variables define axes on a regular grid, and are ISE independent parameters (IPARMS).
The time variable represents the time left to reach the goal, and is an ISE dependent parameter (DPARMS). The energy variable is the battery energy required to reach the minimum goal energy. As a renewable resource, energy is non-monotonic. Depending on the ISE mode used, as will be explained below, energy is represented in ISE in one of two ways. In both cases, e is constrained to be within the legal range of battery levels, from zero to some maximum
value e max
3.4.2 Comparison of Two Solution Approaches This study contrasts two approaches for solving this resource management problem, summarized in Table 3-1.
Approach 1 uses the BESTDPARMS mode (Section 3.3.1). In this approach, the energy variable e is represented as a second DPARMS parameter, yielding four dimensions for the search. Under BESTDPARMS, ISE finds the “best” DPARMS solution (lowest energy requirement to reach the goal, as defined by the function better ( X, Y ) ) whose cost falls below a given maximum duration. Observe that even though the objective function for Approach 1 is duration, the BESTDPARMS uses it only to constrain the duration of solutions that have a minimal energy expense.
Approach 2 uses BESTPCOST mode (Section 3.3.1) and the composite objective function mechanism (Section 3.1.7), with energy as the non-monotonic resource parameter. Under BESTPCOST, ISE finds the minimum cost path in combined terms of energy and number of plan steps. The objective function represents the energy variable e as a
MISSION-DIRECTED PATH PLANNING FOR PLANETARY ROVER EXPLORATION
global constraint parameter, and hence without using a DPARMS parameter. Therefore, the DPARMS e dimension used in Approach 1 is unnecessary in Approach 2. This yields a three dimensional search and a vast reduction in search space. Removing the energy dimension comes at a price, however, and is discussed later.
The objective function of Approach 2 formulates the resource cost c ( x i, x i + 1 ) of Equation 3-5 to enable a representation of the energy variable that is independent of the ISE IPARMS-DPARMS state. As mentioned earlier, e represents the energy required to reach the goal. This quantity can never be less than zero, since there is no meaning to negative stored energy. Furthermore, no feasible plan can require more than a full battery to achieve a goal. Both of these constraints must be reflected in the energy representation. If ∆e i is the change in energy predicted by the arc
transition function β ( x i ) from state x i, the resulting energy can be constrained to be above zero:
Under Equation 3-11, the greatest change in energy possible in an arc is ∆e i. To limit the upper bound on energy, the
resource cost term from Equation 3-5 then becomes:
If the resulting energy does not exceed the capacity of the battery, then the cost is simply the change in energy. If the resulting energy does exceed capacity, the path is infeasible and is given an infinite cost. For feasible trajectories, the summation of resource costs over a path correctly (Equation 3-5) represents the energy parameter e. The other term from Equation 3-5, path length cost K, is equal to the absolute value of the most negative energy cost over all possible transitions. The sum of the resource and path length costs is always non-negative, since the resource cost can never increment more than ∆e i. So, this adaptation of Equation 3-5 continues to guarantee a monotonic increase over the path, and also enables a full representation of the energy state parameter without the use of another state space dimension. Finally, the domain transition model does not depend on the energy variable to derive time or energy
INCREMENTAL SEARCHcosts. Hence, an energy state DPARMS is redundant with the parameter in the objective function, and can be removed from the DPARMS of Approach 1.
The two approaches differ in substantial ways. Most obviously, they operate on state spaces of different dimensionality. In terms of time and memory, the 3-D Approach 2 was expected to have a notable advantage over the 4-D Approach 1.