Cost-Optimal Algorithmsfor Planning with Procedural Control Knowledge

Cost-Optimal Algorithms
for Planning with Procedural Control Knowledge

Vikas Shivashankar Knexus Research Corporation, National Harbor, MD, vikas.shivashankar@knexusresearch.comMITRE, McLean, VA, ralford@mitre.orgNRC Postdoctoral Fellow, Naval Research Laboratory, Washington DC, Research Laboratory, Washington DC,    Ron Alford MITRE, McLean, VA, ralford@mitre.orgNRC Postdoctoral Fellow, Naval Research Laboratory, Washington DC, Research Laboratory, Washington DC,    Mark Roberts NRC Postdoctoral Fellow, Naval Research Laboratory, Washington DC, Research Laboratory, Washington DC,    David W. Aha Naval Research Laboratory, Washington DC,

There is an impressive body of work on developing heuristics and other reasoning algorithms to guide search in optimal and anytime planning algorithms for classical planning. However, very little effort has been directed towards developing analogous techniques to guide search towards high-quality solutions in hierarchical planning formalisms like HTN planning, which allows using additional domain-specific procedural control knowledge. In lieu of such techniques, this control knowledge often needs to provide the necessary search guidance to the planning algorithm, which imposes a substantial burden on the domain author and can yield brittle or error-prone domain models. We address this gap by extending recent work on a new hierarchical goal-based planning formalism called Hierarchical Goal Network (HGN) Planning to develop the Hierarchically-Optimal Goal Decomposition Planner (), an HGN planning algorithm that computes hierarchically-optimal plans.  is guided by , a new HGN planning heuristic that extends existing admissible landmark-based heuristics from classical planning to compute admissible cost estimates for HGN planning problems. Our experimental evaluation across three benchmark planning domains shows that  compares favorably to both optimal classical planners due to its ability to use domain-specific procedural knowledge, and a blind-search version of  due to the search guidance provided by .


1 Motivation and Background

Formalisms for automated planning (to represent and solve planning problems) broadly fall into either domain-independent planning or domain-configurable planning. Domain-independent planning formalisms, such as classical planning requires that the users only provide models of the base actions executable in the domain. In contrast, domain-configurable planning formalisms (e.g., Hierarchical Task Network (HTN) planning) allow users to supplement action models with additional domain-specific knowledge structures that increases the expressivity and scalability of planning systems.

An impressive body of work exploring search heuristics has been developed for classical planning that has helped speed up generation of high-quality solutions. More specifically, search heuristics such as the relaxed planning graph heuristic [9], landmark generation algorithms [10, 17], and landmark-based heuristics [17, 11] dramatically improved optimal and anytime planning algorithms by guiding search towards (near-) optimal solutions to planning problems.

Yet relatively little effort has been devoted to develop analogous techniques to guide search towards high-quality solutions in domain-configurable planning systems. In lieu of such search heuristics, domain-configurable planners often require additional domain-specific knowledge to provide the necessary search guidance. This requirement not only imposes a significant burden on the user, but also sometimes leads to brittle or error-prone domain models. In fact, getting the best of heuristic search and hierarchical procedural knowledge (to decomposes planning tasks) has remained an unsolved problem since planning competitions first focused on heuristic search at AIPS-98 [12].

In this paper, we address this gap by developing the Hierarchically-Optimal Goal Decomposition Planner (), a hierarchical planning algorithm that uses admissible heuristic estimates to generate hierarchically-optimal plans (i.e., plans that are valid and optimal with respect to the given hierarchical knowledge).  leverages recent work on a new hierarchical planning formalism called Hierarchical Goal Network (HGN) Planning [19, 18], which combines the hierarchical structure of HTN planning with the goal-based nature of classical planning.

In particular, our contributions are as follows:

  • Admissible Heuristic: We present an HGN planning heuristic – (HGN Landmark heuristic) – that extends landmark-based admissible classical planning heuristics to derive admissible cost estimates for HGN planning problems. To the best of our knowledge,  is the first non-trivial admissible hierarchical planning heuristic.

  • Optimal Planning Algorithm: We introduce , an A search algorithm that uses  to generate hierarchically-optimal plans.

  • Experimental Evaluation: We describe an empirical study on three benchmark planning domains in which  outperforms optimal classical planners due to its ability to exploit hierarchical knowledge. We also found that  provides useful search guidance; despite substantial computational overhead, it compares favorably in terms of runtime and nodes explored to , using the trivial heuristic .

2 Preliminaries

In this section we detail the classical planning model, review how landmarks are constructed for classical planning and an admissible landmark-based heuristic , and describe HGN planning using examples from assembly planning.

2.1 Classical Planning

We define a classical planning domain  as a finite-state transition system in which each state is a finite set of ground atoms of a first-order language , and each action is a ground instance of a planning operator . A planning operator is a 4-tuple , where and are conjuncts of literals called ’s preconditions and effects, and includes ’s name and argument list (a list of the variables in and ). (o) represents the non-negative cost of applying operator .

Actions. An action is executable in a state if , in which case the resulting state is , where and are the atoms and negated atoms, respectively, in . A plan is executable in if each is executable in the state produced by ; and in this case is the state produced by executing . If and are plans or actions, then their concatenation is .

We define the cost of as the sum of the costs of the actions in the plan, i.e. .

2.2 Generating Landmarks for Classical Planning

There are several landmark generation algorithms suggested in the literature [10, 17]. The general approach used in generating sound landmarks is to relax the planning problem, generate sound landmarks for the relaxed version, and then use those for the original planning problem. In this paper, we use LAMA’s landmark generation algorithm [17], which uses relaxed planning graphs and domain-transition graphs in tandem to generate landmarks.

2.3 : an Admissible Landmark-based Heuristic for Classical Planning

We provide some background on , the landmark-based admissible heuristic for classical planning problems proposed by Karpas and Domshlak [11] that we will be using in our heuristic.

Consider a classical planning problem and a landmark graph computed using any of the off-the-shelf landmark generation algorithms mentioned in Section 2.2. Then, we can define to be the set of landmarks that need to be achieved from onwards, assuming we got to using . Note that is path-dependent: it can vary for the same state when reached by different paths. It can be computed as follows:

where is the set of landmarks that were true at some point along . is the set of landmarks that were accepted but are required again; an accepted landmark is required again if (1) it does not hold true in , and (2) it is greedy-necessarily ordered before another landmark in that is not accepted.

Karpas and Domshlak show that it is possible to partition the costs of the actions in over the landmarks in to derive an admissible cost estimate for the state as follows: let be the cost assigned to the landmark , and be the portion of ’s cost assigned to . Furthermore, let us suppose these costs satisfy the following set of inequations:


where is the set of possible achievers of along any suffix of , and .

Informally, what these equations are encoding is a scheme to partition the cost of each action across all the landmarks it could possibly achieve, and assigns to each landmark a cost no more than the minimum cost assigned to by all its achievers. Given this, they prove the following useful theorem:

Theorem 1.

Given a set of action-to-landmark and landmark-to-action costs satisfying Eqn. 1, is an admissible estimate of the optimal plan cost from .

Note that the choice of exactly how to do the cost-partitioning is left open. One of the schemes Karpas and Domshlak propose is an optimal cost-partitioning scheme that uses an LP solver to solve the constraints in Eqn. 1 with the objective function . This has the useful property that given two sets of landmarks and , if , then . In other words, the more landmarks you provide to , the more informed the heuristic estimate.

2.4 Goal Networks and HGN Methods

We extend the definitions of HGN planning [19] to work with partially-ordered sets of goals, which we call a goal network.

A goal network is a way to represent the objective of satisfying a partially ordered multiset of goals. Formally, it is a pair such that:

  • is a finite nonempty set of nodes;

  • each node contains a goal that is a DNF (disjunctive normal form) formula over ground literals;

  • is a partial order over .

Figure 1: Three generic goal networks we use for examples of the various relationships within a goal network.

We will provide examples of both generic and concrete goal networks. Figure 1 shows three generic goal networks. Each subfigure is itself a goal network denoted . Directed arcs indicate a subgoal pair (e.g., from ) such that the first goal must be satisfied before the second goal. Consider the network where is a subgoal of , then . Network shows a partial ordering, where . Similarly, and this implies both must occur before . Consider a network that is composed of and . Then ). Note that is a partially ordered forest of goal networks.

Figure 2: Goal Network for an Automated Manufacturing domain

Figure 2 shows a concrete goal network for an automated manufacturing domain. joined denotes the goal of assembling the parts and together, while at represents the goal of getting to location . In this goal network, joined and joined are unordered with respect to one another. Furthermore, joined has three subgoals that need to be achieved before achieving it, i.e the goals of getting the parts , and the to the assembly table. These subgoals are also unordered with respect to one another, indicating that the goals can be accomplished in any order.

HGN Methods

An HGN method is a 4-tuple where the head and preconditions are similar to those of a planning operator. is a conjunct of literals representing the goal decomposes. is the goal network that decomposes into. By convention, has a last node containing the goal to ensure that accomplishes its own goal.

Figure 3 describes the goal network that the deliver-obj method, a method responsible for solving problems related to delivering parts and tools to their destinations, decomposes a goal into. This method is relevant to at goals (since that’s the last node), and its preconditions are .

Figure 3: Subgoal network of deliver-obj, an HGN method to deliver the part to using .

Whether a node has predecessors impacts the kinds of operations we allow. We refer to any node in a goal network having no predecessors as an unconstrained node of , otherwise the node is constrained. The constrained nodes of Figure 1 include and the remaining are unconstrained. The unconstrained nodes in Figure 2 include all the at nodes as well as the joined node.

We define the following operations over any goal network :

  1. Goal Release: Let be an unconstrained node. Then the removal of from , denoted by , results in the goal network where and is the restriction of to .

  2. Method Application: Let be an unconstrained node. Also, let be a method applied to with . Finally, recall that always contains a ’last’ node that contains ; let be this node. Then the application of to via , denoted by , results in the goal network where and . Informally, this operation adds the elements of to , preserving the order specified by and setting as a predecessor of .

2.5 HGN Domains, Problems and Solutions

A HGN domain is a pair where  is a classical planning domain and is a set of HGN methods.

A HGN planning problem is a triple , where is an HGN domain, is the initial state, and is the initial goal network.

Definition 2 (Solutions to HGN Planning Problems).

The set of solutions for is defined as follows:

Base Case.

If is empty, the empty plan is a solution for .

In the following cases, let be an unconstrained node.

Unconstrained Goal Satisfaction.

If , then any solution for is also a solution for .

Action Application.

If action is applicable in and is relevant to , and is a solution for , then is a solution for .

Method Decomposition.

If is a method applicable in and relevant to , then any solution to is also a solution to .

Note that HGN planning allows an action to be applied only if it is relevant to an unconstrained node in ; this prevents unrestricted chaining of applicable actions as done in classical planning and allows for tighter control of solutions as in HTN planning.

Let us denote as the set of solutions to an HGN planning problem as allowed by Definition 2. Then we can define what it means for a solution to be hierarchically optimal with respect to as follows:

Definition 3 (Hierarchically Optimal Solutions).

A solution is hierarchically optimal with respect to if .

3 The  Algorithm

Algorithm 1 describes . It takes as input an HGN domain , the initial state and the initial goal network . It does an A search using the admissible HGN heuristic  (described in Section 4) to compute a hierachically optimal solution to the problem; it either returns a plan if it finds one, or  if the problem is unsolvable.

Initialization. It starts off by initializing  (Line 2), which is a priority queue that sorts the HGN search nodes yet to be expanded by their -value, where .  initially contains the initial search node . It also initializes  (Line 3), the set of all nodes seen during the search process. This data structure keeps track of the best known path for each pair, and is thus helpful to detect when we find a cheaper path to a previously seen HGN search node.

1:function ()
4:     while  is not empty do
5:         rem. with lowest -value from
6:         if  is empty then  return          
7:         successors
8:         for  successors do
9:              if  then
10:                  if  then
11:                       replace with                               in
12:                  else continue                   
13:              else add to               
14:              eval. -value of and add to               
15:     return
17:function ()
18:     successors
19:     for unconstrained satisfied in  do
20:         add the node to successors      
21:      actions in applicable in and relevant to an unconstrained goal in
22:     for  do
23:         add the node to successors      
24:      s.t. is applicable in and relevant to an unconstrained goal in
25:     for  do
26:         add the node to successors      
27:     return successors
Algorithm 1 Pseudocode of . It takes as arguments the domain description , the initial state , and the initial goal network . It either returns a plan if it finds one, or  if it doesn’t.

Search.  now proceeds to do an A search in the space of HGN search nodes starting from the initial node. While  is not empty, it does the following (Lines 414): it removes the HGN search node with the best -value from  (Line 5) and first checks if is empty (Line 6). If this is true, this means that all the goals in have been solved, and is the optimal solution to the HGN planning problem.

If is not empty, then the algorithm proceeds by using the  subroutine to compute ’s successor nodes (Line 7). For each successor node , it proceeds to do the following: it checks to see if another path to exists in  (Line 9). If this is the case and if is costlier than (Line 10), it updates  with the new path; and reopens the search node (Line 14); if is cheaper than the new plan , it simply skips this successor (Line 12).

If has not been seen before, it adds to  to track the currently best-known plan to (Line 13). It also evaluates the -value of (note that this is where  is called) and adds it to  (Line 14).

If there are no more nodes left in , this implies that it has exhausted the search space without finding a solution, and therefore returns  (Line 15).

Computing Successors. The procedure  computes the successors of a given HGN search node in accordance with Definition 2. First, we check to see if there are any unconstrained goals in that are satisfied in the current state . We then proceed to create new HGN search nodes by removing all such goals from (Line 1920). Next, we compute all actions applicable in and relevant to an unconstrained goal in (Line 21) and create new search nodes by progressing using these actions (Line 2223). We compute all pairs such that is an HGN method applicable in and relevant to an unconstrained goal in (Line 24) and create new search nodes by decomposing in using (Line 2526). Finally, we return the set of generated successor nodes (Line 27).

4 : An Admissible Heuristic for HGN Planning

1:function ()
2:     queueSeeds
3:     queue
4:     while queueSeeds is not empty do
5:         choose a w/o successors from queueSeeds, and remove it along with all associated orderings
6:         , add to queue
7:         add any orderings shares with other goals from already added to LG
8:         while queue is not empty do
9:              pop landmark from queue and use  to generate the new set of landmarks
10:              for  do addLM                              
11:     return LG
13:function ()
14:     if  is a fact and  then
15:         remove from LG and all orderings it is part of      
16:     if  then return      
17:     if  then add to queue and return      
19:function ()
21:     add ordering to LG
Algorithm 2 Procedure for computing landmarks for relaxed HGN planning problems.

As mentioned in Section 3,  uses  to compute the -values (and thus, the -values) of search nodes. In this section, We will proceed to describe how to construct  as follows:

  1. We define a relaxation of HGN planning that ignores the provided methods and allows unrestricted action chaining as in classical planning, which expands the set of allowed solutions,

  2. We will extend landmark generation algorithms for classical planning problems to compute sound landmark graphs for the relaxed HGN planning problems, which in turn are sound with respect to the original HGN planning problems as well, and finally

  3. We will use admissible classical planning heuristics like on these landmark graphs to compute admissible cost estimates for HGN planning problems.

4.1 Relaxed HGN Planning

Definition 4 (Relaxed HGN Planning).

A relaxed HGN planning problem is a triple where is a classical planning domain, is the initial state, and is the initial goal network. Any sequence of actions that is executable in state and achieves the goals in in an order consistent with the constraints in is a valid solution to .

Relaxed HGN planning can thus be viewed as an extension of classical planning to solve for goal networks, where there are no HGN methods and the objective is to generate sequences of actions that satisfy the goals in in an order consistent with . In fact, it is easy to show that relaxed HGN planning, in contrast to HGN planning, is no more expressive than classical planning, and relaxed HGN planning problems can be compiled into classical planning problems quite easily.

Next, we will show how to leverage landmark generation algorithms for classical planning to generate landmark graphs for relaxed HGN planning.

4.2 Generating Landmarks for Relaxed HGN Planning

This section describes a landmark discovery technique that can use any landmark discovery technique for classical planning (referred to as  here) such as [17] to compute landmarks for relaxed HGN planning problems. The main difference here is that while classical planning problems are pairs, relaxed HGN planning problems are pairs; every goal in the goal network can be thought of as a landmark. Therefore, there is now a partially ordered set of goals to compute landmarks from, as opposed to a single goal in classical planning.

We therefore need to generalize classical planning landmark generation techniques to work for relaxed HGN planning problems. The  algorithm (Algorithm 2) describes one such generalization. At a high level,  proceeds by computing landmark graphs for each goal in (which in fact is a classical planning problem) and merging them all together to create the final landmark graph .

 takes as input a relaxed HGN planning problem and generates , a graph of landmarks. First, queueSeeds is initialized with a copy of (Line 2). This is because unlike in classical planning where we generate landmarks for a single goal, in HGN planning we have a partially ordered set of goals to seed landmark generation; queueSeeds stores these seeds. We also initialize queue, the openlist of landmarks, to .

While there is a goal from that we have not yet computed landmarks for (Line 4), we do the following: we remove it from queueSeeds along with all induced orderings and add it to queue (Lines 56). We also add to using ; we also add any ordering constraints it might have with other elements of that have already been added to . This queue is then used as a starting point by  to begin landmark generation. We iteratively use  to pop landmarks off the queue and generate new landmarks by backchaining until we can no longer generate any more landmarks (Lines 810). Each new landmark is added to by the  procedure. Once all goals in have been handled, the landmark generation process is completed and the algorithm returns .

The  procedure takes as input a computed landmark , adds it to and returns a landmark . There are three cases to consider:

  • subsumes another landmark in , implying we can remove and replace it with (since is a stronger version of ), and return (Lines 1415)

  • is subsumed by another landmark in , implying we can ignore (Lines 16). In this case, we don’t add any new landmark to and simply return

  • is a new landmark, in which case we can simply add it to and return (Lines 17)

The  procedure takes as input a landmark and an ordering constraint and adds them to . More precisely, it adds to using , which returns the added landmark . It then adds the ordering constraint between and in .

Figure 4: (a) LM graph on goal network containing a single goal at. (b) LM graph after decomposing at with deliver-obj. The double-circled landmarks represent new landmarks inferred after the method decomposition, while the landmarks colored gray are new landmarks that subsumed an existing one in (a).

LM graph computation example. Figure 4 illustrates the working of . Let us assume the goal network contains only one goal . Figure (a)a illustrates the output of  on . This is identical to what  would generate, since contains only one goal, making the relaxed HGN problem equivalent to a classical planning problem.

Now, let us assume that we decompose using the , and get the new goal network , which essentially looks like an instantiated version of the network in Figure 3. Now if we run  on , we end up generating the landmark graph in Figure (b)b, which is a more focused version of the first landmark graph. This is because the goals in are landmarks that must be accomplished, which constrains the set of valid solutions that can be generated. For instance, since we’ve committed to agent , every solution we can generate from will involve the use of . We can, as a result, generate more focused landmarks than we otherwise could have from just the top-level goal . This includes fact landmarks that replace disjunctive landmarks (the ones in gray in Fig. (b)b) as well as completely new landmarks that arise as a result of the method; e.g. reserved is not a valid landmark for , but is one for .

An important point to note at this point is that the subgoals in are not true landmarks for ; they are landmarks once we commit to applying method . However, this actually ends up being useful to us, since it allows us to generate different landmark graphs for different methods; for instance, if we had committed to , we would have obtained a different set of landmarks specific to . Now, landmark-based heuristics when applied to these two graphs would get us different heuristic estimates, thus allowing to differentiate between these two methods by using the specific subgoals each method introduces.

It is easy to show that  generates sound landmark graphs for relaxed HGN planning problems:

Claim 5.

Given a relaxed HGN planning problem , is a sound landmark graph for .

Let be an HGN planning problem, and let be the corresponding relaxed version. Then by definition, any solution to is a solution to . Therefore, it is easy to see that a landmark of is also a sound landmark of . More generally, a landmark graph generated for is going to be sound with respect to as well:

Claim 6.

Given an HGN planning problem , then is a sound landmark graph for .

4.3 Computing

The main insight behind  is the following: since the  algorithm generates sound landmarks and orderings for relaxed (and therefore regular) HGN planning problems, we can use any admissible landmark-based heuristic from classical planning to derive an admissible cost estimate for HGN planning problems.

In particular,  uses  as follows: given an HGN search node , the landmark graph is given by . Then


where is the plan generated to get to .

A couple of important implementation details: when using  to guide classical planners, it is sufficient to compute the landmark graph just once upfront since it can be reused in every state along the plan due to the goal staying the same. This isn’t the case in HGN planning; method decomposition can change the goal network. So,  requires re-computing the landmark graph each node. In our implementation, we try to optimize this process by computing landmark graphs for each goal network we encounter from the initial state and caching them for use in future nodes containing the same goal network. Section 5.2 discusses the impact of this overhead in the experiments. Secondly, while the optimal cost partitioning scheme in  provides more informed heuristic estimates, we chose to use the uniform cost partitioning scheme in our implementation since the former requires solving an LP at each search node, which is costly.

4.4 Admissibility of

Claim 6 shows that given an HGN problem , is a sound landmark graph with respect to . Furthermore, Lemma 1 shows that provides an admissible cost estimate of the optimal plan starting from that achieves all the landmarks in . Since every solution to has to achieve all the landmarks in in a consistent order, provides an admissible estimate of the optimal cost to as well. However, from Eq. 2, . Therefore, we have the following theorem:

Theorem 7 (Admissibility of ).

Given an HGN planning domain , a search node and its cost-optimal solution , .

Figure 5: Graph of number of nodes expanded and running times of the planners across the Logistics, Blocks-World, and Depots domains. Each data point is the average over 25 randomly generated problems. Data points where all the problems are not solved were discarded.

5 Experimental Evaluation

We implemented  within the Fast-Downward codebase, and extended LAMA’s landmark generation code to develop , our HGN planning heuristic.

We tested two hypotheses in our study:

H1: ’s ability to exploit hierarchical planning knowledge enables it to outperform state-of-the-art optimal classical planners. To test this, we compared the performances of  with  [11], the optimal classical planner whose heuristic we extended to develop .

It might seem that H1 is obviously true due to the dominance of hierarchical planners (e.g., SHOP2 and GDP) over classical planners, but these are merely satisficing planners. It is not clear whether this advantage would carry over to optimal planning because  needs to do an optimal search in the possibly larger space of pairs, in contrast to classical planners, which search in the space of states.

H2: The heuristic used by , , provides useful search guidance. To test this, we compared the performances of  with , which is identical to  except that it uses the trivial heuristic estimate of .

5.1 Experimental Results

We evaluated , , and  on three well-known planning benchmarks, Logistics, Blocks World and Depots. We chose these 3 domains because from a control-knowledge standpoint, these three domains capture a wide spectrum: Logistics contains only enough control-knowledge to define allowed solutions, Blocks-World is at the other extreme, defining sophisticated knowledge that significantly prunes the search space, and Depots incorporates elements of both.

For each domain, we randomly generated 25 problem instances per problem size. We ran all problems on a Xeon E5-2639 with a per problem limit of 4 GB of RAM and 25 minutes of planning time. Data points were discarded if the planner did not solve all of the corresponding problem instances within the time limit.

Logistics. We modified the standard PDDL Logistics model to limit the capacity of all vehicles to one to ensure the HGN and non-HGN planners compute the same solutions. We generated 25 random logistics problems for each problem size ranging from 4,6,…,14 packages. For  and , we provided the HGN methods used in ’s experimental evaluation [18]. There are three methods in this knowledge base that together capture all the possible (minimal) solutions to a Logistics problem; these are (1) a method to move packages within the same city using trucks, (2) a method to move packages between airports using planes, and (3) a method that combines the previous two to move packages across different cities.

Figures (a)a and (d)d show the performance of the three planners in terms of number of nodes expanded by the planners and overall planning time. Both  and  could solve problems up to size 10 (i.e., within the time limit), while  could solve problems only up to size 6.

In terms of nodes expanded, Figure (a)a shows that the heuristic in  helped to modestly decrease the number of nodes expanded;  on average expanded 22% fewer nodes than . We did not include  because it expanded many orders of magnitude more nodes than either  variant (e.g., for problems of size 6,  on average expanded nodes). With regard to running time, Figure (d)d shows that the modest gain by  was outweighed by the computational overhead of running the heuristic (on average about 35% of the total running time). , despite its blind search, was slightly faster than .

Blocks World. We generated 25 random blocks-world problems for problem sizes ranging from 4,6,…,20 blocks. As in our study with Logistics, we use the same HGN methods used in ’s evaluation [18]. In contrast to our Logistics study, the methods encode sophisticated knowledge that allows the planners to prune search paths that don’t lead to good solutions (e.g., it contains a recursively defined axiom that checks if a block is in its final position and only then builds towers on top of it).

Figures (b)b and (e)e show the performance of the three planners on these blocks-world problems.  could solve problems up to size 10,  to size 16, and  could solve problems up to size 18.

Figure (b)b displays the number of nodes expanded by the three planners. In this domain, the guidance provided by  helped substantially;  on average expanded 76% fewer nodes than . This savings far outweighed the heuristic computation overhead (on average about 48% of the total running time), resulting in smaller overall planning times for  as can be seen in Figure (e)e.

Depots. We generated 25 random depots problems for problem sizes ranging from 4,5,…,10 crates. Since the Depots domain combines aspects of Logistics (moving cargo around) and Blocks-World (stacking them in a particular manner), the HGN methods for Depots is a combination of the HGNs used in Logistics and Blocks-World.

Figures (c)c and (f)f show the performance of the three planners on the generated problems.  could solve problems up to only size 6, while both  and  could solve problems up to size 9. Figure (c)c shows the average number of nodes expanded by the three planners. The  heuristic in  provides good search guidance, reducing the number of nodes expanded by about 46% when compared to . As in Logistics, we didn’t show the nodes expanded by  since it was many orders of magnitude more than either  variant; for size 6 problems, on average, it expanded nodes.

In terms of planning time (Figure (f)f), the provided domain knowledge clearly helps both  variants in scaling much better than . Furthermore, the additional search guidance provided by  results in overall lower runtimes for  in comparison to , even with the computation overhead of the heuristic (which is about 56% of the total time).

5.2 Interpretation of Results

There are two main takeaways from this empirical study:

Support for H1.

Hierarchical planning knowledge helps in scaling up solving of optimal planning problems. In all three benchmark domains, both of the  variants solved more problems while requiring less time and expanding fewer nodes than , showing that the additional overhead of searching through the space of pairs was outweighed by the benefit that hierarchical planning knowledge can provide in terms of more focused search.

Support for H2.

The HGN heuristic  provides useful guidance when searching for hierarchically optimal plans. We can conclude this from the decrease in the number of nodes expanded in  as compared to  in all three benchmark domains. The Logistics results only weakly support this due to only a modest decrease in the number of nodes expanded (22%), while the results from Blocks-World and Depots are more conclusive, registering large savings in number of nodes expanded (76% and 46% respectively).

We posit that the reduction in number of nodes expanded by  is a function of the input HGN knowledge. For instance, the Logistics methods do not encode any expert knowledge and instead only model the minimum knowledge required to capture the three ways to move a package: by truck, by plane, and by a combination of the two. Therefore, the goal networks always contain landmarks or more focused versions of landmarks (e.g. instead of ) that can be detected by landmark generation algorithms. This means that the landmarks generated do not change much after a method application, implying that the heuristic estimates are unlikely to change much either. In contrast, methods in both Blocks-World and Depots contain specialized knowledge that, when applied, yield goal networks containing subgoals and orderings that cannot be detected by landmark generation algorithms. That is, when landmark generation is run on these goal networks, because the subgoals in the goal network serve as seeds for landmark generation, a richer set of landmarks will be generated, resulting in more informed heuristic estimates.

Another important takeaway from the experiments is the following: the current implementation of  imposes a substantial overhead on . On average, it uses 35%, 48% and 56% of the total planning time in Logistics, Blocks-World, and Depots respectively. This is partly due to the current implementation not being optimized. For instance, unlike landmark-based classical planners where the landmark graph needs to be computed only once for the final goal,  needs to compute landmark graphs for every goal network it generates during search. Reusing the computed landmark graphs more effectively can potentially help in substantially reducing planning times.

6 Related Work

HTN planners solve planning problems by (1) forward state-space search, such as in the SHOP [16] and SHOP2 [15] HTN planners, or (2) partial-order causal-link planning (POCL) techniques, such as in UMCP [8] and in the hybrid planning literature [7, 6].

HGN planning can be translated to HTN planning in a plan-preserving manner [3], meaning we can, in theory, use any optimal HTN planner for optimal HGN planning. However, there is little research on search heuristics for forward-search HTN planning [2, 1]. Therefore, planners often provide other domain-specific mechanisms for users to encode search strategies. For example, SHOP2 allows domain-specific knowledge, known as HTN methods, to be specified in a ’good’ order according to the user, and attempts to apply them in the same order. SHOP2 also provides support for external function calls [15] that can call arbitrary code to perform intensive computations, thus minimizing the choices that need to be made during search. For example, in the 2002 Planning Competition for hand-tailored planners, the authors of SHOP2 used a graph-algorithm library that SHOP2 could call externally to generate shortest paths [15].

Waisbrot et al [23] developed , a HTN planner that augments SHOP2 with classical planning heuristics to make local decisions on which method to apply next by estimating how close the method’s goal is to the current state. However, retains the depth-first search structure of SHOP2, making it difficult to generate high-quality plans.

Marthi et al [13, 14] propose an HTN-like formalism called angelic hierarchical planning that allows users to annotate abstract tasks with additional domain-specific information (i.e., lower and upper bounds on the costs of the possible plans they can be used to generate). They then use this information to compute hierarchically-optimal plans. In contrast, we require the costs of only the primitive actions and use domain-independent search heuristics to compute hierarchically-optimal plans.

There has been recent work on developing search heuristics for POCL HTN planners [7, 6]. However, these heuristics typically provide estimates on how many more plan refinement steps need to be taken from a search node to obtain a solution. This differs from plan quality estimates, which is our focus in this paper.

Hierarchical Goal Network (HGN) Planning combines the hierarchical structure of HTN planning with the goal-based nature of classical planning. It therefore allows for easier infusion of techniques from classical planning into hierarchical planning, such as adapting the FF heuristic for method ordering in the  planner [19], and using landmark-based techniques to plan with partial amounts of domain knowledge in  [18]. However, both planners use depth-first search and inadmissible heuristics, so they cannot provide any guarantees of plan quality.

Another less-related domain-configurable planning formalism is Planning with Control Rules [4], where domain-specific knowledge is encoded as linear-temporal logic (LTL) formulas. TLPlan, one of the earliest planners developed under this formalism, used control rules written in LTL to prune trajectories deemed suboptimal by the user. There have also been attempts to develop heuristic search planners that can plan with LTL, a simplified version of LTL that works with finite traces. This has been used to incorporate search heuristics to solve for temporally extended goals written in LTL [5], planning for preferences [22], as well as to express landmark-based heuristics that guide classical planners [21].

7 Conclusion

Despite the popularity of hierarchical planning techniques in theory and practice, little effort has been devoted to developing domain-independent search heuristics that can provide useful search guidance towards high-quality solutions. As a result, end-users need to encode domain-specific heuristics into the domain models, which can make the domain-modeling process tedious and error-prone.

To address this issue, we leverage recent work on HGN planning, which allows tighter integration of hierarchical and classical planning, to develop (1) , an admissible HGN planning heuristic, and (2) , an A search algorithm guided by  to compute hierarchically-optimal plans. Our experimental study showed that  outperforms optimal heuristic search classical planners (due to its ability to exploit domain-specific planning knowledge) and optimal blind search HGN planners (due to the search guidance provided by ).

There are several directions for future work, such as:

  • Extension to Anytime Planning: An obvious and a practically useful extension of this work is to extend  to work in an anytime manner (i.e., generate a solution quickly such that a solution is available at any time during execution and then iteratively/continuously improve the plan’s quality over time) instead of trying to compute the optimal solution up-front. We can of course adapt techniques used in anytime classical planners like LAMA, which runs a series of weighted-A searches. However, we also plan to explore the use of block-deordering [20], a technique for continual plan improvement that seems to lend itself well to plans that are hierarchically structured.

  • Extension to Temporal Planning: We also plan on investigating temporal extensions of HGN planning and  to develop search heuristics and hierarchical planners that can leverage procedural knowlege to find high-quality plans and schedules.

This work is sponsored in part by OSD ASD (R&E). The information in this paper does not necessarily reflect the position or policy of the sponsors, and no official endorsement should be inferred. Ron Alford performed part of this work under an ASEE postdoctoral fellowship at NRL. We also would like to thank the anonymous reviewers at ECAI 2016 for their insightful comments. We would also like to thank the reviewers at HSDIP 2016 for useful feedback on a preliminary version of this paper.


  • [1] Ron Alford, Gregor Behnke, Daniel Höller, Susanne Biundo, Pascal Bercher, and David W. Aha, ‘Bound to plan: Exploiting classical heuristics via automatic translations of tail-recursive HTN problems’, in Proc. of the 26th Int. Conf. on Automated Planning and Scheduling (ICAPS). AAAI Press, (2016).
  • [2] Ron Alford, Vikas Shivashankar, Ugur Kuter, and Dana S. Nau, ‘On the feasibility of planning graph style heuristics for HTN planning’, in Proc. of the 24th Int. Conf. on Automated Planning and Scheduling (ICAPS), pp. 2–10. AAAI Press, (2014).
  • [3] Ron Alford, Vikas Shivashankar, Mark Roberts, Jeremy Frank, and David W. Aha, ‘Hierarchical planning: relating task and goal decomposition with task sharing’, in Proc. of the 25th Int. Joint Conf. on Artificial Intelligence (IJCAI). AAAI Press, (2016).
  • [4] Fahiem Bacchus and Froduald Kabanza, ‘Using temporal logics to express search control knowledge for planning’, Artif. Intell., 116, 123–191, (2000).
  • [5] Jorge A. Baier and Sheila A. McIlraith, ‘Planning with first-order temporally extended goals using heuristic search’, in AAAI Conference on Artificial Intelligence, (2006).
  • [6] Pascal Bercher, Shawn Keen, and Susanne Biundo, ‘Hybrid planning heuristics based on task decomposition graphs’, in Proc. of the Seventh Annual Symposium on Combinatorial Search (SoCS), pp. 35–43. AAAI Press, (2014).
  • [7] Mohamed Elkawkagy, Pascal Bercher, Bernd Schattenberg, and Susanne Biundo, ‘Improving hierarchical planning performance by the use of landmarks’, in AAAI Conference on Artificial Intelligence, pp. 1763–1769, (2012).
  • [8] Kutluhan Erol, James Hendler, and Dana S. Nau, ‘UMCP: A sound and complete procedure for hierarchical task-network planning’, pp. 249–254, (June 1994). ICAPS 2009 influential paper honorable mention.
  • [9] J. Hoffmann and Bernhard Nebel, ‘The FF planning system’, Journal of Artificial Intelligence Research, 14, 253–302, (2001).
  • [10] Jörg Hoffmann, Julie Porteous, and Laura Sebastia, ‘Ordered landmarks in planning’, Journal of Artificial Intelligence Research, 22, 215–278, (2004).
  • [11] Erez Karpas and Carmel Domshlak, ‘Cost-optimal planning with landmarks’, in IJCAI 2009, Proceedings of the 21st International Joint Conference on Artificial Intelligence, Pasadena, California, USA, July 11-17, 2009, ed., Craig Boutilier, pp. 1728–1733, (2009).
  • [12] Derek Long, Henry A. Kautz, Bart Selman, Blai Bonet, Hector Geffner, Jana Koehler, Michael Brenner, Jorg Hoffmann, Frank Rittinger, Corin R. Anderson, Daniel S. Weld, David E. Smith, and Maria Fox, ‘The aips-98 planning competition’, AI Magazine, 21, 13–33, (2000).
  • [13] B. Marthi, S.J. Russell, and J. Wolfe, ‘Angelic semantics for high-level actions’, in International Conference on Automated Planning and Scheduling, (2007).
  • [14] B. Marthi, S.J. Russell, and J. Wolfe, ‘Angelic hierarchical planning: Optimal and online algorithms’, in International Conference on Automated Planning and Scheduling, pp. 222–231, (2008).
  • [15] Dana S. Nau, Tsz-Chiu Au, Okhtay Ilghami, Ugur Kuter, J William Murdock, Dan Wu, and Fusun Yaman, ‘SHOP2: An HTN planning system’, Journal of Artificial Intelligence Research, 20, 379–404, (2003).
  • [16] Dana S. Nau, Yue Cao, Amnon Lotem, and Héctor Muñoz-Avila, ‘SHOP: Simple hierarchical ordered planner’, in International Joint Conference on Artificial Intelligence, ed., Thomas Dean, pp. 968–973, (August 1999).
  • [17] Silvia Richter and Matthias Westphal, ‘The LAMA planner: Guiding cost-based anytime planning with landmarks’, J. Artif. Intell. Res. (JAIR), 39, 127–177, (2010).
  • [18] Vikas Shivashankar, Ron Alford, Ugur Kuter, and Dana Nau, ‘The GoDeL planning system: a more perfect union of domain-independent and hierarchical planning’, in Proc. of the 23rd Int. Joint Conf. on Artificial Intelligence (IJCAI), pp. 2380–2386. AAAI Press, (2013).
  • [19] Vikas Shivashankar, Ugur Kuter, Dana Nau, and Ron Alford, ‘A hierarchical goal-based formalism and algorithm for single-agent planning’, in Proc. of the 11th Int. Conf. on Autonomous Agents and Multiagent Systems (AAMAS), volume 2, pp. 981–988. Int. Foundation for Autonomous Agents and Multiagent Systems, (June 2012).
  • [20] Fazlul Hasan Siddiqui and Patrik Haslum, ‘Continuing plan quality optimisation’, J. Artif. Intell. Res. (JAIR), 54, 369–435, (2015).
  • [21] Salome Simon and Gabriele Roger, ‘Finding and exploiting ltl trajectory constraints in heuristic search’, in Symposium on Combinatorial Search, (2015).
  • [22] Shirin Sohrabi, Jorge Baier, and Sheila McIlraith, ‘Htn planning with preferences’, in IJCAI, (2009).
  • [23] Nathaniel Waisbrot, Ugur Kuter, and Tolga Konik, ‘Combining heuristic search with hierarchical task-network planning: A preliminary report’, in International Conference of the Florida Artificial Intelligence Research Society, (2008).
Comments 0
Request Comment
You are adding the first comment!
How to quickly get a good reply:
  • Give credit where it’s due by listing out the positive aspects of a paper before getting into which changes should be made.
  • Be specific in your critique, and provide supporting evidence with appropriate references to substantiate general statements.
  • Your comment should inspire ideas to flow and help the author improves the paper.

The better we are at sharing our knowledge with each other, the faster we move forward.
The feedback must be of minimum 40 characters and the title a minimum of 5 characters
Add comment
Loading ...
This is a comment super asjknd jkasnjk adsnkj
The feedback must be of minumum 40 characters
The feedback must be of minumum 40 characters

You are asking your first question!
How to quickly get a good answer:
  • Keep your question short and to the point
  • Check for grammar or spelling errors.
  • Phrase it like a question
Test description