CostOptimal Algorithms
for Planning with Procedural Control Knowledge
Abstract
There is an impressive body of work on developing heuristics and other reasoning algorithms to guide search in optimal and anytime planning algorithms for classical planning. However, very little effort has been directed towards developing analogous techniques to guide search towards highquality solutions in hierarchical planning formalisms like HTN planning, which allows using additional domainspecific procedural control knowledge. In lieu of such techniques, this control knowledge often needs to provide the necessary search guidance to the planning algorithm, which imposes a substantial burden on the domain author and can yield brittle or errorprone domain models. We address this gap by extending recent work on a new hierarchical goalbased planning formalism called Hierarchical Goal Network (HGN) Planning to develop the HierarchicallyOptimal Goal Decomposition Planner (), an HGN planning algorithm that computes hierarchicallyoptimal plans. is guided by , a new HGN planning heuristic that extends existing admissible landmarkbased heuristics from classical planning to compute admissible cost estimates for HGN planning problems. Our experimental evaluation across three benchmark planning domains shows that compares favorably to both optimal classical planners due to its ability to use domainspecific procedural knowledge, and a blindsearch version of due to the search guidance provided by .
1 Motivation and Background
Formalisms for automated planning (to represent and solve planning problems) broadly fall into either domainindependent planning or domainconfigurable planning. Domainindependent planning formalisms, such as classical planning requires that the users only provide models of the base actions executable in the domain. In contrast, domainconfigurable planning formalisms (e.g., Hierarchical Task Network (HTN) planning) allow users to supplement action models with additional domainspecific knowledge structures that increases the expressivity and scalability of planning systems.
An impressive body of work exploring search heuristics has been developed for classical planning that has helped speed up generation of highquality solutions. More specifically, search heuristics such as the relaxed planning graph heuristic [9], landmark generation algorithms [10, 17], and landmarkbased heuristics [17, 11] dramatically improved optimal and anytime planning algorithms by guiding search towards (near) optimal solutions to planning problems.
Yet relatively little effort has been devoted to develop analogous techniques to guide search towards highquality solutions in domainconfigurable planning systems. In lieu of such search heuristics, domainconfigurable planners often require additional domainspecific knowledge to provide the necessary search guidance. This requirement not only imposes a significant burden on the user, but also sometimes leads to brittle or errorprone domain models. In fact, getting the best of heuristic search and hierarchical procedural knowledge (to decomposes planning tasks) has remained an unsolved problem since planning competitions first focused on heuristic search at AIPS98 [12].
In this paper, we address this gap by developing the HierarchicallyOptimal Goal Decomposition Planner (), a hierarchical planning algorithm that uses admissible heuristic estimates to generate hierarchicallyoptimal plans (i.e., plans that are valid and optimal with respect to the given hierarchical knowledge). leverages recent work on a new hierarchical planning formalism called Hierarchical Goal Network (HGN) Planning [19, 18], which combines the hierarchical structure of HTN planning with the goalbased nature of classical planning.
In particular, our contributions are as follows:

Admissible Heuristic: We present an HGN planning heuristic – (HGN Landmark heuristic) – that extends landmarkbased admissible classical planning heuristics to derive admissible cost estimates for HGN planning problems. To the best of our knowledge, is the first nontrivial admissible hierarchical planning heuristic.

Optimal Planning Algorithm: We introduce , an A search algorithm that uses to generate hierarchicallyoptimal plans.

Experimental Evaluation: We describe an empirical study on three benchmark planning domains in which outperforms optimal classical planners due to its ability to exploit hierarchical knowledge. We also found that provides useful search guidance; despite substantial computational overhead, it compares favorably in terms of runtime and nodes explored to , using the trivial heuristic .
2 Preliminaries
In this section we detail the classical planning model, review how landmarks are constructed for classical planning and an admissible landmarkbased heuristic , and describe HGN planning using examples from assembly planning.
2.1 Classical Planning
We define a classical planning domain as a finitestate transition system in which each state is a finite set of ground atoms of a firstorder language , and each action is a ground instance of a planning operator . A planning operator is a 4tuple , where and are conjuncts of literals called ’s preconditions and effects, and includes ’s name and argument list (a list of the variables in and ). (o) represents the nonnegative cost of applying operator .
Actions. An action is executable in a state if , in which case the resulting state is , where and are the atoms and negated atoms, respectively, in . A plan is executable in if each is executable in the state produced by ; and in this case is the state produced by executing . If and are plans or actions, then their concatenation is .
We define the cost of as the sum of the costs of the actions in the plan, i.e. .
2.2 Generating Landmarks for Classical Planning
There are several landmark generation algorithms suggested in the literature [10, 17]. The general approach used in generating sound landmarks is to relax the planning problem, generate sound landmarks for the relaxed version, and then use those for the original planning problem. In this paper, we use LAMA’s landmark generation algorithm [17], which uses relaxed planning graphs and domaintransition graphs in tandem to generate landmarks.
2.3 : an Admissible Landmarkbased Heuristic for Classical Planning
We provide some background on , the landmarkbased admissible heuristic for classical planning problems proposed by Karpas and Domshlak [11] that we will be using in our heuristic.
Consider a classical planning problem and a landmark graph computed using any of the offtheshelf landmark generation algorithms mentioned in Section 2.2. Then, we can define to be the set of landmarks that need to be achieved from onwards, assuming we got to using . Note that is pathdependent: it can vary for the same state when reached by different paths. It can be computed as follows:
where is the set of landmarks that were true at some point along . is the set of landmarks that were accepted but are required again; an accepted landmark is required again if (1) it does not hold true in , and (2) it is greedynecessarily ordered before another landmark in that is not accepted.
Karpas and Domshlak show that it is possible to partition the costs of the actions in over the landmarks in to derive an admissible cost estimate for the state as follows: let be the cost assigned to the landmark , and be the portion of ’s cost assigned to . Furthermore, let us suppose these costs satisfy the following set of inequations:
(1) 
where is the set of possible achievers of along any suffix of , and .
Informally, what these equations are encoding is a scheme to partition the cost of each action across all the landmarks it could possibly achieve, and assigns to each landmark a cost no more than the minimum cost assigned to by all its achievers. Given this, they prove the following useful theorem:
Theorem 1.
Given a set of actiontolandmark and landmarktoaction costs satisfying Eqn. 1, is an admissible estimate of the optimal plan cost from .
Note that the choice of exactly how to do the costpartitioning is left open. One of the schemes Karpas and Domshlak propose is an optimal costpartitioning scheme that uses an LP solver to solve the constraints in Eqn. 1 with the objective function . This has the useful property that given two sets of landmarks and , if , then . In other words, the more landmarks you provide to , the more informed the heuristic estimate.
2.4 Goal Networks and HGN Methods
We extend the definitions of HGN planning [19] to work with partiallyordered sets of goals, which we call a goal network.
A goal network is a way to represent the objective of satisfying a partially ordered multiset of goals. Formally, it is a pair such that:

is a finite nonempty set of nodes;

each node contains a goal that is a DNF (disjunctive normal form) formula over ground literals;

is a partial order over .
We will provide examples of both generic and concrete goal networks. Figure 1 shows three generic goal networks. Each subfigure is itself a goal network denoted . Directed arcs indicate a subgoal pair (e.g., from ) such that the first goal must be satisfied before the second goal. Consider the network where is a subgoal of , then . Network shows a partial ordering, where . Similarly, and this implies both must occur before . Consider a network that is composed of and . Then ). Note that is a partially ordered forest of goal networks.
Figure 2 shows a concrete goal network for an automated manufacturing domain. joined denotes the goal of assembling the parts and together, while at represents the goal of getting to location . In this goal network, joined and joined are unordered with respect to one another. Furthermore, joined has three subgoals that need to be achieved before achieving it, i.e the goals of getting the parts , and the to the assembly table. These subgoals are also unordered with respect to one another, indicating that the goals can be accomplished in any order.
HGN Methods
An HGN method is a 4tuple where the head and preconditions are similar to those of a planning operator. is a conjunct of literals representing the goal decomposes. is the goal network that decomposes into. By convention, has a last node containing the goal to ensure that accomplishes its own goal.
Figure 3 describes the goal network that the deliverobj method, a method responsible for solving problems related to delivering parts and tools to their destinations, decomposes a goal into. This method is relevant to at goals (since that’s the last node), and its preconditions are .
Whether a node has predecessors impacts the kinds of operations we allow. We refer to any node in a goal network having no predecessors as an unconstrained node of , otherwise the node is constrained. The constrained nodes of Figure 1 include and the remaining are unconstrained. The unconstrained nodes in Figure 2 include all the at nodes as well as the joined node.
We define the following operations over any goal network :

Goal Release: Let be an unconstrained node. Then the removal of from , denoted by , results in the goal network where and is the restriction of to .

Method Application: Let be an unconstrained node. Also, let be a method applied to with . Finally, recall that always contains a ’last’ node that contains ; let be this node. Then the application of to via , denoted by , results in the goal network where and . Informally, this operation adds the elements of to , preserving the order specified by and setting as a predecessor of .
2.5 HGN Domains, Problems and Solutions
A HGN domain is a pair where is a classical planning domain and is a set of HGN methods.
A HGN planning problem is a triple , where is an HGN domain, is the initial state, and is the initial goal network.
Definition 2 (Solutions to HGN Planning Problems).
The set of solutions for is defined as follows:
 Base Case.

If is empty, the empty plan is a solution for .
In the following cases, let be an unconstrained node.
 Unconstrained Goal Satisfaction.

If , then any solution for is also a solution for .
 Action Application.

If action is applicable in and is relevant to , and is a solution for , then is a solution for .
 Method Decomposition.

If is a method applicable in and relevant to , then any solution to is also a solution to .
Note that HGN planning allows an action to be applied only if it is relevant to an unconstrained node in ; this prevents unrestricted chaining of applicable actions as done in classical planning and allows for tighter control of solutions as in HTN planning.
Let us denote as the set of solutions to an HGN planning problem as allowed by Definition 2. Then we can define what it means for a solution to be hierarchically optimal with respect to as follows:
Definition 3 (Hierarchically Optimal Solutions).
A solution is hierarchically optimal with respect to if .
3 The Algorithm
Algorithm 1 describes . It takes as input an HGN domain , the initial state and the initial goal network . It does an A search using the admissible HGN heuristic (described in Section 4) to compute a hierachically optimal solution to the problem; it either returns a plan if it finds one, or if the problem is unsolvable.
Initialization. It starts off by initializing (Line 2), which is a priority queue that sorts the HGN search nodes yet to be expanded by their value, where . initially contains the initial search node . It also initializes (Line 3), the set of all nodes seen during the search process. This data structure keeps track of the best known path for each pair, and is thus helpful to detect when we find a cheaper path to a previously seen HGN search node.
Search. now proceeds to do an A search in the space of HGN search nodes starting from the initial node. While is not empty, it does the following (Lines 4–14): it removes the HGN search node with the best value from (Line 5) and first checks if is empty (Line 6). If this is true, this means that all the goals in have been solved, and is the optimal solution to the HGN planning problem.
If is not empty, then the algorithm proceeds by using the subroutine to compute ’s successor nodes (Line 7). For each successor node , it proceeds to do the following: it checks to see if another path to exists in (Line 9). If this is the case and if is costlier than (Line 10), it updates with the new path; and reopens the search node (Line 14); if is cheaper than the new plan , it simply skips this successor (Line 12).
If has not been seen before, it adds to to track the currently bestknown plan to (Line 13). It also evaluates the value of (note that this is where is called) and adds it to (Line 14).
If there are no more nodes left in , this implies that it has exhausted the search space without finding a solution, and therefore returns (Line 15).
Computing Successors. The procedure computes the successors of a given HGN search node in accordance with Definition 2. First, we check to see if there are any unconstrained goals in that are satisfied in the current state . We then proceed to create new HGN search nodes by removing all such goals from (Line 19–20). Next, we compute all actions applicable in and relevant to an unconstrained goal in (Line 21) and create new search nodes by progressing using these actions (Line 22–23). We compute all pairs such that is an HGN method applicable in and relevant to an unconstrained goal in (Line 24) and create new search nodes by decomposing in using (Line 25–26). Finally, we return the set of generated successor nodes (Line 27).
4 : An Admissible Heuristic for HGN Planning
As mentioned in Section 3, uses to compute the values (and thus, the values) of search nodes. In this section, We will proceed to describe how to construct as follows:

We define a relaxation of HGN planning that ignores the provided methods and allows unrestricted action chaining as in classical planning, which expands the set of allowed solutions,

We will extend landmark generation algorithms for classical planning problems to compute sound landmark graphs for the relaxed HGN planning problems, which in turn are sound with respect to the original HGN planning problems as well, and finally

We will use admissible classical planning heuristics like on these landmark graphs to compute admissible cost estimates for HGN planning problems.
4.1 Relaxed HGN Planning
Definition 4 (Relaxed HGN Planning).
A relaxed HGN planning problem is a triple where is a classical planning domain, is the initial state, and is the initial goal network. Any sequence of actions that is executable in state and achieves the goals in in an order consistent with the constraints in is a valid solution to .
Relaxed HGN planning can thus be viewed as an extension of classical planning to solve for goal networks, where there are no HGN methods and the objective is to generate sequences of actions that satisfy the goals in in an order consistent with . In fact, it is easy to show that relaxed HGN planning, in contrast to HGN planning, is no more expressive than classical planning, and relaxed HGN planning problems can be compiled into classical planning problems quite easily.
Next, we will show how to leverage landmark generation algorithms for classical planning to generate landmark graphs for relaxed HGN planning.
4.2 Generating Landmarks for Relaxed HGN Planning
This section describes a landmark discovery technique that can use any landmark discovery technique for classical planning (referred to as here) such as [17] to compute landmarks for relaxed HGN planning problems. The main difference here is that while classical planning problems are pairs, relaxed HGN planning problems are pairs; every goal in the goal network can be thought of as a landmark. Therefore, there is now a partially ordered set of goals to compute landmarks from, as opposed to a single goal in classical planning.
We therefore need to generalize classical planning landmark generation techniques to work for relaxed HGN planning problems. The algorithm (Algorithm 2) describes one such generalization. At a high level, proceeds by computing landmark graphs for each goal in (which in fact is a classical planning problem) and merging them all together to create the final landmark graph .
takes as input a relaxed HGN planning problem and generates , a graph of landmarks. First, queueSeeds is initialized with a copy of (Line 2). This is because unlike in classical planning where we generate landmarks for a single goal, in HGN planning we have a partially ordered set of goals to seed landmark generation; queueSeeds stores these seeds. We also initialize queue, the openlist of landmarks, to .
While there is a goal from that we have not yet computed landmarks for (Line 4), we do the following: we remove it from queueSeeds along with all induced orderings and add it to queue (Lines 5–6). We also add to using ; we also add any ordering constraints it might have with other elements of that have already been added to . This queue is then used as a starting point by to begin landmark generation. We iteratively use to pop landmarks off the queue and generate new landmarks by backchaining until we can no longer generate any more landmarks (Lines 8–10). Each new landmark is added to by the procedure. Once all goals in have been handled, the landmark generation process is completed and the algorithm returns .
The procedure takes as input a computed landmark , adds it to and returns a landmark . There are three cases to consider:
The procedure takes as input a landmark and an ordering constraint and adds them to . More precisely, it adds to using , which returns the added landmark . It then adds the ordering constraint between and in .
LM graph computation example. Figure 4 illustrates the working of . Let us assume the goal network contains only one goal . Figure (a)a illustrates the output of on . This is identical to what would generate, since contains only one goal, making the relaxed HGN problem equivalent to a classical planning problem.
Now, let us assume that we decompose using the , and get the new goal network , which essentially looks like an instantiated version of the network in Figure 3. Now if we run on , we end up generating the landmark graph in Figure (b)b, which is a more focused version of the first landmark graph. This is because the goals in are landmarks that must be accomplished, which constrains the set of valid solutions that can be generated. For instance, since we’ve committed to agent , every solution we can generate from will involve the use of . We can, as a result, generate more focused landmarks than we otherwise could have from just the toplevel goal . This includes fact landmarks that replace disjunctive landmarks (the ones in gray in Fig. (b)b) as well as completely new landmarks that arise as a result of the method; e.g. reserved is not a valid landmark for , but is one for .
An important point to note at this point is that the subgoals in are not true landmarks for ; they are landmarks once we commit to applying method . However, this actually ends up being useful to us, since it allows us to generate different landmark graphs for different methods; for instance, if we had committed to , we would have obtained a different set of landmarks specific to . Now, landmarkbased heuristics when applied to these two graphs would get us different heuristic estimates, thus allowing to differentiate between these two methods by using the specific subgoals each method introduces.
It is easy to show that generates sound landmark graphs for relaxed HGN planning problems:
Claim 5.
Given a relaxed HGN planning problem , is a sound landmark graph for .
Let be an HGN planning problem, and let be the corresponding relaxed version. Then by definition, any solution to is a solution to . Therefore, it is easy to see that a landmark of is also a sound landmark of . More generally, a landmark graph generated for is going to be sound with respect to as well:
Claim 6.
Given an HGN planning problem , then is a sound landmark graph for .
4.3 Computing
The main insight behind is the following: since the algorithm generates sound landmarks and orderings for relaxed (and therefore regular) HGN planning problems, we can use any admissible landmarkbased heuristic from classical planning to derive an admissible cost estimate for HGN planning problems.
In particular, uses as follows: given an HGN search node , the landmark graph is given by . Then
(2) 
where is the plan generated to get to .
A couple of important implementation details: when using to guide classical planners, it is sufficient to compute the landmark graph just once upfront since it can be reused in every state along the plan due to the goal staying the same. This isn’t the case in HGN planning; method decomposition can change the goal network. So, requires recomputing the landmark graph each node. In our implementation, we try to optimize this process by computing landmark graphs for each goal network we encounter from the initial state and caching them for use in future nodes containing the same goal network. Section 5.2 discusses the impact of this overhead in the experiments. Secondly, while the optimal cost partitioning scheme in provides more informed heuristic estimates, we chose to use the uniform cost partitioning scheme in our implementation since the former requires solving an LP at each search node, which is costly.
4.4 Admissibility of
Claim 6 shows that given an HGN problem , is a sound landmark graph with respect to . Furthermore, Lemma 1 shows that provides an admissible cost estimate of the optimal plan starting from that achieves all the landmarks in . Since every solution to has to achieve all the landmarks in in a consistent order, provides an admissible estimate of the optimal cost to as well. However, from Eq. 2, . Therefore, we have the following theorem:
Theorem 7 (Admissibility of ).
Given an HGN planning domain , a search node and its costoptimal solution , .
5 Experimental Evaluation
We implemented within the FastDownward codebase, and extended LAMA’s landmark generation code to develop , our HGN planning heuristic.
We tested two hypotheses in our study:

H1: ’s ability to exploit hierarchical planning knowledge enables it to outperform stateoftheart optimal classical planners. To test this, we compared the performances of with [11], the optimal classical planner whose heuristic we extended to develop .
It might seem that H1 is obviously true due to the dominance of hierarchical planners (e.g., SHOP2 and GDP) over classical planners, but these are merely satisficing planners. It is not clear whether this advantage would carry over to optimal planning because needs to do an optimal search in the possibly larger space of pairs, in contrast to classical planners, which search in the space of states.

H2: The heuristic used by , , provides useful search guidance. To test this, we compared the performances of with , which is identical to except that it uses the trivial heuristic estimate of .
5.1 Experimental Results
We evaluated , , and on three wellknown planning benchmarks, Logistics, Blocks World and Depots. We chose these 3 domains because from a controlknowledge standpoint, these three domains capture a wide spectrum: Logistics contains only enough controlknowledge to define allowed solutions, BlocksWorld is at the other extreme, defining sophisticated knowledge that significantly prunes the search space, and Depots incorporates elements of both.
For each domain, we randomly generated 25 problem instances per problem size. We ran all problems on a Xeon E52639 with a per problem limit of 4 GB of RAM and 25 minutes of planning time. Data points were discarded if the planner did not solve all of the corresponding problem instances within the time limit.
Logistics. We modified the standard PDDL Logistics model to limit the capacity of all vehicles to one to ensure the HGN and nonHGN planners compute the same solutions. We generated 25 random logistics problems for each problem size ranging from 4,6,…,14 packages. For and , we provided the HGN methods used in ’s experimental evaluation [18]. There are three methods in this knowledge base that together capture all the possible (minimal) solutions to a Logistics problem; these are (1) a method to move packages within the same city using trucks, (2) a method to move packages between airports using planes, and (3) a method that combines the previous two to move packages across different cities.
Figures (a)a and (d)d show the performance of the three planners in terms of number of nodes expanded by the planners and overall planning time. Both and could solve problems up to size 10 (i.e., within the time limit), while could solve problems only up to size 6.
In terms of nodes expanded, Figure (a)a shows that the heuristic in helped to modestly decrease the number of nodes expanded; on average expanded 22% fewer nodes than . We did not include because it expanded many orders of magnitude more nodes than either variant (e.g., for problems of size 6, on average expanded nodes). With regard to running time, Figure (d)d shows that the modest gain by was outweighed by the computational overhead of running the heuristic (on average about 35% of the total running time). , despite its blind search, was slightly faster than .
Blocks World. We generated 25 random blocksworld problems for problem sizes ranging from 4,6,…,20 blocks. As in our study with Logistics, we use the same HGN methods used in ’s evaluation [18]. In contrast to our Logistics study, the methods encode sophisticated knowledge that allows the planners to prune search paths that don’t lead to good solutions (e.g., it contains a recursively defined axiom that checks if a block is in its final position and only then builds towers on top of it).
Figures (b)b and (e)e show the performance of the three planners on these blocksworld problems. could solve problems up to size 10, to size 16, and could solve problems up to size 18.
Figure (b)b displays the number of nodes expanded by the three planners. In this domain, the guidance provided by helped substantially; on average expanded 76% fewer nodes than . This savings far outweighed the heuristic computation overhead (on average about 48% of the total running time), resulting in smaller overall planning times for as can be seen in Figure (e)e.
Depots. We generated 25 random depots problems for problem sizes ranging from 4,5,…,10 crates. Since the Depots domain combines aspects of Logistics (moving cargo around) and BlocksWorld (stacking them in a particular manner), the HGN methods for Depots is a combination of the HGNs used in Logistics and BlocksWorld.
Figures (c)c and (f)f show the performance of the three planners on the generated problems. could solve problems up to only size 6, while both and could solve problems up to size 9. Figure (c)c shows the average number of nodes expanded by the three planners. The heuristic in provides good search guidance, reducing the number of nodes expanded by about 46% when compared to . As in Logistics, we didn’t show the nodes expanded by since it was many orders of magnitude more than either variant; for size 6 problems, on average, it expanded nodes.
In terms of planning time (Figure (f)f), the provided domain knowledge clearly helps both variants in scaling much better than . Furthermore, the additional search guidance provided by results in overall lower runtimes for in comparison to , even with the computation overhead of the heuristic (which is about 56% of the total time).
5.2 Interpretation of Results
There are two main takeaways from this empirical study:
 Support for H1.

Hierarchical planning knowledge helps in scaling up solving of optimal planning problems. In all three benchmark domains, both of the variants solved more problems while requiring less time and expanding fewer nodes than , showing that the additional overhead of searching through the space of pairs was outweighed by the benefit that hierarchical planning knowledge can provide in terms of more focused search.
 Support for H2.

The HGN heuristic provides useful guidance when searching for hierarchically optimal plans. We can conclude this from the decrease in the number of nodes expanded in as compared to in all three benchmark domains. The Logistics results only weakly support this due to only a modest decrease in the number of nodes expanded (22%), while the results from BlocksWorld and Depots are more conclusive, registering large savings in number of nodes expanded (76% and 46% respectively).
We posit that the reduction in number of nodes expanded by is a function of the input HGN knowledge. For instance, the Logistics methods do not encode any expert knowledge and instead only model the minimum knowledge required to capture the three ways to move a package: by truck, by plane, and by a combination of the two. Therefore, the goal networks always contain landmarks or more focused versions of landmarks (e.g. instead of ) that can be detected by landmark generation algorithms. This means that the landmarks generated do not change much after a method application, implying that the heuristic estimates are unlikely to change much either. In contrast, methods in both BlocksWorld and Depots contain specialized knowledge that, when applied, yield goal networks containing subgoals and orderings that cannot be detected by landmark generation algorithms. That is, when landmark generation is run on these goal networks, because the subgoals in the goal network serve as seeds for landmark generation, a richer set of landmarks will be generated, resulting in more informed heuristic estimates.
Another important takeaway from the experiments is the following: the current implementation of imposes a substantial overhead on . On average, it uses 35%, 48% and 56% of the total planning time in Logistics, BlocksWorld, and Depots respectively. This is partly due to the current implementation not being optimized. For instance, unlike landmarkbased classical planners where the landmark graph needs to be computed only once for the final goal, needs to compute landmark graphs for every goal network it generates during search. Reusing the computed landmark graphs more effectively can potentially help in substantially reducing planning times.
6 Related Work
HTN planners solve planning problems by (1) forward statespace search, such as in the SHOP [16] and SHOP2 [15] HTN planners, or (2) partialorder causallink planning (POCL) techniques, such as in UMCP [8] and in the hybrid planning literature [7, 6].
HGN planning can be translated to HTN planning in a planpreserving manner [3], meaning we can, in theory, use any optimal HTN planner for optimal HGN planning. However, there is little research on search heuristics for forwardsearch HTN planning [2, 1]. Therefore, planners often provide other domainspecific mechanisms for users to encode search strategies. For example, SHOP2 allows domainspecific knowledge, known as HTN methods, to be specified in a ’good’ order according to the user, and attempts to apply them in the same order. SHOP2 also provides support for external function calls [15] that can call arbitrary code to perform intensive computations, thus minimizing the choices that need to be made during search. For example, in the 2002 Planning Competition for handtailored planners, the authors of SHOP2 used a graphalgorithm library that SHOP2 could call externally to generate shortest paths [15].
Waisbrot et al [23] developed , a HTN planner that augments SHOP2 with classical planning heuristics to make local decisions on which method to apply next by estimating how close the method’s goal is to the current state. However, retains the depthfirst search structure of SHOP2, making it difficult to generate highquality plans.
Marthi et al [13, 14] propose an HTNlike formalism called angelic hierarchical planning that allows users to annotate abstract tasks with additional domainspecific information (i.e., lower and upper bounds on the costs of the possible plans they can be used to generate). They then use this information to compute hierarchicallyoptimal plans. In contrast, we require the costs of only the primitive actions and use domainindependent search heuristics to compute hierarchicallyoptimal plans.
There has been recent work on developing search heuristics for POCL HTN planners [7, 6]. However, these heuristics typically provide estimates on how many more plan refinement steps need to be taken from a search node to obtain a solution. This differs from plan quality estimates, which is our focus in this paper.
Hierarchical Goal Network (HGN) Planning combines the hierarchical structure of HTN planning with the goalbased nature of classical planning. It therefore allows for easier infusion of techniques from classical planning into hierarchical planning, such as adapting the FF heuristic for method ordering in the planner [19], and using landmarkbased techniques to plan with partial amounts of domain knowledge in [18]. However, both planners use depthfirst search and inadmissible heuristics, so they cannot provide any guarantees of plan quality.
Another lessrelated domainconfigurable planning formalism is Planning with Control Rules [4], where domainspecific knowledge is encoded as lineartemporal logic (LTL) formulas. TLPlan, one of the earliest planners developed under this formalism, used control rules written in LTL to prune trajectories deemed suboptimal by the user. There have also been attempts to develop heuristic search planners that can plan with LTL, a simplified version of LTL that works with finite traces. This has been used to incorporate search heuristics to solve for temporally extended goals written in LTL [5], planning for preferences [22], as well as to express landmarkbased heuristics that guide classical planners [21].
7 Conclusion
Despite the popularity of hierarchical planning techniques in theory and practice, little effort has been devoted to developing domainindependent search heuristics that can provide useful search guidance towards highquality solutions. As a result, endusers need to encode domainspecific heuristics into the domain models, which can make the domainmodeling process tedious and errorprone.
To address this issue, we leverage recent work on HGN planning, which allows tighter integration of hierarchical and classical planning, to develop (1) , an admissible HGN planning heuristic, and (2) , an A search algorithm guided by to compute hierarchicallyoptimal plans. Our experimental study showed that outperforms optimal heuristic search classical planners (due to its ability to exploit domainspecific planning knowledge) and optimal blind search HGN planners (due to the search guidance provided by ).
There are several directions for future work, such as:

Extension to Anytime Planning: An obvious and a practically useful extension of this work is to extend to work in an anytime manner (i.e., generate a solution quickly such that a solution is available at any time during execution and then iteratively/continuously improve the plan’s quality over time) instead of trying to compute the optimal solution upfront. We can of course adapt techniques used in anytime classical planners like LAMA, which runs a series of weightedA searches. However, we also plan to explore the use of blockdeordering [20], a technique for continual plan improvement that seems to lend itself well to plans that are hierarchically structured.

Extension to Temporal Planning: We also plan on investigating temporal extensions of HGN planning and to develop search heuristics and hierarchical planners that can leverage procedural knowlege to find highquality plans and schedules.
Acknowledgements.
This work is sponsored in part by OSD ASD (R&E). The information in this paper does not necessarily reflect the position or policy of the sponsors, and no official endorsement should be inferred. Ron Alford performed part of this work under an ASEE postdoctoral fellowship at NRL. We also would like to thank the anonymous reviewers at ECAI 2016 for their insightful comments. We would also like to thank the reviewers at HSDIP 2016 for useful feedback on a preliminary version of this paper.References
 [1] Ron Alford, Gregor Behnke, Daniel Höller, Susanne Biundo, Pascal Bercher, and David W. Aha, ‘Bound to plan: Exploiting classical heuristics via automatic translations of tailrecursive HTN problems’, in Proc. of the 26th Int. Conf. on Automated Planning and Scheduling (ICAPS). AAAI Press, (2016).
 [2] Ron Alford, Vikas Shivashankar, Ugur Kuter, and Dana S. Nau, ‘On the feasibility of planning graph style heuristics for HTN planning’, in Proc. of the 24th Int. Conf. on Automated Planning and Scheduling (ICAPS), pp. 2–10. AAAI Press, (2014).
 [3] Ron Alford, Vikas Shivashankar, Mark Roberts, Jeremy Frank, and David W. Aha, ‘Hierarchical planning: relating task and goal decomposition with task sharing’, in Proc. of the 25th Int. Joint Conf. on Artificial Intelligence (IJCAI). AAAI Press, (2016).
 [4] Fahiem Bacchus and Froduald Kabanza, ‘Using temporal logics to express search control knowledge for planning’, Artif. Intell., 116, 123–191, (2000).
 [5] Jorge A. Baier and Sheila A. McIlraith, ‘Planning with firstorder temporally extended goals using heuristic search’, in AAAI Conference on Artificial Intelligence, (2006).
 [6] Pascal Bercher, Shawn Keen, and Susanne Biundo, ‘Hybrid planning heuristics based on task decomposition graphs’, in Proc. of the Seventh Annual Symposium on Combinatorial Search (SoCS), pp. 35–43. AAAI Press, (2014).
 [7] Mohamed Elkawkagy, Pascal Bercher, Bernd Schattenberg, and Susanne Biundo, ‘Improving hierarchical planning performance by the use of landmarks’, in AAAI Conference on Artificial Intelligence, pp. 1763–1769, (2012).
 [8] Kutluhan Erol, James Hendler, and Dana S. Nau, ‘UMCP: A sound and complete procedure for hierarchical tasknetwork planning’, pp. 249–254, (June 1994). ICAPS 2009 influential paper honorable mention.
 [9] J. Hoffmann and Bernhard Nebel, ‘The FF planning system’, Journal of Artificial Intelligence Research, 14, 253–302, (2001).
 [10] Jörg Hoffmann, Julie Porteous, and Laura Sebastia, ‘Ordered landmarks in planning’, Journal of Artificial Intelligence Research, 22, 215–278, (2004).
 [11] Erez Karpas and Carmel Domshlak, ‘Costoptimal planning with landmarks’, in IJCAI 2009, Proceedings of the 21st International Joint Conference on Artificial Intelligence, Pasadena, California, USA, July 1117, 2009, ed., Craig Boutilier, pp. 1728–1733, (2009).
 [12] Derek Long, Henry A. Kautz, Bart Selman, Blai Bonet, Hector Geffner, Jana Koehler, Michael Brenner, Jorg Hoffmann, Frank Rittinger, Corin R. Anderson, Daniel S. Weld, David E. Smith, and Maria Fox, ‘The aips98 planning competition’, AI Magazine, 21, 13–33, (2000).
 [13] B. Marthi, S.J. Russell, and J. Wolfe, ‘Angelic semantics for highlevel actions’, in International Conference on Automated Planning and Scheduling, (2007).
 [14] B. Marthi, S.J. Russell, and J. Wolfe, ‘Angelic hierarchical planning: Optimal and online algorithms’, in International Conference on Automated Planning and Scheduling, pp. 222–231, (2008).
 [15] Dana S. Nau, TszChiu Au, Okhtay Ilghami, Ugur Kuter, J William Murdock, Dan Wu, and Fusun Yaman, ‘SHOP2: An HTN planning system’, Journal of Artificial Intelligence Research, 20, 379–404, (2003).
 [16] Dana S. Nau, Yue Cao, Amnon Lotem, and Héctor MuñozAvila, ‘SHOP: Simple hierarchical ordered planner’, in International Joint Conference on Artificial Intelligence, ed., Thomas Dean, pp. 968–973, (August 1999).
 [17] Silvia Richter and Matthias Westphal, ‘The LAMA planner: Guiding costbased anytime planning with landmarks’, J. Artif. Intell. Res. (JAIR), 39, 127–177, (2010).
 [18] Vikas Shivashankar, Ron Alford, Ugur Kuter, and Dana Nau, ‘The GoDeL planning system: a more perfect union of domainindependent and hierarchical planning’, in Proc. of the 23rd Int. Joint Conf. on Artificial Intelligence (IJCAI), pp. 2380–2386. AAAI Press, (2013).
 [19] Vikas Shivashankar, Ugur Kuter, Dana Nau, and Ron Alford, ‘A hierarchical goalbased formalism and algorithm for singleagent planning’, in Proc. of the 11th Int. Conf. on Autonomous Agents and Multiagent Systems (AAMAS), volume 2, pp. 981–988. Int. Foundation for Autonomous Agents and Multiagent Systems, (June 2012).
 [20] Fazlul Hasan Siddiqui and Patrik Haslum, ‘Continuing plan quality optimisation’, J. Artif. Intell. Res. (JAIR), 54, 369–435, (2015).
 [21] Salome Simon and Gabriele Roger, ‘Finding and exploiting ltl trajectory constraints in heuristic search’, in Symposium on Combinatorial Search, (2015).
 [22] Shirin Sohrabi, Jorge Baier, and Sheila McIlraith, ‘Htn planning with preferences’, in IJCAI, (2009).
 [23] Nathaniel Waisbrot, Ugur Kuter, and Tolga Konik, ‘Combining heuristic search with hierarchical tasknetwork planning: A preliminary report’, in International Conference of the Florida Artificial Intelligence Research Society, (2008).