Planning and Synthesis Under Assumptions

Planning and Synthesis Under Assumptions

Benjamin Aminof
JKU Linz and TU Wien
Austria
aminof@forsyte.at &Giuseppe De Giacomo
Sapienza Univ. Roma
Rome, Italy
degiacomo@dis.uniroma1.it &Aniello Murano
Univ. Federico II
Naples, Italy
murano@unina.it &Sasha Rubin
Univ. Federico II
Naples, Italy
rubin@unina.it
Abstract

In Reasoning about Action and Planning, one synthesizes the agent plan by taking advantage of the assumption on how the environment works (that is, one exploits the environment’s effects, its fairness, its trajectory constraints). In this paper we study this form of synthesis in detail. We consider assumptions as constraints on the possible strategies that the environment can have in order to respond to the agent’s actions. Such constraints may be given in the form of a planning domain (or action theory), as linear-time formulas over infinite or finite runs, or as a combination of the two. We argue though that not all assumption specifications are meaningful: they need to be consistent, which means that there must exist an environment strategy fulfilling the assumption in spite of the agent actions. For such assumptions, we study how to do synthesis/planning for agent goals, ranging from a classical reachability to goal on traces specified in LTL and LTLf/LDLf, characterizing the problem both mathematically and algorithmically.

1 Introduction

Reasoning about actions and planning concern the representation of a dynamic system. This representation consists of a description of the interaction between an agent and its environment and aims at enabling reasoning and deliberation on the possible course of action for the agent [\citeauthoryearReiter2001]. Planning in fully observable nondeterministic domains (FOND), say in Planning Domain Definition Language (PDDL), [\citeauthoryearGhallab, Nau, and Traverso2004, \citeauthoryearGeffner and Bonet2013] exemplifies the standard methodology for expressing dynamic systems: it represents the world using finitely many fluents under the control of the environment and a finitely many actions under the control of the agent. Using these two elements a model of the dynamics of world is given. Agent goals, e.g., reachability objectives, or, say, temporally extended objectives written in LTL [\citeauthoryearBacchus and Kabanza2000, \citeauthoryearCamacho et al.2017, \citeauthoryearDe Giacomo and Rubin2018], are expressed over such models in terms of such fluents and actions.

An important observation is that, in devising plans, the agent takes advantage of such a representation of the world. Such a representation corresponds to knowledge that the agent has of the world. In other words, the agent assumes that the world works in a certain way, and exploits such an assumption in devising its plans. A question immediately comes to mind:

Which kinds of environment assumptions can the agent make?

Obviously the planning domain itself (including the initial state) with its preconditions and effects is such an assumption. That is, as long as the agent sticks to its preconditions, the environment acts as described by the domain. So, the agent can exploit the effect of its actions in order to reach a certain goal (state of affairs). Another common assumption is to assume the domain is fair, i.e., so-called fair FOND [\citeauthoryearDaniele, Traverso, and Vardi1999, \citeauthoryearPistore and Traverso2001, \citeauthoryearCimatti et al.2003, \citeauthoryearCamacho et al.2017, \citeauthoryearD’Ippolito, Rodríguez, and Sardiña2018]. In this case the agent can exploit not only the effects, but also the guarantee that by continuing to execute an action from a given state the environment will eventually respond with all its possible nondeterministic effects.111There are two notions of fairness in planning. One stems from the fact that nondeterminism is resolved stochastically. The other is a logical notion analogous to that used in the formal-methods literature. These two notions coincide in the context reachability goals [\citeauthoryearD’Ippolito, Rodríguez, and Sardiña2018], but diverge with more general LTL goals [\citeauthoryearPnueli1983, \citeauthoryearPnueli and Zuck1993]. In this paper, we focus on the logical notion. More recently [\citeauthoryearBonet and Geffner2015, \citeauthoryearBonet et al.2017] trajectory constraints over the domain, expressed in LTL, have been proposed to model general restrictions on the possible environment behavior. But is any kind of LTL formula on the fluents and actions of the domain a possible trajectory constraint for the environment? The answer is obviously not! To see this, consider a formula expressing that eventually a certain possible action must actually be performed (the agent may decide not to do it). But then

Which trajectory constraints are suitable as assumptions in a given domain?

Focusing on LTL, the question can be rephrased as:

Can any linear-time specification be used as an assumption for the environment?

We can summarize these questions, ultimately, by asking:

What is an environment assumption?

This is what we investigate in this paper. We take the view that environment assumptions are ways to talk about the set of strategies the environment can enact. Moreover, the plan for the goal, i.e., the agent strategy for fulfilling the goal, need only fulfill the goal against the strategies of the environment from the given set of environment strategies. We formalize this insight and define synthesis/planning under assumptions and the relationship between the two in a general linear-time setting. In particular, our definitions only allow linear-time properties to be assumptions if the environment can enforce them. In doing this we answer the above questions.

We also concretize the study and express goals and assumptions in LTL, automata over infinite words (deterministic parity word automata) [\citeauthoryearGrädel, Thomas, and Wilke2002], as well as formalisms over finite traces, i.e., LTLf/LDLf [\citeauthoryearDe Giacomo and Vardi2013, \citeauthoryearDe Giacomo and Rubin2018] and finite word automata. This allows us to study how to solve synthesis/planning under assumptions problems. One may think that the natural way to solve such synthesis problems is to have the agent synthesize a strategy for the implication

 Assumption⊃Goal

where both and are expressed, say, in LTL. A first problem with such an implication is that the agent should not devise strategies that make false, because in this case the agent would lose its model of the world without necessarily fulfilling its . This undesirable situation is avoided by our very notion of environment assumption. A second issue is this:

Does synthesis/planning under assumptions amount to synthesizing for the above implication?

We show that this is not the case. Note that an agent that synthesizes for the implication is too pessimistic: the agent, having chosen a candidate agent strategy, considers as possible all environment strategies that satisfy against the specific candidate strategy it is analyzing. But, in this way the agent gives too much power to the environment, since, in fact, the environment does not know the agent’s chosen strategy. On the other hand, surprisingly, we show that if there is an agent strategy fulfilling under , then also there exists one that indeed enforces the implication. Thus, even if the implication cannot be used for characterizing the problem of synthesis/planning under assumptions, it can be used to solve it. Exploiting this result, we give techniques to solve synthesis/planning under assumptions, and study the worst case complexity of the problems when goals and assumptions are expressed in the logics and automata mentioned above.

2 Synthesis and Linear-time specifications

Synthesis is the problem of producing a module that satisfies a given property no matter how the environment behaves  [\citeauthoryearPnueli and Rosner1989]. Synthesis can be thought of in the terminology of games. Let be a finite set of Boolean variables (also called atoms), and assume it is partitioned into two sets: , those controllable by the agent, and , those controllable by the environment. Let be the set of actions and the set of environment states (note the symmetry: we think of as a set of actions that are compactly represented as assignments of the variables in ). The game consists of infinitely many phases. In each phase of the game, both players assign values to their variables, with the environment going first. These assignments are given by strategies: an agent strategy and an environment strategy . The resulting infinite sequence of assignments is denoted .222Formally, we say that complies with if for all ; and we say that complies with if for all . Then is defined to be the unique infinite trace that complies with both strategies.

In classic synthesis the agent is trying to ensure that the produced sequence satisfies a given linear-time property. In what follows we write to denote a generic formalism for defining linear-time properties. Thus, the reader may substitute their favorite formalism for , e.g., one can take to be linear temporal logic, or deterministic parity automata. We use logical notation throughout. For instance, when refers to a logical formula, then refers to conjunction of formulas, but when refers to an automaton then refers to intersection of automata. If write for the set it defines. For instance, if then is the set of infinite sequences that satisfy , but when is an automaton operating on infinite sequences, then is the set of infinite sequences accepted by the automaton. Moreover, in both cases we say that the sequence satisfies .

We say that realizes (written ) if , i.e., if no matter which strategy the environment uses, the resulting sequence satisfies . Similarly, we say that realizes (written ) if . We write (resp. ) for the set of environment (resp. agent) strategies that realize , and in case this set is non-empty we say that is environment (resp. agent) realizable. We write (resp. ) for the set of all environment (resp. agent) strategies.

Solving environment- (resp. agent-) synthesis asks, given to decide if is environment- (resp. agent-) realizable, and to return such a finite-state strategy (if one exists). In other words, realizability is the recognition problem associated to synthesis. We now recall two concrete specification formalisms , namely LTL (linear temporal logic) and DPW (deterministic parity word automata), and then state that results about solving LTL/DPW synthesis.

Linear temporal Logic (LTL)

Formulas of , or simply LTL, are generated by the following grammar:

 φ::=p∣φ∨φ∣¬φ∣Xφ∣φUφ

where . The size of a formula is the number of symbols in it. LTL formulas are interpreted over infinite sequences . Define the satisfaction relation as follows: {enumerate*}

iff ;

iff for some ;

iff it is not the case that ;

iff ;

iff there exists such that and for all , . Write if and say that satisfies and is a model of . An LTL formula defines the set . We use the usual abbreviations, , , , , . Write for the set of Boolean formulas over . We remark that every result in this paper that mentions LTL also holds for LDL (linear dynamic logic) [\citeauthoryearVardi2011, \citeauthoryearEisner and Fisman2006].

Deterministic Parity Word Automata (DPW)

A DPW over is a tuple where is a finite set of states, is an initial state, is the transition function, and is the coloring. The run of on the input word is the infinite sequence of transitions such that . A run is successful if the largest color occurring infinitely often is even. In this case, we say that the input word is accepted. The DPW defines the set consisting of all input words it accepts. The size of , written , is the cardinality of . The number of colors of is the cardinality of .

DPWs are effectively closed under Boolean operations, see e.g., [\citeauthoryearGrädel, Thomas, and Wilke2002]:

Lemma 1.

Let be DPW with states and colors, respectively.

1. One can effectively form a DPW with states and colors for the complement of .

2. One can effectively form a DPW with with many states and many colors, where , for the disjunction .

Thus, e.g., from DPW one can build a DPW for whose number of states is and whose number of colors is .

Every LTL formula can be translated into an equivalent DPW , i.e., , see e.g. [\citeauthoryearVardi1995, \citeauthoryearPiterman2007]. Moreover, the cost of this translation and the size of are at most doubly exponential in the size of , and the number of colors of is at most singly exponential in the size of .

Here is a summary of the complexity of solving synthesis:

Theorem 2 (Solving Synthesis).
1. Solving LTL environment (resp. agent) synthesis is exptime-complete [\citeauthoryearPnueli and Rosner1989].

2. Solving DPW environment (resp. agent) synthesis is ptime in the size of the automaton and exptime in the number of its colors [\citeauthoryearPnueli and Rosner1989, \citeauthoryearFinkbeiner2016].

3 Synthesis under Assumptions

In this section we give core definitions of environment assumptions and synthesis under such assumptions. Intuitively, the assumptions are used to select the environment strategies that the agent considers possible, i.e., although the agent does not know the particular environment strategy it will encounter, it knows that it comes from such a set. We begin in the abstract, and then move to declarative specifications. Unless explicitly specified, we assume fixed sets and of environment and agent atoms.

Here are the main definitions of this paper:

Definition 1 (Environment Assumptions – abstract).

We call any non-empty set of environment strategies an environment assumption.

Informally, the set represents the set of environment strategies that the agent considers possible.

Definition 2 (Agent Goals – abstract).

We call any set of traces an agent goal.

Definition 3 (Synthesis under assumptions – abstract).

Let be an environment assumption and an agent goal. We say that an agent strategy realizes assuming if

 ∀σenv∈Ω.πσag,σenv∈Γ
Remark 1 (On the non-emptiness of Ω).

Note that the requirement that be non-empty is a consistency requirement; if it were empty then there would be no to test for membership in and so synthesis under assumptions would trivialize and all agent strategies would realize all goals.

For the rest of this paper we will specify agent goals and environment assumptions as linear-time properties. In particular, we assume that is a formalism for specifying linear-time properties over , e.g., or .

How should determine an assumption ? In general, talks about the interaction between the agent and the environment. However, we want that the agent can be guaranteed that whatever it does the resulting play satisfies . Thus, a given induces the set consisting of all environment strategies such that for all agent strategies the resulting trace satisfies . In particular, for to be non-empty (as required for it to be an environment assumption) we must have that is environment realizable. This justifies the following definitions.

Definition 4 (Synthesis under Assumptions – linear-time).
1. We call an environment assumption if it is environment realizable.

2. We call any an agent goal.

3. An synthesis under assumptions problem is a tuple where is an environment assumption and is an agent goal.

4. We say that an agent strategy realizes assuming , or that it solves , if .

5. The corresponding decision problem is to decide, given , if there is an agent strategy solving .

For instance, solving LTL synthesis under assumptions means, given with environment assumption and agent goal , to decide if there is an agent strategy solving , and to return such a finite-state strategy (if one exists). We remark that solving LTL synthesis under assumptions is not immediate; we will provide algorithms in the next section. For now, we point out that deciding whether is an environment assumption amounts to checking if is environment realizable, itself a problem that can be solved by known results (i.e., Theorem 2).

Theorem 3.
1. Deciding if an LTL formula is an environment assumption is exptime-complete.

2. Deciding if a DPW is an environment assumption is in ptime in the size of the DPW and exponential in its number of colors.

We illustrate such notions with some examples.

Example 1.
1. The set , definable in LTL by the formula , is an environment assumption. It captures the situation that the agent assumes that the environment will use any of the strategies in .

2. In robot-action planning problems, typical environment assumptions encode the physical space, e.g., “if robot is in Room 1 and does action then in the next step it can only be in Rooms 1 or 4”. The set of environment strategies that realize these properties is an environment assumption, definable in LTL by a conjunction of formulas of the form . We will generalize this example by showing that the set of environment strategies in a planning domain can be viewed as an environment assumption definable in LTL.

4 Solving Synthesis under Assumptions

In this section we show how to solve synthesis under assumptions when the environment assumptions and agent goals are given in LTL or by DPW. The general idea is to reduce synthesis under assumptions to ordinary synthesis, i.e., synthesis of the implication . Although correct, understanding why it is correct is not immediate.

Lemma 4.

Let be an environment assumption and an agent goal. Then, every agent strategy that realizes also realizes assuming .

Proof.

Let be an agent strategy realizing (a). To show that realizes assuming let be an environment strategy realizing (b). Now consider the trace . We must show that satisfies . By (a) satisfies and by (b) satisfies .

We now observe that the converse is not true. Consider and , and let and . First note that is an environment assumption formula (indeed, the environment can realize by playing at the first step). Moreover, every environment strategy realizing begins by playing (since otherwise the agent could play on its first turn and falsify ). Thus, every agent strategy realizes assuming (since the environment’s first move is to play which makes true no matter what the agent does). On the other hand, not every agent strategy realizes (indeed, the strategy which plays on its first turn fails to satisfy the implication on the trace in which the environment plays on its first turn). In spite of the failure of the converse, the realizability problems are inter-reducible:333For all reasonable expressions , e.g., that define Borel sets [\citeauthoryearMartin1975].

Theorem 5.

Suppose is an environment assumption. The following are equivalent:

1. There is an agent strategy realizing .

2. There is an agent strategy realizing assuming .

Proof.

The previous lemma gives us . For the converse, suppose does not hold, i.e., is not agent-realizable. Now, an immediate consequence of Martin’s Borel Determinacy Theorem [\citeauthoryearMartin1975] is that for every in any reasonable specification formalism (including all the ones mentioned in this paper), is not agent realizable iff is environment realizable. Thus, is environment-realizable, i.e., . Note in particular that realizes , i.e., . Now, suppose for a contradiction that holds, and take realizing assuming . Then by definition of realizability under assumptions and using the fact that we have that . On the other hand, we have already seen that , a contradiction.

Moreover, we see that one can actually extract a strategy solving synthesis by assumptions simply by extracting a strategy for solving the implication , which itself can be done by known results, i.e., for LTL use Theorem 2 (part 1), and for DPW use Lemma 1 and Theorem 2 (part 2).

Theorem 6.
1. Solving LTL synthesis under assumptions is exptime-complete.

2. Solving DPW synthesis under assumptions is in ptime in the size of the automata and in exptime in the number of colors of the automata.

5 Planning under Assumptions

In this section we define planning under assumptions, that is synthesis wrt a domain444Domains can be thought of as compact representations of the arenas in games on graphs [\citeauthoryearGrädel, Thomas, and Wilke2002]. The player chooses actions, also represented compactly, and the environment resolves the nondeterminism. In addition, not every action needs to be available in every vertex of the arena.. We begin with a representation of fully-observable non-deterministic (FOND) domains [\citeauthoryearGhallab, Nau, and Traverso2004, \citeauthoryearGeffner and Bonet2013]. Our representation considers actions symmetrically to fluents, i.e., as assignments to certain variables.

A domain consists of:

• a non-empty set of environment Boolean variables, also called fluents; the elements of are called environment states,

• a non-empty set (disjoint from ) of action Boolean variables; the elements of are called actions,

• a non-empty set of initial environment states,

• a relation of available actions such that for every there is an with (we say that is available in ), and

• a relation such that implies that .

As is customary in planning and reasoning about actions, we assume domains are represented compactly by tuples where , , and (here ). This data induces the domain where

1. iff ,

2. iff ,

3. iff .

We emphasize that when measuring the size of we use this compact representation:

Definition 5.

The size of , written , is .

We remark that in PDDL action preconditions are declared using :precondition, conditional effects using the when operator, and nondeterministic outcomes using the oneof operator (note that we code actions with action variables).

Example 2 (Universal Domain).

Given and define the universal domain where , and .

We now define the set of environment strategies induced by a domain. We do this by describing a property , that itself can be represented in LTL and DPW, as shown below.

Definition 6.

Fix a domain . Define a property (over atoms ) as consisting of all traces such that

1. and

2. for all , if is available in for every then .

Observe that is an environment assumption since, by the definition of domain, whenever an action is available in a state there is at least one possible successor state. Intuitively, an environment strategy is in if i) its first move is to pick an initial environment state, and ii) thereafter, if the current action is available in the current environment state (and the same holds in all earlier steps) then the next environment state is constrained so that . Notice that is unconstrained the moment is not available in , e.g., in PDDL these would be actions for which the preconditions are not satisfied. Intuitively, this means that it is in the interest of the agent to play available actions because otherwise the agent can’t rely on the fact that the trace comes from the domain.

Remark 2.

The reader may be wondering why the above definition does not say i’) and ii’) for all , . Consider the linear-time property consisting of traces satisfying i’ and ii’. Observe that, in general, is not environment realizable. Indeed, condition ii’ implies that is available in . However, no environment strategy can force the agent to play an available action.

We now observe that one can express in LTL.

Lemma 7.

For every domain there is an LTL formula equivalent to . Furthermore, the size of the LTL formula is linear in the size of .

To see this, say domain is represented compactly by . For the LTL formula, let be the formula formed from the formula by replacing every term of the form by . Note that iff . The promised formula is

 init∧(Gδ′∨δ′U¬pre).

One can also express directly by a DPW.

Lemma 8.

For every domain there is a DPW equivalent to . Furthermore, the size of the DPW is at most exponential in the size of and has two colors.

To do this we define the DPW directly rather than translate the LTL formula (which would give a double exponential bound). Define the DPW over as follows. Introduce fresh symbols . Let be the initial state. Define . Define , and for all . For all the transitions are given in Table 1. Intuitively, on reading the input the DPW goes to the rejecting sink if (resp. ) is not respected, it goes to the accepting sink if (resp. ) is respected but is not, and otherwise it continues (and accepts).

Definition 7.

Let be a domain.

• A set is an environment assumption for the domain if is non-empty.

• is an environment assumption for the domain if is non-empty, i.e., if is an environment assumption for the domain .

We illustrate the notion with some examples.

Example 3.
1. is an environment assumption for since is environment realizable.

2. Let denote the following property: iff for all , if there are infinitely many such that and , then for every with there are infinitely many such that and . In words, this says that if a state-action pair occurs infinitely often, then infinitely often this is followed by every possible effect.

Note that is an environment assumption for domain since, e.g., the strategy that resolves the effects in a round-robin way realizes . Note that is definable in LTL by a formula of size exponential in :

 ⋀s∈E⋀a∈A(GF(s∧a)⊃⋀s′:(s,a,s′)∈ΔGF(s∧a∧Xs′)).
3. In planning, trajectory constraints, e.g., expressed in LTL, have been introduced for expressing temporally extended goals  [\citeauthoryearBacchus and Kabanza2000, \citeauthoryearGerevini et al.2009]. More recently, especially in the context of generalized planning, they have been used to describe restrictions on the environment as well [\citeauthoryearBonet and Geffner2015, \citeauthoryearDe Giacomo et al.2016, \citeauthoryearBonet et al.2017]. However, not all trajectory constraints can be used as assumptions. In fact, Definition 7, which says that a formula is an environment assumption for the domain if is environment realizable, characterizes those formulas that can serve as trajectory constraints.

We can check if is an environment assumption for by converting it to a DPW , converting into the DPW (as above), and then checking if the DPW is environment realizable. Hence we have:

Theorem 9.
1. Deciding if an LTL formula is an environment assumption for the domain is exptime-complete. Moreover, it can be solved in exptime in the size of and exptime in the size of .

2. Deciding if a DPW is an environment assumption for the domain is in exptime. Moreover, it can be solved in exptime in the size of and ptime in the size of and exptime in the number of colors of .

For the lower bound take to be the universal domain and apply the lower bound from Theorem 2.

Now we turn to planning under assumptions.

Definition 8 (Planning under Assumptions – abstract).
1. A planning under assumptions problem is a tuple where

• is a domain,

• is an environment assumption for , and

• is an agent goal.

2. We say that an agent strategy solves if

 ∀σenv∈Strenv(ωD)∩Ω.πσag,σenv∈Γ

We can instantiate this definition to environment assumptions and agent goals definable in .

Definition 9 (Planning under Assumptions – linear-time).
1. An planning under assumptions problem is a tuple where is an environment assumption for and is an agent goal.

2. We say that an agent strategy realizes assuming , or that it solves , if

 ∀σenv⊳(ωD∧ω).πσag,σenv⊨γ

The corresponding decision problem asks, given an planning under assumptions problem to decide whether there is an agent strategy that solves . For instance, LTL planning under assumptions asks, given with , to decide if there is an agent strategy that solves , and to return such a finite-state strategy (if one exists). Similar definitions apply to DPW planning under assumptions, etc.

It turns out that virtually all forms of planning (with linear-time temporally extended goals) in the literature are special cases of planning under LTL assumptions, i.e., the set of strategies that solve a given planning problem are exactly the set of strategies that solve the corresponding planning under assumptions problem. In the following, , and is the LTL formula expressing that if an action is done then its precondition holds.

Example 4.
1. FOND planning with reachability goals [\citeauthoryearRintanen2004] corresponds to LTL planning under assumptions with and .

2. FOND planning with LTL (temporally extended) goals [\citeauthoryearBacchus and Kabanza2000, \citeauthoryearPistore and Traverso2001, \citeauthoryearCamacho et al.2017]. corresponds to LTL planning under assumptions with and goal .

3. FOND planning with LTL trajectory constraints and LTL (temporally extended) goals [\citeauthoryearBonet and Geffner2015, \citeauthoryearDe Giacomo et al.2016, \citeauthoryearBonet et al.2017] corresponds to LTL planning under assumptions with assumptions and goal .

4. Fair FOND planning with reachability goals [\citeauthoryearDaniele, Traverso, and Vardi1999, \citeauthoryearGeffner and Bonet2013, \citeauthoryearD’Ippolito, Rodríguez, and Sardiña2018] corresponds to planning under assumptions with and .

5. Fair FOND planning with (temporally extended) goals as defined in [\citeauthoryearPatrizi, Lipovetzky, and Geffner2013, \citeauthoryearCamacho et al.2017] corresponds to planning under assumptions with and goal .

6. Obviously adding LTL trajectory constraints to fair FOND planning with (temporally extended) goals corresponds to planning under assumptions with and goal .

We also observe that the Fair FOND planning problems just mentioned can be captured by LTL planning under assumptions since can be written in LTL (see Example 3).

6 Translating between planning and synthesis

In this section we ask the question if there is a fundamental difference between synthesis and planning in our setting (i.e., assumptions and goals given as linear-time properties). We answer by observing that there are translations between them. The next two results follow immediately from the definitions:

Theorem 10 (Synthesis to Planning).

Let be a synthesis under Assumptions problem, and let be the corresponding Planning under Assumptions problem where is the universal domain. Then, for every agent strategy we have that solves iff realizes assuming .

Theorem 11 (Planning to Synthesis).

Let be a domain and let be a Planning under Assumptions problem. Let be the corresponding Synthesis under Assumptions problem. Then, for every agent strategy we have that solves iff realizes assuming .

Thus, we can solve LTL planning under assumptions by reducing to LTL synthesis under assumptions, which itself can be solved by known results (i.e., Theorem 2):

Corollary 12.

Solving LTL planning under assumptions is exptime-complete.

However, this does not distinguish the complexity measured in the size of the domain from that in the size of the assumption and goal formulas. We take this up next.

7 Solving Planning under Assumptions

In this section we show how to solve Planning under Assumptions for concrete specification languages , i.e., = LTL and = DPW. We measure the complexity in two different ways: we fix the domain and measure the complexity with respect to the size of the formulas or automata for the environment assumption and the agent goal, this is called goal/assumption complexity; and we fix the formulas/automata and measure the complexity with respect to the size of the domain, this is called the domain complexity. 555Formally, if is a complexity class, we say that goal/assumption complexity is in if for every domain the complexity of deciding if there is an agent strategy solving , is in . A similar definition holds for domain complexity. Also, we say that the goal/assumption complexity is -hard if there exists a domain such that the problem of deciding if there is an agent strategy solving , is -hard.

We begin with = DPW and consider the following algorithm: Given in which is represented by a DPW and is represented by a DPW , perform the following steps:

 Alg 1. Solving DPW planning under assumptions Given domain D, assumption Mω, goal Mγ. 1: Form DPW MD equivalent to ωD. 2: Form DPW M for (MD∧Mω)⊃Mγ. 3: Solve the parity game on M.

The first step results in a DPW whose size is exponential in the size of and with a constant number of colors (Lemma 8). The second step results in a DPW whose size is polynomial in the number of states of the DPWs involved (i.e., and ), and exponential in the number of their colors (Lemma 1). For the third step, the think of the DPW as a parity game: play starts in the initial state, and at each step, if is the current state of , first the environment picks and then the agent picks an action , i.e., an evaluation of the action variables. The subsequent step starts in the state of resulting from taking the unique transition from labeled . This produces a run of the DPW which the agent is trying to ensure is successful (i.e., the largest color occurring infinitely often is even).

Formally, we say that an agent strategy is winning if for every environment strategy , the unique run of the DPW on input word is successful. Deciding if the a player has a winning strategy, and returning a finite-state strategy (it one exists), is called solving the game. Parity games can be solved in time polynomial in the size of and exponential in the number of colors of  [\citeauthoryearGrädel, Thomas, and Wilke2002].666Better algorithms are known, e.g. [\citeauthoryearCalude et al.2017], but are not helpful for this paper.

The analysis of the above algorithm shows the following.

Theorem 13.
1. The domain complexity of solving DPW planning under assumptions is in exptime.

2. The goal/assumption complexity of solving DPW planning under assumptions is in ptime in their sizes and exptime in the number of their colors.

Moreover, by converting LTL formulas to DPW with exponentially many colors and double-exponential many states [\citeauthoryearVardi1995, \citeauthoryearPiterman2007], we get the upper bounds in the following:

Theorem 14.
1. The domain complexity of solving LTL planning under assumptions is exptime-complete.

2. The goal/assumption complexity of solving LTL planning under assumptions is exptime-complete.

For the matching lower-bounds, we have that the domain complexity is exptime-hard follows from the fact that planning with reachability goals and no assumptions is exptime-hard [\citeauthoryearRintanen2004]; to see that the goal/assumption complexity is exptime-hard note that LTL synthesis, known to be exptime-hard [\citeauthoryearPnueli and Rosner1989, \citeauthoryearRosner1992], is a special case (take and to be the universal domain).

Similarly, one can apply this technique to solving Fair LTL planning under assumptions. The exact complexity, however, is open. See the conclusion for a discussion.

8 Focusing on finite traces

In this section we revisit the definitions and results in case that assumptions and goals are expressed as linear-time properties over finite traces. There are two reasons to do this. First, in AI and CS applications executions of interest are often finite [\citeauthoryearDe Giacomo and Vardi2013]. Second, the algorithms presented for the infinite-sequence case involve complex constructions on automata/games that are notoriously hard to optimize [\citeauthoryearFogarty et al.2013]. Thus, we will not simply reduce the finite-trace case to the infinite-trace case [\citeauthoryearDe Giacomo, Masellis, and Montali2014]. We begin by carefully defining the setting.

Synthesis and linear-time specifications over finite traces

We define synthesis over finite traces in a similar way to the infinite-trace case, cf. [\citeauthoryearDe Giacomo and Vardi2015, \citeauthoryearCamacho, Bienvenu, and McIlraith2018]. The main difference is that agent strategies can be partial. This represents the situation that the agent stops the play. Environment strategies are total (as before). Thus, the resulting play may be finite, if the agent chooses to stop, as well as infinite.777Formally, is redefined to be the longest trace (it may be finite or infinite) that complies with both strategies. Objectives may be expressed in general specification formalisms for finite traces, e.g., = LTLf (LTL over finite traces888All our results for LTLf also hold for linear-dynamic logic over finite traces (LDLf) [\citeauthoryearDe Giacomo and Vardi2013]. ), = DFA (deterministic finite word automata). For , we overload notation and write for the set of finite traces defines.

We now define realizability in the finite-trace case:

Definition 10.

Let .

1. We say that realizes (written ) if .

2. We say that realizes (written ) if .

The asymmetry in the definition results from the fact that stopping is controlled by the agent.

Duality still holds, and is easier to prove since it amounts to determinacy of reachability games [\citeauthoryearGrädel, Thomas, and Wilke2002]:

Lemma 15 (Duality).

For every we have that is not agent realizable iff is environment realizable.

Linear temporal logic on finite traces (LTLf)

The logic LTLf has the same syntax as LTL but is interpreted on finite traces . Formally, for (the length of ) we only reinterpret the temporal operators:

• iff and ;

• iff there exists with such that and for all , .

Let denote the dual of , i.e., . Semantically we have that

• iff implies .

Deterministic finite automata (DFA)

A DFA over is a tuple which is like a DPW except that is replaced by a set of final states. The run on a finite input trace is successful if it ends in a final state. We recall that DFA are closed under Boolean operations using classic algorithms (e.g., see [\citeauthoryearVardi1995]). Also, LTLf formulas (and also LDLf formulas) can be effectively translated into DFA. This is done in three classic simple steps that highlight the power of the automata-theoretic approach: convert to an alternating automaton (poly), then into a nondeterministic finite automaton (exp), and then into a DFA (exp). These steps are outlined in detail in, e.g., [\citeauthoryearDe Giacomo and Vardi2013].

Solving Synthesis over finite traces

agent synthesis is the problem, given , of deciding if the agent can realize . Now, solving DFA agent synthesis is ptime-complete: it amounts to solving a reachability game on the given DFA , which can be done with an algorithm that captures how close the agent is to a final state, i.e., a least-fixpoint of the operation. Finally, to solve LTLf agent synthesis first translate the LTLf formula to a DFA and then run the fixpoint algorithm (also, LTLf agent synthesis is exptime-complete) [\citeauthoryearDe Giacomo and Vardi2015].

Note that, by Duality, solving LTLf environment realizability and solving LTLf agent realizability are inter-reducible (and thus the former is also exptime-complete). Thus, to decide if is environment realizable we simply negate the answer to whether is agent realizable. However, to extract an environment strategy, one solves the dual safety game.

Synthesis under assumptions

We say that is an environment assumption if is environment realizable. Solving synthesis under assumptions means to decide if there is an agent strategy such that

We now consider the case that . Checking if is an environment assumption is, by definition, the problem of deciding if is environment realizable, as just discussed. Hence we can state the following:

Theorem 16.
1. Deciding if an LTLf formula is an environment assumption is exptime-complete.

2. Deciding if a DFA is an environment assumption is ptime-compete (cf. [\citeauthoryearGrädel, Thomas, and Wilke2002]).

Turning to LTLf synthesis under assumptions we have that synthesis under assumptions and synthesis of the implication are equivalent. Indeed, as before, the key point is the duality which we have in Lemma 15:

Theorem 17.

Suppose is an environment assumption. The following are equivalent:

1. There is an agent strategy realizing .

2. There is an agent strategy realizing assuming .

Hence to solve synthesis under assumptions we simply solve agent synthesis for the implication. Hence we have:

Theorem 18.
1. Solving LTLf synthesis under assumptions is exptime-complete.

2. Solving DFA synthesis under assumptions is ptime-complete.

Planning under assumptions

Planning and fair planning have recently been studied for LTLf goals [\citeauthoryearDe Giacomo and Rubin2018, \citeauthoryearCamacho, Bienvenu, and McIlraith2018, \citeauthoryearCamacho et al.2018]. Here we define and study how to add environment assumptions.

Recall that we represent a planning domain by the linear-time property (Definition 6) which itself was defined as those infinite traces satisfying two conditions. The exact same conditions determine a set of finite traces, also denoted . Moreover, this is equivalent to an LTLf formula of size linear in and a DFA of size at most exponential in . To see this, replace by in the LTL formula from Lemma 7. That is, let be the LTLf formula formed from by replacing every term of the form by . Note that if then iff , and if then iff . The promised formula is . Also, similar to the DPW before there is a DFA of size at most exponential in the size of equivalent to . To see this, take the DPW from Lemma 8 and instead of define the set of final states to be the set .

As before, say that is an environment assumption for the domain if is environment realizable. Define an planning under assumptions problem to be a tuple with such that is an environment assumption for . To decide if is an environment assumption for we use the next algorithm: Alg 2. Deciding if is an environment assumption for Given domain , and DFA . 1: Convert into a DFA equivalent to . 2: Form the DFA for . 3: Decide if is environment realizable. Further, if is given as an LTLf formula, first convert it to a DFA and then run the algorithm. We then have:

Theorem 19.
1. Deciding if LTLf formula is an environment assumption for the domain is exptime-complete. Moreover, it can be solved in exptime in the size of and exptime in the size of .

2. Deciding if DFA is an environment assumption for the domain is in exptime. Moreover, it can be solved in exptime in the size of and ptime in the size of .

Solving Planning under Assumptions

As before, there are simple translations between planning under assumptions and synthesis under assumptions. And again, solving LTLf planning under assumptions via such a translation is not fine enough to analyze the complexity in the domain vs the goal/assumption. To solve DFA/LTLf planning under assumptions use the following simple algorithm:

 Alg 3. Solving DFA planning under assumptions Given domain D, assumption Mω, goal Mγ. 1: Convert D into a DFA MD equivalent to ωD. 2: Form the DFA M for (MD∧Mω)⊃Mγ. 3: Solve the reachability game on DFA M.

Further, if is given as an LTLf formula, first convert it to a DFA and then run the algorithm. This gives the upper bounds in the following:

Theorem 20.
1. The domain complexity of solving DFA (resp. LTLf) planning under assumptions is exptime-complete.

2. The goal/assumption complexity of solving DFA (resp. LTLf) planning under assumptions is ptime-complete (resp. exptime-complete).

For the lower bounds, setting results in FOND with reachability goals, known to be exptime-hard [\citeauthoryearRintanen2004]; and additionally taking the domain to be the universal domain results in DFA (resp. LTLf) synthesis, known to be ptime-hard [\citeauthoryearGrädel, Thomas, and Wilke2002] (resp. exptime-hard [\citeauthoryearDe Giacomo and Vardi2015]).

Finally, if is an LTLf planning under assumptions problem, say that fairly solves if for every we have that if then is finite and satisfies (here from Example 3 is defined so that it now also includes all finite traces). We remark that Alg applies unchanged. However, to solve the fair LTLf planning problem, we do not know a better way, in general, than translating the problem into one over infinite traces and applying the techniques from the previous section.

9 Conclusion and Outlook

While we illustrate synthesis and planning under assumptions expressed in linear-time specifications, our definitions immediately apply to assumptions expressed in branching-time specifications, e.g., , -calculus, and tree automata. As future work, it is of great interest to study synthesis under assumptions in the branching time setting so as to devise restrictions on possible agent behaviors with certain guarantees, e.g., remain in an area from where the agent can enforce the ability to reach the recharging doc, whenever it needs to, in the spirit of [\citeauthoryearDal Lago, Pistore, and Traverso2002].

Although our work is in the context of reasoning about actions and planning, we expect it can also provide insights to verification and to multi-agent systems. In particular, the undesirable drawback of the agent being able to falsify an assumption when synthesizing is well known, and it has been observed that it can be overcome when the is environment realizable [\citeauthoryearD’Ippolito et al.2013, \citeauthoryearBrenguier, Raskin, and Sankur2017]. Our Theorem 5 provides the principle for such a solution. Interestingly, various degrees of cooperation to fulfill assumptions among adversarial agents has been considered, e.g., [\citeauthoryearChatterjee and Henzinger2007, \citeauthoryearBloem, Ehlers, and Könighofer2015, \citeauthoryearBrenguier, Raskin, and Sankur2017] and we believe that a work like present one is needed to establish similar principled foundations.

Turning to the multi-agent setting, there, agents in a common environment interact with each other and may have their own objectives. Thus, it makes sense to model agents not as hostile to each other, but as rational, i.e., agents that act to achieve their own objectives. Rational synthesis [\citeauthoryearKupferman, Perelli, and Vardi2014] (as compared to classic synthesis) further requires that the strategy profile chosen by the agents is in equilibrium (various notions of equilibrium may be used). It would be interesting to investigate rational synthesis under environment assumptions, in the sense that all agents also make use of their own assumptions about their common environment. We believe that considering assumptions as sets of strategies rather than sets of traces will serve as a clarifying framework also for the multi-agent setting.

Finally, there are a number of open questions regarding the computational complexity of solving synthesis/planning under assumptions, i.e., what is the exact complexity of Fair LTL/LTLf planning under assumptions? what is the assumption complexity of LTL/LTLf synthesis under assumptions? Here, the assumption complexity is the complexity of the problem assuming the domain and goal are fixed, and the only input to the problem is the assumption formula/automaton.

References

• [\citeauthoryearBacchus and Kabanza2000] Bacchus, F., and Kabanza, F. 2000. Using temporal logics to express search control knowledge for planning. Artif. Intell. 116(1-2):123–191.
• [\citeauthoryearBloem, Ehlers, and Könighofer2015] Bloem, R.; Ehlers, R.; and Könighofer, R. 2015. Cooperative reactive synthesis. In Proc. of ATVA 2015. 394–410.
• [\citeauthoryearBonet and Geffner2015] Bonet, B., and Geffner, H. 2015. Policies that generalize: Solving many planning problems with the same policy. In IJCAI, 2798–2804. AAAI Press.
• [\citeauthoryearBonet et al.2017] Bonet, B.; De Giacomo, G.; Geffner, H.; and Rubin, S. 2017. Generalized planning: Non-deterministic abstractions and trajectory constraints. In IJCAI, 873–879.
• [\citeauthoryearBrenguier, Raskin, and Sankur2017] Brenguier, R.; Raskin, J.; and Sankur, O. 2017. Assume-admissible synthesis. Acta Inf. 54(1):41–83.
• [\citeauthoryearCalude et al.2017] Calude, C. S.; Jain, S.; Khoussainov, B.; Li, W.; and Stephan, F. 2017. Deciding parity games in quasipolynomial time. In STOC, 252–263. ACM.
• [\citeauthoryearCamacho et al.2017] Camacho, A.; Triantafillou, E.; Muise, C.; Baier, J. A.; and McIlraith, S. 2017. Non-deterministic planning with temporally extended goals: LTL over finite and infinite traces. In AAAI.
• [\citeauthoryearCamacho et al.2018] Camacho, A.; Baier, J. A.; Muise, C. J.; and McIlraith, S. A. 2018. Finite LTL synthesis as planning. In ICAPS, 29–38. AAAI Press.
• [\citeauthoryearCamacho, Bienvenu, and McIlraith2018] Camacho, A.; Bienvenu, M.; and McIlraith, S. A. 2018. Finite LTL synthesis with environment assumptions and quality measures. In KR, 454–463. AAAI Press.
• [\citeauthoryearChatterjee and Henzinger2007] Chatterjee, K., and Henzinger, T. A. 2007. Assume-guarantee synthesis. In Tools and Algorithms for the Construction and Analysis of Systems. 261–275.
• [\citeauthoryearCimatti et al.2003] Cimatti, A.; Pistore, M.; Roveri, M.; and Traverso, P. 2003. Weak, strong, and strong cyclic planning via symbolic model checking. Artif. Intell. 1–2(147).
• [\citeauthoryearDal Lago, Pistore, and Traverso2002] Dal Lago, U.; Pistore, M.; and Traverso, P. 2002. Planning with a language for extended goals. In AAAI, 447–454.
• [\citeauthoryearDaniele, Traverso, and Vardi1999] Daniele, M.; Traverso, P.; and Vardi, M. Y. 1999. Strong cyclic planning revisited. In ECP, 35–48.
• [\citeauthoryearDe Giacomo and Rubin2018] De Giacomo, G., and Rubin, S. 2018. Automata-theoretic foundations of fond planning for LTL/LDL goals. In IJCAI.
• [\citeauthoryearDe Giacomo and Vardi2013] De Giacomo, G., and Vardi, M. Y. 2013. Linear temporal logic and linear dynamic logic on finite traces. In IJCAI.
• [\citeauthoryearDe Giacomo and Vardi2015] De Giacomo, G., and Vardi, M. Y. 2015. Synthesis for LTL and LDL on finite traces. In IJCAI.
• [\citeauthoryearDe Giacomo et al.2016] De Giacomo, G.; Murano, A.; Rubin, S.; and Stasio, A. D. 2016. Imperfect-information games and generalized planning. In IJCAI, 1037–1043.
• [\citeauthoryearDe Giacomo, Masellis, and Montali2014] De Giacomo, G.; Masellis, R. D.; and Montali, M. 2014. Reasoning on ltl on finite traces: Insensitivity to infiniteness. In AAAI.
• [\citeauthoryearD’Ippolito et al.2013] D’Ippolito, N.; Braberman, V. A.; Piterman, N.; and Uchitel, S. 2013. Synthesizing nonanomalous event-based controllers for liveness goals. ACM Trans. Softw. Eng. Methodol. 22(1):9:1–9:36.
• [\citeauthoryearD’Ippolito, Rodríguez, and Sardiña2018] D’Ippolito, N.; Rodríguez, N.; and Sardiña, S. 2018. Fully observable non-deterministic planning as assumption-based reactive synthesis. J. Artif. Intell. Res. 61:593–621.
• [\citeauthoryearEisner and Fisman2006] Eisner, C., and Fisman, D. 2006. A practical introduction to PSL. Springer.
• [\citeauthoryearFinkbeiner2016] Finkbeiner, B. 2016. Synthesis of reactive systems. Dependable Software Systems Eng. 45:72–98.
• [\citeauthoryearFogarty et al.2013] Fogarty, S.; Kupferman, O.; Vardi, M. Y.; and Wilke, T. 2013. Profile trees for Büchi word automata, with application to determinization. In GandALF.
• [\citeauthoryearGeffner and Bonet2013] Geffner, H., and Bonet, B. 2013. A Coincise Introduction to Models and Methods for Automated Planning. Morgan & Claypool.
• [\citeauthoryearGerevini et al.2009] Gerevini, A.; Haslum, P.; Long, D.; Saetti, A.; and Dimopoulos, Y. 2009. Deterministic planning in the fifth international planning competition: PDDL3 and experimental evaluation of the planners. Artif. Intell. 173(5-6):619–668.
• [\citeauthoryearGhallab, Nau, and Traverso2004] Ghallab, M.; Nau, D. S.; and Traverso, P. 2004. Automated planning – Theory and Practice. Elsevier.
• [\citeauthoryearGrädel, Thomas, and Wilke2002] Grädel, E.; Thomas, W.; and Wilke, T., eds. 2002. Automata, Logics, and Infinite Games: A Guide to Current Research, LNCS 2500.
• [\citeauthoryearKupferman, Perelli, and Vardi2014] Kupferman, O.; Perelli, G.; and Vardi, M. Y. 2014. Synthesis with rational environments. In Multi-Agent Systems. 219–235.
• [\citeauthoryearMartin1975] Martin, D. A. 1975. Borel determinacy. Annals of Mathematics 363–371.
• [\citeauthoryearPatrizi, Lipovetzky, and Geffner2013] Patrizi, F.; Lipovetzky, N.; and Geffner, H. 2013. Fair LTL synthesis for non-deterministic systems using strong cyclic planners. In IJCAI, 2343–2349.
• [\citeauthoryearPistore and Traverso2001] Pistore, M., and Traverso, P. 2001. Planning as model checking for extended goals in non-deterministic domains. In IJCAI.
• [\citeauthoryearPiterman2007] Piterman, N. 2007. From nondeterministic büchi and streett automata to deterministic parity automata. Logical Methods in Computer Science 3(3).
• [\citeauthoryearPnueli and Rosner1989] Pnueli, A., and Rosner, R. 1989. On the synthesis of a reactive module. In POPL.
• [\citeauthoryearPnueli and Zuck1993] Pnueli, A., and Zuck, L. D. 1993. Probabilistic verification. Inf. Comput. 103(1):1–29.
• [\citeauthoryearPnueli1983] Pnueli, A. 1983. On the extremely fair treatment of probabilistic algorithms. In STOC.
• [\citeauthoryearReiter2001] Reiter, R. 2001. Knowledge in Action: Logical Foundations for Specifying and Implementing Dynamical Systems. The MIT Press.
• [\citeauthoryearRintanen2004] Rintanen, J. 2004. Complexity of planning with partial observability. In ICAPS.
• [\citeauthoryearRosner1992] Rosner, R. 1992. Modular synthesis of reactive systems. Ph.D. Dissertation, PhD thesis, Weizmann.
• [\citeauthoryearVardi1995] Vardi, M. Y. 1995. An automata-theoretic approach to linear temporal logic. In Moller, F., and Birtwistle, G. M., eds., Logics for Concurrency - Structure versus Automata, volume 1043 of LNCS, 238–266. Springer.
• [\citeauthoryearVardi2011] Vardi, M. Y. 2011. The rise and fall of linear time logic. In GandALF.
You are adding the first comment!
How to quickly get a good reply:
• Give credit where it’s due by listing out the positive aspects of a paper before getting into which changes should be made.
• Be specific in your critique, and provide supporting evidence with appropriate references to substantiate general statements.
• Your comment should inspire ideas to flow and help the author improves the paper.

The better we are at sharing our knowledge with each other, the faster we move forward.
The feedback must be of minimum 40 characters and the title a minimum of 5 characters