Synthesis from KnowledgeBased Specifications^{†}^{†}thanks: An extended abstract of this paper appeared in CONCUR’98. Work begun while both authors were visitors at the DIMACS Special Year on Logic and Algorithms. Work of the first author supported by an Australian Research Council Large Grant. Work of the second author supported in part by NSF grants CCR9628400 and CCR9700061, and by a grant from the Intel Corporation. Thanks to Kai Engelhardt, Yoram Moses and Nikolay Shilov for their comments on earlier versions of this paper.
Abstract
In program synthesis, we transform a specification into a program that is guaranteed to satisfy the specification. In synthesis of reactive systems, the environment in which the program operates may behave nondeterministically, e.g., by generating different sequences of inputs in different runs of the system. To satisfy the specification, the program needs to act so that the specification holds in every computation generated by its interaction with the environment. Often, the program cannot observe all attributes of its environment. In this case, we should transform a specification into a program whose behavior depends only on the observable history of the computation. This is called synthesis with incomplete information. In such a setting, it is desirable to have a knowledgebased specification, which can refer to the uncertainty the program has about the environment’s behavior. In this work we solve the problem of synthesis with incomplete information with respect to specifications in the logic of knowledge and time. We show that the problem has the same worstcase complexity as synthesis with complete information.
1 Introduction
One of the most significant developments in the area of design verification is the development of of algorithmic methods for verifying temporal specifications of finitestate designs [11, 29, 45, 53]. The significance of this follows from the fact that a considerable number of the communication and synchronization protocols studied in the literature are in essence finitestate programs or can be abstracted as finitestate programs. A frequent criticism against this approach, however, is that verification is done after substantial resources have already been invested in the development of the design. Since designs invariably contain errors, verification simply becomes part of the debugging process. The critics argue that the desired goal is to use the specification in the design development process in order to guarantee the design of correct programs. This is called program synthesis.
The classical approach to program synthesis is to extract a program from a proof that the specification is satisfiable. For reactive programs, the specification is typically a temporal formula describing the allowable behaviors of the program [30]. Emerson and Clarke [14] and Manna and Wolper [31] showed how to extract programs from (finite representations of) models of the formula. In the late 1980s, several researchers realized that the classical approach is well suited to closed systems, but not to open systems [12, 42, 1]. In open systems the program interacts with the environment. A correct program should be able to handle arbitrary actions of the environment. If one applies the techniques of [14, 31] to open systems, one obtains programs that can handle only some actions of the environment.
Pnueli and Rosner [42], Abadi, Lamport and Wolper [1], and Dill [12] argued that the right way to approach synthesis of open systems is to consider the situation as a (possibly infinite) game between the environment and the program. A correct program can then be viewed as a winning strategy in this game. It turns out that satisfiability of the specification is not sufficient to guarantee the existence of such a strategy. Abadi et al. called specifications for which winning strategies exist realizable. A winning strategy can be viewed as an infinite tree. In those papers it is shown how the specification can be transformed into a tree automaton such that a program is realizable precisely when this tree automaton is nonempty, i.e., it accepts some infinite tree. This yields a decision procedure for realizability. (This is closely related to the approach taken by Büchi and Landweber [8] and Rabin [46] to solve Church’s solvability problem [10].)
The works discussed so far deal with situations in which the program has complete information about the actions taken by the environment. This is called synthesis with complete information. Often, the program does not have complete information about its environment. Thus, the actions of the program can depend only on the “visible” part of the computation. Synthesizing such programs is called synthesis with incomplete information. The difficulty of synthesis with incomplete information follows from the fact that while in the completeinformation case the strategy tree and the computation tree coincide, this is no longer the case when we have incomplete information. Algorithms for synthesis were extended to handle incomplete information in [43, 54, 3, 24, 51, 25].
It is important to note that temporal logic specifications cannot refer to the uncertainty of the program about the environment, since the logic has no construct for referring to such uncertainty. It has been observed, however, that designers of open systems often reason explicitly in terms of uncertainty [19]. A typical example is a rule of the form “send an acknowledgement as soon as you know that the message has been received”. For this reason, it has been proposed in [21] to use epistemic logic as a specification language for open systems with incomplete information. When dealing with ongoing behavior in systems with incomplete information, a combination of temporal and epistemic logic can refer to both behavior and uncertainty [28, 27]. In such a logic the above rule can be formalized by the formula , where is the temporal connective “always”, is the epistemic modality indicating knowledge, and and are atomic propositions.
Reasoning about open systems at the knowledge level allows us to abstract away from many concrete details of the systems we are considering. It is often more intuitive to think in terms of the highlevel concepts when we design a protocol, and then translate these intuitions into a concrete program, based on the particular properties of the setting we are considering. This style of program development will generally allow us to modify the program more easily when considering a setting with different properties, such as different communication topologies, different guarantees about the reliability of various components of the system, and the like. See [2, 6, 9, 13, 18, 22, 23, 36, 39, 40, 47] for examples of knowledgelevel analysis of open systems with incomplete information. To be able to translate, however, these highlevel intuitions into a concrete program one has to be able to check that the given specification is realizable in the sense described above.
Our goal in this paper is to extend the program synthesis framework to temporalepistemic specification. The difficulty that we face is that all previous programsynthesis algorithms attempt to construct strategy trees that realize the given specification. Such trees, however, refer to temporal behavior only and they do not contain enough information to interpret the epistemic constructs. (We note that this difficulty is different than the difficulty faced when one attempts to extend synthesis with incomplete information to branchingtime specification [25], and the solution described there cannot be applied to knowledgebased specifications.) Our key technical tool is the definition of finitely labelled trees that contain information about both temporal behavior and epistemic uncertainty. Our main result is that we can extend the program synthesis framework to handle knowledgebased specification with no increase in worstcase computational complexity.
In an earlier, extended abstract of the present work [32] we stated this result for specifications in the logic of knowledge and linear time, and required the protocols synthesized to be deterministic. The present paper differs from the earlier work in giving full proofs of all results, as well as in the fact that we generalize the specification language to encompass branching as well as linear time logical operators. We also liberalize the class of solutions to encompass nondeterministic protocols. (Our previous result on deterministic protocols is easily recovered, by noting that the branching time specification language can express determinism of the solutions.) These generalizations allow us to give an application of the results to the synthesis of implementations of knowledgebased programs [16], a type of programs in an agent’s actions may depend in its knowledge.
The structure of the paper is as follows. Section 2 defines the syntax and semantics of the temporalepistemic specification language and defines the synthesis problem for this language. In Section 3 we give a characterization of realizability that forms the basis for our synthesis result. Section 4 describes an automatontheoretic algorithm for deciding whether a specification is realizable, and for extracting a solution in case it is. Section 5 discusses two aspects of this result: a subtlety concerning the knowledge encoded in the states of the solutions, and an application of our result to knowledgebased program implementation. The paper concludes in Section 6 with a discussion of extensions and open problems.
2 Definitions
In this section we define the formal framework within which we will study the problem of synthesis from knowledgebased specifications, provide semantics for the logic of knowledge and time in this framework, and define the realizability problem.
Systems will be decomposed in our framework into two components: the program, or protocol being run, and the remainder of the system, which we call the environment within which this protocol operates. We begin by presenting a model, from [34], for the environment. This model is an adaption of the notion of context of Fagin et al. [16]. Our main result in this paper is restricted to the case of a single agent, but as we will state a result in Section 5.1 that applies in a more general setting, we define the model assuming a finite number of agents.
Intuitively, we model the environment as a finitestate transition system, with the transitions labelled by the agents’ actions. For each agent let be a set of actions associated with agent . We will also consider the environment as able to perform actions, so assume additionally a set of actions for the environment. A joint action will consist of an action for each agent and an action for the environment, i.e., the set of joint actions is the cartesian product .
Suppose we are given such a set of actions, together with a set of of atomic propositions. Define a finite interpreted environment for agents to be a tuple of the form where the components are as follows:

is a finite set of states of the environment. Intuitively, states of the environment may encode such information as messages in transit, failure of components, etc. and possibly the values of certain local variables maintained by the agents.

is a subset of , representing the possible initial states of the environment.

is a function, called the protocol of the environment, mapping states to subsets of the set of actions performable by the environment. Intuitively, represents the set of actions that may be performed by the environment when the system is in state . We assume that this set is nonempty for all .

is a function mapping joint actions to state transition functions . Intuitively, when the joint action is performed in the state , the resulting state of the environment is .

For each , the component is a function, called the observation function of agent , mapping the set of states to some set . If is a global state then will be called the observation of agent in the state .

is an interpretation, mapping each state to an assignment of truth values to the atomic propositions in .
A run of an environment is an infinite sequence of states such that and for all there exists a joint action such that and . For we write for . For we also write for the sequence and for .
A point is a tuple , where is a run and a natural number. Intuitively, a point identifies a particular instant of time along the history described by the run. A run will be said to be a run through a point if . Intuitively, this is the case when the two runs and describe the same sequence of events up to time .
Runs of an environment provide sufficient structure for the interpretation of formulae of linear temporal logic. To interpret formulae involving knowledge, we need additional structure. Knowledge arises not from a single run, but from the position a run occupies within the collection of all possible runs of the system under study. Following [16], define a system to be a set of runs and an interpreted system to be a tuple consisting of a system together with an interpretation function mapping the points of runs in to assignments of truth value to the propositions in . As we will show below, interpreted systems also provide enough structure to interpret branching time logics [11], by means of a slight modification of the usual semantics for such logics.
All the interpreted systems we deal with in this paper will have all runs drawn from the same environment, and the interpretation derived from the interpretation of the environment by means of the equation , where is a point and an atomic proposition. That is, the value of a proposition at a point of a run is determined from the state of the environment at that point, as described by the environment generating the run.
The definition of run presented above is a slight modification of the definitions of Fagin et al. [16]. Roughly corresponding to our notion of state of the environment is their notion of a global state, which has additional structure. Specifically, a global state identifies a local state for each agent, which plays a crucial role in the semantics of knowledge. We have avoided the use of such extra structure in our states because we focus on just one particular definition of local states that may be represented in the general framework of [16].
In particular, we will work with respect to a synchronous perfectrecall semantics of knowledge. Given a run of an environment with observation functions , we define the local state of agent at time to be the sequence . That is, the local state of an agent at a point in a run consists of a complete record of the observations the agent has made up to that point.
These local states may be used to define for each agent a relation of indistinguishability on points, by if . Intuititively, when , agent has failed to receive enough information to time in run and time in run to determine whether it is on one situation or the other. Clearly, each is an equivalence relation. The use of the term “synchronous” above is due to the fact that an agent is able to determine the time simply by counting the number of observations in its local state. This is reflected in the fact that if , we must have . (There also exists an asynchronous version of perfect recall [16], which will not concern us in the present paper.)
To specify systems, we will use a propositional multimodal language for knowledge and time based on a set of atomic propositions, with formulae generated by the modalities (next time), (until), and a knowledge operator for each agent . Time may be either branching or linear, so we also consider the branching time quantifier . More precisely, the set of formulae of the language is defined as follows: each atomic proposition is a formula, and if and are formulae, then so are , , , , and for each . As usual, we use the abbrevations for , and for .
The semantics of this language is defined as follows. Suppose we are given an interpreted system , where is a set of runs of environment and is determined from the environment as described above. We define satisfaction of a formula at a point of a run in , denoted , inductively on the structure of . The cases for the temporal fragment of the language are standard:

, where is an atomic proposition, if ,

, if and ,

, if not ,

, if ,

, if there exists such that and for all with .

if there exists a run in through such that .
The semantics of the knowledge operators is defined by

, if for all points of satisfying
That is, an agent knows a formula to be true if this formula holds at all points that it is unable to distinguish from the actual point. This definition follows the general framework for the semantics of knowledge proposed by Halpern and Moses [21]. We use the particular equivalence relations obtained from the assumption of synchronous perfect recall, but the same semantics for knowledge applies for other ways of defining local states, and hence the relations . We refer the reader to [21, 16] for further background on this topic.
The systems we will be interested in will not have completely arbitrary sets of runs, but rather will have sets of runs that arise from the agents running some program, or protocol, within a given environment. Intuitively, an agent’s choice of actions in such a program should depend on the information it has been able to obtain about the environment, but no more. We have used observations to model the agent’s source of information about the environment. The maximum information that an agent has about the environment at a point is given by the local state . Thus, it is natural to model an agent’s program as assigning to each local state of the agent a nonempty set of actions for that agent. We define a protocol for agent to be a function . A joint protocol is a tuple , where each is a protocol for agent . We say that is deterministic if is a singleton for all agents and local states .
The systems we consider will consist of all the runs in which at each point of time each agent behaves as required by its protocol. As usual, we also require that the environment follows its own protocol. Formally, the system generated by a joint protocol in environment is the set of all runs of such that for all we have , where is a joint action in . The interpreted system generated by a joint protocol in environment is the interpreted system , where is the interpretation derived from the environment as described above.
Finally, we may define the relation between specifications and implementations that is our main topic of study. We say that a joint protocol realizes a specification in an environment if for all runs of we have . A specification is realizable in environment if there exists a joint protocol that realizes in . The following example illustrates the framework and provides examples of realizable and unrealizable formulae.
Example 1
Consider a timed toggle switch with two positions (on, off), with a light intended to indicate the position. If the light is on, then the switch must be in the on position. However, the light is faulty, so it might be off when the switch is on. Suppose that there is a single agent that has two actions: “toggle” and “do nothing”. If the agent toggles, the switch changes position. If the agent does nothing, the toggle either stays in the same position or, if it is on, may timeout and switch to off automatically. The timer is unreliable, so the timeout may happen any time the switch is on, or never, even if the switch remains on forever. The agent observes only the light, not the toggle position.
This system may be represented as an environment with states consisting of pairs , where is a boolean variable indicating the toggle position and is a boolean variable representing the light, subject to the constraint that if . The agent’s observation function is given by . To represent the effect of the agent’s actions on the state, write for the toggle action and for the agent’s null action. The environment’s actions may be taken to be pairs where and are boolean variables indicating, respectively, that the environment times out the toggle, and that it switches the light on (provided the switch is on). Thus, the transition function is given by where (i) if either or , else , and (ii) iff and .
If “toggleon” is the proposition true in states where , then the formula expresses that the agent knows at all times whether or not the toggle is on. This formula is realizable when the initial states of the environment are those in which the toggle is on (and the light is either on or off). The protocol by which the agent realizes this formula is that in which it performs at all steps. Since it has perfect recall it can determine whether the toggle is on or off by checking if it has made (respectively) an odd or an even number of observations.
However, the same formula is not realizable if all states are initial. In this case, if the light is off at time 0, the agent cannot know whether the switch is on. As it has had at time 0 no opportunity to influence the state of the environment through its actions, this is the case whatever the agent’s protocol.
3 A Characterization of Realizability
In this section we characterize realizability in environments for a single agent in terms of the existence of a certain type of labelled tree. Intuitively, the nodes of this tree correspond to the local states of the agent, and the label at a node is intended to express (i) the relevant knowledge of the agent and (ii) the action the agent performs when in the corresponding local state.
Consider , the set of all finite sequences of observations of agent 1, including the empty sequence. This set may be viewed as an infinite tree, where the root is the null sequence and the successors of a vertex are the vertices , where is an observation. A labelling of is a function for some set . We call a labelled tree. We will work with trees in which the labels are constructed from the states of the environment, a formula and the actions of the agent. Define an atom for a formula to be a mapping from the set of all subformulae of to . A knowledge set for in is a set of pairs of the form , where is an atom of and is a state of . Take to be the set of all pairs of the form where is a knowledge set for in and is a nonempty set of actions of agent 1. We will consider trees that are labellings of by . We will call such a tree a labelled tree for and .
Given such a labelled tree , we may define the functions , mapping to knowledge sets, and , mapping to nonempty sets of actions of agent 1, such that for all we have . Note that is a protocol for agent 1. This protocol generates an interpreted system in the given environment . Intuitively, we are interested in trees in which the describe the states of knowledge of the agent in this system. We now set about stating some constraints on the labels in the tree that are intended to ensure this is the case.
Suppose we are given a sequence of states and a vertex of with for some . Then we obtain a branch of , where and for . We say that is a run of from if there exists an atom such that , and for each there exists a joint action such that . That is, the actions of agent 1 labelling the branch corresponding to , together with some choice of the environment’s actions, generate the sequence of states in the run.
We now define a relation on points of the runs from vertices of . This relation interprets subformulae of by treating the linear temporal operators as usual, but referring to the knowledge sets to interpret formulae involving knowledge or the branching time operator. Intuitively, asserts that the formula “holds” at the th vertex reached from along , as described above. More formally, this relation is defined by means of the following recursion:

if

if not .

if and .

if

if there exists such that for and .

if for all , where is determined as above.

if for some , with , where is determined as above.
We use the abbreviation for . (The choice of the vertex here is not really significant: it is not difficult to show that for all we have iff .)
Define a labelled tree for and to be acceptable if it satisfies the following conditions:
 (Real)

For all observations , and for all , we have and .
 (Init)

For all initial states , there exists an atom for such that is in .
 (Obs)

For all observations and all vertices of , we have for all .
 (Pred)

For all observations , for all vertices other than the root, and for all , there exists and a joint action such that .
 (Succ)

For all vertices other than the root, for all and for all , if then there exists an atom such that .
 (sound)

For all vertices , and , if then there exists such that and .
 (comp)

For all vertices , and , if then .
 (Ksound)

For all vertices (other than the root) and all , there exists a run from such that and for all subformulae of we have iff .
 (Kcomp)

For all vertices and all runs from there exists such that and for all subformulae of we have iff .
The following theorem provides the characterization of realizability of knowledgebased specifications that forms the basis for our synthesis procedure.
Theorem 3.1
A specification for a single agent is realizable in the environment iff there exists an acceptable labelled tree for in .
Proof: We first show that if there exists an acceptable tree then the specification is realizable. Suppose is an acceptable tree for in . We show that the protocol for agent 1 derived from this tree realizes . Let be the system generated by in .
We claim that for all points of and all subformulae of we have iff . It follows from this that realizes in . For, let be a run of . Take to be the vertex . By Init, there exists a pair with . Thus, is a run of from the vertex . By Kcomp, there exists an atom such that is in and for all subformulae of , we have iff . By the claim, we obtain in particular that iff . But by Real, we have that , so also holds. This shows that realizes in .
The proof of the claim is by induction on the complexity of . The base case, when is an atomic proposition, is straightforward, as are the cases where is built using boolean or temporal operators from subformulae satisfying the claim. We establish the cases where is of the form or .
We first assume that , and show . That is, for all we show . By Ksound, for each there exists a run of from with and iff . Applying Pred and Init, this run may be extended backwards to a run of with and . By the assumption, we have that . It follows using the induction hypothesis that . Note that this implies , hence .
Conversely, we suppose that and show that . Suppose that is a run of with . We need to prove that . Using Init, is a run of from . By Succ and induction, is a run of from . Thus, by Kcomp, there exists such that and for all subformulae of we have iff . By the assumption that , we have that for all . Thus, . By the induction hypothesis it follows that , which is what we set out to establish. This completes the proof of the claim, and also the proof that the existence of an acceptable tree implies the existence of a realization.
For the case where , we argue as follows. First, we assume that and show that . From the assumption, there exists such that and . By Ksound, there exists a run from in such that and . Let . This is a run of , and we have . By the induction hypothesis, it follows that . Since , it follows that .
Conversely, assume that . We show that . From assumption, there exists a run such that and . By induction, we have . This is equivalent to . By Kcomp, there exists such that and . It follows that . This completes the argument from the existence of an acceptable tree for in to realizability of in .
Next, we show that if is realizable in then there exists an acceptable tree for and . Suppose that the protocol for agent 1 realizes in . We construct a labelled tree as follows. Let be the system generated by in . If is a point of , define the atom by iff . Define the function to map the point of to the pair . For all in , define to be the set of all , where is a point of with . Define by for each . (The label of the root can be chosen arbitrarily.) We claim that is an acceptable tree for and .
For Real, let for an observation , and suppose that . Then there exists a run of such that and . Since realizes , we have that and it is immediate that .
For Init, let be an initial state. Take to be any run of with and let . Then .
For Obs, Let be an observation and a vertex of . If , then there exists a run such that , and , where . This implies that , as required.
For Pred, let be an observation, a vertex of other than the root, and . Then there exists a run of such that and and . Let and . Since we have . It follows that . Moreover, since is a run of , there exists an action such that . This gives the conditions required for the consequent of Pred.
For Succ, let be a vertex other than the root, and consider and . Let . By construction, there exists a run such that and and , where . Let be any run extending . Take . Then and , so , as required for Succ.
For sound, suppose that and . We need to show that there exists such that and . By construction, there exists a run of such that and , where . Moreover, we have . Thus, there exists a run of such that and . Let ; plainly, this satisfies . Note that . It follows from that . Thus, , and suffices for the required conclusion.
For comp, suppose and . We show that . By construction, there exist runs such that and and and . Since we have . Consider the sequence . This is a run of with . Thus, . Since satisfaction of formulas depends only on the future and , we obtain that . This yields that , as required.
We next prove Ksound and Kcomp. For this, we first prove that for all points of we have iff . The proof is by induction on the complexity of . As above, the cases not involving knowledge are straightforward, so we focus on the case where is of the form . By definition, iff for all . By definition of and the induction hypothesis, this holds just when for all . This latter condition is equivalent to , so we are done.
For Ksound, suppose that is a vertex not equal to the root and that is in . Then there exists a point of such that and . The sequence is a run of from with initial state . To establish Ksound, we need to show that for all subformulae of we have iff . This holds because iff .
We now prove that satisfies Kcomp. Let be a vertex of not equal to the root and let be a run of from . We need to show that there exists a pair in such that and, for all subformulae of , iff . By definition of a run from , there exists with . By construction of , there exists a point of such that . Clearly, the sequence is a run of . Thus, we have that is in . By definition we have iff . As shown above, the latter holds just when . But this last condition is equivalent to , since . This shows that is the required pair .
In the next section, we show how this result can be used to yield an automatatheoretic procedure for constructing a realization of a specification.
4 An Algorithm for Realizability
We first recall the definitions of the two types of automata we require. Section 4.1 deals with automata on infinite words, and Section 4.2 deals with alternating automata on infinite trees. We apply these to our realizability problem in Section 4.3.
4.1 Automata on Infinite Words
For an introduction to the theory of automata on infinite words and trees see [48].
The types of finite automata on infinite words we consider are those defined by Büchi [7]. A (nondeterministic) automaton on words is a tuple , where is a finite alphabet, is a finite set of states, is a set of starting states, is a (nondeterministic) transition function, and is an acceptance condition. A Büchi acceptance condition is a set .
A run of over a infinite word , is a sequence , where and , for all . Let denote the set of states in that appear in infinitely often. The run satisfies a Büchi condition if there is some state in that repeats infinitely often in , i.e., . The run is accepting if it satisfies the acceptance condition, and the infinite word is accepted by if there is an accepting run of over . The set of infinite words accepted by is denoted .
The following theorem establishes the correspondence between temporal formulae and Büchi automata.
Proposition 1
[49] Given a temporal formula over a set of propositions, one can build a Büchi automaton , where , such that is exactly the set of computations satisfying the formula .
4.2 Alternating Automata on Infinite Trees
Alternating tree automata generalize nondeterministic tree automata and were first introduced in [37]. They have recently found usage in computeraided verification [5, 50, 52]. An alternating tree automaton runs on labelled trees (i.e., mappings from to ). It consists of a finite set of states, an initial state , a transition function , and an acceptance condition (a condition that defines a subset of ).
For a set , let be the set of positive Boolean formulae over ; i.e., Boolean formulae built from elements in using and , where we also allow the formulae true and false. For a set and a formula , we say that satisfies iff assigning true to elements in and assigning false to elements in makes true.
The transition function maps a state and an input letter to a formula that suggests a new configuration for the automaton. A run of an alternating automaton on an input labelled tree is a tree in which the root is labelled by and every other node is labelled by an element of . Here is a prefixclosed subset of and is the labeling function. Each node of corresponds to a node of . A node in , labelled by , describes a copy of the automaton that reads the node of and visits the state . Formally, is a labeled tree where and satisfies the following:

and .

Let with and . Then there is a (possibly empty) set , such that the following hold:

satisfies , and

for all , we have and .

For example, if is a tree with and , then the nodes of at level include the label or , and include the label or .
Each infinite path in is labelled by a word in . A run is accepting iff all its infinite paths satisfy the acceptance condition. Let