XPL: An extended probabilistic logic for probabilistic transition systemsThis work was partially supported by NSF grant IIS-1447549.

XPL: An extended probabilistic logic for probabilistic transition systems1

Abstract

Generalized Probabilistic Logic (GPL) is a temporal logic, based on the modal mu-calculus, for specifying properties of reactive probabilistic systems. We explore XPL, an extension to GPL allowing the semantics of nondeterminism present in Markov decision processes (MDPs). XPL is expressive enough that a number of independently studied problems— such as termination of Recursive MDPs (RMDPs), PCTL* model checking of MDPs, and reachability for Branching MDPs— can all be cast as model checking over XPL. Termination of multi-exit RMDPs is undecidable; thus, model checking in XPL is undecidable in general. We define a subclass, called separable XPL, for which model checking is decidable. Decidable problems such as termination of 1-exit RMDPs, PCTL* model checking of MDPs, and reachability for Branching MDPs can be reduced to model checking separable XPL. Thus, XPL forms a uniform framework for studying problems involving systems with non-deterministic and probabilistic behaviors, while separable XPL provides a way to solve decidable fragments of these problems.

1 Introduction

For finite-state systems, model checking a temporal property can be cast in terms of model checking in the modal -calculus, the so-called “assembly language” of temporal logics. A number of temporal logics have been proposed and used for specifying properties of finite-state probabilistic systems. Two of the notable logics for probabilistic systems based on the -calculus are GPL [6] and pL [22].

GPL is defined over Reactive Probabilistic Labeled Transition Systems (RPLTSs). In an RPLTS, each state has a set of outgoing transitions with distinct labels; each transition, in turn, specifies a (probabilistic) distribution of target states. The branching-time probabilistic logic GPL is expressive enough to serve as an “assembly language” of a large number of probabilistic temporal logics. For instance, model checking PCTL* properties over Markov Chains, as well as termination and reachability of Recursive Markov Chains (RMCs) can be cast in terms of GPL model checking [6, 16].

In this paper, we propose an extension to GPL, which we call Extended Probabilistic Logic (XPL), to express properties of probabilistic systems with internal nondeterministic choice, under linear-time semantics. Syntactically, XPL is very close to GPL: whereas GPL has probabilistic quantifiers and over fuzzy formulae , XPL admits quantifiers and as well. XPL’s semantics, however, is given with respect to maximizing schedulers that resolve internal non-deterministic choices. Properties involving minimizing schedulers can be analyzed by considering their duals (with respect to negation) over maximizing schedulers. The semantics of XPL is defined over Probabilistic Labeled Transition Systems (PLTSs). In a PLTS, each state has a set of outgoing transitions, possibly with common labels; and each transition specifies a distribution of target states. PLTSs, as interpreted with XPL, thus exhibit probabilistic choice and, under both linear- and branching-time semantics, nondeterministic choice.

Contributions and Significance: XPL is expressive enough that a wide variety of independently-studied verification problems can be cast as model checking PLTSs with XPL. In fact, undecidable problems such as termination of multi-exit Recursive Markov Decision Processes (Recursive MDPs or RMDPs) can be reduced in linear time to model checking PLTSs with XPL. We introduce a syntactically-defined subclass, called separable XPL, for which model checking is decidable. We describe a procedure for model checking XPL which always terminates— successfully with the model checking result, or with failure— such that it always terminates successfully for separable XPL (see Sect. 4).

A number of distinct model checking algorithms have been developed independently for decidable verification problems involving systems that have probabilistic and internal non-deterministic choice. Examples of such problems include PCTL* model checking of MDPs [2], reachability in branching MDPs [11], and termination of 1-exit RMDPs [13]. These problems can all be reduced, in linear time, to model checking separable XPL formulae over PLTSs (see Sect. 5). To the best of our knowledge, the idea that branching and recursive systems could be interpreted as having nondeterminism under the branching-time semantics, and the question of its compatibility with nondeterminism under the linear-time semantics, have not been recognized in the literature.

Termination of multi-exit RMDPs, cast as a model checking problem over XPL along the same lines as our treatment of 1-exit RMDPs, yields an XPL formula that is not separable. Thus separability can be seen as a characteristic of the verification problems that are known to be decidable, when cast in terms of model checking in XPL. Consequently, XPL in general, and separable XPL in particular, form a useful formalism to study the relationships between verification problems over systems involving probabilistic and both linear- and branching-time non-deterministic choice. We discuss these issues in greater detail in Sect. 6.

2 Preliminaries

In this section, we formally define PLTSs, which are used to define the semantics of XPL. We also summarize the syntax and semantics of GPL, using the notations from [6].

2.1 Probabilistic Labeled Transition Systems

We define a probabilistic labeled transition system (PLTS) as an extension of [6]’s RPLTS.

Definition 1 (Plts).

With respect to fixed sets and of actions and propositions, respectively, a PLTS is a quadruple , where

  • is a countable set of states;

  • is the transition relation;

  • is the transition probability distribution satisfying:

    • , and

    • ;

  • is the interpretation, recording the set of propositions true at a state.

A reactive PLTS does not have internal nondeterminism, i.e., its transition probability distribution is a function of . This definition is in line with the most general for a PLTS [22, 26], in which, given an action, a probabilistic distribution is chosen nondeterministically (we assume that there are finitely many nondeterministic choices). Other equally expressive models include alternating automata, in which labeled nondeterministic ones are followed by silent probabilistic choices. The difference between such models has been analyzed with respect to bisimulation [27].

Given , a partial computation is a sequence , where for all , . Also, and denote, respectively, the first and last states in . Each transition of a partial computation is labeled with an action . The set of all partial computations of is denoted by , and . Composition of partial computations, , represents if . A partial computation is a prefix of if for some .

From a set of partial computations, we can build deterministic trees (d-trees). We often denote a d-tree by the set of paths in the tree. Every d-tree is prefix-closed and deterministic. is prefix-closed if, for every and a prefix of , . is deterministic if for every with and , either or , i.e., if a pair of computations share a prefix, the first difference cannot involve transitions labeled by the same action. A d-tree has a starting state, denoted ; if then . We also let .

refers to all the d-trees of , and . is a prefix of if . means . is finite if , and maximal if there exists no d-tree with . and are analogous to and , but for maximal d-trees. An outcome is a maximal d-tree.

{adjustbox}

scale=1

(a) An example PLTS
{adjustbox}

scale=0.9

(b) Example outcomes
Figure 1: Example PLTS and selected outcomes

An example PTLS and two of its outcomes are shown in Fig. 1. In the figure, transitions are usually annotated with their action label and probability; the probability is omitted when it is . Note that there are two transitions labeled from state reflecting internal nondeterminism. If we label transition from to only with (omitting ) and that from to only with (omitting ), we get an RPLTS with only probabilistic and external choices.

Note that, with d-trees, we have the distinction between linear- and branching-time semantics for the nondeterministic choices which are internal and external, respectively. Since a d-tree is defined to be deterministic, all of the internal choices (both probabilistic and nondeterministic) are resolved, but the external choices remain. Meanwhile, a property of a PLTS will hold for some subset of its maximal d-trees. In order to give the property a probability, we need a measure of this set. This is straightforward for an RPLTS, as all internal choices are probabilistic; but we will need to do more for PLTSs with internal nondeterministic choice.

Thus, the subsequent concepts apply only to RPLTSs, and we will extend them to PLTSs in Sect. 3. A finite RPLTS d-tree has finite measure, which can be computed from the values of the probabilistic choices in the trees, i.e., its edges. An infinite d-tree will typically have zero measure, but an infinite set of these may have positive measure. Instead, intuitively, we consider the probability of some finite prefix, which again is the product of the probabilities of all the edges. Formally, a basic cylindrical subset of contains all trees sharing a given prefix. Letting , and to be finite, . The measure of is:

(1)

From here, a probability measure on the smallest field of sets is generated from subsets with  [6, Definition 8].

2.2 GPL Syntax

GPL has two different kinds of formulae. State formulae depend directly only on the given state. Fuzzy formulae depend on outcomes. We give the syntax of GPL, with , , , and , for state formulae, , and fuzzy formulae, , as:

Note that only atomic propositions may be negated, but every operator has its dual given in the syntax. The propositional connectives, and , can be used on both state and fuzzy formulae. Operators and are least and greatest fixed point operators for the “equation” . Additionally, fuzzy formulae must be alternation-free, which prohibits a kind of mixing of least and greatest fixed points, and a formula used to construct state formulae and may not have any free variables. These operators check the probability for a fuzzy formula ( and are duals). The semantics of GPL is given in terms of RPLTS d-trees. In that interpretation, diamond implies box: means that there is an -transition and it satisfies ; means that if there is an -transition, it satisfies . We also use a set for the modalities, reading as and as . When we write “” for , that represents .

, where is a closed formula,
,
,
,
,
,
, where and ,
, where .
Table 1: GPL/XPL semantics: fuzzy formulae

2.3 GPL Semantics

iff ,
iff ,
iff and ,
iff or ,
iff ,
iff .
Table 2: GPL semantics: state formulae

We define the semantics of GPL with respect to a fixed RPLTS , where and are the sets of all state and fuzzy formulae, respectively. A function , augmented with an extra environment parameter , returns the set of outcomes satisfying a given fuzzy formula, defined inductively in Table 1.

For a given , . The relation indicates when a state satisfies a state formula, and it is defined inductively in Table 2. Note that the definitions for and are mutually recursive.

There are two properties of GPL fuzzy formulae that are important for the completeness of the GPL model checking algorithm. First, we have distributivity on box and diamond [6, Lemma 1]:

Lemma 2 (Distributivity on modal operators).

Letting :

(2)

Second, we can relate the probability of a conjunction with that of a disjunction and compute the effect of taking a step [6, Lemma ]:

(3)
(4)

Additionally, although there is no negation operator in the syntax, we can write the negation of a fuzzy formula , , and of a state formula , , such that, for any RPLTS and state ([6, Lemma ]):

The proof involves switching all the operators to their duals.

3 Xpl

To resolve the nondeterministic transitions in a PLTS, we additionally require a scheduler. Recall, from Sect. 2.1, that is the set of all partial computations of .

Definition 3 (Scheduler).

A scheduler for a PLTS is a function , such that if an action is present at , then implies that .

Note that we have defined deterministic schedulers, which are also aware of their relevant histories. Given a scheduler for a PLTS , we have a (countable) RPLTS , where and so . We define a probability distribution:

Definition 4 (Combined probability).

The probability distribution of a PLTS with scheduler is a function, , where:

(5)

We also let when .

Recall, from Sect. 2.1, that the basic cylindrical subset contains all maximal d-trees sharing the prefix tree . For these subsets, we define the probability measure:

Definition 5 (Probability measure).

For a PLTS with scheduler , the probability measure of a basic cylindrical subset is defined by a partial function , where:

(6)

Since may be considered as defined for an RPLTS, we can extend it to a measure as in Sect. 2.1.

3.1 XPL Syntax

Now we give the XPL syntax, with :

The fuzzy formulae remain the same as in GPL. assumes maximizing schedulers, i.e., we compare against the supremum probabilities over all schedulers. Note that is no longer the dual of , which is why we allow the “less than” comparisons, as well; moreover, analyzing a fuzzy formula over minimizing schedulers is essentially equivalent to considering over maximizing schedulers.

3.2 XPL Semantics

iff ,
iff ,
iff and ,
iff or ,
iff ,
Table 3: XPL semantics: state formulae

The semantics of XPL changes from GPL only due to the measure of the PLTS outcomes. In particular, we retain the same semantics on diamond and box. The semantics is defined with respect to a fixed PLTS . The function remains the same, while differs for the probabilistic operators.

Definition 6 (XPL semantics).

The semantics for the state formulae is given in Table 3. For the fuzzy formulae, the semantics are as in Table 1.

Note the use of and in Table 3. We refer to the value as a probabilistic value and write it as ([7] calls this a capacity). Unlike in GPL, we may not always be able to compute it with a model checking algorithm.

3.3 Separability of Fuzzy Formulae

With internal nondeterminism, we lose the general relation between conjunctions and disjunctions, as in (3). However, since we are maximizing (or minimizing) over schedulers, we would want the relation in (7).

(7)

This requires that the optimal strategy be the same for , , , and ; in general, these may all be distinct. Instead, we will seek to delay the application of all conjunctions and disjunctions until the two sides are independent, primarily through repeated application of Lemma 2, which holds for XPL as well because it deals with sets of d-trees, but not their measure. For example, we can rewrite as . We generalize this to a syntactic notion of separability, defined below. It will be useful to view a fuzzy formula as a kind of an and-or tree.

Definition 7 (And-or tree).

The and-or tree of a fuzzy formula , is a node labeled by , where , with children and when , and a leaf otherwise.

We can flatten this tree with the straightforward flattening operator, where, e.g., the tree may be flattened to . Note that flattened trees have alternating and nodes. A (conjunctive) set of formulae corresponds to a flattened and-or tree with the root node labeled by and having the elements of as leaves. We will assume refers to the flattened tree.

A subformula of of the form or is called a modal subformula of . We say that is an unguarded subformula of if it is a leaf in . The GPL model checking algorithm requires bound variables to be guarded by actions (i.e., is fine, but is not) [6], and we adopt this requirement as well.

Definition 8 (Formula Transformations).
  • The fixed-point expansion of , denoted by , is a formula obtained by expanding any unguarded subformula of the form to where .

  • We say that a formula is non-probabilistic if it is a state formula, or of the form and for and . The purely probabilistic abstraction of a fuzzy formula , denoted by , is a formula obtained by removing unguarded non-probabilistic subformulae (i.e., , where is non-probabilistic, becomes , etc.).

  • A grouping of a formula , denoted by , groups modalities in a formula using distributivity. Formally, GRP maps to a that is equivalent to based on the equivalences in Lemma 2, applied left-to-right as much as possible on the top level.

At a high level, a necessary condition of separability is that the actions guarding distinct conjuncts and disjuncts of a formula are distinct as well.

Definition 9 (Action set).

The action set of a formula , denoted by is the set of actions appearing at unguarded modal subformulae of :

  • ;

  • ;

  • ;

  • .

We can now define separability based on action sets of formulae as follows.

Definition 10 (Separability).

The set of all separable formulae is the largest set such that , if , then

  1. every subformula of is in , and

  2. if where , then .

A formula is separable if .

Below we illustrate separability of formulae. Let - be all separable and distinct, and also let and be separable.

Note that GRP uses only distributivity of the modal operators over “” and “”, and not the distributivity of the boolean operators themselves. Consequently, a separable formula may be equivalent to a non-separable formula.

Example 11 (Separable formula with equivalent non-separable formula).

The formula is separable.

(8)

The DNF version of , , is not separable since action sets of disjuncts overlap.

(9)

This is important because we need the subformulae of a separable formula to also be separable.

Example 12 (Non-separable formula).

The formula is a subformula of (9), is not separable, and has no equivalent separable formula:

(10)

With , we need to satisfy or following an action, and likewise for or following a action. An equivalent separable formula would thus have to include and , but this would also be satisfied by, e.g., outcomes satisfying only .

We say that a formula is entangled at a state if it is not (equivalent to) a separable formula even after considering that state’s specific characteristics. For instance, is entangled only at states with both and actions present. Even when considering only states where the actions relevant to entanglement are present, a formula may be entangled at some states and not at others.

Example 13 (Entanglement on and depends on ).

The formula reduces to (8) at states that have a -transition, and to (10) otherwise.

(11)

There are also non-separable formulae that nonetheless would not be entangled at any state of an arbitrary PLTS.

Example 14 (Never-entangled non-separable formula).

For the formula , , but at any state it is equivalent either to or to .

(12)

Since GRP combines modal subformulae with a common action, we have the following important consequence.

Remark.

All conjunctive formulae and disjunctive formulae are separable.

4 Model Checking XPL Formulae

We outline a model checking procedure for XPL formulae for a fixed PLTS , along similar lines to the GPL model checking algorithm in [6, Sect. ]. The model checking procedure succeeds whenever the given formula is separable.

Definition 15 (Fisher-Ladner closure).

Given a formula , its Fisher-Ladner closure, , is the smallest set such that the following hold:

  • .

  • If , then:

    • if or , then ;

    • if or for some , then ;

    • if , then , with either or .

Also, we let represent the set of and-or trees with elements of a set as leaves. The core of the model checking algorithm is the construction of a dependency graph , to compute , such that all the formulae appearing in the graph will be in the set . When constructing a dependency graph, in order to divide a formula by actions, we transform it into a factored form, in a similar manner to checking separability. If we are unable to transform a formula into a factored form, as can happen when a formula is non-separable, the graph construction terminates with failure.

Definition 16 (Factored form).

A factored formula can be trivial, when . Otherwise, every leaf of is in the action form, , and no action may guard more than one leaf.

Given a state , a formula can be transformed into a semantically equivalent one that is in factored form2 as: . partially evaluates , by evaluating unguarded non-probabilistic subformulae of as well as all unguarded modal subformulae with actions absent at state , yielding or for each, and simplifying the result.3 Then .

Definition 17 (Dependency graph).

The dependency graph for model checking a formula with respect to a state in PLTS , denoted by , is a directed graph , where node set , and edge set ; i.e., the edges are labeled from . The sets and are the smallest such that:

  • .

  • If , is not in factored form: if equivalent in factored form exists, then and .

  • If , then for . Moreover, for , and .

  • If , then for each such that . Moreover, .

If and has no factored form, then the dependency graph construction fails.

When we transform to the factored form , the semantics does not change, i.e., . For the factored formulae, standard XPL semantics applies (Table 1). Note that we can assume action nodes to be of the form , as the action must then be present at state . From this semantics, we also get the relationships for the probabilistic values. Here, is the standard product operator, while .

Lemma 18 (Probabilistic values).

Fix . The probabilistic value for a node is as follows:

  • and .

  • If is an and-node, then:
    .

  • If is an or-node, then:
    .

  • If is an action node, i.e., , then:

  • The remaining nodes have a unique successor with .

Proof.

Most of the cases are straightforward and similar to the GPL model checking algorithm [6, Lemma 8] and a result for two-player stochastic parity games [22, Theorem 4.22]. The and-node and or-node cases have the product and coproduct, respectively, due to independence. We explain the action node case in more detail.

The sum over the probabilistic distribution is as in GPL and (4); we explain the nondeterministic choice. A PLTS scheduler makes a choice for an action given the partial computation . Here, this choice is made based on a formula, , to be satisfied. When the initial formula is separable, this is well-defined: given , , and , the scheduler can deduce from , a la traversal of the dependency graph. ∎

We note that, although a particular choice may maximize , a scheduler that makes this choice every time is not necessarily optimal. Indeed, no optimal scheduler may exist, in which case we would only have -optimal schedulers for any  [11, 22]. The probabilistic value may be predicated on making a different choice eventually. The formulation in Lemma 18 is consistent with this possibility, and the existence of (-)optimal schedulers may be justified through a common method, called strategy improvement or strategy stealing [13, 22]. The intuition is that, in case of a loop, we can add a choice to succeed immediately with the maximum probability for the state. This cannot increase the probability, and the maximizing scheduler can otherwise be the same, if this choice does not arise.

Theorem 19 (Model checking termination).

The graph construction of terminates for any XPL formula and PLTS . Moreover, if is separable, the XPL model checking algorithm will complete the construction of the dependency graph.

Proof.

is finite, so (for DNF versions used for equivalence checking) is finite. The number of actions in and is finite, so the number of factored formulae is finite. This is sufficient to guarantee termination, as we fail when we cannot construct a factored formula. Meanwhile, separability of implies that we can construct a factored formula from any . ∎

Our primary contribution is the completed dependency graph for a separable formula . For model checking separable XPL formulae, we show how, given the graph, to compare the probabilistic value of at a state against a threshold . We do this by first constructing a system of polynomial max fixed point equations from the graph. Each node in the dependency graph is associated with a real-valued variable . Given a set of variables , each equation in the system is of the form where is

  • a polynomial over such that the sum of coefficients is ; or

  • of the form where .

Furthermore, the equations form a stratified system, where each variable can be assigned a stratum such that is defined in terms of only variables of the form such that (cf. [21, Def. 9]); and variables in the same stratum fall under the same fixed point.

Theorem 20.

Given a real value , a system of polynomial max fixed point equations and a distinguished variable defined in the system, whether or not in its solution is decidable.

Proof.

We write the max polynomial system, , as a sentence in the first-order theory of real closed fields, similar to [21]. The additional comparison will be . Along with the equation system, we need to encode fixed points and .

We can encode as (13) (cf. [13, Section 5]):

(13)

Meanwhile, letting be the set of all variables and a subset belonging to some stratum with least fixed point, we can encode the fixed point itself as (14):

(14)

The stratification of fixed points in the equation system precludes a cyclical dependency between a least and a greatest fixed point; a greatest fixed point can be encoded similarly.

The original fixed point equation system, along with the query , (13)-(14), and the counterpart encoding greatest fixed point, are sentences in a first order theory of real closed fields, which is decidable [29]. Hence the decidability of in the solution to the fixed point equations follows. ∎

We use the above result to determine whether or not for a separable XPL formula . The polynomial fixed point system is derived similarly to [6, Section 4.1.2], with a variable for each node in the dependency graph , and equations based on Lemma 18.

  • If is not in factored form, then has a unique edge labeled by to a node , and .

  • and .

  • If is an and-node, then .

  • If is an or-node, then .

  • If is an action node and , then
    .

Theorem 21 (Correctness).

The construction of the dependency graph , when is separable, yields a polynomial max fixed point equation system, such that the value of in its solution is .

Proof.

The correctness result follows from Lemma 18 and the semantics of fixed points given by Equation 14 (and its counterpart). ∎

Consequently, we have:

Corollary 22 (Decidability).

Given a state formula with separable subformulae, a PLTS and a state in , whether or not is decidable.

{adjustbox}

scale=1