Weak bisimulations for labelled transition systemsweighted over semirings

# Weak bisimulations for labelled transition systems weighted over semirings

## Abstract

Weighted labelled transition systems are LTSs whose transitions are given weights drawn from a commutative monoid. WLTSs subsume a wide range of LTSs, providing a general notion of strong (weighted) bisimulation. In this paper we extend this framework towards other behavioural equivalences, by considering semirings of weights. Taking advantage of this extra structure, we introduce a general notion of weak weighted bisimulation. We show that weak weighted bisimulation coincides with the usual weak bisimulations in the cases of non-deterministic and fully-probabilistic systems; moreover, it naturally provides a definition of weak bisimulation also for kinds of LTSs where this notion is currently missing (such as, stochastic systems). Finally, we provide a categorical account of the coalgebraic construction of weak weighted bisimulation; this construction points out how to port our approach to other equivalences based on different notion of observability.

## 1 Introduction

Many extensions of labelled transition systems have been proposed for dealing with quantitative notions such as execution times, transition probabilities and stochastic rates; see e.g. [8, 7, 19, 20, 25, 34] among others. This ever-increasing plethora of variants has naturally pointed out the need for general mathematical frameworks, covering uniformly a wide range of cases, and offering general results and tools. As examples of these theories we mention ULTraSs [7] and weighted labelled transition systems (WLTSs) [39, 24, 26]. In particular, in a WLTS every transition is associated with a weight drawn from a commutative monoid ; the monoid structure defines how weights of alternative transitions combine. As we will recall in Section 2, by suitably choosing this monoid we can recover ordinary non-deterministic LTSs, probabilistic transition systems, and stochastic transition systems, among others. WLTSs offer a notion of (strong) -weighted bisimulation, which can be readily instantiated to particular cases obtaining precisely the well-known Milner’s strong bisimulation [31], Larsen-Skou’s strong probabilistic bisimulation [27], strong stochastic bisimulation [20], etc.

However, in many situations strong bisimulations are too fine, and many coarser relations have been introduced since then. Basically, these observational equivalences do not distinguish systems differing only for unobservable or not relevant transitions. Likely the most widely known of these observational equivalences is Milner’s weak bisimulation for non-deterministic LTSs [31] (but see [40, 41] for many variations). Weak bisimulations focus on systems’ interactions (communications, synchronizations, etc.), ignoring transitions associated with systems’ internal operations, hence called silent (and denoted by the ).

Unfortunately, weak bisimulations become quite more problematic in models for stochastic systems, probabilistic systems, etc. The conundrum is that we do not want to observe -transitions but at the same time their quantitative effects (delays, probability distributions) are still observable and hence cannot be ignored. In fact, for quantitative systems there is no general agreement of what a weak bisimulation should be. As an example, consider the stochastic system executing an action at rate , and a system executing at rate , followed by an at rate : should these two systems be considered weakly bisimilar?

 S1 a,r S2 τ,r1 a,r2

Some approaches restrict to instantaneous -actions (and hence ) [6]; others require that the average times of ’s executions are the same the two systems - but still these can be distinguished by looking at the variances [5]. Therefore, it is not surprising that many definitions proposed in literature are rather ad-hoc, and that a general mathematical theory is still missing.

This is the problem we aim to address in this paper. More precisely, in Section 3 we introduce the uniform notion of weak weighted bisimulation which applies to labelled transition systems weighted over a semiring. The multiplication operation of semirings allows us to compositionally extend weights to multi-step transitions and traces. In Section 4 we show that our notion of weak bisimulation coincides with the known ones in the cases of non-deterministic and fully probabilistic systems, just by changing the underlying semiring. Moreover it naturally applies to stochastic systems, providing an effective notion of weak stochastic bisimulation. As a side result we introduce a new semiring of stochastic variables which generalizes that of rated transition systems [25].

Then, in Section 5 we present the general algorithm for computing weak weighted bisimulation equivalence classes, parametric in the underlying semiring. This algorithm is a variant of Kanellakis-Smolka’s algorithm for deciding strong non-deterministic bisimulation [22]. Our solution builds on the refinement technique used for the coarsest stable partition, but instead of “strong” transitions in the original system we consider “weakened” ones. We prove that this algorithm is correct, provided the semiring satisfies some mild conditions, i.e. it is -complete. Finally, we discuss also its complexity, which is comparable with Kanellakis-Smolka’s algorithm. Thus, this algorithm can be used in the verification of many kinds of systems, just by replacing the underlying semiring (boolean, probabilistic, stochastic, tropical, arctic, …) and taking advantage of existing software packages for linear algebras over semirings.

In Section 6 we give a brief categorical account of weak weighted bisimulations. These will be characterized as cocongruences between suitably saturated systems, akin to the elegant construction of -elimination given in [35].

In Section 7 we give some final remarks and directions for further work.

## 2 Weighted labelled transition systems

In this section we recall the notion of labelled transition systems weighted over a commutative monoid, showing how these subsume non-deterministic, stochastic and probabilistic systems, among many others. Weighted LTSs were originally introduced by Klin in [24] as the prosecution and generalization of the work on stochastic SOS presented in [25] with Sassone and were further developed in [26].

In the following let denote a generic commutative (aka abelian) monoid , i.e. a set equipped with a distinguished element and a binary operation which is associative, commutative and has as left and right unit.

###### Definition 1 (W-Lts [24, Def. 1]).

Given a commutative monoid , a -weighted labelled transistion system is a triple where:

• is a set of states (processes);

• is an at most countable set of labels;

• is a weight function, mapping each triple of to a weight.

is said to be image finite (resp. countable) iff for each and , the set is finite (resp. countable). A state is said terminal iff for every and : .

For adherence to the notation used in [24] and to support the intuitions based on classical labelled transition systems we shall often write for ; moreover, following a common notation for stochastic and probabilistic systems, we will write also to denote .

The monoidal structure was not used in Definition 1 but for the existence of a distinguished element required by the image finiteness (resp. countability) property. The commutative monoidal structure of weights comes into play in the notion of bisimulation, where weights of transitions with the same labels have to be “summed”. This operation is commonplace for stochastic LTSs, but at first it may appear confusing with respect to the notion of bisimulation of non-deterministic LTSs; we will explain it in Section 2.1.

###### Definition 2 (Strong W-bisimulation [24, Def. 3]).

Given a -LTS , a (strong) -bisimulation is an equivalence relation on such that for each pair of elements of , implies that for each label and each equivalence class of :

 ∑y∈Cρ(xa→y)=∑y∈Cρ(x′a→y).

Processes and are said to be -bisimilar (or just bisimilar when is understood) if there exists a -bisimulation such that .

Clearly -bisimulations are closed under arbitrary unions ensuring the -bisimilarity on any -LTS to be the largest -bisimulation over it.1

###### Remark 1.

In order for the above definition to be well-given, summations need to be well-defined. Intuitively this means that the -LTS does not exceed the expressiveness of its underlying monoid of weights . Reworded, the system has to be image finite if the monoid admits only finite summations; image countable if the monoid admits countable summations, and so on.

In [24, 26], for the sake of simplicity the authors restrict themselves to image finite systems (which is not unusual in the coalgebraic setting). In the present paper we extend their definitions to the case of countable images. This generalization allows to capture a wider range of systems and is crucial for the definition of weak and delay bisimulations.

In practice, Remark 1 is not a severe restriction, since the commutative monoids relevant for most systems of interest admit summations over countable sets. To supports this claim, in the rest of this Section we illustrate how non-deterministic, stochastic and probabilistic labelled transition systems can be recovered as systems weighted over commutative monoids whit countable sums. These kind of commutative monoids are often called commutative -monoids2.

### 2.1 Non-deterministic systems are WLTS

This section illustrates how non-deterministic labelled transition systems [31] can be recovered as systems weighted over the commutative -monoid of logical values equipped with logical disjunction .

###### Definition 3 (Non-deterministic LTS).

A non-deterministic labelled transition system is a triple where:

• is a set of states (processes);

• is an at most countable set of labels (actions);

• is the transition relation.

As usual, we shall denote an -labelled transition from to i.e.  by . A state is called successor of a given state iff . If has no successors then it is said to be terminal. If every state has a finite set of successors then the system is said to be image finite. Likewise it is said to be image countable if each state has at most countably many successors.

Every -valued weight function is a predicate defining a subset of its domain, turning equivalent to the classical definition of the transition relation .

###### Definition 4 (Strong non-deterministic bisimulation).

Let be an LTS. An equivalence relation is a (strong non-deterministic) bisimulation on iff for each pair of states , for any label and each equivalence class :

 ∃y∈C.xa→y⟺∃y′∈C.x′a→y′.

Two states and are said bisimilar iff there exists a bisimulation relation such that . The greatest bisimulation for uniquely exists and is called (strong) bisimilarity.

Strong -bisimulation and strong non-deterministic bisimulation coincide, since logical disjunction over the states in a given class encodes the ability to reach making an -labelled transition.

### 2.2 Stochastic systems are WLTS

Stochastic systems have important application especially in the field of quantitative analysis, and several tools and formalisms to describe and study them have been proposed (e.g. PEPA [20], EMPA [8] and the stochastic -calculus [34]). Recently, rated transition systems [25, 26, 32, 7] emerged as a convenient presentation of these kind of systems.

###### Definition 5 (Rated LTS [25, Sec. 2.2]).

A rated labelled transition system is a triple where:

• is a set of states (processes);

• is a countable set of labels (actions);

• is the rate function.

Semantics of stochastic processes is usually given by means of labelled continuous time Markov chains (CTMC). The real number is interpreted as the parameter of an exponential probability distribution governing the duration of the transition from state to by means of an -labelled action and hence encodes the underlying CTMC (for more information about CTMCs and their presentation by transition rates see e.g. [19, 20, 33, 34]).

###### Definition 6 (Strong stochastic bisimulation).

Given a rated system an equivalence relation is a (strong stochastic) bisimulation on (or strong equivalence [20]) iff for each pair of states , for any label and each equivalence class :

 ∑y∈Cρ(x,a,y)=∑y∈Cρ(x′,a,y).

Two states and are said bisimilar iff there exists a bisimulation relation such that . The greatest bisimulation for uniquely exists and is called (strong) bisimilarity.

Rated transition systems (hence stochastic systems) are precisely WLTS weighted over the commutative monoid of nonnegative real numbers (closed with infinity) under addition and stochastic bisimulations correspond to -bisimulations, as shown in [24]. Moreover, is an -monoid since non-negative real numbers admit sums over countable families. In particular, the sum of a given countable family is defined as the supremum of the set of sums over its finite subfamilies:

 ∑i∈Ixi≜sup{∑i∈Jxi∣J⊆I,|J|<ω}.

### 2.3 Probabilistic systems are (Constrained) WLTS

This section illustrates how probabilistic LTSs are captured by weighted ones. We focus on fully probabilistic systems (also known as generative systems) [13, 27, 4] but in the end we provide some hints on other types of probabilistic systems.

Fully probabilistic system can be regarded as a specializations of non-deterministic transition systems where probabilities are used to resolve nondeterminism. From a slightly different point of view, they can also be interpreted as labelled Markov chains with discrete parameter set [23].

###### Definition 7 (Fully probabilistic LTS).

A fully probabilistic labelled transition system is a triple where:

1. is a set of states (processes);

2. is a countable set of labels (actions);

3. is a function such that for any is either a discrete probability measures for or the constantly function.

In “reactive” probabilistic systems, in contrast to fully probabilistic systems, transition probability distributions are dependent on the occurrences of actions i.e. for any and is either a discrete probability measures for or the constantly function.

Strong probabilistic bisimulation has been originally introduced by Larsen and Skou [27] for reactive systems and has been reformulated by van Glabbeek et al. [13] for fully probabilistic systems.

###### Definition 8 (Strong probabilistic bisimilarity).

Let be a fully probabilistic system. An equivalence relation is a (strong probabilistic) bisimulation on iff for each pair of states , for any label and any equivalence class :

 P(x,a,C)=P(x′,a,C)

where .

Two states and are said bisimilar iff there exists a bisimulation relation such that . The greatest bisimulation for uniquely exists and is called bisimilarity.

It would be tempting to recover fully probabilistic systems as LTS weighted over the probabilities interval but unfortunately the addition does not define a monoid on since it is not a total operation when restricted . There exist various commutative monoids over the probabilities interval, leading to different interpretations of probabilistic systems (as will be shown in Section 4.4), but since in Definition 8 we sum probabilities of outgoing transitions (e.g. to compute the probability of reaching a certain set of states), the real number addition has to be used.

###### Remark 2 (On partial commutative monoids).

The theory of weighted labelled transition systems can be extended to consider partial commutative monoids (i.e.  may be undefined but when it is defined then also is and commutativity holds) or commutative -monoids to handle sums over opportune countable families (thus relaxing the requirement of weights forming -monoids). However, every -semiring can be turned into an -complete one by adding a distinguished element and resolving partiality accordingly.

Klin [24] suggested to consider probabilistic systems as systems weighted over but subject to suitable constraints ensuring that the weight function is a state-indexed probability distribution and thus satisfies Definition 7. These constrained WLTSs were proposed to deal with reactive probabilistic systems.

###### Definition 9 (constrained W-Lts).

Let be a commutative monoid and be a constraint family. A -constrained -weighted labelled transistion system is a -LTS such that its weight function satisfies the constraints over .

Then, fully probabilistic labelled transition systems are precisely constrained -LTSs subject to the constraint family:

 ∑a∈A,y∈Xρ(x,a,y)∈{0,1} for x∈X.

Likewise, reactive probabilistic systems are -LTSs subject to the constraint family:

 ∑y∈Xρ(x,a,y)∈{0,1} for x∈X and a∈A.

Therefore strong bisimulations for these kind of systems are exactly strong -bisimulations.

## 3 Weak bisimulations for WLTS over semirings

In the previous section we illustrated how weighted labelled transitions systems can uniformly express several kinds of systems such as non-deterministic, stochastic and probabilistic systems. Remarkably, bisimulations for these systems were proved to be instances of weighted bisimulations.

In this section we show how other observational equivalences can be stated at the general level of the weighted transition system offering a treatment for these notions uniform across the wide range of systems captured by weighted ones. Due to space constraints we focus on weak bisimulation but eventually we discuss briefly how the proposed results can cover other notions of observational equivalence.

### 3.1 From transitions to execution paths

Let be a -LTS. A finite execution path for this system is a sequence of transition i.e. an alternating sequence of states and labels like

 π=x0a1−→x1a2−→x2…xn−1an−→xn

such that for each transition in the path:

 ρ(xi−1ai−→xi)≠0.

Let denote the above path, then set:

 length(π)=nfirst(π)=x0last(π)=xntrace(π)=a1a2…an.

to denote the length, starting state, ending state and trace of respectively.

In order to extend the definition of the weight function to executions we need some additional structure on the domain of weights, allowing us to capture concatenation of transition. To this end, we require weights to be drawn from a semiring, akin to the theory of weighted automata. Recall that a semiring is a set equipped with two binary operations and called addition and multiplication respectively and such that:

• is a commutative monoid and is a monoid;

• multiplication left and right distributes over addition:

 a⋅(b+c)=(a⋅b)+(a⋅c)(a+b)⋅c=(a⋅c)+(b⋅c)
• multiplication by annihilates :

 0⋅a=0=a⋅0.

Basically, the idea is to express parallel and subsequent transitions (i.e. branching and composition) by means of addition and multiplication respectively. Therefore, multiplication is not required to be commutative (cf. the semiring of formal languages). Distributivity ensures that execution paths are independent from the alternative branching i.e. given two executions sharing some sub-path, we are not interested in which is the origin of the sharing; as the following diagram illustrates:

 a b c a b a c c a b a c b c ≡ ≡ (1)

Finally, since weights of (proper) transitions are always different from , the annihilation property means that no proper execution can contain improper transitions.

Then, the weight function extends to finite paths by semiring multiplication (therefore we shall use the same symbol):

 ρ(x0a1−→x1⋯an−→xn)≜n∏i=1ρ(xi−1,ai,xi)

In the following let be a semiring .

Semirings offer enough structure to extend weight function to finite execution paths compositionally but executions can also be (countably) infinite. Likewise countable branchings (cf. Remark 1), paths of countable length can be treated requiring multiplication to be defined also over (suitable) countable families of weights and obviously respect the semiring structure. However, the additional requirement for can be avoided by dealing with suitable sets of paths as long as these convey enough information for the notion of weak bisimulation (and observational equivalence in general). In particular, a finite path determines a set of paths (possibly infinite) starting with , thus can be seen as a representative for the set. Moreover, the behavior of a system can be reduced to its complete executions: a path is called complete (or “full” [4]) if it is either infinite or ends in a terminal state.

Intuitively, we distinguish complete paths only up to the chosen representatives: longer representative may generate smaller sets of paths, and this can be thought in “observing more” the system. If two complete paths are distinguishable, we have to be able to distinguish them in a finite way i.e. there must be two representative with enough information to tell one set from the other. Otherwise, if no such representative exist, then the given complete paths are indeed equivalent. Therefore, it is enough to be able to compositionally weight (finite) representatives in order to distinguish any complete path.

The remaining of the subsection elaborates the above intuition defining a -algebra over complete paths (for each state). The method presented is a generalization to semirings of the one used in [2]. This structure allows to deal with sets of finite paths avoiding redundancies (cf. Example 3) and define weights compositionally.

Let , and denote the sets of all, complete and finite paths starting in the state respectively. Likewise, we shall denote the corresponding sets of paths w.r.t. any starting state as , and respectively (e.g. ). Paths naturally organize into a preorder by the prefix relation. In particular, given define if and only if one of the following holds:

1. and (both finite), and for ;

2. and (one finite and the other infinite), and for ;

3. (both infinite).

For each finite path define the cone of complete paths generated by as follows:

 π↑≜{π′∈CPaths(x)∣π⪯π′}.

Cones are precisely the sets we were sketching in the intuition above and form a subset of the parts of :

 Γ≜{π↑∣π∈FPaths(x)}.

This set is at most countable since the set is so and every two of its elements are either disjoint or one the subset of the other as the following Lemmas state.

###### Lemma 1.

For any state , the set of finite paths of an image countable -LTS is at most countable.

###### Proof.

By induction on the length of paths in , these are at most countable. In fact, for there is exactly one path, and, taken the set of paths of length be at most countable, then the set of those with length is at most countable because the system is assumed to be image countable. Then is at most countable since it is the disjoint union of

 {π∈Paths(x)∣length(π)=k}

for . ∎

###### Lemma 2.

Two cones and are either disjoint or one the subset of the other.

###### Proof.

For any , we have by definition:

 π∈π1↑⟺π1⪯π and π∈π2↑⟺π2⪯

Then, if then (likewise for ). For the other case, since there is no such that . ∎

Given , the set of all cones generated by its elements is denoted by and defined as the (at most countable) union of the cones generated by each . If this union is over disjoint cones then is said to be minimal.

Minimality is not preserved by set union even if operands are disjoint and both minimal. As a counter example consider the sets and for ; both are minimal and disjoint, but their union is not minimal since . However, always has at least a subset being minimal and such that

 Π↑=Π′↑. (2)

and among these there exists exactly one which is also minimal in the sense of prefixes:

###### Lemma 3.

For , there exists a minimal subset which satisfies (2), i.e. for any satisfying (2) we have: We denote such by .

###### Proof.

Clearly iff since there are no infinite prefix descending chains. Then since is minimal. For every there exists such that and by Lemma 2 i.e.  . Therefore . Consider as in the enunciate, then, for every there exists such that and in particular if . Uniqueness follows straightforwardly. ∎

The set is called minimal support of and intuitively correspond to the “minimal” set of finite executions needed to completely characterize the behavior captured by and the complete paths it induces. Any other path of is therefore redundant (cf. Example 3).

The idea of which complete paths are distinguishable and then “measurable” (i.e. that can be given weight) is captured precisely by the notion of -algebra. In fact, the set of all cones (together with the emptyset) induce a -algebra, as they form a semiring of sets (in the sense of [43]).

###### Lemma 4.

The set is a semiring of sets and uniquely induces a -algebra over .

###### Proof (Sketch).

is closed under finite intersections since cones are always either disjoint or one the subset of the other. Set difference follows from the existence of minimal supports. ∎

As discussed before, in general the weight of cannot be defined as the sum of the weights of its elements, due to redundancies. However, what we are really interested in is the unique set of behaviors described by , i.e. the complete paths it subsumes. Therefore we first extend to minimal , as follows:

 ρ(Π)≜∑π∈Πρ(π) for Π minimal.

then, for all , we simply take

 ρ(Π)≜ρ(Π↓).

Because can be countably infinite, semiring addition has to support countable additions over these sets (cf. Remark 1).

### 3.2 Well-behaved semirings

###### Definition 10.

Let the semiring be endowed with a preorder . We call the semiring well-behaved if, and only if, for any two and the following holds:

 Π1⊆Π2⇒ρ(Π1)⊑ρ(Π2).

If the semiring is well-behaved then addition unit is necessarily the bottom of the preorder because . Moreover, the semiring operations have to respect the preorder e.g.:

 a⊑b⇒a+c⊑b+c.

As a direct consequence, annihilation of parallel is avoided by the zerosumfree property of the semiring i.e. the sum of weighs of proper transition always yield the weight of a proper transition where proper means different from the addition unit.

Well-behaved semirings are precisely positively (partially) ordered semirings and it is well known that these admit the natural preorder:

 a⊴b\lx@stackrel△⟺∃c.a+c=b

which is respected by the semiring operation and has as bottom. The natural preorder is the weaker preorder rendering a semiring positively ordered (hence well-behaved) where weaked means that for any such preorder and elements

 a⊴b⟹a⊑b.

The converse holds only when also the other order is natural.

###### Lemma 5.

The natural preorder is the weaker preorder rendering the semiring well-behaved.

Note that any idempotent semiring bares a natural preorder and hence is well-behaved and the same holds for every semiring considered in the examples illustrated in this paper (cf. Section 4). For instance, some arithmetic semirings like are not positively ordered because of negatives; moreover their are not -semirings (there is no limit for ).

### 3.3 Weak W-bisimulation

Weak bisimulations weakens the notion of strong bisimulation by allowing sequences of silent action before and after any observable one. Then, we are now dealing with (suitable) paths instead of single transitions and the states are compared on the bases of how opportune classes of states are reached from these by means of the paths allowed (i.e. making some silent actions, before and after an observable, if any). Therefore, the notion of how a class state is reached and what paths can be used in doing this is crucial in the definition of the notion of weak bisimulation.

For instance, for non-deterministic LTSs, the question of how and if a class is reached coincides and then it suffices to find a (suitable) path leading to the class. This allows weak bisimulation for non-deterministic LTSs to rely on the reflexive and transitive closure of -labelled transition of a system (cf. Definition 12) to blur the distinction between sequences of silent actions which can then be “skipped”. In fact, the -closure at the base of (3) defines a new LTS over the same state space of the previous and such that every weak bisimulation for this new system is a weak bisimulation for the given one and vice versa.

In [11] Buchholz and Kemper extends this notion to a class of automatons weighted over suitable semirings i.e. those having operations commutative and idempotent (e.g. ). This class includes interesting examples such as the boolean and bottleneck semiring (cf. Section 4.4) but not the semiring of non-negative real numbers and therefore does not cover the cases of fully probabilistic systems. Modulo some technicality connected to initial and accepting states, their results can be extended to labelled transition systems and holds also for LTSs weighted over suitable semirings.

Their interesting construction relies on the -closure of a system and it is known that this closure does not cover the general case. For instance, it can not be applied to recover weak bisimulation for generative systems as demonstrated by Baier and Hermanns (cf. [2]). The following example gives an intuition of the issue.

###### Example 3.

Consider the -LTS below.

 x x1 x2 x3 x4 x5 x6 b,w1 b,w2 b,w3 a,w7 a,w4 b,w5 a,w6 C

There are four finite paths going from state the to the class . Their weights are:

 ρ(xb→x1b→x2) =w1⋅w2 ρ(xb→x1b→x2b→x3a→x5) =w1⋅w2⋅w3⋅w7 ρ(xa→x4) =w4 ρ(xa→x4b→x5) =w4⋅w5

Let us suppose to define the weight of the set of these paths as the sum of its elements weights and suppose that the system is generative; then the probability of reaching from would exceed . Likewise, in the case of a stochastic system, the rate of reaching cannot consider paths passing through before ending in it. If we are interested in how is reached from with actions yielding a trace in the set , paths and are ruled out because the first has a different trace and the second reaches before it ends.

Then, given a set of traces , a state and a class of states , the set of finite paths of the given transition system reaching from with trace in that should be considered is:

 \Lbagx,T,C\Rbag≜{π∣∣∣\parbox184.942913pt$π∈FPaths(x)$,$last(π)∈C$,$trace(π)∈T$,$∀π′⪯π:trace(π′)∈T⇒last(π′)∉C$}

since these are all and only the finite executions of the system starting going from to with trace in and never passing through except for their last state. Redundancies highlighted in the example above are ruled out since no execution path in this set is the prefix of an other in the same set. In particular is the minimal support of the set of all finite paths reaching from with trace in :

 \Lbagx,T,C\Rbag={π∣πFPaths(x),last(π)∈C,trace(π)∈T}↓.

Therefore, weight functions can be consistently extended to these sets by point-wise sums:

 ρ(\Lbagx,T,C\Rbag)=∑π∈\Lbagx,T,C\Rbagρ(π).

The sum is at most countable since so is and . Then, the addition operation of the semiring will support countable sums as discussed in Remark 1.

When clear from the context, we may omit the bag brackets from .

We are now ready to state the notion of weak bisimulation of a labelled transition system weighted over any semiring admitting sums over (not necessarily every) countable family of weights. The notion we propose relies on the weights of paths reaching every class in the relation but making at most one observable and hence the importance of defining sets of paths reaching a class consistently.

###### Definition 11 (Weak W-bisimulation).

Let be a LTS weighted over the semiring . A weak -bisimulation is an equivalence relation on such that for all , implies that for each label and each equivalence class of :

 ρ(x,τ∗aτ∗,C)=ρ(x,τ∗aτ∗,C) ρ(x,τ∗,C)=ρ(x,τ∗,C).

States and are said to be weak -bisimilar (or just weak bisimilar), written , if there exists a weak -bisimulation such that .

The approach we propose applies to other behavioural equivalences. For instance, delay bisimulation can be recovered for WLTSs by simply considering in the above definition of weak bisimulations sets of paths of the sort of and . The notion of branching bisimulation relies on paths with the same traces of those considered for defining weak bisimulation but with some additional constraint on the intermediate states. In particular, the states right before the observable have to be in the same equivalence class and likewise the states right after it. Definition 11 is readily adapted to branching bisimulation by considering these particular subsets of .

## 4 Examples of weak W bisimulation

In this Section we instantiate Definition 11 to the systems introduced in Section 2 as instances of LTSs weighted over commutative -monoids.

### 4.1 Non-deterministic systems

Let us recall the usual definition of weak bisimulation for LTS [31].

###### Definition 12 (Weak non-deterministic bisimulation).

An equivalence relation is a weak (non-deterministic) bisimulation on iff for each , label and equivalence class :

 ∃y∈C.x\ext@arrow0359\Rightarrowfill@αy⟺∃y′∈C.x′\ext@arrow0359\Rightarrowfill@αy′ (3)

where is the well-known -reflexive and -transitive closure of the transition relation . Two states and are said weak bisimilar iff there exists a weak non-deterministic bisimulation relation such that .

Clearly, a weak bisimulation is a relation on states induced by a strong bisimulation of a suitable LTS with the same states and actions. In particular, weak bisimulations for are strong bisimulations for and viceversa. The transition system is sometimes referred as saturated or weak (e.g. in [21]). This observation is at the base of some algorithmic and coalgebraic approaches to weak non-deterministic bisimulations (cf. Section 5 and Section 6 respectively).

Section 2.1 illustrated that non-deterministic LTSs are -WLTSs. The commutative monoid is part of the boolean semiring of logical values under disjunction and conjunction which we shall also denote as . Then, by straightforward application of the definitions, the notions of weak non-deterministic bisimulation and weak -bisimulation coincide.

###### Proposition 6.

Definition 12 is equivalent to Definition 11 with .

It easy to check that a similar correspondence holds for branching and delay bisimulations.

### 4.2 Probabilistic systems

In the definition of weak bisimulation for fully probabilistic systems we are interested in the probability of reach a class of states. This aspect is present also in the case of strong bisimulation, but things become more complex for weak equivalences due to silent actions and multi-step executions. Moreover, -additivity is no longer available since the probability of reaching a class of states is not the sum of the probabilities of reaching every single state in that class. (On the contrary, a class is reachable if any of its state is so which is the property we are interested in when dealing with non-deterministic systems.)

Weak bisimulation for fully probabilistic systems was introduced by Baier and Hermanns in [4, 2]. Here we recall briefly their definition; we refer the reader to loc. cit. for a detailed presentation.

###### Definition 13 (Weak probabilistic bisimilarity [4, 2]).

Given a fully probabilistic system , an equivalence relation on is a weak (probabilistic) bisimulation iff for , for any and any equivalence class :

 Prob(x,τ∗aτ∗,C)=Prob(x′,τ∗aτ∗,C) Prob(x,τ∗,C)=Prob(x′,τ∗,C).

Two states and are said weak bisimilar iff there exists a weak probabilistic bisimulation relation such that .

The function is the extension over finite execution paths of the unique probability measure induced by over the -field of the basic cylinders of complete paths.

###### Proposition 7.

Definition 12 is equivalent to Definition 11 with .

The function is a weight function such that is a probability measure (or the constantly measure) which extends to the unique -algebra on (Lemma 4). This defines precisely . In particular, for any and where is seen as the weigh function of a -LTS.

### 4.3 Stochastic systems

As we have seen in Section 2.2, stochastic transition systems can be captured as WLTSs over by describing the exponential time distributions of a CTMC by their rates [25]. Unfortunately, this does not extend to paths because the sequential composition of two exponential distributions does not yield an exponential distribution, and hence it can not be represented by an element of . Moreover, there are stochastic systems (e.g. TIPP [14], SPADES [12]) whose transition times follow generic probability distributions.

To overcome this shortcoming, in this Section we introduce a semiring of weights called stochastic variables which allows to express stochastic transition system with generic distributions as WLTSs. Then the results of this theory can be readily applied to define various behavioural equivalences, ranging from strong bisimulation to trace equivalence, for all these kind of systems. In particular, we define weak stochastic bisimulation by instantiating Definition 11 on the semiring of stochastic variables.

The carrier of the semiring structure we are defining is the set of transition-time random variables i.e. random variables on the nonnegative real numbers (closed with infinity) which describes the nonnegative part of the line of time.

Given two (possibly dependent) random variables and from , let be the minimum random variable yielding the minimum between and . If the variables and characterize the time required by two transitions then their combined effect is defined by the stochastic race between the two transitions; a race that is “won” by the transition completed earlier and hence the minimum. For instance, given two stochastic transitions and the transition time for their “combination” going from to is characterized by the random variable i.e. the overall time is given by the first transition to be completed on the specific run.

Minimum random variables defines the operation over with a constantly continuous random variable (its density is the Dirac delta function ) as the unit. Random variables of the sort of are self-independent and since they always always yield we shall make no distinction between them and refer to the random variable. In general, time-transition variables do not have to be self-independent since the events they describe usually depends on themselves. Intuitively, it is like racing against ourself i.e. we are the only racer and therefore . Formally:

 P(min(X,X)>t)=P(X>t∩X>t)=P(X>t)⋅P(X>t∣X>t)=P(X>t).

Let and be two continuous random variables from with probability density functions and respectively. The density describing is:

 fmin(X,Y)(z)=fX(z)+fY(z)−fX,Y(z,z).

When and are independent (but not necessarily i.i.d.) can be simplified as:

 fmin(X,Y)(z)=fX(z)⋅∫+∞zfY(y)dy+fY(z)⋅∫+∞zfX(x)dx.

Intuitively, the likelihood that one variable is the minimum must be “weighted” by the probability that the other one is not. In particular, for independent exponentially distributed variables and , is exponentially distributed and its rate is the sum of the rates of the negative exponentials characterizing and . Therefore, the commutative monoid faithfully generalizes the monoid used in Section 2.2 to capture CTMCs as WLTSs

During the execution of a given path, the time of every transition in the sequence sums to the overall time. Therefore, the transition time for e.g.  is characterized by the random variable sum of the variable characterizing the single transitions composing the path. Sum and the constantly 0 continuous variable define a commutative monoid over . The operation has to be commutative because the order a path imposes to its steps does not change the total time of execution.

Let and be two continuous random variables from with probability density functions and respectively. The probability density function is:

 fX+Y(t)=∫t0fX,Y(s,t−s)ds

and, if and are independent (but not necessarily i.i.d.), is the convolution:

 fX+Y(t)=∫t0fX(s)⋅fY(t−s)ds.

It is easy to check that sum distributes over minimum:

 X+min(Y,Z)=min(X+Y,X+Z)

by taking advantage of the latter operation being idempotent. Then, because of sum being commutative, left distributivity implies right one (and vice versa). Thus is a (commutative and idempotent) semiring and stochastic systems can be read as -LTS. This induces immediately a strong bisimulation (by instantiating Definition 2) which corresponds to strong stochastic bisimulations on rated LTS (Definition 6). Moreover, following Definition 11, we can readily define the weak stochastic bisimulation as the weak -bisimulation.

In literature there are some (specific and ad hoc) notions of weak bisimilarity for stochastic systems. The closest to our is the one given by Bernardo et al. for CTMCs extended with passive rates and instantaneous actions [6, 5]. Their definition is finer than our weak -bisimulation since they allow to merge silent actions only when these are instantaneous and hence unobservable also w.r.t. the time. Instead, in our definition sequences of actions are equivalent as long as their overall “rates” are the same (note that in general, the convolution of exponentially distributed random variables is no longer exponentially distributed but an hyper-exponential). In [5], Bernardo et al. relaxed the definition given in [6] to account also for non-instantaneous -transitions. However, to retain exponentially distributed variables, they approximate hyper-exponentials with exponentials with the same average. This approach allows them to obtain a saturated system that still is a CTMC but loosing precision since, in general, the average is the only momentum preserved during the operation. On the opposite, our approach does not introduce any approximation.

In [29] López and Núñez proposed a definition of weak bisimulation for stochastic transition systems with generic distributions. Their (rather involved) definition is a refinement of the notion they previously proposed in [30] and relies on the reflexive and transitive closure of silent transitions. However, their definition of strong bisimulation does not correspond to the results from the theory of WLTSs, so neither the weak one does.

### 4.4 Other examples

The definition of weak -bisimulation applies to many other situations. In the following we briefly illustrate some interesting cases.

Tropical and arctic semirings These semirings are used very often in optimization problems, especially for task scheduling and routing problems. Some examples are: ; ; .

In these contexts, weak bisimulation would allow to abstract from “unobservable” tasks e.g. internal tasks and treat a cluster of machines as a single one, reducing the complexity of the problem.

Truncation semiring . It is variant of the above ones, and it is used to reason “up-to” a threshold . A weak bisimulation for this semiring allows us to abstract from how the threshold is violated, but only if this happens.

Probabilistic semiring Another semiring used for reasoning about probabilistic events is . This is used to model the maximum likelihood of events, e.g. for troubleshooting, diagnosis, failure forecasts, worse cases, etc. A weak bisimulation on this semiring allows to abstract from “unlikely” events, focusing on the most likely ones.

Formal languages A well-known semiring is that of formal languages over a given alphabet . Here, a weak bisimulation is a kind of determinization w.r.t. to words assigned to transitions.

## 5 A parametric algorithm for computing weak W-bisimulations

In this section we present an algorithm for computing weak -bisimulation equivalence classes which is parametric in the semiring structure . Being parametrized, the same algorithm can be used in the mechanized verification and analysis of many kinds of systems. This kind of algorithms is often called universal since they do not depend on any particular numerical domain nor its machine representation. In particular, algorithms parametric over a semiring structure have been successfully applied to other problems of computer science, especially in the field of system analysis and optimization (cf. [28]).

The algorithm we present is a variation of the well-known Kanellakis-Smolka’s algorithm for deciding strong non-deterministic bisimulation [22]. Our solution is based on the same refinement technique used for the coarsest stable partition, but instead of “strong” transitions in the original system we consider “weakened” or saturated ones. The idea of deciding weak bisimulation by computing the strong bisimulation equivalence classes for the saturated version of the system has been previously and successfully used e.g. for non-deterministic or probabilistic weak bisimulations [2]. The resulting complexity is basically that of the coarsest stable partition problem plus that introduced by the construction of the saturated transitions. The last factor depends on the properties and kind of the system and, in our case, on the properties of the semiring (the algorithm and its complexity will be discussed with more detail in Section 5.2).

Before outlining the general idea of the algorithm let us introduce some notation. For a finite set we denote by a partition of it i.e. a set of pairwise disjoint sets covering :

 X=⨄X=⨄{B0,…,Bn}.

We shall refer to the elements of the partition as blocks or classes since every partition induces an equivalence relation on and vice versa.

Given a finite -LTS the general idea for deciding weak -bisimulation by partition refinement is to start with a partition of the states coarser than the weak bisimilarity relation e.g.  and then successively refine the partition with the help of a splitter (i.e. a witness that the partition is not stable w.r.t. the transitions). This process eventually yields a partition being the set of equivalence classes of the weak bisimilarity. A splitter of a partition is a pair made of an action and a class of that violates the condition for to be a weak bisimulation. Reworded, a pair is a splitter for if, and only if, there exist and such that:

 ρ(x,^α,C)≠ρ(y,^α,C) (4)

where is a short hand for the sets of traces and when and respectively. Then is obtained from splitting every3 accordingly to the selected splitter .

 Xi+1≜⋃{B/≈α,C∣B∈Xi} (5)

where is the equivalence relation on states induced by the splitter and such that:

 x≈α,Cy\lx@stackrel△⟺ρ(x,^α,C)=ρ(y,^α,C).

Note that the block can be split in more than two parts (which is the case of non-deterministic systems) since splitting depends on weights of outgoing weak transitions.

### 5.1 Computing weak transitions

The algorithm outlined above follows the classical approach to the coarsest stable partition problem where stability is given in terms of weak weighted transitions like (and in general weighted sets of paths e.g. ) but nothing is assumed on how these values are computed. In this section, we show how weights of weak transitions can be obtained as solutions of systems of linear equations over the semiring . Clearly, for some specific cases and sets of paths, there may be more efficient ad-hoc technique (e.g. saturated transitions can be precomputed for non-deterministic LTSs) however the linear system at the core of our algorithm is a general and flexible solution which can be readily adapted to other observational equivalences (cf. Example 4).

Let be a class. For every and let be a variable with domain the semiring carrier. Intuitively, once solved, these will represent:

 xτ=ρ(\Lbagx,τ∗,C\Rbag)xa=ρ(\Lbagx,τ∗aτ∗,C\Rbag)

The linear system is given by the equation families (6), (7) and (8) which capture exactly the finite paths yielding the cones covering weak transitions.

 xτ =1 for x∈C (6) xτ =∑y∈Xρ(xτ→y)⋅yτ for x∉C (7) xa =∑y∈Xρ(xa→y)⋅yτ+∑y∈Xρ(xτ→y)⋅ya (8)

The system is given as a whole but it can be split in smaller sub-systems improving the efficiency of the resolution process. In fact, unknowns like depend only on those indexed by or and unknowns like depend only on those indexed by . Hence instead of a system of equations and unknowns, we obtain systems of equations and unknowns by first solving the sub-system for and then a separate sub-system of each action (where are now constant).

###### Example 4 (Delay bisimulation).

Delay bisimulation is defined at the general level of WLTSs simply by replacing with in Definition 11. Then, delay bisimulation equivalence classes can be computed with the same algorithm simply by changing the saturation part at its core. Weights of sets like are computed as the solution to the linear equation system:

 xa=∑y∈Xρ(xτ→y)⋅ya+∑y∈Cρ(xa→y).

#### Solvability

Decidability of the algorithm depends on the solvability the equation system at its core. In particular, on the existence and uniqueness of the solution. In section we prove that this holds for every positively ordered -semiring. The results can be extended to -semirings provided that their -algebra covers the countable families used by Theorem 10.

The linear equation systems under consideration bare a special form: they have exactly the same number of equations and unknowns (say ) and every unknown appears alone on the left side of exactly one equation. Therefore, these systems are defining an operator

 F(x)=M×x+b (9)

over the space of -dimensional vectors where and are a -dimensional matrix and vector respectively defined by the equations of the system. Then, the solutions of the system are precisely the fix-points of the operator and since the number of equations and unknowns is the same, if has a fix-point, it is unique.

Let the semiring be positively ordered. These semirings admit a natural preorder which subsumes any preorder respecting the structure of the semiring; hence we restrict ourselves to the former. The point-wise extension of to -dimensional vectors defines the partial order with bottom ; suprema are lifted pointwise from where are sum-defined. Therefore, -chains suprema exists only under the assumption of addition over at most countable families and viceversa.

###### Lemma 8.

is -complete iff admits countable sums.

The operator manipulates its arguments only by additions and constant multiplications which respect the natural order. Thus is monotone with respect to . Moreover, preserves -chains suprema (and in general -families) because suprema for are defined by means of additions and the order is lifted point-wise.

###### Lemma 9.

The operator over is Scott-continuous.

Finally, we can state the main result of this Section from which decidability follows as a corollary.

###### Theorem 10.

Systems in the form of (9) have unique solutions if the underlying semiring is well-behaved and -complete.

###### Proof.

By Lemma 8, Lemma 9 and Kleene Fix-point Theorem has a least fix point. Because the linear equation system has the same number of equations and unknowns, this solution is unique. ∎

The linear equation systems defined by the equation families (6), (7) and (8) have exactly one solution and hence the algorithm proposed is decidable. Moreover this holds also for any behavioural equivalence whose saturation can be expressed in a similar way e.g. delay bisimulation (cf. Example 4).