A finite state projection algorithm for the stationary solution of the chemical master equation

# A finite state projection algorithm for the stationary solution of the chemical master equation

Ankit Gupta, Jan Mikelson and Mustafa Khammash
Department of Biosystems Science and Engineering
ETH Zurich
Mattenstrasse 26
4058 Basel, Switzerland.
###### Abstract

The chemical master equation (CME) is frequently used in systems biology to quantify the effects of stochastic fluctuations that arise due to biomolecular species with low copy numbers. The CME is a system of ordinary differential equations that describes the evolution of probability density for each population vector in the state-space of the stochastic reaction dynamics. For many examples of interest, this state-space is infinite, making it difficult to obtain exact solutions of the CME. To deal with this problem, the Finite State Projection (FSP) algorithm was developed by Munsky and Khammash (Jour. Chem. Phys. 2006), to provide approximate solutions to the CME by truncating the state-space. The FSP works well for finite time-periods but it cannot be used for estimating the stationary solutions of CMEs, which are often of interest in systems biology. The aim of this paper is to develop a version of FSP which we refer to as the stationary FSP (sFSP) that allows one to obtain accurate approximations of the stationary solutions of a CME by solving a finite linear-algebraic system that yields the stationary distribution of a continuous-time Markov chain over the truncated state-space. We derive bounds for the approximation error incurred by sFSP and we establish that under certain stability conditions, these errors can be made arbitrarily small by appropriately expanding the truncated state-space. We provide several examples to illustrate our sFSP method and demonstrate its efficiency in estimating the stationary distributions. In particular, we show that using a quantized tensor train (QTT) implementation of our sFSP method, problems admitting more than 100 million states can be efficiently solved.

Keywords: stochastic reaction networks; the Chemical Master Equation; Finite State Projection; stationary distribution; ergodicity; irreducibility; tensor trains

Mathematical Subject Classification (2010): 60J22; 60J27; 60H35; 65C40; 92E20

## 1 Introduction

Many intracellular reaction networks consist of biomolecular species that are typically present in low copy-numbers. The reactions involving these species fire intermittently at random times, rather than continuously. Hence deterministic descriptions of the reaction dynamics, based on Ordinary Differential Equations (ODEs), become highly inaccurate [1]. It is now well-known that macroscopic properties of the system can be heavily influenced by the intrinsic noise or randomness that arises due to the random timing of reactions [2]. Consequently stochastic formulations of the reaction dynamics, based on continuous-time Markov chains (CTMCs), has become a popular approach for studying the effects of intrinsic noise [3]. In this paper we provide a tool for the analysis of such models.

In the CTMC model of a reaction network, the state at any time is the vector of copy-number counts of all the species. When the number of network species is , the dynamics evolves on a discrete state-space which is a subset of the -dimensional nonnegative integer lattice and this subset must be large enough to include all the states that are accessible by the random dynamics. The effects of intrinsic noise on the reaction network are generally studied using the probability distribution of the random state-vector at time . It is known that the time-evolution of this probability distribution is given by a system of coupled ODEs, known as the Chemical Master Equation (CME) in the literature (see (2.7)). For each state in there is an ODE in the CME that captures the inflow and outflow of probability at that state. If this state-space is finite, then the CME is a finite system of linear ODEs which can in principle be solved to yield the probability distribution . However in many examples of biological interest, the state-space is infinite, making the CME impossible to solve. A common approach in such cases is to estimate the CME solution by computing the empirical distribution of the samples obtained by simulating the CTMC using Monte Carlo methods such as Gillespie’s Stochastic Simulation Algorithm (SSA) [4]. This simulation-based approach can be very time-consuming and the estimates suffer from statistical errors due to finitely many samples being used. In particular the low-probability events are sparsely sampled by Monte Carlo simulations, which can lead to incorrect representations of the CME solution. Such problems can be avoided by using the Finite State Projection (FSP) method developed by Munsky and Khammash [5], that directly solves the CME by truncating the state-space to a manageable size (see Section 2.2). The solution obtained is approximate but FSP provides an iterative way to ensure that the approximation error is within some pre-specified tolerance level.

The truncated state-space needed by FSP to solve the CME accurately is still exorbitantly large for many problems of interest. For example, consider a simple gene-expression network where ten protein species are interacting with each other. Typically each protein in a cell has copy-numbers of the order of several thousands. So even if we have a conservative upper-bound of 1000 on the copy-number of each protein, the size of the state-space required for FSP is of the order of , which is beyond the computational and storage capacity of modern day computers. This combinatorial explosion in the state-space size is often called the “curse of dimensionality” and it presents a major challenge in making the CME practically solvable. Several advanced numerical techniques have been developed that address this challenge by adapting the FSP. These techniques include Krylov Subspace approximations [6], Tensor-Train representations [7], and using sparse grids and aggregation methods [8]. Unlike these methods which attempt to solve the exact version of CME, there also exist a body of methods that aim to solve simplified versions of CME, which are derived by approximating the CTMC dynamics by a Stochastic Differential Equation (SDE) or a Piecewise-deterministic Markov Process (PDMP) (see [9, 10, 11]). Such dynamical approximations only hold for finite time-periods, and the assumptions on species copy-numbers and reaction propensities they require, are not always satisfied by networks encountered in systems biology.

For many biological applications, one is interested in the steady-state behavior which is captured by the stationary probability distribution to which the solution of the CME converges to as . For CTMCs whose state-space is finite and not too large, estimation of the stationary distribution is a simple linear-algebraic problem (see (1.1)). However in situations where the state-space is very large or infinite, this linear-algebraic problem cannot be practically solved, and we need to estimate by other means. The methods mentioned above for estimating only work over finite time-intervals and they would generally fail to provide an accurate estimate of the stationary distribution . The reason for this failure depends on the method being used. The dynamical approximations based on PDMPs or SDEs introduce an error that can become unbounded in the limit , and the Monte Carlo simulation based approach for estimating is highly undesirable due to statistical errors and the computational costs associated with these simulations over large time-intervals. The FSP algorithm also cannot be used for estimating because this method introduces an absorbing state to catch all the transitions that leave the truncated state-space (see Figure 1B). However in the limit , all the probability mass flows into this absorbing state, and so the obtained probability distributions are unable to capture the true stationary distribution . We revisit this point later in this section and also explain it in detail in Section 2.2.

The aim of this paper is to present a FSP-like method for accurately estimating the stationary distribution . This method also involves truncating the state-space but rather than solving a linear system of ODEs for probabilities over the truncated state-space (as in FSP), our method estimates the true stationary distribution by computing the stationary distribution of a suitably defined CTMC over the truncated state-space. As the latter step can be accomplished by solving a linear-algebraic system, rather than a system of ODEs, the computational complexity of our method is much lower than that of FSP. Consequently it can be successfully applied on a larger class of networks. We call our method the stationary Finite State Projection (or sFSP) algorithm and we provide theoretical arguments to establish its accuracy under certain stability conditions which are usually satisfied by networks in systems and synthetic biology. Even though sFSP can be applied on larger systems than FSP, the combinatorial explosion of state-space sizes still limits the range of applicability of sFSP severely. As was the case with FSP, this issue can be somewhat resolved by adapting sFSP to work with quantized tensor-train (QTT) representations [7], sparse grids and aggregation methods [8]. We illustrate this point with a computational example where sFSP is applied to the QTT representation of the CME (see Section 5). We remark here that QTT representations have already been successfully employed for obtaining approximations of the stationary distribution for reaction networks satisfying certain graph-theoretic criteria [12, 13]. However these criteria are highly-restrictive and it will become evident that the sFSP based approach is more versatile.

We now describe the problem of estimating stationary distributions in more detail. Henceforth let denote the size of any set , and let and denote the vector of all zeros and all ones respectively111The dimension of these vectors will be clear from the context.. The stochastic model of a reaction network (see Section 2.1) represents the dynamics as a CTMC over a discrete state-space . Such a CTMC can be described by its transition rate matrix (see [14]), whose diagonal entries are non-positive, off-diagonal entries are non-negative and all the rows sum to zero. The stationary distribution for this CTMC can be described by a non-negative vector222Throughout the paper we assume that vector and matrix indices start from rather than . , which is in the left null-space of transition rate matrix , i.e.

 QTπ=0, (1.1)

and its components sum to (i.e. ). Such a stationary distribution may not be unique and if then it may not even exist (see [14]). In our recent work, we have dealt extensively with the issue of computationally verifying the existence and uniqueness of the stationary distribution corresponding to the CTMC models for a large class of biomolecular reaction networks (see [15] and [16]). Assuming that the existence and uniqueness of the stationary distribution has been ascertained for the network, our aim here is to estimate numerically. We are primarily interested in situations where is infinite, and so the direct computation of using (1.1) is computationally impossible.

It is natural to try to estimate by solving a finite, truncated version of the linear-algebraic system (1.1). This truncated version can be obtained by first identifying a truncated state-space and then projecting the CTMC dynamics on this truncated state-space. Thereafter the stationary distribution for the projected CTMC, found by solving the corresponding linear-algebraic system of the form (1.1), serves as an estimate of the true stationary distribution . An important issue that arises here is how to handle the outgoing transitions from the truncated state-space, so that the obtained estimate of is accurate. In the FSP approach [5], these outgoing transitions are preserved but their target states are collapsed into a single absorbing state (see Figure 1B). This leads to the “probability leakage” problem which can be managed over finite time-intervals but not in the asymptotic regime. This problem manifests itself in the fact that the only stationary distribution for the projected CTMC would be the one that puts all the probability-mass at the absorbing state. Obviously this does not capture the true stationary distribution and hence modifications to the FSP approach are necessary to circumvent the probability-leakage problem. One such modification that has been tried is motivated by the use of “reflected” boundary conditions in the study of Fokker-Planck equations [12]. In this approach all the outgoing transitions from the truncated state-space are simply eliminated by setting their propensities to zero. It has been shown that this reflected version of FSP yields accurate estimates of the stationary distribution for some reaction network examples [17, 12]. However there is no theoretical guarantee that this approach will work well in general.

The method sFSP that we present in this paper modifies the FSP in another way. It preserves the outgoing transitions from the truncated state-space, but rather than channeling them to an absorbing state (as in FSP), it redirects them to a designated state within the truncated state-space (see Figure 1C). This modification is simple to implement and its appealing feature is that for a wide range of biomolecular reaction networks, we can theoretically guarantee that the stationary distribution of the projected CTMC converges to the actual stationary distribution as the truncated state-space expands to the full state-space . Moreover we derive bounds for the approximation error incurred by this approach, in terms of the outflow rate of all the outgoing transitions evaluated at the estimated stationary distribution (see Theorem 3.1). These results provide the theoretical basis for our method which expands the truncated state-space iteratively to recover a “good” approximation of . Note that our approach for estimating the stationary distribution is very different from the stochastic complementation approach proposed in [18]. This complementation approach is generally difficult to implement for infinite state-space CTMCs and it only yields the conditional stationary distribution which can then be used to derive upper and lower bounds for the true stationary probabilities. However such bounds are not guaranteed to be close to each other. In contrast, our method allows one to estimate the true stationary probabilities directly.

For our method sFSP to work we require that the original CTMC representing the reaction network satisfies a couple of stability conditions. The first condition is that the state-space needs to be irreducible i.e. all the states in must be accessible from each other via a sequence of positive-propensity reactions. The second condition is a Foster-Lyapunov criterion (see [19]) which ensures that the original CTMC is exponentially ergodic i.e. the solution of the CME converges to the stationary distribution exponentially fast. We elaborate these stability conditions in Section 3.1 and there we also explain how these conditions can be easily checked for a wide range of networks arising in systems and synthetic biology, using the computational procedures developed in our recent papers [15] and [16]. This makes the proposed sFSP method broadly applicable and of interest to the growing community of researchers working with stochastic models of biomolecular reaction networks.

This paper is organized as follows. In Section 2 we describe the stochastic model and the original FSP method [5]. In Section 3 we present and mathematically analyze our stationary Finite State Projection (or sFSP) algorithm. A simple implementation of sFSP is presented in Section 4 while its QTT implementation is presented in Section 5. These sections also include the computational examples that illustrate the respective implementations. Finally in Section 6 we conclude and discuss directions for future research.

## 2 Preliminaries

### 2.1 The stochastic model

We now formally describe the CTMC model of a reaction network. Suppose this network has species, called , and reactions of the form

 d∑i=1νikXi⟶d∑i=1ν′ikXi. (2.2)

Here and are nonnegative integers denoting the number of molecules of species that are consumed and produced by the -th reaction. The state of the system at any time is the vector of molecular counts of all the species. When the -th reaction fires, the state is displaced by the stoichiometric vector whose -th component is . At any state , the rate of the -th reaction is , where is the propensity function for this reaction. Commonly mass action kinetics (see [3]) is assumed, where each is given by

 λk(x1,…,xd)=θkd∏i=1xi(xi−1)…(xi−νik+1)νik!, (2.3)

with the positive parameter being the associated rate constant. We model the reaction dynamics as a CTMC which jumps from state after a random waiting time which is exponentially distributed with rate , and this jump is in direction with probability . Formally this CTMC can be specified by its generator333The generator of a Markov process is an operator which specifies the rate of change of the probability distribution of the process (see Chapter 4 in [20] for details). defined as

 Qf(x)=K∑k=1λk(x)(f(x+ζk)−f(x)), (2.4)

where is any bounded real-valued function on .

From now on we suppose that there is a nonempty state-space on which the CTMC evolves i.e.

 for each x∈E and k=1,…,K,if λk(x)>0 then  (x+ζk)∈E. (2.5)

In other words, if at state , reaction has a positive probability of firing then the resulting state must also be in . As is at most countable, it can be enumerated. This means that we can find a one-to-one and onto map from to the set . Once such an enumeration is fixed, the set can be expressed as , where . Then the CTMC generator can be expressed as the transition rate matrix given by444Here we assume for convenience that all stoichiometry vectors (-s) are distinct.

 Qij=⎧⎪ ⎪⎨⎪ ⎪⎩−∑Kk=1λk(xi) if i=jλk(xi) if xj=xi+ζk for some% k0 otherwise.

Let be the CTMC with this transition rate matrix and some initial state . For any state , let

 p(t,x)=P(X(t)=x) (2.6)

be the probability that the CTMC is in state at time . These probabilities evolve in time according to the Chemical Master Equation (CME) given by

 dp(t,x)dt= K∑k=1p(t,x−ζk)λk(x−ζk)−p(t,x)K∑k=1λk(x), (2.7)

for each . Note that this system has as many ODEs as the number of elements in the state-space , which is generally infinite or very large.

Let be the probability distribution defined by

 p(t,A)=∑x∈Ap(t,x) (2.8)

for any . The vectorized form of w.r.t. enumeration is simply given by

 p(t)=(p(t,x0),p(t,x1),p(t,x2),…)

and using this form we can express the CME as

 dpdt=QTp(t). (2.9)

If the number of states in is finite, then this first-order system can in principle be solved by exponentiating the matrix , i.e. the solution is given by

 p(t)=exp(QTt)p(0)for anyt≥0,

where is the vectorized form of the probability distribution of the initial state . However this approach is infeasible for large state-spaces and in such cases, the Finite State Projection (FSP) method [5] can be used to approximately solve the CME (see Section 2.2).

In many biological applications, rather than the finite-time behavior, one is interested in the properties of the system after it has settled down, or in other words, the CTMC has reached a steady-state which is characterized by a stationary distribution satisfying (1.1), that is essentially a fixed-point for the CME (2.9). We say that the CTMC is ergodic if this fixed-point is unique and globally attracting in the sense that for any initial probability distribution , the solution of (2.9) satisfies

 limt→∞∥p(t)−π∥ℓ1=0, (2.10)

where denotes the -distance555Generally ergodicity is defined using the total variation distance between probability distributions. However for a discrete state-space the total variation distance among probability distributions is exactly half of the distance computed using the norm. So we work with the norm in this paper. between probability measures and . Furthermore the CTMC is called exponentially ergodic if the convergence in (2.10) is exponentially fast, i.e. there exist positive constants and such that for any

 ∥p(t)−π∥ℓ1≤Ce−ρt.

Here the constant may depend on the initial distribution but the constant does not (see [19] for example).

### 2.2 The Finite State Projection Algorithm

In the FSP method, approximate solutions of the CME (2.9) are obtained by restricting it to a truncated state-space. Suppose this truncated subset is given by a finite set of size . Using the same enumeration as in Section 2.1, we can express the set as . Letting to be the matrix formed by the rows and columns of matrix in the set , we approximate (2.9) by the -dimensional linear system

 dpndt=QTnpn(t). (2.11)

The solution of this system is simply , where is the containing the components of vector in the set .

Let be the -dimensional vector of all ones. We assume that the initial state can only take values in and so . It is easy to check that all the rows of matrix have a non-positive sum, which implies that for any

 ϵn(t):=1−1Tpn(t)=1−1Texp(QTnt)pn(0)≥0.

Results in [5] show that quantifies the “error” between the actual solution of CME and its approximation . For any fixed , this error decreases monotonically with increasing values of . Moreover as and the truncated state-space approaches the full state-space , we have . In the FSP algorithm of [5], the final time is fixed and the system (2.11) is solved in the time-interval with some truncated state-space . Thereafter the error is evaluated and if this value is below some pre-specified tolerance level , then the algorithm is terminated. Otherwise the truncated state-space is expanded to include more states and the same is process is repeated. After finitely many such iterations, the truncated state-space becomes large enough to ensure that the tolerance criterion is met.

Another way to formulate the FSP method is to consider the projected CTMC over the state-space , whose transitions among the states in are same as the original CTMC, but any outgoing transitions from the set are absorbed in the state (see Figure 1B), which serves as a proxy for all states in the set . Enumerating the elements of as , the transition rate matrix for this projected CTMC is given by

 ˜Qn=[Qncn00], (2.12)

where is the -dimensional column vector whose -th component is

 cn,i=K∑k=1,(xji+ζk)∉Enλk(xji). (2.13)

This choice of ensures that all the rows of matrix sum to and hence is a valid transition rate matrix. Let be the solution of the CME (2.9) corresponding to rate matrix and with initial value . Then it can be shown that for any we can express as

 ˜pn(t)=(pn(t),ϵn(t)),

which proves that the FSP approximation error at time is exactly the amount of probability-mass that has been absorbed by the state in the time-interval .

One can show that typically for any fixed truncated state-space , we would have as , which says that eventually all the probability mass gets absorbed by the state . Therefore even if the original CTMC is ergodic and the solution of the CME (2.7) converges to as , the approximate solution obtained by solving the FSP system (2.11), will not be close to for large times and in fact converges to a vector of all zeros at . This is also evident from the stationary distribution that can be computed by finding a non-zero solution to the linear-algebraic system (1.1) corresponding to the matrix (see (2.12)). This would yield a stationary distribution of the form , which assigns all the mass to the absorbing state and hence cannot be close to . This shows that the FSP approach is not conducive for the estimation of stationary distribution .

## 3 The stationary FSP method

In this section we present our method sFSP for estimating the stationary distribution for the CTMC model of a reaction network. This is accomplished by constructing a projected CTMC over the truncated state-space and computing the stationary distribution of this new CTMC. Keeping the same notation as in Section 2.2, this projected CTMC over the truncated state-space is constructed by redirecting the transitions that leave to some designated state (see Figure 1C). Let the matrix and the vector be as in Section 2.2. Then the transition rate matrix for this CTMC is simply given by

 ¯¯¯¯Qn=Qn+cnbl, (3.14)

where corresponds to the address of the designated state (i.e. ) and is the vector whose -th component is and the rest are all zeros. Essentially, is formed by adding the non-negative vector to the -th column of matrix . All the rows of matrix sum to and hence is a valid transition rate matrix and so our projected CTMC is well-defined. Our method sFSP estimates the stationary distribution by computing the finite-dimensional stationary distribution for the projected CTMC with transition rate matrix . Using , we can also compute the overall outflow rate at the estimated stationary distribution by

 r(n)out=cTn¯¯¯πn. (3.15)

This quantity will play a key role in bounding the sFSP approximation error whose direct computation is impossible.

### 3.1 Analysis of sFSP

The aim of this section is to demonstrate that under certain conditions, that are commonly satisfied by biological reaction networks, the sFSP approximation error can be made arbitrarily small by picking a truncated state-space , that is large enough. Moreover it is possible to check if is large enough by computing a convergence factor which is defined by suitably scaling the outflow rate . The main results of this section are collected in Theorem 3.1 and they provide the theoretical basis for our sFSP method.

Before we present our result we need to discuss some preliminary concepts. The state-space of the original CTMC is called irreducible if this CTMC has a positive probability of reaching any state in from any other state in , in a finite time. More formally, the state-space is irreducible, if for any we have for some . In our setting of reaction networks, this is equivalent to saying that between any two states there exists a sequence of positive-propensity reactions that takes the dynamics from to . For this to hold we must have and at each intermediate state the next reaction in the sequence () has a positive propensity of firing (). When only finitely many states are accessible by the reaction dynamics, irreducible state-spaces can be easily found by manipulating the transition rate matrix (see [21]). However when infinitely many states are accessible, finding irreducible state-spaces within the infinite lattice becomes a complicated task. In a recent work [15] we address this challenge and develop a computational procedure that can find all the irreducible state-spaces for a large class of biological reaction networks. In particular, for most networks of interest each irreducible state-space has the form666To obtain this form relabeling of species may be required.

 E=Eb×Ndf0, (3.16)

where is a finite set in , and are non-negative integers summing up to the total number of species . Here contains the dynamics of bounded species whose copy-numbers are required to satisfy a positive mass-conversation relation. A typical example is a gene-expression network where the gene of interest has many (say ) activity modes. To represent the dynamics we need to represent each such mode by a different network species, but all these species will be bounded and their copy-numbers will evolve in a finite set , because the gene of interest has a fixed copy-number (see the Pap-Switch example in Section 4.3 for instance). The species that are not bounded are free777Apart from free and bounded species, there may also exist another type of species, called restricted species, whose dynamics essentially mimics the dynamics of free species according to some affine map. However these restricted species can be easily eliminated to obtain a dynamically equivalent network and hence we ignore such species here (see [15] for more details). to have any copy-number and hence the state-space for their dynamics is taken to be the full non-negative integer orthant .

Note that the property of ergodicity (see Section 2.1) will obviously fail if there do not exist any stationary distributions or there exist more than one stationary distributions for the CTMC . If the state-space is finite, then its irreducibility is sufficient to guarantee that the stationary distribution exists uniquely and the CTMC is exponentially ergodic (see [21]). However when is infinite, its irreducibility can only guarantee the uniqueness of a stationary distribution but the existence of this distribution must be checked by other means, for example, using the results in [22] and [19]. In particular Theorem 7.1 in [19] guarantees the existence of a stationary distribution along with exponential ergodicity, if we can construct a function which is norm-like (i.e. as ) and for some , the following holds for all :

 QV(x)≤C1−C2V(x), (3.17)

where is the CTMC generator given by (2.4). This condition is called the Foster-Lyapunov criterion in the literature and it describes the tendency of the CTMC to experience a drift towards some finite set in the state-space with a force that is proportional to the distance from this finite set, measured according to . In [16] it is shown that for many biomolecular reaction networks, a linear Foster-Lyapunov function

 V(x)=1+⟨v,x⟩, (3.18)

satisfying (3.17) can be constructed. Here is a positive vector which is chosen using simple Linear Programming and denotes the standard inner product in . Observe that for the linear function (3.18), the drift condition (3.17) is simply

 K∑k=1λk(x)⟨v,ζk⟩≤C1−C2(1+⟨v,x⟩) for all x∈E. (3.19)

As demonstrated in [16], often for biological reaction networks the vector can be chosen in such a way that along with this drift condition, the following diffusivity condition is also satisfied - for some

 K∑k=1λk(x)⟨v,ζk⟩2≤C3+C4(1+⟨v,x⟩) for all x∈E. (3.20)

When (3.19) and (3.20) hold simultaneously, then in addition to exponential ergodicity, one can also guarantee other desirable properties like finiteness of all statistical moments of the stationary distribution and convergence of all the moments of the CTMC to their steady-state values as time approaches infinity (see Theorem 5 in [16]).

To study the sFSP approximation error we need to work with the norm prescribed by the Foster-Lyapunov function . For any signed measure on , this norm is given by

 ∥μ∥V=∑x∈E|μ(x)|V(x).

Note that this norm is tighter than the norm because as . Let denote the boundary of the truncated state-space , which includes all those states in for which there exists a positive-propensity reaction that takes the dynamics outside , i.e.

 B(En)={x∈En:λk(x)>0 and (x+ζk)∉En for some k=1,…,K}. (3.21)

Based on the outflow rate given by (3.15), we define the convergence factor as

 γ(n)V=r(n)out∥En∥V, (3.22)

where

 ∥En∥V=V(xℓ)+maxx∈B(En)V(x) (3.23)

and is the designated state. Our next result will show that the convergence factor is a useful diagnostic tool to assess the approximation error of sFSP. Note that unlike the approximation error, can be explicitly computed from the sFSP output if the Foster-Lyapunov function is known. In situations where is unknown, the definition of can often be suitably modified to preserve its diagnostic purpose (see Remark 3.3).

We now come to the main result of our paper.

###### Theorem 3.1

Suppose that state-space is irreducible for the original CTMC with transition rate matrix , and there exists a Foster-Lyapunov function satisfying (3.17). Also assume that is a sequence of finite sets that is increasing (i.e. if ) and that covers the full state-space in the limit . Fix a designated state and let be the transition rate matrix of our projected CTMC with state-space , defined according to (3.14). Then we have the following:

• The stationary distribution for the projected CTMC exists uniquely.

• As , converges to the stationary distribution for the original CTMC, in the metric, i.e.

 limn→∞∥π−¯¯¯πn∥ℓ1=0. (3.24)
• There exists a positive constant such that for any

 ∥π−¯¯¯πn∥V≤Mγ(n)V, (3.25)

where is the convergence factor defined by (3.22).

• Suppose that the Foster-Lyapunov function has the linear form (3.18) and the positive vector is such that both (3.19) and (3.20) are satisfied. Furthermore assume that the sequence of sets grows uniformly w.r.t. function which means that for some constant we have

 minx∈B(En)V(x)≥θmaxx∈B(En)V(x)for alln=1,2,…, (3.26)

where is the boundary of defined by (3.21). Then there exists a constant for which the converse of (3.25) also holds, i.e. for each

 ∥π−¯¯¯πn∥V≥M′γ(n)V. (3.27)

Furthermore, the convergence factor converges to as .

###### Remark 3.2

It will become evident from the proof that if the Foster-Lyapunov function and constants in (3.17) are known, then a constant satisfying part (C) can be explicitly computed using the results in Meyn and Tweedie [23]. Hence part (C) provides a computable upper-bound for the approximation error . Similarly the constant satisfying part (D) may be explicitly computed from constants in (3.19) and (3.20), and the constant that appears in (3.26). The tightness of the error bounds obtained from these explicitly computable constants remains to be investigated. Nevertheless parts (C) and (D) are useful in demonstrating that up to a constant, the magnitude of the uncomputable approximation error can be assessed by computing the convergence factor . In other words, if then , and similarly if then , where and are the optimal constants for which parts (C) and (D) hold.

###### Remark 3.3

Note that computation of the convergence factor (3.22) requires knowledge of the Foster-Lyapunov function which is undesirable from the point of view of applications. However it is possible to circumvent this problem, if one has information about the form of and the shape of finite sets . For this one needs to pick a sequence such that for some constants

 1α∥En∥V≤βn≤1α′∥En∥V,

holds for each , with defined by (3.23). Then one can define the convergence factor as

 γn=r(n)outβn, (3.28)

with the outflow rate given by (3.15), and parts (C) and (D) will hold with the substitutions, , and . For example, if has the linear form (3.18), then one can define in the same way as but with replaced by any norm on .

Proof. We start by proving part (A). The stationary distribution for the projected CTMC certainly exists because the transition rate matrix is finite (see [21]). This stationary distribution can be found by solving the linear-algebraic system (1.1) with transition-rate matrix . We now prove by contradiction the uniqueness of this stationary distribution. Suppose that this uniqueness does not hold. Then there would exist at least two disjoint non-empty irreducible state-spaces (say and ) for the projected CTMC within the state-space . This implies that if the projected CTMC starts in set then it remains in this set for all times, and there is a positive probability for this CTMC to reach any state in from any other state in in a finite time. The same holds true for set . Certainly one of these sets, say , will not contain the designated state but this leads to a contradiction due to the following reasons. Since the state-space is irreducible for the original CTMC, there exists a sequence of reactions that takes the original CTMC from any state to the designated state with a positive probability. If all the intermediate states that arise in this reaction path (recall -s from above) lie within the set , then the same sequence of reactions will also take the projected CTMC from state to state , which is a contradiction because is an irreducible state-space not containing . On the other hand if one of the intermediate states lies outside , then the last reaction, say , in the sequence that takes the dynamics outside will be redirected to the designated state in the projected CTMC and hence again we have a contradiction because is a positive-probability sequence of reactions that takes the projected CTMC from state to state . Therefore the stationary distribution for the projected CTMC is unique, and this completes the proof of part (A).

We now prove part (B). Clearly the assertion of part (B) is trivial when the full state-space is finite and so we assume that is infinite from now on. Let be a sequence of sets as stated in the proposition and let be an enumeration of satisfying

 ϕ(x)∈{0,1,…,|En|−1}for % eachx∈Enandn=1,2,…. (3.29)

Such an enumeration exists because is an increasing sequence of sets that cover the set in the limit . Note that as each is a finite set, condition (3.29) ensures that

 lim∥x∥→∞ϕ(x)=∞andlimi→∞∥ϕ−1(i)∥=∞. (3.30)

Now consider the -valued, one-dimensional process given by for each , where is the original CTMC with transition rate matrix and generator (see (2.4)). As is a one-to-one and onto map, the process is also a CTMC and its generator is given by

 ˆQg(i)=Qf(ϕ−1(i)),

where is a bounded real-valued function on and is the bounded real-valued function on defined by .

Irreducibility of state-space for implies the irreducibility of state-space for . Let be the norm-like Foster-Lyapunov function satisfying (3.17) and define the function by . Then using (3.30) and (3.17) we can deduce that is a norm-like function satisfying

 ˆQˆV(i)=QV(ϕ−1(i))≤C1−C2V(ϕ−1(i))=C1−C2ˆV(i).

Therefore is a Foster-Lyapunov function for CTMC with generator and hence this CTMC is exponentially ergodic due to Theorem 7.1 in [19]. Let and be the probability distributions on and defined by

 ˆπ(i)=π(ϕ−1(i))andˆπn(i)=¯¯¯πn(ϕ−1(i)).

Then is the stationary distribution for the CTMC and is the stationary distribution of this CTMC projected onto the finite state-space by redirecting all the outgoing transitions to the designated state . Theorem 3.3 in [24] proves

 limn→∞∥ˆπ−ˆπn∥ℓ1=0,

using resolvent forms (see (3.34)). This limit is equivalent to (3.24) and this proves part (B).

We will now prove part (C). Without loss of generality we can assume that . Define an infinite vector

 ϑn=QT[¯¯¯πn0]=[ϑ1ϑ2], (3.31)

whose first elements are , where denotes the northwest sub-matrix of . Recall that matrix is given by (3.14) and the outflow rate is defined by (3.15). As we can write as

 ϑ1=QTn¯¯¯πn−¯¯¯¯QTn¯¯¯πn=(Qn−¯¯¯¯Qn)T¯¯¯πn=−bTlcTn¯¯¯πn=−bTlr(n)out,

which shows that the vector has only one non-zero entry which is equal to and it is at the position corresponding to the designated state . Since we have which implies that

 1Tϑ2=−1Tϑ1=cTn¯¯¯πn=r(n)out.

One can check that all entries of the infinite vector are non-negative and only those entries are non-zero that correspond to the states in the boundary set (see (3.21)) of . Therefore, viewing as a signed measure over , we can express it as

 ϑn=r(n)out(μ2−μ1), (3.32)

where and are probability measures on , supported on and respectively. With a slight abuse of notation, we will denote the vector-version of also as .

Define the sFSP approximation error in vector form as

 ϵn=(π−[¯¯¯πn0]),

and since we get the following equation from (3.31)

 QTϵn=−ϑn. (3.33)

One can verify that is the unique solution of this linear system with the constraint . For any , let denote the -resolvent matrix corresponding to the transition rate matrix . It is defined by

 Rβ=β(βI−Q)−1, (3.34)

where denotes the identity matrix. It is known (see [24]) that is a positive matrix satisfying , and

 Rβ=I+β−1QRβ=I+β−1RβQ. (3.35)

One can regard as the transition matrix of a discrete-time Markov chain over whose unique stationary distribution is .

Expressing the Foster-Lyapunov function as the vector we can write the drift condition (3.17) as

 QV≤C11−C2V.

This relation along with (3.35) and the positivity of implies

 RβV =(I+β−1RβQ)V=V+β−1RβQV≤V+β−1Rβ(C11−C2V)=V+C1β1−C2βRβV.

Letting and we obtain

 RβV≤λV+C1.

Note that . Theorem 6.1 in [23] shows that we can explicitly compute constants and , such that for any probability distribution over we have

 ∥μTRmβ−πT∥V≤C′∥μ∥Vρm, (3.36)

where denotes the -th power of the matrix . Transposing (3.33), multiplying both sides by and using (3.35) and (3.32) we get

 βϵTn(Rβ−I)=ϵTnQRβ=−ϑTnRβ=r(n)out(μT1−μT2)Rβ.

One can write as

 ϵn=r(n)outβ(ϵ1−ϵ2), (3.37)

where is the solution to

 ϵTj(Rβ−I)=(μj−π)TRβ=μTjRβ−πT,

for . This solution can be expressed as

 ϵTj=−(μj−π)T∞∑m=1Rmβ=−∞∑m=1(μTjRmβ−πT),

and using (3.36) we get

 ∥ϵj∥V≤∞∑m=1∥μTjRmβ−πT∥V≤C′∥μj∥V∞∑m=1ρm=C′∥μj∥Vρ1−ρ.

Therefore

 ∥ϵn∥V≤r(n)outβ(∥ϵ1∥V+∥ϵ2∥V)≤Mr(n)out(∥μ1∥V+∥μ2∥V)

where . As and are probability distributions supported on and , we have (see (3.23)). This proves part (C) of the theorem.

We now prove part (D). Here we assume that the Foster-Lyapunov function has the linear form (3.18) and both (3.19) and (3.20) are satisfied. Note that by rescaling the positive vector in (3.18) if necessary, we can assume that

 |⟨v,ζk⟩|≤⟨v,ζk⟩2for eachk=1,…,K.

As from condition (3.19) we obtain

 |QV(x)|≤K∑k=1λk(x)|⟨v,ζk⟩|≤K∑k=1λk(x)⟨v,ζk⟩2≤C3+C4V(x), (3.38)

for each . Transposing (3.33), multiplying both sides by vector and taking absolute values we get

 |ϵTnQV|=|ϑTnV|. (3.39)

Using (3.38) we can upper-bound the l.h.s. as

 |ϵTnQV|=|⟨ϵn,QV⟩|≤⟨|ϵn|,|QV|⟩≤C3∥ϵn∥ℓ1+C4∥ϵn∥V. (3.40)

Since is given by (3.32), with and being probability distributions supported on and respectively, we can lower-bound the r.h.s. of (3.39) as

 |ϑTnV|=r(n)out(μT2V−μT1V)≥r(n)out(minx∈B(En)V(x)−V(xℓ)).