A finite state projection algorithm for the stationary solution of the chemical master equation
Abstract
The chemical master equation (CME) is frequently used in systems biology to quantify the effects of stochastic fluctuations that arise due to biomolecular species with low copy numbers. The CME is a system of ordinary differential equations that describes the evolution of probability density for each population vector in the statespace of the stochastic reaction dynamics. For many examples of interest, this statespace is infinite, making it difficult to obtain exact solutions of the CME. To deal with this problem, the Finite State Projection (FSP) algorithm was developed by Munsky and Khammash (Jour. Chem. Phys. 2006), to provide approximate solutions to the CME by truncating the statespace. The FSP works well for finite timeperiods but it cannot be used for estimating the stationary solutions of CMEs, which are often of interest in systems biology. The aim of this paper is to develop a version of FSP which we refer to as the stationary FSP (sFSP) that allows one to obtain accurate approximations of the stationary solutions of a CME by solving a finite linearalgebraic system that yields the stationary distribution of a continuoustime Markov chain over the truncated statespace. We derive bounds for the approximation error incurred by sFSP and we establish that under certain stability conditions, these errors can be made arbitrarily small by appropriately expanding the truncated statespace. We provide several examples to illustrate our sFSP method and demonstrate its efficiency in estimating the stationary distributions. In particular, we show that using a quantized tensor train (QTT) implementation of our sFSP method, problems admitting more than 100 million states can be efficiently solved.
Keywords: stochastic reaction networks; the Chemical Master Equation; Finite State Projection; stationary distribution; ergodicity; irreducibility; tensor trains
Mathematical Subject Classification (2010): 60J22; 60J27; 60H35; 65C40; 92E20
1 Introduction
Many intracellular reaction networks consist of biomolecular species that are typically present in low copynumbers. The reactions involving these species fire intermittently at random times, rather than continuously. Hence deterministic descriptions of the reaction dynamics, based on Ordinary Differential Equations (ODEs), become highly inaccurate [1]. It is now wellknown that macroscopic properties of the system can be heavily influenced by the intrinsic noise or randomness that arises due to the random timing of reactions [2]. Consequently stochastic formulations of the reaction dynamics, based on continuoustime Markov chains (CTMCs), has become a popular approach for studying the effects of intrinsic noise [3]. In this paper we provide a tool for the analysis of such models.
In the CTMC model of a reaction network, the state at any time is the vector of copynumber counts of all the species. When the number of network species is , the dynamics evolves on a discrete statespace which is a subset of the dimensional nonnegative integer lattice and this subset must be large enough to include all the states that are accessible by the random dynamics. The effects of intrinsic noise on the reaction network are generally studied using the probability distribution of the random statevector at time . It is known that the timeevolution of this probability distribution is given by a system of coupled ODEs, known as the Chemical Master Equation (CME) in the literature (see (2.7)). For each state in there is an ODE in the CME that captures the inflow and outflow of probability at that state. If this statespace is finite, then the CME is a finite system of linear ODEs which can in principle be solved to yield the probability distribution . However in many examples of biological interest, the statespace is infinite, making the CME impossible to solve. A common approach in such cases is to estimate the CME solution by computing the empirical distribution of the samples obtained by simulating the CTMC using Monte Carlo methods such as Gillespie’s Stochastic Simulation Algorithm (SSA) [4]. This simulationbased approach can be very timeconsuming and the estimates suffer from statistical errors due to finitely many samples being used. In particular the lowprobability events are sparsely sampled by Monte Carlo simulations, which can lead to incorrect representations of the CME solution. Such problems can be avoided by using the Finite State Projection (FSP) method developed by Munsky and Khammash [5], that directly solves the CME by truncating the statespace to a manageable size (see Section 2.2). The solution obtained is approximate but FSP provides an iterative way to ensure that the approximation error is within some prespecified tolerance level.
The truncated statespace needed by FSP to solve the CME accurately is still exorbitantly large for many problems of interest. For example, consider a simple geneexpression network where ten protein species are interacting with each other. Typically each protein in a cell has copynumbers of the order of several thousands. So even if we have a conservative upperbound of 1000 on the copynumber of each protein, the size of the statespace required for FSP is of the order of , which is beyond the computational and storage capacity of modern day computers. This combinatorial explosion in the statespace size is often called the “curse of dimensionality” and it presents a major challenge in making the CME practically solvable. Several advanced numerical techniques have been developed that address this challenge by adapting the FSP. These techniques include Krylov Subspace approximations [6], TensorTrain representations [7], and using sparse grids and aggregation methods [8]. Unlike these methods which attempt to solve the exact version of CME, there also exist a body of methods that aim to solve simplified versions of CME, which are derived by approximating the CTMC dynamics by a Stochastic Differential Equation (SDE) or a Piecewisedeterministic Markov Process (PDMP) (see [9, 10, 11]). Such dynamical approximations only hold for finite timeperiods, and the assumptions on species copynumbers and reaction propensities they require, are not always satisfied by networks encountered in systems biology.
For many biological applications, one is interested in the steadystate behavior which is captured by the stationary probability distribution to which the solution of the CME converges to as . For CTMCs whose statespace is finite and not too large, estimation of the stationary distribution is a simple linearalgebraic problem (see (1.1)). However in situations where the statespace is very large or infinite, this linearalgebraic problem cannot be practically solved, and we need to estimate by other means. The methods mentioned above for estimating only work over finite timeintervals and they would generally fail to provide an accurate estimate of the stationary distribution . The reason for this failure depends on the method being used. The dynamical approximations based on PDMPs or SDEs introduce an error that can become unbounded in the limit , and the Monte Carlo simulation based approach for estimating is highly undesirable due to statistical errors and the computational costs associated with these simulations over large timeintervals. The FSP algorithm also cannot be used for estimating because this method introduces an absorbing state to catch all the transitions that leave the truncated statespace (see Figure 1B). However in the limit , all the probability mass flows into this absorbing state, and so the obtained probability distributions are unable to capture the true stationary distribution . We revisit this point later in this section and also explain it in detail in Section 2.2.
The aim of this paper is to present a FSPlike method for accurately estimating the stationary distribution . This method also involves truncating the statespace but rather than solving a linear system of ODEs for probabilities over the truncated statespace (as in FSP), our method estimates the true stationary distribution by computing the stationary distribution of a suitably defined CTMC over the truncated statespace. As the latter step can be accomplished by solving a linearalgebraic system, rather than a system of ODEs, the computational complexity of our method is much lower than that of FSP. Consequently it can be successfully applied on a larger class of networks. We call our method the stationary Finite State Projection (or sFSP) algorithm and we provide theoretical arguments to establish its accuracy under certain stability conditions which are usually satisfied by networks in systems and synthetic biology. Even though sFSP can be applied on larger systems than FSP, the combinatorial explosion of statespace sizes still limits the range of applicability of sFSP severely. As was the case with FSP, this issue can be somewhat resolved by adapting sFSP to work with quantized tensortrain (QTT) representations [7], sparse grids and aggregation methods [8]. We illustrate this point with a computational example where sFSP is applied to the QTT representation of the CME (see Section 5). We remark here that QTT representations have already been successfully employed for obtaining approximations of the stationary distribution for reaction networks satisfying certain graphtheoretic criteria [12, 13]. However these criteria are highlyrestrictive and it will become evident that the sFSP based approach is more versatile.
We now describe the problem of estimating stationary distributions in more detail. Henceforth let denote the size of any set , and let and denote the vector of all zeros and all ones respectively^{1}^{1}1The dimension of these vectors will be clear from the context.. The stochastic model of a reaction network (see Section 2.1) represents the dynamics as a CTMC over a discrete statespace . Such a CTMC can be described by its transition rate matrix (see [14]), whose diagonal entries are nonpositive, offdiagonal entries are nonnegative and all the rows sum to zero. The stationary distribution for this CTMC can be described by a nonnegative vector^{2}^{2}2Throughout the paper we assume that vector and matrix indices start from rather than . , which is in the left nullspace of transition rate matrix , i.e.
(1.1) 
and its components sum to (i.e. ). Such a stationary distribution may not be unique and if then it may not even exist (see [14]). In our recent work, we have dealt extensively with the issue of computationally verifying the existence and uniqueness of the stationary distribution corresponding to the CTMC models for a large class of biomolecular reaction networks (see [15] and [16]). Assuming that the existence and uniqueness of the stationary distribution has been ascertained for the network, our aim here is to estimate numerically. We are primarily interested in situations where is infinite, and so the direct computation of using (1.1) is computationally impossible.
It is natural to try to estimate by solving a finite, truncated version of the linearalgebraic system (1.1). This truncated version can be obtained by first identifying a truncated statespace and then projecting the CTMC dynamics on this truncated statespace. Thereafter the stationary distribution for the projected CTMC, found by solving the corresponding linearalgebraic system of the form (1.1), serves as an estimate of the true stationary distribution . An important issue that arises here is how to handle the outgoing transitions from the truncated statespace, so that the obtained estimate of is accurate. In the FSP approach [5], these outgoing transitions are preserved but their target states are collapsed into a single absorbing state (see Figure 1B). This leads to the “probability leakage” problem which can be managed over finite timeintervals but not in the asymptotic regime. This problem manifests itself in the fact that the only stationary distribution for the projected CTMC would be the one that puts all the probabilitymass at the absorbing state. Obviously this does not capture the true stationary distribution and hence modifications to the FSP approach are necessary to circumvent the probabilityleakage problem. One such modification that has been tried is motivated by the use of “reflected” boundary conditions in the study of FokkerPlanck equations [12]. In this approach all the outgoing transitions from the truncated statespace are simply eliminated by setting their propensities to zero. It has been shown that this reflected version of FSP yields accurate estimates of the stationary distribution for some reaction network examples [17, 12]. However there is no theoretical guarantee that this approach will work well in general.
The method sFSP that we present in this paper modifies the FSP in another way. It preserves the outgoing transitions from the truncated statespace, but rather than channeling them to an absorbing state (as in FSP), it redirects them to a designated state within the truncated statespace (see Figure 1C). This modification is simple to implement and its appealing feature is that for a wide range of biomolecular reaction networks, we can theoretically guarantee that the stationary distribution of the projected CTMC converges to the actual stationary distribution as the truncated statespace expands to the full statespace . Moreover we derive bounds for the approximation error incurred by this approach, in terms of the outflow rate of all the outgoing transitions evaluated at the estimated stationary distribution (see Theorem 3.1). These results provide the theoretical basis for our method which expands the truncated statespace iteratively to recover a “good” approximation of . Note that our approach for estimating the stationary distribution is very different from the stochastic complementation approach proposed in [18]. This complementation approach is generally difficult to implement for infinite statespace CTMCs and it only yields the conditional stationary distribution which can then be used to derive upper and lower bounds for the true stationary probabilities. However such bounds are not guaranteed to be close to each other. In contrast, our method allows one to estimate the true stationary probabilities directly.
For our method sFSP to work we require that the original CTMC representing the reaction network satisfies a couple of stability conditions. The first condition is that the statespace needs to be irreducible i.e. all the states in must be accessible from each other via a sequence of positivepropensity reactions. The second condition is a FosterLyapunov criterion (see [19]) which ensures that the original CTMC is exponentially ergodic i.e. the solution of the CME converges to the stationary distribution exponentially fast. We elaborate these stability conditions in Section 3.1 and there we also explain how these conditions can be easily checked for a wide range of networks arising in systems and synthetic biology, using the computational procedures developed in our recent papers [15] and [16]. This makes the proposed sFSP method broadly applicable and of interest to the growing community of researchers working with stochastic models of biomolecular reaction networks.
This paper is organized as follows. In Section 2 we describe the stochastic model and the original FSP method [5]. In Section 3 we present and mathematically analyze our stationary Finite State Projection (or sFSP) algorithm. A simple implementation of sFSP is presented in Section 4 while its QTT implementation is presented in Section 5. These sections also include the computational examples that illustrate the respective implementations. Finally in Section 6 we conclude and discuss directions for future research.
2 Preliminaries
2.1 The stochastic model
We now formally describe the CTMC model of a reaction network. Suppose this network has species, called , and reactions of the form
(2.2) 
Here and are nonnegative integers denoting the number of molecules of species that are consumed and produced by the th reaction. The state of the system at any time is the vector of molecular counts of all the species. When the th reaction fires, the state is displaced by the stoichiometric vector whose th component is . At any state , the rate of the th reaction is , where is the propensity function for this reaction. Commonly mass action kinetics (see [3]) is assumed, where each is given by
(2.3) 
with the positive parameter being the associated rate constant. We model the reaction dynamics as a CTMC which jumps from state after a random waiting time which is exponentially distributed with rate , and this jump is in direction with probability . Formally this CTMC can be specified by its generator^{3}^{3}3The generator of a Markov process is an operator which specifies the rate of change of the probability distribution of the process (see Chapter 4 in [20] for details). defined as
(2.4) 
where is any bounded realvalued function on .
From now on we suppose that there is a nonempty statespace on which the CTMC evolves i.e.
(2.5) 
In other words, if at state , reaction has a positive probability of firing then the resulting state must also be in . As is at most countable, it can be enumerated. This means that we can find a onetoone and onto map from to the set . Once such an enumeration is fixed, the set can be expressed as , where . Then the CTMC generator can be expressed as the transition rate matrix given by^{4}^{4}4Here we assume for convenience that all stoichiometry vectors (s) are distinct.
Let be the CTMC with this transition rate matrix and some initial state . For any state , let
(2.6) 
be the probability that the CTMC is in state at time . These probabilities evolve in time according to the Chemical Master Equation (CME) given by
(2.7) 
for each . Note that this system has as many ODEs as the number of elements in the statespace , which is generally infinite or very large.
Let be the probability distribution defined by
(2.8) 
for any . The vectorized form of w.r.t. enumeration is simply given by
and using this form we can express the CME as
(2.9) 
If the number of states in is finite, then this firstorder system can in principle be solved by exponentiating the matrix , i.e. the solution is given by
where is the vectorized form of the probability distribution of the initial state . However this approach is infeasible for large statespaces and in such cases, the Finite State Projection (FSP) method [5] can be used to approximately solve the CME (see Section 2.2).
In many biological applications, rather than the finitetime behavior, one is interested in the properties of the system after it has settled down, or in other words, the CTMC has reached a steadystate which is characterized by a stationary distribution satisfying (1.1), that is essentially a fixedpoint for the CME (2.9). We say that the CTMC is ergodic if this fixedpoint is unique and globally attracting in the sense that for any initial probability distribution , the solution of (2.9) satisfies
(2.10) 
where denotes the distance^{5}^{5}5Generally ergodicity is defined using the total variation distance between probability distributions. However for a discrete statespace the total variation distance among probability distributions is exactly half of the distance computed using the norm. So we work with the norm in this paper. between probability measures and . Furthermore the CTMC is called exponentially ergodic if the convergence in (2.10) is exponentially fast, i.e. there exist positive constants and such that for any
Here the constant may depend on the initial distribution but the constant does not (see [19] for example).
2.2 The Finite State Projection Algorithm
In the FSP method, approximate solutions of the CME (2.9) are obtained by restricting it to a truncated statespace. Suppose this truncated subset is given by a finite set of size . Using the same enumeration as in Section 2.1, we can express the set as . Letting to be the matrix formed by the rows and columns of matrix in the set , we approximate (2.9) by the dimensional linear system
(2.11) 
The solution of this system is simply , where is the containing the components of vector in the set .
Let be the dimensional vector of all ones. We assume that the initial state can only take values in and so . It is easy to check that all the rows of matrix have a nonpositive sum, which implies that for any
Results in [5] show that quantifies the “error” between the actual solution of CME and its approximation . For any fixed , this error decreases monotonically with increasing values of . Moreover as and the truncated statespace approaches the full statespace , we have . In the FSP algorithm of [5], the final time is fixed and the system (2.11) is solved in the timeinterval with some truncated statespace . Thereafter the error is evaluated and if this value is below some prespecified tolerance level , then the algorithm is terminated. Otherwise the truncated statespace is expanded to include more states and the same is process is repeated. After finitely many such iterations, the truncated statespace becomes large enough to ensure that the tolerance criterion is met.
Another way to formulate the FSP method is to consider the projected CTMC over the statespace , whose transitions among the states in are same as the original CTMC, but any outgoing transitions from the set are absorbed in the state (see Figure 1B), which serves as a proxy for all states in the set . Enumerating the elements of as , the transition rate matrix for this projected CTMC is given by
(2.12) 
where is the dimensional column vector whose th component is
(2.13) 
This choice of ensures that all the rows of matrix sum to and hence is a valid transition rate matrix. Let be the solution of the CME (2.9) corresponding to rate matrix and with initial value . Then it can be shown that for any we can express as
which proves that the FSP approximation error at time is exactly the amount of probabilitymass that has been absorbed by the state in the timeinterval .
One can show that typically for any fixed truncated statespace , we would have as , which says that eventually all the probability mass gets absorbed by the state . Therefore even if the original CTMC is ergodic and the solution of the CME (2.7) converges to as , the approximate solution obtained by solving the FSP system (2.11), will not be close to for large times and in fact converges to a vector of all zeros at . This is also evident from the stationary distribution that can be computed by finding a nonzero solution to the linearalgebraic system (1.1) corresponding to the matrix (see (2.12)). This would yield a stationary distribution of the form , which assigns all the mass to the absorbing state and hence cannot be close to . This shows that the FSP approach is not conducive for the estimation of stationary distribution .
3 The stationary FSP method
In this section we present our method sFSP for estimating the stationary distribution for the CTMC model of a reaction network. This is accomplished by constructing a projected CTMC over the truncated statespace and computing the stationary distribution of this new CTMC. Keeping the same notation as in Section 2.2, this projected CTMC over the truncated statespace is constructed by redirecting the transitions that leave to some designated state (see Figure 1C). Let the matrix and the vector be as in Section 2.2. Then the transition rate matrix for this CTMC is simply given by
(3.14) 
where corresponds to the address of the designated state (i.e. ) and is the vector whose th component is and the rest are all zeros. Essentially, is formed by adding the nonnegative vector to the th column of matrix . All the rows of matrix sum to and hence is a valid transition rate matrix and so our projected CTMC is welldefined. Our method sFSP estimates the stationary distribution by computing the finitedimensional stationary distribution for the projected CTMC with transition rate matrix . Using , we can also compute the overall outflow rate at the estimated stationary distribution by
(3.15) 
This quantity will play a key role in bounding the sFSP approximation error whose direct computation is impossible.
3.1 Analysis of sFSP
The aim of this section is to demonstrate that under certain conditions, that are commonly satisfied by biological reaction networks, the sFSP approximation error can be made arbitrarily small by picking a truncated statespace , that is large enough. Moreover it is possible to check if is large enough by computing a convergence factor which is defined by suitably scaling the outflow rate . The main results of this section are collected in Theorem 3.1 and they provide the theoretical basis for our sFSP method.
Before we present our result we need to discuss some preliminary concepts. The statespace of the original CTMC is called irreducible if this CTMC has a positive probability of reaching any state in from any other state in , in a finite time. More formally, the statespace is irreducible, if for any we have for some . In our setting of reaction networks, this is equivalent to saying that between any two states there exists a sequence of positivepropensity reactions that takes the dynamics from to . For this to hold we must have and at each intermediate state the next reaction in the sequence () has a positive propensity of firing (). When only finitely many states are accessible by the reaction dynamics, irreducible statespaces can be easily found by manipulating the transition rate matrix (see [21]). However when infinitely many states are accessible, finding irreducible statespaces within the infinite lattice becomes a complicated task. In a recent work [15] we address this challenge and develop a computational procedure that can find all the irreducible statespaces for a large class of biological reaction networks. In particular, for most networks of interest each irreducible statespace has the form^{6}^{6}6To obtain this form relabeling of species may be required.
(3.16) 
where is a finite set in , and are nonnegative integers summing up to the total number of species . Here contains the dynamics of bounded species whose copynumbers are required to satisfy a positive massconversation relation. A typical example is a geneexpression network where the gene of interest has many (say ) activity modes. To represent the dynamics we need to represent each such mode by a different network species, but all these species will be bounded and their copynumbers will evolve in a finite set , because the gene of interest has a fixed copynumber (see the PapSwitch example in Section 4.3 for instance). The species that are not bounded are free^{7}^{7}7Apart from free and bounded species, there may also exist another type of species, called restricted species, whose dynamics essentially mimics the dynamics of free species according to some affine map. However these restricted species can be easily eliminated to obtain a dynamically equivalent network and hence we ignore such species here (see [15] for more details). to have any copynumber and hence the statespace for their dynamics is taken to be the full nonnegative integer orthant .
Note that the property of ergodicity (see Section 2.1) will obviously fail if there do not exist any stationary distributions or there exist more than one stationary distributions for the CTMC . If the statespace is finite, then its irreducibility is sufficient to guarantee that the stationary distribution exists uniquely and the CTMC is exponentially ergodic (see [21]). However when is infinite, its irreducibility can only guarantee the uniqueness of a stationary distribution but the existence of this distribution must be checked by other means, for example, using the results in [22] and [19]. In particular Theorem 7.1 in [19] guarantees the existence of a stationary distribution along with exponential ergodicity, if we can construct a function which is normlike (i.e. as ) and for some , the following holds for all :
(3.17) 
where is the CTMC generator given by (2.4). This condition is called the FosterLyapunov criterion in the literature and it describes the tendency of the CTMC to experience a drift towards some finite set in the statespace with a force that is proportional to the distance from this finite set, measured according to . In [16] it is shown that for many biomolecular reaction networks, a linear FosterLyapunov function
(3.18) 
satisfying (3.17) can be constructed. Here is a positive vector which is chosen using simple Linear Programming and denotes the standard inner product in . Observe that for the linear function (3.18), the drift condition (3.17) is simply
(3.19) 
As demonstrated in [16], often for biological reaction networks the vector can be chosen in such a way that along with this drift condition, the following diffusivity condition is also satisfied  for some
(3.20) 
When (3.19) and (3.20) hold simultaneously, then in addition to exponential ergodicity, one can also guarantee other desirable properties like finiteness of all statistical moments of the stationary distribution and convergence of all the moments of the CTMC to their steadystate values as time approaches infinity (see Theorem 5 in [16]).
To study the sFSP approximation error we need to work with the norm prescribed by the FosterLyapunov function . For any signed measure on , this norm is given by
Note that this norm is tighter than the norm because as . Let denote the boundary of the truncated statespace , which includes all those states in for which there exists a positivepropensity reaction that takes the dynamics outside , i.e.
(3.21) 
Based on the outflow rate given by (3.15), we define the convergence factor as
(3.22) 
where
(3.23) 
and is the designated state. Our next result will show that the convergence factor is a useful diagnostic tool to assess the approximation error of sFSP. Note that unlike the approximation error, can be explicitly computed from the sFSP output if the FosterLyapunov function is known. In situations where is unknown, the definition of can often be suitably modified to preserve its diagnostic purpose (see Remark 3.3).
We now come to the main result of our paper.
Theorem 3.1
Suppose that statespace is irreducible for the original CTMC with transition rate matrix , and there exists a FosterLyapunov function satisfying (3.17). Also assume that is a sequence of finite sets that is increasing (i.e. if ) and that covers the full statespace in the limit . Fix a designated state and let be the transition rate matrix of our projected CTMC with statespace , defined according to (3.14). Then we have the following:

The stationary distribution for the projected CTMC exists uniquely.

As , converges to the stationary distribution for the original CTMC, in the metric, i.e.
(3.24) 
There exists a positive constant such that for any
(3.25) where is the convergence factor defined by (3.22).

Suppose that the FosterLyapunov function has the linear form (3.18) and the positive vector is such that both (3.19) and (3.20) are satisfied. Furthermore assume that the sequence of sets grows uniformly w.r.t. function which means that for some constant we have
(3.26) where is the boundary of defined by (3.21). Then there exists a constant for which the converse of (3.25) also holds, i.e. for each
(3.27) Furthermore, the convergence factor converges to as .
Remark 3.2
It will become evident from the proof that if the FosterLyapunov function and constants in (3.17) are known, then a constant satisfying part (C) can be explicitly computed using the results in Meyn and Tweedie [23]. Hence part (C) provides a computable upperbound for the approximation error . Similarly the constant satisfying part (D) may be explicitly computed from constants in (3.19) and (3.20), and the constant that appears in (3.26). The tightness of the error bounds obtained from these explicitly computable constants remains to be investigated. Nevertheless parts (C) and (D) are useful in demonstrating that up to a constant, the magnitude of the uncomputable approximation error can be assessed by computing the convergence factor . In other words, if then , and similarly if then , where and are the optimal constants for which parts (C) and (D) hold.
Remark 3.3
Note that computation of the convergence factor (3.22) requires knowledge of the FosterLyapunov function which is undesirable from the point of view of applications. However it is possible to circumvent this problem, if one has information about the form of and the shape of finite sets . For this one needs to pick a sequence such that for some constants
holds for each , with defined by (3.23). Then one can define the convergence factor as
(3.28) 
with the outflow rate given by (3.15), and parts (C) and (D) will hold with the substitutions, , and . For example, if has the linear form (3.18), then one can define in the same way as but with replaced by any norm on .
Proof. We start by proving part (A). The stationary distribution for the projected CTMC certainly exists because the transition rate matrix is finite (see [21]). This stationary distribution can be found by solving the linearalgebraic system (1.1) with transitionrate matrix . We now prove by contradiction the uniqueness of this stationary distribution. Suppose that this uniqueness does not hold. Then there would exist at least two disjoint nonempty irreducible statespaces (say and ) for the projected CTMC within the statespace . This implies that if the projected CTMC starts in set then it remains in this set for all times, and there is a positive probability for this CTMC to reach any state in from any other state in in a finite time. The same holds true for set . Certainly one of these sets, say , will not contain the designated state but this leads to a contradiction due to the following reasons. Since the statespace is irreducible for the original CTMC, there exists a sequence of reactions that takes the original CTMC from any state to the designated state with a positive probability. If all the intermediate states that arise in this reaction path (recall s from above) lie within the set , then the same sequence of reactions will also take the projected CTMC from state to state , which is a contradiction because is an irreducible statespace not containing . On the other hand if one of the intermediate states lies outside , then the last reaction, say , in the sequence that takes the dynamics outside will be redirected to the designated state in the projected CTMC and hence again we have a contradiction because is a positiveprobability sequence of reactions that takes the projected CTMC from state to state . Therefore the stationary distribution for the projected CTMC is unique, and this completes the proof of part (A).
We now prove part (B). Clearly the assertion of part (B) is trivial when the full statespace is finite and so we assume that is infinite from now on. Let be a sequence of sets as stated in the proposition and let be an enumeration of satisfying
(3.29) 
Such an enumeration exists because is an increasing sequence of sets that cover the set in the limit . Note that as each is a finite set, condition (3.29) ensures that
(3.30) 
Now consider the valued, onedimensional process given by for each , where is the original CTMC with transition rate matrix and generator (see (2.4)). As is a onetoone and onto map, the process is also a CTMC and its generator is given by
where is a bounded realvalued function on and is the bounded realvalued function on defined by .
Irreducibility of statespace for implies the irreducibility of statespace for . Let be the normlike FosterLyapunov function satisfying (3.17) and define the function by . Then using (3.30) and (3.17) we can deduce that is a normlike function satisfying
Therefore is a FosterLyapunov function for CTMC with generator and hence this CTMC is exponentially ergodic due to Theorem 7.1 in [19]. Let and be the probability distributions on and defined by
Then is the stationary distribution for the CTMC and is the stationary distribution of this CTMC projected onto the finite statespace by redirecting all the outgoing transitions to the designated state . Theorem 3.3 in [24] proves
using resolvent forms (see (3.34)). This limit is equivalent to (3.24) and this proves part (B).
We will now prove part (C). Without loss of generality we can assume that . Define an infinite vector
(3.31) 
whose first elements are , where denotes the northwest submatrix of . Recall that matrix is given by (3.14) and the outflow rate is defined by (3.15). As we can write as
which shows that the vector has only one nonzero entry which is equal to and it is at the position corresponding to the designated state . Since we have which implies that
One can check that all entries of the infinite vector are nonnegative and only those entries are nonzero that correspond to the states in the boundary set (see (3.21)) of . Therefore, viewing as a signed measure over , we can express it as
(3.32) 
where and are probability measures on , supported on and respectively. With a slight abuse of notation, we will denote the vectorversion of also as .
Define the sFSP approximation error in vector form as
and since we get the following equation from (3.31)
(3.33) 
One can verify that is the unique solution of this linear system with the constraint . For any , let denote the resolvent matrix corresponding to the transition rate matrix . It is defined by
(3.34) 
where denotes the identity matrix. It is known (see [24]) that is a positive matrix satisfying , and
(3.35) 
One can regard as the transition matrix of a discretetime Markov chain over whose unique stationary distribution is .
Expressing the FosterLyapunov function as the vector we can write the drift condition (3.17) as
This relation along with (3.35) and the positivity of implies
Letting and we obtain
Note that . Theorem 6.1 in [23] shows that we can explicitly compute constants and , such that for any probability distribution over we have
(3.36) 
where denotes the th power of the matrix . Transposing (3.33), multiplying both sides by and using (3.35) and (3.32) we get
One can write as
(3.37) 
where is the solution to
for . This solution can be expressed as
and using (3.36) we get
Therefore
where . As and are probability distributions supported on and , we have (see (3.23)). This proves part (C) of the theorem.
We now prove part (D). Here we assume that the FosterLyapunov function has the linear form (3.18) and both (3.19) and (3.20) are satisfied. Note that by rescaling the positive vector in (3.18) if necessary, we can assume that
As from condition (3.19) we obtain
(3.38) 
for each . Transposing (3.33), multiplying both sides by vector and taking absolute values we get
(3.39) 
Using (3.38) we can upperbound the l.h.s. as
(3.40) 
Since is given by (3.32), with and being probability distributions supported on and respectively, we can lowerbound the r.h.s. of (3.39) as