Dimensional Reduction of Markov State Models from Renormalization Group Theory
Abstract
Renormalization Group (RG) theory provides the theoretical framework to define Effective Theories (ETs), i.e. systematic lowresolution approximations of arbitrary microscopic models. Markov State Models (MSMs) are shown to be rigorous ETs for Molecular Dynamics (MD). Based on this fact, we use Real Space RG to vary the resolution of a MSM and define an algorithm for clustering microstates into macrostates. The result is a lower dimensional stochastic model which, by construction, provides the optimal coarsegrained Markovian representation of the system’s relaxation kinetics. To illustrate and validate our theory, we analyze a number of test systems of increasing complexity, ranging from synthetic toy models to two realistic applications, built form allatom MD simulations. The computational cost of computing the lowdimensional model remains affordable on a desktop computer even for thousands of microstates.
I Introduction
MSMs – for recent reviews see Pande0 (); Noe0 (); Prinz0 (); Noe1 (), for a discussion of mathematical aspects see MSMfund1 (); MSMfund2 (); MSMfund3 (); MSMfund4 ()– provide a coarsegrained description of the conformational dynamics of macromolecules in which the probability for the system to be found in a discrete set of metastable states (herby termed microstates) evolves in time according to a master equation. The microstates and all the parameters of the master equation are obtained by reducing the data generated by MD simulations.
Coarsegraining the dynamics at the MSM level enables to extract the relevant kinetic and thermodynamical information from many short independent MD trajectories. The larger is the number of microstates in the model, the shorter are the MD trajectories which need to be run in order to specify the master equation. By choosing a sufficiently large number of microstates, the computational effort can be massively distributed and it becomes possible to investigate the dynamics for time intervals inaccessible to straightforward MD simulations. For example, in applications to protein dynamics, Pande1 (); Bowman0 () microstates are usually required to achieve an efficient sampling of the configuration space.
Unfortunately, with such a large number of states it is hard to gain insight into the conformational dynamics by directly inspecting the transition pathways in the space of microstates. A way around this problem consists in further coarsegraining the representation of the dynamics, by clustering (lumping) the microstates into supergroups that are usually referred to as macrostates. Physically, macrostates are to be interpreted as larger metastable regions of the configuration space, which comprize many smaller and fast interconverting microstates.
Several algorithms have been developed to perform dimensional reduction of MSMs (see e.g. Huang0 (); Schutte0 (); Bowman0 ()), and some of these have been compared and assessed on a few benchmark systems in Ref. Bowman1 (). It was found that the structure and the optimal number of macrostates is quite sensitive to the specific algorithm adopted to lump the microstates. Furthermore, all the considered algorithms were found to fail in reproducing the kinetics of the original MSM, by underextimating the relaxation timescales by a factor as large as . Alternative lumping schemes which provide relaxation times scales in much better agreement with the original model have been developed using Hidden Markov models hidden () or by means of projection techniques pcca (); HummerSzabo (); DeLosRios1 (); DeLosRios2 ().
In this work, we frame the theoretical foundation of MSMs and their dimensional reduction, within the ET formalism. ETs –for an excellent pedagogical introduction see Lepage ()— are systematic lowresolution approximations of arbitrary more microscopic theories. Unlike coarsegraining approaches inspired by heuristic or phenomenological arguments, ETs can be rigorously derived starting from the underlying microscopic theory using RG theory. Consequently, under certain welldefined conditions to be discussed below, ETs are guaranteed to approximate the longdistance (or longtime) dynamics within a degree of accuracy which can be estimated a priori and systematically improved.
The physical picture behind the ET formalism is very familiar: Any experimental probe with wavelength (or frequency is insensitive to the details of the physics at lengthscales (or timescales ). As a consequence, as long as one is interested on the infrared (IR) physics, i.e. in the behaviour of observables at distances or times , all the socalled ultraviolet (UV) details of a microscopic theory are irrelevant, and can be accurately mimicked by a set of effective parameters. These parameters have to be computed from the underlying microscopic theory, or extracted from experimental data. Clearly, ETs are only applicable to physical systems which display a gap in characteristic scales, i.e. for which it is possible to separate IR from UV scales.
A familiar example of ET is the multipole expansion of classical electrodynamics: The electric field generated by an arbitrary localized charge distribution of size is systematically approximated at distances by the series of multipoles. In the terms of this series, the finegrained structure of the charge distribution at the scale is mimicked by effective constants like total charge, dipole moment, etc.. This example illustrates that ETs are usually much simpler than the corresponding underlying microscopic theories. On the other hand, their accuracy of any ET breaks down at the scales comparable with the UV cutoff.
The identification of MSMs with ETs paves the door to using RG methods in order to systematically lower their time and space resolution. In fact, one of the main results of this work is the derivation of a physically sound and mathematically rigorous dimensional reduction scheme for MSMs based on the socalled RealSpace RG formalism. This approach was inspired by the work of Degenhard and RodriguezLaguna RG1 (); RG2 (), who used the RG to lower the computational cost of integrating nonlinear partial differential equations.
In the following sections, we shall first review how MSMs emerge as rigorous ETs for MD. Then, in section III we develop our RG scheme for dimensional reduction and describe the practical implementation of the corresponding algorithm. In sections IV and V we present a number of illustrative applications of increasing complexity and assess the accuracy of our scheme. Finally, our main results are summarized in section VI.
Ii Theoretical Framework of Markov State Models
MSMs can be defined without explicitly referring to a particular type of MD Sarich (). However, from a mathematical standpoint, the connection between MSMs and rigorous ETs is particularly manifest for systems which obey the overdamped Langevin equation. In turn, in view of ZwanzigMori projection formalism Zwanzig0 (); Mori (), such a stochastic differential equation can be regarded as the lowenergy approximation of an underlying Hamiltonian dynamics.
Let us therefore consider a system composed of atoms, which obey the equation
(1) 
where is a point in configuration space, is the potential energy and is the diffusion coefficient (assumed to be the same for all atoms, for the sake of notational simplicity). is a white Gaussian noise obeying the fluctuationdissipation relationship,
(2) 
The probability distribution sampled by the stochastic differential Eq. (1) satisfies then the FokkerPlanck (FP) equation
(3) 
where is the nonhermitian operator
(4) 
From it is possible to define a hermitian operator by performing the following nonunitary transformation:
(5) 
This transformation turns the FP equation into a Schrödinger equation in imaginary time:
(6) 
In the following, the function will be referred to as the hermitian component of the probability density .
It is straightforward to show that and have the same spectrum, which is nonnegative definite and contains a null eigenvalue, . The right zeromode is the Gibbs distribution:
(7) 
where is the system’s canonical partition function. The hermitian components of the left and right eigenstates of are related to the eigenstates of the corresponding hermitian operator :
(8) 
Finally, we note that the probability density entering the FP equation (3) can be expanded as a series of right eigenfunctions of or, equivalently, eigenfunctions of :
(9) 
ii.1 Definition of Microstates
Let us now specialize on molecular systems in which the typical relaxation times required to attain local thermal equilibrium within all metastable states are much smaller than the timescales associated to transitions between such metastable states. This condition is realized if the thermal energy is much lower than the energy barriers between local minima and implies that the spectrum of the operator is gapped. In the following, we shall always assume such a lowtemperature regime. Consequently, MSM can only deal with dynamics at time intervals much larger than the inverse of the lowest eigenvalue above the gap, . Thus, represents a typical value for the UV cutoff scale of the ET, i.e. .
To explicitly construct such an ET, starting from the microscopic FokkerPlanck dynamics, we observe (see e.g. the discussion in Ref. MSMfund4 ()) that there exist exactly linear combinations of the righteigenstates below the gap,
(10) 
which simultaneously satisfy the following three properties: (i) nonnegativity, , (ii) disjointness, for and and (iii) local Gibbseanity,
(11) 
where is negligible everywhere and equal to 1 in the vicinity of one and only one of the local metastable states. is then interpreted as the probability distribution associated to the th socalled Markov state and denotes the corresponding characteristic function (see Fig. 1).
It is also relevant to consider the following linear combinations of the left eigenstates , with :
(12) 
The interest in these linear combinations resides in the fact that the functions are proportional to the characteristic functions of the microstates, thus are approximatively constant where is nonnegligible, and are negligible elsewhere. To see this, it is sufficient to isolate the hermitian component of the distributions:
(13) 
In the hermitian formalism, the left and right state distributions are replaced by a single distribution :
(14) 
In practical applications, MSMs are never defined by directly diagonalizing , as in this purely theoretical discussion. Instead, they are built by analyzing an ensemble of MD trajectories by means of dimensional reduction methods such as Timelagged Independent Component Analysis (TICA) TICA1 (); TICA2 (); TICA3 (); TICA4 () or Principal Component Analysis (PCA) TICA1 (); PCA () and by geometric clustering of the projected configurations. In this case, the cutoff scale is identified with the frequency of the slowest spectral component which is projected out, e.g. the inverse of the TICA timescale associated to the first excluded independent component.
Once the dynamics has been coarsegrained at the level of microstates, the explicit dependence on the dimensional configuration point becomes redundant. It is then convenient to introduce a formalism in which such a dependence is removed altogether. Hence, from this point on we shall use the hermitian formulation of the FokkerPlanck dynamics (6) and adopt Dirac’s “braket" formalism, in which the normalized microstates are identified with points in a Hilbert space .
To construct the microstates , we begin by introducing position eigenstates and the corresponding hermitian operator :
(15) 
where the eigenvalues of are points in the dimensional configuration space. Notice that the position eigenstates obey the normalization condition
(16) 
The ket and bramicrostates, and are defined from the hermitian components of the state distributions :
(17)  
(18) 
Note that they approximatively form an orthonormal set:
(19) 
At finite temperature, orthogonality is weakly violated by the exponentially small overlaps between the microstates.
The instantaneous configuration of a system is described by a timedependent state , defined by
(20) 
where is the solution of the imaginarytime Schrödinger equation (6). In appendix A, we show that if the distribution can be expressed through a linear combination of eigenstates below the gap – like in Eq. (9) –, then the state can be expressed as timedependent linear combinations of the ketmicrostates :
(21) 
The timedependent coefficients
(22)  
are interpreted as the probabilities of observing the molecule in the different microstates at time and will be called the instantaneous microstate populations. In the lowtemperature limit, the time spent by the system in crossing highenergy regions is exponentially small compared to the time spent in the states. Consequently, the following sumrule holds:
(23) 
Note that the normalization factor ensures that the correct equilibrium populations are attained in the longtime limit:
(24) 
where is the system partition function, in the lowtemperature limit.
The action of operators on the arbitrary state is defined by its action on the microstates , i.e.
(25) 
where is a matrix.
ii.2 Dynamics in the Space of Microstates
To determine the dynamics of the state it is convenient to introduce a timeevolution operator,
(26) 
The ”wavefunction” evolves according to
(27) 
where is the imaginarytime propagator
(28) 
and will be called the (effective) Hamiltonian operator.
We are now in a condition to derive the timeevolution of the populations of the microstates :
(29)  
The socalled transition probability matrix
(30) 
expresses the probability that a system prepared in microstate is found in microstate after a time interval . In the MSM literature, the time interval is ofter referred to as lagtime. We emphasize that the matrix (30) manifestly satisfies the detailed balance condition, which implies that is a left stochastic matrix:
(31) 
If the lagtime is finite, Eq. (29) defines a socalled discretetime master equation. Conversely, if represents an infinitesimal time interval, Eq. (29) defines as a socalled continuoustime master equation (see e.g. Sriraman () for related discussions),
(32) 
The is called the transition rate matrix (sometimes also termed kinetic matrix or Markov generator). Its relationship with the transition probability matrix is simply given by
(33) 
We emphasize again that, in the RG perspective, the symbol in Eq. (32) does not represent a real derivative, but rather a finite increment. Indeed, cannot be set smaller than the inverse frequency cutoff scale . From this discussion it follows that the lagtime of discretetime MSMs must obey . Models which violate this inequality cannot be interpreted as rigorous ETs of MD. The hierarchies of different frequencies scales which must be obeyed for a MSM to represent an ET is schematically illustrated in Fig.2.
Finally, the equation expressing the evolution of the states in the Hilbert space is
(34) 
Notice that this equation provides the discrete formulation of the effective Schrödinger equation (SE) and that the symmetry of the following matrices (as introduced in Sriraman ())
(35)  
(36) 
expresses the detailed balance condition. It is straightforward to show that and share the same spectrum, up to an overall sign. Indeed,
(37) 
By defining , one has . The same argument applies also to the relationship between and .
Iii Even coarsergrained ETs: Dimensional Reduction of MSMs
In the previous sections we have seen how it is possible to derive a MSM which provides the rigorous ET for the system’s slow relaxation dynamics. Now, we further assume that the spectrum of the associated master equation is also gapped –see Fig.2b–, enabling to separate its slow and fast relaxation processes. Under such circumstances it is possible to use RG theory to further coarsegrain the dynamics, i.e. to build a lower dimensional MSMs, in which the number of macrostates is set by the number of eigenvalues below the gap.
To introduce such a dimensional reduction, let us begin by considering the case of a continuoustime MSM.The dynamics of the populations is determined by the master equation (32) which can be rewritten in the following form:
(38) 
We stress once again that plays the role the UV cutoff of this ET, hence it must be chosen longer than the inverse UV cutoff frequency, yet shorter than all relaxation times predicted by the rate matrix.
Geometrically, the set of microstate populations can be identified as the coordinates of a unitnorm vector belonging to the positive quadrant of a dimensional vector space – see Fig. 3. Let us now introduce the socalled truncation matrix , which maps vectors belonging to onto some insofar unspecified dimensional linear subspace . In particular, the timedependent population vector is transformed by as follows:
(39) 
From here on, we shall always denote with uppercase letters all the indices ranging from to and with lowercase letters those ones ranging from to . Since , the truncation matrix is obviously not invertible. However, it is possible to define its socalled pseudoinverse matrix which coincides with for and satisfies the socalled MoorePenrose relationships (see Appendix B). In practice, the pseudoinverse of the truncation matrix (which we shall refer to as the embedding matrix) can be explicitly constructed from the singular value decomposition of the truncation matrix :
(40) 
where and are the orthogonal matrices of the left and right singular vectors of and is the rectangular matrix with the singular values on the diagonal.
Using the embedding and truncation matrices, we can define a new lowerdimensional master equation:
(41) 
where
(42) 
is an effective transition rate matrix.
Our goal is to identify the truncation and embedding matrices and for which the lowerdimensional master equation (41) provides the best possible Markovian lowdimensional approximation of the dynamics defined by the original dimensional master equation (38). In this case, the components of the dimensional vector are interpreted as the probabilities of observing the system in each of the different macrostates and the matrix contains the interconversion rates between macrostates.
Our general strategy is schematically represented in Fig. 3: we first introduce a norm which quantifies the difference between the dynamics described by the original master equation (38) and the reduced master equation (41), thus defines the error introduced by the dimensional reduction. Next, we vary the choice of the subspace until such a difference is reduced to a minimum. The vector space which corresponds to the least difference will be called the relevant subspace.
To implement this scheme in practice, we begin by embedding the effective dynamics given by Eq. (41) in the larger vector space , where the original dynamics given by (32) is defined. This can be done by applying the embedding matrix to Eq. (41):
(43) 
where the projector
(44) 
is called the reduction matrix. In this equation, is the th component of the th rightsingular vector of the truncation matrix .
It is now convenient to switch to the quantummechanical notation introduced in the previous section. We define the reduction operator acting on the Hilbert space:
(45) 
where the states are called the target states and are related to the components of the right singular vectors of the truncation matrix , i.e.
(46) 
Notice that, with this definition, the reduction operator and the reduction matrix are related by . Thus, finding the optimal dimensional reduction is equivalent to identifying the set of target states whose elimination from the Hilbert space minimally affects the time evolution of the probability state .
It is most convenient to perform this elimination by projecting out one target state at the time. Let us therefore discuss the elimination of the first one, :
(47) 
The difference between the evolution of the population vectors in the original and in the effective dynamics is written componentwise as follows
(48)  
Hence, the term into square bracket on the righthand side expresses the error which is introduced in the dynamics by projecting out the target state , thus is called the error operator. Recalling definition (47), it can be written as follows:
(49) 
where denotes the anticommutator. Our goal is then to minimize the Frobenius norm^{1}^{1}1There are many possible norms between which one can choose, a priori. We follow the choice of Ref.s RG1 (); RG2 (). of the error operator with respect to the choice of target state :
(50) 
Expanding the square and retaining only the leadingorder terms in we find:
(51) 
The target vector components are found by extremizing the bilinear under the normalisation constraint . Introducing a Lagrange multiplier and imposing stationarity with respect to variation of we obtain:
(52) 
which yelds
(53) 
Thus, we have found that all eigenvectors of the orthogonal matrix locally minimize the target function inside the square brackets of Eq. (52). In particular, if are the components of the th eigenvector, then and the error introduced by projecting out this vector is
(54) 
The global minimum of our error estimator is realized by projecting out the eigenvector of with the largest eigenvalue. This procedure should be repeated to project out all eigenstates with eigenvalues above the gap (i.e. all system’s eigenmodes with fast relaxation frequencies). Indeed, from Eq. (51) it follows that the elimination of all these target vectors generates errors which are small and comparable. The elimination of all these states results in the lowering of the cutoff. The new UV scale is set by the frequency of the slowest mode which was projected out –see Fig. 2a–.
The renormalization of a discretetime MSMs is completely analog and is reported in appendix C. We recall that, in the discretetime formulation, one needs to specify the lagtime . For the dimensionally reduced MSM to represent a rigorous ET, such a lagtime must be chosen in such a way to remain larger than the new UV time cutoff scale . In other words, the RG transformation of a discretetime MSMs is intrinsically consistent only if the lagtime of the original MSM is much longer than the inverse of the smallest relaxation frequency projected out during the dimensional reduction –see Fig.2a–.
iii.1 Identifying the Macrostates and Computing the Reduced Kinetic Matrix
So far we have developed a procedure to identify subspace of the original vector space where the relevant dynamics takes place. Let us now address the problem of identifying the macrostates. We recall that we are working under the hypothesis that also the spectrum of the effective hamiltonian operator is gapped, namely that the first right eigenstates
(55) 
have eigenvalues well separated by all other eigenvalues , with . In the previous section, we discussed how the existence of a gap in the spectrum of the FokkerPlanck operator leads to the definition of continuous state distributions , associated to the system’s microstates. The same arguments can be repeated at the discrete level and lead to the definition of macrostates. We expect to to find linear combinations of the lowest righteigenvectors of the hermitian Hamiltonian operator ,
(56) 
which obey the following properties:

The coefficients expressing the ketmacrostates as linear combination of the microstates,
(57) are nonnegative, . Those coefficients are the analog of the entering Eq. (11).

The states are disjoint, that is , for all . Disjointness implies that each microstate belongs to one and only one of the states.
One can immediately notice that Eq. (56) guarantees that the projector operates as the identity in the relevant space :
(58) 
This means that the projector operator can be rewritten as a projector on the macrostates:
(59) 
Given Eq. (59), let us compute the projector matrix elements on the microstates basis:
(60) 
From this we deduce that
(61) 
Furthermore, to make sure that the relation is satisfied, one necessarily has to define the bramacrostates as
(62) 
Thus, the expansion coefficients of the macrostates in the microstates basis naturally provide the truncation and embedding matrices. We recall that the condition of minimum Frobenius norm of the error operator poses a constraint only on the reduction matrix , thus leaving some arbitrariness on the choice of and . Within the manifold of and matrices which satisfy , only the choice given by (61) and (62) leads to the correct probabilistic interpretation, i.e. ensues that is related to the probability of observing the microstate inside the macrostate .
Let us finally tackle the problem of how constructing the macrostates starting from the set of microstates of the system. To this goal, we consider a new set of states defined by:
(63) 
Then,
(64) 
where the second line in Eq. (64) follows from the fact that is a projector, so . Let us now suppose that some microstate is contained in the macrostate (i.e. ). Then, it is immediate to prove that the state state has nonzero overlap with :
(65) 
where we used the fact that belongs to the relevant subspace. Conversely, if the microstate is not contained in the macrostate , then and have no overlap (up to correction of order ):
(66) 
Hence, if the state overlaps with , then it has no relevant overlap with any other macrostate . This implies that each of the states is either null or parallel to one and only one macrostate.
Given this consideration, we can compute the coefficients of the generic macrostate by imposing the same normalization adopted for the macrostates:
(67) 
This means that the coefficients are simply provided by
(68) 
where are the linearly independent rows corresponding to the arbitrary choice of the linearly independent states. In Appendix D we show that , which implies
(69) 
The last ingredient of our Renormalization Group procedure concerns computing the effective kinetic matrix. We start from the effective Hamiltonian operator represented in the macrostates basis
(70) 
where is the partition function of the th macro state On the other hand,
(71) 
Plugging Eq. (70) into Eq. (71) one finds that the effective kinetic matrix is the solution of the following equation
(72) 
We emphasize that this is an implicit relationship. Indeed the partition functions in the righthandside are the components of the lowest righteigenvalues of the effective kinetic matrix .
All the proofs given in this section hold also for discretetime MSMs, since and share the same eigenvectors . Thus, the macrostates of a system described by a transition probability matrix are simply provided by Eqs. (61) and (69) and the corresponding effective transition probability matrix is readily obtained as
(73) 
where is defined in Eq. (36).
To summarize, Eq.s (68), (72) and (73) are the most important results of this paper. Indeed, Eq. (61) provides the definition of macrostates in terms of microstates, while Eq.s (72) and (73) provide the new continuous and discretetime master equations, which approximate the MD in the reduced space of macrostates.
iii.2 The Renormalization Group Clustering Algorithm
We now describe our Renormalization Group Clustering (RGC) algorithm, which implements the RG theory developed so far. Here we illustrate it for a continuoustime MSM. The formulation of the same algorithm for a discretetime MSM is analog and is reported in Appendix C.

Compute the spectrum of relaxation frequencies from the kinetic matrix . Retain only the frequencies below the gap, i.e. up to , where , and the corresponding set of right eigenvectors. Let us call the eigenvector corresponding to and define
(74) which quantifies the extent the spectral gap.

Compute the hamiltonian matrix ;

Build the projector where is the th components of the th lowest eigenstate of the hamiltonian matrix;

Extract the linearly independent vectors from to build the matrix ;

The matrix may contain entries of order , some of which may be negative. To preserve only the relevant terms and the probabilistic interpretation, filter out all such terms using the following rule: A matrix element is set to if there exists a element such that
(75) Once the irrelevant terms have been deleted from the matrix, normalize its rows dividing them by (which is equivalent to normalizing to 1 the sum of the elements in each of the rows of the matrix);

Solve the selfconsistent equation to obtain the reduced kinetic matrix. To this goal, one may use the fixedpoint algorithm described Appendix E;

Compute the spectrum of and isolate the smallest mode . In general, . To ensure the existence of a stationary state, redefine the spectrum as
(76) and recompute the effective kinetic matrix as
(77) where is the orthogonal matrix which diagonalizes .
A Python code implementing this algorithm and the equivalent version for discretetime MSMs can be made available by the authors upon request.
We emphasize that the effective transition rate matrix (or, equivalently, the transition probability matrix ) is expected to satisfy the microscopic reversibility condition only up to corrections of . On the other hand, to solve the selfconsistent equation and to determine the equilibrium population it is important to consider effective models which obey such a condition. In the RGC algorithm this is done at step 7, by shifting the spectrum of the effective matrix by a factor so that the lowest eigenvalue of is null. We also stress the fact that the results of the renormalized MSM are only expected to be accurate up relative corrections which are expected to scale according to the gap ratio .
Iv Illustrative Examples
In this section we provide two simple illustrative applications of the RGC algorithm based on simple toy models.
iv.1 MSM with Multiple Gaps in the Spectrum of Relaxation Frequencies
As a first example of application of the RGC algorithm, we consider the MSM represented in Fig. 4a, composed of microstates. The arrows represent the connectivity between the different states and the numbers near the arrows express the corresponding transition rates (measured in units of some reference rate ). In particular, we notice that most transition rates are of order , a few of them are of order and two are of order . Physically, this model represents a system in which the relaxation to thermal equilibrium is slowed down by two energy barriers playing the role of kinetic bottlenecks. The and matrix are respectively showed in Eq. (109) and (110) of Appendix F.
Fig.4b shows that the spectrum of the matrix displays two well separated gaps, denoted with and , respectively. Eigenvalues above are of order , those between and are of order , while the two below are respectively and . Consequently, the original continuous time MSM must be defined using a cutoff smaller than the inverse of the highest frequency shown in Fig. 4b.
In the presence of two gaps it is possible to define two ETs, characterized by a different level of resolution, i.e. a different degree of dimensional reduction. In the finergrained effective MSM, only the fast relaxation modes above the gap are projected out. The number of macrostates of this effective model is set by the number of eigenvalues below the gap, and the new cutoff is smaller than the smallest relaxation frequency above the gap. In the even coarsergrained effective MSM, all the relaxation modes with frequency above the gap are projected out and a new cutoff is defined correspondingly. In this ET, there are only two macrostates.
From the block diagonal structure of the kinetic matrix in Eq. (109) it is immediate to guess the structure of the macrostates in the finer and coarsergrained effective MSM, which are graphically highlighted by the ellipses in Fig.4c and Fig.4d, respectively. In the following we show how these structures emerge from the RGC analysis.
The first step consists in computing the reduction matrix , i.e. the projector onto the lowest 6 eigenstates of the matrix. The result is shown in Eq. (111) of Appendix F. We note that some rows of this matrix differ only by relative corrections of order , i.e. by an amount of the order , which represents the estimate of the intrinsic relative uncertainty of the RGC approach. To this level of accuracy, it is legittimate to set to zero all entries of relative order and require that two vectors are equal if their relative difference is lower than . The new matrix is reported in Eq. (112) of Appendix F. It is worth noticing that such filtering procedure preserves the symmetry of the original projector. Furthermore, in this form it is clear that only 6 lines are linearly independent. The truncation operator is obtained by normalizing and arranging in raw such lines and is provided in Eq. (113) in Appendix F.
Finally, the corresponding effective kinetic matrix is calculated by solving the selfconsistent equation (72). The result obtained using the fixedpoint algorithm described in Appendix E is reported in Eq. (114) of Appendix F. We note that that the raws of this matrix do not sum exactly to , but to a number of the order , smaller than the intrinsic error of the method. Within such an accuracy it is legitimate to shift the diagonal element by an amount given by the lowest eigenvalue of the matrix (in this particular case ) so that the new satisfies detailed balance condition. This spectral shift procedure yields the matrix reported in Eq. (115) in Appendix F.
The same procedure can be repeated to build the coarser grained effective MSM. In this case, we need to build a new projector matrix which maps onto the space spanned by the 2 lowest eigenvectors of . To build the truncation operator we discard all relative differences which are of order or lower and obtain the truncation operator showed in Eq. (116) in Appendix F. Through this procedure one finds the two macrostates showed in Fig.4d. The kinetic matrix obtained after enforcing detailed balance is reported in (117) in Appendix F.
Let us now assess the accuracy of these two effective descriptions in describing the kinetics. To this goal, we compare the relaxation rates of the two effective models with the slowest rates calculated from the eigenvalues of the kinetic matrix of the original MSM. We expect the finergrained MSM to reproduce well the lowest 6 relaxation frequencies of the original model, while the coarser grained model to reproduce only the thermal relaxation rate (we recall that first eigenvalue is set to 0 by construction).
Original relaxation rates  6.93  6.39  8.74 
