Exact mean field inference in asymmetric kinetic Ising systems

# Exact mean field inference in asymmetric kinetic Ising systems

M. Mézard and J. Sakellariou Laboratoire de Physique Théorique et Modèles Statistiques, CNRS and Université Paris-Sud, Bât 100, 91405 Orsay Cedex, France
July 3, 2019
###### Abstract

We develop an elementary mean field approach for fully asymmetric kinetic Ising models, which can be applied to a single instance of the problem. In the case of the asymmetric SK model this method gives the exact values of the local magnetizations and the exact relation between equal-time and time-delayed correlations. It can also be used to solve efficiently the inverse problem, i.e. determine the couplings and local fields from a set of patterns, also in cases where the fields and couplings are time-dependent. This approach generalizes some recent attempts to solve this dynamical inference problem, which were valid in the limit of weak coupling. It provides the exact solution to the problem also in strongly coupled problems. This mean field inference can also be used as an efficient approximate method to infer the couplings and fields in problems which are not infinite range, for instance in diluted asymmetric spin glasses.

Inference problems are as old as scientific modelling: given data, how can we reconstruct a model which accounts for it, and find the parameters of the model? This is particularly difficult when data is obtained from networks of many interacting components. The fast development of high-throughput technologies in various fields of biology, ranging from gene regulation to protein interaction and neural activity, is generating a lot of data, which is challenging our ability to infer the structure and parameters of the underlying networks.

This ‘network reconstruction’ problem is typically an inverse problem which has motivated a lot of activity in machine learning and in statistical physicsAckleyHinton; Hinton; KappenSpanjers; KappenRodriguez; Tanaka; SessakMonasson; MezardMora; Marinari; HuangSuscProp; HuangMF; Wainwright; AurellOR; HertzRTTAZ; RoudiHertzPRL; RoudiHertzLong; RoudiAH; ZengAAM; WeigtWhite; BraunsteinPWZ; CoccoLeibMon. Until recently the main efforts have been dedicated to reconstructing equilibrium Boltzmann-Gibbs distributions. In the so-called inverse Ising model, one typically assumes to have data in the form of some configurations, which we shall call ‘patterns’, of a -spin Ising system drawn from the Boltzmann-Gibbs distribution with an energy function including one-body (local magnetic fields) and two-body (exchange couplings) terms. The problem is to reconstruct the local fields and the exchange couplings (collectively denoted below as ‘couplings’) from the data. This problem has been actively studied in recent years, in particular in the context of neural network inference based on multielectrode recordings in retinas SchneidBSB; ShlensFGG; CoccoLeibMon. The standard solution of this problem, known as the Boltzmann machine, computes, for some given couplings, the local magnetizations and the two-spin correlations, and compares them to the empirical estimates of magnetizations and correlations found from the patternsAckleyHinton; Hinton. The couplings are then iteratively adjusted in order to decrease the distance between the empirical magnetizations/correlations and the ones computed from the model. A Bayesian formulation shows that the problem of finding the couplings is actually convex, so that this iterative procedure is guaranteed to converge to the correct couplings, provided that the number of patterns is large enough to allow for a good estimate of correlations. The drawback of this method is that the reliable computation of the magnetizations/correlations, given the couplings, which is done using a Monte Carlo procedure, is extremely time consuming. Therefore this exact approach is limited to systems with a small number of spins. Most of the recent works on this issue have developed approximate methods to infer the couplings. Among the most studied approaches are the naive mean field method KappenRodriguez; Tanaka; HertzRTTAZ, the TAP approach KappenSpanjers; RoudiHertzPRL; ZengAAM, a method based on a small magnetization expansion SessakMonasson, and a message-passing method called susceptibility propagationMezardMora; Marinari; HuangSuscProp. Another approach which has been developed is that of linear relaxation of the inference problemWainwright. The inverse-Potts problem is a version of this same problem, with variables taking possible states. The case is relevant for inferring interaction in protein pairs from data on co-evolution of these pairs, and its solution by susceptibility propagation has given an accurate prediction of inter-protein residue contactsWeigtWhite. Another case which has received some attention is the problem of reconstruction in Boolean networks (see e.g.BraunsteinPWZ and references therein).

However, in many applications to biological systems, in particular the ones concerning neural activity and gene expression network, the assumption that the patterns are generated by an underlying equilibrium Boltzmann-Gibbs measure is not well founded. Couplings are typically asymmetric, and they may vary in time, so there is no equilibrium measure. This has prompted the recent study of inference in purely kinetic models without an equilibrium measure CoccoLeibMon; HertzRTTAZ; RoudiHertzPRL; ZengAAM. A benchmark on this dynamic inference problem is the inverse asymmetric kinetic Ising model. The framework is the same as the equilibrium one: one tries to infer the parameters of the dynamical evolution equation of an Ising spin systems, given a set of patterns generated by this evolution. The recent works HertzRTTAZ; RoudiHertzPRL; RoudiHertzLong; ZengAAM have studied the performance of two mean-field methods on this problem, the naive mean field (nMF) and a weak-coupling expansion which they denote as TAP method. They have shown that, in the case of the fully asymmetric infinite range spin glass problem, the inference problem can be solved by these methods in the case where the spins are weakly coupled. In the present work we present a (non-naive!) mean field approach which solves the problem at all values of the couplings (and reduces to their TAP approach at weak coupling).

The kinetic Ising model which we shall study is the same as the one of RoudiHertzPRL. Ising spins evolve in discrete time, with a synchronous parallel dynamics. Given the configuration of spins at time , , the spins are independent random variables drawn from the distribution:

 P(s(t)|s(t−1))=N∏i=112cosh(βhi(t))eβsi(t)hi(t) (1)

where

 hi(t)=Hi(t−1)+∑jJij(t−1)sj(t−1) (2)

Note that both the local external fields and the exchange couplings may depend on time. Here we are interested in a fully asymmetric model. We generate the by an asymmetric version HertzGS; Parisi; Derrida of the infinite -range Sherrington-Kirkpatrick spin glass model SK, in which for each directed edge the coupling is a gaussian random variable with variance . Notice that and are independent random variables. We do not include self-interactions (we take ), although this could be done without changing the results. As initial conditions we take with probability . Our method also applies to the case of asynchronous dynamics, studied in ZengAAM with the TAP approach, but to keep the presentation simple we shall study only the case of the synchronous parallel dynamics in this letter.

We first derive the mean-field equations for the magnetizations . Because the couplings are asymmetric, , therefore the Onsager reaction term is not present in this problem. This makes the derivation of our equations, which we shall denote just ’mean-field’ equations, particularly easy. The approximate equations used in RoudiHertzPRL; ZengAAM, originally derived in KappenRodriguez, have been obtained by a second order expansion in the couplings. When this expansion is applied to the symmetric problem it gives back the TAP equations TAP with their Onsager reaction term. In the present case of asymmetric coupling, it keeps the correction of order . We shall keep for these second-order-expanded equations the name ‘TAP’-equations, as used by KappenRodriguez; RoudiHertzPRL; ZengAAM.

The local field on spin due to the other spins, , is the sum of a large number of terms. Therefore it has a gaussian distribution with mean

 gi(t−1)=∑jJijmj(t−1) (3)

and variance

 Δi(t−1)=∑jJ2ij(1−mj(t−1)2) (4)

(in order to derive this last formula, one must use the fact that the typical connected correlation is typically of order ; this will be checked self-consistently below). Using this property and the definition of the dynamics (1), one obtains the magnetization of spin at time :

 mi(t)=∫Dxtanh[β(Hi(t−1)+gi(t−1)+x√Δi(t−1))] , (5)

where is the probability density for a Gaussian variable with zero mean and variance unity.

Equations (3,4,5) are our mean field (MF) equations for this problem, valid on a given instance. Similar dynamical equations have been obtained in the study of the sample-averaged order parameter in asymmetric neural networksDerGarZip; GutMez and spin glassesDerrida. They can be iterated starting from some initial condition (in our case ) in order to get all the magnetizations at any time. They rely only on the central limit theorem and they are exact in the large limit, for any set of couplings and external fields, even if they are time-dependent. These differ from the ‘TAP’ equations of RoudiHertzPRL; ZengAAM; KappenRodriguez; KappenSpanjers which can be written in our notation:

 mi(t)=tanh[βHi(t−1)+βgi(t−1)−mi(t)β2Δi(t−1)] , (6)

and from the naive mean field (nMF) equations:

 mi(t)=tanh[β(Hi(t−1)+gi(t−1))] . (7)

The nMF equations and the ‘TAP’ equations actually give the same result as our exact MF equations, when expanded in powers of , respectively to order and , but they differ at order . The fact that ‘TAP’ equations agree with the exact MF to second order in a weak coupling expansion is consistent with their derivation through second order Plefka-type expansionKappenSpanjers. The correctness of the MF equations (5,3,4) can be easily checked numerically as shown in the left panels of Fig.1.

We now turn to the computation of correlations. We shall establish the mean field relation between the time-delayed and the equal-time correlation matrices:

 Dij(t) ≡ ⟨δsi(t+1)δsj(t)⟩ (8) Cij(t) ≡ ⟨δsi(t)δsj(t)⟩ , (9)

where we define as the fluctuation of the magnetization:

We start by writing , where is gaussian distributed with mean and variance . Now, by definition of we have

 (10)

Hereafter in order to keep notations simple in the derivation of the relation between and we work at a fixed time and we thus drop the explicit time indices: all time indices in this derivation are equal to (e.g. , , etc.) We get:

 ∑kJjkDik=⟨(gj+δgj)tanh[β(Hi+gi+δgi)]⟩−gj⟨tanh[β(Hi+gi+δgi)]⟩=⟨δgjtanh[β(Hi+gi+δgi)]⟩ (11)

In order to evaluate the average we need the joint distribution of and . The crucial point to keep in mind is that, as the couplings are of order , each matrix element of and is also small, of order . Their covariance is therefore small:

 ⟨δgiδgj⟩ = ⟨∑kJik(sk−⟨sk⟩)∑lJjl(sl−⟨sl⟩)⟩ (12) = ∑k,lJikJjlCkl=(JCJT)ij≡ε , (13)

where is typically of order . So the joint distribution of and takes the form, in the large limit (omitting terms of order ):

 P(x,y)=12π√ΔiΔjexp(−x22Δi−y22Δj+εxyΔiΔj) (14)

Using the small expansion of eq. (14) we can rewrite eq. (11) as

 ∑kJjkDik = εΔiΔj∫dx√2πΔidy√2πΔje−x22Δi−y22Δjxy2tanh[β(Hi+gi+x)] (15) = εβ∫dx√2πΔiexp−x22Δi(1−tanh2[β(Hi+gi+x)]) . (16)

Combining eq. (12) and eq. (15) we get:

 (DJT)ij=(JCJT)ijβ∫dx√2πΔie−x22Δi(1−tanh2β(Hi+gi+x)) , (17)

which gives the explicit mean-field relation between and . Putting back the time indices, we obtain the final result in matrix form:

 D(t)=A(t)J(t)C(t) , (18)

where is a diagonal matrix: , with:

 ai(t)=β∫Dx[1−tanh2β(Hi(t)+gi(t)+x√Δi(t))] . (19)

The final result (18) takes exactly the same form as the one found with the naive mean field equation and with the ‘TAP’ approach. The predictions of all three methods, nMF, ‘TAP’ and our MF method is always , with a diagonal matrix which differs in each case. As shown in RoudiHertzPRL, the nMF approximation gives:

 anMFi(t)=β[1−mi(t+1)2] , (20)

the ‘TAP’ approximation gives:

 aTAPi(t+1)=β[1−mi(t+1)2][1−(1−mi(t+1)2)β2∑kJ2ik(1−mk(t)2)] (21)

and our mean field prediction is the one given in (19).

We claim that, as in the case of the magnetizations, our mean field equations connecting to are exact in the asymmetric SK model, in the large limit. This statement can be checked numerically by comparing with the experimental values of found by monte carlo simulations, as shown in Fig.1.

These results on the mean field relation between and can be used for the inverse problem. Given ‘patterns’, which are time sequences of length generated from the distribution (1), one can estimate for each , the magnetizations , the equal time correlations and the time-delayed correlations . The problem is to infer from these data the values of the couplings and of the local fields . Without loss of generality, we can use as it is absorbed in the strength of couplings and fields that we want to infer. We shall solve this problem using the mean field equations.

The problems corresponding to different times and sites decouple. So let us consider a fixed value of and , and infer the for , and . To lighten notation we drop the explicit indices and , and we denote , , , , , . Following RoudiHertzPRL, one can obtain by inverting the relation (18). The first step is to invert the empirical matrix and compute:

 bj=∑kDik(τ)C−1kj(τ) . (22)

If one knows the number one can then infer the couplings from (18):

 Jij(τ)=bj/a . (23)

Let us now see how can be computed. The mean field equation (5) for the magnetization reads:

 m=∫Dxtanh[H+g+x√Δ] . (24)

The equation (19) for is

 a=∫Dx(1−tanh2[H+g+x√Δ]) . (25)

 Δ=1a2∑jb2j(1−m2j)=γa2 . (26)

To solve this system of equations, we propose the following iterative procedure. Using the empirical correlations and magnetizations estimated from the patterns, we first compute from (22) the , , and .

Then we use the following mapping to find .

• Start from a given value of .

• Using the empirical value of and the value of , compute by inverting (24). The right-hand side of this equation is an increasing function of so this inversion is easy.

• Using and , compute using (25)

• Compute the new value of , called , using (26).

It is worth pointing out that in the thermodynamic limit, , the value of becomes independent of . So, if the system under consideration is large enough, the above iteration could be perfomed only once in order to reduce computation time.

This procedure defines a mapping from to , and we want to find a fixed point of this mapping. It turns out that a simple iterative procedure, starting from an arbitrary (for instance ) and using , usuallly converges. More precisely, it can be shown that and that the asymptotic form for the slope of for is , where is such that . We have found numerically that when the number of patterns is large enough the slope verifies: . Therefore the mapping converges exponentially fast to the unique fixed point. This method therefore works when the number of patterns per spin is large enough. In the double limit and large enough the above procedure thus allows to get the exact result for ; and therefore to find the couplings . Once the couplings have been found, one can easily compute , and therefore get the local field . The number of operations needed for the full inference of the couplings and fields is dominated by the inversion of the correlation matrix , a time which is typically at most of order . If the number of patterns is too small, it may happen that there is no solution to the fixed point equation . Then one can decide to use , which is nothing but the nMF estimate for .

We have tested our mean field inference method on the asymmetric SK problem, where the couplings are time-independent, gaussian distributed with variance and the fields are time independent, uniformly distributed on . Fig.(2) shows a scatter plot of the result on one given instance at and , and compares it to the inference method of RoudiHertzPRL using nMF and ‘TAP’ (the ‘TAP’ inference is limited to small values of : at large it fails). Figs. 3 and 4 show a statistical analysis of the performance of MF inference. It accurately infers the couplings and fields even in the strong coupling regime.

The method that we propose is exact and allows for a very precise inference of the couplings when applied to the fully asymmetric SK spin glass, at any temperature, if the number of patterns is large enough. At the same time, it is an easy and versatile method which can be used as an approximate inference method when the number of patterns is not very large (although one should at least have in order for to be invertible), or when the underlying model is not of the SK type. As an example showing the possible use of the method, we have applied it to a sample where the matrix is sparse, generated as follows. We first generate a regular graph with vertices and degree on each vertex. For each edge of this graph we choose randomly with probability an orientation, say . Then we take and is drawn randomly from the probability density . All the other couplings corresponding to pairs of sites which are not in the graph are set to . One then iterates the dynamics (1) 100000 times at , and uses this data to reconstruct the couplings. Fig. 5 shows the resulting couplings as a scatter plot. The topology of the underlying interaction graph can be reconstructed basically exactly, both by nMF and MF by using a threshold, deciding that all reconstructed couplings with vanish. The non-zero couplings are found accurately by the MF inference method.

To summarize, we have introduced a simple mean field method which can be applied on a single instance of a dynamical fully asymmetric Ising model. In the case of the asymmetric SK model this MF method gives the exact values of the local magnetizations and the exact relation between equal-time and time-delayed correlations. This method can be used to solve efficiently the inverse problem, i.e. determine the couplings and local fields from a set of patterns. Again this inference method is exact in the limit of large sizes and large number of patterns, in the asymmetric SK case. It can also be used in cases where the underlying model is different, for instance for diluted models. This could be quite useful for many applications.

We thank Lenka Zdeborová for useful discussions.This work has been supported in part by the EC grant ’STAMINA’, No 265496.

## References

You are adding the first comment!
How to quickly get a good reply:
• Give credit where it’s due by listing out the positive aspects of a paper before getting into which changes should be made.
• Be specific in your critique, and provide supporting evidence with appropriate references to substantiate general statements.
• Your comment should inspire ideas to flow and help the author improves the paper.

The better we are at sharing our knowledge with each other, the faster we move forward.
The feedback must be of minimum 40 characters and the title a minimum of 5 characters