Central limit theorem for quasilocal statistics of spin models on Cayley graphs
Abstract
Central limit theorems for linear statistics of lattice random fields (including spin models) are usually proven under suitable mixing conditions or quasiassociativity. Many interesting examples of spin models do not satisfy mixing conditions, and on the other hand, it does not seem easy to show central limit theorem for local statistics via quasiassociativity. In this work, we prove general central limit theorems for local statistics and quasilocal statistics of spin models on discrete Cayley graphs with polynomial growth. Further, we contrast these results by proving similar central limit theorems for random fields defined on discrete Cayley graphs but under the stronger assumptions of mixing (for local statistics) and exponential mixing (for quasilocal statistics). All our central limit theorems assume a suitable variance lower bound like many others in the literature. We illustrate our general central limit theorem with specific examples of lattice spin models and statistics arising in computational topology, statistical physics and random networks. Examples of clustering spin models include quasiassociated spin models with fast decaying covariances like the offcritical Ising model, level sets of Gaussian random fields with fast decaying covariances like the massive Gaussian free field and determinantal point processes with fast decaying kernels. Examples of local statistics include intrinsic volumes, face counts, component counts of random cubical complexes while quasilocal statistics include nearest neighbour distances in spin models and Betti numbers of random cubical complexes.
Keywords:
Clustering spin models mixing random fields central limit theorem Cayley graphs fast decaying covariance Gaussian random field determinantal point process quasilocal statistics cubical complexes nearestneighbour graphsMsc:
82B20 Lattice systems (Ising, dimer, Potts, etc.) and systems on graphs 60G60 Random fields 60F05 Central limit theorems and other weak theorems 60D05 Geometric probability and stochastic geometry.∎
1 Introduction:
Our results in full generality are on finitely generated countably infinite Cayley graphs but we shall restrict ourselves to the simple case of integer lattices i.e., , to motivate and illustrate our results in the introduction. A simple model of random lattice networks is to consider a random subset of (the dimensional integer lattice) as nodes and define edges or other features depending on the geometry of the nodes. Under such a setup, various performance measures/functionals/statistics of the network reduce to geometric functionals/statistics of the underlying set of random nodes. Mathematically, the set of random nodes is a random element in with denoting the presence of a node at the corresponding location in and denoting absence of a node. Denoting the random node set as , various geometric and topological features of the random network are encapsulated in the random subset (called as cubical complex) with being the centered unit cube and denoting the Minkowski sum. A common way to understand is to investigate the asymptotics of where . Though the mathematical model described above is possibly one of the simpler ones, it appears in various avatars in diverse areas ranging from statistical physics, digital geometry, cosmology, stereology etc.
One of the oftused model for lattice random networks (cf. Haenggi2010 (); Franceschetti2008 ()) is the percolation model arising from statistical physics (Grimmett2010 ()). We note here that percolation models encode only pairwise interaction between nodes and indeed, it is very common for various models of complex networks to be modeled as graphs using pairwise interactions between the nodes (see (Boccaletti2006, , Section 6)). However, in many of these models the interactions are not just pairwise, but happen between subsets of nodes, referred to as higherorder interactions. Hypergraphs represent a natural choice for modeling such higherorder interactions, and so do cubical complexes. The additional topological structure of cubical complexes also makes them suitable for modeling of surfaces or other topological structures as well. A hypergraph can be constructed easily from a cubical complex by making a hyperedge if . Due to the choice of , we shall have at most hyperedges on the constructed from . However, it is possible to consider any other compact subset of instead of to build such hypergraphs, and obtain more general hypergraphs. But for ease of illustrating our main theorems, we shall restrict ourselves to this simple model of cubical complex or the corresponding hypergraph. Both cubical complexes and hypergraphs are very useful models of complex networks. For example, Atkin74 (); Atkin76 () proposed simplicial complexes to model social networks, and we refer the reader to Kraetzl01 () for a survey on this line of research using simplicial complexes for network analysis. Though simplicial complexes are different from cubical complexes, our methods are applicable to simplicial complexes built on spin models as well. In Klamt2009 (), hypergraphs have also been proposed as models for cellular networks such as protein interaction, reaction and metabolic networks. Apart from these examples, further examples are discussed in Estrada2005 (); Michoel2012 () including food webs and collaboration networks as well. Analysis of such networks involve understanding of the corresponding cubical complex or its generalization (see Section 2.2). Though we shall not explicitly state applications of our results to such performance metrics, we hope that it will be conspicuous to the reader that our applications in Section 2.2 can be extended naturally to handle other hypergraph statistics as well.
In statistical physics, is termed as a spin model and there is considerable interest in the connectivity properties of spin models (Copin2015 (); Funaki05 (); Grimmett2010 (); Friedli2017 (); Bovier2006 (); Simon2014 ()). This is all the more important when one considers random cubical complexes as models of discrete random surfaces as in Funaki05 (); Giacomin2001 (). With this perspective, it is natural to study connectivity properties of surfaces. In Sections 2.2.2 and 2.2.3, we shall mention asymptotics of nearest neighbour distances and Betti numbers of , a measure of highdimensional connectivity of surfaces.
Besides being considered as models of complex networks or discrete surfaces, cubical or cell complexes have often been used in digital image analysis. A simple digital image is nothing but black and white values (i.e., values) assigned to lattice points. Geometry and topology of digital images are of interest in image processing, morphology, cosmology and stereology (see Gray71 (); Kong1989 (); Schladitz06 (); Goring2013 (); Hilfer2000 (); Svane2017 (); Klette2004 (); Saha2015 ()). We shall explicitly mention some of the statistics (for example, subgraph counts, component counts) in the Section 2.2.1 and these are motivated by those that arise in the above referenced literature. For example, recently Minkowski tensors of random cubical complexes have received attention in astronomy (see Goring2013 (); Klatt2016 ()). Minkowski tensors are a generalization of intrinsic volumes and asymptotics of the latter shall concern us in Section 2.2.1.
It is pertinent to mention here that in (Bulinski12Bernoulli, , Section 6), it was said about random fields on that “It would be desirable to prove limit theorems for joint distributions of various surface characteristics of different classes of random fields …it might be of interest to prove limit theorems involving other Minkowski functionals for level sets such as the boundary length or the Euler characteristics.” Such a question remains unproven even for random fields parametrized by . As levelsets of random fields are an example of spin models (see below), we can safely remark that our general central limit theorems have reduced the task of proving a central limit theorem for many statistics (including those mentioned above) of various spin models on many Cayley graphs to that of proving variance lower bounds. Indeed, this represents one of the major contribution of this article.
We remark here that other statistics of interest arising naturally in signaltointerferenceplusnoise networks (see Haenggi2010 (); Franceschetti2008 ()), nearest neighbour edges, local topological numbers as in (Saha2015, , Section VI), discrete Morse critical points as in Forman02 () and local porosity as in Hilfer2000 (), also fall within our framework.
Even though the aforecited applications are important contributions of our article as well as crucial motivations for our work, our main contribution can be said to lie within the realm of limit theorems for asymptotically independent random fields as in Malyshev75 (); Neaderhouser78 (); Martin80 (); Sun91 (); Doukhan (); Bradley05 (); Heinrich13 (); Bulinski07 (); Bulinski12 (); Bjorklund2017 (). We shall elaborately survey this literature to place our results in context. We shall make more specific remarks on the relation between this literature (see Remark 4) and our results after stating our results.
A lattice random field is a collection of random variables indexed by the lattice points. Various statistics of the random field can be expressed as sums of score functions which encode the interaction of with . Setting , one is interested in the asymptotics for
(1.1) 
Not surprisingly, a large section of literature on random fields is devoted to the simplest of score functions i.e, . In what follows, we shall call the corresponding as linear statistic. In contrast to linear statistic, two other general statistics  local statistics and quasilocal statistic  have received very little attention. The two are defined rigorously in Section 1.2. Briefly, is a local statistic if there exists a (deterministic) such that for all . In vague terms, is quasilocal if is allowed to be random with suitably decaying tail probability. It is to be understood that we refer to the central limit theorem (often abbreviated as CLT) whenever we use the word asymptotics or limit theorems below.
A spin model can be naturally viewed as a random field by setting . Under such a consideration, all the aforementioned statistics are either local or quasilocal statistics. Spin models also arise as levelsets of random fields i.e., set for some . Such levelsets are of interest as well in the applications mentioned above and represent yet another use for our results.
While asymptotics for linear statistics of follow naturally when is an i.i.d. random field, asymptotics of local or quasilocal statistics can be deduced from the powerful martingaledifference based central limit theorem of Penrose01 (). We refer the reader to recent application of this general central limit theorem to random topology in Hiraoka16 (). Once we drop the assumption of independence, the classical methods or even those in Penrose01 () fail. A natural heuristic is that stationary ‘asymptotically’ independent random fields shall display the same asymptotic behaviour as i.i.d. random fields. However it is a challenge to theorize a notion of ‘asymptotic independence’ that is powerful enough to prove various limit theorems and yet accommodative enough to include many interesting examples. To the best of our knowledge, two successful approaches to capture the notion of ‘asymptotic independence’ in random fields are mixing and dependence.
In short, mixing conditions require that the ‘distance’ between sigmaalgebras of and for decay suitably fast as a function of the distance between and . We shall mostly focus on mixing in this article and refer to Section 1.3 for more details.
On the other hand, dependence requires ‘suitable decay’ of covariance of and for far apart and bounded Lipschitz functions . The ‘suitable decay’ allows for dependence on cardinalities of but in a specific manner, and this indeed is the obstacle in relating mixing to dependence. Refer to Section 4.1 for details.
However, we also shall use the weaker notion of ‘clustering’ arising in statistical physics (cf. Malyshev75 (); Martin80 ()) which roughly states that for disjoint sets
decays exponentially fast as a function of the distance between and with the rate of decay allowed to depend on cardinalities of with a lot more flexibility (see Definition 3). These different notions of ‘asymptotic independence’ and their relations are summarized in the following figure.
While asymptotics for linear statistics are proven under all three conditions  mixing (Bradley93 ()), dependence (Bulinski12 ()) and clustering (Malyshev75 (); Martin80 ())  the nondegeneracy of the limit requires additional assumptions.
However, the limit theorems as stated in any of these papers do not apply to local or quasilocal statistics.
Thus, in order to use mixing or dependence or clustering to prove a CLT for , one needs to show that mixing or dependence or clustering holds for the random field under appropriate assumptions on the random field and the score functions . If we consider generating local statistics, such an approach can be carried out successfully for mixing random field (see Theorem 1.6). The dependence argument does not apply to local statistics i.e., the random field need not be dependent even if is so. If we consider generating quasilocal statistics, assuming exponential mixing of and using ideas involving clustering, we prove a central limit theorem (see Theorem 1.7).
Further, even when these two notions have been used to prove central limit theorems, the attention has been restricted to either random fields on or but rarely on more general spaces. The notions of asymptotic independence and quasilocal can be naturally defined on any metric space and so it is obvious to wonder whether such extensions to more general spaces and their consequences have been considered. While the ergodic theorem (or strong law in our language) has been studied extensively in the ergodic theory or dynamical systems literature, the extension of central limit theorem to more general spaces (mainly to ‘nice’ groups or their Cayley graphs) has received some attention but still many questions persist. For example, we refer the reader to the survey of Derriennic2006 () and to the recent articles
(Cohen2016, , Theorems 3.4 and 3.5), (Bjorklund2017, , Theorems 1.1 and 1.5). Though the language is vastly different, the crux of these results is again that the CLT holds for under suitable weak mixing conditions (which are more closer to our clustering condition) on the random fields where is now a ball of radius centered at the identity element in a group or its Cayley graph.
While all the aforementioned results are important precursors to our article, to the best of our knowledge, we are not aware of a CLT with easytouse geometric conditions on , and simple mixing conditions on . Such conditions are more common in the literature on point processes in continuous settings such as Euclidean spaces or some nice compact manifolds (see Yukich2013 (); Penrose2013 (); Yogesh16 (); Lachieze2017 ()). In this article, we make a successful attempt at proving similar generic central limit theorems for ‘nice’ geometric statistics of the form (1.1) for suitably mixing or clustering random fields on Cayley graphs of finitely generated infinite groups. We make extensive remarks following our main theorems (see Remark 4) comparing our results with those in literature, and also indicating that the “problem of establishing sufficient conditions on a function defined on an ergodic dynamical system in order that the central limit theorem holds” (see Derriennic  Derriennic2006 ()) is far from being answered satisfactorily.
Given the above context, we remark that while many of the interesting models (For ex.,massive Gaussian free field, offcritical Ising model, determinantal random fields, et al.) satisfy the dependence condition, not many satisfy mixing. In other words, there is a tradeoff between dependence and mixing in that the former admits more examples but the latter is powerful enough in proving asymptotics. To our advantage, ‘clustering’ condition manages to retain the benefits of both mixing and dependence. We summarize our central limit theorem in relation to those in the literature in the following table.
Linear statistics  Local statistics  Quasilocal statistics  

Clustering  Malyshev75 () *+  Theorem 1.4 *+  Theorem 1.4 *+ 
Bulinski12 () *  Theorem 1.4 *+  Theorem 1.4 *+  
mixing  Bradley15 () *  Theorem 1.6 *  Theorem 1.7* 
The direct precursor to this article lies in Yogesh16 (), where asymptotics for geometric statistics of clustering point processes in were proven. While the multivariate central limit theorem (Theorem 1.5) and mixing central limit theorems (Theorems 1.6 and 1.7) are new, analogues of our Theorems 1.1  1.4 are present in Yogesh16 (). As with many other papers using ‘clustering’ condition (see, for instance, Malyshev75 (); Martin80 (); Baryshnikov2005 (); Nazarov12 (); Bjorklund2017 ()), we also derive suitable bounds on mixed moments of the random field and then use the cumulant method to prove the central limit theorem. The idea of bounding mixed moments via factorial moment expansion of Bartek95 (); Bartek97 () is borrowed from Yogesh16 (). However, the theorem statements and proofs here are simpler exploiting the structure of discrete Cayley graphs and also lead to applications with minimal assumptions (see Section 2.2). One distinct advantage of lattice point processes over their continuum versions is that usage of Palm theory can be avoided completely. Secondly, we are able to compare clustering with mixing conditions more directly on discrete Cayley graphs as well as furnish central limit theorems for quasilocal statistics of exponentially mixing random fields. Further, this paper generalizes Yogesh16 () in a different direction and as mentioned in Remark 4 also indicates a way to unify the two frameworks.
Organization of the paper :
We shall first introduce all the preliminaries  Cayley graphs, random fields, quasilocal statistics and clustering spin models, mixing coefficients  in Section 1.1. After introducing all the necessary notions, we shall state all our results in Section 1.4 with a detailed discussion of our results in Remark 4. In Section 2, we shall give examples and applications of our results on clustering spin models. Examples of spin models satisfying our assumptions are mentioned in Section 2.1 and applications of our general results to random cubical complexes are described in detail in Section 2.2. All our proofs are in Section 3 including the crucial factorial moment expansion in Section 3.1. For the sake of completeness, we shall define quasiassociation and dependence, and describe the connections between these different notions of ‘asymptotic independence’ in Sections 4.1 and 4.2, respectively. Finally, we conclude with a void probability bound needed in our applications in the Section 4.3.
1.1 Preliminary notions and notations
Cayley graphs :
We shall briefly define the necessary notions related to Cayley graphs here and we point the reader to the rich literature delaharpe2000 (); Benjamini2013 (); Roe2003 (); Meier2008 (); Lyons16 (); Pete2017 () for more details. Let be a countably infinite group with a finite symmetric set of generators i.e., iff , where denotes the inverse of in . By calling as generators, we assume that is the smallest subgroup containing . Further, denoting the identity by , we shall assume that The Cayley graph on is the graph whose vertex set is and is an edge if . In other words, for all and , By symmetry of , it is easy to see that iff and so is an undirected graph. Since and is the generator of , is a simple, connected, regular graph with vertex degree . For any , the group operation is called as translation of . We emphasize that we have not assumed abelianness of the group, thus need not be same as for all . We shall now make the blanket assumption that all our Cayley graphs are countably infinite but finitely generated. We shall call such Cayley graphs as discrete Cayley graphs in the rest of the article.
We shall use to denote the usual graph distance. Observe that which we shall also denote as . For a set , let denote its cardinality, denote the ball of radius centered at and Because the underlying graph is a Cayley graph, for all where denotes the graph isomorphism. Setting , there are two trivial bounds for growth of for discrete Cayley graphs. In particular, there exists and a constant such that for all ,
(1.2) 
The lower bound is a result of being a strictly increasing valued function (See (delaharpe2000, , Section VI.A.3)), and the upper bound follows by comparing with a regular tree of vertex degree . For a subset of vertices, , we define its inner vertex boundary as
Definition 1 (bAmenable Cayley graphs)
Let be a discrete Cayley graph as defined above. We say that a Cayley graph is bamenable (amenable with respect to the sequence of balls) if
We say that a Cayley graph has polynomial growth if for some , we have that ^{2}^{2}2Here, we have used the Big O notation of Bachmann–Landau..
Polynomial growth is stronger than amenability, which in turn is stronger than amenability (see (delaharpe2000, , VII.34)). amenability is equivalent to the condition that . On a discrete Cayley graph, it is easy to note that and hence by Fekete’s subadditive lemma exists. Further, if (in particular, includes discrete Cayley graphs with exponential) then is not amenable (see (delaharpe2000, , VII.34)). We discuss the use of amenability assumption in Remark 4(8).
A large class of examples of discrete Cayley graphs with polynomial growth can be constructed via Gromov’s famous characterization of groups with polynomial growth (Gromov1981 () and see also (Pete2017, , Theorem 10.1)). We shall quickly mention this result. For two subgroups , define to be the commutator subgroup generated by all for . A group is said to be nilpotent if the descending series of subgroups (called as lower central series) defined by terminates in the trivial subgroup . With this quick definition, Gromov’s theorem can be stated as follows : A finitely generated group has polynomial growth if and only if it is virtually nilpotent (i.e., contains a nilpotent subgroup of finite index). From the definitions, it is easy to see that abelian groups are nilpotent (since ) and hence any group containing an abelian subgroup of finite index is virtually nilpotent. The class of abelain groups includes not only our motivating example of integer lattices but other lattices on the Euclidean plane as well. Further, for a virtually nilpotent group, it was shown in (Pansu1983, , Theorem 51) that there exists a (called the degree of polynomial growth) such that
(1.3) 
Without defining the group, we mention that discrete Heisenberg groups of all dimensions (i.e., over the ring , see (Pete2017, , Section 4.1)) is also an example of a nilpotent group and its degree of polynomial growth is . For more examples refer to the previous references, especially (Roe2003, , Chapter 3), (delaharpe2000, , Chapters VI and VII).
1.2 Random fields and score functions
Let , be a valued random field, where is a Polish space.
We say that the random field is stationary if for all . The configuration space of such a random field is .
We shall now introduce score functions which are defined on the configuration space. Specifically, a score function is a function . We shall assume all our score functions to be translationinvariant i.e., for all , , where is translation by in the configuration space defined by for all . Next, for , set for which is another random field but indexed by . Also, for any , we set . We say that a score function is local if there exists such that for any it holds that
(1.4) 
for a score function . In statistical physics, the aforementioned local statistics are referred to as local observables (see (Schonmann1994, , Section 2)) or local function (see (Friedli2017, , Definition 3.11)).
More generally, for a fixed score function , not necessary local, we define its radius of stabilization as follows: for any fixed and , define , then
(1.5) 
and . We shall adopt the convention that and . Trivially, .
Definition 2 (Stabilizing score function)
We say that is a stabilizing score function for if there is constant and a function with as such that
We say that is a quasilocal score function on if is a stabilizing score function as in Definition 2 and
(1.6) 
for some . If is a local score function, then a.s., i.e., for all and hence it is trivially quasilocal. Our definition of quasilocal is a little more restrictive than that of (Friedli2017, , Lemma 6.21), where there is no assumption on the rate of decay.
As is to be expected, we shall need a suitable moment condition as well on the pair . We say that satisfies the moment condition if
(1.7) 
where we assume without loss of generality that is nondecreasing in .
Spin models:
A specific class of random fields when are called spin models. Let us denote as the space of spin configurations^{3}^{3}3Usually, lattice spin configurations are defined as elements of but this is trivially equivalent to our definition. Further, these spinconfigurations are also referred more specifically as twostate spin configurations to emphasize that spins here can take only two values instead of multiple values. on . Alternatively, we can think of as the space of simple point measures in by setting We can also identify with its support as well. We shall use either the measuretheoretic notation or the settheoretic notation .
Definition 3 (‘Clustering’ Spin Models)
By a spin model or point process on , we refer to a valued random variable. Further, let be stationary (i.e., for all ) and that is nondegenerate (i.e., ). We say that a nondegenerate and stationary is a clustering spin model, if for all there exists constants and a fast decreasing function (i.e. , decreasing and for all ) such that for all distinct , we have that
(1.8) 
where Without loss of generality, we assume that is nondecreasing in and is nonincreasing in .
We shall say that is summable in if for all ,
(1.9) 
Trivially, compactly supported are summable on any discrete Cayley graph and fast decreasing are summable on any discrete Cayley graph with polynomial growth. If for some , then is clearly summable on any discrete Cayley graph because of the upper bound on in (1.2).
Remark 1
Though seemingly simple, clustering has various implications for spin models (see Lemma 1, for instance). Another important consequence is that if clusters, so does where we have identified with (see (4.3) for a proof). If are the clustering constants and the clustering function for , then are the clustering constants and the clustering function for .
Next, setting the joint density or correlation functions of as
where are assumed to be distinct and else .
A spin model is said to be exponentially clustering if is a clustering spin model as in Definition 3 with and for some , and the clustering function satisfies the growth condition^{4}^{4}4We have referred stretched exponential or superexponential also as exponential for convenience.
(1.10) 
for some .
Since spin models are also random fields, we can define score functions as we did for random fields and all the related definitions carry forward for spin models as well with a few changes. For a spin model, we assume if . We say that satisfies the exponential growth condition if for some and for all and (with ), we have that
(1.11) 
Remark 2
Local score functions (say with a.s.) satisfy the growth condition (1.13) and moment condition (1.7). To see this, define
(1.12) 
where again denotes the restriction of the spin model to the ball and denotes the space of spin configurations on . Since is finite for all , we trivially have that for all . Thus by ‘locality’ of , we have that for any and implying the moment condition and the exponential growth condition trivially.
Remark 3
Let be a quasilocal statistic as in (1.6) and satisfy polynomial growth condition i.e., there exists such that for all and , we have that
(1.13) 
Then, using the exponential tail decay of the radius of stabilization along with the polynomial growth condition, we have that for all ,
Thus, we have shown that satisfies the moment condition (1.7) for all if it is quasilocal and satisfies the polynomial growth condition. As we shall see, our examples of quasilocal statistics shall satisfy this weaker condition of polynomial growth and thereby removing the need to prove the moment condition as well.
1.3 Mixing random fields :
While the notion of clustering suffices for capturing ‘asymptotic independence’ of spin models, we shall need stronger notions to do so for general random fields. Let be a stationary random field as in Section 1.2, and let for a subset . We shall often omit the reference to , when the underlying random field is clear. We define the following two mixing coefficients for subsets
where denotes all squareintegrable, measurable functions. Further we define
A random field is said to be mixing (resp. mixing) if (resp. ) as . It is known that the two mixing conditions are equivalent and in fact for any stationary random field, we have that
(1.14) 
The first inequality can be obtained easily by choosing appropriate indicator functions and for the second inequality, we refer the reader to (Bradley93, , Theorem 1) for the complete statement and proof. However, it is not difficult to observe that the arguments set forth in proving (1.14) can be used verbatim to conclude the same result even when the underlying parameter space is assumed to be a discrete Cayley graph as in this paper, with the understanding that generators play the same role as eigen bases of or .
Standard examples of mixing random fields include stationary Gaussian random fields on with spectral density bounded away from zero (see Section 2.1 (Doukhan, , Theorem 2)). In fact, one can obtain precise decay rates of the mixing coefficients of stationary Gaussian random field on if in addition to the condition on the spectral density stated above, the covariance of the random field decays appropriately as a function of the distance between random variables (see Section 2.1 (Doukhan, , Corollary 2)). We refer the reader to Bradley05 () for related discussion and many more examples of processes satisfying various mixing conditions.
1.4 Our results
As mentioned in the introduction, our main statistic of interest is the sum of score functions of a spin model, i.e., our interest lies in the asymptotics of
as . We shall now state our abstract theorems for and mention examples of scores and spin models in the next section. For distinct and , define the mixed moments of for as
(1.15) 
These play a crucial role in the proof analogous to that of moments for a usual random variable. The following theorem is the key step in our proofs of the weak law and the central limit theorem.
Theorem 1.1 (Clustering of mixed moments)
Let be a discrete Cayley graph and together with satisfy one of the following two conditions :
Then, the statistic for satisfies clustering in terms of mixed moments, i.e., there exists constants such that for all with and for , we have that
(1.16) 
for a fast decreasing function . Further, under Assumption 1 if is summable as in (1.9), so is .
We shall first state the weak law and then the central limit theorem, for which we recall that , where is a ball of radius in , defined earlier in Section 1.1.
Theorem 1.2 (Weak law for quasilocal statistics of ‘clustering’ spin models)
Let be a discrete Cayley graph and together with satisfy one of the following two conditions :
Then we have that as ,
and further
The following abstract central limit theorem is a straightforward adaptation of (Malyshev75, , Theorem 1) to general random fields (See also (Malyshev75, , Remark 1)). Since we are unable to find a general statement of the form below, we shall sketch the proof of this later.
Theorem 1.3 (CLT for ‘clustering’ random fields.)
Let be a sequence of random fields such that for all Further, we assume that satisfy clustering of mixed moments i.e., there exists constants such that for all with and for , we have that
(1.17) 
where is a fastdecreasing function and satisfies the summability condition as in (1.9). Set . Further, if for some , it holds that
then we have that as ,
Combining Theorems 1.1 and 1.3, we obtain easily the following result that is very convenient for use in applications as we shall see in Section 2.2.
Theorem 1.4 (CLT for quasilocal statistics of ‘clustering’ spin models)
Let be a discrete Cayley graph and together with satisfy one of the following two conditions :
Further if for some it holds that
then we have that as ,
In applications, one is also interested in a joint distributional limit of the vector where is the total mass corresponding to the score function . We shall combine Theorems 1.2 and 1.4 along with the CramérWold theorem to derive the following multivariate central limit theorem.
Theorem 1.5 (Multivariate CLT for quasilocal statistics of ‘clustering’ spin models)
Let be a discrete Cayley graph and together with satisfy one of the following two conditions :
Set We have that as ,
where is the covariance matrix defined by
Though there is no variance lower bound assumption in our multivariate CLT, it is implicit in the result. Note that the limiting Gaussian vector is nondegenerate iff for some .
We shall now move to next class of results which are based on different set of assumptions involving mixing coefficients. Broadly, the results are the same as stated above in the case of clustering random fields, but with little leeway allowing more general random fields . The interest again lies in the asymptotic distribution of appropriately normalized and scaled sums of
We note here that when the context is clear, we often omit reference to the underlying random field.
Theorem 1.6 (CLT for local statistics of mixing random fields)
Let be a discrete Cayley graph, and be stationary, mixing random field defined on . Then, writing for a local statistic as defined in (1.4), we have as ,
where
Next, we shall state the analogous limit theorem stated for the sums of quasilocal score functions evaluated on random field satisfying certain assumptions on the rate of decay of the mixing coefficients.
Theorem 1.7 (CLT for quasilocal statistics of exponential mixing random fields)
A multivariate version of the above theorem can also be concluded using precisely the same set of arguments as put forth in the proof of Theorem 1.5.
Remark 4 (Remarks on our results and future directions)

Before comparing our results specifically with CLTs in the literature on mixing, dependence or ergodic theory, we wish to comment on the general points of likeness and unlikeness between our results and those. Firstly, our CLT does not require volumeorder variance growth unlike the CLTs available in the aforementioned literature. Secondly, as mentioned in the introduction, we provide what we believe as easytouse geometric conditions on and mixing/clustering conditions on for CLTs to hold. All the CLTs, including ours, require a nontrivial variance lower bound assumption for the limit to be nondegenerate.

Comparison with CLTs under mixing conditions : CLTs under mixing condition are known for linear statistics (Peligrad98 ()) and here with suitable additional assumptions, we have extended it to local and quasilocal statistics in Theorems 1.6 and 1.7. But for spin models clustering is a simpler condition to check than mixing as attested by the many spin models (see Section 2) that satisfy clustering condition.

Comparison with CLTs under dependence : Again, CLTs under dependence are proven for linear statistics (Bulinski12 ()) and many spin models do satisfy this condition. But as mentioned before, it is far from clear whether local or quasilocal statistics of dependent random fields are dependent. This problem arises mainly due to the specific structure of the covariance decay required in the condition.

Comparison with CLTs on general spaces : Though the underlying spaces considered in Derriennic2006 (); Bjorklund2017 () are far more general than ours, the exponential mixing conditions assumed on the random field (see (Bjorklund2017, , Definition 2.1)) is similar to our clustering of mixed moments as in (3.32) and the condition (Bjorklund2017, , (1.7)) plays the role of our summability condition (1.9). Further, for (strictly) exponential clustering (i.e., in (1.10)) and Cayley graphs with subexponential growth, our summability condition (1.9) holds and so does (Bjorklund2017, , (1.7)) (see (Bjorklund2017, , Section 3)). However, we also allow for subexponential clustering (see (1.10)) as well provided it is suitably fastdecreasing depending on the score function and the growth of the Cayley graph.

Normal approximation : While our focus has only been on central limit theorems, it is not uncommon to ask for rates of convergence in central limit theorems. The wellknown Stein’s method has often been used to derive such rates. For example, rates of normal convergence for linear statistics (and also some local and global statistics) for Ising model and some other specific particle systems have been derived recently in Goldstein2016 (). For some models, the clustering property of spin models play a crucial role. Goldstein2016 () exploits the positive association property of spin models and hence applies for increasing statistics of spin models but it is not clear if it applies to clustering spin models or nonlinear statistics like in our CLTs. Another way to obtain rates of normal convergence is by obtaining suitable bounds on the growth of cumulants (see (Grote16, , Lemma 4.2) and Saulis1991 ()). This method of normal approximation necessitates a more precise quantification of our cumulant bounds. Further, (Grote16, , Lemma 4.2) also gives cumulant bounds needed for moderate deviations. This would be a worthwhile direction to pursue in the future.

Cumulant Bounds : Another use of cumulant bounds to derive CLT is in Feray2016a (); Feray2016 () where such bounds are crucially used in the weighted dependency graph method to prove CLTs. In Feray2016a (), CLTs for local and some global statistics of the Ising model (see Section 2.1.2) are proved using bounds on cumulants and a generalization of the dependency graph method. Again, it requires restrictions on regimes for the Ising model and variance lower bounds but it is not obvious if the methods can be adapted to other similar spin models.

Scaling limits : Suppose that , the integer lattice with the generators of the group being with denoting the distance. The corresponding Cayley graph on is nothing but
(1.18) Then it is possible to consider a suitably scaled version of the random field and study its scaling limit. Two possible choices for scaling are either to consider the random field
or the random field
where is the coordinate wise ordering of points in . The former scaling is more in the spirit of the scaling considered in (Yogesh16, , (1.3)) and the latter is in spirit of the scaling considered in (Malyshev75, , Theorem 2). In both the cases, the limit is expected to be a Gaussian random field with a white noise like structure i.e., the covariance matrix of all finite dimensional marginals converging to a diagonal matrix (See (Yogesh16, , (1.21)) and (Malyshev75, , Theorem 2)). 
We are not aware of any examples of amenable Cayley graphs exhibiting non polynomial growth. However, our proof methods for amenable graphs should allow our results to be proven under the assumption that grows subexponentially i.e., . For such groups, there exists a subsequence as such that as . Thus, if we take sums over for the subsequence chosen as above instead of taking sums over in (1.1), one would expect our results to hold under such asymptotics as well.

We have restricted ourselves to a class of amenable Cayley graphs but it is natural to ask whether one can consider a more general class of graphs. Deriving the motivation from various probabilistic studies (see Aldous2007 (); Benjamini2013 (); Lyons16 (); Pete2017 ()), two possible classes of graphs that are suitable to such a study are unimodular random graphs and vertex transitive graphs. To further emphasize the need for such a study, even graphs on stationary point processes as studied in Yogesh16 () can be considered as unimodular random graphs (see (Baccelli2016, , Section 5)). Thus a study on unimodular random graphs can unify the framework in this article and that of Yogesh16 () apart from considerably extending the scope of applications.

Variance lower bounds : Another persistent but unavoidable assumption not only in our CLTS but in various such generic CLTs in the literature (including those cited here) is the variance lower bound condition. Such lower bounds are usually shown by adhoc methods. Primarily, for sums of stationary sequences, variance lower bounds can be obtained under conditions involving the spectral density of the random variables (see Theorem 2 in Chapter 1.5 of Doukhan ()). Alternatively, under the stationary and strong mixing condition, together with appropriate summability condition of the covariance, the necessary and sufficient condition for meaningful variance lower bounds of partial sums, is that the variance must grow to infinity (see (Bradley97, , Lemma 1), or (Peligrad98, , Theorem 2.1)). However, for sums of nonstationary sequences of random variables, the condition provides suitable variance lower bounds (see (Bradley15, , Theorem 2.2)). In this context, a very simple and natural question follows: for an mixing random field satisfying , writing as the mixing coefficient of the field , is it possible to conclude , even for a local statistic ?
2 Examples and applications
In this section, we illustrate our main theorem (Theorem 1.4) by providing examples of quasilocal statistics and clustering spin models. Though we shall mainly focus on a variety of applications to random cubical complexes, we shall also hint at others. Also, we shall not mention applications of our mixing CLTs (Theorems 1.6 and 1.7) but we hope the discussion in the introduction and our applications for clustering spin models will convince the reader that such results are feasible as well.
2.1 Examples of clustering spin models
The simplest example of a clustering spin model is one with i.i.d spins. The clustering property, which captures asymptotic independence in a strong way, is a natural condition that is expected to hold in statistical physical models which have weak dependences. Without delving into the details, we shall mention a few illustrative examples in this part of the section, and specifically restrict our attention to spin models on the lattice , unless mentioned otherwise.
2.1.1 Level sets of Gaussian fields
Let be a stationary Gaussian random field whose covariance kernel is exponentially decaying i.e., is such that Further for simplicity, assume that is a function of alone. The superlevel sets at level of this Gaussian field, defined as , is a spin model. To show clustering of , we shall use the following totalvariation distance bound between Gaussian random vectors from Beffara16 ().
We recall the definition of total variation distance between two probability measures and on a sigmaalgebra is given by
Theorem 2.1 (Theorem 4.3 in Beffara16 ())
Let and be random Gaussian vectors (not necessarily centered) with covariances and . Let and be the laws of the vectors whose entries are or , whether the corresponding entries in the vectors and are positive or not, moreover and has on diagonal. Then
Remark 5
Theorem 2.1 of Beffara and Gayet was originally stated for centered Gaussian vectors in Beffara16 (). By using the trivial bound for any Gaussian random variable with arbitrary mean, where ever necessary, in the proof of Theorem 2.1 it trivially extends to any noncentered Gaussian vector. Hence, Theorem 2.1 is true for arbitrary level sets of a Gaussian vector.
Thus trivially from Theorem 2.1, and the corresponding definitions, we have the following corollary.
Corollary 1
Let be a stationary Gaussian field, on a discrete Cayley graph , with having unit variance. The level sets satisfy exponential clustering with clustering constants and clustering function , where .
Massive Gaussian free field :
Massive Gaussian free field, on the lattice , is defined as the Gaussian field whose covariance kernel is