Meta-universality classes at criticality

# Meta-universality classes at criticality

Duncan A.J. Blythe111Berlin Institute of Technology, Berlin, Germany, Bernstein Centre for Computational Neuroscience, Berlin, Germany, African Institute of Mathematical Sciences, Bagamoyo, Tanzania, Vadim V. Nikulin222Charité Medical University, Berlin, Germany, Bernstein Centre for Computational Neuroscience, Berlin, Germany, National Research University, Higher School of Economics, Moscow, Russia
###### Abstract

We show theoretically that the hypothesis of criticality as a theory of long-range fluctuation in the human brain may be distinguished from competing explanations on the basis of macroscopic neuronal signals such as the EEG, using novel theory of narrowband amplitude time-series at criticality. Our theory predicts the division of critical activity into meta-universality classes. As a consequence we provide strong evidence for criticality in the human brain on the basis of the EEG.

Testing for the presence of critical power-law avalanche dynamics (CPLAD) is pivotal in establishing that a dynamical system is poised close to a critical point. Such an analysis assumes that periods of activity are separated from periods of inactivity. For the intact brain of awake human subjects, which has been hypothesized to be critical [1], the assumption of separation between activity and inactivity is never met, since the system is continuously active. This lack of separation makes unclear whether the results of existing experiments confirming criticality in animals in vitro [2, 3] and in vivo [4, 5] generalize to the intact brain of awake human subjects; in particular it has been argued that the scale-free form of the power-spectra of LFP, EEG and MEG data is due not to CPLAD but to passive filtering (PF) through the extracellular media [6, 7]. While several papers have claimed to test the hypothesis of criticality from large scale-brain signals of awake human subjects [8] the authors only find CPLAD over short time-scales ( second) since the continuous nature of the data prevented testing of criticality as a theory of neuronal fluctuations over longer time-scales. Quantifying potential CPLAD from such non-invasive human recordings is challenging since a superposition of overlapping avalanches may lead to incorrect estimation of size and duration (critical) exponents (this is related to the under sampling problem [5]) or indeed of whether CPLAD are present at all. Interestingly, in this letter we show that in order to quantify and distinguish CPLAD from PF, it is not necessary to define individual avalanches; instead we show that is is sufficient to analyse the properties of continuous neuronal recordings. This approach allows us to test the hypothesis of criticality as a theory of neuronal variability over large time-scales.

We use the facts that periods of activity of local neural networks are separated by periods of inactivity, and macroscopic brain signals, such as the LFP, EEG and MEG, measure the linear superposition of activity of numerous local neural networks [9]; this means that the macroscopic brain signal is continuous, although the activity of local networks is not. Thus:

 X′(t)=T∑s=1qs∑i=1hs,ias,i(t−sLs,i) (1)

denotes activity at time of the local neural network which begins at time , after a period of inactivity and lasts for time steps. We adopt the conventions that is normalized to unit standard deviation and is for . Thus, denotes the average height of the time-course of activity of the local neural network which begins at time .

PF and CPLAD differ on how and are distributed. According to the PF hypothesis their distributions decay faster than power-laws. For simplicity, therefore, we summarize the PF as claiming exponential distributions: and . Moreover the PF hypothesis states power-spectra of macroscopic brain signals are scale-free because they reflect a filtered version of :

 X(t)=∞∑u=0F(u)X′(t−u) (2)

is a linear filter due to the extracellular media which yields a signal with power-law power spectrum from white noise input [6, 7].

On the other hand, the CPLAD hypothesis states that macroscopic signals reflect directly so that and that we have power law distributions for and : and . These power-law distributions explain the power-law form of the power-spectrum [10]. In addition the CPLAD hypothesis states that each is an independent and identical sample from a single stochastic process, which we call [11].

We now present theory which makes predictions for the CPLAD hypothesis which provably do not hold for the PF hypothesis. As well as considering the raw signal X(t) we also consider the amplitude of a narrowband filtered version of , which we denote . Thus let be a linear narrowband filter with pass band , and the Hilbert transform then:

 gω(X(t))=|fω(X(t))+iH(fω(X(t)))|

The theory depends on two measures of long-term variation evaluated on and : the Hurst exponent and the detrended cross correlation coefficient. The Hurst exponent , of a stationary process may be defined by the scaling of the auto covariance function. For , is said to be long-range temporally correlated (LRTC):

 E(Y(t+s)Y(t))−E(Y(t))2∼s2H−2

For auto covariances decaying faster than , one defines and is not LRTC. From now on we distinguish the Hurst exponents of and as and . Typically in neuroscientific applications Hurst exponents are measured with Detrended Fluctuation Analysis (DFA) [12].

The detrended cross correlation coefficient [13] is a measure of correlation between two time-series and at a time-scale , which is invariant to non-stationary trends of a fixed polynomial degree. Let be the detrended variance of where each window of size is detrended, as computed for DFA and , by analogy, the detrended covariance, as computed for DCCA [14]; then in analogy to the Pearson correlation coefficient, one defines:

 ρDCCA(n,Y1,Y2)=F2Y1,Y2(n)√F2Y1,Y1(n)F2Y2,Y2(n)

Our first result is that for the of the PF hypothesis, Equation (5), and, when , as . This can be seen by splitting Equation (B) into activity which lasts longer than some value and activity which has duration shorter than or equal to :

 X′(t)=∑Ls,i≤L′hs,ias,i(t−sLs,i)+∑Ls,i>L′hs,ias,i(t−sLs,i) (3)

Since the distribution of L decays exponentially, the vari- ance of the left hand term dominates. Therefore:

 fω(X′(t))∼∑Ls,i≤L′hs,ifω(as,i(t−sLs,i))

Since time-points in this expression spaced more than points apart are independent we also have that time-points of spaced more than points apart are independent, so that and as (same asymptotic properties as white noise); the reweighting of frequency induced by the passive filtering does not change the narrowband properties. See Proposition 2.1 of the supplement.

We now consider the CPLAD hypothesis. The behaviours we derive for and depend on the exponents and . We find that there are four regions of parameters with qualitatively differing behaviours which we term meta-universality classes. We assume that all power-law distributions are cut off at a lifetime which is proportional to the size of the system and, for simplicity, that at each time-point a fixed number of avalanches begin.

(MU1)

For the number of avalanches active at time is:

 Lc∑s=0#{Lt−s,i|Lt−s,i>s} ∼∫Lc0q(∫LcsL−αdL) ∼L2−αc

This implies the number of avalanches active at any given time is unbounded in the system size. Applying the Central Limit Theorem, this implies that is a Gaussian process with power-law autocorrelation (see supplement for details). Since Gaussian processes are uniquely defined by their second order properties and may be generated by filtering white noise [15], then in analogy to the results for the PF hypothesis and as . See Proposition 2.4.

(MU2) and

For , the probability that an avalanche is active with duration greater than is:

 1T∑Ls,I>L′Ls,i∼L′2−α

Thus the probability that large avalanches occur simultaneously is negligible. Moreover we have, for an exponent , a scaling relation for large :

 fω(Lβa(t/L))∼Lβ′fω(a)(t/L)

Here is understood as position of filtered in the narrowband around . We are able to derive this exponent theoretically (Proposition 2.7) and find that . This implies that for (but not otherwise) the variance of the large narrowband filtered avalanches dominates so that:

 fω(X(t))∼∑Ls,i>L′Lβ/2s,ifω(a)(t−sLs,i)

Since the avalanches on the right hand side of this relation do not overlap then we have that:

 gω(X(t))∼∑Ls,i>L′Lβ/2s,igω(a)(t−sLs,i)

Given this representation of the amplitude as a simple sum of amplitudes of individual avalanches, standard techniques may then be applied to approximating the Hurst exponent [10], and we find:

 Hωamp =β/2−α/2+2 >1/2

Similarly we find:

 Hωraw =β−α/2+2 >1/2

Moreover, assuming that the integrals exist, then the separation of large avalanches make it simple to derive that as . See Propositions 2.8 and 2.9.

(MU3) and

For frequencies with and :

 fω(Lβ)∼Lβfω(a)(t/L)

This is because relative to the time-scale of the filter, may be treated as a delta function, which weights all frequencies equally. Therefore applying a similar argument as for MU2 we find that for , , as . Moreover, applying the results for MU2 we have:

 Hωamp =1/2 Hωraw =β−α/2+2 >1/2

See Proposition 2.10.

(MU4) and

Since these universality classes have the shortest tails in their critical distributions, we may apply identical methods as applied to the PF hypothesis which show that and as when . See Proposition 2.11.

The results of the theory for the CPLAD model are summarized in Figure 1.

We now present tests of these predictions in simulations. We first tested the predictions for the PF hypothesis by generating a LRTC process by linear filtering of white noise. The results are displayed in the bottom panel of Figure 2 and Figure 2 of the supplement. The results confirm that and .

We then tested the predictions for the CPLAD hypothesis, modeling the activity of local networks by:

 a(t)=b(t)+c(t)ϵ(t)

is the average avalanche shape, the shape in the variance profile and a colored noise with the spectrum known for a critical system [16] (see supplement for details). Two examples in MU2 and MU1 are displayed in the top two rows of Figure 2.

The results of a simulation for all critical parameters are displayed in Figure 3. We find good agreement be- tween the meta-universality classes derived and the simulation results. (See supplement sec. 3.3 for details.)

Finally we tested the PF and CPLAD hypotheses by estimating , and on EEG time-series (see supplement for preprocessing). Important is that we analyse 3 frequency ranges without oscillations (no local maximum in power-spectrum); the aim was to restrict analysis to activity corresponding to the shape of the power-spectra. Given that the data were sampled at 200Hz, and that lower frequencies require far larger window sizes for analysis, we chose 3 frequencies above the beta range, taking care to exclude the 50Hz line noise.

The results of our analysis over all 7 subjects are displayed in Figure 4. For 244 of 261 signals, the DFA estimate of was higher than , clearly indicating the presence of LRTC (median = 0.61, 5th and 95th percentiles 0.49 and 0.85, p ¡ 0.0001). Likewise, 238 of 261 DCCA correlation coefficients at the highest scale between frequency ranges were positive (). We found moreover that the values at the highest scale and measured from the same neural data were highly correlated () and values in distinct frequency ranges were highly correlated () (likewise for ). Finally we found that the and values were not significantly correlated with (). These results confirm the CPLAD hypothesis, in particular in agreement with predictions for MU2.

Thus these findings provide strong evidence for criticality in the human brain by confirming the hypothesis of CPLAD. These results cannot be explained by the PF hypothesis; passive filtering of neural signals may generate a power-law spectrum but will not induce LRTC of narrowband amplitudes or DCCA correlations between narrowband amplitudes. Moreover, we note that the failure to detect avalanches in the experiments of [6, 5] may be explained by criticality in the first meta-universality class where we have shown that even the largest avalanches are not discernible. Thus it is possible in these experiments that the systems under study were critical despite failure to demonstrate CPLAD directly.

We thank Daniel Bartz and Klaus-Robert Müller for critical readings of the manuscript. Financial support; DAJB: DFG, GRK 1589/1 ÓSensory Computation in Neural SystemsÓ; VVN partial support: BCCN, Berlin (BCCN 01GQ 1001C) and the National Research University, Higher School of Economics.

## References

• [1] J. M. Beggs, “The criticality hypothesis: how local cortical networks might optimize information processing,” Philosophical Transactions of the Royal Society of London A: Mathematical, Physical and Engineering Sciences, vol. 366, no. 1864, pp. 329–343, 2008.
• [2] J. M. Beggs and D. Plenz, “Neuronal avalanches in neocortical circuits,” The Journal of neuroscience, vol. 23, no. 35, pp. 11167–11177, 2003.
• [3] N. Friedman, S. Ito, B. A. Brinkman, M. Shimono, R. L. DeVille, K. A. Dahmen, J. M. Beggs, and T. C. Butler, “Universal critical dynamics in high resolution neuronal avalanche data,” Physical Review Letters, vol. 108, no. 20, p. 208102, 2012.
• [4] T. Petermann, T. C. Thiagarajan, M. A. Lebedev, M. A. Nicolelis, D. R. Chialvo, and D. Plenz, “Spontaneous cortical activity in awake monkeys composed of neuronal avalanches,” Proceedings of the National Academy of Sciences, vol. 106, no. 37, pp. 15921–15926, 2009.
• [5] T. L. Ribeiro, M. Copelli, F. Caixeta, H. Belchior, D. R. Chialvo, M. A. Nicolelis, and S. Ribeiro, “Spike avalanches exhibit universal dynamics across the sleep-wake cycle,” PloS one, vol. 5, no. 11, p. e14129, 2010.
• [6] C. Bedard, H. Kroeger, and A. Destexhe, “Does the frequency scaling of brain signals reflect self-organized critical states?,” Physical Review Letters, vol. 97, no. 11, p. 118102, 2006.
• [7] C. Bédard and A. Destexhe, “Macroscopic models of local field potentials and the apparent noise in brain activity,” Biophysical Journal, vol. 96, no. 7, pp. 2589–2603, 2009.
• [8] O. Shriki, J. Alstott, F. Carver, T. Holroyd, R. N. Henson, M. L. Smith, R. Coppola, E. Bullmore, and D. Plenz, “Neuronal avalanches in the resting meg of the human brain,” The journal of Neuroscience, vol. 33, no. 16, pp. 7079–7090, 2013.
• [9] D. J. Griffiths and R. College, Introduction to electrodynamics, vol. 3. Prentice Hall Upper Saddle River, NJ, 1999.
• [10] H. J. Jensen, K. Christensen, and H. C. Fogedby, “1/f noise, distribution of lifetimes, and a pile of sand,” Physical Review B, vol. 40, no. 10, p. 7425, 1989.
• [11] A. P. Mehta, A. C. Mills, K. A. Dahmen, and J. P. Sethna, “Universal pulse shape scaling function and exponents: Critical test for avalanche models applied to barkhausen noise,” Physical Review E, vol. 65, no. 4, p. 046139, 2002.
• [12] C.-K. Peng, S. V. Buldyrev, S. Havlin, M. Simons, H. E. Stanley, and A. L. Goldberger, “Mosaic organization of dna nucleotides,” Physical Review E, vol. 49, no. 2, p. 1685, 1994.
• [13] G. Zebende, P. Da Silva, and A. Machado Filho, “Study of cross-correlation in a self-affine time series of taxi accidents,” Physica A: Statistical Mechanics and its Applications, vol. 390, no. 9, pp. 1677–1683, 2011.
• [14] B. Podobnik and H. E. Stanley, “Detrended cross-correlation analysis: a new method for analyzing two nonstationary time series,” Physical Review Letters, vol. 100, no. 8, p. 084102, 2008.
• [15] P.-O. Amblard, J.-F. Coeurjolly, F. Lavancier, and A. Philippe, “Basic properties of the multivariate fractional brownian motion,” in Séminaires et Congres, vol. 28, pp. 65–87, 2013.
• [16] M. C. Kuntz and J. P. Sethna, “Noise in disordered systems: The power spectrum and dynamic exponents in avalanche models,” Physical Review B, vol. 62, no. 17, p. 11699, 2000.
• [17] R. Weron, “Estimating long-range dependence: finite sample properties and confidence intervals,” Physica A: Statistical Mechanics and its Applications, vol. 312, no. 1, pp. 285–299, 2002.
• [18] P. Abry, D. Veitch, and P. Flandrin, “Long-range dependence: Revisiting aggregation with wavelets,” Journal of Time Series Analysis, vol. 19, no. 3, pp. 253–266, 1998.
• [19] L. Giraitis, P. M. Robinson, and A. Samarov, “Adaptive semiparametric estimation of the memory parameter,” Journal of Multivariate Analysis, vol. 72, no. 2, pp. 183–207, 2000.
• [20] P. M. Robinson, “Gaussian semiparametric estimation of long range dependence,” The Annals of statistics, pp. 1630–1661, 1995.
• [21] W. Rea, L. Oxley, M. Reale, and J. Brown, “Estimators for long range dependence: an empirical study,” arXiv preprint arXiv:0901.0762, 2009.
• [22] P. Billingsley, Probability and Measure. Wiley Series in Probability and Mathematical Statistics. Wiley New York, 1995.
• [23] M. S. Taqqu, “Weak convergence to fractional brownian motion and to the rosenblatt process,” Probability Theory and Related Fields, vol. 31, no. 4, pp. 287–302, 1975.
• [24] J. W. Kantelhardt, E. Koscielny-Bunde, H. H. Rego, S. Havlin, and A. Bunde, “Detecting long-range correlations with detrended fluctuation analysis,” Physica A: Statistical Mechanics and its Applications, vol. 295, no. 3, pp. 441–454, 2001.
• [25] J. P. Sethna, K. A. Dahmen, and C. R. Myers, “Crackling noise,” Nature, vol. 410, no. 6825, pp. 242–250, 2001.
• [26] N. J. Kasdin, “Discrete simulation of colored noise and stochastic processes and 1/ power law noise generation,” Proceedings of the IEEE, vol. 83, no. 5, pp. 802–827, 1995.
• [27] V. V. Nikulin, G. Nolte, and G. Curio, “A novel method for reliable and fast extraction of neuronal eeg/meg oscillations on the basis of spatio-spectral decomposition,” NeuroImage, vol. 55, no. 4, pp. 1528–1535, 2011.
• [28] R. Srinivasan, P. Nunez, D. Tucker, R. Silberstein, and P. Cadusch, “Spatial sampling and filtering of EEG with spline laplacians to estimate cortical potentials,” Brain topography, vol. 8, no. 4, pp. 355–366, 1996.

## Appendix A Estimators

### a.1 Detrended Fluctuation Analysis

Detrended Fluctuation Analysis (DFA) [12] is a methodology for the estimation of the Hurst exponent of a (possibly non-stationary) time-series. Its advantage over covariance analysis or analysis of the power-spectrum are its robustness to trends contaminating the empirical time-series and its desirable convergence properties [17].

The steps involved in DFA are as follows. First one forms the aggregate sum of the empirical time-series :

 x(t)=t∑i=1X(i)

(From now on whenever we refer to we mean the time-series obtained from by way of this operation.) Analysis of the fluctuations in may then be performed by measuring the variance of in windows of varying size after detrending, i.e., is split into windows of length , and the average variance after detrending the data of in these windows is formed; i.e. let be the operator which generates the mean-squares estimate of the polynomial fit of degree , then the DFA coefficients or detrended variances of degree are:

 F2DFA(n)=1⌊N/n⌋⋅n∑j(x(j)n−Pd(x(j)n))⊤(x(j)n−Pd(x(j)n))

Crucially, in the limit of data the slope of against converges to . Thus is power-law correlated LRTC if and only if the estimate of , , converges to a number greater than in the limit of data. We note here that there are numerous methods for the Hurst exponent; these include wavelet estimators [18], log-periodogram based methods [19], among others [20]. We chose DFA since it is standard in the physics and neuroimaging literature, and yields competitive estimates [21].

### a.2 Detrended Cross-Correlation Analysis

In precise analogy to DFA, Podobnik and Stanley propose Detrended Cross-Correlation Analysis (DCCA) [14], an extension of DFA to two time-series, by considering the detrended covariance:

Thus DCCA generalizes DFA in the sense that if then . To simplify the fact that we will need to consider the DFA coefficients of and simultaneously, we will also refer to the DCCA coefficients as:

 F2X1,X2(n)=F2DCCA(n)

Thus, DCCA quantifies the behaviour of the covariance between and over a range of time-scales given by . In analogy to the Pearson correlation coefficient, we may consider the detrended cross correlation coeffcient [13]:

 ρDCCA(n,X1,X2)=F2X1,X2(n)√F2X1,X1(n)√F2X2,X2(n)

quantifies the correlation between and over a range of time-scales. Note that while the machinery involved in the estimation of are more complex than for Pearson correlation, both coefficients estimate the same quantity for stationary time-series, not contaminated by trends. Thus generalizes the Pearson correlation coefficient. Applicability to non-stationary time-series is particularly important for the neural data analysis. Whenever the context allows for no ambiguity we abbreviate to .

## Appendix B Theory of the Avalanche Process

We derive here properties which hold for the CPLAD model but not for the PF model. The theory focuses on the Hurst exponent of which we denote , the Hurst exponent of , which we denote and the DCCA correlation coefficient . Throughout we assume that all power-law distributions are cutoff at a lifetime , that is fixed and deterministic for all and that is greater than to ensure stationarity.

First we consider the PF model.

###### Proposition B.1.

For the PF model for and .

###### Proof.

For the passive filtering model, the right hand term of Equation (3) of the main paper dominates. That is:

 Z(t)=∑Ls,i≤L′hs,ias,i(t−sLs,i)

Time points of greater than apart are independent so .
Moreover since the numerator of the DCCA correlation coefficient is given by the DCCA coefficients . These behave like the coefficients of white noise for large window sizes and therefore have expectation 0 in the limit. ∎

We now turn to the CPLAD model, to which all of the following propositions apply. We first aim at understanding the scaling of the number of active large avalanches:

###### Proposition B.2.

The number of active avalanches is asymptotically for large and and of constant order for .

###### Proof.

The number of active avalanches at time is equal to:

 Lc∑t′=0#{Lt−t′,i|Lt−t′,i>t′} ∼∫Lc0q(∫Lct′L−αdL)dt′ ∼(1−α)(Lc)1−α−(1−α)∫Lc0(t′)1−αdt′ =(1−α)(Lc)1−α−(1−α)(2−α)(L2−αc−1) {∼L2−αc for α<2=O(1) for α>2 (4)

###### Proposition B.3.

is approximately Gaussian for large and for .

###### Proof.

At any time point we have a number (Proposition B.2) of avalanches which commenced each at times with lengths . Then we may write:

 X(t)=k∑i=1Lβia(t−siLi)

For large, which may be assumed by Proposition B.2, after normalization and centering we apply the Lyapunov central limit condition [22] to:

 ¯¯¯¯¯X(t)=1√kLβ+(α+1)/2ck∑j=1Lβi(a(t−siLi)−μ)

where is the mean value of over the interval . Let , then for the Lyapunov condition to apply we need to show, for some , that:

 1s2+δkk∑j=1E∣∣∣Lβi(a(t−siLi)−μ)∣∣∣2+δ→0

Thus:

 1s2+δkr∑j=1E∣∣∣Lβi(a(t−siLi)−μ)∣∣∣2+δ ∼1k1+δ/2L(β+α/2+1/2)(2+δ)ck∑i=1L(2+δ)βi ∼1k1+δ/2L(β+α/2+1/2)(2+δ)ckLβ(2+δ)+α+1c ∼Lα+1ckδ/2L(α/2+1/2)(2+δ)c ≤1kδ/2Lδ/2c⟶0

This completes the proof for univariate Gaussianity. The proof that convergence is to a multivariate Gaussian process is a simple extension via the Cramer-Wold device [22].

For , for and .

###### Proof.

By Proposition B.3, is a Gaussian process, which is uniquely defined by its first two moments. Over long time-scales, therefore, behaves like fractional Brownian noise [23]. One approach to generating fractional Brownian noise is by long-range filtering of a Gaussian white noise [15]. This brings us exactly into the situation of Proposition B.1. ∎

###### Proposition B.5.

The probability that avalanches of length occur simultaneously vanishes for large when .

###### Proof.

In analogy to Proposition B.2 equal to is the time spent in large avalanches divided by the total recorded time:

 1T∑Li>L′Li ∼qE(\mathbbm1(L′,∞)(L)) ∼∫LcL′L1−αdL ∼L′2−α

Thus the probability that two avalanches of length greater than occur . ∎

###### Proposition B.6.

The Hurst exponent of satisfies the approximate relation:

 Hraw ={β−α2+2 if τ<31/2 if τ>3 (5)
###### Proof.

Following [10] we approximate avalanches by a box function:

 a(t)={1 for t∈[0,1]0 otherwise

The authors show that in this case the autocorrelation function satisfies:

 r(t) ∝∫∞|t|(L−|t|)∫∞0L−αL2βdL ∼|t|2β−α+2

Using the fact that if the autocorrelation function scales as , then the Hurst exponent and are related as [24], so that:

 Hraw =β−α2+2 (6)

We adopt the convention that by we refer to the time-point of . At criticality we have the relationship:

 gω(Lβia(t/Li))∼Lβ′igω(a)(t/Li)

This scaling relationship is illustrated in Figure 5. The left hand panel illustrates avalanches whose heights scale according to , with . The right hand panel displays . One sees that the heights of these filtered avalanches scale more slowly than the original avalanches.

We require an additional known relation between critical exponents. Let be the size of an avalanche, i.e. the total activity occurring over the course of the avalanche: . Then, at criticality:

 S∼S−τ

Then we have the following relation [25] between critical exponents:

 α−1τ−1=β+1 (7)

If we define then we may also define by Equation (7) so that:

 α′−1τ′−1=β′+1 (8)

The size distribution is important for our subsequent analyses because whether the asymptotic properties of the process are dominated by the large avalanches depends on whether the size distributions have infinite variance or not.

.

###### Proof.

We have that . This implies that the standard deviation of scales according to . This is because:

 var(fω,Δω(a(t))) ∼∫ω+Δωω−ΔωP[ω′]dω′ ∼1ωβ+1

And therefore:

 var(fLω,LΔω(a(t))) ∼∫Lω+LΔωLω−LΔωP[ω′]dω′ ∼∫Lω+LΔωLω−LΔω1ω′β+1dω′ ∼∫ω+Δωω−Δω1(Lω′)β+1Ldω′ ∼1Lβvar(fω,Δω(a(t)))

Therefore:

 std(fLω,LΔω(a(t))))∼L−β/2

and:

 fω,Δω(Lβiai(t/Li)) ∼LβifLiω,LiΔω(a)(t/Li) ∼Lβ/2ifω,Δω(a)(t/L)

Therefore by linearity of the Hilbert transform:

 gω(Lβiai(t/Li))∼Lβ/2igω(a)(t/Li)

###### Proposition B.8.

The Hurst exponent of satisfies the approximate relation:

 Hamp ={β2−α2+2 if α≤β+3, α>21/2 otherwise (9)
###### Proof.

may be divided into a sum of long avalanches which do not overlap and short avalanches.

 X(t)=∑Ls,i>L′Lβs,ias,i(t−sLs,i)+∑Ls,i≤L′Lβs,ias,i(t−sLs,i) (10)

Therefore by Proposition B.7:

 fω(X(t))=∑Ls,i>L′Lβ/2s,ifω(as,i(t−sLs,i))+∑Ls,i≤L′Lβ/2s,ifω(as,i(t−sLs,i)) (11)

The left hand term dominates whenever its variance is unbounded for large . This happens if which translates to by Equation (8). Therefore the right hand term may be neglected and since the avalanches of the left hand term are separated in time, the envelope operator may be pulled under the sum so that:

 gω(X(t))∼∑Ls,i>LLβ/2s,igω(as,i(t−sLs,i)) (12)

The same proof as for Proposition B.6 may then be applied to deriving . On the other hand if then the right hand term dominates, for large and has time-scale bounded by —thus in this case . ∎

We now study the DCCA correlation coefficients between amplitude envelopes.

###### Proposition B.9.

Assuming that converges, for and , for .

###### Proof.

For large , each DCCA window is longer than the largest avalanche. For the variance in the left hand term of Equation (11) may be assumed to be non-overlapping since we assume (Propostion B.5) we may neglect small avalanches and assume that a window contains only one avalanche at with length . Then:

 t∑i=1gωj(X)(i)={0 if tt′+L

But up to a constant factor for large avalanches and assuming the integral converges. Thus the correlation tends to 1. ∎

###### Proposition B.10.

Assuming that converges, for and ,
for and .

###### Proof.

For then we have that . This is because, may be approximated by a delta function relative to these frequencies, which has white spectrum. Then the same proof techniques of Propositions B.8 and B.9 may be applied but with Equation (11) replaced by:

 fω(X(t))=∑Ls,i>LLβs,ifω(as,i(t−sLs,i))+∑Ls,i≤LLβs,ifω(as,i(t−sLs,i)) (13)

For , the left hand term dominates, since it has unbounded variance in , which completes the proof. ∎

###### Proposition B.11.

For , for , and .

###### Proof.

The proof is identical to the proofs of Propositions B.1 and B.4, using the results of Proposition B.6. ∎

### b.1 Summary of Results

We term each region of parameters a meta-universality class which gives qualitatively constant behaviour as quantified by Hurst exponents and DCCA correlation coefficients. There are four meta-universality classes MU1–MU4:

• :

 Hraw>0.5 Hωamp=0.5 ρDCCA(n,gω1(X(t)),gω2(X(t)))→0
• and

 Hraw>0.5 Hωamp>0.5 ρDCCA(n,gω1(X(t)),gω2(X(t)))→1
• and

 Hraw>0.5 Hωamp=0.5 ρDCCA(n,gω1(X(t)),gω2(X(t)))→1 as ωi→0
• and

 Hraw=0.5 Hωamp=0.5 ρDCCA(n,gω1(X(t)),gω2(X(t)))→0

## Appendix C Simulations

In all simulations we model the average avalanche shape as a quadratic function: . Moreover we set the variance swell identically so that . For the noise component we take the implementation of [26]. For the power-law cutoff sampling, we perform a density transformation of the uniform distribution. (In MATLAB x = rand(1,T).*(1-)+; x = 1./(x.^(1./(-1))))

### c.1 Simulation 1 (from main text)

We describe here how we obtain the results of Figure 2 of the main text.

Lower panel: we filter white Noise with (using the method of [26]) to yield a process with spectrum scaling according to . We then measure DCCA correlations between amplitudes in the frequency ranges [0.68,0.72] and [0.78,0.82] of half the sampling frequency. with log-spaced between 20 and 9000 and Hurst exponents of the same amplitudes using DFA with window sizes between 1000 and 9000.

Top two panels: we generate two processes assuming CPLAD applied to Equation (1) of the main paper, generating avalanches as described above. We set , and (top, middle resp.). We then apply the same analysis as in the lower panel.

### c.2 Simulation 2

The aim of this simulation is to verify that long-range dependent Gaussian processes satisfy and , where the time series are generated as filtered Gaussian white noise processes. Thus this simulation tests our predictions for the PF model.

To this end we simulate 100 long-range dependent Gaussian processes () (using the method of [26]) of length 15000 time points. In each case the data are then filtered forward and backwards in two separate frequency bands (between 0.39 and 0.41 of the sampling frequency and 0.29 and 0.31 of the sampling frequency) with butterworth filters of order . We then measure DCCA correlations between the Hilbert transforms of these signals and measure their Hurst exponents with DFA, in both cases using window lengths between and .

We repeat this setup for and .

The results are displayed in Figure 6 and show that, up to small sample effects, we expect and zero cross correlations between frequency bands.

### c.3 Simulation 3 (from main text)

In the first simulation, for each pair of exponents in the ranges and , we generate a sample path of length , with a cutoff at , and number of superpositions . The first time points are discarded, to ensure stationarity. We then design Butterworth filters of order 2 between and and between and of the sampling frequency. The data from are then filtered forwards and backwards (yielding effective filter order of 4) and the amplitude envelopes are calculated to yield and . We measure the Hurst exponent of , using DFA, and the DCCA correlation coefficients between and , setting to log spaced values between and . This setup is repeated 100 times, and the results of the simulations are averaged.

### c.4 Simulation 4

In the second simulation we check the prediction of and , setting set , , , with a burn in time of time points (less time is required for convergence for this value of ), setting to log spaced values between and for the estimation of and between and for the estimation of (according to where scaling regions were observed). The results are displayed in Figure 7,

The quality of the estimate is greater for small , whereas the quality of the estimate is greater for larger . This discrepancy may be explained as follows: since has longer tails than , the convergence of its moments is slow for large , thus the quality of the estimate decreases for large . On the other hand, the estimate of requires a linear approximation to the non-linear transform given by the amplitude of the analytic signal: this approximation increases in quality for larger .

### c.5 Simulation 5

The aim of this simulation is to verify the theory of Proposition B.7 checking the heights of the filtered avalanches . We set ,