On the Statistics of Baryon Correlation Functions in Lattice QCD
Abstract
A systematic analysis of the structure of singlebaryon correlation functions calculated with lattice QCD is performed, with a particular focus on characterizing the structure of the noise associated with quantum fluctuations. The signaltonoise problem in these correlation functions is shown, as long suspected, to result from a sign problem. The logmagnitude and complex phase are found to be approximately described by normal and wrapped normal distributions respectively. Properties of circular statistics are used to understand the emergence of a large time noise region where standard energy measurements are unreliable. Powerlaw tails in the distribution of baryon correlation functions, associated with stable distributions and “Lévy flights”, are found to play a central role in their time evolution. A new method of analyzing correlation functions is considered for which the signaltonoise ratio of energy measurements is constant, rather than exponentially degrading, with increasing sourcesink separation time. This new method includes an additional systematic uncertainty that can be removed by performing an extrapolation, and the signaltonoise problem reemerges in the statistics of this extrapolation. It is demonstrated that this new method allows accurate results for the nucleon mass to be extracted from the largetime noise region inaccessible to standard methods. The observations presented here are expected to apply to quantum Monte Carlo calculations more generally. Similar methods to those introduced here may lead to practical improvements in analysis of noisier systems.
pacs:
11.15.Ha, 12.38.Gc,00A0
NPLQCD Collaboration
I Introduction
Modern nuclear physics research relies upon largescale highperformance computing (HPC) to predict the properties of a diverse array of manybody systems, ranging from the properties of hadrons computed from the dynamics of quarks and gluons, through to the form of gravitational waves emitted from inspiraling binary neutron star systems. In many cases, the entangled quantum nature of these systems and the nonlinear dynamics that define them, preclude analytic calculation of their properties. In these cases, precise numerical evaluations of highdimensional integrations that systematically approach the quantum path integral are required. Typically, it is average quantities that are determined by Monte Carlo (MC) path integral evaluations. These average values are to be used subsequently in direct comparison with experiment, as input to analytic frameworks with outputs that can then be compared with experiment, or as predictions for critical components of systems that are inaccessible to experiment such as the equation of state of dense matter in explosive astrophysical environments. Enormous amounts of HPC resources are used in such MC calculations to determine average values of quantities and their uncertainties. The central limit theorem, and in particular the scaling anticipated for the uncertainties associated with average values, are used to make estimates of projected resource requirements. When a system has a “sign problem”, for which the average value of a quantity of interest results from cancellations of (relatively) large contributions, such as found when averaging , the HPC resources required for accurate numerical estimates of the average(s) are prohibitively large. This is the case for numerical evaluations of the path integrals describing strongly interacting systems with even a modest nonzero net baryon number.
While the quantum fluctuations (noise) of manybody systems contain a wealth of information beyond average values, only a relatively small amount of attention has been paid to refining calculations based upon the structure of the noise. This statement, of course, does not do justice to the fact that all observables (Smatrix elements) in quantum field theory calculations can be determined from vacuum expectation values of products of quantum fields. However, in numerical calculations, it is generally the case that noise is treated as a nuisance, something to reduce as much as needed, as opposed to a feature that may reveal aspects of systems that are obscured through distribution among many expectation values. In the area of Lattice Quantum Chromodynamics (LQCD), which is the numerical technique used to evaluate the quantum path integral associated with Quantum Chromodynamics (QCD) that defines the dynamics of quarks and gluons, limited progress has been made toward understanding the structure of the noise in correlation functions and the physics that it contains.
Strongly interacting quantum systems can be described through path integral representations of correlation functions. In principle, MC evaluation of lattice regularized path integrals can solve QCD as well as many strongly interacting atomic and condensed matter theories. In practice, conceptual obstacles remain and large nuclei and nuclear matter are presently inaccessible to LQCD. In the grand canonical formulation, LQCD calculations with nonzero chemical potential face a sign problem where MC sampling weights are complex and cannot be interpreted as probabilities. In the canonical formulation, calculations with nonzero baryon number face a SignaltoNoise (StN) problem where statistical uncertainties in MC results grow exponentially at large times. Like the sign problem, the StN problem arises when the sign of a correlation function can fluctuate, at which point cancellations allow for a mean correlation function of much smaller magnitude than a typical MC contribution.
The nucleon provides a relatively simple and wellstudied example of a complex correlation function with a StN problem. The zeromomentum Euclidean nucleon correlation function is guaranteed to be real by existence of a Hermitian, bounded transfer matrix and the spectral representation
(1) 
where denotes an individual nucleon correlation function calculated from quark propagators in the presence of the th member of a statistical ensemble of gauge field configurations, denotes an average over gauge field ensembles in and an average over quark and gluon fields in the middle term, and are nucleon creation and annihilation interpolating operators, and represent the overlap of these interpolating operators onto the th QCD eigenstates with quantum numbers of the nucleon, is the energy of the corresponding eigenstate, is Euclidean time, is the nucleon mass, and denotes proportionality in the limit . A phase convention for creation and annihilation operators is assumed so that is real for all correlation functions in a statistical ensemble. At small times is approximately real, but at large times it must be treated as a complex quantity. The equilibrium probability distribution for can be formally defined as
(2) 
where is a gauge field, is the nucleon correlation function in the presence of a background gauge field , is the Haar measure for the gauge group, and is the gauge action arising after all dynamical matter fields have been integrated out. For convenient comparison with LQCD results, a lattice regulator with a lattice spacing equal to unity will be assumed throughout. Unless specified, results will not depend on details of the ultraviolet regularization of .
MC integration of the path integral representation of a partition function, as performed in LQCD calculations, provides a statistical ensemble of background quantum fields. Calculation of in an ensemble of QCDvacuumdistributed gauge fields provides a statistical ensemble of correlation functions distributed according to . Understanding the statistical properties of this ensemble is essential for efficient MC calculations, and significant progress has been achieved in this direction since the early days of lattice field theory. Following Parisi Parisi (1984), Lepage Lepage (1989) argued that has a StN problem where the noise, or square root of the variance of , becomes exponentially larger than the signal, or average of , at large times. It is helpful to review the pertinent details of ParisiLepage scaling of the StN ratio.
Higher moments of are themselves field theory correlation functions with welldefined spectral
representations.
(3) 
At large times, the nucleon StN ratio is determined by the slowestdecaying moments at large times, taking the form,
(4) 
and is therefore exponentially small.
Beyond moments, the general form of correlation function distributions has also been investigated. Endres, Kaplan, Lee, and Nicholson Endres et al. (2011a) found that correlation functions in the nonrelativistic quantum field theory describing unitary fermions possess approximately lognormal distributions. They presented general arguments, that are discussed below, suggesting that this behavior might be a generic feature of quantum field theories. Knowledge of the approximate form of the correlation function distribution was exploited to construct an improved estimator, the cumulant expansion, that was productively applied to subsequent studies of unitary fermions Endres et al. (2011b, c); Lee et al. (2011); Endres et al. (2013). Correlation function distributions have been studied analytically in the NambuJonaLasinio model Grabowska et al. (2013); Nicholson et al. (2012), where it was found that real correlation functions were approximately lognormal but complex correlation functions in a physically equivalent formulation of the theory were broad and symmetric at large times with qualitative similarities to the QCD nucleon distribution. DeGrand DeGrand (2012) observed that meson, baryon, and gaugefield correlation functions in gauge theories with several choices of are also approximately lognormal at small times where imaginary parts of correlation functions can be neglected. These various observations provide strong empirical evidence that the distributions of real correlation functions in generic quantum field theories are approximately lognormal.
A generalization of the lognormal distribution for complex random variables that approximately describes the QCD nucleon correlation function at large times is presented in this work. To study the logarithm of a complex correlation function, it is useful to introduce the magnitudephase decomposition
(5) 
At small times where the imaginary part of is negligible, previous observations of lognormal correlation
functions DeGrand (2012) demonstrate that is approximately normally distributed. It is shown below that
is approximately normal at all times, and that is approximately normal at small times.
Statistical analysis of is complicated by the fact that it is defined modulo .
In particular, the sample mean of a phase defined on does not necessarily provide a faithful
description of the intuitive average phase (consider a symmetric distribution peaked around
with a sample mean close to zero).
Suitable statistical tools for analyzing are found in the theory of circular statistics and as will
be seen below that is described by an approximately wrapped normal distribution.
Sec. II discusses standard statistical analysis methods in LQCD that introduce concepts used below. In Section III, the magnitudephase decomposition of the nucleon correlation function and connections to the StN problem are discussed. Section III.1 describes the distributions of the logmagnitude and its time derivative in more detail, while Section III.2 describes the distribution of the complex phase and its time derivative and explains how their features lead to systematic bias in standard estimators during a largetime region that is dominated by noise. Section IV draws on these observations to propose an estimator for the nucleon mass in which accurate results can be extracted from the largetime noise region with a precision that is constant in sourcesink separation time but exponentially degrading in an independent time parameter . Section V conjectures about applications to the spectra of generic complex correlation functions and concludes.
Ii Relevant Aspects of Standard Analysis Methods of Correlation Functions
Typically, in calculations of meson and baryon masses and their interactions, correlation functions are generated from combinations of quark and gluonlevel sources and sinks with the appropriate hadronlevel quantum numbers. Linear combinations of these correlation functions are formed, either using VariationalMethod type techniques Luscher and Wolff (1990), the MatrixProny technique Beane et al. (2009a), or other less automated methods, in order to optimize overlap onto the lowest lying states in the spectrum and establish extended plateaus in relevant effective mass plots (EMPs). In the limit of an infinite number of independent measurements, the expectation value of the correlation function is a real number at all times, and the imaginary part can be discarded as it is known to average to zero. The largetime behavior of such correlation functions becomes a single exponential (for an infinite timedirection) with an argument determined by the groundstate energy associated with the particular quantum numbers, or more generally the energy of the lowestlying state with nonnegligible overlap.
The structure of the source and sink play a crucial role in determining the utility of sets of correlation functions. For many observables of interest, it is desirable to optimize the overlap onto the ground state of the system, and to minimize the overlap onto the correlation function dictating the variance of the ground state. In the case of the single nucleon, the sources and sinks, , are tuned in an effort to have maximal overlap onto the groundstate nucleon, while minimizing overlap of onto the threepion ground state Detmold and Endres (2015). NPLQCD uses momentum projected hadronic blocks Beane et al. (2006) generated from quark propagators originating from localized smeared sources to suppress the overlap into the threepion ground state by a factor of where is the lattice volume, e.g. Ref. Beane et al. (2009a). For such constructions, the variance of the average scales as at large times, where is the number of statistically independent correlation functions, while the nucleon correlation function scales as . For this set up, the StN ratio scales as , from which it is clear that exponentially large numbers of correlation functions or volumes are required to overcome the StN problem at large times. The situation is quite different at small and intermediate times in which the variance correlation function is dominated, not by the threepion ground state, but by the “connected” nucleonantinucleon excited state, which provides a variance contribution that scales as .
This time interval where the nucleon correlation function is in its ground state and the variance correlation function is in a nucleonantinucleon excited state has been called the “golden window” Beane et al. (2009a) (GW). The variance in the GW is generated, in part, by the distribution of overlaps of the source and sink onto the ground state, that differs at each lattice site due to variations in the gluon fields. In the work of NPLQCD, correlation functions arising from Gaussiansmeared quarkpropagator sources and pointlike or Gaussiansmeared sinks that have been used to form singlebaryon hadronic blocks. Linear combinations of these blocks are combined with coefficients (determined using the MatrixProny technique of Ref. Beane et al. (2009a) or simply by minimizing the /dof in fitting a constant to an extended plateau region) that extend the singlebaryon plateau region to earlier times, eliminating the contribution from the first excited state of the baryon and providing access to smaller timeslices of the correlation functions where StN degradation is less severe. Highstatistics analyses of these optimized correlation functions have shown that GW results are exponentially more precise and have a StN ratio that degrades exponentially more slowly than larger time results Beane et al. (2009a, b, 2010) (for a review, see Ref. Beane et al. (2011)). In particular StN growth in the GW has been shown to be consistent with an energy scale close to zero, as is expected from a variance correlation function dominated by baryon, as opposed to meson, states. Despite the ongoing successes of GW analyses of fewbaryon correlation functions, the GW shrinks with increasing baryon number Beane et al. (2009a, b, 2010) and calculations of larger nuclei may require different analysis strategies suitable for correlation function without a GW.
EMPs, such as that associated with the baryon shown in Fig. 1, are formed from ratios of correlation functions, which become constant when only a single exponential is contributing to the correlation function,
(6) 
where is the ground state energy in the channel with appropriate quantum numbers.
The average over gauge field configurations is typically over correlation functions derived from multiple source points on multiple gaugefield configurations.
This is wellknown technology and is a “workhorse” in the analysis of LQCD calculations.
Typically, corresponds to one temporal lattice spacing, and the jackknife and bootstrap resampling techniques are used to generate covariance matrices in the plateau interval used to extract the
groundstate energy from a correlated minimization DeGrand and Toussaint (1990); Beane et al. (2011, 2015).
It is known that baryon correlation functions contain strong correlations over time scales, and that these correlations are sensitive the presence of outliers. Fig. 2 shows the distribution of the real part of smalltime nucleon correlation functions, which resembles a heavytailed lognormal distribution DeGrand (2012). Lognormal distributions are associated with a larger number of “outliers” than arise when sampling a Gaussian distribution, and the sample mean of these smalltime correlation function will be strongly affected by the presence of these outliers. The distribution of baryon correlation functions at very large sourcesink separations is also heavytailed; David Kaplan has analyzed the real parts of NPLQCD baryon correlation functions and found that they resemble a stable distribution Kaplan (2014). Cancellations between positive and negative outliers occur in determinations of the sample mean of this largetime distribution, leading to different statistical issues that are explored in detail in Sec. III.
To analyze temporal correlations in baryon correlation functions in more detail, results for inverse covariance matrices generated through bootstrap resampling of the baryon effective mass are shown in Fig. 3. The size of offdiagonal elements in the inverse covariance matrix directly sets the size of contributions to the leastsquares fit result from temporal correlations in the effective mass, and so it is appropriate to use their magnitude to describe the strength of temporal correlations. The inverse covariance matrix is seen to possess large offdiagonal elements associated with small time separations that appear to decrease exponentially with increasing time separation at a rate somewhat faster than . Mild variation in the inverse covariance matrix is seen when is varied. Since correlations between and are seen in Fig. 3 to decrease rapidly as becomes large compared to hadronic correlation lengths, is expected that small distance correlations in the covariance matrix decrease when and are separated by and Fig. 2, though such an effect is not clearly visible in the inverse covariance matrix on the logarithmic scale shown.
The role of outliers in temporal correlations on timescales is highlighted in Fig. 4, where inverse covariance matrices determined with the HodgesLehmann estimator are shown. The utility of robust estimators, such as the median and the HodgesLehmann estimator, with reduced sensitivity to outliers, has been explored in Ref. Beane et al. (2015). When the median and average of a function are known to coincide, there are advantages to using the median or HodgesLehmann estimator to determine the average of a distribution. The associated uncertainty can be estimated with the “median absolute deviation” (MAD), and be related to the standard deviation with a wellknown scaling factor. Offdiagonal elements in the inverse covariance matrix associated with timescales are visibly smaller on a logarithmic scale when the covariance matrix is determined with the HodgesLehmann estimator instead of the sample mean. This decrease in smalltime correlations when a robust estimator is employed strongly suggests that shorttime correlations on scales are associated with outliers.
Iii A MagnitudePhase Decomposition
In terms of the logmagnitude and phase, the mean nucleon correlation functions is
(7) 
In principle, could be included in the MC probability distribution used for importance sampling. With this approach, would contribute as an additional term in a new effective action. The presence of nonzero demonstrates that this effective action would have an imaginary part. The resulting weight therefore could not be interpreted as a probability and importance sampling could not proceed; importance sampling of faces a sign problem. In either the canonical or grand canonical approach, onebaryon correlation functions are described by complex correlation functions that cannot be directly importance sampled without a sign problem, but it is formally permissible to importance sample according to the vacuum probability distribution, calculate the phase resulting from the imaginary effective action on each background field configuration produced in this way, and average the results on an ensemble of background fields. This approach, known as reweighting, has a long history in grand canonical ensemble calculations but has been generically unsuccessful because statistical averaging is impeded by large fluctuations in the complex phase that grow exponentially with increasing spacetime volume Gibbs (1986); Splittorff and Verbaarschot (2007a, b). Canonical ensemble nucleon calculations averaging over background fields importance sampled with respect to the vacuum probability distribution are in effect solving the sign problem associated with nonzero by reweighting. As emphasized by Ref. Grabowska et al. (2013), similar chiral physics is responsible for the exponentially hard StN problem appearing in canonical calculations and exponentially large fluctuations of the complex phase in grand canonical calculations.
Reweighting a pure phase causing a sign problem generically produces a StN problem in theories with a mass gap. Suppose for some . Then because by construction, has the StN ratio
(8) 
which is necessarily exponentially small at large times. Nonzero guarantees that statistical sampling of has a StN problem. Strictly, this argument applies to a pure phase but not to a generic complex observable such as which might receive zero effective mass contribution from and could have important correlations between and . MC LQCD studies are needed to understand whether the pure phase StN problem of Eq. (8) captures some or all of the nucleon StN problem of Eq. (4).
To determine the largetime behavior of correlation functions, it is useful to consider the effectivemass estimator commonly used in LQCD spectroscopy, a special case of eq. (6),
(9) 
As , the average correlation function can be described by a single exponential whose decay rate is set by the ground state energy, and therefore . The uncertainties associated with can be estimated by resampling methods such as bootstrap. The variance of is generically smaller than that of due to cancellations arising from correlations between and across bootstrap ensembles. Assuming that these correlations do not affect the asymptotic scaling of the variance of , propagation of uncertainties for bootstrap estimates of the variance of shows that the variance of scales as
(10) 
An analogous effectivemass estimator for the largetime exponential decay of the magnitude is
(11) 
and an effectivemass estimator for the phase is
(12) 
where the reality of the average correlation function has been used.
Figure 5 shows EMPs for , , and calculated from the LQCD ensemble described previously. The mass of the nucleon, determined from a constant fit in the shaded plateau region indicated in Fig. 5, is found to be , in agreement with the mass obtained from the golden window in previous studies Orginos et al. (2015) of . and do not visually plateau until much larger times. For the magnitude, a constant fit in the shaded region gives an effective mass which is close to the value Orginos et al. (2015) indicated by the red line. For the phase, a constant fit to the shaded region gives an effective mass , which is consistent with the value Orginos et al. (2015) indicated by the red line. It is unlikely that the phase has reached its asymptotic value by this time, but a signal cannot be established at larger times. For , large fluctuations lead to complex effective mass estimates for and and unreliable estimates and uncertainties. agrees with up to corrections at all times, demonstrating that the magnitude and cosine of the complex phase are approximately uncorrelated at the few percent level. This suggests the asymptotic scaling of the nucleon correlation function can be approximately decomposed as
(13) 
For small times , the means and variances of and agree up to a small contribution from . This indicates that the real part of the correlation function is nearly equal to its magnitude at small times. At intermediate times , the contribution of grows relative to , and for the variance of the full effective mass is nearly saturated by the variance of , as shown in Fig. 6. At intermediate times a linear fit normalized to with slope provides an excellent fit to bootstrap estimates of , in agreement with the scaling of Eq. (10). is indistinguishable from in this region, and has an identical StN problem. has much more mild time variation, and can be reliably estimated at all times without almost no StN problem. At intermediate times, the presence of nonzero signaling a sign problem in importance sampling of appears responsible for the entire nucleon StN problem.
approaches its asymptotic value much sooner than or . This indicates that the overlap of onto the threepion ground state in the variance correlation function is greatly suppressed compared to the overlap of onto the onenucleon signal ground state. Optimization of the interpolating operators for high signal overlap contributes to this. Another contribution arises from momentum projection, which suppresses the variance overlap factor by Beane et al. (2009b). A large hierarchy between the signal and noise overlap factors provides a GW visible at intermediate times . In the GW, approaches it’s asymptotic value but begins to grow exponentially and is suppressed compared to . Reliable extractions of are possible in the GW.
The effects of blocking, that is averaging subsets of correlation functions and analyzing the distribution of the averages, are shown in Fig. 7. is suppressed compared to for larger times in the blocked ensemble, and the logmagnitude saturates the average and variance of through intermediate times . Blocking does not actually reduce the total variance of . Variance in is merely shifted from the phase to the logmagnitude at intermediate times. This is reasonable, since the imaginary part of vanishes on average and so blocked correlation functions will have smaller imaginary parts. Still, blocking does not affect and only affects bootstrap estimates of at the level of correlations between correlation functions in the ensemble. Blocking also does not delay the onset of a largetime noise region where and cannot be reliably estimated.
Eventually the scaling of begins to deviate from Eq. (10), and in the noise region the observed variance remains approximately constant (up to large fluctuations). This is inconsistent with ParisiLepage scaling. While the onset of the noise region is close to the midpoint of the time direction , a qualitatively similar onset occurs at earlier times in smaller statistical ensembles. Standard statistical estimators therefore do not reproduce the scaling required by basic principles of quantum field theory in the noise region. This suggests systematic deficiencies leading to unreliable results for standard statistical estimation of correlation functions in the noise region. The emergence of a noise region where standard statistical tools are unreliable can be understood in terms of the circular statistics describing and is explained in Sec. III.2. A more straightforward analysis of the distribution of is first presented below.
iii.1 The Magnitude
Histograms of the nucleon logmagnitude are shown in Fig. 9. Particularly at large times, the distribution of is approximately described by a normal distribution. Fits to a normal distribution are qualitatively good but not exact, and deviations between normal distribution fits and results are visible in Fig. 9.
Cumulants of can be used to quantify these deviations, which can be recursively calculated from its moments by
(14) 
The first four cumulants of a probability distribution characterize its mean, variance, skewness, and kurtosis respectively. If were exactly lognormal, the first and second cumulants of , its mean and variance, would fully describe the distribution. Third and higher cumulants of would all vanish for exactly lognormal . Fig. 10 shows the first four cumulants of . Estimates of higher cumulants of become successively noisier.
The cumulant expansion of Ref. Endres et al. (2011a) relates the effective mass of a correlation function to the cumulants of the logarithm of the correlation function. The derivation of Ref. Endres et al. (2011a) is directly applicable to . The characteristic function , defined as the Fourier transform of the probability distribution function of , can be described by a Taylor series for whose coefficients are precisely the cumulants of ,
(15) 
The average magnitude of is given in terms of this characteristic function by
(16) 
This allows application of the cumulant expansion in Ref. Endres et al. (2011a) to the effective mass in Eq. (11) to give,
(17) 
Since with vanishes for normally distributed , the cumulant expansion provides a rapidly convergent series for correlation functions that are close to, but not exactly, lognormally distributed. Note that the righthandside of Eq. (17) is simply a discrete approximation suitable for a lattice regularized theory of the time derivative of the cumulants.
Results for the effective mass contributions of the first few terms in the cumulant expansion of Eq. (17) are shown in Fig. 11. The contribution , representing the time derivative of the mean, provides an excellent approximation to after small times. provides a very small negative contribution to , and contributions from and are statistically consistent with zero. As approaches its asymptotic value, the logmagnitude distribution can be described to highaccuracy by a nearly normal distribution with very slowly increasing variance and small, approximately constant . The slow increase of the variance of is consistent with observations above that has no severe StN problem. It is also consistent with expectations that describes a (partiallyquenched) threepion correlation function with a very mild StN problem, with a scale set by the attractive isoscalar pion interaction energy.
As Eq. (17) relates to time derivatives of moments of , it is interesting to consider the distribution of the time derivative . Defining generic finite differences,
(18) 
the time derivative of lattice regularized results can be defined as the finite difference,
(19) 
If and were statistically independent, it would be straightforward to extract the time derivatives of the moments of from the moments of . The presence of correlations in time, arising from nontrivial QCD dynamics, obstructs a naive extraction of from moments of . For instance, without knowledge of it is impossible to extract the time derivative of the variance of from the variance of . While the time derivative of the mean of is simply the mean of , time derivatives of the higher cumulants of cannot be extracted from the cumulants of without knowledge of dynamical correlations.
The cumulants of are shown in Fig. 12. As expected, the mean of approaches at large times. The variance of is tending to a plateau which is approximately onethird of the variance of . This implies there are correlations between and that are on the same order of the individual variances of and . This is not surprising, given that the QCD correlation length is larger than the lattice spacing. No statistically significant is seen for at large times, but a statistically significant positive is found. Normal distribution fits to are found to be poor, as shown in Fig. 13, as they underestimate both the peak probability and the probability of finding “outliers” in the tails of the distribution. Interestingly, Fig. 12, and histograms of shown in Fig. 13, suggest that the distribution of becomes approximately timeindependent at large times.
Stable distributions are found to provide a much better description of , and are consistent with the heuristic arguments for lognormal correlation functions given in Ref. Endres et al. (2011a). Generic correlation functions can be viewed as products of creation and annihilation operators with many transfer matrix factors describing Euclidean time evolution. It is difficult to understand the distribution of products of transfer matrices in quantum field theories, but following Ref. Endres et al. (2011a) insight can be gained by considering products of random positive numbers. As a further simplification, one can consider a product of independent, identically distributed positive numbers, each schematically representing a product of many transfer matrices describing time evolution over a period much larger than all temporal correlation lengths. Application of the central limit theorem to the logarithm of a product of many independent, identically distributed random numbers shows that the logarithm of the product tends to become normally distributed as the number of factors becomes large. The central limit theorem in particular assumes that the random variables in question originate from distributions that have a finite variance. A generalized central limit theorem proves that sums of heavytailed random variables tend to become distributed according to stable distributions (that include the normal distribution as a special case), suggesting that stable distributions arise naturally in the logs of products of random variables.
Stable distributions are named as such because their shape is stable under averaging of independent copies of a random variable.
Formally, stable distributions form a manifold of fixed points in a Wilsonian space of probability distributions where averaging
independent random variables from the distribution plays the role of renormalization group evolution.
A parameter , called the index of stability, dictates the shape of a stable distribution and remains fixed under averaging transformations.
All probability distributions with finite variance evolve under averaging towards the normal distribution,
a special case of the stable distribution with .
Heavytailed distributions with illdefined variance evolve towards generic stable distributions with .
In particular, stable distributions with have powerlaw tails;
for a stable random variable the tails decay as .
The heavytailed Cauchy, Levy, and Holtsmark distributions are special cases of stable distributions with and respectively,
that arise in physical applications.
Stable distributions for a real random variable are defined via Fourier transform,
(20) 
of their characteristic functions
(21) 
where is the index of stability, determines the skewness of the distribution, is the location of peak probability, sets the width. For , the above parametrization does not hold and should be replaced by . For the mean is , and for the mean is illdefined. For the variance is and Eq. (21) implies the distribution is independent of , while for the variance is illdefined.
The distributions of obtained from the LQCD calculations can be fit to stable distributions through maximum likelihood estimation of the stable parameters and , obtaining the results that are shown in Fig. 14. Estimates of are consistent with , corresponding to a normal distribution. This is not surprising, because higher moments of would be illdefined and diverge in the infinite statistics limit if were literally described by a heavytailed distribution. is strictly illdefined when , but results consistent with indicate negative skewness in agreement with observations above. Estimates of and are consistent with the cumulant results above if a normal distribution () is assumed. Fits of to generic stable distributions are shown in Fig. 9, and are roughly consistent with fits to a normal distribution, though some skewness is captured by the stable fits.
Stable distribution fits to indicate statistically significant deviations from a normal distribution (), as seen in Fig. 15. The largetime distribution of appears time independent, and fitting in the largetime plateau region gives an estimate of the largetime index of stability. Recalling describes a finite difference over a physical time interval of one lattice spacing, the estimated index of stability is
(22) 
Maximum likelihood estimates for are consistent with the sample mean, and is consistent with zero in agreement with observations of vanishing skewness. Therefore, the distribution of is symmetric, as observed in Fig. 13, with powerlaw tails scaling as over this time interval of .
The value of depends on the physical time separation used in the finite difference definition Eq. (18), and stable distribution fits can be performed for generic finite differences . For all , the distribution of becomes time independent at large times. Histograms of the largetime distributions for are shown in Fig. 16, and the best fit largetime values for and are shown in Fig. 17. Since QCD has a finite correlation length, can be described as the difference of approximately normally distributed variables at large . In the large limit, is therefore necessarily almost normally distributed, and correspondingly, , shown in Fig. 17, increases with and begins to approach the normal distribution value for large . A large plateau in is observed that demonstrates small but statistically significant departures from . This deviation is consistent with the appearance of small but statistically significant measures of nonGaussianity in seen in Fig. 10. Heavytailed distributions are found to be needed only to describe the distribution of when is small enough such that and are physically correlated. In some sense, the deviations from normally distributed differences, i.e. , are a measure the strength of dynamical QCD correlations on the scale .
The heavytailed distributions of for dynamically correlated time separations correspond to time evolution that is quite different to that of diffusive Brownian motion describing the quantum mechanical motion of free point particles. Rather than Brownian motion, heavytailed jumps in correspond to a superdiffusive random walk or Lévy flight. Powerlaw, rather than exponentially suppressed, large jumps give Lévy flights a qualitatively different character than diffusive random walks, including fractal selfsimilarity, as can be seen in Fig. 18.
The dynamical features of QCD that give rise to superdiffusive time evolution are presently unknown, however,
we conjecture that instantons play a role.
Instantons are associated with large, localized fluctuations in gauge fields,
and we expect that instantons may also be responsible for infrequent, large fluctuations
in hadronic correlation functions generating the tails of the distribution.
It would be interesting to understand if
can be simply related to observable properties of the nucleon.
It is also not possible to say from this single study whether has a welldefined
continuum limit for infinitesimal .
Further LQCD studies are required to investigate the
continuum limit of .
Lattice field theory studies of other systems and
calculations of in perturbation theory, effective field theory,
and models of QCD could provide important insights into the dynamical origin of superdiffusive time evolution.
One feature of LQCD results is not well described by a stable distribution. The variance of heavytailed distributions is illdefined, and were truly described by a heavytailed distribution then the variance and higher cumulants of would increase without bound as the size of the statistical ensemble is increased. This behavior is not observed. While the distribution of is welldescribed by a stable distribution near its peak, the extreme tails of the distribution of decay sufficiently quickly that the variance and higher cumulants of shown in Fig. 12 give statistically consistent results as the statistical ensemble size is varied. This suggests that is better described by a truncated stable distribution, a popular model for, for example, financial markets exhibiting high volatility but with a natural cutoff on trading prices, in which some form of sharp cutoff is added to the tails of a stable distribution Voit (2005). Note that the tails of the distribution describe extremely rapid changes in the correlation function and are sensitive to ultraviolet properties of the theory. One possibility is that describes a stable distribution in the presence of a (perhaps smooth) cutoff arising from ultraviolet regulator effects that damps the stable distribution’s powerlaw decay at very large . Further studies at different lattice spacings will be needed to understand the form of the truncation and whether the truncation scale is indeed set by the lattice scale. It is also possible that there is a strong interaction length scale providing a modification to the distribution at large , and it is further possible that stable distributions only provide an approximate description at all . For now we simply observe that a truncated stable distribution with an unspecified highscale modification provides a good empirical description of .
Before turning to the complex phase of , we summarize the main findings about the logmagnitude:

The logmagnitude of the nucleon correlation function in LQCD is approximately normally distributed with small but statistically significant negative skewness and positive kurtosis.

The magnitude effective mass approaches at large times, consistent with expectations from ParisiLepage scaling for the nucleon variance . The plateau of marks the start of the golden window where excited state systematics are negligible and statistical uncertainties are increasing slowly. The much largertime plateau of roughly coincides with the plateau of to and occurs after variance growth of reaches the ParisiLepage expectation . Soon after, a noise region begins where the variance of stops increasing and the effective mass cannot be reliably estimated.

The logmagnitude does not have a severe StN problem, and can be measured accurately across all 48 timesteps of the present LQCD calculations. The variance of the logmagnitude distribution only increases by a few percent in 20 timesteps after visibly plateauing.

The cumulant expansion describes as a sum of the time derivatives of the cumulants of the log of the correlation function. At large times, the time derivative of the mean of is constant and approximately equal to . Contributions to from the variance and higher cumulants of are barely resolved in the sample of correlation functions.

Finite differences in , , are described by time independent distributions at large times. For large compared to the QCD correlation length, describes a difference of approximately independent normal random variables and is therefore approximately normally distributed. For small , describes a difference of dynamically correlated variables. The mean of is equal to the time derivative of the mean of and therefore provides a good approximation to . The time derivatives of higher cumulants of cannot be readily extracted from cumulants of without knowledge of dynamical correlations.

At large times, is well described by a symmetric, heavytailed, truncated stable distribution. The presence of heavy tails in indicates that is not described by free particle Brownian motion but rather by a superdiffusive Lévy flight. Deviations of the index of stability of from a normal distribution quantify the amount of dynamical correlations present in the nucleon system, the physics of which is yet to be understood. Further studies are required to determine the continuum limit value of the index of stability associated with and the dynamical origin and generality of superdiffusive Lévy flights in quantum field theory correlation functions.
iii.2 The Phase
The reality of average correlation functions requires that the distribution of be symmetric under . Cumulants of calculated from sample moments in analogy to Eq. (14) are shown in Fig. 19.
The mean and are noisy but statistically consistent with zero as expected. The variance and are small at small times since every sample of is defined to vanish at , and grow linearly at intermediate times around the golden window. After , this linear growth slows and they become constant at large times, and are consistent with results from a uniform distribution.
Histograms of shown in Fig. 20 qualitatively suggest that is described by a narrow, approximately normal distribution at small times and an increasingly broad, approximately uniform distribution at large times. is only defined modulo and can be described as a circular variable defined on the interval . The distribution of can therefore be described with angular histograms, as shown in Fig. 21. Again, resembles a uniform circular random variable at large times.
A cumulant expansion can be readily constructed for . The mean phase is given in terms of the characteristic function and cumulants of by
(23) 
and the appropriate cumulant expansion for is therefore, using Eq. (12),
(24) 
Factors of dictate that a linearly increasing variance of makes a positive contribution to , in contradistinction to the slight negative contribution to made by linearly increasing variance of . Since the mean of necessarily vanishes, the variance of makes the dominant contribution to Eq. (24) for approximately normally distributed . For this contribution to be positive, the variance of must increase, indicating that has a StN problem. For the case of approximately normally distributed , nonzero requires a StN problem for the phase.
Contributions to Eq. (24) from the first four cumulants of are shown in Fig. 22. Contributions from odd cumulants are consistent with zero, as expected by symmetry. The variance provides the dominant contribution to at small and intermediate times, and is indistinguishable from the total calculated using the standard effective mass estimator for . Towards to end of the golden window , the variance contribution to the effective mass begins to decrease. At very large times contributions to from the variance are consistent with zero. The fourth cumulant makes smaller but statistically significant contributions to at intermediate times. Contributions from the fourth cumulant also decrease and are consistent with zero at large times. The vanishing of these contributions results from the distribution becoming uniform at large times, and time independent as a consequence. These observations signal a breakdown in the cumulant expansion at large times where contributions from the variance do not approximate standard estimates of . Notably, the breakdown of the cumulant expansion at coincides with plateaus to uniform distribution cumulants in Fig. 19 and with the onset of the noise region discussed in Sec. III.
Observations of these unexpected behaviors of in the noise region hint at more fundamental issues with the statistical description of used above. A sufficiently localized probability distribution of a circular random variable peaked far from the boundaries of can be reliably approximated as a standard probability distribution of a linear random variable defined on the real line. For broad distributions of a circular variable, the effects of a finite domain with periodic boundary conditions cannot be ignored. While circular random variables are not commonly encountered in quantum field theory, they arise in many scientific contexts, most notably in astronomy, biology, geography, geology, meteorology and oceanography. Familiarity with circular statistics is not assumed here, and a few basic results relevant for understanding the statistical properties of will be reviewed without proof. Further details can be found in Refs. Fisher (1995); Mardia and Jupp (2009); Borradaile (2003) and references therein.
A generic circular random variable can be described by two linear random variables and with support on the line interval where periodic boundary conditions are not imposed. It is the periodic identification of that makes sample moments poor estimators of the distribution of and, in particular, allows the sample mean of a distribution symmetrically peaked about to be opposite the actual location of peak probability. Parameter estimation for circular distributions can be straightforwardly performed using trigonometric moments of and . For an ensemble of random angles , the first trigonometric moments are defined by the sample averages,
(25) 
Higher trigonometric moments can be defined analogously but will not be needed here. The average angle can be defined in terms of the mean twodimensional vector as
(26) 
A standard measure of a circular distribution’s width is given in terms of trigonometric moments as
(27) 
where should be viewed as a measure of the concentration of a circular distribution. Smaller corresponds to a broader, more uniform distribution, while larger corresponds to a more localized distribution.
One way of defining statistical distributions of circular random variables is by “wrapping” distributions for linear random variables around the unit circle. The probability of a circular random variable equaling some value in is equal to the sum of the probabilities of the linear random variable equaling any value that is equivalent to modulo . Applying this prescription to a normally distributed linear random variable gives the wrapped normal distribution
(28) 
where the second form follows from the Poisson summation formula. Wrapped distributions share the same characteristic functions as their unwrapped counterparts, and the second expression above can be derived as a discrete Fourier transform of a normal characteristic function. The second sum above can also be compactly represented in terms of elliptic functions. For the wrapped normal distribution qualitatively resembles a normal distribution, but for the effects of wrapping obscure the localized peak. As , the wrapped normal distribution becomes a uniform distribution on . Arbitrary trigonometric moments and therefore the characteristic function of the wrapped normal distribution are given by
(29) 
Parameter estimation in fitting a wrapped normal distribution to LQCD results for can be readily performed by relating and above to these trigonometric moments as
(30) 
Note that Eq. (30) holds only in the limit of infinite statistics. Estimates for the average of a wrapped normal distribution are consistent with zero at all times, as expected. Wrapped normal probability distribution functions with determined from Eq. (30) are shown with the histograms of Fig. 20 and provide a good fit to the data at all times.
The appearance of a uniform distribution at large times is consistent with the heuristic argument that the logarithm of a correlation function should be described by a stable distribution. The uniform distribution is a stable distribution for circular random variables, and in fact is the only stable circular distribution Mardia and Jupp (2009). The distribution describing a sum of many linear random variables broadens as the number of summands is increased, and the same is true of circular random variables. A theorem of Poincaré proves that as the width of any circular distribution is increased without bound, the distribution will approach a uniform distribution. One therefore expects that the sum of many welllocalized circular random variables might initially tend towards a narrow wrapped normal distribution while boundary effects are negligible. Eventually as more terms are added to the sum this wrapped normal distribution will broaden and approach a uniform distribution. This intuitive picture appears consistent with the time evolution of shown in Figs. 20, 21.