On the Statistics of Baryon Correlation Functions in Lattice QCD

On the Statistics of Baryon Correlation Functions in Lattice QCD

Abstract

A systematic analysis of the structure of single-baryon correlation functions calculated with lattice QCD is performed, with a particular focus on characterizing the structure of the noise associated with quantum fluctuations. The signal-to-noise problem in these correlation functions is shown, as long suspected, to result from a sign problem. The log-magnitude and complex phase are found to be approximately described by normal and wrapped normal distributions respectively. Properties of circular statistics are used to understand the emergence of a large time noise region where standard energy measurements are unreliable. Power-law tails in the distribution of baryon correlation functions, associated with stable distributions and “Lévy flights”, are found to play a central role in their time evolution. A new method of analyzing correlation functions is considered for which the signal-to-noise ratio of energy measurements is constant, rather than exponentially degrading, with increasing source-sink separation time. This new method includes an additional systematic uncertainty that can be removed by performing an extrapolation, and the signal-to-noise problem re-emerges in the statistics of this extrapolation. It is demonstrated that this new method allows accurate results for the nucleon mass to be extracted from the large-time noise region inaccessible to standard methods. The observations presented here are expected to apply to quantum Monte Carlo calculations more generally. Similar methods to those introduced here may lead to practical improvements in analysis of noisier systems.

pacs:
11.15.Ha, 12.38.Gc,
1\DeclareUnicodeCharacter

00A0 


NPLQCD Collaboration

I Introduction

Modern nuclear physics research relies upon large-scale high-performance computing (HPC) to predict the properties of a diverse array of many-body systems, ranging from the properties of hadrons computed from the dynamics of quarks and gluons, through to the form of gravitational waves emitted from inspiraling binary neutron star systems. In many cases, the entangled quantum nature of these systems and the nonlinear dynamics that define them, preclude analytic calculation of their properties. In these cases, precise numerical evaluations of high-dimensional integrations that systematically approach the quantum path integral are required. Typically, it is average quantities that are determined by Monte Carlo (MC) path integral evaluations. These average values are to be used subsequently in direct comparison with experiment, as input to analytic frameworks with outputs that can then be compared with experiment, or as predictions for critical components of systems that are inaccessible to experiment such as the equation of state of dense matter in explosive astrophysical environments. Enormous amounts of HPC resources are used in such MC calculations to determine average values of quantities and their uncertainties. The central limit theorem, and in particular the scaling anticipated for the uncertainties associated with average values, are used to make estimates of projected resource requirements. When a system has a “sign problem”, for which the average value of a quantity of interest results from cancellations of (relatively) large contributions, such as found when averaging , the HPC resources required for accurate numerical estimates of the average(s) are prohibitively large. This is the case for numerical evaluations of the path integrals describing strongly interacting systems with even a modest non-zero net baryon number.

While the quantum fluctuations (noise) of many-body systems contain a wealth of information beyond average values, only a relatively small amount of attention has been paid to refining calculations based upon the structure of the noise. This statement, of course, does not do justice to the fact that all observables (S-matrix elements) in quantum field theory calculations can be determined from vacuum expectation values of products of quantum fields. However, in numerical calculations, it is generally the case that noise is treated as a nuisance, something to reduce as much as needed, as opposed to a feature that may reveal aspects of systems that are obscured through distribution among many expectation values. In the area of Lattice Quantum Chromodynamics (LQCD), which is the numerical technique used to evaluate the quantum path integral associated with Quantum Chromodynamics (QCD) that defines the dynamics of quarks and gluons, limited progress has been made toward understanding the structure of the noise in correlation functions and the physics that it contains.

Strongly interacting quantum systems can be described through path integral representations of correlation functions. In principle, MC evaluation of lattice regularized path integrals can solve QCD as well as many strongly interacting atomic and condensed matter theories. In practice, conceptual obstacles remain and large nuclei and nuclear matter are presently inaccessible to LQCD. In the grand canonical formulation, LQCD calculations with non-zero chemical potential face a sign problem where MC sampling weights are complex and cannot be interpreted as probabilities. In the canonical formulation, calculations with non-zero baryon number face a Signal-to-Noise (StN) problem where statistical uncertainties in MC results grow exponentially at large times. Like the sign problem, the StN problem arises when the sign of a correlation function can fluctuate, at which point cancellations allow for a mean correlation function of much smaller magnitude than a typical MC contribution.

The nucleon provides a relatively simple and well-studied example of a complex correlation function with a StN problem. The zero-momentum Euclidean nucleon correlation function is guaranteed to be real by existence of a Hermitian, bounded transfer matrix and the spectral representation

(1)

where denotes an individual nucleon correlation function calculated from quark propagators in the presence of the -th member of a statistical ensemble of gauge field configurations, denotes an average over gauge field ensembles in and an average over quark and gluon fields in the middle term, and are nucleon creation and annihilation interpolating operators, and represent the overlap of these interpolating operators onto the -th QCD eigenstates with quantum numbers of the nucleon, is the energy of the corresponding eigenstate, is Euclidean time, is the nucleon mass, and denotes proportionality in the limit . A phase convention for creation and annihilation operators is assumed so that is real for all correlation functions in a statistical ensemble. At small times is approximately real, but at large times it must be treated as a complex quantity. The equilibrium probability distribution for can be formally defined as

(2)

where is a gauge field, is the nucleon correlation function in the presence of a background gauge field , is the Haar measure for the gauge group, and is the gauge action arising after all dynamical matter fields have been integrated out. For convenient comparison with LQCD results, a lattice regulator with a lattice spacing equal to unity will be assumed throughout. Unless specified, results will not depend on details of the ultraviolet regularization of .

MC integration of the path integral representation of a partition function, as performed in LQCD calculations, provides a statistical ensemble of background quantum fields. Calculation of in an ensemble of QCD-vacuum-distributed gauge fields provides a statistical ensemble of correlation functions distributed according to . Understanding the statistical properties of this ensemble is essential for efficient MC calculations, and significant progress has been achieved in this direction since the early days of lattice field theory. Following Parisi Parisi (1984), Lepage Lepage (1989) argued that has a StN problem where the noise, or square root of the variance of , becomes exponentially larger than the signal, or average of , at large times. It is helpful to review the pertinent details of Parisi-Lepage scaling of the StN ratio.

Higher moments of are themselves field theory correlation functions with well-defined spectral representations. 2 Their large-time behavior is a single decaying exponential whose scale is set by the lowest energy state with appropriate quantum numbers. Assuming that matter fields have been integrated out exactly rather than stochastically, will contain three valence quarks and three valence antiquarks whose net quark numbers are separately conserved. This does not imply that will only contain nucleon-antinucleon states, as nothing prevents these distinct valence quarks and antiquarks from forming lower energy configurations such as three pions. Quadratic moments of the correlation function, therefore, have the asymptotic behavior

(3)

At large times, the nucleon StN ratio is determined by the slowest-decaying moments at large times, taking the form,

(4)

and is therefore exponentially small. 3 The quantitative behavior of the variance of baryon correlation function in LQCD calculations was investigated in high-statistics studies by the NPLQCD collaboration Beane et al. (2009a, 2011, 2015) and more recently by Detmold and Endres Detmold and Endres (2015, 2014), and was found to be roughly consistent with Parisi-Lepage scaling. One of us Savage (2010) extended Parisi-Lepage scaling to higher moments of and showed that all odd moments of are exponentially suppressed compared to even moments at large times, see Ref. Grabowska et al. (2013); Beane et al. (2015) for further discussion. Nucleon correlation function distributions are increasingly broad and symmetric with exponentially small StN ratios at large times, as seen, for example, in histograms of the real parts of LQCD correlation functions in Ref. Beane et al. (2015).

Beyond moments, the general form of correlation function distributions has also been investigated. Endres, Kaplan, Lee, and Nicholson Endres et al. (2011a) found that correlation functions in the nonrelativistic quantum field theory describing unitary fermions possess approximately log-normal distributions. They presented general arguments, that are discussed below, suggesting that this behavior might be a generic feature of quantum field theories. Knowledge of the approximate form of the correlation function distribution was exploited to construct an improved estimator, the cumulant expansion, that was productively applied to subsequent studies of unitary fermions Endres et al. (2011b, c); Lee et al. (2011); Endres et al. (2013). Correlation function distributions have been studied analytically in the Nambu-Jona-Lasinio model Grabowska et al. (2013); Nicholson et al. (2012), where it was found that real correlation functions were approximately log-normal but complex correlation functions in a physically equivalent formulation of the theory were broad and symmetric at large times with qualitative similarities to the QCD nucleon distribution. DeGrand DeGrand (2012) observed that meson, baryon, and gauge-field correlation functions in gauge theories with several choices of are also approximately log-normal at small times where imaginary parts of correlation functions can be neglected. These various observations provide strong empirical evidence that the distributions of real correlation functions in generic quantum field theories are approximately log-normal.

A generalization of the log-normal distribution for complex random variables that approximately describes the QCD nucleon correlation function at large times is presented in this work. To study the logarithm of a complex correlation function, it is useful to introduce the magnitude-phase decomposition

(5)

At small times where the imaginary part of is negligible, previous observations of log-normal correlation functions DeGrand (2012) demonstrate that is approximately normally distributed. It is shown below that is approximately normal at all times, and that is approximately normal at small times. Statistical analysis of is complicated by the fact that it is defined modulo . In particular, the sample mean of a phase defined on does not necessarily provide a faithful description of the intuitive average phase (consider a symmetric distribution peaked around with a sample mean close to zero). Suitable statistical tools for analyzing are found in the theory of circular statistics and as will be seen below that is described by an approximately wrapped normal distribution. 4 This work is based on a high-statistics analysis of 500,000 nucleon correlation functions generated on a single ensemble of gauge-field configurations by the NPLQCD collaboration Orginos et al. (2015) with LQCD. This ensemble has a pion mass of , physical strange quark mass, lattice spacing fm, and spacetime volume . The Lüscher-Weisz gauge action Luscher and Weisz (1985) and clover-improved Wilson quark actions Sheikholeslami and Wohlert (1985) were used to generate these ensembles, details of which can be found in Ref. Orginos et al. (2015). Exploratory data analysis of this high-statistics ensemble plays a central role below.

Sec. II discusses standard statistical analysis methods in LQCD that introduce concepts used below. In Section III, the magnitude-phase decomposition of the nucleon correlation function and connections to the StN problem are discussed. Section III.1 describes the distributions of the log-magnitude and its time derivative in more detail, while Section III.2 describes the distribution of the complex phase and its time derivative and explains how their features lead to systematic bias in standard estimators during a large-time region that is dominated by noise. Section IV draws on these observations to propose an estimator for the nucleon mass in which accurate results can be extracted from the large-time noise region with a precision that is constant in source-sink separation time but exponentially degrading in an independent time parameter . Section V conjectures about applications to the spectra of generic complex correlation functions and concludes.

Ii Relevant Aspects of Standard Analysis Methods of Correlation Functions

Typically, in calculations of meson and baryon masses and their interactions, correlation functions are generated from combinations of quark- and gluon-level sources and sinks with the appropriate hadron-level quantum numbers. Linear combinations of these correlation functions are formed, either using Variational-Method type techniques Luscher and Wolff (1990), the Matrix-Prony technique Beane et al. (2009a), or other less automated methods, in order to optimize overlap onto the lowest lying states in the spectrum and establish extended plateaus in relevant effective mass plots (EMPs). In the limit of an infinite number of independent measurements, the expectation value of the correlation function is a real number at all times, and the imaginary part can be discarded as it is known to average to zero. The large-time behavior of such correlation functions becomes a single exponential (for an infinite time-direction) with an argument determined by the ground-state energy associated with the particular quantum numbers, or more generally the energy of the lowest-lying state with non-negligible overlap.

The structure of the source and sink play a crucial role in determining the utility of sets of correlation functions. For many observables of interest, it is desirable to optimize the overlap onto the ground state of the system, and to minimize the overlap onto the correlation function dictating the variance of the ground state. In the case of the single nucleon, the sources and sinks, , are tuned in an effort to have maximal overlap onto the ground-state nucleon, while minimizing overlap of onto the three-pion ground state Detmold and Endres (2015). NPLQCD uses momentum projected hadronic blocks Beane et al. (2006) generated from quark propagators originating from localized smeared sources to suppress the overlap into the three-pion ground state by a factor of where is the lattice volume, e.g. Ref. Beane et al. (2009a). For such constructions, the variance of the average scales as at large times, where is the number of statistically independent correlation functions, while the nucleon correlation function scales as . For this set up, the StN ratio scales as , from which it is clear that exponentially large numbers of correlation functions or volumes are required to overcome the StN problem at large times. The situation is quite different at small and intermediate times in which the variance correlation function is dominated, not by the three-pion ground state, but by the “connected” nucleon-antinucleon excited state, which provides a variance contribution that scales as .

This time interval where the nucleon correlation function is in its ground state and the variance correlation function is in a nucleon-antinucleon excited state has been called the “golden window” Beane et al. (2009a) (GW). The variance in the GW is generated, in part, by the distribution of overlaps of the source and sink onto the ground state, that differs at each lattice site due to variations in the gluon fields. In the work of NPLQCD, correlation functions arising from Gaussian-smeared quark-propagator sources and point-like or Gaussian-smeared sinks that have been used to form single-baryon hadronic blocks. Linear combinations of these blocks are combined with coefficients (determined using the Matrix-Prony technique of Ref. Beane et al. (2009a) or simply by minimizing the /dof in fitting a constant to an extended plateau region) that extend the single-baryon plateau region to earlier times, eliminating the contribution from the first excited state of the baryon and providing access to smaller time-slices of the correlation functions where StN degradation is less severe. High-statistics analyses of these optimized correlation functions have shown that GW results are exponentially more precise and have a StN ratio that degrades exponentially more slowly than larger time results Beane et al. (2009a, b, 2010) (for a review, see Ref. Beane et al. (2011)). In particular StN growth in the GW has been shown to be consistent with an energy scale close to zero, as is expected from a variance correlation function dominated by baryon, as opposed to meson, states. Despite the ongoing successes of GW analyses of few-baryon correlation functions, the GW shrinks with increasing baryon number Beane et al. (2009a, b, 2010) and calculations of larger nuclei may require different analysis strategies suitable for correlation function without a GW.

Figure 1: The EMP associated with the -baryon correlation function with (left panel) and the energy scale associated with the standard deviation of the ground state energy (right panel). This correlation function is a tuned linear combination of those resulting from localized smeared and point sinks and from a localized smeared source at a pion mass of calculated from 96 sources per configuration on 3538 statistically independent isotropic clover gauge-field configurations Orginos et al. (2015). They have been blocked together to form 100 independent samplings of the combined correlation function. The red dashed line in the right panel corresponds to the lowest energy contributing to the StN ratio that is expected to dominate at large times.

EMPs, such as that associated with the -baryon shown in Fig. 1, are formed from ratios of correlation functions, which become constant when only a single exponential is contributing to the correlation function,

(6)

where is the ground state energy in the channel with appropriate quantum numbers. The average over gauge field configurations is typically over correlation functions derived from multiple source points on multiple gauge-field configurations. This is well-known technology and is a “workhorse” in the analysis of LQCD calculations. Typically, corresponds to one temporal lattice spacing, and the jackknife and bootstrap resampling techniques are used to generate covariance matrices in the plateau interval used to extract the ground-state energy from a correlated -minimization DeGrand and Toussaint (1990); Beane et al. (2011, 2015)5 The energy can be extracted from an exponential fit to the correlation function or by a direct fit to the effective mass itself. Because correlation functions generated from the same, and nearby, gauge-field configuration are correlated, typically they are blocked to form one average correlation function per configuration, and blocked further over multiple configurations, to create an smaller ensemble containing (approximately) statistically independent samplings of the correlation function.

It is known that baryon correlation functions contain strong correlations over time scales, and that these correlations are sensitive the presence of outliers. Fig. 2 shows the distribution of the real part of small-time nucleon correlation functions, which resembles a heavy-tailed log-normal distribution DeGrand (2012). Log-normal distributions are associated with a larger number of “outliers” than arise when sampling a Gaussian distribution, and the sample mean of these small-time correlation function will be strongly affected by the presence of these outliers. The distribution of baryon correlation functions at very large source-sink separations is also heavy-tailed; David Kaplan has analyzed the real parts of NPLQCD baryon correlation functions and found that they resemble a stable distribution Kaplan (2014). Cancellations between positive and negative outliers occur in determinations of the sample mean of this large-time distribution, leading to different statistical issues that are explored in detail in Sec. III.

Figure 2: The distribution of the real part of nucleon correlation functions at time slices (left panel), (middle panel) and (right panel).

To analyze temporal correlations in baryon correlation functions in more detail, results for inverse covariance matrices generated through bootstrap resampling of the baryon effective mass are shown in Fig. 3. The size of off-diagonal elements in the inverse covariance matrix directly sets the size of contributions to the least-squares fit result from temporal correlations in the effective mass, and so it is appropriate to use their magnitude to describe the strength of temporal correlations. The inverse covariance matrix is seen to possess large off-diagonal elements associated with small time separations that appear to decrease exponentially with increasing time separation at a rate somewhat faster than . Mild variation in the inverse covariance matrix is seen when is varied. Since correlations between and are seen in Fig. 3 to decrease rapidly as becomes large compared to hadronic correlation lengths, is expected that small distance correlations in the covariance matrix decrease when and are separated by and Fig. 2, though such an effect is not clearly visible in the inverse covariance matrix on the logarithmic scale shown.

Figure 3: The logarithm of the inverse covariance matrix determined using booststrap resampling of the sample mean. Lines with and are shown to demonstrate expected hadronic correlation lengths. The correlation function is the same as that described in the caption of Fig. 1. The normalization of the color scale is identical for all .

The role of outliers in temporal correlations on timescales is highlighted in Fig. 4, where inverse covariance matrices determined with the Hodges-Lehmann estimator are shown. The utility of robust estimators, such as the median and the Hodges-Lehmann estimator, with reduced sensitivity to outliers, has been explored in Ref. Beane et al. (2015). When the median and average of a function are known to coincide, there are advantages to using the median or Hodges-Lehmann estimator to determine the average of a distribution. The associated uncertainty can be estimated with the “median absolute deviation” (MAD), and be related to the standard deviation with a well-known scaling factor. Off-diagonal elements in the inverse covariance matrix associated with timescales are visibly smaller on a logarithmic scale when the covariance matrix is determined with the Hodges-Lehmann estimator instead of the sample mean. This decrease in small-time correlations when a robust estimator is employed strongly suggests that short-time correlations on scales are associated with outliers.

Figure 4: The logarithm of the inverse of the baryon effective mass covariance matrix for determined using bootstrap resampling of the Hodges-Lehman estimator. Lines with and are shown to demonstrate expected hadronic correlation lengths. The normalization of the color scale is identical for all and further is identical to the normalization of Fig. 3.

Iii A Magnitude-Phase Decomposition

In terms of the log-magnitude and phase, the mean nucleon correlation functions is

(7)

In principle, could be included in the MC probability distribution used for importance sampling. With this approach, would contribute as an additional term in a new effective action. The presence of non-zero demonstrates that this effective action would have an imaginary part. The resulting weight therefore could not be interpreted as a probability and importance sampling could not proceed; importance sampling of faces a sign problem. In either the canonical or grand canonical approach, one-baryon correlation functions are described by complex correlation functions that cannot be directly importance sampled without a sign problem, but it is formally permissible to importance sample according to the vacuum probability distribution, calculate the phase resulting from the imaginary effective action on each background field configuration produced in this way, and average the results on an ensemble of background fields. This approach, known as reweighting, has a long history in grand canonical ensemble calculations but has been generically unsuccessful because statistical averaging is impeded by large fluctuations in the complex phase that grow exponentially with increasing spacetime volume Gibbs (1986); Splittorff and Verbaarschot (2007a, b). Canonical ensemble nucleon calculations averaging over background fields importance sampled with respect to the vacuum probability distribution are in effect solving the sign problem associated with non-zero by reweighting. As emphasized by Ref. Grabowska et al. (2013), similar chiral physics is responsible for the exponentially hard StN problem appearing in canonical calculations and exponentially large fluctuations of the complex phase in grand canonical calculations.

Reweighting a pure phase causing a sign problem generically produces a StN problem in theories with a mass gap. Suppose for some . Then because by construction, has the StN ratio

(8)

which is necessarily exponentially small at large times. Non-zero guarantees that statistical sampling of has a StN problem. Strictly, this argument applies to a pure phase but not to a generic complex observable such as which might receive zero effective mass contribution from and could have important correlations between and . MC LQCD studies are needed to understand whether the pure phase StN problem of Eq. (8) captures some or all of the nucleon StN problem of Eq. (4).

To determine the large-time behavior of correlation functions, it is useful to consider the effective-mass estimator commonly used in LQCD spectroscopy, a special case of eq. (6),

(9)

As , the average correlation function can be described by a single exponential whose decay rate is set by the ground state energy, and therefore . The uncertainties associated with can be estimated by resampling methods such as bootstrap. The variance of is generically smaller than that of due to cancellations arising from correlations between and across bootstrap ensembles. Assuming that these correlations do not affect the asymptotic scaling of the variance of , propagation of uncertainties for bootstrap estimates of the variance of shows that the variance of scales as

(10)

An analogous effective-mass estimator for the large-time exponential decay of the magnitude is

(11)

and an effective-mass estimator for the phase is

(12)

where the reality of the average correlation function has been used.

Figure 5: The left panel shows the nucleon effective mass as a function of Euclidean time in lattice units. The middle and right panels show the effective masses and of the magnitude and phase respectively. The asymptotic values of and are close to and respectively, whose values are indicated for comparison with horizontal red lines. The uncertainties are calculated using bootstrap methods. Past the imaginary parts of and are not negligible compared to the real part. Here and below we display the real part of the complex log in Eq. (9)-(12); taking the real part of the average correlation functions before taking the log or some other prescription would modify the results after in the left and right panels. All definitions are equivalent in the infinite statistics limit where is real.
Figure 6: Variances of the effective mass estimates shown in Fig. 5. The blue points common to all panels show the variance of . The red line in the left plot shows a fit to variance growth, where the normalization has been fixed to reproduce the observed variance at . The orange points in the middle panel show the variance associated with . The green points in the right panel show the variance associated with .

Figure 5 shows EMPs for , , and calculated from the LQCD ensemble described previously. The mass of the nucleon, determined from a constant fit in the shaded plateau region indicated in Fig. 5, is found to be , in agreement with the mass obtained from the golden window in previous studies Orginos et al. (2015) of . and do not visually plateau until much larger times. For the magnitude, a constant fit in the shaded region gives an effective mass which is close to the value  Orginos et al. (2015) indicated by the red line. For the phase, a constant fit to the shaded region gives an effective mass , which is consistent with the value  Orginos et al. (2015) indicated by the red line. It is unlikely that the phase has reached its asymptotic value by this time, but a signal cannot be established at larger times. For , large fluctuations lead to complex effective mass estimates for and and unreliable estimates and uncertainties. agrees with up to corrections at all times, demonstrating that the magnitude and cosine of the complex phase are approximately uncorrelated at the few percent level. This suggests the asymptotic scaling of the nucleon correlation function can be approximately decomposed as

(13)

For small times , the means and variances of and agree up to a small contribution from . This indicates that the real part of the correlation function is nearly equal to its magnitude at small times. At intermediate times , the contribution of grows relative to , and for the variance of the full effective mass is nearly saturated by the variance of , as shown in Fig. 6. At intermediate times a linear fit normalized to with slope provides an excellent fit to bootstrap estimates of , in agreement with the scaling of Eq. (10). is indistinguishable from in this region, and has an identical StN problem. has much more mild time variation, and can be reliably estimated at all times without almost no StN problem. At intermediate times, the presence of non-zero signaling a sign problem in importance sampling of appears responsible for the entire nucleon StN problem.

approaches its asymptotic value much sooner than or . This indicates that the overlap of onto the three-pion ground state in the variance correlation function is greatly suppressed compared to the overlap of onto the one-nucleon signal ground state. Optimization of the interpolating operators for high signal overlap contributes to this. Another contribution arises from momentum projection, which suppresses the variance overlap factor by  Beane et al. (2009b). A large hierarchy between the signal and noise overlap factors provides a GW visible at intermediate times . In the GW, approaches it’s asymptotic value but begins to grow exponentially and is suppressed compared to . Reliable extractions of are possible in the GW.

Figure 7: EMPs from an ensemble of 500 blocked correlation functions, each of which is equal to the sample mean of 1000 nucleon correlation functions. The left panel shows the effective mass of the blocked correlation functions. The middle panel shows the magnitude contribution and, for reference, a red line at and a blue line at are shown. The right panel shows the phase mass of the blocked correlation functions along with a red line at .
Figure 8: Bootstrap estimates of the variance of the effective mass using blocked correlation functions. The left panel shows the variance of for blocked data in blue and the almost indistinguishable variance of for unblocked data in gray. The middle panel shows the variance of blocked estimates of in orange and the right panel shows the variance of blocked estimates of in green.

The effects of blocking, that is averaging subsets of correlation functions and analyzing the distribution of the averages, are shown in Fig. 7. is suppressed compared to for larger times in the blocked ensemble, and the log-magnitude saturates the average and variance of through intermediate times . Blocking does not actually reduce the total variance of . Variance in is merely shifted from the phase to the log-magnitude at intermediate times. This is reasonable, since the imaginary part of vanishes on average and so blocked correlation functions will have smaller imaginary parts. Still, blocking does not affect and only affects bootstrap estimates of at the level of correlations between correlation functions in the ensemble. Blocking also does not delay the onset of a large-time noise region where and cannot be reliably estimated.

Eventually the scaling of begins to deviate from Eq. (10), and in the noise region the observed variance remains approximately constant (up to large fluctuations). This is inconsistent with Parisi-Lepage scaling. While the onset of the noise region is close to the mid-point of the time direction , a qualitatively similar onset occurs at earlier times in smaller statistical ensembles. Standard statistical estimators therefore do not reproduce the scaling required by basic principles of quantum field theory in the noise region. This suggests systematic deficiencies leading to unreliable results for standard statistical estimation of correlation functions in the noise region. The emergence of a noise region where standard statistical tools are unreliable can be understood in terms of the circular statistics describing and is explained in Sec. III.2. A more straightforward analysis of the distribution of is first presented below.

iii.1 The Magnitude

Histograms of the nucleon log-magnitude are shown in Fig. 9. Particularly at large times, the distribution of is approximately described by a normal distribution. Fits to a normal distribution are qualitatively good but not exact, and deviations between normal distribution fits and results are visible in Fig. 9.

Figure 9: Normalized histograms of derived from the LQCD results. The blue curves correspond to best fit normal distributions determined from the sample mean and variance, while the purple curves correspond to maximum likelihood fits to generic stable distributions. See the main text for more details.

Cumulants of can be used to quantify these deviations, which can be recursively calculated from its moments by

(14)

The first four cumulants of a probability distribution characterize its mean, variance, skewness, and kurtosis respectively. If were exactly log-normal, the first and second cumulants of , its mean and variance, would fully describe the distribution. Third and higher cumulants of would all vanish for exactly log-normal . Fig. 10 shows the first four cumulants of . Estimates of higher cumulants of become successively noisier.

Figure 10: The first four cumulants of as functions of . Cumulants are calculated from sample moments using Eq. (14) and the associated uncertainties are estimated by bootstrap methods. From left to right, the panels show the cumulants (mean), (variance), (characterizing skewness) and (characterizing kurtosis).

The cumulant expansion of Ref. Endres et al. (2011a) relates the effective mass of a correlation function to the cumulants of the logarithm of the correlation function. The derivation of Ref. Endres et al. (2011a) is directly applicable to . The characteristic function , defined as the Fourier transform of the probability distribution function of , can be described by a Taylor series for whose coefficients are precisely the cumulants of ,

(15)

The average magnitude of is given in terms of this characteristic function by

(16)

This allows application of the cumulant expansion in Ref. Endres et al. (2011a) to the effective mass in Eq. (11) to give,

(17)

Since with vanishes for normally distributed , the cumulant expansion provides a rapidly convergent series for correlation functions that are close to, but not exactly, log-normally distributed. Note that the right-hand-side of Eq. (17) is simply a discrete approximation suitable for a lattice regularized theory of the time derivative of the cumulants.

Figure 11: Contributions to from the first four terms in the cumulant expansion of Ref. Endres et al. (2011a) given in Eq. (17). In the leftmost panel, the gray points correspond to the unapproximated estimate for (that are also shown in Fig. 5), while the orange points show the contribution from the mean . The other panels show the contributions to Eq. (17) associated with the higher cumulants , , and , respectively.

Results for the effective mass contributions of the first few terms in the cumulant expansion of Eq. (17) are shown in Fig. 11. The contribution , representing the time derivative of the mean, provides an excellent approximation to after small times. provides a very small negative contribution to , and contributions from and are statistically consistent with zero. As approaches its asymptotic value, the log-magnitude distribution can be described to high-accuracy by a nearly normal distribution with very slowly increasing variance and small, approximately constant . The slow increase of the variance of is consistent with observations above that has no severe StN problem. It is also consistent with expectations that describes a (partially-quenched) three-pion correlation function with a very mild StN problem, with a scale set by the attractive isoscalar pion interaction energy.

As Eq. (17) relates to time derivatives of moments of , it is interesting to consider the distribution of the time derivative . Defining generic finite differences,

(18)

the time derivative of lattice regularized results can be defined as the finite difference,

(19)

If and were statistically independent, it would be straightforward to extract the time derivatives of the moments of from the moments of . The presence of correlations in time, arising from non-trivial QCD dynamics, obstructs a naive extraction of from moments of . For instance, without knowledge of it is impossible to extract the time derivative of the variance of from the variance of . While the time derivative of the mean of is simply the mean of , time derivatives of the higher cumulants of cannot be extracted from the cumulants of without knowledge of dynamical correlations.

Figure 12: The first four cumulants of , determined analogously to the cumulants in Fig. 10.

The cumulants of are shown in Fig. 12. As expected, the mean of approaches at large times. The variance of is tending to a plateau which is approximately one-third of the variance of . This implies there are correlations between and that are on the same order of the individual variances of and . This is not surprising, given that the QCD correlation length is larger than the lattice spacing. No statistically significant is seen for at large times, but a statistically significant positive is found. Normal distribution fits to are found to be poor, as shown in Fig. 13, as they underestimate both the peak probability and the probability of finding “outliers” in the tails of the distribution. Interestingly, Fig. 12, and histograms of shown in Fig. 13, suggest that the distribution of becomes approximately time-independent at large times.

Figure 13: Histograms of , defined as the finite difference given in Eq. (18). The blue curves in each panel correspond to the best-fit normal distribution, while the purple curves correspond to the best-fit stable distribution.

Stable distributions are found to provide a much better description of , and are consistent with the heuristic arguments for log-normal correlation functions given in Ref. Endres et al. (2011a). Generic correlation functions can be viewed as products of creation and annihilation operators with many transfer matrix factors describing Euclidean time evolution. It is difficult to understand the distribution of products of transfer matrices in quantum field theories, but following Ref. Endres et al. (2011a) insight can be gained by considering products of random positive numbers. As a further simplification, one can consider a product of independent, identically distributed positive numbers, each schematically representing a product of many transfer matrices describing time evolution over a period much larger than all temporal correlation lengths. Application of the central limit theorem to the logarithm of a product of many independent, identically distributed random numbers shows that the logarithm of the product tends to become normally distributed as the number of factors becomes large. The central limit theorem in particular assumes that the random variables in question originate from distributions that have a finite variance. A generalized central limit theorem proves that sums of heavy-tailed random variables tend to become distributed according to stable distributions (that include the normal distribution as a special case), suggesting that stable distributions arise naturally in the logs of products of random variables.

Stable distributions are named as such because their shape is stable under averaging of independent copies of a random variable. Formally, stable distributions form a manifold of fixed points in a Wilsonian space of probability distributions where averaging independent random variables from the distribution plays the role of renormalization group evolution. A parameter , called the index of stability, dictates the shape of a stable distribution and remains fixed under averaging transformations. All probability distributions with finite variance evolve under averaging towards the normal distribution, a special case of the stable distribution with . Heavy-tailed distributions with ill-defined variance evolve towards generic stable distributions with . In particular, stable distributions with have power-law tails; for a stable random variable the tails decay as . The heavy-tailed Cauchy, Levy, and Holtsmark distributions are special cases of stable distributions with and respectively, that arise in physical applications. 6

Stable distributions for a real random variable are defined via Fourier transform,

(20)

of their characteristic functions

(21)

where is the index of stability, determines the skewness of the distribution, is the location of peak probability, sets the width. For , the above parametrization does not hold and should be replaced by . For the mean is , and for the mean is ill-defined. For the variance is and Eq. (21) implies the distribution is independent of , while for the variance is ill-defined.

Figure 14: Maximum likelihood estimates for stable distribution fits of in terms of the parameters of Eq. (20)-(21). corresponds to a normal distribution. The associated uncertainties are estimated by bootstrap methods. Changes in do not affect the likelihood when , and reliable estimates of are not obtained at all times.

The distributions of obtained from the LQCD calculations can be fit to stable distributions through maximum likelihood estimation of the stable parameters and , obtaining the results that are shown in Fig. 14. Estimates of are consistent with , corresponding to a normal distribution. This is not surprising, because higher moments of would be ill-defined and diverge in the infinite statistics limit if were literally described by a heavy-tailed distribution. is strictly ill-defined when , but results consistent with indicate negative skewness in agreement with observations above. Estimates of and are consistent with the cumulant results above if a normal distribution () is assumed. Fits of to generic stable distributions are shown in Fig. 9, and are roughly consistent with fits to a normal distribution, though some skewness is captured by the stable fits.

Figure 15: Maximum likelihood estimates for stable distribution fits of similar to Fig. 14. The associated uncertainties are estimated by bootstrap methods.

Stable distribution fits to indicate statistically significant deviations from a normal distribution (), as seen in Fig. 15. The large-time distribution of appears time independent, and fitting in the large-time plateau region gives an estimate of the large-time index of stability. Recalling describes a finite difference over a physical time interval of one lattice spacing, the estimated index of stability is

(22)

Maximum likelihood estimates for are consistent with the sample mean, and is consistent with zero in agreement with observations of vanishing skewness. Therefore, the distribution of is symmetric, as observed in Fig. 13, with power-law tails scaling as over this time interval of .

Figure 16: Histograms of for selected large-time values of . The top row shows results for , the bottom row shows results for , and Fig. 13 shows the results for . The blue curves represent fits to a normal distribution, while the purple curves represent fits to a stable distribution.

The value of depends on the physical time separation used in the finite difference definition Eq. (18), and stable distribution fits can be performed for generic finite differences . For all , the distribution of becomes time independent at large times. Histograms of the large-time distributions for are shown in Fig. 16, and the best fit large-time values for and are shown in Fig. 17. Since QCD has a finite correlation length, can be described as the difference of approximately normally distributed variables at large . In the large limit, is therefore necessarily almost normally distributed, and correspondingly, , shown in Fig. 17, increases with and begins to approach the normal distribution value for large . A large plateau in is observed that demonstrates small but statistically significant departures from . This deviation is consistent with the appearance of small but statistically significant measures of non-Gaussianity in seen in Fig. 10. Heavy-tailed distributions are found to be needed only to describe the distribution of when is small enough such that and are physically correlated. In some sense, the deviations from normally distributed differences, i.e. , are a measure the strength of dynamical QCD correlations on the scale .

Figure 17: Maximum likelihood estimates for the index of stability, and width , in the large-time plateau region as a function of . Associated uncertainties are estimated with bootstrap methods.

The heavy-tailed distributions of for dynamically correlated time separations correspond to time evolution that is quite different to that of diffusive Brownian motion describing the quantum mechanical motion of free point particles. Rather than Brownian motion, heavy-tailed jumps in correspond to a superdiffusive random walk or Lévy flight. Power-law, rather than exponentially suppressed, large jumps give Lévy flights a qualitatively different character than diffusive random walks, including fractal self-similarity, as can be seen in Fig. 18.

Figure 18: The two-dimensional motion of tests particles with their random motion taken from symmetric Stable Distributions. At each time step, the angle of the outgoing velocity is chosen randomly with respect to the incident velocity while the magnitude of the velocity is chosen from a symmetric Stable Distribution with corresponding to Brownian motion (left panel), and corresponding to a Holtsmark distribution (right panel). In the right panel, the large separations between clusters achieved during one time interval correspond to Lévy flights.

The dynamical features of QCD that give rise to superdiffusive time evolution are presently unknown, however, we conjecture that instantons play a role. Instantons are associated with large, localized fluctuations in gauge fields, and we expect that instantons may also be responsible for infrequent, large fluctuations in hadronic correlation functions generating the tails of the distribution. It would be interesting to understand if can be simply related to observable properties of the nucleon. It is also not possible to say from this single study whether has a well-defined continuum limit for infinitesimal . Further LQCD studies are required to investigate the continuum limit of . Lattice field theory studies of other systems and calculations of in perturbation theory, effective field theory, and models of QCD could provide important insights into the dynamical origin of superdiffusive time evolution. 7

One feature of LQCD results is not well described by a stable distribution. The variance of heavy-tailed distributions is ill-defined, and were truly described by a heavy-tailed distribution then the variance and higher cumulants of would increase without bound as the size of the statistical ensemble is increased. This behavior is not observed. While the distribution of is well-described by a stable distribution near its peak, the extreme tails of the distribution of decay sufficiently quickly that the variance and higher cumulants of shown in Fig. 12 give statistically consistent results as the statistical ensemble size is varied. This suggests that is better described by a truncated stable distribution, a popular model for, for example, financial markets exhibiting high volatility but with a natural cutoff on trading prices, in which some form of sharp cutoff is added to the tails of a stable distribution Voit (2005). Note that the tails of the distribution describe extremely rapid changes in the correlation function and are sensitive to ultraviolet properties of the theory. One possibility is that describes a stable distribution in the presence of a (perhaps smooth) cutoff arising from ultraviolet regulator effects that damps the stable distribution’s power-law decay at very large . Further studies at different lattice spacings will be needed to understand the form of the truncation and whether the truncation scale is indeed set by the lattice scale. It is also possible that there is a strong interaction length scale providing a modification to the distribution at large , and it is further possible that stable distributions only provide an approximate description at all . For now we simply observe that a truncated stable distribution with an unspecified high-scale modification provides a good empirical description of .

Before turning to the complex phase of , we summarize the main findings about the log-magnitude:

  • The log-magnitude of the nucleon correlation function in LQCD is approximately normally distributed with small but statistically significant negative skewness and positive kurtosis.

  • The magnitude effective mass approaches at large times, consistent with expectations from Parisi-Lepage scaling for the nucleon variance . The plateau of marks the start of the golden window where excited state systematics are negligible and statistical uncertainties are increasing slowly. The much larger-time plateau of roughly coincides with the plateau of to and occurs after variance growth of reaches the Parisi-Lepage expectation . Soon after, a noise region begins where the variance of stops increasing and the effective mass cannot be reliably estimated.

  • The log-magnitude does not have a severe StN problem, and can be measured accurately across all 48 timesteps of the present LQCD calculations. The variance of the log-magnitude distribution only increases by a few percent in 20 timesteps after visibly plateauing.

  • The cumulant expansion describes as a sum of the time derivatives of the cumulants of the log of the correlation function. At large times, the time derivative of the mean of is constant and approximately equal to . Contributions to from the variance and higher cumulants of are barely resolved in the sample of correlation functions.

  • Finite differences in , , are described by time independent distributions at large times. For large compared to the QCD correlation length, describes a difference of approximately independent normal random variables and is therefore approximately normally distributed. For small , describes a difference of dynamically correlated variables. The mean of is equal to the time derivative of the mean of and therefore provides a good approximation to . The time derivatives of higher cumulants of cannot be readily extracted from cumulants of without knowledge of dynamical correlations.

  • At large times, is well described by a symmetric, heavy-tailed, truncated stable distribution. The presence of heavy tails in indicates that is not described by free particle Brownian motion but rather by a superdiffusive Lévy flight. Deviations of the index of stability of from a normal distribution quantify the amount of dynamical correlations present in the nucleon system, the physics of which is yet to be understood. Further studies are required to determine the continuum limit value of the index of stability associated with and the dynamical origin and generality of superdiffusive Lévy flights in quantum field theory correlation functions.

iii.2 The Phase

The reality of average correlation functions requires that the distribution of be symmetric under . Cumulants of calculated from sample moments in analogy to Eq. (14) are shown in Fig. 19.

Figure 19: The first four cumulants of . In these fits, no special care is given to the fact that is a phase defined on and standard sample moments are used to determine these cumulants in analogy to Eq. (14). Uniform distribution results of variance and fourth cumulant are shown as green lines for reference.

The mean and are noisy but statistically consistent with zero as expected. The variance and are small at small times since every sample of is defined to vanish at , and grow linearly at intermediate times around the golden window. After , this linear growth slows and they become constant at large times, and are consistent with results from a uniform distribution.

Figure 20: Histograms of with fits to wrapped normal distributions using Eq. (27) shown in blue and fits to wrapped stable distributions using maximum likelihood estimation of the parameters of Eq. (37) shown in purple. See the main text for details.

Histograms of shown in Fig. 20 qualitatively suggest that is described by a narrow, approximately normal distribution at small times and an increasingly broad, approximately uniform distribution at large times. is only defined modulo and can be described as a circular variable defined on the interval . The distribution of can therefore be described with angular histograms, as shown in Fig. 21. Again, resembles a uniform circular random variable at large times.

Figure 21: Angular histograms of . The unit circle is split into a uniform sequence of bins, and the number of samples falling in each bin sets the radial length of a bar at that angle. Colors ranging from orange to blue also denotes angle, and is included to indicate the location of the branch cut in .

A cumulant expansion can be readily constructed for . The mean phase is given in terms of the characteristic function and cumulants of by

(23)

and the appropriate cumulant expansion for is therefore, using Eq. (12),

(24)

Factors of dictate that a linearly increasing variance of makes a positive contribution to , in contradistinction to the slight negative contribution to made by linearly increasing variance of . Since the mean of necessarily vanishes, the variance of makes the dominant contribution to Eq. (24) for approximately normally distributed . For this contribution to be positive, the variance of must increase, indicating that has a StN problem. For the case of approximately normally distributed , non-zero requires a StN problem for the phase.

Contributions to Eq. (24) from the first four cumulants of are shown in Fig. 22. Contributions from odd cumulants are consistent with zero, as expected by symmetry. The variance provides the dominant contribution to at small and intermediate times, and is indistinguishable from the total calculated using the standard effective mass estimator for . Towards to end of the golden window , the variance contribution to the effective mass begins to decrease. At very large times contributions to from the variance are consistent with zero. The fourth cumulant makes smaller but statistically significant contributions to at intermediate times. Contributions from the fourth cumulant also decrease and are consistent with zero at large times. The vanishing of these contributions results from the distribution becoming uniform at large times, and time independent as a consequence. These observations signal a breakdown in the cumulant expansion at large times where contributions from the variance do not approximate standard estimates of . Notably, the breakdown of the cumulant expansion at coincides with plateaus to uniform distribution cumulants in Fig. 19 and with the onset of the noise region discussed in Sec. III.

Figure 22: Contributions from the first four terms in the cumulant expansion of Eq. (17). The variance, shown second from left, is expected to provide the dominant contribution if a truncation of Eq. (17) is reliable. Standard estimates of from Eq. (12) are shown as the gray points, alongside the cumulant contribution (green points) in the second from left panel. Other panels only show cumulant contributions (green points).

Observations of these unexpected behaviors of in the noise region hint at more fundamental issues with the statistical description of used above. A sufficiently localized probability distribution of a circular random variable peaked far from the boundaries of can be reliably approximated as a standard probability distribution of a linear random variable defined on the real line. For broad distributions of a circular variable, the effects of a finite domain with periodic boundary conditions cannot be ignored. While circular random variables are not commonly encountered in quantum field theory, they arise in many scientific contexts, most notably in astronomy, biology, geography, geology, meteorology and oceanography. Familiarity with circular statistics is not assumed here, and a few basic results relevant for understanding the statistical properties of will be reviewed without proof. Further details can be found in Refs. Fisher (1995); Mardia and Jupp (2009); Borradaile (2003) and references therein.

A generic circular random variable can be described by two linear random variables and with support on the line interval where periodic boundary conditions are not imposed. It is the periodic identification of that makes sample moments poor estimators of the distribution of and, in particular, allows the sample mean of a distribution symmetrically peaked about to be opposite the actual location of peak probability. Parameter estimation for circular distributions can be straightforwardly performed using trigonometric moments of and . For an ensemble of random angles , the first trigonometric moments are defined by the sample averages,

(25)

Higher trigonometric moments can be defined analogously but will not be needed here. The average angle can be defined in terms of the mean two-dimensional vector as

(26)

A standard measure of a circular distribution’s width is given in terms of trigonometric moments as

(27)

where should be viewed as a measure of the concentration of a circular distribution. Smaller corresponds to a broader, more uniform distribution, while larger corresponds to a more localized distribution.

One way of defining statistical distributions of circular random variables is by “wrapping” distributions for linear random variables around the unit circle. The probability of a circular random variable equaling some value in is equal to the sum of the probabilities of the linear random variable equaling any value that is equivalent to modulo . Applying this prescription to a normally distributed linear random variable gives the wrapped normal distribution

(28)

where the second form follows from the Poisson summation formula. Wrapped distributions share the same characteristic functions as their unwrapped counterparts, and the second expression above can be derived as a discrete Fourier transform of a normal characteristic function. The second sum above can also be compactly represented in terms of elliptic- functions. For the wrapped normal distribution qualitatively resembles a normal distribution, but for the effects of wrapping obscure the localized peak. As , the wrapped normal distribution becomes a uniform distribution on . Arbitrary trigonometric moments and therefore the characteristic function of the wrapped normal distribution are given by

(29)

Parameter estimation in fitting a wrapped normal distribution to LQCD results for can be readily performed by relating and above to these trigonometric moments as

(30)

Note that Eq. (30) holds only in the limit of infinite statistics. Estimates for the average of a wrapped normal distribution are consistent with zero at all times, as expected. Wrapped normal probability distribution functions with determined from Eq. (30) are shown with the histograms of Fig. 20 and provide a good fit to the data at all times.

The appearance of a uniform distribution at large times is consistent with the heuristic argument that the logarithm of a correlation function should be described by a stable distribution. The uniform distribution is a stable distribution for circular random variables, and in fact is the only stable circular distribution Mardia and Jupp (2009). The distribution describing a sum of many linear random variables broadens as the number of summands is increased, and the same is true of circular random variables. A theorem of Poincaré proves that as the width of any circular distribution is increased without bound, the distribution will approach a uniform distribution. One therefore expects that the sum of many well-localized circular random variables might initially tend towards a narrow wrapped normal distribution while boundary effects are negligible. Eventually as more terms are added to the sum this wrapped normal distribution will broaden and approach a uniform distribution. This intuitive picture appears consistent with the time evolution of shown in Figs. 2021.

Figure 23: The left panel shows estimates of the wrapped normal mean calculated from Eq. (30) as a function of time. The center panel shows analagous estimates of the wrapped normal variance, . The right panel shows the wrapped normal effective mass, , defined in Eq. (31) (green points) along with the standard complex phase effective mass defined in Eq. (24) (gray points).

The wrapped normal variance estimates for that are shown in Fig. 23 require further discussion. At intermediate times, the wrapped normal variance calculated from Eq. (30) rises linearly with a slope consistent with . This is not surprising because assuming an exactly wrapped normal , becomes