Methods of Reverberation mapping: I. Timelag Determination by Measures of Randomness
Abstract
A class of methods for measuring time delays between astronomical time series is introduced in the context of quasar reverberation mapping, which is based on measures of randomness or complexity of the data. Several distinct statistical estimators are considered that do not rely on polynomial interpolations of the light curves nor on their stochastic modeling, and do not require binning in correlation space. Methods based on von Neumann’s meansquare successivedifference estimator are found to be superior to those using other estimators. An optimized von Neumann scheme is formulated, which better handles sparsely sampled data and outperforms current implementations of discrete correlation function methods. This scheme is applied to existing reverberation data of varying quality, and consistency with previously reported time delays is found. In particular, the sizeluminosity relation of the broadline region in quasars is recovered with a scatter comparable to that obtained by other works, yet with fewer assumptions made concerning the process underlying the variability. The proposed method for timelag determination is particularly relevant for irregularly sampled time series, and in cases where the process underlying the variability cannot be adequately modeled.
Subject headings:
galaxies: active — methods: data analysis — methods: statistical — quasars: general1. Introduction
Reverberation mapping (RM) is a widely used technique for probing spatially unresolved regions in variable astronomical sources. Perhaps its best known application is for the study of the broadline region (BLR) in quasars, where line emission from photoionized gas lags behind continuum variations, whence its spatial extent may be deduced (Bahcall et al., 1972; Cherepashchuk & Lyutyi, 1973). In this context, RM is at the core of studies to estimate the mass of supermassive black holes (e.g., Kaspi et al., 2000; Peterson & Wandel, 2000; Peterson et al., 2004) with implications for the formation and coevolution of black holes and galaxies (e.g., Shields et al., 2003; Volonteri et al., 2003; Schawinski et al., 2006; Netzer et al., 2007; Trakhtenbrot & Netzer, 2012).
In its simplest form, RM seeks to determine a time delay between time series, which reflects on the size of the source that emits the lagging signal (be it in quasars or in Xray binaries; Uttley et al. 2014). Nevertheless, RM is not restricted to timedelay measurements, and additional information may be obtained on the geometry of the reverberating region by studying higher moments of the transfer function, provided the data are good enough (e.g., Blandford & McKee, 1982; Krolik et al., 1991; Pancoast et al., 2012; Grier et al., 2013; Pozo Nuñez et al., 2013). In what follows we restrict our discussion to timedelay measurement, which is a commonly encountered problem in astronomy.
Astronomical techniques for RM are inherently different from radar techniques, for example, because the driving signal cannot be controlled and the sampling is rarely regular, which undermines many common algorithms for signal processing. Various approaches have been put forward to deal with irregular sampling. For example, linear interpolation of the data is commonly applied by various algorithms for timedelay determination that are based on crosscorrelation techniques (generally referred to in the literature as interpolated crosscorrelation functions (ICCF), see, e.g., Gaskell & Peterson, 1987; Welsh, 1999; Chelouche & Daniel, 2012; Chelouche & Zucker, 2013). Other approaches that rely on more elaborate models include joint spline fitting of the quasar light curves with relative scaling and timeshifting, where the lag is obtained using a bestfit criterion (Tewes et al., 2013; Liao et al., 2015, and references therein). Although these approaches to RM are quite effective, which has to do with the fact that quasar variability is characterized by soft power density spectra (White & Peterson, 1994), they are not strictly justified because quasar light curves are stochastic in nature (Kelly et al., 2009).
With the advance in computing power, statistical models for quasar light curves (e.g., Scargle, 1981; Rybicki & Press, 1992) have been introduced to RM (e.g., Reynolds, 2000; Suganuma et al., 2006; Zu et al., 2011). These rely on the notion of quasar ergodicity, and often assume that the underlying process responsible for quasar variability is of the damped randomwalk type. Whether this assumption is justified for all sources, at all times, and over the full range of variability timescales, is unclear (Mushotzky et al., 2011; Hernitschek et al., 2015, but see also Fausnaugh et al. 2016).
The aforementioned RM techniques, although widely used, require educated guessing as to how astronomical sources behave when not observed, e.g. due to seasonal gaps, which might result in erroneous lags if not properly applied (Hernitschek et al., 2015). An alternative approach to RM relies on (largely) modelindependent timedelay measurement techniques, which fall into two main categories: discrete crosscorrelation function (DCF) schemes (Edelson & Krolik, 1988, with its transformed variant, ZDCF, by Alexander 1997), and serialdependence schemes (Pelt et al., 1994, 1996, note that some of those have evolved beyond the notion of serialdependence schemes and may rely on tunable parameters, e.g., GilMerino et al. 2002 and references therein). While the use of discrete correlation schemes is widespread in reverberation studies (e.g., Collier et al., 1998; Kaspi et al., 2000; Pozo Nuñez et al., 2012), they are often less stable when sparse sampling is concerned (White & Peterson, 1994; Kaspi et al., 2000), and their results are sensitive to binning in correlation space (Kovačević et al., 2014). Serialdependence schemes, on the other hand, have largely been restricted to lensing studies (Pelt et al., 1996; GilMerino et al., 2002; Tewes et al., 2013; Liao et al., 2015, and references therein), and their relevance to RM problems with nontrivial transfer function effects has not been adequately demonstrated, nor has their performance been benchmarked with respect to other commonly used RM schemes.
With the above limitations of current RM techniques in mind, searching for additional approaches to timelag measurement is of the essence. Specifically, more reliable modelindependent means for RM would help to refine current BLRsize–luminosity relations (e.g., by decreasing their scatter, Haas et al. 2011), fully exploit upcoming survey data (e.g., Shen et al., 2015), and assess the viability of quasars as standard cosmological candles (Elvis & Karovska, 2002; Cackett et al., 2007; Watson et al., 2011; Czerny et al., 2013; Hönig et al., 2017).
In this work we introduce a new class of techniques to quasar RM, which is associated with measures of randomness and includes measures of datacomplexity and serialdependence. Such techniques are extensively used in various nonRM contexts in other fields of science (e.g., in cryptography and econometrics) but are not as widespread in astronomy. Common to all methods employed here is the minimal set of assumptions made when extracting time delays. In particular, the methods outlined here do not rely on data interpolation, nor on binning in correlation space, and do not invoke arguments regarding quasar ergodicity.
The paper is organized as follows: the method is described in §2, where various statistical estimators are defined. Section 3 benchmarks several variants of the method using an extensive suite of numerical simulations. The methods are applied to RM datasets of varying quality in §4, and their performance is benchmarked against other methods of RM. The summary follows in §5.
2. Methods
Let us represent two discrete light curves, and , as two finite sets of ordered pairs:
(1) 
where are the fluxes measured at times . is the number of data points in . We assign the indices of the ordered pairs so that they are in an increasing order of time stamps: when for , and similarly for . We further define the timeshifted version of ,
(2) 
where is a parameter.
Let us assume that and are normalized to zero mean and unit standard deviation (the degree to which this assumption is effective is discussed in detail in §3.1), and that all time stamps are unique,^{1}^{1}1In the event that two data points have the same time stamp (this is generally unlikely, especially for ), we introduce arbitrarily small shifts, which do not affect the end result, so that time stamps are unique. i.e., . We then define the combined light curve as
(3) 
where . Here also we assign the indices in an increasing order of the time stamps, so that is an interlaced combination of and .
Clearly, is a discrete function of time, which depends, by construction, on . In particular, if is a version of that lags by some time interval – as would be the case if is the continuum/broadline light curve of quasars – then will appear more regular^{2}^{2}2By the term ”regular” we mean that longer trends in become more pronounced and short term fluctuations are suppressed. For example, when and are, timewise, better aligned, the power spectrum of becomes more reminiscent of red noise than of white noise. when constructed from equation 3 using with (Fig. 1). The problem of estimating the lag between and therefore translates to finding a reliable estimator for the regularity, or conversely the randomness of , and defining a suitable extremum condition for it, from which the recovered time delay, , may be deduced. Note that many of the statistical properties of , such as its valuedistribution moments, are independent of hence algorithms that are sensitive to the ordering of the data are required.
2.1. Definitions of Estimators
There is vast literature on quantifying the serialdependence, levelofrandomness, and complexity of datasets, which relates to coding theory (e.g., cryptography and data compression problems), and some algorithms work better than others, depending on the type of data and the purpose for which they are employed^{3}^{3}3A useful compendium is by Bassham et al. (2010), which can be found at http://csrc.nist.gov/publications/nistpubs/80022rev1a/SP80022rev1a.pdf.. Below, we outline a few representative algorithms/statistics that operate on but we emphasize that additional relevant estimators exist, which are not included here and may be worth exploring.
2.1.1 Von Neumann
The meansquare successive difference after von Neumann (1941) is
(4) 
and recall the dependence of on , namely Equation 3, whose explicit dependence is henceforth dropped for brevity. In particular, if were to be drawn from a normal distribution then would converge to twice the variance for large . If, however, positive correlations exist between successive points on the light curve (e.g., the long trends seen in quasar light curves) then its value would be smaller than twice the variance of the light curve, which is insensitive to the ordering of points. For the case in question, constructing with , which corresponds to the lag between and , is expected to result in pronounced trends in and hence to a minimum in (Fig. 2). The recovered lag, , is defined here by the location of the minimum in over the range probed. A version of this test, which operates on the ranked light curve, , and leads to similar results, is due to Bartels (1982). Similar results are also obtained when considering the sum of absolute differences rather than their squares (not shown).
A version of the von Neumann (VN) estimator due to Pelt et al. (1994, sometimes referred to as a ”dispersion spectrum” or ”Pelt statistics”), which is reminiscent of the definition, multiplies each of the terms in the above sum by a weighting factor, , where
(5) 
( is the flux measurement uncertainty), and leads to a weighted version of the VN estimator. Like Pelt et al. (1994), we introduce a flag , which multiplies each of the terms in the VN sum, yet our definition expands on theirs, and reads {widetext}
(6) 
where if the ’th data point originates from and otherwise (see also Appendix A). This definition of serves for a dual purpose: it allows us to include in the sum only those cross terms of the light curve, where each point in the pair originates in a different light curve, thereby ignoring redundant information for timedelay measurements (Pelt et al., 1994). Our definition of further allows us to exclude pairs of points whose observed difference in timestamps is smaller than some timescale over which correlated errors may be important and could bias the results. For example, the effect of correlated errors between continuum and line flux measurements that originate from the same spectrum can be mitigated by using the flag with , as considered, for simplicity, henceforth (see Alexander 1997, 2013). The refined VN expression then takes the form
(7) 
2.1.2 Kendall
This is, essentially, a nonparametric correlation estimator (Kendall, 1938) that involves the counting of the number of concordant and discordant pairs^{4}^{4}4Concordant pairs are those that are characterized by similar trends, i.e., the first point in both pairs having a smaller (larger) rank than the second point. Discordant pairs are those whose trends are opposite; see also Zucker (2015). ( respectively) in or in its ranked version . It is defined as
(8) 
and is larger for light curves that are more regular. In particular, when combining the light curves with the correct time shift (of order the lag), the estimator reaches a maximum at (Fig. 2). In addition to the standard implementation, we have also implemented a version of the test with the flag, as defined by equation 6 with the appropriate normalization^{5}^{5}5The normalization term in the denominator of Eq. 8 then reads ..
2.1.3 Wald–Wolfowitz
Each point in the light curve is assigned with a property, , which is unity if its value is above the median and zero otherwise. The following sum is then calculated (Wald & Wolfowitz, 1940):
(9) 
Light curves that show longer trends over many visits, and are hence less random, will result in smaller values of . Conversely, more random light curves, or those that are nonrandom but rather exhibit frequent – with respect to the cadence – sawtoothlike behavior (not a characteristic of quasar light curves), will attain higher values. We therefore identify the lag with that minimizes (Fig. 2). In addition to the standard implementation, we have also implemented a version of the test with each term in the sum multiplied by the flag (Equation 6) with proper normalization maintained. Weighting by measurement uncertainties (Eq. 5) may also be formally included with proper normalization as in §2.1.1.
2.1.4 Cumulative sums
This test assigns each point in with a property so that if and otherwise. If resembles a random walk then by summing over segments, one expects to observe relatively small deviations from zero (of the type). On the other hand, larger deviations from a randomwalk behavior are expected to occur in light curves that include correlations and trends extending over many successive points or ”steps” (Bassham et al., 2010, for additional information). In particular, we calculate the maximum of partial sums,
(10) 
A maximal is expected to arise when is comparable to the time delay between and (Fig. 2). As there is no clear dependence of the above sums on (quasar light curves are not of a randomwalk type), we do not consider weights/flag implementations here.
2.1.5 BlumKieferRosenblatt
This test has evolved from the work of Hoeffding (1948), and has been shown to have superior qualities in the astronomical context (Zucker, 2016). The measure is defined using the ranked light curve as in Blum et al. (1961):
(11) 
where is the ”bivariate rank” (Zucker, 2016), which is defined as the number of pairs of points for which and . For light curves with longer trends (i.e., a lower level of randomness), the first term on the right hand side of the above expression would be larger. We therefore seek that maximizes and identify it with the lag (Fig. 2). A flagged version has also been formally implemented with the appropriate normalization ().
2.1.6 LempelZiv
Lempel & Ziv (1976) introduced a complexity measure of a data stream, which is at the heart of a well known lossless data compression algorithm. A lower complexity measure means that the data are showing more significant trends and hence are easier to compress^{6}^{6}6For example, imagine points in the plane that trace a linear trend. These points may be characterized by values or instead by values (their values augmented by the linear trend parameterization). Choosing the latter representation leads to a compression of the data volume without any information loss.. We therefore expect to have minimal complexity at around (Fig. 2). To limit ourselves to a finite eigenvocabulary (Lempel & Ziv, 1976) we consider two binarysequence options: either we operate on , as defined in §2.1.3, or instead define if and otherwise ( is a sequence of points using this definition). It turns out that the outcome, as far as determination of time lag is concerned, depends little on the particular implementation used. Due to the binary nature of this estimator, and the uncertainties associated with its normalization, we do not consider errorweighting or flag implementations here.
2.2. Definitions of Optimized Estimators
Consider each of the aforementioned estimators, which operate on a timeseries defined in equation 3. Clearly, an improper relative normalization of and (or ) could lead to erroneous timedelays (see more below). Nevertheless, due to finite sampling, the true relative normalization of and is unknown. In this work we improve upon the method described in Pelt et al. (1994), and consider the following scheme when assembling an optimal version of . Specifically, in analogy to equation 3 let us define
(12) 
where are parameters with (unphysical normalizations of the light curves would be obtained for whereby is either nulled or its flux inverted). We note that for light curves typical of RM and from symmetry considerations we may instead choose to normalize instead of so that an expression analogous to Equations 3 and 12 would read
(13) 
Clearly, the evaluation of should not depend on whether or is used as input, hence we define the optimized versions of the estimators as
(14) 
where are calculated for respectively. An extremum can then be sought either numerically or analytically in the threedimensional phase space spanned by and (see Appendix A for the VN scheme). The aforementioned symmetrization is robust to situations where little overlap exists between the shifted light curves, e.g., due to seasonal gaps. In particular, an asymmetric implementation of the normalization scheme may lead to either (in the case of a minimum) or (in the case of a maximum) solutions, and thence to erroneous timedelay measurements.
3. Simulations
To test the efficiency of measures of randomness for timelag determination, and to identify potential biases with respect to the true lag, we resort to simulations. We model the driving continuum light curve, , using the method of Timmer & Koenig (1995) with the Fourier transform , where , which is typical of quasars over days to years timescales (Giveon et al., 1999; Mushotzky et al., 2011; Caplar et al., 2016)^{7}^{7}7Note that some works define the powerlaw index of the power spectral density instead..
The light echo is obtained from the convolution of the driving light curve with a transfer function , so that , where the input lag, , and denotes convolution. In this work we consider rectangular transfer functions over the time range where thus preserving causality while allowing us to control the level of BLR isotropy around the ionizing source by the ratio (e.g., Pozo Nuñez et al., 2013, and references therein).
To simulate light curves that are reminiscent of astronomical data, and are independently and randomly sampled with visits. For comparison, we also experimented with simultaneous uniform sampling of both time series and found the relative benchmarks of the various estimators to be qualitatively similar, hence they are not shown. Lastly, we add uncorrelated (Gaussian) noise to the sampled light curves to mimic observational uncertainties.
Having quasar RM campaigns in mind, our basic set of lightcurve realizations, around which parameterized excursions are later explored (§3.1), is defined by (Kelly et al., 2009), days (Kaspi et al., 2000), and , which is the ratio of the total span of the time series, , to the lag (see Fig. 8 in Welsh 1999 for motivation). In addition, each light curve is randomly sampled with 100 visits (), which corresponds to an average sampling timescale for each light curve of (Pozo Nuñez et al., 2012; Barth et al., 2015; Du et al., 2016). We further take , which is consistent with a relatively faceon disk geometry (Pancoast et al., 2012; Pozo Nuñez et al., 2013). A signaltonoise ratio S/N is assumed, which corresponds to measurement uncertainties at the level (see Kaspi et al., 2000), and may be compared to the assumed fractional rms amplitude of variability of the quasar, (RodríguezPascual et al., 1997; Kaspi et al., 2000).
The timedelays between pairs of mock light curves are deduced in a fully automated way by searching for the extrema associated with each of the aforementioned measures of randomness. Unless otherwise specified, we do not use an errorweighting scheme, nor flag implementations.
The recovered distributions of timelag, , are based on lightcurve realizations, and are shown for the nonoptimized estimator versions in Figure 3. We find that the Bartels and VN estimators lead to similar results, and only the latter, which leads to somewhat improved lag statistics, is shown. Further, the VN estimator is the best performer in our simulations: it exhibits the narrowest distribution of , and is relatively unbiased with . Next in terms of estimator performance come the Kendall and Blum–Kiefer–Rosenblatt estimators, which exhibit larger dispersions around the mean. The Wald–Wolfowitz, the Lempel–Ziv – the particular implementation of which appears to be largely immaterial – and the cumulative sums estimators are, in order of appearance, the worst performers: they lead to the broadest distributions around the central value, and their means, like their median and centroid values, and are biased to shorter lags – by up to 30% for the cumulative sums estimator – for our set of simulations.
3.1. Benchmarking the VN Estimator
With the above results in mind, we focus in this section on benchmarking and improving the von Neumann scheme for RM purposes, and compare its deduced lags to those obtained by the ZDCF method of Alexander (1997, 2013),^{8}^{8}8See the complete code package at: http://wwo.weizmann.ac.il/weizsites/tal/research/software/. which is a widely used discrete correlation scheme in the field of RM. Unless otherwise specified, the ZDCF algorithm is used here in its default optimalbinning mode and with zerolag points omitted (Alexander, 2013).
A detailed comparison of the VN and ZDCF deduced lags for mock light curves defined by our basic parameterization is shown in Figure 4. We find that the von Neumann estimator provides a distribution of laga with a narrower core than the ZDCF – the ratio of their full widths at halfmaximum is for the simulated case; see Fig. 4 – but with a more pronounced longtail wing extending to short and even negative time delays (the latter are obtained in of the realizations for this set of simulations). Detrending of the light curves appears to be counterproductive for the VN approach and hence not is considered in the remainder of the paper. We attribute the emergence of the wing toward short time delays to relative inconsistencies in normalization in a fraction of the lightcurve realizations, and demonstrate the improved statistics obtained for the optimized scheme of §2.2 in Figure 4. In particular, the distribution of obtained by the optimized scheme becomes more symmetric and narrow; it is significantly (%) narrower than the ZDCF, with the former having 68% of the values within days from the peak for our basic set of simulations.
The optimized scheme appears to be considerably better than the ZDCF in suppressing spurious time delays over the range (see the inset of Fig. 4). Nevertheless, in what follows we restrict our timedelay search window to the range , within which more than half of the lightcurve features covered by the time series are in the timewise overlapping region between and (this statement is true in the statistical sense but may not hold for individual light curve realizations).
We further note the overall poor correspondence between the timedelay measurements of the VN and the ZDCF methods, leading to a Pearson correlation coefficient (righthand plot of Fig. 4). Recovered time delays that deviate from the central value by more than in either direction, for both methods, are somewhat better correlated (note the somewhat tilted outer contours in Fig. 4 characterized by ). Interestingly, the poor correlation between the time delays obtained by the two methods implies that they may be viewed as independent measurements of the lag, and under certain circumstances might even be used in tandem to improve timedelay estimates; a proper treatment of this possibility is beyond the scope of the present work.
We next benchmark the performance of the optimized VN method against the ZDCF for various parameterized excursions from our basic set of lightcurve realizations. To this end we define the rms deviation,^{9}^{9}9The root mean square deviation is defined here as where is the number of realizations. To avoid having our statistics influenced by outliers (see Fig. 4), includes the sum of all but the extreme 2.5% measurements on either side of the distribution, for either of the estimators., , of the measured lag from the true lag per estimator, and calculate the ratio ( implies an advantage to the VN approach). The results are summarized in Figure 5, where each point is based on the analysis of mock light curves.
We find the performance of the optimized VN scheme to be consistently better than that of the ZDCF method over much of the parameter space explored here. Specifically, it performs best for slopes, , of the order of the observed values (but underperforms for , not shown) and for narrow transfer functions, such as those that characterize relatively faceon disks or rings (Pancoast et al., 2012). A further advantage of the optimized VN scheme is attained for more extended campaigns (with a fixed number of points), mainly because of the increased scatter in the ZDCF results with poorer sampling. Higher cadence and S/N further improve the relative performance of the optimized VN scheme, as may be expected from a robust statistical estimator given the rising quality of the data. Interestingly, considering a model with S/N, , and , and repeating the simulations, results in the distribution of of the optimized VN scheme being times narrower than the ZDCF (not shown).
As data quality deteriorates (e.g., sparser light curves), the deduced distributions of broaden and may blend with the spurious lag population at which is shown in the inset of Figure 4. Nevertheless, our calculations indicate a consistent advantage of the VN scheme over the ZDCF approach also in this limit. This is most notable for sparse sampling with and when the total number of points per light curve is low ( for this set of simulations; enforcing a different ZDCF binning scheme than the default does not appreciably change the results and hence is not shown). A significant relative decline in ZDCF’s performance is seen also for the short simulated campaigns with , which rarely characterizes RM campaigns because of well known biases (Welsh, 1999). In the example considered here, the ZDCF timelag distributions develop a nonnegligible wing extending to short/negative delays while the optimized VN is not substantially affected and hence is less biased. A more indepth exploration of the relative performance of the two estimators under challenging observing conditions, which are not conducive to reliable RM (Welsh, 1999), is beyond the scope of this paper.
Finally, we note that a similar behavior characterizes also the Bartels estimator, whose results are not shown here. Specifically, its values are generally smaller than unity, and hence its performance is superior to the ZDCF, but they are larger than VN’s over the parameter range considered here.
3.1.1 The Weighted VN Scheme
Here we consider the effect of spikes in the light curve (due to larger measurement uncertainties in a fraction of the data points) and correlated noise between the light curves on the deduced time delays and the means to mitigate them using the weighted VN scheme of §2.1.1. To this end, more realistic sets of light curves have been simulated, which mimic typical light curves used for (a) BLR RM and (b) continuum timedelay measurements (perhaps associated with the accretion disk; Collier et al. 1998).
To better mimic BLR RM data we assume our basic set of simulations but include seasonal gaps 150 days wide, and assume that 10% of the data points suffer from an enhanced noise level, with their measurement uncertainties being larger by a factor of 3; an example is shown in Figure 6. Results for the recovered time delays from mock light curve simulations using the weighted and unweighted VN schemes are shown in Figure 6, and demonstrate the advantage of using the former. We find marginal advantage in using the flag scheme.
To study the effect of correlated noise we turn to the case of continuum RM, where relevant light curves are often characterized by high, nearly uniform cadence, and we rely also on spectroscopic data for measuring fluxes at different wavelengths, hence leading to correlated errors. In our set of simulations we assume a time delay day with . Sampling cadence is fixed at 1 day, hence the transfer function is barely resolved. We further consider correlated noise at the 5% level, which exceeds the 1% formal measurement uncertainties assumed here, and depict one such realization in Figure 6. Results for the weighted VN give a distribution of centered^{10}^{10}10The fact that timedelays on subsampling timescales are obtained should not come as a surprise as those reflect on the average numbers obtained by the FR/RSS scheme (see §4). around 0.5 days, which does not extend to values as large as the input delay because of correlated errors. The bias is alleviated using our flag implementation, and the average delay is consistent with the input lag.
4. Data Applications
In this section we apply the schemes to measure randomness (§2) to quasar data of varying quality, and benchmark their performance. We aim to provide lag measurements for individual sources and epochs, so timelag uncertainties need to be quantified. Here we resort to the standard techniques in RM for estimating uncertainties in lags (see Pelt et al., 1994, for a different implementation), which involves the FR/RSS algorithm of Peterson et al. (1998, 2004), and quote upper/lower uncertainties that correspond to 16/84 percentiles as obtained using the lag statistics for realizations for each set of light curves (see Appendix B). Unless otherwise stated, optimized and weighted versions of the estimators (§2.2), including nontrivial factor implementations, are considered where applicable.
object  (days)  (days)  Ref.  

Mrk 335 
0.026  1  
PG0026 
0.142  2  
PG0052 
0.155  2  
Fairall 9 
0.047  3  
Mrk 590 
0.026  4  
3C 120 
0.033  1  
Ark 120 
0.033  4  
Mrk 79 
0.022  4  
PG0804 
0.100  2  
Mrk 110 
0.035  4  
PG0953 
0.234  2  
NGC 3227 
0.004  5  
Mrk 142 
0.045  6  
NGC 3516 
0.009  5  
SBS 1116 
0.028  6  
Arp 151 
0.021  6  
NGC 3783 
0.010  7  
Mrk 1310 
0.020  6  
NGC 4051 
0.002  8  
NGC 4151 
0.003  9  
Mrk 202 
0.021  6  
NGC 4253 
0.013  6  
PG1226 
0.158  2  
PG1229 
0.063  2  
NGC 4593 
0.009  10  
NGC 4748 
0.015  6  
PG1307 
0.155  2  
Mrk 279 
0.031  11  
PG1411 
0.090  2  
NGC 5548 
0.017  5  

12  
12  
12  
12  
12  
12  
12  
12  
12  
12  
12  
12  
12  
PG1426 
0.087  2  
Mrk 817 
0.032  8  
Mrk 290 
0.030  8  
PG1613 
0.129  2  
PG1617 
0.112  2  
PG1700 
0.292  2  
3C 390.3 
0.056  13  
NGC 6814 
0.005  6  
Mrk 509 
0.034  4  
PG2130 
0.063  1  
NGC 7469 
0.016  14 
Values are in the rest frame of the source.
Data references: (1) Grier et al. 2012; (2) Kaspi et al. 2000; (3) SantosLleó et al. 1997; (4) Peterson et al. 1998; (5) Denney et al. 2009; (6) Bentz et al. 2009; (7) Stirpe et al. 1994; (8) Denney et al. 2009b (9) Bentz et al. 2006; (10) Denney et al. 2006; (11) SantosLleó et al. 2001; (12) Peterson et al. 2002; (13) Dietrich et al. 1998; (14) Peterson et al. 2014.
4.1. Multiepoch Data for NGC 5548
NGC 5548 has one of the best sets of light curves suitable for RM, covering more than a decade of highcadence observations (Peterson et al., 2002). Based on these data several works have shown that NGC 5548 follows an intrinsic BLRsizeluminosity relation (Peterson et al., 2002; Bentz et al., 2007, 2013), which is consistent with the extrinsic (i.e., multiobject) one. Here we analyze 13 consecutive sets of continuum and H light curves of NGC 5548 spanning a year each (yearly definitions follow those of Peterson et al. 2002; Bentz et al. 2013), and benchmark the performance of the optimized estimators defined in §2 by quantifying the scatter around the BLRsize–luminosity relation. Hostsubtracted luminosity estimates for each epoch were taken from Bentz et al. (2013, and see §4.3).
Figure 7 shows versions of the intrinsic BLRsize–luminosity relations as obtained by each of the estimators defined in §2. The relative performance of the estimators is insensitive to whether weighted or unweighted versions are used. We find that the VN estimator outperforms the other measures of randomness, and leads to a tighter correlation between timedelay and luminosity with (time delays are quoted in Table 1). Conversely, the cumulative sums estimator is the worst performer (leading in some cases to negative lags and to ). These findings confirm our simulations (Fig. 3).
Given the advantages of the VN scheme over the other estimators explored here, we henceforth focus on its performance, and consider its weighted version with nontrivial implementations of  and implementations unless otherwise specified.
4.2. The PalomarGreen Quasar Sample of Kaspi et al. (2000)
Here we benchmark the performance of the VN schemes using the spectrophotometric data of Kaspi et al. (2000) for 17 PalomarGreen (PG) quasars, and compare it to the default implementation of the ZDCF (see also Kaspi et al. 2000). Time delays reported for the latter are obtained using the maximum likelihood approach of Alexander (2013) and the corresponding correlation functions are identical to those reported in Kaspi et al. (2000, see their Fig. 4) and hence are not shown.
Timedelay measurements for the H line are reported here for two implementations of the VN algorithm: including nontrivial weights and flags (leading to the column in Table 2), and for an unweighted version operating over the full extent of (i.e., setting and leading to the column in Table 2). Below we comment on the statistical properties of the lags obtained with further details provided in Appendix C. That said, the reader is advised not to overinterpret the statistics given the sample size and data quality.
We find the results of the weighted and unweighted VN schemes to be similar with , and that their time delays are both comparably correlated with the quasar luminosity and have a similar value to that of the ICCF scheme (Kaspi et al., 2000, and our Table 2). In contrast, the ZDCF time delays lead to a significantly weaker correlation with luminosity. Further, while the VN scheme leads to positive delays in all cases, the ZDCF reports negative delays in % of the cases. This hints at the superior stability of the VN scheme, as also indicated by our simulations, and perhaps reflects also on the overestimated lag uncertainties obtained by the FR/RSS scheme; see Appendix B. We also find that the VN results are very well correlated with the ICCF time delays (with ; see Table 2 and Kaspi et al., 2000). Taken together, our findings suggest that the VN scheme has two promising features: (1) it has the major advantage of being largely model and binningindependent while (2) producing more stable results than default implementations of discrete crosscorrelation function schemes.
object  (days)  (days)  (days)  (days)  

PG0026+129  0.142  
PG0052+251  0.155  
PG0804+761  0.100  
PG0844+349  0.064  
PG0953+414  0.239  
PG1211+143  0.085  
PG1226+023  0.158  
PG1229+204  0.064  
PG1307+085  0.155  
PG1351+640  0.087  
PG1411+442  0.089  
PG1426+015  0.086  
PG1613+658  0.129  
PG1617+175  0.114  
PG1700+518  0.292  
PG1704+608  0.371  
PG2130+099  0.061  
Pearson’s correlation coefficients:  0.96  



Values are in the observed frame.
H lag measurements were considered as in Kaspi et al. (2000).
After correcting for cosmological time dilation and using logarithmic values (negative time delays were ignored).
Correlation coefficients should not be overinterpreted because of the limited sample size and the uncertainties associated with individual measurements. Uncertainties on the correlation coefficients are not provided for similar reasons.
4.3. The Sample of Active Galactic Nuclei of Bentz et al. (2013)
Here we revisit the sample of active galactic nuclei (AGN) of Bentz et al. (2013), for which the nuclear luminosity has been estimated by subtracting off the host’s contribution to the aperture flux, resulting in a tighter BLR size–luminosity relation. In particular, we reanalyze the continuum and H data for all sources listed in Table 1 using the optimized and weighted VN scheme with nontrivial factor implementations. More information about individual targets may be found in Appendix C.
Timelag measurements are in most cases robust, with the delays being highly significant, as expected given the overall data quality of the sample. That said, the values reported at the shortlag end of the distribution appear to be somewhat less significant than the values reported by Bentz et al. (2013), which may relate to the overall overestimation of the FR/RSS uncertainties when applied to the VN scheme (see below and Appendix B). Overall, a statistical comparison of the time delays obtained here to those of Bentz et al. (2013, and references therein) shows good agreement with (Fig. 8).
The expected correspondence between our results and the lags reported in Bentz et al. (2013) could be used to estimate the degree to which the quoted lag uncertainties are reasonable. To this end, we take a simplified approach and symmetrize the lower and upper measurement uncertainties reported in Table 1, and consider the ratio and its uncertainty with errors added in quadrature. With the expectation of a ratio of unity, we calculate the reduced and find its value to be . Reducing the lag uncertainties by % leads to a reduced , and in agreement with our calculations in Appendix B (note that this is merely a global estimate and may not reflect on the reliability of the measurement uncertainties for individual sources, nor on the reliability of the quoted uncertainties for the ICCF scheme).
Our version of the BLRsizeluminosity is shown in Figure 8 together with multiepoch results for NGC 5548 of §4.1. The results are consistent with the bestfit relation of Bentz et al. (2013) of the form , and an unweighted fit to the von Neumann results gives a powerlaw slope and an intercept , which are consistent with the values reported in Bentz et al. (2013) of and .
We find the dispersion in our lag measurements relative to the (theoretical) BLRsizeluminosity relation^{11}^{11}11After excluding Mrk 142 and NGC 3227 for reasons given in Bentz et al. (2013). Quoted uncertainties on were obtained by applying a random subset selection (RSS) scheme – akin to that discussed in Peterson et al. (1998) – to the data shown in the right panel of Figure 8 and quoted in Table 1. These do not include lag uncertainties contribution to the scatter due to reasons discussed in Appendix B. to be dex, which is comparable to the values found in Zu et al. (2011) and Bentz et al. (2013); note, however, the differences in the samples (objects and epochs) as well as estimates of source luminosity considered in each work.
5. Summary
In this paper we seek to expand the range of methods available for timelag determination in the field of quasar reverberationmapping. To this end we introduce a class of statistical estimators, which was originally used to measure the degree of randomness or complexity of a dataset (e.g., in cryptographic applications), and adapt it to RM.
Six different randomness/complexity measures are considered, among which the von Neumann and Lempel–Ziv estimators (the latter is at the core of many lossless data compression algorithms) and appropriate extrema criteria for timelag measurements are defined for each. At least two distinct implementations are considered per estimator (e.g., errorweighted and unweighted schemes) with the feature common to all being the minimal set of assumptions used to analyze the data. Specifically, none of the estimators considered here requires polynomial interpolation schemes to be defined, nor do they employ binning in correlation space. Further, they do not rely on ergodicity arguments and (tunable) stochastic models for quasar variability, hence timedelay measurements are restricted to the information embedded in the light curves being processed.
Among the various measures considered in this work, we find that an optimized implementation of the VN estimator (see Pelt et al., 1994, for a similar but not identical approach) performs best with the added benefit of being mathematically simple and computationally efficient to evaluate. We benchmark its performance using a large set of simulated light curves, and study the effects of data sampling rate, timeseries duration, S/N, lightcurve spikes, and correlated errors on the deduced lags. The effects of the properties of the quasar – the power density spectrum of its light curve and the BLR transfer function – on the deduced timedelay statistics are also investigated. We find that the optimized VN outperforms current implementations of the discrete correlation scheme for much of the parameter space relevant to RM, and leads to better agreement with the true lags.
Applying the optimized VN estimator to existing data of varying quality, we find its performance to be on par with that of other (more modeldependent) timelag measurement schemes. In particular, the parameterization of the BLRsize–luminosity relation of Bentz et al. (2013) is recovered, and the scatter around this theoretical relation is found to be comparable to that reported in other studies.
It is found that the optimized VN estimator can, under favorable conditions, be used in conjunction with existing methods, such as the discrete correlation function scheme, to obtain refined lags and to identify biases or the breakdown of the assumptions underlying other methods of RM. Such refinement and verification of time delays is crucial for alleviating systematics and for reducing the scatter in the sizeluminosity relation, as well as for identifying departures from it (e.g., Du et al., 2016), with implications for quasar physics and the use of quasars as standard cosmological candles.
To conclude, we find that measures of randomness significantly add to our arsenal of tools for RM, and that further exploration of the vast literature in the field is warranted. In particular, some of the schemes outlined here may be further refined, and additional measures of randomness/complexity not explored here may achieve superior performance as far as quasar RM is concerned.
References
 Alexander (1997) Alexander, T. 1997, Astronomical Time Series, 218, 163
 Alexander (2013) Alexander, T. 2013, arXiv:1302.1508
 Bahcall et al. (1972) Bahcall, J. N., Kozlovsky, B.Z., & Salpeter, E. E. 1972, ApJ, 171, 467
 Bartels (1982) Bartels R. 1982, J. Am. Stat. Assoc., 77, 40
 Barth et al. (2015) Barth, A. J., Bennert, V. N., Canalizo, G., et al. 2015, ApJS, 217, 26
 Bassham et al. (2010) Bassham, L. E., Rukhin A., et al. 2010, National Institute of Standards and Technology, Special Publication (NIST SP)  80022
 Bentz et al. (2006) Bentz, M. C., Denney, K. D., Cackett, E. M., et al. 2006, ApJ, 651, 775
 Bentz et al. (2007) Bentz, M. C., Denney, K. D., Cackett, E. M., et al. 2007, ApJ, 662, 205
 Bentz et al. (2013) Bentz, M. C., Denney, K. D., Grier, C. J., et al. 2013, ApJ, 767, 149
 Bentz et al. (2009) Bentz, M. C., Walsh, J. L., Barth, A. J., et al. 2009, ApJ, 705, 199
 Blandford & McKee (1982) Blandford, R. D., & McKee, C. F. 1982, ApJ, 255, 419
 Blum et al. (1961) Blum J. R., Kiefer J., & Rosenblatt M. 1961, Ann. Math. Stat., 32, 485
 Cackett et al. (2007) Cackett, E. M., Horne, K., & Winkler, H. 2007, MNRAS, 380, 669
 Caplar et al. (2016) Caplar, N., Lilly, S. J., & Trakhtenbrot, B. 2016, arXiv:1611.03082
 Chelouche & Daniel (2012) Chelouche, D., & Daniel E. 2012, ApJ, 747, 62
 Chelouche & Zucker (2013) Chelouche, D., & Zucker S., 2013, ApJ, 769, 124
 Cherepashchuk & Lyutyi (1973) Cherepashchuk, A. M., & Lyutyi, V. M. 1973, Astrophys. Lett., 13, 165
 Collier et al. (1998) Collier, S. J., Horne, K., Kaspi, S., et al. 1998, ApJ, 500, 162
 Czerny et al. (2013) Czerny, B., Hryniewicz, K., Maity, I., et al. 2013, A&A, 556, A97
 Denney et al. (2006) Denney, K. D., Bentz, M. C., Peterson, B. M., et al. 2006, ApJ, 653, 152
 Denney et al. (2009) Denney, K. D., Peterson, B. M., Pogge, R. W., et al. 2009, ApJ, 704, L80
 Denney et al. (2009b) Denney, K. D., Watson, L. C., Peterson, B. M., et al. 2009, ApJ, 702, 1353
 Dietrich et al. (1998) Dietrich, M., Peterson, B. M., Albrecht, P., et al. 1998, ApJS, 115, 185
 Du et al. (2016) Du, P., Lu, K.X., Zhang, Z.X., et al. 2016, ApJ, 825, 126
 Edelson & Krolik (1988) Edelson, R. A., & Krolik, J. H. 1988, ApJ, 333, 646
 Elvis & Karovska (2002) Elvis, M., & Karovska, M. 2002, ApJ, 581, L67
 Fausnaugh et al. (2016) Fausnaugh, M. M., Denney, K. D., Barth, A. J., et al. 2016, ApJ, 821, 56
 Gaskell & Peterson (1987) Gaskell, C. M., & Peterson, B. M. 1987, ApJS, 65, 1
 GilMerino et al. (2002) GilMerino, R., Wisotzki, L., & Wambsganss, J. 2002, A&A, 381, 428
 Giveon et al. (1999) Giveon, U., Maoz, D., Kaspi, S., Netzer, H., & Smith, P. S. 1999, MNRAS, 306, 637
 Grier et al. (2008) Grier, C. J., Peterson, B. M., Bentz, M. C., et al. 2008, ApJ, 688, 837843
 Grier et al. (2012) Grier, C. J., Peterson, B. M., Pogge, R. W., et al. 2012, ApJ, 755, 60
 Grier et al. (2013) Grier, C. J., Peterson, B. M., Horne, K., et al. 2013, ApJ, 764, 47
 Haas et al. (2011) Haas, M., Chini, R., Ramolla, M., et al. 2011, A&A, 535, A73
 Hernitschek et al. (2015) Hernitschek, N., Rix, H.W., Bovy, J., & Morganson, E. 2015, ApJ, 801, 45
 Hoeffding (1948) Hoeffding W. 1948, Ann. Math. Stat., 19, 293
 Hönig et al. (2017) Hönig, S. F., Watson, D., Kishimoto, M., et al. 2017, MNRAS, 464, 1693
 Kaspi et al. (2000) Kaspi, S., Smith, P. S., Netzer, H., et al. 2000, ApJ, 533, 631
 Kelly et al. (2009) Kelly, B. C., Bechtold, J., & Siemiginowska, A. 2009, ApJ, 698, 895910
 Kendall (1938) Kendall M. G., 1938, Biometrika, 30, 81
 Kovačević et al. (2014) Kovačević, A., Popović, L. Č., Shapovalova, A. I., et al. 2014, Advances in Space Research, 54, 1414
 Krolik et al. (1991) Krolik, J. H., Horne, K., Kallman, T. R., et al. 1991, ApJ, 371, 541
 Lempel & Ziv (1976) Lempel A., & Ziv J. (1976), IEEE Trans. Inf. Theory, IT 22, 75Ð81.
 Liao et al. (2015) Liao, K., Treu, T., Marshall, P., et al. 2015, ApJ, 800, 11
 Mushotzky et al. (2011) Mushotzky, R. F., Edelson, R., Baumgartner, W., & Gandhi, P. 2011, ApJ, 743, L12
 Netzer et al. (2007) Netzer, H., Lira, P., Trakhtenbrot, B., Shemmer, O., & Cury, I. 2007, ApJ, 671, 1256
 Pancoast et al. (2012) Pancoast, A., Brewer, B. J., Treu, T., et al. 2012, ApJ, 754, 49
 Pelt et al. (1994) Pelt, J., Hoff, W., Kayser, R., Refsdal, S., & Schramm, T. 1994, A&A, 286, 775
 Pelt et al. (1996) Pelt, J., Kayser, R., Refsdal, S., & Schramm, T. 1996, A&A, 305, 97
 Peterson et al. (2002) Peterson, B. M., Berlind, P., Bertram, R., et al. 2002, ApJ, 581, 197
 Peterson et al. (2004) Peterson, B. M., Ferrarese, L., Gilbert, K. M., et al. 2004, ApJ, 613, 682
 Peterson & Wandel (2000) Peterson, B. M., & Wandel, A. 2000, ApJ, 540, L13
 Peterson et al. (1998) Peterson, B. M., Wanders, I., Horne, K., et al. 1998, PASP, 110, 660
 Peterson et al. (2014) Peterson, B. M., Grier, C. J., Horne, K., et al. 2014, ApJ, 795, 149
 Pozo Nuñez et al. (2012) Pozo Nuñez, F., Ramolla, M., Westhues, C., et al. 2012, A&A, 545, A84
 Pozo Nuñez et al. (2013) Pozo Nuñez, F., Westhues, C., Ramolla, M., et al. 2013, A&A, 552, A1
 Reynolds (2000) Reynolds, C. S. 2000, ApJ, 533, 811
 Rybicki & Press (1992) Rybicki, G. B., & Press, W. H. 1992, ApJ, 398, 169
 RodríguezPascual et al. (1997) RodríguezPascual, P. M., Alloin, D., Clavel, J., et al. 1997, ApJS, 110, 9
 SantosLleó et al. (1997) SantosLleó, M., Chatzichristou, E., de Oliveira, C. M., et al. 1997, ApJS, 112, 271
 SantosLleó et al. (2001) SantosLleó, M., Clavel, J., Schulz, B., et al. 2001, A&A, 369, 57
 Scargle (1981) Scargle, J. D. 1981, ApJS, 45, 1
 Schawinski et al. (2006) Schawinski, K., Khochfar, S., Kaviraj, S., et al. 2006, Nature, 442, 888
 Shen et al. (2015) Shen, Y., Brandt, W. N., Dawson, K. S., et al. 2015, ApJS, 216, 4
 Shields et al. (2003) Shields, G. A., Gebhardt, K., Salviander, S., et al. 2003, ApJ, 583, 124
 Stirpe et al. (1994) Stirpe, G. M., Winge, C., Altieri, B., et al. 1994, ApJ, 425, 609
 Suganuma et al. (2006) Suganuma, M., Yoshii, Y., Kobayashi, Y., et al. 2006, ApJ, 639, 46
 Tewes et al. (2013) Tewes, M., Courbin, F., & Meylan, G. 2013, A&A, 553, A120
 Timmer & Koenig (1995) Timmer, J., & Koenig, M. 1995, A&A, 300, 707
 Trakhtenbrot & Netzer (2012) Trakhtenbrot, B., & Netzer, H. 2012, MNRAS, 427, 3081
 Uttley et al. (2014) Uttley, P., Cackett, E. M., Fabian, A. C., Kara, E., & Wilkins, D. R. 2014, A&A Rev., 22, 72
 von Neumann (1941) von Neumann, J. 1941, Ann. Math. Statist. 12, 367
 Volonteri et al. (2003) Volonteri, M., Haardt, F., & Madau, P. 2003, ApJ, 582, 559
 Wald & Wolfowitz (1940) Wald A., Wolfowitz J. 1940, Ann. Math. Stat., 11, 147
 Watson et al. (2011) Watson, D., Denney, K. D., Vestergaard, M., & Davis, T. M. 2011, ApJ, 740, L49
 Welsh (1999) Welsh, W. F. 1999, PASP, 111, 1347
 White & Peterson (1994) White, R. J., & Peterson, B. M. 1994, PASP, 106, 879
 Zu et al. (2011) Zu, Y., Kochanek, C. S., & Peterson, B. M. 2011, ApJ, 735, 80
 Zucker (2015) Zucker, S. 2015, MNRAS, 449, 2723
 Zucker (2016) Zucker, S. 2016, MNRAS, 457, L118
Appendix A A scheme for computing the optimized von Neumann estimator
Here we outline an efficient computing scheme^{12}^{12}12See http://chelouche.haifa.ac.il/codes/VonNeumann.m for a Matlab^{®} version of the full code, and http://www.pozonunez.de/astro_codes/python/vnrm.py or http://www.pozonunez.de/astro_codes/idl/vnrm.pro for partial Python/IDL^{®} implementations. for determining the global minimum of the VN estimator given the renormalization of and as defined in equations 12 and 13. We first note that we can write
(A1) 
where if the th data point originates from , and otherwise. Using these, we can write equation 14 as
(A2) 
Requiring a minimum for , at any given , we obtain the following set of two linear equations, , where
(A3) 
and
(A4) 
The solutions for and , inserted back into equation A2, yield the minimal (see, e.g., Figs. 9, 10), from which may be deduced at a modest computational cost. A similar, but not identical approach may reduce the computational load of the optimization scheme also for the other estimators discussed in this paper but is not further explored here.
Appendix B The FR/RSS scheme and the von Neumann estimator
The FR/RSS method of Peterson et al. (2004, which includes refinements of the scheme of due to ) is used here to estimate the uncertainties on the time delay between the light curves of individual sources. While this method has been shown to provide reasonable, albeit somewhat conservative, lag uncertainties for ICCF schemes, it is yet to be shown to provide reasonable values also for the optimized VN scheme. To this end, we simulated the light curves for sources (§3), and carried out calculations of uncertainty in lag for each using the FR/RSS scheme with realizations. We find that, when averaged over the entire mock population, the input lag falls outside the 16%84% range obtained via the FR/RSS scheme in of the cases (for the model parameterization explored here, and assuming random cadence), suggesting that the uncertainties are conservative. Requiring that the input delay falls within the designated uncertainty interval in % of the cases (within one standard deviation around the peak for normal distributions) requires that the quoted FR/RSS uncertainties be reduced by on average. A comparative analysis of the results of Bentz et al. (2013) and the time delays obtained here supports this conclusion (§4.3). Nevertheless, these results may not hold for individual targets, or for all object parameterizations and and lightcurve cadences, and we therefore quote the standard FR/RSS uncertainties in Table 1. A more complete treatment of the estimation of the timelag uncertainty for the VN RM scheme is beyond the scope of the present paper.
Appendix C Notes on Individual Sources
Results for the optimized (weighted) VN estimator are provided for each of the sources in Tables 1 and 2 with the estimator values and uncertainty contours shown in Figures 9 and 10. Note that the superior quality of the light curves that characterize objects in the compilation of Bentz et al. (2013, and references therein) results in more regular VN estimators with well defined troughs compared to the sample of Kaspi et al. (2000). This also reflects on the width of the error ”snakes” around the mean VN values (dotted black curves in Figs. 9 ,10), and results in better defined time delays. Note that the values of the VN estimator at different values of are correlated, as are their uncertainties. Also shown in Figures 9 and 10 in red are the numbers of pairs of points with (Eq. 6) that determine . The effect of discarding zerolag pairs of points is evident in all sources, as are the seasonal gaps that might affect some timedelay measurements for the PG quasars (see also Zu et al., 2011) but are less relevant for most AGNs in the compilation of Bentz et al. (2013).
Below we comment on anomalies encountered while analyzing the sources and epochs considered here.

Mrk 79: A second minimum at 200 days is ignored and the search for time delays is restricted to the range days. Using the default search window of we get days.

Mrk 110: The lag quoted in Table 1 was obtained by analyzing the full light curve for this source, while the values quoted from Bentz et al. (2013) are based on the mean values reported there and the uncertainty on their standard deviation (i.e., quoted lag uncertainties from Bentz et al. (2013) do not include the FR/RSS contribution).

PG 0953: The VN time delay is stable so long as the search window is days .

NGC 3227: The VN time delay is stable so long as the search window is narrower than days .

NGC 4151: The timedelay search interval was constrained to the interval days since there exists a second VN minimum at 20 days whose inclusion leads to a solution of days.

Mrk 202: Time delay search interval was restricted to the interval days .

PG 1307: We take an approach akin to Kaspi et al. (2000), and find a stable solution within a search interval with a lower bound of days (the upper bound matters little to the end result).

PG 1229: We restrict the search window to the range days, because a second VN minimum exists at days, which is reminiscent of the negative peak seen in the ZDCF for this source (see Fig. 4 in Kaspi et al. 2000).

Mrk 279: We search for lags within the range days range since there exists another VN minimum at days resulting in days.

PG 1411: The timedelay search window is restricted to days since a second VN minimum exists outside this range.

PG 1426: The timedelay search window is restricted to days since a second VN minimum exists outside this range.

PG 1613: The timedelay search window is restricted to days since a second VN minimum exists outside this range.

3C 390.3: We find that different implementations of the VN algorithm (e.g., weighted vs. unweighted) change the structure of the VN trough in a nonnegligible way, which can exceed the formal uncertainties provided by the FR/RSS scheme. Nevertheless, the delay is positive and days for all implementations. Better data are required to better constrain the lag in this source.

NGC 6814: A local VN minimum exists also at negative delays where the number of pairs contributing to the signal is reduced (see Fig. 10). We therefore restrict the search window to days .

PG 2130: Two very different delays are reported for this source: Table 2 and Figure 9 quote delays for the data set of Kaspi et al. (2000) while Table 1 and Figure 10 quote lags for the Grier et al. (2012) dataset. For more information about potential sampling issues for this source see Grier et al. (2008, 2012).