Continuous testing for Poisson process intensities :A new perspective on scanning statistics

Continuous testing for Poisson process intensities :
A new perspective on scanning statistics

Franck Picard
Univ Lyon, Université Lyon 1, CNRS, LBBE UMR5558,
F-69622 Villeurbanne, France
franck.picard@univ-lyon1.fr

Patricia Reynaud-Bouret
Université Côte d’Azur, CNRS, LJAD, France,
Patricia.Reynaud-Bouret@unice.fr

Etienne Roquain
LPMA, UMR 7599, Sorbonne Universités, UPMC, Univ. Paris 6,
F-75005, Paris, France.
etienne.roquain@upmc.fr
Abstract

We propose a novel continuous testing framework to test the intensities of Poisson Processes. This framework allows a rigorous definition of the complete testing procedure, from an infinite number of hypothesis to joint error rates. Our work extends traditional procedures based on scanning windows, by controlling the family-wise error rate and the false discovery rate in a non-asymptotic manner and in a continuous way. The decision rule is based on a -valueprocess that can be estimated by a Monte-Carlo procedure. We also propose new test statistics based on kernels. Our method is applied in Neurosciences and Genomics through the standard test of homogeneity, and the two-sample test.

Contents

1 Introduction

Continuous testing has recently emerged as a suitable theoretical framework to test a possibly infinite number of hypothesis. It is especially adapted to situations where observed data come from an underlying continuously-indexed random process, like in the classical white noise model (Dümbgen and Spokoiny, 2001), or when data can be modeled by a point process (Blanchard et al., 2014). Our work being motivated by data coming from Neurosciences and Genomics, we focus on point processes that have been very effective to model occurrences of events through time or space in many fields. More than a broad range of applications, point processes have the advantage to be discrete by essence, which allows a drastic reduction of the continuum number of hypotheses, into to a random finite number of tests. This property was exploited by Blanchard et al. (2014) in a particular context, and we elaborate upon this work to propose a continuous testing framework that allows the rigorous definition of the complete testing procedure, from multiple hypothesis to error rates. Our procedure is very general and fully operational thanks to the use of sliding windows. This allows us to we revisit the old-tale of scanning statistics, and we extend their traditional possibilities and performance by providing the theoretical controls for the Family Wise Error Rate (FWER) as well as for the False Discovery Rate (FDR). We focus on Poisson processes for their remarkable properties, but the main definitions, methodological developments or computational tricks are valid for general point processes.

Scanning windows probably constitute the most common method to perform tests on Poisson processes for the detection of unexpected clusters of observations (Kulldorf, 1997; Chan and Zhang, 2007; Perone Pacifico et al., 2007; Siegmund et al., 2011) or to compare samples (Walther, 2010). It consists in computing a test statistic on overlapping windows of a given length that are shifted by all possible delays, which has the advantage of avoiding empirical discretization of the data (Grün, 1996; Tuleau-Malot et al., 2014). Then the decision rule is based on the so-called scan statistic, that refers to the window with most signal. Therefore these procedures rely on the distribution of the maximum of the test statistics under the full null, that can be either approximated (Naus, 1982; Robin et al., 2005; Fu et al., 2012; Rivera and Walther, 2013; Walther, 2010; Chan and Zhang, 2007) or learned by a randomization procedure (Kulldorff et al., 2005; Kulldorff et al., 2009; Arias-Castro et al., 2015), which is common in standard multiple testing procedures (Westfall and Young, 1993; Romano and Wolf, 2005a). In this work we adopt the conditional testing approach (Loader, 1991; Rivera and Walther, 2013) that has the advantage of getting rid of nuisance parameters. It consists in deriving tests that are conditional to the number of occurrence (for the homogeneity test) or to the position of occurrences (for the two-sample test). Also, this strategy is consistent with the strong connections between continuous testing and non-parametric statistics (Blanchard et al., 2014), since the conditional test may depend on the unknown intensity functions of the observed processes. Therefore, our strategy is also to go beyond standard parametric approaches, that are mainly based on observed counts or on a likelihood ratio statistic. As we will show, these parametric statistics do not fully exploit the spatial/temporal information that lie within the data, but only use integrated information. Inspired by recent results on non-parametric testing (Gretton et al., 2012; Fromont et al., 2013), we enrich the scan-statistics toolbox with kernel-based statistics to increase the power of detection of sharp alternatives.

Another limitation of current window-based strategies is that they correspond to global procedures that focus on a global null hypothesis. This resumes to ask whether the process is globally homogeneous, or if the samples are globally identical, which inevitably results in a global yes/no answer. However, with the increasing availability of more and more complex data, there is a need for flexible and local approaches to improve sensitivity for local structures. As we will see, continuous testing allows us to revisit the scanning windows framework, by defining a (possibly) uncountably infinite set of local hypothesis, and most importantly by providing a rigorous framework for defining joint error rates.

Indeed, the well known statistical challenge that arises when dealing with sliding windows is the appropriate control of Type-I errors, which is made difficult by the induced dependency structure. Current strategies mostly rely on the asymptotic approximation of the scan distribution (Kulldorf, 1997; Chan and Zhang, 2007), which has two advantages : it somewhat avoids the dependency matter by focusing on the scan statistic only, it is computationally efficient. We rather adopt a non-asymptotic point of view, by considering the complete -valueprocess that becomes here a stochastic process . For some particular statistics (the count statistic for instance), this process is easily computed. When the statistic distribution is unknown we propose a Monte-Carlo procedure to estimate the -valueprocess. Then we propose a -procedure in continuous time to control the FWER in a non-asymptotic manner, based on a new computationally efficient double Monte-Carlo scheme. Simulations show that this non asymptotic procedure outperforms the one of Chan and Zhang (2007) in terms of FWER control.

The control of the False Discovery Rate (FDR), that is the expected proportion of errors among rejections, is more intricate, and tentatives are more recent. Perone Pacifico et al. (2004) propose a continuous analogue to the FDR in the context of random fields, where the amount of null hypotheses is assessed by the Lebesgue measure. However, their (continuous) FDR controlling procedure is based upon a FWER guarantee. Their strategy has also been considered in a setting close to our homogeneity framework (Perone Pacifico et al., 2007). Siegmund et al. (2011) also investigated the FDR for scanning statistics, but only under two strong conditions (Poisson distribution for the number of false discoveries and independence between null and alternative statistics), none of which being satisfied in standard models. Moreover, their procedure eventually resumes to discretized -values, and is based on the tail approximation of the scan statistic, which is not valid for other statistics other than the scan statistic. Thanks to the continuous testing framework, we propose a weighted Benjamini-Hochberg procedure in continuous time for Poisson processes testing, which outperforms on simulations Siegmund et al. (2011). This procedure corresponds to a continuous analogue of the well-known step-up procedure of Benjamini and Hochberg (1995), and extends the results of Blanchard et al. (2014) for continuous FDR in the context of local hypothesis testing.

The present work is inspired by two applications of Poisson processes: the analysis of spike trains data in Neurosciences and analysis of Genomic data coming from Next Generation Sequencing (NGS) technologies. In Neurosciences, Poisson processes are used to model spike trains, that is the succession of action potentials of a given neuron (Pipa et al., 2013). In particular inhomogeneous Poisson processes constitute a powerful model to understand rapid changes in firing rates induced by some stimulii (Kass et al., 2003). While the estimation of the Poisson process intensity has been already deeply studied (Shimazaki and Shinomoto, 2010), the questions asked in practice are often somehow different from a pure curve reconstruction matter: does the stimulus affect the spike apparition process (global hypothesis)? When (local) ? Does it depend on the type of stimulus and if so when exactly are the differences? In Genomics, next generation sequencing technologies (NGS) now allow the fine mapping of genomic features along chromosomes, and this spatial organization has become central to better understand genomes architecture and regulation. Here we consider the spatial organization of human replication origins that constitute the starting points of chromosomes duplication. To maintain the stability and integrity of genomes, replication origins are under a very strict spatio-temporal control, and part of their positioning has been shown to be associated with cell differentiation (Picard et al., 2014). Their recent mapping on the human genome has allowed the comparative analysis of replication origins maps between cell lines. Unfortunately, there is a lack of statistical framework to compare genomic maps of spatially organized genomic features, and we show how our procedure can be applied to the identification of replication starting sites that are linked to cell differentiation.

This article is organized as follows. First, we define the continuous testing framework, and we show how it can be easily handled in the Poisson process set-up (Section 2), with a particular focus on the test of homogeneity and the two-sample test. We also define the infinite set of local test hypotheses, the continuum of scanning windows to test them and the associated -valuestochastic process. To proceed, and for the sake of simplicity, we first focus on the standard count statistic that directly gives access to the -values. Then two types of control are discussed in Section 3: the classical FWER and the more involved control of the FDR in continuous time. Both procedures result in an adjusted -valueprocess, which makes our method operational and easily implemented. To increase the power of our procedure we propose in Section 4 local kernel-based statistics. Those kernel statistics require non explicit conditional distributions to be transformed into -valuesand we discuss Monte-Carlo procedures to estimate the -values, with a new computationally efficient double Monte-Carlo step to control the FWER in Section 5. In Section 6 we show through simulations that the resulting procedures are more powerful than existing ones (in particular those based on unconditioned scanning statistics or on discretization). Section 7.2 shows that the procedure works extremely well on experimental data, issued from Neurosciences or Genomics. Our method is available in the form of the contest R-package, and offers many perspectives in the field of continuous testing.

2 The continuous testing framework for Poisson Processes

In the sequel a point process is a random countable set of points usually denoted by . The corresponding point measure is then denoted by , that is with denoting the Dirac mass. For any interval , is a random variable that corresponds to the number of points of in . We focus on processes in (which can be thought as the DNA strand for genomic applications and time line for neuroscience applications) and we consider processes in [0,1] for the sake of simplicity.

2.1 Homogeneity and two-sample conditional tests

In order to propose a unified framework, we use the following notations: we observe a random set of points with a distribution of the form , where is a nuisance parameter, while represents the signal of interest. In both cases, the aim is to test whether is locally null.

Homogeneity test.

We observe , a point process modeled by a heterogeneous Poisson Process on with unknown intensity (non negative and in ) with respect to the Lebesgue measure. We would like to detect the regions that are particularly rich or poor in occurrences of the process, with respect to its mean/stationary behavior. To do so, we compare the intensity of the observed process to the mean behavior of the intensity

Then we rewrite the distribution of in terms of two different parameters and , with , defined by

The signal of interest is , whereas may be seen as a nuisance parameter because its value is unknown under the null hypothesis. The homogeneity null hypothesis is therefore reduced to ””: it is a composite hypothesis since under the null hypothesis, the distribution can still vary according to . As a particular easier problem, one can also assume that is known. The problem is then a classical goodness-of-fit test and the null hypothesis is not composite anymore.

Two-sample test.

We observe here , where and are two heterogeneous independent Poisson Processes on with intensities and (non negative and in ) with respect to the Lebesgue measure. In the following, for any interval , we denote , and by a slight abuse of notation, . Our aim here is to detect the regions where (or, alternatively, ).

To re-parametrize the distribution of , one can provide a one-to-one correspondence between and the couple , where is the joint process and where is a set of marks with values in such that if actually belongs to and if actually belongs to (Kingman, 1993). An important point here is to note that is actually a Poisson process with intensity and that conditionally to , the ’s are independent variables with distribution

with

Once again, stands for the amount of signal in the data and is a nuisance parameter. The two-sample null hypothesis is then once again summarized by ””. Since is an unknown function, this hypothesis is composite. This case is thus much more complex than the homogeneity case, and calls for a non-parametric procedure.

Conditional distributions.

To avoid in both cases the use of specific unknown values of to derive the distribution under the null hypothesis, our strategy is based on conditional testing (Loader, 1991; Rivera and Walther, 2013). Indeed, for the homogeneity test, conditionally on the total number of points, , the observed process is actually an i.i.d. sample of density that does not depend on (Loader, 1991; Rivera and Walther, 2013). Similarly, in the two-sample case, given the positions of the joint process , the remaining variability under the null hypothesis lies in the distribution of the marks that only depends on (and not on anymore). Thus we introduce the variable , such that in the homogeneity case, and in the two-sample case and our testing procedure relies on the distribution of conditionally to . Since the conditional distribution of given does not depend on , it is denoted by in the sequel, whereas refers to the expectation w.r.t. .

2.2 An infinite set of local null hypothesis

The continuous testing procedure developed here aims at determining where the signal is non zero. The classical strategy consists in testing the full null hypothesis, which corresponds to the single question “Is the signal null over [0,1]?”. Therefore a unique hypothesis is tested and the test produces a binary answer. Here, we rather consider local null hypothesis such as,

In practice, one-sided tests of homogeneity are usually considered to detect regions that are too rich in occurrences, whereas two-sided tests of the two-sample problem seem more relevant when there is a complete equivalence between both processes and from a modeling point of view.

Since information at the single point level seems difficult to catch in general, we focus on finding intervals or unions of intervals on which the null hypothesis holds. We denote , the null hypothesis corresponding to . As a particular case, is the full null hypothesis and corresponds to , where is the null function on .

2.3 A continuum of scanning windows

Our procedure is based on scanning windows with a given resolution that is fixed. We envision here all the possible continuum of windows with such given length, meaning that we are performing a whole continuum of tests, one for each window. To avoid any confusion in the sequel, always denotes a window center whereas always denotes a point, that is a possible value for the observations . Thus we have (while the set of possible values for is ). The scanning windows are denoted by for any . Therefore, a typical relationship between and is . In the sequel, all the proposed multiple testing procedures are actually based on single tests of the null hypothesis , for all the possible window centers , which is different than the null hypotheses given by for in . The continuous testing procedure aims at accepting the set of ”true windows”, which is indexed by their centers:

formally depends on , but is denoted by in the sequel, for short.

2.4 A first simple construction of the -valueprocess

Homogeneity test.

Consider as an individual test statistic ,that is the number of points in the window of center . Then classically, a one-sided test based on this statistic rejects if is large enough. Under , it is easy to see that the conditional distribution of given (or here) is a Binomial variable with parameter and . Hence, if we denote by , the function such that with , then

can be seen as a -valueassociated to the classical one-sided single test of . Since is explicit and since the number of points, as a function of , can be very efficiently computed, one can have access to a -valueprocess , with a very low computational cost.

If is known, one does not need to use a conditional distribution. The unconditional distribution of is a Poisson distribution with mean , under . If with , then

(1)

also defines a -valueprocess. Note that this set-up is quite close to the one of Chan and Zhang (2007).

Two-sample test.

Consider now as individual test statistic A classical one-sided test based on this statistic also rejects , if is large enough. Under , it is easy to see that the conditional distribution of given (the joint process here), is the same as the conditional distribution of given . It is a Binomial variable with parameter and . The corresponding -valueassociated to the classical one-sided single test of can then be defined by:

(2)

This process depends on and , and once again can be computed very easily.

2.5 Making continuous testing computationally tractable in general

Even if the method is based on a continuum of tests, the two previous examples are computationally tractable because they are based on , the number of points within windows , which can be computed very efficiently and which is piece-wise constant as a function of . In full generality, the number of points may not be informative enough, and to gain power, one can consider more general single test statistics (see Section 4) that only depend on the composition of each window, that is on the function

which records the random positions of points in that lie in . Since the composition is piece-wise constant with changes only each time a point leaves or enters the scanning window, the single test statistic that only depends on the composition is a piece-wise constant function on a random finite partition . Therefore, the continuum of tests can be reduced to a random finite number of tests, one for each segment defined by . An example of such a partition if provided in Fig. 1.

Figure 1: Construction of the -valueprocess for the two-sample test. Occurrences from and are merged to from the joint process and Rademacher marks are introduced as labels. Each occurrence has a span that is used to create the partition whose elements constitute the centers of the testing windows , . Such construction of the windows ensures that the composition of the observed point process is constant between two centers and in terms of number and repartition of points. The -valueprocess is then a càdlàg process with jumps defined by partition .

2.6 General definition of the -valueprocess

In the sequel, we assume that a test statistic is given for each window , that only depends on the composition of the window. Because Poisson processes have nice properties of independence between disjoint sets, it is easy to show the following property, which is usually referred to as subset pivotality (Westfall and Young, 1993; Romano and Wolf, 2005a).

For all , the conditional distribution of given is the same whether or .

This roughly says the the conditional distribution of given is not modified by the truth or the falsehood of the remaining hypotheses. In particular, the conditional distribution of given under is the same as the one under the full null .

Note that in full generality the conditional distribution of given depends on but also on the composition of the window centered on . Therefore we define:

and the associated -valueprocess by

(3)

It satisfies for all , for all ,

(4)

Moreover since is a deterministic function of (conditionally to ), property also holds for , and not only for .

3 False positive control in continuous time

Except when specifically mentioned, the results presented here hold for any test statistic that only depends on the composition of the window (as defined in Section 2.5). Their associated -valueprocess are as described previously.

A rejection set is a subset of , which is a function of the -valueprocess . Classically, it corresponds to small values of this process and a typical example is given by

for a given threshold potentially depending on the data. Once the rejection set given, the set of accepted windows is defined by its complement

3.1 Definition of the multiplicity error rates

If the procedure behaves properly, the set should be a good approximation of , or similarly should be a good approximation of . To evaluate the quality of a procedure , we focus on false positive windows that correspond to elements of . The size of can be gauged in many ways. In this work, we focus on the most famous criteria, namely the Family-Wise Error Rate (FWER) and the False Discovery Rate (FDR).

From a family-wise point of view, one wants to avoid any false positive with a large probability, which corresponds to a control of the quantity

(5)

While a FWER control provides a strong assessment on the amount of false positives, this approach generally leads to conservative procedures. A more balanced view is to allow some false positives, in a pre-specified fraction of the total amount of positives. This can be achieved by controlling

(6)

with . Importantly, since the setting is continuous here, the “number” of (false) positives is quantified above by the Lebesgue measure, , on .

3.2 A -procedure to control the FWER in continuous time

We would like to find a threshold such that:

For all ,

with defined in (3). The set being unknown, one can use some rough upper-bound. We have, thanks to subset pivotality , that for any (possibly depending on the conditioning variable ),

Since , we obtain

The distribution of under and given are in both cases (homogeneity and two sample cases) known/observable once the variable is fixed and it is possible to calibrate an adequate threshold . However, due to the classical subtleties of inversion of non continuous c.d.f., it is in fact more powerful to use adjusted -values that are defined by

(7)

with being the conditional c.d.f. of under the full null hypothesis, that is .

Theorem 3.1.

Let and given by (7). Then the procedure defined by

(8)

has a FWER controlled by .

Note that , which is a continuous version of the classical -procedure, is coherent with the case where there is only one value in the -valueprocess because the resolution is equal to 1. In this extreme case, is reduced to one point and the -valueis exactly the adjusted -value, the control of being then an obvious consequence of Section (4).

Remark 3.2.

The previous procedure is usually referred to as a single-step procedure. We also propose refinements by using a step-down procedure in the Supplementary Material.

Remark 3.3.

Unfortunately, even for the classical homogeneity or two-sample tests the c.d.f. is not explicit and needs to be estimated. This is the purpose of Section 5 in which we also derive meaningful theoretical results when this distribution is approximated by Monte-Carlo that are necessary to provide a completely operational method with strong theoretical guarantee.

3.3 Link with the scan statistic framework.

In the case of the homogeneity test, the -valueprocess does not depend on and because function is a non increasing function, we have

(9)

Hence, rejecting for small values of the infinimum of the -valueprocess (similar to standard minP procedures), or rejecting for large of the supremum of the test statistic (i.e. the scan statistic, often called the maxT procedure) is strictly equivalent (Dudoit et al., 2003; Daley and Vere-Jones, 2008). Compared with the work of Chan and Zhang (2007), we mainly added two ingredients: () the distribution we are considering is conditional and exact, whereas theirs is asymptotic and still depends on the unknown nuisance parameter (even if they provide a plug-in procedure in practice to avoid this issue), () thanks to Property , we are able to prove the FWER control, whereas their control stands under the full null only.

This minP-maxT equivalence is valid in the case of the homogeneity test only, but it does not hold as soon as the c.d.f. of the infinimum of the -valueunder the full null depends on the composition of the window. This is typically the case for the two-sample test, where depends on so that Eq. (9) cannot hold anymore. In this case, the minP framework offers the advantage to scale all tests on the same level (Dudoit et al., 2003; Daley and Vere-Jones, 2008).

3.4 A weighted BH procedure to control the FDR in continuous time

To increase the detection capability, we propose a second procedure based on the control of the FDR criterion (6).

Heuristic.

We would like to find a threshold (see (3)) such that:

Following Blanchard et al. (2014), let us first explain heuristically how one can reach this goal in a continuous framework. By Fubini’s theorem, ,

Then, choosing such that

and doing as if was deterministic leads to

which leads to

by using the property of -values(4). The above heuristic suggests to choose by using

Weighted-BH procedure.

Quantile is random (even conditionally on ) because it depends on the observed -valueprocess. Nevertheless, since the -value process is piece-wise constant in our framework, finding threshold is very simple. We propose a step-up algorithm that is inspired from the famous weighted BH threshold (Benjamini and Hochberg, 1997), with the essential difference that our algorithm is based on random weights. Recall that the processes and therefore are piece-wise constant on the random partition (, ). Therefore

This implies that the threshold defined in (3.4) can be derived as follows:

  • Compute the weights,

  • Denote , ,

    and order these -valuesin increasing order for an appropriate permutation of ;

  • Consider (with if the set is empty);

  • Compute as

  • Define

By contrast with the classical BH step-up procedure, each -value has its own weight which is not uniform but adaptive to the data. This weight means that the quantity of interest is not the number of tests, i.e. the cardinality of the partition , but the proportion of ”time” that the -valueprocess spends at a particular value.

Remark 3.4.

It is worth to note that where stands for the adjusted -value

Theoretical result.

Our procedure is based on a heuristic that has been made fully theoretically valid by Blanchard et al. (2014) under suitable assumptions of measurability and positive dependence (namely, finite dimensional positively regressively dependent -value process). The difficulty comes from the interplay between the numerator and the denominator within the expectation in (3.4) via , because is a random variable. While the measurability issue is not critical here (by the piece-wise constant property), the positive dependence condition seems difficult to check in general. For instance, this condition is known to exclude the two-sided test statistic (Yekutieli, 2008). We prove here that this positive dependence condition holds in the one-sided testing setting (both for the homogeneity test with a known and the two-sample test), by using the peculiar properties of Poisson processes.

Theorem 3.5.

For all , for a -valueprocess given either by Eq. 1 or Eq. 2, for one-sided null hypotheses, the corresponding procedure , satisfies

This result is proved in the Supplementary File. Theorem 3.5 hence theoretically justifies the use of a weighted step-up procedure to control the continuous FDR. However, it is only valid for simple versions of our test statistic, which ensure the positive dependence structure. For more general cases, Blanchard et al. (2014) proposed a modification of the step-up algorithm. This modification consists in introducing a function of a specific form, such that the quantity is replaced by the smaller quantity . However, this modification is reported to be quite conservative (Blanchard and Roquain, 2008). Furthermore, our simulation results show that the FDR control (3.4) is achieved in most of the situations we studied, even for more complex -valueprocesses that the ones we considered here (homogeneity test with known intensity, and two-sample test)

Note that the homogeneity test with known intensity in Theorem 3.5 is similar to Corollary 4.5 in Blanchard et al. (2014). However, a subtle distinction concerns the way the null hypotheses are defined (windows versus points). A consequence is that our approach does not rely on any knowledge regarding the regularity of the underlying process intensity.

4 Kernel-based statistics for sliding windows

4.1 On the limitations of the count-based statistic

Let us focus on the two-sample test to illustrate our motivations. In this context, our objective is to detect local differences between unknown functions and based on individual tests defined on a set of windows. In a first approximation, this can be achieved by comparing with , the count of occurrences on window , that are distributed as a couple of independent Poisson variables. However, given that (resp. ) is an approximation of (resp. ), the count based-statistic strategy can be considered as an hypothesis that tests the equality of function integrals rather than the equality of the functions themselves. Hence the induced testing procedure may lack of accuracy.

Therefore we developed a kernel-based strategy to propose single test statistics that do not depend on the sole number of occurrences in a window, but that consider its whole composition as defined in Section 2.5. The interest of kernel-based statistics is that they account the physical distances between occurrences, which explains the gain of performance.

4.2 The kernel vision in the two-sample case

The kernel framework for testing proposed by Gretton et al. (2012) and Fromont et al. (2013) has been especially designed to separate functions, and not just their integrals. It is based on the estimation of a distance between the functions and (in a RKHS, or in ). In particular, Fromont et al. (2013) show that these test statistics are minimax with respect to various smoothness classes, and in this sense, powerful against some alternatives that cannot be distinguished by count-based tests.

In the following we consider translation invariant kernels denoted by , and in practice we will restrict ourselves (see Sections 6 and 7) to the Gaussian kernel that has been shown to be the most powerful on simulations (Fromont et al., 2013). The shape of the statistic is then given by

where is the joint process and the ’s are the marks (see Section 2.1). In particular, it only depends on the composition of the window. To understand, why this statistic is meaningful, let us compute its expectation. By classical computations for Poisson processes,

If denote the convolution product, we end up with

In particular if is a convolution kernel with bandwidth we obtain, under classical assumptions, that

This short derivation illustrates the motivation to use such a kernel-based statistic, as a good estimate of a distance between intensity functions. The one-sided version of this statistic, as well as the homogeneity case are presented in the Supplementary Material. Note that from a computational point of view, the complexity for computing is in . However, thanks to the piece-wise constant property of on partition , the continuum , can be computed using the in/out points of each scanning window, which drastically reduces the global computational burden.

5 Monte-carlo procedures for -valuesestimation and adjustment

Procedure test statistic computations homogeneity test two-sample test
- count exact
full procedure 1-step MC
kernel 1-step MC
full procedure 2-step MC
wBH count exact
full procedure exact
kernel 1-step MC
full procedure 1-step MC
Table 1: Summary of the required computations to implement the continuous tests in practice for the homogeneity and two-sample test.

5.1 The distribution of the single test statistic being explicit.

For tests based on the count statistic the -valueprocess is explicitly computable. Hence, no need for Monte-Carlo in this case. Moreover, since our weighted BH procedure (Section 3.2) is completely explicit, can be easily computed in practice. In this situation we have already proved the FDR control for some particular cases and we will show that it holds in more general cases by using simulations (Section 6).

However, the -procedure is not explicit since it requires the conditional distribution of given under the full null hypothesis (Section 3.2). Therefore we need to use a Monte-Carlo procedure to approach this distribution. This can be achieved while maintaining a valid FWER control by following Romano and Wolf (2005a), as detailed in Section 5.4 below.

5.2 The distribution of the single test statistic being not explicit

This concerns kernel-based tests (Section 4) for which the -valueprocess is not even computable. One can use again the estimated -valuesdeveloped by Romano and Wolf (2005b) since their result shows that they satisfy (4) and can be easily computed in practice. However, there is no theoretical FDR guarantee and we will only show the FDR control by simulations in Section 6. Consequently, to compute , the weighted-BH procedure only requires one Monte-Carlo step in this situation.

As for the -procedure (Section 3.2), it requires an additional Monte-Carlo step. To this end, a classical and sufficient method is to perform an approximation using two independent Monte-Carlo samples. We detail this approach for the Homogeneity case in the supplementary material. For the two-sample case, the Monte-Carlo procedure is quite demanding and is close in spirit to bootstrap methods. In this case, we developed a double Monte-Carlo procedure, which only requires one Monte-Carlo sample and still guarantees a controlled FWER (Section 5.4).

5.3 Estimation of the -valueprocess by Monte-Carlo

Recall that this estimation is required when using test statistics different from the statistics based on counts. The full procedures are summarized in Table 1.

Homogeneity test.

Here, the conditioning variable is the same for all scanning windows and the distribution of under does not depend on . Let us focus on the first window and simulate points inside this window under . Hence we consider random samples of the number of points such that

and then we draw

to sample positions of a homogeneous Poisson process inside window . For each , one can compute thanks to the positions of the points inside the value of the test statistic for the first window, that we denote . But since all the statistics are distribution invariant by translation under the null, one can also say that this sample has the same distribution as under for every window center .

Therefore, whatever , one has easily access to , i.i.d. variables with the same distribution as under for every . Then, following Romano and Wolf (2005b), we define:

which amounts to use the empirical function defined in Eq. 3 over a sample where is the observed statistic.

Two sample test.

We label the observed marks such that it constitutes the first term of a -sample of marks, filled by independent draws of i.i.d. Rademacher sets, , for . The conditional distribution given of these Rademacher processes are:

that is the distribution that the observed marks ’s should have under the full null hypothesis. As previously, the Monte-Carlo approximated conditional -valueof window is defined by:

(10)

where is the observed statistic and where is the value of the test statistic on that is computed with the resampled marks and fixed joint process . Let us underline that because in the two-sample case the partition only depend on and do not vary with the resampling scheme , the processes , for , and therefore the process are piece-wise constant on .

5.4 Two-step Monte-Carlo method for the -procedure.

Recall that when using the -procedure with a -valueprocess that is not explicit, we need two Monte-Carlo approximations.

Homogeneity test.

In that case, the distribution of the test statistics under the null are independent of the window centers, while the distribution of the minimum of the -values relies on the observations over all windows. Hence, it is quite natural to perform two separated Monte-Carlo schemes for these two steps.

Two sample test.

In the two-sample case, the trick is that one can use the same sample of ’s as the one used in (10) for both approximations. More precisely, the -values, corresponding to the simulated process , can be computed as

that is we do the same computation as in Eq. 10 except that we replace the observed statistics by the simulated one , hereby computing a ”resampled” p-value process corresponding to . Then, let us define, for all

The corresponding adjusted Monte-Carlo -valuesare then given by

Finally the rejection set is given by

The previous double Monte-Carlo procedure is based on the use of the same random signs , and not on two sets of different simulations. Indeed the ’s are used twice: a first time to estimate the -values of the observed process and a second time to give the -valueseven on the simulated realizations leading to the adjusted -values. This procedure has the double advantage of sparing computational resource, while enjoying the same control property as shown hereafter.

Theorem 5.1.

Let . For the two-sample case, the continuous testing procedure defined by (5.4) satisfies

This result holds whatever the choice of the single test statistic that only depends on the composition of the window.

6 Simulations

6.1 Homogeneity test

For a fixed parameter , we simulate a homogeneous Poisson process corresponding to the following piecewise constant signal (illustrated in Figure 9, Supp. Material):

with

and , , the positions on which is not null (or null, respectively). We consider this particular shape of to ensure that .

For a fixed background intensity parameter , we therefore simulate 1000 times a Poisson process of intensity . Parameter is used to tune the distance to homogeneity (which reflects the difficulty of the test), with standing for a situation where almost every position is under the homogeneous regime, standing for a situation with high separability. Parameter controls the number of occurrence and varies in . To compare with other scanning procedures, we opt for one-sided tests (the alternative corresponding to too large ), with fixed window size . More precisely, we compare our count-based -and count-based weighted BH procedures in continuous time with asymptotic approaches based on the scan statistic that controls the FWER (Chan and Zhang, 2007) and the FDR (Siegmund et al., 2011).

We consider this homogeneity test to compare asymptotic approaches based on the scan statistic to control the FWER (Chan and Zhang, 2007) and the FDR (Siegmund et al., 2011), with our -and weighted BH procedures in continuous time, based on the count statistic. Figure 2 clearly shows that our -procedure in continuous time controls the FWER even for moderate sample sizes, whereas the scan statistic procedure requires a strong signal (, or a higher number of occurrences (, not shown). This trend is even more clear for the FDR control, that is not ensured by method of Siegmund et al. (2011). Note that contrary to continuous testing, this method requires a discretization step on which the performance depend. In this case we choose fixed-length non-overlapping windows to compute their procedure.

Figure 2: Comparison of asymptotic vs. non asymptotic methods for the control of Type-I error rates (. ChanZhang: asymptotic approximation of the scan statistics distribution proposed by Chan and Zhang (2007), SYZ: approximation of the FDR proposed by Siegmund et al. (2011) with 50 (SZY.50) and 500 (SZY.500) windows of identical length. FWER and FDR refer to Eqs. 5 and 6.

6.2 Two sample test

To test the differences between intensities and , we sample a joint point process over with constant intensity . Then we sample the labels given the occurrences of the joint process , in an independent manner, such that

Thus positions with (resp. ) stand for points of process (resp. ) with intensity (resp.