Continuous testing for Poisson process intensities :
A new perspective on scanning statistics
Abstract
We propose a novel continuous testing framework to test the intensities of Poisson Processes. This framework allows a rigorous definition of the complete testing procedure, from an infinite number of hypothesis to joint error rates. Our work extends traditional procedures based on scanning windows, by controlling the familywise error rate and the false discovery rate in a nonasymptotic manner and in a continuous way. The decision rule is based on a valueprocess that can be estimated by a MonteCarlo procedure. We also propose new test statistics based on kernels. Our method is applied in Neurosciences and Genomics through the standard test of homogeneity, and the twosample test.
Contents
1 Introduction
Continuous testing has recently emerged as a suitable theoretical framework to test a possibly infinite number of hypothesis. It is especially adapted to situations where observed data come from an underlying continuouslyindexed random process, like in the classical white noise model (Dümbgen and Spokoiny, 2001), or when data can be modeled by a point process (Blanchard et al., 2014). Our work being motivated by data coming from Neurosciences and Genomics, we focus on point processes that have been very effective to model occurrences of events through time or space in many fields. More than a broad range of applications, point processes have the advantage to be discrete by essence, which allows a drastic reduction of the continuum number of hypotheses, into to a random finite number of tests. This property was exploited by Blanchard et al. (2014) in a particular context, and we elaborate upon this work to propose a continuous testing framework that allows the rigorous definition of the complete testing procedure, from multiple hypothesis to error rates. Our procedure is very general and fully operational thanks to the use of sliding windows. This allows us to we revisit the oldtale of scanning statistics, and we extend their traditional possibilities and performance by providing the theoretical controls for the Family Wise Error Rate (FWER) as well as for the False Discovery Rate (FDR). We focus on Poisson processes for their remarkable properties, but the main definitions, methodological developments or computational tricks are valid for general point processes.
Scanning windows probably constitute the most common method to perform tests on Poisson processes for the detection of unexpected clusters of observations (Kulldorf, 1997; Chan and Zhang, 2007; Perone Pacifico et al., 2007; Siegmund et al., 2011) or to compare samples (Walther, 2010). It consists in computing a test statistic on overlapping windows of a given length that are shifted by all possible delays, which has the advantage of avoiding empirical discretization of the data (Grün, 1996; TuleauMalot et al., 2014). Then the decision rule is based on the socalled scan statistic, that refers to the window with most signal. Therefore these procedures rely on the distribution of the maximum of the test statistics under the full null, that can be either approximated (Naus, 1982; Robin et al., 2005; Fu et al., 2012; Rivera and Walther, 2013; Walther, 2010; Chan and Zhang, 2007) or learned by a randomization procedure (Kulldorff et al., 2005; Kulldorff et al., 2009; AriasCastro et al., 2015), which is common in standard multiple testing procedures (Westfall and Young, 1993; Romano and Wolf, 2005a). In this work we adopt the conditional testing approach (Loader, 1991; Rivera and Walther, 2013) that has the advantage of getting rid of nuisance parameters. It consists in deriving tests that are conditional to the number of occurrence (for the homogeneity test) or to the position of occurrences (for the twosample test). Also, this strategy is consistent with the strong connections between continuous testing and nonparametric statistics (Blanchard et al., 2014), since the conditional test may depend on the unknown intensity functions of the observed processes. Therefore, our strategy is also to go beyond standard parametric approaches, that are mainly based on observed counts or on a likelihood ratio statistic. As we will show, these parametric statistics do not fully exploit the spatial/temporal information that lie within the data, but only use integrated information. Inspired by recent results on nonparametric testing (Gretton et al., 2012; Fromont et al., 2013), we enrich the scanstatistics toolbox with kernelbased statistics to increase the power of detection of sharp alternatives.
Another limitation of current windowbased strategies is that they correspond to global procedures that focus on a global null hypothesis. This resumes to ask whether the process is globally homogeneous, or if the samples are globally identical, which inevitably results in a global yes/no answer. However, with the increasing availability of more and more complex data, there is a need for flexible and local approaches to improve sensitivity for local structures. As we will see, continuous testing allows us to revisit the scanning windows framework, by defining a (possibly) uncountably infinite set of local hypothesis, and most importantly by providing a rigorous framework for defining joint error rates.
Indeed, the well known statistical challenge that arises when dealing with sliding windows is the appropriate control of TypeI errors, which is made difficult by the induced dependency structure. Current strategies mostly rely on the asymptotic approximation of the scan distribution (Kulldorf, 1997; Chan and Zhang, 2007), which has two advantages : it somewhat avoids the dependency matter by focusing on the scan statistic only, it is computationally efficient. We rather adopt a nonasymptotic point of view, by considering the complete valueprocess that becomes here a stochastic process . For some particular statistics (the count statistic for instance), this process is easily computed. When the statistic distribution is unknown we propose a MonteCarlo procedure to estimate the valueprocess. Then we propose a procedure in continuous time to control the FWER in a nonasymptotic manner, based on a new computationally efficient double MonteCarlo scheme. Simulations show that this non asymptotic procedure outperforms the one of Chan and Zhang (2007) in terms of FWER control.
The control of the False Discovery Rate (FDR), that is the expected proportion of errors among rejections, is more intricate, and tentatives are more recent. Perone Pacifico et al. (2004) propose a continuous analogue to the FDR in the context of random fields, where the amount of null hypotheses is assessed by the Lebesgue measure. However, their (continuous) FDR controlling procedure is based upon a FWER guarantee. Their strategy has also been considered in a setting close to our homogeneity framework (Perone Pacifico et al., 2007). Siegmund et al. (2011) also investigated the FDR for scanning statistics, but only under two strong conditions (Poisson distribution for the number of false discoveries and independence between null and alternative statistics), none of which being satisfied in standard models. Moreover, their procedure eventually resumes to discretized values, and is based on the tail approximation of the scan statistic, which is not valid for other statistics other than the scan statistic. Thanks to the continuous testing framework, we propose a weighted BenjaminiHochberg procedure in continuous time for Poisson processes testing, which outperforms on simulations Siegmund et al. (2011). This procedure corresponds to a continuous analogue of the wellknown stepup procedure of Benjamini and Hochberg (1995), and extends the results of Blanchard et al. (2014) for continuous FDR in the context of local hypothesis testing.
The present work is inspired by two applications of Poisson processes: the analysis of spike trains data in Neurosciences and analysis of Genomic data coming from Next Generation Sequencing (NGS) technologies. In Neurosciences, Poisson processes are used to model spike trains, that is the succession of action potentials of a given neuron (Pipa et al., 2013). In particular inhomogeneous Poisson processes constitute a powerful model to understand rapid changes in firing rates induced by some stimulii (Kass et al., 2003). While the estimation of the Poisson process intensity has been already deeply studied (Shimazaki and Shinomoto, 2010), the questions asked in practice are often somehow different from a pure curve reconstruction matter: does the stimulus affect the spike apparition process (global hypothesis)? When (local) ? Does it depend on the type of stimulus and if so when exactly are the differences? In Genomics, next generation sequencing technologies (NGS) now allow the fine mapping of genomic features along chromosomes, and this spatial organization has become central to better understand genomes architecture and regulation. Here we consider the spatial organization of human replication origins that constitute the starting points of chromosomes duplication. To maintain the stability and integrity of genomes, replication origins are under a very strict spatiotemporal control, and part of their positioning has been shown to be associated with cell differentiation (Picard et al., 2014). Their recent mapping on the human genome has allowed the comparative analysis of replication origins maps between cell lines. Unfortunately, there is a lack of statistical framework to compare genomic maps of spatially organized genomic features, and we show how our procedure can be applied to the identification of replication starting sites that are linked to cell differentiation.
This article is organized as follows. First, we define the continuous testing framework, and we show how it can be easily handled in the Poisson process setup (Section 2), with a particular focus on the test of homogeneity and the twosample test. We also define the infinite set of local test hypotheses, the continuum of scanning windows to test them and the associated valuestochastic process. To proceed, and for the sake of simplicity, we first focus on the standard count statistic that directly gives access to the values. Then two types of control are discussed in Section 3: the classical FWER and the more involved control of the FDR in continuous time. Both procedures result in an adjusted valueprocess, which makes our method operational and easily implemented. To increase the power of our procedure we propose in Section 4 local kernelbased statistics. Those kernel statistics require non explicit conditional distributions to be transformed into valuesand we discuss MonteCarlo procedures to estimate the values, with a new computationally efficient double MonteCarlo step to control the FWER in Section 5. In Section 6 we show through simulations that the resulting procedures are more powerful than existing ones (in particular those based on unconditioned scanning statistics or on discretization). Section 7.2 shows that the procedure works extremely well on experimental data, issued from Neurosciences or Genomics. Our method is available in the form of the contest Rpackage, and offers many perspectives in the field of continuous testing.
2 The continuous testing framework for Poisson Processes
In the sequel a point process is a random countable set of points usually denoted by . The corresponding point measure is then denoted by , that is with denoting the Dirac mass. For any interval , is a random variable that corresponds to the number of points of in . We focus on processes in (which can be thought as the DNA strand for genomic applications and time line for neuroscience applications) and we consider processes in [0,1] for the sake of simplicity.
2.1 Homogeneity and twosample conditional tests
In order to propose a unified framework, we use the following notations: we observe a random set of points with a distribution of the form , where is a nuisance parameter, while represents the signal of interest. In both cases, the aim is to test whether is locally null.
Homogeneity test.
We observe , a point process modeled by a heterogeneous Poisson Process on with unknown intensity (non negative and in ) with respect to the Lebesgue measure. We would like to detect the regions that are particularly rich or poor in occurrences of the process, with respect to its mean/stationary behavior. To do so, we compare the intensity of the observed process to the mean behavior of the intensity
Then we rewrite the distribution of in terms of two different parameters and , with , defined by
The signal of interest is , whereas may be seen as a nuisance parameter because its value is unknown under the null hypothesis. The homogeneity null hypothesis is therefore reduced to ””: it is a composite hypothesis since under the null hypothesis, the distribution can still vary according to . As a particular easier problem, one can also assume that is known. The problem is then a classical goodnessoffit test and the null hypothesis is not composite anymore.
Twosample test.
We observe here , where and are two heterogeneous independent Poisson Processes on with intensities and (non negative and in ) with respect to the Lebesgue measure. In the following, for any interval , we denote , and by a slight abuse of notation, . Our aim here is to detect the regions where (or, alternatively, ).
To reparametrize the distribution of , one can provide a onetoone correspondence between and the couple , where is the joint process and where is a set of marks with values in such that if actually belongs to and if actually belongs to (Kingman, 1993). An important point here is to note that is actually a Poisson process with intensity and that conditionally to , the ’s are independent variables with distribution
with
Once again, stands for the amount of signal in the data and is a nuisance parameter. The twosample null hypothesis is then once again summarized by ””. Since is an unknown function, this hypothesis is composite. This case is thus much more complex than the homogeneity case, and calls for a nonparametric procedure.
Conditional distributions.
To avoid in both cases the use of specific unknown values of to derive the distribution under the null hypothesis, our strategy is based on conditional testing (Loader, 1991; Rivera and Walther, 2013). Indeed, for the homogeneity test, conditionally on the total number of points, , the observed process is actually an i.i.d. sample of density that does not depend on (Loader, 1991; Rivera and Walther, 2013). Similarly, in the twosample case, given the positions of the joint process , the remaining variability under the null hypothesis lies in the distribution of the marks that only depends on (and not on anymore). Thus we introduce the variable , such that in the homogeneity case, and in the twosample case and our testing procedure relies on the distribution of conditionally to . Since the conditional distribution of given does not depend on , it is denoted by in the sequel, whereas refers to the expectation w.r.t. .
2.2 An infinite set of local null hypothesis
The continuous testing procedure developed here aims at determining where the signal is non zero. The classical strategy consists in testing the full null hypothesis, which corresponds to the single question “Is the signal null over [0,1]?”. Therefore a unique hypothesis is tested and the test produces a binary answer. Here, we rather consider local null hypothesis such as,
In practice, onesided tests of homogeneity are usually considered to detect regions that are too rich in occurrences, whereas twosided tests of the twosample problem seem more relevant when there is a complete equivalence between both processes and from a modeling point of view.
Since information at the single point level seems difficult to catch in general, we focus on finding intervals or unions of intervals on which the null hypothesis holds. We denote , the null hypothesis corresponding to . As a particular case, is the full null hypothesis and corresponds to , where is the null function on .
2.3 A continuum of scanning windows
Our procedure is based on scanning windows with a given resolution that is fixed. We envision here all the possible continuum of windows with such given length, meaning that we are performing a whole continuum of tests, one for each window. To avoid any confusion in the sequel, always denotes a window center whereas always denotes a point, that is a possible value for the observations . Thus we have (while the set of possible values for is ). The scanning windows are denoted by for any . Therefore, a typical relationship between and is . In the sequel, all the proposed multiple testing procedures are actually based on single tests of the null hypothesis , for all the possible window centers , which is different than the null hypotheses given by for in . The continuous testing procedure aims at accepting the set of ”true windows”, which is indexed by their centers:
formally depends on , but is denoted by in the sequel, for short.
2.4 A first simple construction of the valueprocess
Homogeneity test.
Consider as an individual test statistic ,that is the number of points in the window of center . Then classically, a onesided test based on this statistic rejects if is large enough. Under , it is easy to see that the conditional distribution of given (or here) is a Binomial variable with parameter and . Hence, if we denote by , the function such that with , then
can be seen as a valueassociated to the classical onesided single test of . Since is explicit and since the number of points, as a function of , can be very efficiently computed, one can have access to a valueprocess , with a very low computational cost.
If is known, one does not need to use a conditional distribution. The unconditional distribution of is a Poisson distribution with mean , under . If with , then
(1) 
also defines a valueprocess. Note that this setup is quite close to the one of Chan and Zhang (2007).
Twosample test.
Consider now as individual test statistic A classical onesided test based on this statistic also rejects , if is large enough. Under , it is easy to see that the conditional distribution of given (the joint process here), is the same as the conditional distribution of given . It is a Binomial variable with parameter and . The corresponding valueassociated to the classical onesided single test of can then be defined by:
(2) 
This process depends on and , and once again can be computed very easily.
2.5 Making continuous testing computationally tractable in general
Even if the method is based on a continuum of tests, the two previous examples are computationally tractable because they are based on , the number of points within windows , which can be computed very efficiently and which is piecewise constant as a function of . In full generality, the number of points may not be informative enough, and to gain power, one can consider more general single test statistics (see Section 4) that only depend on the composition of each window, that is on the function
which records the random positions of points in that lie in . Since the composition is piecewise constant with changes only each time a point leaves or enters the scanning window, the single test statistic that only depends on the composition is a piecewise constant function on a random finite partition . Therefore, the continuum of tests can be reduced to a random finite number of tests, one for each segment defined by . An example of such a partition if provided in Fig. 1.
2.6 General definition of the valueprocess
In the sequel, we assume that a test statistic is given for each window , that only depends on the composition of the window. Because Poisson processes have nice properties of independence between disjoint sets, it is easy to show the following property, which is usually referred to as subset pivotality (Westfall and Young, 1993; Romano and Wolf, 2005a).
For all , the conditional distribution of given is the same whether or . 
This roughly says the the conditional distribution of given is not modified by the truth or the falsehood of the remaining hypotheses. In particular, the conditional distribution of given under is the same as the one under the full null .
Note that in full generality the conditional distribution of given depends on but also on the composition of the window centered on . Therefore we define:
and the associated valueprocess by
(3) 
It satisfies for all , for all ,
(4) 
Moreover since is a deterministic function of (conditionally to ), property also holds for , and not only for .
3 False positive control in continuous time
Except when specifically mentioned, the results presented here hold for any test statistic that only depends on the composition of the window (as defined in Section 2.5). Their associated valueprocess are as described previously.
A rejection set is a subset of , which is a function of the valueprocess . Classically, it corresponds to small values of this process and a typical example is given by
for a given threshold potentially depending on the data. Once the rejection set given, the set of accepted windows is defined by its complement
3.1 Definition of the multiplicity error rates
If the procedure behaves properly, the set should be a good approximation of , or similarly should be a good approximation of . To evaluate the quality of a procedure , we focus on false positive windows that correspond to elements of . The size of can be gauged in many ways. In this work, we focus on the most famous criteria, namely the FamilyWise Error Rate (FWER) and the False Discovery Rate (FDR).
From a familywise point of view, one wants to avoid any false positive with a large probability, which corresponds to a control of the quantity
(5) 
While a FWER control provides a strong assessment on the amount of false positives, this approach generally leads to conservative procedures. A more balanced view is to allow some false positives, in a prespecified fraction of the total amount of positives. This can be achieved by controlling
(6) 
with . Importantly, since the setting is continuous here, the “number” of (false) positives is quantified above by the Lebesgue measure, , on .
3.2 A procedure to control the FWER in continuous time
We would like to find a threshold such that:
For all ,
with defined in (3). The set being unknown, one can use some rough upperbound. We have, thanks to subset pivotality , that for any (possibly depending on the conditioning variable ),
Since , we obtain
The distribution of under and given are in both cases (homogeneity and two sample cases) known/observable once the variable is fixed and it is possible to calibrate an adequate threshold . However, due to the classical subtleties of inversion of non continuous c.d.f., it is in fact more powerful to use adjusted values that are defined by
(7) 
with being the conditional c.d.f. of under the full null hypothesis, that is .
Theorem 3.1.
Note that , which is a continuous version of the classical procedure, is coherent with the case where there is only one value in the valueprocess because the resolution is equal to 1. In this extreme case, is reduced to one point and the valueis exactly the adjusted value, the control of being then an obvious consequence of Section (4).
Remark 3.2.
The previous procedure is usually referred to as a singlestep procedure. We also propose refinements by using a stepdown procedure in the Supplementary Material.
Remark 3.3.
Unfortunately, even for the classical homogeneity or twosample tests the c.d.f. is not explicit and needs to be estimated. This is the purpose of Section 5 in which we also derive meaningful theoretical results when this distribution is approximated by MonteCarlo that are necessary to provide a completely operational method with strong theoretical guarantee.
3.3 Link with the scan statistic framework.
In the case of the homogeneity test, the valueprocess does not depend on and because function is a non increasing function, we have
(9) 
Hence, rejecting for small values of the infinimum of the valueprocess (similar to standard minP procedures), or rejecting for large of the supremum of the test statistic (i.e. the scan statistic, often called the maxT procedure) is strictly equivalent (Dudoit et al., 2003; Daley and VereJones, 2008). Compared with the work of Chan and Zhang (2007), we mainly added two ingredients: () the distribution we are considering is conditional and exact, whereas theirs is asymptotic and still depends on the unknown nuisance parameter (even if they provide a plugin procedure in practice to avoid this issue), () thanks to Property , we are able to prove the FWER control, whereas their control stands under the full null only.
This minPmaxT equivalence is valid in the case of the homogeneity test only, but it does not hold as soon as the c.d.f. of the infinimum of the valueunder the full null depends on the composition of the window. This is typically the case for the twosample test, where depends on so that Eq. (9) cannot hold anymore. In this case, the minP framework offers the advantage to scale all tests on the same level (Dudoit et al., 2003; Daley and VereJones, 2008).
3.4 A weighted BH procedure to control the FDR in continuous time
To increase the detection capability, we propose a second procedure based on the control of the FDR criterion (6).
Heuristic.
We would like to find a threshold (see (3)) such that:
Following Blanchard et al. (2014), let us first explain heuristically how one can reach this goal in a continuous framework. By Fubini’s theorem, ,
Then, choosing such that
and doing as if was deterministic leads to
which leads to
by using the property of values(4). The above heuristic suggests to choose by using
WeightedBH procedure.
Quantile is random (even conditionally on ) because it depends on the observed valueprocess. Nevertheless, since the value process is piecewise constant in our framework, finding threshold is very simple. We propose a stepup algorithm that is inspired from the famous weighted BH threshold (Benjamini and Hochberg, 1997), with the essential difference that our algorithm is based on random weights. Recall that the processes and therefore are piecewise constant on the random partition (, ). Therefore
This implies that the threshold defined in (3.4) can be derived as follows:

Compute the weights,

Denote , ,
and order these valuesin increasing order for an appropriate permutation of ;

Consider (with if the set is empty);

Compute as

Define
By contrast with the classical BH stepup procedure, each value has its own weight which is not uniform but adaptive to the data. This weight means that the quantity of interest is not the number of tests, i.e. the cardinality of the partition , but the proportion of ”time” that the valueprocess spends at a particular value.
Remark 3.4.
It is worth to note that where stands for the adjusted value
Theoretical result.
Our procedure is based on a heuristic that has been made fully theoretically valid by Blanchard et al. (2014) under suitable assumptions of measurability and positive dependence (namely, finite dimensional positively regressively dependent value process). The difficulty comes from the interplay between the numerator and the denominator within the expectation in (3.4) via , because is a random variable. While the measurability issue is not critical here (by the piecewise constant property), the positive dependence condition seems difficult to check in general. For instance, this condition is known to exclude the twosided test statistic (Yekutieli, 2008). We prove here that this positive dependence condition holds in the onesided testing setting (both for the homogeneity test with a known and the twosample test), by using the peculiar properties of Poisson processes.
Theorem 3.5.
This result is proved in the Supplementary File. Theorem 3.5 hence theoretically justifies the use of a weighted stepup procedure to control the continuous FDR. However, it is only valid for simple versions of our test statistic, which ensure the positive dependence structure. For more general cases, Blanchard et al. (2014) proposed a modification of the stepup algorithm. This modification consists in introducing a function of a specific form, such that the quantity is replaced by the smaller quantity . However, this modification is reported to be quite conservative (Blanchard and Roquain, 2008). Furthermore, our simulation results show that the FDR control (3.4) is achieved in most of the situations we studied, even for more complex valueprocesses that the ones we considered here (homogeneity test with known intensity, and twosample test)
Note that the homogeneity test with known intensity in Theorem 3.5 is similar to Corollary 4.5 in Blanchard et al. (2014). However, a subtle distinction concerns the way the null hypotheses are defined (windows versus points). A consequence is that our approach does not rely on any knowledge regarding the regularity of the underlying process intensity.
4 Kernelbased statistics for sliding windows
4.1 On the limitations of the countbased statistic
Let us focus on the twosample test to illustrate our motivations. In this context, our objective is to detect local differences between unknown functions and based on individual tests defined on a set of windows. In a first approximation, this can be achieved by comparing with , the count of occurrences on window , that are distributed as a couple of independent Poisson variables. However, given that (resp. ) is an approximation of (resp. ), the count basedstatistic strategy can be considered as an hypothesis that tests the equality of function integrals rather than the equality of the functions themselves. Hence the induced testing procedure may lack of accuracy.
Therefore we developed a kernelbased strategy to propose single test statistics that do not depend on the sole number of occurrences in a window, but that consider its whole composition as defined in Section 2.5. The interest of kernelbased statistics is that they account the physical distances between occurrences, which explains the gain of performance.
4.2 The kernel vision in the twosample case
The kernel framework for testing proposed by Gretton et al. (2012) and Fromont et al. (2013) has been especially designed to separate functions, and not just their integrals. It is based on the estimation of a distance between the functions and (in a RKHS, or in ). In particular, Fromont et al. (2013) show that these test statistics are minimax with respect to various smoothness classes, and in this sense, powerful against some alternatives that cannot be distinguished by countbased tests.
In the following we consider translation invariant kernels denoted by , and in practice we will restrict ourselves (see Sections 6 and 7) to the Gaussian kernel that has been shown to be the most powerful on simulations (Fromont et al., 2013). The shape of the statistic is then given by
where is the joint process and the ’s are the marks (see Section 2.1). In particular, it only depends on the composition of the window. To understand, why this statistic is meaningful, let us compute its expectation. By classical computations for Poisson processes,
If denote the convolution product, we end up with
In particular if is a convolution kernel with bandwidth we obtain, under classical assumptions, that
This short derivation illustrates the motivation to use such a kernelbased statistic, as a good estimate of a distance between intensity functions. The onesided version of this statistic, as well as the homogeneity case are presented in the Supplementary Material. Note that from a computational point of view, the complexity for computing is in . However, thanks to the piecewise constant property of on partition , the continuum , can be computed using the in/out points of each scanning window, which drastically reduces the global computational burden.
5 Montecarlo procedures for valuesestimation and adjustment
Procedure  test statistic  computations  homogeneity test  twosample test 

  count  exact  
full procedure  1step MC  
kernel  1step MC  
full procedure  2step MC  
wBH  count  exact  
full procedure  exact  
kernel  1step MC  
full procedure  1step MC 
5.1 The distribution of the single test statistic being explicit.
For tests based on the count statistic the valueprocess is explicitly computable. Hence, no need for MonteCarlo in this case. Moreover, since our weighted BH procedure (Section 3.2) is completely explicit, can be easily computed in practice. In this situation we have already proved the FDR control for some particular cases and we will show that it holds in more general cases by using simulations (Section 6).
However, the procedure is not explicit since it requires the conditional distribution of given under the full null hypothesis (Section 3.2). Therefore we need to use a MonteCarlo procedure to approach this distribution. This can be achieved while maintaining a valid FWER control by following Romano and Wolf (2005a), as detailed in Section 5.4 below.
5.2 The distribution of the single test statistic being not explicit
This concerns kernelbased tests (Section 4) for which the valueprocess is not even computable. One can use again the estimated valuesdeveloped by Romano and Wolf (2005b) since their result shows that they satisfy (4) and can be easily computed in practice. However, there is no theoretical FDR guarantee and we will only show the FDR control by simulations in Section 6. Consequently, to compute , the weightedBH procedure only requires one MonteCarlo step in this situation.
As for the procedure (Section 3.2), it requires an additional MonteCarlo step. To this end, a classical and sufficient method is to perform an approximation using two independent MonteCarlo samples. We detail this approach for the Homogeneity case in the supplementary material. For the twosample case, the MonteCarlo procedure is quite demanding and is close in spirit to bootstrap methods. In this case, we developed a double MonteCarlo procedure, which only requires one MonteCarlo sample and still guarantees a controlled FWER (Section 5.4).
5.3 Estimation of the valueprocess by MonteCarlo
Recall that this estimation is required when using test statistics different from the statistics based on counts. The full procedures are summarized in Table 1.
Homogeneity test.
Here, the conditioning variable is the same for all scanning windows and the distribution of under does not depend on . Let us focus on the first window and simulate points inside this window under . Hence we consider random samples of the number of points such that
and then we draw
to sample positions of a homogeneous Poisson process inside window . For each , one can compute thanks to the positions of the points inside the value of the test statistic for the first window, that we denote . But since all the statistics are distribution invariant by translation under the null, one can also say that this sample has the same distribution as under for every window center .
Two sample test.
We label the observed marks such that it constitutes the first term of a sample of marks, filled by independent draws of i.i.d. Rademacher sets, , for . The conditional distribution given of these Rademacher processes are:
that is the distribution that the observed marks ’s should have under the full null hypothesis. As previously, the MonteCarlo approximated conditional valueof window is defined by:
(10) 
where is the observed statistic and where is the value of the test statistic on that is computed with the resampled marks and fixed joint process . Let us underline that because in the twosample case the partition only depend on and do not vary with the resampling scheme , the processes , for , and therefore the process are piecewise constant on .
5.4 Twostep MonteCarlo method for the procedure.
Recall that when using the procedure with a valueprocess that is not explicit, we need two MonteCarlo approximations.
Homogeneity test.
In that case, the distribution of the test statistics under the null are independent of the window centers, while the distribution of the minimum of the values relies on the observations over all windows. Hence, it is quite natural to perform two separated MonteCarlo schemes for these two steps.
Two sample test.
In the twosample case, the trick is that one can use the same sample of ’s as the one used in (10) for both approximations. More precisely, the values, corresponding to the simulated process , can be computed as
that is we do the same computation as in Eq. 10 except that we replace the observed statistics by the simulated one , hereby computing a ”resampled” pvalue process corresponding to . Then, let us define, for all
The corresponding adjusted MonteCarlo valuesare then given by
Finally the rejection set is given by
The previous double MonteCarlo procedure is based on the use of the same random signs , and not on two sets of different simulations. Indeed the ’s are used twice: a first time to estimate the values of the observed process and a second time to give the valueseven on the simulated realizations leading to the adjusted values. This procedure has the double advantage of sparing computational resource, while enjoying the same control property as shown hereafter.
Theorem 5.1.
Let . For the twosample case, the continuous testing procedure defined by (5.4) satisfies
This result holds whatever the choice of the single test statistic that only depends on the composition of the window.
6 Simulations
6.1 Homogeneity test
For a fixed parameter , we simulate a homogeneous Poisson process corresponding to the following piecewise constant signal (illustrated in Figure 9, Supp. Material):
with
and , , the positions on which is not null (or null, respectively). We consider this particular shape of to ensure that .
For a fixed background intensity parameter , we therefore simulate 1000 times a Poisson process of intensity . Parameter is used to tune the distance to homogeneity (which reflects the difficulty of the test), with standing for a situation where almost every position is under the homogeneous regime, standing for a situation with high separability. Parameter controls the number of occurrence and varies in . To compare with other scanning procedures, we opt for onesided tests (the alternative corresponding to too large ), with fixed window size . More precisely, we compare our countbased and countbased weighted BH procedures in continuous time with asymptotic approaches based on the scan statistic that controls the FWER (Chan and Zhang, 2007) and the FDR (Siegmund et al., 2011).
We consider this homogeneity test to compare asymptotic approaches based on the scan statistic to control the FWER (Chan and Zhang, 2007) and the FDR (Siegmund et al., 2011), with our and weighted BH procedures in continuous time, based on the count statistic. Figure 2 clearly shows that our procedure in continuous time controls the FWER even for moderate sample sizes, whereas the scan statistic procedure requires a strong signal (, or a higher number of occurrences (, not shown). This trend is even more clear for the FDR control, that is not ensured by method of Siegmund et al. (2011). Note that contrary to continuous testing, this method requires a discretization step on which the performance depend. In this case we choose fixedlength nonoverlapping windows to compute their procedure.
6.2 Two sample test
To test the differences between intensities and , we sample a joint point process over with constant intensity . Then we sample the labels given the occurrences of the joint process , in an independent manner, such that
Thus positions with (resp. ) stand for points of process (resp. ) with intensity (resp.