Change Point Detection in the Mean of High-Dimensional Time Series Data under Dependence

Change Point Detection in the Mean of High-Dimensional Time Series Data under Dependence

Jun Li Department of Mathematical Sciences, Kent State University, Kent, OH 44242 Minya Xu Guanghua School of Management, Peking University, Beijing 100871, China Ping-Shou Zhong Department of Statistics and Probability, Michigan State University, East Lansing, MI 48824 Lingjun Li Department of Mathematical Sciences, Kent State University, Kent, OH 44242


High-dimensional time series are characterized by a large number of measurements and complex dependence, and often involve abrupt change points. We propose a new procedure to detect change points in the mean of high-dimensional time series data. The proposed procedure incorporates spatial and temporal dependence of data and is able to test and estimate the change point occurred on the boundary of time series. We study its asymptotic properties under mild conditions. Simulation studies demonstrate its robust performance through the comparison with other existing methods. Our procedure is applied to an fMRI dataset.

Keywords: Change point analysis; Spatial-temporal data; Large small .

1. Introduction

Many dynamic processes involve abrupt changes and change point analysis is to identify the locations of change points in time series data. There exists abundant research on change point analysis for univariate time series data. Examples include Sen and Srivastava (1975), Inclán and Tiao(1994), Chen and Gupta (1997), Kokoszka and Leipus (2000), Lavielle and Moulines (2000), Ombao et al. (2001), Davis et al. (2006), Davis et al. (2008), and Shao and Zhang (2010). Change point analysis for classical multivariate time series data has also been extensively studied. Examples include Srivastava and Worsley (1986), James et al. (1992), Desobry et al. (2005), Harchaoui et al. (2009), Zhang et al. (2010), Siegmund et al. (2011) and Matteson and James (2014).

With explosive development of high-throughput technologies, high-dimensional time series data are commonly observed in many fields including medical, environmental, financial, engineering and geographical studies. Change point analysis for high-dimensional data has received a lot of attention in recent years. For instance, Bai (2010) considered estimating the location of a change point in high-dimensional panel data under the assumption that the change has occurred a priori. Chen and Zhang (2015) proposed a graph-based approach to test and estimate change points under the assumption that a sequence of observations are independent.

In this paper, we propose a new nonparametric procedure to detect change points in the mean of high-dimensional time series data. Let be a sequence of -dimensional observations and be the mean of for , where the dimension can be much larger than the sample size . We first test


where are some unknown change points. When is rejected, we further estimate the locations of change points. Different from Chen and Zhang (2015) which assumed a sequence of observations to be independent, our procedure incorporates both spatial and temporal dependence, namely spatial dependence among the -components of at each and temporal dependence between any and for . Different from Bai (2010) which imposed growth rate of the dimension with respect to the sample size , our procedure allows the dimension to be much larger than the number of observations . Most importantly, our procedure is able to detect a change point on the boundary when data dependence is present. This feature distinguishes our procedure from other existing methods. The implementation of the proposed procedure is provided in the R package HDcpDetect (Okamoto et al., 2018).

2. Main Results

2.1 Test statistic

For any , we consider a bias-corrected statistic


Here, and . With defined in Condition 1 of Section 2.2, is an -dimensional vector with and for ,


The element at th row and th column of the matrix is


The th component of the -dimensional random vector is


Imposing in (2.1) leads to (2.6) in Proposition 1 that excludes the interference of data dependence in testing the hypothesis and estimating locations of change points in (1.1). The proposed depends on which separates dominant dependence from the remainder. How to choose a proper in practice will be addressed in Section 3. From here to the end of Section 2, we simply assume to be known in order to present theoretical results of our methods.

For the two-sample testing problem of means, can be reduced to the test statistic in Bai and Saranadasa (1996) with temporally independent sequence, and the test statistic in Ayyala et al. (2017) with -dependent Gaussian process. Their asymptotic testing procedures require and thus cannot test the hypothesis in (1.1) if a change point occurs near the boundary, specially at or . Unlike Bai and Saranadasa (1996) and Ayyala et al. (2017), we establish the asymptotic normality of at any and the testing procedure can be applied regardless of locations of change points. Moreover, there is no need to estimate a change point in the two-sample testing problem as two samples have been pre-specified before testing. In addition to hypothesis testing, we establish an estimating procedure based on for the locations of change points regardless of locations of change points.

2.2 Hypothesis testing

To study asymptotic properties of , we model the sequence of -dimensional random vectors by


where is the -dimensional population mean, is a matrix with , and so that are mutually independent and satisfy , and for some finite constant .

By allowing to depend on , each has its own covariance described by , and each pair of and has its own temporal dependence described by for . Model (2.5) is thus flexible for many applications. We require to guarantee the positive definite of . It also ensures the existence of ’s under special structural assumptions. For example, if all ’s are temporally independent, the condition guarantees the existence of ’s so that if . Another advantage of (2.5) is that it does not assume Gaussian distribution of beyond the existence of fourth moment.

Let , and define a weight function . Moreover, for any matrix , we let .

Condition1 (Spatial and temporal dependence assumption). We assume that for . Moreover, as , there exists such that

Condition2 (Covariance assumption). For , , , with ,

Condition 1 assumes the stationary on which can be relaxed to the locally stationary. Condition 1 is trivially true for temporally independent or -dependent sequence, but general as the sequence needs not be -dependent. Moreover, Condition 1 does not impose any structural assumption on dependence within a critical value , but only requires that the spatial dependence beyond the critical value is not too strong, so that the two equations are satisfied. At last, comparing to the usually assumed mixing condition, it is advantageous as mixing condition is hard to verify for the real data and usually requires additional smoothness or restrictive moment assumptions (Carrasco and Chen, 2002).

Condition 2 is imposed on the covariance matrix of the entire sequence of . To see this, let and from (2.5). The covariance matrix of is , where each block diagonal matrix of describes the spatial dependence among components of each , and each block off-diagonal matrix measures the spatio-temporal dependence of and for . To impose a condition on , we may consider , which is satisfied if all the eigenvalues of are bounded or the dependence of is not too strong. However, it is more desirable to impose the condition on the spatial and temporal dependence through . By the relationship that , it can be shown that Condition 2 is a sufficient condition for . Another advantage of Condition 2 is that we do not require any explicit relationship between dimension and the number of observations .

The mean and variance of are given by the following proposition.

Proposition 1

Under (2.5) and Condition 1, and for ,


where , and with .


where the set and the matrix satisfies

Now we are ready to present the asymptotic normality of .

Theorem 1

Assume (2.5) and Conditions 1–2. As and for any , converges in distribution to the standard normal , where is defined by (2.7) in Proposition 1.

To implement a testing procedure based on Theorem 1, we need to estimate

which is under the null hypothesis. The only unknown terms are for and from . Similar to Li and Chen (2012), we estimate them by


where represents the sum of indices that are at least apart, and with are the corresponding number of indices. As a result, the estimator of is

Figure 1: Histogram of versus -curve. The upper row chooses , at different and ; The lower row chooses , and different and .
Theorem 2

Assume the same conditions in Theorem 1 and of (1.1). As and for any , converges in distribution to the standard normal .

One of the contributions in this paper is establishing the asymptotic normality of for any . This enables us to test the hypothesis of (1.1) even when a change point is on the boundary of a sequence. We conduct some simulations for a visual inspection. Figure 1 illustrates histograms of based on iterations for and , respectively. The data were generated based on the setups in Section 4.1. Clearly, as and increase, the histograms converge to the standard normal curve even when equals 1.

From Theorem 2, our testing procedure rejects of (1.1) if with a nominal significance level , where is the upper- quantile of . The testing procedure relies on and may lose power if the chosen is very different from the location of a change point. For example, there exists only one change point located near the boundary. If we choose a near the middle to break the entire sequence into two subsequences, the small piece with mean change falls into a long subsequence and its contribution to the change point detection is diluted by averaging all observations in the subsequence. In order to circumvent the difficulty of choosing and most importantly retain the power of the test, we accumulate all the marginal and consider


Let where is specified in Proposition 1, be obtained by replacing with in (2.7), and be the estimator of under the null hypothesis obtained by replacing with in (2.9).

Theorem 3

Assume the same conditions in Theorem 1. As , converges to the standard normal in distribution. Especially under of (1.1), converges to the standard normal in distribution.

Based on Theorem 3, we reject of (1.1) with a nominal significance level if . Free of the tuning parameter and retaining the power, the testing procedure based on is thus chosen for the existence of any change point.

Remark 1. Alternatively, one may consider the max-norm statistic . If there are only few time points where the difference of and is large, the max-norm based test is expected to be more powerful than our proposed test. If the small differences occur in many time points, our test can dominate the max-norm based test by aggregating all small differences. Furthermore, it requires stringent conditions to establish the extreme value distribution of and its rate convergence is known to be slow (Liu and Shao, 2013).

2.2 Estimating one and multiple change points

If the null hypothesis is rejected, we further estimate the change points. We first consider the case of one change point. The location of a change point is estimated by


where is given by (2.1). The rationale of proposing is demonstrated by Lemma 1.

Lemma 1

Under (2.5), Condition 1 and of (1.1), always attains its maximum at the change point .

Let and where is given in Proposition 1. Here and measure signal strength and maximal noise, respectively. The following theorem establishes the convergence rate of .

Theorem 4

Assume that the change-point satisfies with . Under the same conditions in Theorem 1, as ,

Remark 2. Under cross-sectional dependence but temporal independence, we can derive

Moreover, under the local alternative that the change in tends to zero, the leading order if all the eigenvalues of are bounded, and thus

Remark 3. In the change point literature, it commonly assumes that the change point is of the form with , that is with in terms of our notation. The corresponding convergence rate is . This excludes the case that the change point is near or on the boundary. Theorem 4 is general as can vary within [0, 1]. Especially, when the change point (near or on the boundary), the convergence rate is which is times slower than the convergence rate when .

To estimate the locations of multiple change points , we can iteratively apply a binary segmentation method similar to that in Vostrikova (1981). Suppose that we have already estimated change points as , which partition the original data into segments. Define and . Let , and be the statistics calculated based on data from the th segment . For each of segments, we first conduct hypothesis testing by checking if where is a chosen nominal significance level. If yes, no change point is estimated from . Otherwise, one change point is estimated as , which further partitions into and . Repeat the above procedure iteratively until no more change point can be estimated from any segment.

Let be the set of all change points and be the set of estimated change points, respectively. Letting and , we define

to be the minimal signal-to-noise ratio from all segments, each of which has starting point for and ending point for . We establish the consistency of under the following Condition 3 plus Conditions 1–2.

Condition 3 (Minimal signal-to-noise ratio assumption). As , and diverges such that . Furthermore, in Theorem 4, for all that contains at least one change point.

Theorem 5

Assume (2.5) and Conditions 1–3. As , converges to in probability.

Remark 4. The binary segmentation can control the family-wise error rate (FWER) as we set in Condition 3. Especially, one can choose so that the FWER is controlled even if other conditions in Theorem 5 are not satisfied.

Remark 5. The defined provides a quantitative measure for efficiency of the binary segmentation. To appreciate this, we consider a configuration of two change points and . Let , . The piecewise constant signals are zero in and and positive in . Then is the smallest signal-to-noise ratio from , , . Especially, if is short and buried in the middle of the large segment (Olshen and Venkatraman, 2004), is close to zero. The binary segmentation is well known to be inefficient under this configuration. To improve its performance, we may consider , where is the test statistic (2.1) defined in a randomly generated interval with . The rationale is that when happens to be or , the change-point detection will be more powerful than that based on the entire sequence. Based on , the circular binary segmentation or wild binary segmentation can be implemented accordingly.

3. Elbow Method for Dependence

The proposed procedure relies on the choice of , which is unknown in practice. From Condition 1, separates dominant temporal dependence from the remainder. As demonstrated in simulation studies of Section 4, if data are dependent (), wrongly applying the procedure based on the assumption that can cause severe type I error and thus produce a lot of false positives when estimating locations of change points. On the other hand, choosing a value that is larger than the actual will reduce the power of the test and thus generate more false negatives. Here we propose a quite simple way to determine .

Figure 2: The elbow method for choosing under both null and alternative hypotheses. The results were obtained based on replications.

Condition 1 states that is relatively small if , or equivalently, is small if . The unknown can be consistently estimated by (2.8) with under the null hypothesis according to the proof of Theorem 2. Even under the alternative hypothesis, the effect of heterogeneity of means on the estimation is of small order as long as the heterogeneity is not too strong. We thus determine by calculating (2.8) for each integer starting from , and terminate the process once a small value appears. Visually, we can plot (2.8) versus , and use the elbow in the plot to determine .

To demonstrate the idea above, we generated the random sample for using (4.1) in Section 4 with and . We considered and , respectively. Figure 2 illustrates (2.8) versus based on iterations. When the actual , the elbow happened at under both null and alternative hypotheses. We thus estimated by . Similarly, when , the elbow happened at which suggested us to estimate by .

4. Simulation Studies

4.1 Empirical performance of the testing procedure

Table 1: Empirical sizes and powers of the CQ, the E-div and the proposed tests based on replications with Gaussian in (4.1).

The first part of simulation studies is to investigate the empirical performance of the test statistic with asymptotic normality established in Theorem 3. The random sample for , were generated from the following multivariate linear process


where is the -dimensional population mean vector at point , is a matrix for , and is a -variate random vector with mean and identity covariance . In the simulation, we set for , and . For and , we considered two different scenarios. If , we simply chose so that became an independent sequence. If , we chose and each row of them had only non-zero elements that were randomly chosen from with magnitude generated by Unif . By doing so, the dependence was dominated by for plus perturbations contributed by and .

Without loss of generality, we chose for under of (1.1). Under the alternative hypothesis, we considered one change-point such as for and for . The non-zero mean vector had non-zero components which were uniformly and randomly drawn from coordinates . Here, denotes the integer part of . The magnitude of non-zero entry of was controlled by a constant multiplied by a random sign. The nominal significance level was chosen to be . All the simulation results were obtained based on 1000 replications.

We also considered two competitors. One is the E-div test proposed by Matteson and James (2014) and the other is the CQ test proposed by Chen and Qin (2010). Both testing procedures assume independence of . The CQ test was originally designed for the two-sample problem, requiring the change point to be known. To implement the CQ test, we used the true change point to divide the sequence into two samples under the alternative hypothesis. Under the null hypothesis, we obtained the two samples by adopting the same time point used for the alternative.

Table 1 demonstrates empirical sizes and powers of three tests with Gaussian in (4.1). To obtain power, we chose the location of the change point and magnitude . Under temporal independence (), sizes of all three tests were well controlled around the nominal significance level . Under dependenc (), the CQ and E-div tests suffered severe size distortion. Unlike those two tests, the proposed test still had sizes well controlled around the nominal significance level . Due to severe size distortion of the CQ and E-div tests, it is not relevant to compare the power of three tests under dependence. We thus only conducted power comparison when . Empirical powers of three tests increased as and increased. The reason the CQ test had the best power among three is that it utilized the information of location of the change point. In real application, such information is unavailable. The proposed test always enjoyed greater powers than the E-div test with respect to different and .

Figure 3: The probability of detecting a change point when (trangles), (squares) and (circles). Upper panel: the change point is at . Lower panel: the change point is at .

Under spatial and temporal dependence, we also studied the power of the proposed test subject to different combinations of sample size , dimension and location of the change point. The results are included in the supplementary material. The supplementary material also includes the results of the proposed test when follows -distribution. The patterns of sizes and powers were quite similar to those when follows Gaussian distribution, showing the nonparametric property of the proposed test.

4.2 Empirical performance of the estimating procedure

The second part of the simulation studies aims to investigate the empirical performance of the change point estimator in (2.11). We first considered the situation with one change-point such as for and for . The non-zero mean vector had non-zero components, which were uniformly and randomly drawn from coordinates . The magnitude of non-zero entry of was controlled by a constant multiplied by a random sign. Figure 3 demonstrates the proportion of the iterations detecting the change point that was located at the time point and , respectively. First, the probability of detecting the change point increased as dimension increased. Second, comparing the right panel with the left panel, the probability of detecting the change point became lower as dependence increased from to . Finally, comparing the lower panel with the upper panel, the stronger signal strength was needed when the change point was at , in order to retain the similar detection probability when the change point was located at . Our empirical results are consistent with the theoretical results in Theorem 4.

We also compared the proposed change-point estimator with the one proposed in Bai (2010). Since the estimating procedure in Bai (2010) assumes that the change point exists a priori, we implemented both methods without conducting hypothesis testing. Figure 4 illustrates that the change-point estimator in Bai (2010) failed to identify the change point at under temporal dependence (). Unlike the change-point estimator in Bai (2010), the proposed change-point estimator performed well under both temporal independence and dependence.

Figure 4: The probability of detecting a change point at 2 by the proposed estimator (circles) and the estimator of Bai (2010) (squares). Left panel: data are temporally independent with . Right panel: data are temporally dependent with .
New E-div New E-div New E-div
0 FN
2 FN
Table 2: The performance of the proposed binary segmentation and E-div method for estimating multiple change points with Gaussian in (4.1). The average FP, FN, TP and corresponding standard deviations were obtained based on replications.

The last part of the simulation studies is to demonstrate the performance of the proposed binary segmentation method for multiple change-point detection. We chose and considered three change points at , and , respectively. In particular, for , . For , the non-zero entry of was controlled by a constant . For , and for , the non-zero entry of was controlled by another constant . We compared our method with E-div method in terms of false positives (FP), false negatives (FN), and true positives (TP). The FP is the number of time points that are wrongly estimated as change points. The FN is the number of change points that are wrongly treated as time points without change. And TP is the total number of identified change points. A procedure is better if it has smaller FP and FN, but TP is close to which is the total number of change points based on our design. Table 2 demonstrates the performance of two methods based on iterations when estimating the three change points. Under temporal independence (), the two methods had similar performance with both FP and FN decreased but TP increased as and/or increased. On the other hand, under temporal dependence (), the E-div procedure suffered severe FP for all cases although it had larger TP. Different from the E-div procedure, the proposed method always had the FP and FN under control. Most importantly, similar to the case of , it enjoyed smaller FP and FN but larger TP as and/or increased.

5. Application

Southwest University, China conducted an fMRI experiment to exam the differences in brain activation between overweight and normal weight subjects when performing a body image self-reflection task. In the task, participants were instructed to view several fat and thin body images closely, and vividly imagine that someone was comparing her body to the body in the picture. The experiment comprised six blocks of the fat body condition and six blocks of the thin body condition, and each block consisted of seven images. During the experiment, the brain of each participant was scanned every 2 seconds and total 280 images were taken, and each of image consisted of 131,072 voxels. Hence, for each subject, the high-dimensional time course data have and . The recorded fMRI data are publicly available at

(a) Normal weight, thin-body image (b) Normal weight, fat-body image (c) Overweight, thin-body image (d) Overweight, fat-body image
Figure 5: Activation map for ROIs. In panel (a): left insula (yellow); right EBA (cyan); left ACC (darkblue); right ACC (blue); right MPFC (darkred); left IPL (orange); right IPL (red); right DLPFC (deepskyblue). In panel (b): right amygdala (darkblue); left MPFC (lightgreen); right MPFC (darkred). In panel (c): right FBA (darkblue); left insula (lightgreen); right insula (darkred). In panel (d): right insula (lightgreen); right MPFC (darkred); left DLPFC (darkblue); right DLPFC (deepskyblue); right IPL (orange).

We randomly picked normal weight subject 7 and overweight subject 1 from the fMRI data. Based on Gao et al.(2016), we used the MNI coordinates to partition all voxels into 16 distinct regions of interest (ROIs) which play different functions. For example, previous studies found that inferior parietal lobule (IPL), extrastriate body area (EBA, lateral occipitotemporal cortex) and fusiformbody area (FBA) were related to perceptive processing of body image; Dorsolateral prefrontal cortex (DLPFC) and amygdala were related to affective processing of body image and can be activated when viewing body pictures with negative emotional valence; Medial prefrontal cortex (MPFC) was related with self-reflection; and ACC and insula were related to body dissatisfaction (Wagner et al., 2003; Uher et al., 2005; Kurosaki et al., 2006; Friederich et al., 2007; Miyake et al., 2010; Friederich et al., 2010; Yang et al., 2014).

Although the time points when the different types of images were applied to each subject are known, we treated the data as if such information were not available in advance. We applied the proposed change-point detection method to 16 ROIs with the nominal significance level and the dependence estimated by the elbow method. If an ROI was tested to encounter one or more change points, it was activated. Moreover, the type of image induced the change is known at each identified change point. Figure 5 illustrates the physical locations of the ROIs activated (changes detected) by the thin-body images and fat-body images for the normal weight subject and the overweight subject. More precisely, 5 ROIs were activated for the overweight subject when viewing fat-body images, while only 3 ROIs were activated when viewing thin-body images. On the other hand, for the normal weight subject, 8 ROIs were activated when viewing thin-body images, while only 3 ROIs were activated when viewing fat-body images.

Our results indicate that the overweight subject showed a stronger visual processing of fat body images than thin body images, whereas the normal weight subject showed a stronger visual processing of thin body images than fat body images. Interestingly, we found that ACC was only activated for the normal weight women when viewing thin body images. Such a result was consistent with the findings in Friederich et al. (2007), that healthy women body dissatisfaction and self-ideal discrepancies can be greatly induced by exposure to attractive slim bodies of other women. This may be one reason that normal weight women are more motivated to watch their weight and keep in shape than the overweight women.


Ayyala, D., Park, J. and Roy, A. (2017), “Mean vector testing for high-dimensional dependent observations,” Journal of Multivariate Analysis, 153, 136-155.

Bai, J. (2010), “Common breaks in means and variances for panel data,ÕÕ Journal of Econometrics, 157, 78–92.

Bai, Z. D. and Saranadasa, H. (1996), “Effect of high dimension: By an example of a two sample problem,” Statistica Sinica, 6, 311-329.

Carrasco, M. and Chen, X. (2002), “Mixing and moment properties of various GARCH and stochastic volatility models,” Econometric Theory, 18, 17-39.

Chen, J. and Gupta, A. (1997), “Testing and locating variance change-points with application to stock prices,” Journal of the American Statistical Association, 92, 739-747.

Chen, S. X. and Qin, Y. (2010),“ A two-sample test for high-dimensional data with applications to gene-set testing,” The Annals of Statistics, 38, 808-835.

Chen, H. and Zhang, N. (2015), “Graph-based change-point detection,” The Annals of Statistics, 43, 139-176.

Davis, R. A., Lee, T. and Rodriguez- Yam, G (2006), “Structural break estimation for non-stationary time series,” Journal of the American Statistical Association, 101, 223-239.

——–(2008), “Break detection for a class of nonlinear time series models,” Journal of Time Series Analysis, 29, 834-867.

Desobry, F., Davy, M. and Doncarli, C. (2005), “An online kernel change detection algorithm,” Signal Processing, IEEE Transaction on, 53, 2961-2974.

Friederich, H. C., Brooks, S., Uher, R., Campbell, I. C., Giampietro, V., Brammer, M., Williams, S.C.R., Herzog, W., and Treasure, J. (2010), “Neural correlates of body dissatisfaction in anorexia nervosa,” Neuropsychologia, 48, 2878-2885.

Friederich, H. C., Uher, R., Brooks, S., Giampietro, V., Brammer, M., Williams, S. C., Herzog, W.,Treasure, J., and Campbell, I. (2007), “I’m not as slim as that girl: neural bases of body shape self-comparison to media images,” Neuroimage, 37, 674-681.

Gao, X., Deng, X., Wen, X., She, Y., Vinke, P. and Chen, H. (2016), “My body looks like that girl’s: body mass index modulates brain activity during body images self-reflection among young women,” PLoS ONE, 11, e0164450.

Harchaoui, Z., Moulines, E. and Bach, F. (2009), “Kernel change-point analysis,” Advances in Neural Information Processing Systems, 609-616.

Incln, C. and Tiao, G. (1994), “Use of sums of squares for retrospective detection of changes of variance,” Journal of the American Statistical Association, 89, 913-923.

James, B., James, K. L. and Siegmund, D. (1992), “Asymptotic approximations for likelihood ratio tests and confidence regions for a change-point in the mean of a multivariate normal distribution,” Statistica Sinica, 2, 69-90.

Kokoszka, P. and Leipus, R. (2000), “Change-point estimation in ARCH models,” Bernoulli, 6, 513-539.

Kurosaki, M., Shirao, N., Yamashita, H., Okamoto, Y. and Yamawaki, S. (2006), “Distorted images of one’s own body activates the prefrontal cortex and limbic/paralimbic system in young women: a functional magnetic resonance imaging study,” Biological Psychiatry, 59, 380-386.

Lavielle, M. and Moulines, E. (2000), “Least-squares estimation of an unknown number of shifts in a time series,” Journal of Time Series Analysis, 21, 33-59.

Li, J. and Chen, S. X. (2012), “Two sample tests for high dimensional covariance matrices,ÕÕ The Annals of Statistics, 40, 908–940.

Liu, W. and Shao, Q. (2013), “A Cramer moderate deviation theorem for Hotelling -statistic with applications to global tests,ÕÕ The Annals of Statistics, 41, 296–322.

Matteson, D. and James, N. A. (2014), “A nonparametric approach for multiple change point analysis of multivariate data,” Journal of the American Statistical Association, 109, 334-345.

Miyake, Y., Okamoto, Y., Onoda, K., Kurosaki, M., Shirao, N. and Yamawaki, S. (2010), “Brain activation during the perception of distorted body images in eating disorders,” Psychiatry Research: Neuroimaging, 181, 183-192.

Okamoto, J., Stewart, N. and Li, J. (2018). HDcpDetect: detect change points in means of high dimensional data. R Package Version 0.1.0. (Available from

Olshen, A. and Venkatraman, E. (2004), “Circular binary segmentation for the analysis of array-based DNA copy number data,ÕÕ Biostatistics, 5, 557–572.

Ombao, H. C., Raz, J. A., von Sachs, R. and Molow, B. A. (2001), “Automatic statistical analysis of bivariate nonstationary time series,” Journal of The American Statistical Association, 96, 543-560.

Sen, A. K. and Srivastava, M. S. (1975), “On tests for detecting change in mean,” The Annals of Statistics, 3, 98-108.

Shao, X. and Zhang, X. (2010), “Testing for change points in time series,” Journal of the American Statistical Association, 105, 1228-1240.

Siegmund, D., Yakir, B. and Zhang, N. R. (2011). “Detecting simultaneous variant intervals in aligned sequences,” The Annals of Applied Statistics, 5, 645-668.

Srivastava, M.S. and Worsley, K. J. (1986). “Likelihood ratio tests for a change in the multivariate normal mean,” Journal of the American Statistical Association, 81, 199-204.

Uher, R., Murphy, T., Friederich, H. C., Dalgleish, T., Brammer, M. J. and Giampietro, V.(2005), “Functional neuroanatomy of body shape perception in healthy and eating-disordered women,” Biological Psychiatry, 58, 990-997.

Vostrikova, L. (1981), “Detection of disorder in multidimensional random processes,” Soviet Mathematics Doklady, 24, 55-59.

Wagner, A., Ruf, M., Braus, D. F. and Schmidt, M. H. (2003), “Neuronal activity changes and body image distortion in anorexia nervosa,” Neuroreport, 14, 2193-2197.

Yang, J., Dedovic, K., Guan, L., Chen, Y. and Qi, M. (2014), “Self-esteem modulates dorsal medial prefrontal cortical response to self-positivity bias in implicit self-relevant processing,” Social Cognitive and Affective Neuroscience, 9, 1814-1818.

Zhang, N. R., Siegmund, D. O., Ji, H. and Li, J. Z. (2010), “Detecting simultaneous changepoints in multiple sequences,” Biometrika, 97, 631-645.