Non-Oscillatory Pattern Learning for Non-Stationary Signals

# Non-Oscillatory Pattern Learning for Non-Stationary Signals

Jieren Xu, Yitong Li, David Dunson,
Ingrid Daubechies, Haizhao Yang
Duke University, National University of Singapore
jieren.xu,yitong.li,dunson,ingrid@duke.edu,matyh@nus.edu.sg
###### Abstract

This paper proposes a novel non-oscillatory pattern (NOP) learning scheme for several oscillatory data analysis problems including signal decomposition, super-resolution, and signal sub-sampling. To the best of our knowledge, the proposed NOP is the first algorithm for these problems with fully non-stationary oscillatory data with close and crossover frequencies, and general oscillatory patterns. Even in stationary cases with trigonometric patterns, numerical examples show that NOP admits competitive or better performance in terms of accuracy and robustness than several state-of-the-art algorithms.

Non-Oscillatory Pattern Learning
for Non-Stationary Signals

Jieren Xu, Yitong Li, David Dunson, Ingrid Daubechies, Haizhao Yang Duke University, National University of Singapore jieren.xu,yitong.li,dunson,ingrid@duke.edu,matyh@nus.edu.sg

\@float

noticebox[b]Preprint. Work in progress.\end@float

## 1 Introduction

This paper concerns oscillatory data defined on a time domain of the form:

 f(t)=K∑k=1fk(t)=K∑k=1ak(t)sk(ϕk(t))=K∑k=1∑nˆsk(n)ak(t)einϕk(t)).\vspace0mm (1)

Here and are the latent instantaneous amplitude and phase functions of the th component, which are assumed to be smooth over time. The derivative of a phase function is called an instantaneous frequency function and denoted as . is a periodic shape function with periodicity one satisfying that and a unit -norm on . Figure 1: An example of Model (1) with shape functions sk=segk. (a) Shapes segk. (b) instantaneous frequencies nϕ′1(t) (green) and nϕ′2(t) (red) for all positive n.

Oscillatory data in Model (1) arises in numerous applications HauBio2 (); Pinheiro2012175 (); 7042968 (); Crystal (); LuWirthYang:2016 (); Eng2 (); ME (); 0957-0233-28-3-035102 (); Canvas (); Canvas2 (); GeoReview (); SSCT (); 977903 (); 4685952 (); MUSIC (); SuperResolution:Candes (); gruber1997statistical (); burg1972relationship (); roy1989esprit () and data analysis of this kind has been an active research field for decades. Usually only (and sometimes ) is available and the goal is to estimate , (or ), and from . Hence, this is a general problem including and generalizing sub-problems like adaptive time-frequency analysis Auger1995 (); Daubechies1996 (); YANG2017 (), empirical mode decomposition Huang1998 (); Wu2009 (); Wu2009EEMD (), super-resolution MUSIC (); SuperResolution:Candes (); gruber1997statistical (); burg1972relationship (); roy1989esprit (), pattern recognition zhu2013locally (), etc. In spite of many successful algorithms for solving these sub-problems, to the best of our knowledge, there is no existing algorithm fulfilling the ultimate goal of estimating , (or ), and when is fully non-stationary with close and crossover frequencies, and general shape functions. Many existing algorithms require high sampling rate for better accuracy and robustness, which is not practical due to the limit of battery capacity of mobile devices that collect oscillatory data, e.g. portable health monitors.

Fig. 1 shows a synthetic example of Model (1) when , , , , , and . is in fact a superposition of infinitely many deformed planewaves due to the Fourier series expansion of shape functions. Hence, close and crossover frequencies are unavoidable. In terms of frequencies, due to Heisenberg uncertainty principle, time-frequency analysis methods Auger1995 (); Daubechies1996 (); YANG2017 () are not able to estimate these instantaneous frequencies; due to the fully non-stationary nature of , existing super-resolution methods MUSIC (); SuperResolution:Candes (); gruber1997statistical (); burg1972relationship (); roy1989esprit () are not able to estimate frequencies . Moreover, existing GP based methods wilson2013gaussian (); pan2017prediction (); parra2017spectral (); ulrich2015gp (); quia2010sparse () cannot produce reliable source separation in Model (1). Non-compact support of in the Fourier domain (Fig. 2(b)) creates particular challenges. Some other regression methods have also been designed to estimate shape functions xu2018recursive (); MMD (); FMMD (), but they assume that phase functions are known.

In order to solve the challenge inverse problem implicit to Model (1), it is natural to choose priors within a Bayesian model for the latent components. While there are many possible stochastic process choices, Guassian Processes (GPs) are appealing in enforcing smoothness providing easy inclusion of prior information and leading to computational tractability. This paper proposes a non-oscillatory pattern (NOP) learning scheme, the first framework that can estimate , (or ), and simultaneously from based on GP. NOP repeatedly applies a two-stage iteration until convergence. In the first stage, fixing rough estimations of shapes, a novel non-stationary GP regression is proposed to estimate amplitude and phase functions. A novel set of auxiliary points, referred to as the pattern inducing points are introduced for this purpose. As oppose to traditional stationary kernels, non-stationary kernel functions in our GP model approximately transform a non-stationary data analysis problem into a stationary one, greatly reduce the regression difficulty. Furthermore, we propose to embed the information of rough shape estimation into the GP regression model, reducing a deep GP regression problem zhu2013locally (); damianou2013deep (); dai2015variational () with a composition of two latent variables (e.g. the composition of a shape function and a phase function) into a simpler problem with only one latent variable, the phase functions. This significantly reduce the computational cost and difficulty in the regression.

In the second stage, fixing rough estimations of amplitudes and phases, a non-stationary GP regression is proposed or other interative regression methods in FMMD (); xu2018recursive (); 1DSSWPT (); MMD () is applied to estimate shapes. The main difference of the proposed non-stationary GP regression in the second stage to existing methods is that, the variance of the point estimate of shape functions is simple to derive.

The first and second stages of the algorithm are introduced in Sections 2 and 3, respectively. Section 4 summarizes NOP. As we shall see in the numerical examples in Section 5, NOP works for a wide range of signals in Model (1) and the performance is not sensitive to initialization.

## 2 Estimation of the instantaneous information

In this section, we assume the shape functions are known and aim at estimating the phase and amplitude functions and , respectively.

### 2.1 Main ideas

We start with a simple case when the signal . Assume a non-stationary GP , and a stationary GP , where is the automatic relevance determination (ARD) rasmussen2006gaussian () squared exponential (SE) kernel:

 k(x,x′)=βSEexp(−12α% SE(x−x′)2),\vspace0mm (2)

with kernel parameters and . Now we consider the phase function as the latent variable of the GP , and claim the resulting GP is periodic and stationary in the domain . This is because , by plugging in the new input , the corresponding output is , i.e., any point lies on the curve . seems to be a deep/hierarchical GP model dai2015variational (), but in fact this ‘unwarping’ process directly removes the first GP layer while keeping the second layer stationary. This is a crucial idea for success of our NOP. And we write the one layer GP with input as for . Figure 2: The pattern driven model. (a) the data points are driven and finally match to the underlying pattern s. (b) KNMK−1MMu.

Next, we introduce a point estimate of the phase function as a latent variable. Suppose 111We will use bold font for vectors. is the observations (contaminated by white-noise with standard deviation ) of at time locations . Denote and . As can be seen, a direct inference of the latent , even when is given, is not trivial. This motivates us to introduce a new set of auxiliary points for the GP to retrieve the latent input variable , and to reveal the underlying patterns for Model 1 when is not known. We refer to this novel set of auxiliary points as the pattern inducing points, denoted as , where is the input variable and is the respective output evaluated at the inducing locations . Usually we uniformly discretize and fix at phase locations .

As for sparse inducing points snelson2006sparse (); titsias2009variational (), the pattern inducing points are treated as variational parameters, and can be updated. They possess the benefits of both sparse inducing points and training points rasmussen2006gaussian (), but aim to reveal the true underlying pattern of Model 1 and are endowed with specific meanings. The intuition behind the pattern inducing points is that they serve as certain landmarks, shown as the green dots (input as the green triangles) in Fig. 2, of the true underlying pattern (shaded green line). The latent input (red right triangle in Fig. 2) can drive the dataset (red dots in Fig. 2) across the phase domain (in Fig. 2(a) from (top) the initial wrong to (middle) to (bottom) the correct ), to match the underlying pattern (in green) by these landmarks. The that matches the dataset with the landmark points is the correct since for . So this process is also referred to as a pattern driven method for Model 1. Note that setting of with the SE kernel, is much more stable than with the periodic kernel rasmussen2006gaussian (), which is highly non-convex and can be trapped at many local minima.

Let be the variational distribution to approximate the posterior distribution of the inducing variable . For the purpose of computational efficiency, we adopt , where is the mean of and the variance matrix of is diagonal. In this section since has been observed, and . Given , the latent input that is encoded in the GP kernel can start the data-pattern matching process by the following standard GP setting as when auxiliary/training points are set. Conditioning on and , the likelihood of becomes

 (3)

By further marginalizing out and out, we have

 p(y|ϕ)=N(KNMK−1MMαu,KNN−KNMK−1MMKMN+Σu+σ2IN).\vspace0mm (4)

Here, is an identity matrix of size , is an covariance matrix, 222Denote as the th entry of a vector, and as the th entry of a matrix. for , and similarly for , . Hence, a point estimate of the latent variable conditioning on can be computed by Bayesian approach using Eqn. (4).

In the case of multi-components as in Model (1), we denote , , and for the th component, and . Assuming the independence among variables of the same type, e.g., among , among , etc., Eqn. (4) can be generalized to

 p(y|Φ)=N(K∑k=1KNM,kK−1MM,kαuk,K∑k=1(KNN,k−KNM,kK−1MM,kKMN,k+Σuk)+σ2IN),\vspace0mm (5)

where consists of latent variables as the th column, , , , and are defined for covariance matrices of the th component similarly to the case of one component.

When we have time-varying smooth amplitude functions as in Model (1), let , then are a new set of latent variables of . In this case, Eqn. (5) takes the form,

 p(y|A,Φ)=N(K∑k=1ak⊙KNM,kK−1MM,kαuk,K∑k=1akaTk⊙(KNN,k−KNM,kK−1MM,kKMN,k+Σuk)+σ2IN), (6)

where has colums .

To accelerate the computation, instead of applying the Bayesian approach with Eqn. (6), we directly fit the mean value of the Gaussian process with the observations via a least square (LS) problem:

 (7)

where has been encoded in the kernel matrices in the minimization objective function.

The optimization problem in (7) is non-convex and there is no strong prior of and to provide a good initialization. Note that in most applications, and usually smoothly vary in time. This motivates us to generate signal patches and locally parametrize the phase and amplitude functions with low-degree polynomials in Eqn. (7). Within each patch, the non-stationary signal becomes much more stationary. Hence, our numerical experiments show that degree is sufficiently good.

For each observation path (also denoted as ), let and be the matrices consisting of the coefficients of the -degree polynomials representing the amplitude and phase functions, respectively. Then we can specify the amplitude and phase functions with and from the following LS problem:

 (8)

where is the entry-wise dot product, and has been absorbed in the covariance matrices.

Once the amplitude and phase for the signal patch have been specified, they provide a local point estimate of the amplitude and phase functions, which can be used to obtain a global point estimate via a robust curve fitting algorithm garcia2010robust (); garcia2011fast (). A standard moving average can be applied to estimate the variance of the point estimate.

## 3 Estimation of shape functions

In this section, we assume the phase and amplitude functions and are known and estimate the shape functions . We do not aim at closed formulas for . As introduced in Section 2, it is sufficient to estimate the pattern inducing variables and to represent .

When amplitude and phase functions are given, shape function estimation methods have been studied previously in FMMD (); xu2018recursive (); 1DSSWPT (); MMD (). These methods achieves high accuracy when amplitude and phase function estimation is close to the ground truth. There is no quantitative criteria to measure how well the shape function estimation performs when the amplitude and phase function estimation is not very good. This motivates us to apply variational inference, following titsias2009variational (), that explicitly expresses the distribution of the pattern inducing variables in a joint form. Please refer to Supplemental Material (SM) Section A and B for details of these three well-established approaches.

## 4 Overview of NOP

NOP repeatedly applies a two-stage iteration until convergence. We have introduced the first stage in Section 2: given rough estimations of shapes, a non-stationary GP regression is applied to estimate amplitude and phase functions. The second stage has been introduced in Section 3: given rough estimations of amplitudes and phases, several regression methods can be adopted to estimate shapes. Hence, a complete algorithm description can be summarized in Algorithm 1 below. The respective prediction formulation of Algorithm 1 is derived in SM Section C.

In many applications, e.g. ECG and photoplethysmogram data analysis, heuristic properties of the physical system are available and we know the rough range of instantaneous frequencies . Hence, we can apply a band-pass filter to , and then estimate and of in a certain frequency band using traditional time-frequency analysis methods Auger1995 (); Daubechies1996 (); YANG2017 () to initialize and following the method in 1DSSWPT ().

Another simple initialization is to let . Since we adopt local patch analysis in Section 2, components after Fourier series expansion , , become approximately orthogonal to each other in a short time domain. Hence, the LS in (7) is able to recover the amplitude and phase functions corresponding to since they usually have the largest magnitude.

The LS problems are non-convex and hence we cannot guarantee convergence to the global minimizer. However, the iterative scheme seems to provide very good results and the algorithm generally converges after only a few iterations. The good performance of NOP might come from the fact that once the amplitude and phase function estimates are roughly good (might not be the global minimizer of LS problems), the shape estimation step can quickly provide very good estimation of shape functions, which can be guaranteed if we adopt methods in FMMD (); xu2018recursive (); MMD (). The global convergence analysis would be an interesting future work.

## 5 Experiments

In this section, we provide numerical examples to demonstrate the performance of NOP, especially in the case of super-resolution and adaptive time-frequency analysis. LS problems in all examples are solved by Adam goodfellow2016deep () aiming at better local minimizers. We choose degree-(or degree- when specified) polynomials to approximate local amplitude and phase functions in these LS problems. The hyperparameters of NOP are set as follows: noise level , , and . In the local patch analysis, we generate signal patches such that each patch contains approximately to periods. In the tests for super-resolution, we repeat the same test with noise realizations for the purpose of using the expectation and variance of estimation error to measure the performance of different algorithms.

### 5.1 Super-resolution spectral estimation

There has been substantial research for the super-resolution problem that aims at estimating time-invariant amplitudes and frequencies in a signal with , , and are very close. Among many possible choices, the baseline models might be MUSIC schmidt1986multiple (), ME burg1972relationship (); georgiou2001spectral (); georgiou2002spectral (), and ESPRIT roy1989esprit (). Hence, we will compare NOP with these methods to show the advantages of NOP. Although the Fourier transform usually fails schmidt1986multiple () to identify and , we use its results as the initialization for NOP. Figure 3: Frequency estimation (absolute) error of f{1}(t)=cos(2πω1t)+sin(2πω2t), where ω1=\nicefrac38.81024 and ω2=\nicefrac(38.8+δ0)1024 with different δ0 and white noise N(0,σ2). Figure 4: Left: results of f{2} with different number of samples N=64, 128, 256, and 1024 from top to bottom, and by different methods in an order of NOP, ME, ESPRIT, and MUSIC from left to right. The ground truth frequencies are (2πω1,2πω2)=(0.1,0.15). 100 tests with different noise realization were performed and the estimated frequencies are visualized in a 2D domain centered at the ground truth. Right: the expectation and variance of estimation errors for different methods and numbers of samples.

#### Accuracy and robustness with different spectral gaps

In this experiment, we use , where the two instantiations of / are / (red lines in Fig. 3) and / as in Fig. 1 (pink lines in Fig. 3). Here and , varies from to , and different noise variance are . The sampling rate is Hz and the number of samples is in this example. Fig. 3 shows the frequency estimation accuracy of NOP, MUSIC, ESPRIT, and ME. As we can see, NOP achieves machine accuracy in the noiseless case and is much more accurate than other methods in all noisy cases. It also reaches an approximate same accuracy for the trigonometric (red) and shaped(pink) instantiations in and .

#### Accuracy and robustness with different sampling rates

In this experiment, we set with , , and . The sampling rate of this signal is still Hz and the numbers of samples are , , , and to generate four sets of test data. There are two different kinds of noise to generate noisy test data: 1) white Gaussian noise is directly added to ; 2) a stochastic process in with i.i.d. uniform distribution in is added to phase functions . Fig. 4 summarizes the results of frequency estimation in this experiment. ESPRIT and MUSIC lose accuracy in all tests. NOP and ME achieve high accuracy when the number of samples is large and NOP is slightly better than ME in terms of accuracy and estimation bias.

### 5.2 Estimation of time-variant frequencies

In this section, we show the capacity of NOP for estimating close and crossover time-varying instantaneous frequencies. An adaptive time-frequency analysis algorithm, ConceFT Daubechies20150193 ()), is used as a comparison. And local approximation degree is set to in this section. Figure 5: Short signal f{3}(N=100) with linear frequencies. (a) is the time-frequency representation of the ten realizations with δ0 varies from −\nicefrac510.24 to \nicefrac510.24. (b) and (c) is the frequency (top) and phase (bottom) estimation error for f{3} with sk=sin and sk=segk respecitvely.

#### Close frequencies and phase estimation error

We use , where the two instantiations of / are / (Fig. 5(b)) and / as in Fig. 1 (Fig. 5(c)). varies from to . The white noise has standard deviation . The sampling rate is Hz and samples are involved. The initialization of is set to a little random perturbation of an STFT output. Result is summarized in Fig. 5.

Fig. 5(a) is the ground truth time-frequency representation of ten tested signals with different value of on . The difference between (green line) and (red line) are pretty difficult to be detected by existing time-frequency methods. The log error of the point-wise averaged frequency estimation is shown in the first row of of Fig. 5 (b) and (c) on different noise levels . The log error of point-wise averaged phase estimation (bottom row) is consistently small as changes. Under large noise case with , NOP controls the phase error approximate or below the level of . Compared to the existing time-frequency methods, NOP has no accumulated phase error.

#### Close and crossover frequencies

In this experiment, we generate a signal consisting of two components with close instantaneous frequencies and a signal with two crossover instantaneous frequencies. Fig. 6 visualizes the ground truth instantaneous frequencies, the time-frequency distribution by ConceFT, the initialization and the estimation results of NOP. ConceFT cannot visualize the instantaneous frequencies even if in the noiseless case. We average out the energy distribution of ConceFT to obtain the initialization of NOP. Although the initialization is very poor, NOP is still able to estimate the instantaneous frequencies with a reasonably good accuracy no matter in clean or noisy cases. Remark that a non-distinguishable issue can arise if two component are initialized with a same frequency. However, this problem can be solved by an easy ad hoc trick and is detailed in SM Section D. Figure 6: Instantaneous frequency estimation for signals with close and crossover frequencies. (a) ground truth instantaneous frequencies and initialization of NOP. (b) estimated instantaneous frequencies for clean signals. (c) estimated instantaneous frequencies for noisy signals. (d) time-frequency distribution by ConceFT. Figure 7: NOP is applied to estimate the amplitude, phase, and shapes of a synthetic signal f{6}(t) consisting of two components. (a) the time-frequency distribution of f{6}(t) by ConceFT in two different frequency ranges. ConceFT cannot reveal the ground truth instantaneous frequencies (in red and green). But we can initialize NOP by averaging out the distribution (see the dash pink line). (b) and (c) the ground truth shape functions and their estimates. (d) the noisy signal f{6}(t) and the reconstructed components by NOP.

### 5.3 Estimation of amplitudes, phases, and shapes simultaneously

Finally, we apply NOP to estimate amplitudes, phases, and shapes simultaneously from a single record. First, we generate a synthetic example , where , , and the shapes are visualized in Fig. 7. The sampling rate for this signal is Hz and we sample it at locations. The shape estimates are initialized as and for the first and second components, respectively (see Fig. 7 (b) and (c)). The frequency estimates are initialized as one constant centered in the peak spectrogram by ConceFT (see Fig. 7 (a)). As we can see in Fig. 7 (b) and (c), NOP is able to estimate shape functions with a reasonably good accuracy and the reconstructed components match the ground truth components very well.

In the second example, we apply NOP to a real signal from photoplethysmogram (PPG) (see Fig. 8 (b)). The shape estimates are still initialized as and for the two components, and samples are involved. The PPG signal contains two components corresponding to the health condition of the heart and lung in a human body. Figure 8: (a) PPG signal. (b) reconstructed components of the PPG signal. These two components were reconstructed from only a small portion of the samples of the original PPG raw data as visuallized in the bottom figure.

## 6 Conclusion

This paper proposes a novel non-oscillatory pattern (NOP) learning scheme for several oscillatory data analysis problems including signal decomposition, super-resolution, and signal sub-sampling. To the best of our knowledge, the proposed NOP is the first algorithm for these problems with fully non-stationary oscillatory data with close and crossover frequencies, and general oscillatory patterns. Numerical examples have shown the advantage of NOP over several state-of-the-art algorithms and NOP is able to handle complicated examples for which existing algorithms fail. NOP could be a very useful tool for pattern analysis for oscillatory data. Although we cannot prove the global convergence of NOP, NOP seems to provide very good results in all of our tests. It is interesting to study the global convergence of NOP in the future.

## References

• (1) E. Alonso, E. Aramendi, D. González-Otero, U. Ayala, M. Daya, and J. K. Russell. Empirical mode decomposition for chest compression and ventilation detection in cardiac arrest. In Computing in Cardiology 2014 2014.
• (2) F. Auger and P. Flandrin. Improving the readability of time-frequency and time-scale representations by the reassignment method. Signal Processing, IEEE Transactions on, 1995.
• (3) X. Bai, M. Xing, F. Zhou, G. Lu, and Z. Bao. Imaging of micromotion targets with rotating parts based on empirical-mode decomposition. IEEE Transactions on Geoscience and Remote Sensing 2008.
• (4) E. Bodin, N. D. Campbell, and C. H. Ek. Latent Gaussian Process regression. arXiv preprint arXiv:1707.05534, 2017.
• (5) J. P. Burg. The relationship between maximum entropy spectra and maximum likelihood spectra. Geophysics, 1972.
• (6) B. Cornelis, H. Yang, A. Goodfriend, N. Ocon, J. Lu, and I. Daubechies. Removal of canvas patterns in digital acquisitions of paintings. IEEE Transactions on Image Processing 2017.
• (7) Z. Dai, A. Damianou, J. González, and N. Lawrence. Variational auto-encoded deep Gaussian Processes. arXiv preprint arXiv:1511.06455, 2015.
• (8) A. Damianou and N. Lawrence. Deep Gaussian Processes. In Artificial Intelligence and Statistics, 2013.
• (9) I. Daubechies and S. Maes. A nonlinear squeezing of the continuous wavelet transform based on auditory nerve models. In Wavelets in Medicine and Biology. CRC Press, 1996.
• (10) I. Daubechies, Y. G. Wang, and H.-t. Wu. Conceft: concentration of frequency and time via a multitapered synchrosqueezed transform. Philosophical Transactions of the Royal Society of London A: Mathematical, Physical and Engineering Sciences, 2016.
• (11) D. Garcia. Robust smoothing of gridded data in one and higher dimensions with missing values. Computational Statistics & Data Analysis, 2010.
• (12) D. Garcia. A fast all-in-one method for automated post-processing of piv data. Experiments in fluids, 2011.
• (13) T. T. Georgiou. Spectral estimation via selective harmonic amplification. IEEE Transactions on Automatic Control, 2001.
• (14) T. T. Georgiou. Spectral analysis based on the state covariance: the maximum entropy spectrum and linear fractional parametrization. IEEE transactions on Automatic Control, 2002.
• (15) I. Goodfellow, Y. Bengio, A. Courville, and Y. Bengio. Deep learning, volume 1. MIT press Cambridge, 2016.
• (16) T. L. Griffiths, M. I. Jordan, J. B. Tenenbaum, and D. M. Blei. Hierarchical topic models and the nested chinese restaurant process. In Advances in neural information processing systems, 2004.
• (17) M. H. Gruber. Statistical digital signal processing and modeling, 1997.
• (18) N. E. Huang, Z. Shen, S. R. Long, M. C. Wu, H. H. Shih, Q. Zheng, N.-C. Yen, C. C. Tung, and H. H. Liu. The empirical mode decomposition and the Hilbert spectrum for nonlinear and non-stationary time series analysis. R. Soc. Lond. Proc. Ser. A Math. Phys. Eng. Sci., 1998.
• (19) W. Huang, Z. Shen, N. E. Huang, and Y. C. Fung. Engineering analysis of biological variables: An example of blood pressure over 1 day. Proc. Natl. Acad. Sci., 1998.
• (20) Y. Huanyin, G. Huadong, H. Chunming, L. Xinwu, and W. Changlin. A sar interferogram filter based on the empirical mode decomposition method. In IGARSS 2001. Scanning the Present and Resolving the Future. Proceedings. IEEE 2001 International Geoscience and Remote Sensing Symposium (Cat. No.01CH37217), volume 5, 2001.
• (21) C. E. J. and F.-G. Carlos. Towards a mathematical theory of super-resolution. Communications on Pure and Applied Mathematics.
• (22) N. Lawrence. Probabilistic non-linear principal component analysis with Gaussian Process latent variable models. Journal of machine learning research, 2005.
• (23) W. Liao and A. Fannjiang. Music for single-snapshot spectral estimation: Stability and super-resolution. Applied and Computational Harmonic Analysis, 2016.
• (24) J. Lu, B. Wirth, and H. Yang. Combining 2D synchrosqueezed wave packet transform with optimization for crystal image analysis. Journal of the Mechanics and Physics of Solids, 2016.
• (25) Y. Pan, X. Yan, E. A. Theodorou, and B. Boots. Prediction under uncertainty in sparse spectrum Gaussian Processes with applications to filtering and control. In ICML, 2017.
• (26) G. Parra and F. Tobar. Spectral mixture kernels for multi-output Gaussian Processes, 2017.
• (27) E. Pinheiro, O. Postolache, and P. Girão. Empirical mode decomposition and principal component analysis implementation in processing non-invasive cardiovascular signals. Measurement, 2012. Special issue on Electrical Instruments.
• (28) J. QuiÃonero-Candela, C. E. Rasmussen, A. R. Figueiras-Vidal, et al. Sparse spectrum Gaussian Process regression. Journal of Machine Learning Research, 2010.
• (29) C. E. Rasmussen and C. K. Williams. Gaussian Processes for machine learning, volume 1. MIT press Cambridge, 2006.
• (30) R. Roy and T. Kailath. Esprit-estimation of signal parameters via rotational invariance techniques. IEEE Transactions on Acoustics, Speech, and Signal Processing 1989.
• (31) Y. Saatçi. Scalable inference for structured Gaussian Process models. PhD thesis, University of Cambridge, 2012.
• (32) R. Schmidt. Multiple emitter location and signal parameter estimation. IEEE transactions on antennas and propagation, 1986.
• (33) E. Snelson and Z. Ghahramani. Sparse Gaussian Processes using pseudo-inputs. In Advances in Neural Information Processing Systems, 2006.
• (34) G. Tang and H. Yang. A fast algorithm for multiresolution mode decomposition. arXiv:1709.06880, 2017.
• (35) J. B. Tary, R. H. Herrera, J. Han, and M. van der Baan. Spectral estimation-What is new? What is next? Rev. Geophys. 2014.
• (36) M. K. Titsias. Variational learning of inducing variables in sparse Gaussian Processes. In International Conference on Artificial Intelligence and Statistics, 2009.
• (37) M. K. Titsias and N. D. Lawrence. Bayesian Gaussian Process latent variable model. In International Conference on Artificial Intelligence and Statistics, 2010.
• (38) K. R. Ulrich, D. E. Carlson, K. Dzirasa, and L. Carin. Gp kernels for cross-spectrum analysis. In NIPS, 2015.
• (39) A. Wilson and R. Adams. Gaussian Process kernels for pattern discovery and extrapolation. In ICML, 2013.
• (40) H.-T. Wu, Y.-H. Chan, Y.-T. Lin, and Y.-H. Yeh. Using synchrosqueezing transform to discover breathing dynamics from ECG signals. Applied and Computational Harmonic Analysis, 2014.
• (41) Z. Wu and N. E. Huang. Ensemble empirical mode decomposition: a noise-assisted data analysis method. Advances in Adaptive Data Analysis, 2009.
• (42) Z. Wu, N. E. Huang, and X. Chen. The multi-dimensional ensemble empirical mode decomposition method. Adv. Adapt. Data Anal., 2009.
• (43) J. Xu, H. Yang, and I. Daubechies. Recursive diffeomorphism-based regression for shape functions. SIAM Journal on Mathematical Analysis, 2018.
• (44) H. Yang. Synchrosqueezed wave packet transforms and diffeomorphism based spectral analysis for 1D general mode decompositions. Applied and Computational Harmonic Analysis, 2015.
• (45) H. Yang. Multiresolution mode decomposition for adaptive time series analysis. arXiv:1709.06880, 2017.
• (46) H. Yang. Statistical analysis of synchrosqueezed transforms. Applied and Computational Harmonic Analysis, 2017.
• (47) H. Yang, J. Lu, W. Brown, I. Daubechies, and L. Ying. Quantitative canvas weave analysis using 2-D synchrosqueezed transforms: Application of time-frequency analysis to art investigation. Signal Processing Magazine, IEEE 2015.
• (48) H. Yang, J. Lu, and L. Ying. Crystal image analysis using 2D synchrosqueezed transforms. Multiscale Modeling & Simulation, 2015.
• (49) H. Yang and L. Ying. Synchrosqueezed curvelet transform for two-dimensional mode decomposition. SIAM Journal on Mathematical Analysis, 2014.
• (50) C. Zhang, Z. Li, C. Hu, S. Chen, J. Wang, and X. Zhang. An optimized ensemble local mean decomposition method for fault detection of mechanical components. Measurement Science and Technology, 2017.
• (51) C. Zhang, Z. Li, C. Hu, S. Chen, J. Wang, and X. Zhang. An optimized ensemble local mean decomposition method for fault detection of mechanical components. Measurement Science and Technology, 2017.
• (52) B. Zhu and D. B. Dunson. Locally adaptive Bayes nonparametric regression via nested Gaussian Processes. Journal of the American Statistical Association, 2013.
You are adding the first comment!
How to quickly get a good reply:
• Give credit where it’s due by listing out the positive aspects of a paper before getting into which changes should be made.
• Be specific in your critique, and provide supporting evidence with appropriate references to substantiate general statements.
• Your comment should inspire ideas to flow and help the author improves the paper.

The better we are at sharing our knowledge with each other, the faster we move forward.
The feedback must be of minimum 40 characters and the title a minimum of 5 characters   