Statistical Analysis of Synchrosqueezed Transforms

# Statistical Analysis of Synchrosqueezed Transforms

Haizhao Yang

Department of Mathematics, Duke University
October 2014, Revised: August 2016
###### Abstract

Synchrosqueezed transforms are non-linear processes for a sharpened time-frequency representation of wave-like components. They are efficient tools for identifying and analyzing wave-like components from their superposition. This paper is concerned with the statistical properties of compactly supported synchrosqueezed transforms for wave-like components embedded in a generalized Gaussian random process in multidimensional spaces. Guided by the theoretical analysis of these properties, new numerical implementations are proposed to reduce the noise fluctuations of these transforms on noisy data. A MATLAB package SynLab together with several heavily noisy examples is provided to support these theoretical claims.

Keywords. Wave-like components, instantaneous (local) properties, synchrosqueezed transforms, noise robustness, generalized Gaussian random process.

AMS subject classifications: 42A99 and 65T99.

## 1 Introduction

Non-linear and non-stationary wave-like signals (also termed as chirp signals in D) are ubiquitous in science and engineering, e.g., clinical data [43, 44], seismic data [19, 22, 34], climate data [36, 45], astronomical data [7, 11], materials science [26, 33, 50], and art investigation [49]. Analyzing instantaneous properties (e.g., instantaneous frequencies, instantaneous amplitudes and instantaneous phases [6, 32]) or local properties (concepts for D signals similar to “instantaneous” in D) of signals has been an important topic for over two decades. In many applications [15, 23, 41, 42, 51, 52], these signals can be modeled as a superposition of several wave-like components with slowly varying amplitudes, frequencies or wave vectors, contaminated by noise. For example, a complex signal

 f(x)=K∑k=1αk(x)e2πiNkϕk(x)+e(x), (1)

where is the instantaneous (local) amplitude, is the instantaneous (local) phase, is the instantaneous frequency (or as the local wave vector), and is a noisy perturbation term.

A powerful tool for analyzing signal (1) is the synchrosqueezed transform (SST) consisting of a linear time-frequency analysis tool and a synchrosqueezing technique. It belongs to more generally time-frequency reassignment techniques [3, 8, 9, 16] (see also [4] for a recent review). The SST was initialized in [16] and further analyzed in [15] for the D wavelet transform. Suppose is the wavelet transform of a D wave-like component . The wavelet time-frequency representation has a wide support spreading around the instantaneous frequency curve . It was proved that the instantaneous frequency information function is able to approximate . Hence, the synchrosqueezing technique shifts the value of from to , generating a sharpened time-frequency representation with a support concentrating around the curve . The localization of the new representation not only improves the resolution of the original spectral analysis due to the uncertainty principle but also make it easier to decompose the superposition in (1) into individual components.

A variety of SSTs have been proposed after the D synchrosqueezed wavelet transform (SSWT) in [16], e.g., the synchrosqueezed short time Fourier transform (SSSTFT) in [37], the synchrosqueezed wave packet transform (SSWPT) in [48, 51], the synchrosqueezed curvelet transform (SSCT) in [52] and the D monogenic synchrosqueezed wavelet transform in [14]. Rigorous analysis has proved that these transforms can accurately decompose a class of superpositions of wave-like components and estimate their instantaneous (local) properties if the given signal is noiseless. To improve the synchrosqueezing operator in the presence of strongly non-linear instantaneous frequencies, some further methods have been proposed in [4, 25, 28] based on an extra investigation of the higher order derivatives of the phase of a wave-like component. All these SSTs are compactly supported in the frequency domain to ensure accurate estimations. To better analyze signals with a trend or big data sets that need to be handled in real-time computation, a recent paper [13] proposes a new synchrosqueezing method based on carefully designed wavelets with sufficiently many vanishing moments and a minimum support in the time domain. Previous synchrosqueezed transforms need to access every data sample of to compute the synchrosqueezed transform at a specific time or location . However, compactly supported synchrosqueezed transforms only need a small portion of the data samples in a neighborhood of . Hence, compactly supported synchrosqueezed transforms are computationally more efficient and are better tools for modern big data analysis. However, mathematical analysis on the accuracy of this compactly supported SSWT is still under development. This paper addresses this problem in the framework of the SSWPT that includes the SSWT as a special case.

Another important topic in the study of SSTs is the statistical analysis of the synchrosqueezing operator, since noise is ubiquitous in real applications. A pioneer paper [10] in this direction studied the statistical properties of the D spectrogram reassignment method by calculating the probability density function of white Gaussian noise after reassignment. A recent paper [36] focused on the statistical analysis of the D SSWT for white Gaussian noise. It estimated the probability of a good estimation of the instantaneous frequency provided by the instantaneous frequency information function . A following paper [12] generalized its results to generalized stationary Gaussian process. To support the application of SSTs to real-time problems and multidimensional problems, this paper analyzes the statistical properties of multidimensional SSTs that can be compactly supported in the time domain.

Turning to the robustness issue in a numerical sense, it is of interest to design an efficient implementation of a sharpened time-frequency representation with reduced noise fluctuations. The idea of multitapering, first proposed in [38] for stationary signals and further extended in [5, 20] for non-stationary signals, attempted to improve the statistical stability of spectral analysis by generating multiple windowed trials of the noisy signal and averaging the spectral analysis of these trials. By combining the multitapering and time-frequency reassignment techniques, [46] proposed the D multitapering time-frequency reassignment for a sharpened time-frequency representation with reduced noise fluctuations. Since its implementation is based on Hermite functions, its efficient generalization in multidimensional spaces is not straightforward. Guided by the theoretical analysis of the statistical properties of SSTs, this paper proposes efficient numerical implementations of multidimensional SSTs based on highly redundant frames. Since the SST of a noiseless signal is frame-independent, averaging the SSTs from multiple time-frequency frames reduces the noise fluctuation while keeping the localization of the synchrosqueezed time-frequency representation.

The rest of this paper is organized as follows. In Section 2, the main theorems for the compactly supported SSTs in multidimensional spaces and their statistical properties are presented. In Section 3, a few algorithms and their implementations are introduced in detail to improve the statistical stability of SSTs. Several numerical examples with heavy noise are provided to demonstrate the proposed properties. We conclude this paper in Section 4.

## 2 Theory for synchrosqueezed transforms (SSTs)

Let us briefly introduce the basics and assumptions of compactly supported SSTs in Section 2.1 before discussing their statistical properties in Section 2.2. While the synchrosqueezing technique can be applied to a wide range of time-frequency transforms, the discussion here is restricted to the framework of multidimensional wave packet transforms to save space. It is easy to extend these results to other transforms (see [47] for the example of the D synchrosqueezed curvelet transform). Due to the space limitation, only the main ideas of the proofs are presented. Readers are referred to [47] for more details.

### 2.1 Compactly supported SSTs

Previously, the synchrosqueezed wave packet transform (SSWPT) was built using mother wave packets compactly supported in the frequency domain. This paper studies a wider class of mother wave packets defined below.

###### Definition 2.1.

An -dimensional mother wave packet is of type for some , and some non-negative integer , if is a real-valued smooth function with a support that covers the ball centered at the origin with a radius satisfying that:

 |ˆw(ξ)|≤ϵ(1+|ξ|)m,

for and , for .

Since , the above decaying requirement is easy to satisfy. Actually, we can further assume is essentially supported in a ball with to adapt signals with close instantaneous frequencies, i.e., is approximately zero outside this support up to an truncation error. However, is just a constant in later asymptotic analysis. Hence, we omit this discussion and consider it as in the analysis but implement it in the numerical tool. Similarly to the discussion in [48, 51], we can use this mother wave packet to define a family of -dimensional compactly supported wave packets

 wab(x)=|a|ns/2w(|a|s(x−b))e2πi(x−b)⋅a,

through scaling, modulation, and translation, controlled by a geometric parameter , for , . With this family of wave packets ready, we define the wave packet transform via

 Wf(a,b) =⟨f,wab⟩=∫f(x)¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯wab(x)dx.

Previously in [48, 51], the synchrosqueezed wave packet transform (SSWPT) was proposed to analyze a class of intrinsic mode type functions as defined below.

###### Definition 2.2.

A function in is an intrinsic mode type function (IMT) of type if and satisfy

 α(x)∈C∞,|∇α(x)|≤M,1/M≤α(x)≤M, ϕ(x)∈C∞,1/M≤|∇ϕ(x)|≤M,|∇2ϕ(x)|≤M.

It has been proved that the instantaneous frequency (or local wave vector when ) information function of an IMT can approximate if the mother wave packet is of type and is sufficiently large. A careful inspection of previous proofs shows that the approximation is still valid up to an relative error if the mother wave packet is of type for any positive integer. See Theorem 2.2.7 in [47] for a detailed proof. Hence, if we squeeze the coefficients together based upon the same information function , then we would obtain a sharpened time-frequency representation of . This motivates the definition of the synchrosqueezed energy distribution

 Tf(v,b)=∫Rn|Wf(a,b)|2δ(Rvf(a,b)−v)da

for . Here denotes the Dirac delta function and means the real part of .

For a multi-component signal , the synchrosqueezed energy of each component will also concentrate around each if these components satisfy the well-separation condition defined below.

###### Definition 2.3.

A function is a well-separated superposition of type if

 f(x)=K∑k=1fk(x),

where each is an IMT of type with and the phase functions satisfy the separation condition: for any , there exists at most one satisfying that

 |a|−s|a−Nk∇ϕk(b)|≤1.

We denote by the set of all such functions.

In real applications, this well-separation condition might not be valid for a multi-component signal at every . However, the SST will work wherever the well-separation condition is satisfied locally.

The key analysis of the SSWPT is how well the information function approximates the instantaneous frequencies or local wave vectors. If the approximation is accurate enough, the synchrosqueezed energy distribution gives a sharpened time-frequency representation of . We close this section with the following theorem that summarizes the main analysis of the -dimensional SSWPT for a superposition of IMTs without noise or perturbation. In what follows, when we write , , or , the implicit constants may depend on , and . Readers are referred to Theorem 2.2.7 in [47] for the proof of Theorem 2.4 here.

###### Theorem 2.4.

Suppose the n-dimensional mother wave packet is of type , for any fixed and any fixed integer . For a function , we define

 Rϵ={(a,b):|Wf(a,b)|≥|a|−ns/2√ϵ},
 Sϵ={(a,b):|Wf(a,b)|≥√ϵ},

and

 Zk={(a,b):|a−Nk∇ϕk(b)|≤|a|s}

for . For fixed , , , , and , there exists a constant such that for any and the following statements hold.

1. are disjoint and ;

2. For any ,

 |vf(a,b)−Nk∇ϕk(b)||Nk∇ϕk(b)|≲√ϵ;
3. For any ,

 |vf(a,b)−Nk∇ϕk(b)||Nk∇ϕk(b)|≲N−ns/2k√ϵ.

### 2.2 Statistical Properties of SSTs

Similarly to the noiseless case, we will analyze how well the information function approximates instantaneous frequencies or local wave vectors in the case when a superposition of IMTs is contaminated by random noise. To simplify the discussion, we will sketch out the proofs in the one-dimensional case and refer the readers to [47] for higher dimensional cases.

Let us start with a simple case in which the superposition is perturbed slightly by a contaminant, Theorem 2.5 below shows that the information function can approximate instantaneous frequencies with a reasonable error determined by the magnitude of the perturbation.

###### Theorem 2.5.

Suppose the mother wave packet is of type , for any fixed and any fixed integer . Suppose , where is a small error term that satisfies for some . For any , let . Define

 Rδ={(a,b):|Wg(a,b)|≥|a|−s/2δ},
 Sδ={(a,b):|Wg(a,b)|≥δ},

and

 Zk={(a,b):|a−Nkϕ′k(b)|≤|a|s}

for . For fixed , , , , and , there exists a constant such that for any and the following statements hold.

1. are disjoint and ;

2. For any ,

 |vg(a,b)−Nkϕ′k(b)||Nkϕ′k(b)|≲√ϵ+ϵp1;
3. For any ,

 |vg(a,b)−Nkϕ′k(b)||Nkϕ′k(b)|≲√ϵ+ϵp1Ns/2k.

We introduce the parameter to clarify the relation among the perturbation, the threshold and the accuracy for better understanding the influence of perturbation or noise. For the same purpose, a parameter will be introduced in the coming theorems. If the threshold is larger, e.g., , the relative estimate errors in and are bounded by and , respectively. Similarly, one can show that the information function computed from the wave packet coefficient with a larger magnitude can better approximate the instantaneous frequency.

Below is a sketch of the proof of Theorem 2.5. See the proof of Theorem 3.2.1 in [47] for a detailed proof.

###### Proof.

We only need to discuss the case when . By the definition of the wave packet transform, we have

 |We(a,b)|≲√ϵ1a−s/2 and |∂bWe(a,b)|≲√ϵ1(a1−s/2+as/2). (2)

If , then . Together with Equation (2), it holds that

 |Wf(a,b)|≥|Wg(a,b)|−|We(a,b)|≥a−s/2(δ−√ϵ1)≥a−s/2√ϵ. (3)

Hence, , where is defined in Theorem 2.4 and is a subset of . So, is true by Theorem 2.4.

Since , implies . Hence, by Theorem 2.4, it holds that

 |vf(a,b)−Nkϕ′k(b)||Nkϕ′k(b)|≲√ϵ, (4)

when is larger than a constant . Notice that implies . Hence, by Equation (2) to (4),

 |vg(a,b)−Nkϕ′k(b)||Nkϕ′k(b)|≤|vf(a,b)−Nkϕ′k(b)||Nkϕ′k(b)|+|∂bWf(a,b)2πiWf(a,b)−∂bWg(a,b)2πiWg(a,b)||Nkϕ′k(b)|≤√ϵ+ϵp1,

when . Hence, is proved. The proof of is similar. ∎

Next, we will discuss the case when the contamination is a random perturbation. [21, 24, 27, 35, 39] are referred to for basic facts about generalized random fields and complex Gaussian processes. To warm up, we start with additive white Gaussian process in Theorem 2.6 and extend it to a generalized zero mean stationary Gaussian process in Theorem 2.7. We assume that has an explicit power spectral function denoted by . represents the norm and is the standard inner product.

###### Theorem 2.6.

Suppose the mother wave packet is of type , for any fixed and any fixed integer . Suppose , where is zero mean white Gaussian process with a variance for some and some . For any , let . Define

 Rδ={(a,b):|Wg(a,b)|≥a−s/2δ},
 Sδ={(a,b):|Wg(a,b)|≥δ},

and

 Zk={(a,b):|a−Nkϕ′k(b)|≤as}

for . For fixed , , , , and , there exists a constant such that for any and the following statements hold.

1. are disjoint.

2. If , then with a probability at least

 1−e−O(N−skϵ−q1)+O⎛⎝ϵNm(1−s)k⎞⎠.
3. If , then with a probability at least

 1−e−ϵ−q1∥w∥−2+O⎛⎝ϵNm(1−s)k⎞⎠.
4. If for some , then

 |vg(a,b)−Nkϕ′k(b)||Nkϕ′k(b)|≲√ϵ+ϵp1

is true with a probability at least

 (1−e−O(N2−3skϵ−q1))(1−e−O(N−s−2kϵ−q1))+O⎛⎝ϵN(m−4)(1−s)−2k⎞⎠.
5. If , then

 |vg(a,b)−Nkϕ′k(b)||Nkϕ′k(b)|≲√ϵ+ϵp1Ns/2k

is true with a probability at least

 (1−e−O(N2−2skϵ−q1))(1−e−O(N−2kϵ−q1))+O⎛⎝ϵN(m−4)(1−s)−2k⎞⎠.

We only sketch the proof of the above theorem. See the proof of Theorem 3.2.2 in [47] for a long proof.

###### Proof.

Step : we prove this theorem when the mother wave packet is of type first, i.e., compactly supported in the frequency domain.

Since and are in , and are Gaussian variables. Hence, and can be understood as Gaussian variables. Furthermore, and are circularly symmetric Gaussian variables by checking that their pseudo-covariance matrices are zero. Therefore, the distribution of is determined by its variance

 e−ϵ−(1+q)1|z1|2∥w∥−2πϵ1+q1∥w∥2.

If we define

 V=(∥ˆw∥2⟨ˆwab,2πiξˆwab⟩⟨2πiξˆwab,ˆwab⟩⟨2πiξˆwab,2πiξˆwab⟩),

then is the covariance matrix of and its distribution is described by the joint probability density

 e−ϵ−(1+q)1z∗V−1zπ2ϵ2(1+q)1detV,

where , and denote the transpose operator and conjugate transpose operator. is an invertible and self-adjoint matrix, since and are linearly independent. Hence, there exist a diagonal matrix and a unitary matrix such that .

Part is true by previous theorems. Define the following events

 G1={|We(a,b)|
 G2={|We(a,b)|<√ϵ1},
 G3={|∂bWe(a,b)|<√ϵ1(a1−s/2+as/2)},
 Hk={|vg(a,b)−Nkϕ′k(b)||Nkϕ′k(b)|≲√ϵ+ϵp1},

and

 Jk=⎧⎨⎩|vg(a,b)−Nkϕ′k(b)||Nkϕ′k(b)|≲√ϵ+ϵp1Ns/2k⎫⎬⎭,

for . To conclude Part to , we need to estimate the probability , , , , and . Algebraic calculations show that

 P(G1)=1−e−a−sϵ−q1∥w∥−2 and P(G2)=1−e−ϵ−q1∥w∥−2.

We are ready to summarize and conclude and . If , then

 |We(a,b)+Wf(a,b)|≥a−s/2(ϵ1/2−p1+√ϵ). (5)

If , then

 |Wf(a,b)|≤a−s/2ϵ. (6)

Equation (5) and (6) lead to . Hence,

 P((a,b)∉⋃1≤k≤KZk)≤P(|We(a,b)|≥a−s/2√ϵ1)=1−P(G1).

This means that if , then with a probability at least , since if . So, is true. A similar argument applied to shows that with a probability at least . Hence, is proved.

Recall that . By the change of variables , we can show that

 P(G1∩G3)≥⎛⎜ ⎜⎝1−e−D11d21ϵ1+q1⎞⎟ ⎟⎠⎛⎜ ⎜⎝1−e−D22d21ϵ1+q1⎞⎟ ⎟⎠,

and

 P(G2∩G3)≥⎛⎜ ⎜⎝1−e−D11d22ϵ1+q1⎞⎟ ⎟⎠⎛⎜ ⎜⎝1−e−D22d22ϵ1+q1⎞⎟ ⎟⎠.

We can further estimate that and . Therefore,

 P(G1∩G3)≥(1−e−O(a2−3sϵ−q1))(1−e−O(a−s−2ϵ−q1)),

and

 P(G2∩G3)≥(1−e−O(a2−2sϵ−q1))(1−e−O(a−2ϵ−q1)).

By Theorem 2.5, if for some , then

 P(Hk)≥P(Hk|G1∩G3)P(G1∩G3)=P(G1∩G3)≥(1−e−O(a2−3sϵ−q1))(1−e−O(a−s−2ϵ−q1)).

Note that when , then

 P(Hk)≥(1−e−O(N2−3skϵ−q1))(1−e−O(N−s−2kϵ−q1)).

Similarly, if for some , then

 P(Jk)≥P(Jk|G2∩G3)P(G2∩G3)=P(G2∩G3)≥(1−e−O(N2−2skϵ−q1))(1−e−O(N−2kϵ−q1)).

These arguments prove and .

Step : we go on to prove this theorem when the mother wave packet is of type with . We would like to emphasize that the requirement is crucial to the following asymptotic analysis and it keeps the error caused by the non-compact support of reasonably small.

The sketch of the proof is similar to the first step. and are still Gaussian variables but in general not circularly symmetric, because they would not have zero pseudo-covariance matrices. Suppose they have covariance matrices and , pseudo-covariance matrices and , respectively. We can still check that they have zero mean, and , where is defined in the first step. By the definition of the mother wave packet of type , the magnitude of every entry in and is bounded by . Notice that the covariance matrix of is

 V1=(C1P1P∗1C∗1).

By Equation in [27] and the Taylor expansion, the distribution of is described by the following distribution

 e−12(z∗1,z1)V−11(z1,z∗1)Tπ√detV1=e−ϵ−(1+q)1|z1|2∥w∥−2πϵ1+q1∥w∥2(1+O(ϵ|z1|2ϵ1+q1am(1−s))).

By the same argument, the covariance matrix of is

 V2=(C2P2P∗2C∗2).

Let . Then the distribution of is described by the joint probability density

 e−12(z∗1,z∗2,z1,z2)V−12(z1,z2,z∗1,z∗2)Tπ2√detV2. (7)

Notice that and has eigenvalues of order and determined by estimating the diagonal entries of the matrix in the diagonalization . Hence, has eigenvalues of order and . Recall that the magnitude of every entry in is bounded by . This means that is nearly dominated by diagonal blocks and . Basic spectral theory for linear transforms shows that

 V−12=(C−12(C∗2)−1)+Pϵ,

where is a matrix with -norm bounded by . is crucial to the above spectral analysis. Since every entry of is bounded by ,

 detV2=(detC2)2+O⎛⎝ϵ4(1+q)1ϵam−2−(m+2)s⎞⎠,

where the residual comes from the entry bound and the eigenvalues of . Hence (7) is actually

 e−ϵ−(1+q)1z∗V−1ze−12(z∗1,z∗2,z1,z2)Pϵ(z1,z2,z∗1,z∗2)Tπ2ϵ2(1+q)1√(detV)2+O(ϵam−2−(m+2)s)).

By the same argument as in the first step, we can show that there exist a diagonal matrix and a unitary matrix such that . Part is still true by previous theorems. To conclude Part to , we still need to estimate the probability of those events defined in the first step, i.e., , , , , and . By the estimations above, one can show that

 P(G1)=1−e−a−sϵ−q1∥w∥−2+O(ϵam(1−s)),

and

 P(G