A Introduction

Intrinsic wavelet regression for curves of Hermitian positive definite matrices

Abstract

In multivariate time series analysis, non-degenerate autocovariance and spectral density matrices are necessarily Hermitian and positive definite and it is important to preserve these properties in any estimation procedure. Our main contribution is the development of intrinsic wavelet transforms and nonparametric wavelet regression for curves in the non-Euclidean space of Hermitian positive definite matrices. The primary focus is on the construction of intrinsic average-interpolation wavelet transforms in the space equipped with a natural invariant Riemannian metric. In addition, we derive the wavelet coefficient decay and linear wavelet thresholding convergence rates of intrinsically smooth curves of Hermitian positive definite matrices. The intrinsic wavelet transforms are computationally fast and nonlinear wavelet shrinkage or thresholding captures localized features, such as cups or kinks, in the matrix-valued curves. In the context of nonparametric spectral estimation, the intrinsic linear or nonlinear wavelet spectral estimator satisfies the important property that it is equivariant under a change of basis of the time series, in contrast to most existing approaches. The finite-sample performance of the intrinsic wavelet spectral estimator based on nonlinear tree-structured trace thresholding is benchmarked against several state-of-the-art nonparametric curve regression procedures in the Riemannian manifold by means of simulated time series data.

\patchcmd

Keywords: Riemannian manifold, Hermitian positive definite matrices, Intrinsic wavelet transform, Nonlinear wavelet thresholding, Spectral matrix estimation, Multivariate time series.

Appendix A Introduction

In multivariate time series analysis, the second-order behavior of a multivariate time series is studied by means of its autocovariance matrices in the time domain, or its spectral density matrices in the frequency domain. Non-degenerate spectral density matrices are necessarily curves of Hermitian positive definite (HPD) matrices, and one generally constrains a spectral curve estimator to preserve these properties. This is important for several reasons: i) interpretation of the spectral estimator as the Fourier transform of symmetric positive definite (SPD) autocovariance matrices in the time domain or as HPD covariance matrices across frequency in the Fourier domain; ii) well-defined transfer functions in the Cramér representation of the time series for the purpose of e.g. simulation of time series and bootstrapping; iii) sufficient regularity to avoid computational problems in subsequent inference procedures (requiring e.g. the inverse of the estimated spectrum). Our main contribution is the development of intrinsic wavelet transforms and nonparametric wavelet regression for curves in the non-Euclidean space of HPD matrices, exploiting the geometric structure of the space as a Riemannian manifold. In this work, the primary focus is on nonparametric spectral density matrix estimation of stationary multivariate time series, but we emphasize that the methodology applies equally to general matrix-valued curve estimation or denoising problems, where the target is a curve of symmetric or Hermitian PD matrices. Examples include; curve denoising of SPD diffusion covariance matrices in diffusion tensor imaging as in e.g. Yuan et al. (2012), or estimation of time-varying autocovariance matrices of a locally stationary time series as in e.g. Dahlhaus (2012).
A first important consideration to perform estimation in the space of HPD matrices is the associated metric in the space. The metric gives the space its curvature and induces a distance between HPD matrices. Traditional nonparametric spectral matrix estimation relies on smoothing the periodogram via e.g. kernel regression as in (Brillinger, 1981, Chapter 5), (Brockwell and Davis, 2006, Chapter 11), or multitaper spectral estimation as in e.g Walden (2000). These approaches equip the space of HPD matrices with the Euclidean (i.e. Frobenius) metric and view it as a flat space. An important disadvantage is that this metric space is incomplete, as the boundary of singular matrices lies at a finite distance. For this reason, flexible nonparametric (e.g. wavelet- or spline-) periodogram smoothing embedded in a Euclidean space cannot guarantee a PD spectral estimate. Exceptions to this rule include inflexible kernel or multitaper periodogram smoothing, which rely on a sufficiently large equivalent smoothing bandwidth for each matrix component. To avoid this issue, Dai and Guo (2004), Rosen and Stoffer (2007) and Krafty and Collinge (2013) among others construct a HPD spectral estimate as the square of an estimated curve of Cholesky square root matrices. This allows for more flexible estimation of the spectrum, such as individual smoothing of Cholesky matrix components, while at the same time guaranteeing a HPD spectral estimate. In this context, the space of HPD matrices is equipped with the Cholesky metric, where the distance between two matrices is given by the Euclidean distance between their Cholesky square roots. Unfortunately, the Cholesky metric and Cholesky-based smoothing are not equivariant to permutations of the components of the underlying time series. That is, if one observes a reordering of the time series components, the spectral estimate is not necessarily a permuted version of the spectral estimate obtained under the original ordering of the time series components.
In this work, we exploit the geometric structure of the space of HPD matrices as a Riemannian manifold equipped with the natural invariant (Smith (2000)) –also affine-invariant (Pennec et al. (2006)), canonical (Holbrook et al. (2016)), trace (Yuan et al. (2012)), Rao-Fisher (Said et al. (2015))– Riemannian metric, or simply the Riemannian metric (Bhatia, 2009, Chapter 6), Dryden et al. (2009)). The natural invariant Riemannian metric plays an important role in estimation problems in the space of symmetric or Hermitian PD matrices for several reasons: (i) the space of HPD matrices equipped with the Riemannian metric is a complete metric space, (ii) there is no swelling effect as with the Euclidean metric, where interpolating two HPD matrices may yield a matrix with a determinant larger than either of the original matrices (e.g. Pasternak et al. (2010)), and (iii) the induced Riemannian distance is invariant under congruence transformation by any invertible matrix, see Section B. The first property guarantees a HPD spectral estimate, while allowing for flexible spectral matrix estimation as with Cholesky-based smoothing. The third property ensures that the spectral estimator is –not only– permutation or unitary congruence equivariant, but also general linear congruence equivariant, which essentially implies that the estimator does not nontrivially depend on the chosen coordinate system of the time series. In Dryden et al. (2009), the authors list several additional suitable metrics to perform estimation in the space of HPD matrices, such as the Log-Euclidean metric also discussed in e.g. Yuan et al. (2012) or Boumal and Absil (2011b). The Log-Euclidean metric transforms the space of HPD matrices in a complete metric space and is unitary congruence invariant, but not general linear congruence invariant.
Several recent works on nonparametric curve regression in the space of SPD matrices equipped with the Riemannian metric include: intrinsic geodesic and linear regression in Pennec et al. (2006) and Zhu et al. (2009) among others, intrinsic local polynomial regression in Yuan et al. (2012) and intrinsic penalized spline-like regression in Boumal and Absil (2011b). In the context of frequency-specific spectral matrix estimation Holbrook et al. (2016) recently introduced a Bayesian geodesic Lagrangian Monte Carlo (gLMC) approach based on the Riemannian metric. The latter may not be best-suited to estimation of the entire spectral curve, as this requires application of the gLMC to each individual Fourier frequency, which is computationally quite challenging. In this work, we develop fast intrinsic wavelet regression in the manifold of HPD matrices equipped with the Riemannian metric. Wavelet-based estimation of spectral matrices allows us to capture potentially very localized features, such as local peaks or troughs in the spectral matrix at pointwise frequencies or frequency bands, in contrast to the approaches mentioned above, which rely on globally homogeneous smoothness behavior. The intrinsic wavelet transform is a generalization of the average-interpolation (AI) wavelet transform on the real line in Donoho (1993). In this sense it is related to the midpoint-interpolation (MI) approach in Rahman et al. (2005) for general symmetric Riemannian manifolds with tractable exponential and logarithmic maps. The fundamental difference lies in the fact that we do not apply a Euclidean refinement scheme to the data projected a set of tangent spaces as in Rahman et al. (2005). Such an approach introduces a certain degree of ambiguity as the base points of the tangent spaces to which the midpoints (i.e. scaling coefficients) are projected need to be specified by the user and different base points lead to different wavelet coefficients. In the introduced intrinsic wavelet transform there is no such ambiguity as we construct an intrinsic AI refinement scheme directly in the manifold equipped with the Riemannian metric. In fact, the construction of the wavelet transform is valid for any of the above-mentioned metrics, and one can substitute the Riemannian metric by e.g. the Log-Euclidean or Cholesky metric. To further illustrate, the intrinsic wavelet transform based on the Euclidean metric simplifies to the matrix version of the AI wavelet transform in Donoho (1993). The second advantage of a truly intrinsic approach is that in contrast to the MI approach in Rahman et al. (2005), the wavelet refinement scheme of order reproduces intrinsic polynomial curves up to order as defined in Hinkle et al. (2014), see Section B. This allows us to derive wavelet coefficient decay and nonparametric estimation rates for smooth curves of HPD matrices.
The structure of the paper is as follows. In Section B, we introduce the necessary geometric notions and tools and in Section C we describe the intrinsic AI refinement scheme and forward and backward wavelet transform in the Riemannian manifold. In Section D, we derive wavelet coefficient decay rates of intrinsically smooth curves of HPD matrices and convergence rates of linear wavelet thresholding. In Section E, we consider intrinsic wavelet regression in the context of spectral matrix estimation. In particular, we examine nonlinear tree-structured thresholding of matrix-valued wavelet coefficients based on their trace as it is shown that, asymptotically, the traces of the wavelet coefficients decompose into a scalar additive signal plus noise model with homogeneous variances across wavelet scales. In Section 6, we compare the finite-sample performance of wavelet-based spectral matrix estimation to several benchmark procedures. The technical proofs are deferred to the appendix and a toolbox for data analaysis and estimation problems in the space of HPD matrices is publicly available in the R-package pdSpecEst on CRAN, Chau (2017).

Appendix B Geometry of HPD matrices

In order to develop an intrinsic wavelet transform for curves in the space of HPD (Hermitian positive definite) matrices, we consider the space as a Riemannian manifold equipped with a specific invariant Riemannian metric, see e.g. Pennec et al. (2006), (Bhatia, 2009, Chapter 6), or Smith (2000) for more details. Below, we introduce the necessary tools and notions for the construction of a proper intrinsic manifold wavelet transform.

Riemannian metric For notational simplicity, let us denote for the space of -dimensional HPD matrices, a -dimensional smooth Riemannian manifold. The tangent space at a point, i.e. a matrix, can be identified by the real vector space of -dimensional Hermitian matrices, and as detailed in Pennec et al. (2006), the Frobenius inner product on induces a natural invariant Riemannian metric on the manifold given by the smooth family of inner products:

 ⟨h1,h2⟩p = Tr((p−1/2∗h1)(p−1/2∗h2)),∀p∈M, (B.1)

with . Here and throughout this paper, always denotes the Hermitian square root matrix of , and we write for matrix congruence transformation, where denotes the conjugate transpose of a matrix. The Riemannian distance on derived from the Riemannian metric is given by:

 δ(p1,p2) = ∥Log(p−1/21∗p2)∥F, (B.2)

where denotes the matrix Frobenius norm and is the matrix logarithm. The mapping is an isometry for each invertible matrix , i.e. it is distance-preserving:

 δ(p1,p2) = δ(a∗p1,a∗p2),∀a∈GL(d,C),

and by (Bhatia, 2009, Theorem 6.1.6 and Prop. 6.2.2), the shortest curve with respect to the Riemannian distance, i.e. the geodesic segment, joining any two points is unique and can be parametrized as,

 γ(p1,p2,t) = p1/21∗(p−1/21∗p2)t,0≤t≤1. (B.3)

Since is a complete separable metric space, the Hopf-Rinow Theorem implies that every geodesic curve can be extended indefinitely.

Exp- and Log-maps The characterization of the intrinsic manifold wavelet coefficients requires the notions of exponential and logarithmic maps, relating the manifold to its tangent spaces in a one-to-one fashion. By (Pennec et al. (2006)) the exponential maps are (locally) diffeomorphic maps from the tangent space at a point to the manifold given by,

 Expp(h) = p1/2∗Exp(p−1/2∗h),∀h∈Tp(M),

where denotes the matrix exponential. Since is a geodesically complete manifold and minimizing geodesics are always unique, it follows by (do Carmo, 1992, Chapter 13) that for each the image of the exponential map is the entire manifold , and the exponential maps are in fact global diffeomorphisms. In the other direction, the logarithmic maps are global diffeomorphisms from the manifold to the tangent space at the point , given by the inverse exponential maps:

 Logp(~p) = p1/2∗Log(p−1/2∗~p).

The Riemannian distance can now also be expressed in terms of the logarithmic map as:

 δ(p1,p2)  =  ∥Logp1(p2)∥p1  =  ∥Logp2(p1)∥p2,∀p1,p2∈M, (B.4)

where throughout this paper denotes the norm of induced by the Riemannian metric.

Parallel transport The parallel transport is required to define intrinsic polynomials and intrinsic Taylor expansions of smooth curves as in (Lang, 1995, Chapter 9), which allow for the derivation of the wavelet coefficient decay for smooth curves in the manifold.
Let for some interval be a smooth curve and be a vector field along , such that for each . The parallel transport transports a vector in the tangent space at to the tangent space at along the curve , such that the inner product between vectors across tangent spaces is preserved, i.e.

 ⟨Γ(γ)t1t0(h1),Γ(γ)t1t0(h2)⟩γ(t1) = ⟨h1,h2⟩γ(t0),∀h1,h2∈Tγ(t0)(M).

The covariant derivative (or affine connection) induced by the Riemannian metric can be retrieved from the parallel transport by differentiation as,

 ∇˙γP(t) = limΔt→0Γ(γ)tt+Δt(P(t+Δt))−P(t)Δt.

Here, the time derivative is a tangent vector in and can be expressed using the logarithmic map (i.e. in terms of normal coordinates) as,

 ˙γ(t) := ddtγ(t) = lim△t→0Logγ(t)(γ(t+△t))△t∈Tγ(t)(M). (B.5)

A vector field is parallel transported along the curve if for each . In particular, geodesic curves –previously defined as shortest line segments on – can also be characterized as the curves for which is parallel transported along the curve itself, i.e. for each .
In , parallel transport of a vector from a point along a geodesic curve in the direction of for time is given explicitly by:

 T(p,Δtv,w) = Expp(Δtv/2)∗p−1∗w.

Substituting , we obtain the whitening transport as a specific case transporting to the tangent space at the identity along a geodesic curve,

 T(p,Logp(Id),w) = p−1/2∗w ∈TId(M). (B.6)

Intrinsic polynomials Intrinsic polynomials as defined in Hinkle et al. (2014) play a key role in the construction of the intrinsic AI refinement scheme. Essentially, polynomial curves of degree on the manifold are defined as the curves with vanishing -th and higher order covariant derivatives. Let be a smooth curve on the manifold, with existing covariant derivatives of all orders, then it is said to be a polynomial curve of degree if,

 ∇ℓ˙γ˙γ(t) = 0,∀ℓ≥k and t∈I,

where . A zero degree polynomial is a curve for which , i.e. a constant curve. A first-degree polynomial is a curve for which corresponding to a geodesic curve, i.e. a straight line on the manifold. In general, higher degree polynomials are difficult to represent in closed form, but discretized polynomial curves are straightforward to generate via numerical integration as outlined in Hinkle et al. (2014).

Intrinsic means A random variable is a measurable function from a probability space to the measurable space , where is the Borel algebra in the complete separable metric space . By , we denote the set of all probability measures on and denotes the subset of probability measures in that have finite moments of order with respect to the Riemannian distance,

 Pp(M) := {ν∈P(M):∃y0∈M, s.t.∫Mδ(y0,x)pν(dx)<∞}

Note that if for some and , this is true for any . This follows by the triangle inequality and the fact that for any , as .
In the intrinsic AI refinement scheme, the center of a random variable is charactarized by its intrinsic (also Karcher or Fréchet) mean, see e.g. Pennec (2006). The set of intrinsic means is given by the points that minimize the second moment with respect to the Riemannian distance,

 μ = Eν[X] := argminy∈supp(ν)∫Mδ(y,x)2 ν(dx).

If , then at least one intrinsic mean exists and since the manifold is a space of non-positive curvature with no cut-locus (see Pennec et al. (2006)), by (Le, 1995, Proposition 1) the intrinsic mean is also unique.

Appendix C Intrinsic AI wavelet transforms

The intrinsic average-interpolation (AI) wavelet transform for curves in the space of HPD matrices is a natural unambiguous generalization of the scalar AI wavelet transform on the real line in Donoho (1993). Although closely related to the midpoint-interpolation (MI) wavelet transform in Rahman et al. (2005), an important difference is that the intrinsic AI wavelet transform reproduces higher-order intrinsic polynomial curves in the manifold, whereas the MI wavelet transform only reproduces first-order polynomials, i.e. geodesic curves. As a consequence, this intrinsic polynomial reproduction property allows us to derive wavelet coefficient decay and nonparametric convergence rates for smooth curves in Section D. We emphasize that the construction of the wavelet transform is based on the idea of lifting transforms, see e.g. Jansen and Oonincx (2005) or Klees and Haagmans (2000) for a general overview of first- and second-generation wavelet transforms using the lifting scheme.

c.1 Intrinsic AI refinement scheme

Midpoint pyramid With in mind the application to spectral matrices, we consider a discretized curve at equidistant points for , with . Here, without loss of generality we set and it is assumed that is dyadic in order to allow for straightforward construction of the wavelet transforms. The latter is not an absolute limitation of the approach, as the lifting wavelet transforms can also be adapted to non-dyadic values of . Given the sequence , we build a redundant midpoint or scaling coefficient pyramid analogous to Rahman et al. (2005). At the finest scale , set for . At the next coarser scale set,

 Mj,k = γ(Mj+1,2k,Mj+1,2k+1,1/2),for k=0,…,2j−1, (C.1)

and continue this coarsening operation up to scale , such that each scale contains a total of midpoints. Here, is the halfway point or midpoint on the geodesic segment connecting , which coincides with the intrinsic sample mean of and .

Intrinsic polynomial interpolation At scale , the intrinsic AI refinement scheme takes as input coarse-scale midpoints and outputs imputed or predicted finer-scale midpoints . The predicted midpoints are computed as the -scale midpoints of the unique intrinsic polynomial with -scale midpoints . In order to reconstruct intrinsic polynomials from a discrete set of points on the manifold, we consider a generalized intrinsic version of Neville’s algorithm as in (Ma and Fu, 2012, Chapter 9.2), replacing ordinary linear interpolation by geodesic interpolation.
Given and , set for all and . The are zero-th order polynomials, since . Iteratively define,

 pi,j(x) := Exppi,j−1(x)(x−xixj−xiLogpi,j−1(x)(pi+1,j(x))),0≤i

where and are the intrinsic polynomials of degree passing through at and through at respectively. Then is the intrinsic polynomial of degree passing through at . Continuing the above iterative reconstruction, at the final iteration we obtain the intrinsic polynomial of order passing through at .
To illustrate, is the geodesic, i.e. first-order intrinsic polynomial, passing through and at and . In general, since geodesically interpolates two polynomials of degree , is itself a polynomial of degree introducing one additional higher-degree non-vanishing covariant derivative. This is exactly analogous to the Euclidean setting, where linear interpolation of two polynomials of degree results in a polynomial of degree .

Midpoint prediction via intrinsic average interpolation

We emphasize that reconstruction of the intrinsic polynomial with -scale midpoints is not the same as reconstructing the intrinsic polynomial passing through the -scale midpoints. To compute predicted midpoints in the intrinsic AI refinement scheme, instead we consider the cumulative instrinsic mean of , , given by:

 My0(y) = ExpMy0(y)(∫yy0% LogMy0(y)(~f(u))du). (C.2)

If is the intrinsic polynomial with -scale midpoints , then equals the cumulative intrinsic average of . The key consideration is that the cumulative intrinsic mean of an intrinsic polynomial of order is again an intrinsic polynomial of order . For instance, given a geodesic segment, i.e. a first-order polynomial, its cumulative intrinsic mean is a geodesic segment moving at half the original speed. Again, this is analogous to the Euclidean setting, where an integrated polynomial remains a polynomial.

Prediction away from the boundary Fix a location on scale for some . Given the neighboring -scale midpoints , the goal is to predict the finer-scale midpoints . The number of collected neighbors is referred to as the order or degree of the refinement scheme. First, to predict the midpoint , fit an intrinsic polynomial of order through the known points by means of Neville’s algorithm, where denotes the cumulative intrinsic average:

 \macc@depth\char1\frozen@everymath\macc@group\macc@set@skewchar\macc@nested@a111Mj−1,ℓ := M(k−L)2−(j−1)((k−L+ℓ)2−(j−1)) = Exp\macc@depth\char1\frozen@everymath\macc@group\macc@set@skewchar\macc@nested@a111Mj−1,ℓ(1ℓ+1k−L+ℓ∑i=k−LLog\macc@depth\char1\frozen@everymath\macc@group\macc@set@skewchar\macc@nested@a111Mj−1,ℓ(Mj−1,i)). (C.3)

By construction of the cumulative intrinsic mean curve, lies on the geodesic segment connecting the known cumulative average and the midpoint . The predicted midpoint is then given by:

 ˜Mj,2k+1 = γ(\macc@depth\char1\frozen@everymath\macc@group\macc@set@skewchar\macc@nested@a111Mj−1,L,ˆM(k−L)2−(j−1)((2k+1)2−j),−2L),

where is replaced by its estimate . We refer to the proof of Proposition D.2 for the derivation of the above expression. The value of directly follows from the midpoint relation as:

 ˜Mj,2k = Mj−1,k∗˜M−1j,2k+1.

An important observation is that if the coarse-scale midpoints are generated from an intrinsic polynomial of degree , then the midpoints are reproduced withour error. This is analogous to the scalar AI refinement scheme in Donoho (1993) and is referred to as the intrinsic polynomial reproduction property.

Prediction at the boundary If is located near or at the boundary, not all symmetric neighbors around are available for prediction of . Instead, collect the closest neighbors of either to the left or right and predict the -scale midpoints as above through -th order intrinsic polynomial interpolation based on the non-symmetric neighbors . This boundary modification preserves the intrinsic polynomial reproduction property. In Figures 1 and 2, we demonstrate successive applications of the intrinsic AI refinement scheme for an interior and a boundary midpoint starting from a sequence of dummy HPD matrix-valued observations at .

Faster midpoint prediction in practice

In the scalar AI refinement scheme on the real line in Donoho (1993) or (Klees and Haagmans, 2000, pg. 95), the predicted -scale scaling coefficients obtained via polynomial average-interpolation of the -scale scaling coefficients are equivalent to weighted linear combinations of the input scaling coefficients, with weights depending on the order of the AI refinement scheme . In the intrinsic version of Neville’s algorithm, the sole change with respect to its Euclidean counterpart is the nature of the interpolation, i.e. linear interpolation is substituted by geodesic interpolation. It turns out that remain weighted averages of the inputs , with the same weights as in the Euclidean case. The difference being that –instead of weighted Euclidean averages– the weighted averages are intrinsic weighted averages in the Riemannian manifold. That is,

 ˜Mj,2k = Exp˜Mj,2k(L∑ℓ=−LCN,2ℓ+N−1Log˜Mj,2k(Mj−1,k+ℓ)), ˜Mj,2k+1 = Exp˜Mj,2k+1(L∑ℓ=−LCN,2ℓ+NLog˜Mj,2k+1(Mj−1,k+ℓ)), (C.4)

where the weights depend on the refinement order and sum up to 2. For instance, away from the boundary; for , ; for , ; for , ; and for , . In the pdSpecEst package, these prediction weights are pre-determined up to order at locations both away of and at the boundary, allowing for faster computation of the predicted midpoints in practice. For higher refinement orders, (i.e. ), the midpoints are predicted via Neville’s algorithm.

c.2 Intrinsic forward and backward AI wavelet transform

Forward wavelet transform The intrinsic AI refinement scheme leads to an intrinsic AI wavelet transform passing from -scale midpoints to -scale midpoints plus -scale wavelet coefficients as follows:

1. Coarsen/Predict: given the -scale midpoints , compute the -scale midpoints through the midpoint relation in eq.(C.1). Select a refinement order and generate the predicted midpoints based on .

2. Difference: given the true and predicted -scale midpoints , define the wavelet coefficients as an intrinsic difference according to,

 Dj,k = 2−j/2Log˜Mj,2k+1(Mj,2k+1)∈T˜Mj,2k+1(M). (C.5)

Note that by definition of the Riemannian distance, giving the wavelet coefficients the interpretation of a (scaled) difference between and . In the remainder, we also keep track of the whitened wavelet coefficients,

 Dj,k = ˜M−1/2j,2k+1∗Dj,k∈TId(M). (C.6)

The whitened coefficients correspond to the coefficients in eq.(C.5) transported to the same tangent space (at the identity) via the whitening transport in eq.(B.6). This allows for straightforward comparison of coefficients across scales and locations in Section D and E, since .

• In terms of memory usage, the forward wavelet transform is executed in place, i.e. one substitutes the old fine-scale midpoints by new coarse-scale midpoints and wavelet coefficients. In particular, the -dimensional midpoints are replaced by the -dimensional midpoint and the -dimensional wavelet coefficient . The coefficients at the even locations do not need to be stored, as they are uniquely determined by and . In addition, if , then also .

Backward wavelet transform
The backward wavelet transform passing from coarse -scale midpoints plus -scale wavelet coefficients to finer -scale midpoints follows from reverting the above operations: Predict/Refine: given -scale midpoints and a refinement order , generate the predicted midpoints and compute the -scale midpoints at the odd locations for through: Complete: the -scale midpoints at the even locations for are retrieved from and through the midpoint relation in eq.(C.1) as, Given the coarsest midpoint at scale and the pyramid of wavelet coefficients , for and , repeating the reconstruction procedure above up to scale , we retrieve the original discretized curve for .

Appendix D Wavelet regression for smooth HPD curves

In this section, we derive the wavelet coefficient decay and linear wavelet thresholding convergence rates in the context of intrinsically smooth curves of HPD matrices. It turns out that the derived rates coincide with the usual scalar wavelet coefficient decay and linear thresholding convergence rates on the real line. Nonlinear thresholding will not improve the convergence rates in the case of a homogeneous smoothness space. However, nonlinear wavelet thresholding is expected to improve the convergence rates in the case of a non-homogeneous smoothness space. This requires a well-defined intrinsic generalization of e.g. the family of Besov spaces to the Riemannian manifold, which is outside the scope of this work.
Repeated midpoint operator
The repeated midpoint operator in eq.(C.1) in the constrution of the midpoint pyramid is a valid intrinsic averaging operator in the sense that it converges to the intrinsic mean in the metric space at a rate conjectured in Rahman et al. (2005) for general Riemannian manifolds. As in Rahman et al. (2005), recursively define, (D.1)
Proposition D.1.
(Law of large numbers) Let , such that with intrinsic mean , and for some . Then, with smaller or equal up to a constant. In particular, as , where the convergence holds with respect to the Riemannian distance, i.e. for every , .
Wavelet coefficient decay of smooth curves
The derivation of the wavelet coefficient decay of intrinsically smooth curves in the Riemannian manifold relies on the fact that the derivative of a smooth curve with can be Taylor expanded in terms of the parallel transport and covariant derivatives according to (Lang, 1995, Chapter 9, Proposition 5.1) as, (D.2) If is an intrinsic polynomial curve of order . Then, since , all terms of order higher or equal to vanish and simplifies to, In the specific case of a first-order polynomial, the above expression reduces to , i.e. is parallel transported along the curve itself, or in other words is a geodesic curve.
Proposition D.2.
Suppose that is a smooth curve with existing covariant derivatives up to order . Then, for each scale sufficiently large and location , where denotes the whitened wavelet coefficient at scale-location as in eq.(C.6), and is the intrinsic AI refinement order. Here, the finest-scale midpoints are given by the local intrinsic averages with for .
Note that the above decay rates correspond to the usual wavelet coefficient decay rates of smooth real-valued curves in a Euclidean space based on wavelets with vanishing moments, see e.g. (Walnut, 2002, Theorem 9.5).
Consistency and convergence rates
The following results detail the convergence rates of linear wavelet regression of intrinsically smooth curves corrupted by noise. In particular, suppose that is an independent sample, such that with and for each . The first proposition gives the estimation error of the empirical wavelet coefficients based on with respect to the true wavelet coefficients of the discretized curve . The proof relies on the rate in Proposition D.1 above.
Proposition D.3.
(Estimation error) Let for some . For each scale sufficiently small and location , it holds that, where is the empirical whitened wavelet coefficient at scale-location , with the estimated repeated midpoint at scale-location based on and the predicted midpoint based on the estimated midpoints .
Combining Proposition D.2 and D.3, the main theorem below gives integrated mean squared error in terms of the Riemannian distance of a linear wavelet estimator of a smooth curve , with existing covariant derivatives up to the intrinsic refinement order , based on the sample of observations . Again, the convergence rates correspond to the usual convergence rates of linear wavelet estimators of smooth real-valued curves in Euclidean space based on wavelets with vanishing moments, see Antoniadis (1997).
Theorem D.4.
(Convergence rates linear thresholding) Let for some , and consider a linear wavelet estimator that thresholds all wavelet coefficients at scales , such that , with the intrinsic AI refinement order. The total mean squared error in terms of the wavelet coefficients satisfies: (D.3) where the sum ranges over and . Moreover, denote by the linear wavelet thresholded curve on the manifold. Then, (D.4)

Appendix E Wavelet-based spectral matrix estimation

In the context of multivariate spectral matrix estimation, consider data observations from a -dimensional strictly stationary time series with spectral matrix and raw periodogram matrix at the Fourier frequencies for . We apply the intrinsic wavelet transform to estimate by denoising the inconsistent spectral estimator