A Second Order NonSmooth Variational Model
for Restoring
ManifoldValued Images
Abstract
We introduce a new nonsmooth variational model for the restoration of manifoldvalued data which includes second order differences in the regularization term. While such models were successfully applied for realvalued images, we introduce the second order difference and the corresponding variational models for manifold data, which up to now only existed for cyclic data. The approach requires a combination of techniques from numerical analysis, convex optimization and differential geometry. First, we establish a suitable definition of absolute second order differences for signals and images with values in a manifold. Employing this definition, we introduce a variational denoising model based on first and second order differences in the manifold setup. In order to minimize the corresponding functional, we develop an algorithm using an inexact cyclic proximal point algorithm. We propose an efficient strategy for the computation of the corresponding proximal mappings in symmetric spaces utilizing the machinery of Jacobi fields. For the sphere and the manifold of symmetric positive definite matrices, we demonstrate the performance of our algorithm in practice. We prove the convergence of the proposed exact and inexact variant of the cyclic proximal point algorithm in Hadamard spaces. These results which are of interest on its own include, e.g., the manifold of symmetric positive definite matrices.
sectioning \setkomafonttitle
Keywords.
manifoldvalued data, second order differences, TVlike methods on manifolds, nonsmooth variational methods, Jacobi fields, Hadamard spaces, proximal mappings, DTMRI
1 Introduction
In this paper, we introduce a nonsmooth variational model for the restoration of manifoldvalued images using first and second order differences. The model can be seen as a second order generalization of the RudinOsherFatemi (ROF) functional [60] for images taking their values in a Riemannian manifold. For scalarvalued images, the ROF functional in its discrete, anisotropic penalized form is given by
where is a given noisy image and the symbol is used to denote the discrete first order difference operator which usually contains the forward differences in vertical and horizontal directions. The frequently used ROF denoising model preserves important image structures as edges, but tends to produce staircasing: instead of reconstructing smooth areas as such, the reconstruction consists of constant plateaus with small jumps. An approach for avoiding this effect incorporates higher order differences, respectively derivatives, in a continuous setting. The pioneering work [15] couples the TV term with higher order terms by infimal convolution. Since then, various techniques with higher order differences/derivatives were proposed in the literature, among them [13, 17, 21, 22, 36, 43, 45, 46, 50, 61, 62, 63]. We further note that the secondorder total generalized variation was extended for tensor fields in [72].
In various applications in image processing and computer vision the functions of interest take values in a Riemannian manifold. One example is diffusion tensor imaging where the data lives in the Riemannian manifold of positive definite matrices; see, e.g., [7, 14, 53, 65, 75, 78]. Other examples are color images based on nonflat color models [16, 40, 41, 73] where the data lives on spheres. Motion group and valued data play a role in tracking, robotics and (scene) motion analysis and were considered, e.g., in [28, 52, 55, 58, 70]. Because of the natural appearance of such nonlinear data spaces, processing manifoldvalued data has gained a lot of interest in applied mathematics in recent years. As examples, we mention wavelettype multiscale transforms [35, 54, 76], robust principal component pursuit on manifolds [37], and partial differential equations [19, 32, 69] for manifoldvalued functions. Although statistics on Riemannian manifolds is not in the focus of this work, we want to mention that, in recent years, there are many papers on this topic.
In [30, 31], the notion of total variation of functions having their values on a manifold was investigated based on the theory of Cartesian currents. These papers extend the previous work [29] where circlevalued functions were considered. The first work which applies a TV approach of circlevalued data for image processing tasks is [66, 67]. An algorithm for TV regularized minimization problems on Riemannian manifolds was proposed in [44]. There, the problem is reformulated as a multilabel optimization problem which is approached using convex relaxation techniques. Another approach to TV minimization for manifoldvalued data which employs cyclic and parallel proximal point algorithms and does not require labeling and relaxation techniques was given in [77]. In a recent approach [34] the restoration of manifoldvalued images was done using a smoothed TV model and an iteratively reweighted least squares technique. A method which circumvents the direct work with manifoldvalued data by embedding the matrix manifold in the appropriate Euclidean space and applying a back projection to the manifold was suggested in [59]. This can be also extended to higher order derivatives since the derivatives (or differences) are computed in the Euclidean space.
Our paper is the next step in a program already consisting of a considerable body of work of the authors: in [77], variational models using first order differences for general manifoldvalued data were developed. In [9], variational models using first and second order differences for circlevalued data were introduced. Using a suitable definition of second order differences on the circle the authors incorporate higher order differences into the energy functionals to improve the denoising results for circlevalued data. Furthermore, convergence for locally nearby data is shown. Our paper [11] extends this approach to product spaces of arbitrarily many circles and a vector space, and [10] to inpainting problems. Product spaces are important for example when dealing with nonlinear color spaces such as HSV.
This paper continues our recent work considerably by generalizing the combined first and second order variational models to general symmetric Riemannian manifolds. Besides cyclic data this includes general spheres, hyperbolic spaces, symmetric positive definite matrices as well as compact Lie groups and Grassmannians. First we provide a novel definition of absolute second order differences for data with values in a manifold. The definition is geometric and particularly appealing since it avoids using the tangent bundle for its definition. As a result, it is computationally accessible by the machinery of Jacobi fields which, in particular, in symmetric spaces yields rather explicit descriptions – even in this generality. Employing this definition, we introduce a variational model for denoising based on first and second order differences in the Riemannian manifold setup. In order to minimize the corresponding functional, we follow [9, 11, 77] and use a cyclic proximal point algorithm (PPA). In contrast to the aforementioned references, in our general setup, no closed form expressions are available for some of the proximal mappings involved. Therefore, we use as approximate strategy, a subgradient descent to compute them. For this purpose, we derive an efficient scheme. We show the convergence of the proposed exact and inexact variant of the cyclic PPA in a Hadamard space. This extends a result from [4], where the exact cyclic PPA in Hadamard spaces was proved to converge under more restrictive assumptions. Note that the basic (batch) version of the PPA in Hadamard spaces was introduced in [3]. Another related result is due to S. Banert [6], who developed both exact and inexact PPA for a regularized sum of two functions on a product of Hadamard spaces. In the context of Hadamard manifolds, the convergence of an inexact proximal point method for multivalued vector fields was studied in [74].
In this paper we prove the convergence of the (inexact) cyclic PPA under the general assumptions required by our model which differs from the cited papers.
Our convergence statements apply in particular to the manifold of symmetric positive definite matrices. Finally, we demonstrate the performance of our algorithm in numerical experiments for denoising of images with values in spheres as well as in the space of symmetric positive definite matrices.
Our main application examples, namely spheres and manifolds of symmetric positive definite matrices are, with respect to the sectional curvature, two extreme instances of symmetric spaces. The spheres have positive constant curvature, whereas the symmetric positive definite matrices are nonpositively curved. Their geometry is totally different, e.g., in the manifolds of symmetric positive definite matrices the triangles are slim and there are no cut locus which means that geodesics are always shortest paths. In spheres however, every geodesic meets a cut point and triangles are always fat, meaning that the sum of the interior angles is always bigger than . In our setup, however, it turns out that the sign of the sectional curvature is not important, but the important thing is the structure provided by symmetric spaces.
The outline of the paper is as follows: We start by introducing our variational restoration model in Section 2. In Section 3 we show how the (sub)gradients of the second order difference operators can be computed. Interestingly, this can be done by solving appropriate Jacobi equations. We describe the computation for general symmetric spaces. Then we focus on spheres and the space of symmetric positive definite matrices. The (sub)gradients are needed within our inexact cyclic PPA which is proposed in Section 4. A convergence analysis of the exact and inexact cyclic PPA is given for Hadamard manifolds. In Section 5 we validate our model and illustrate the good performance of our algorithms by numerical examples. The appendix provides some useful formulas for the computations. Further, Appendix C gives a brief introduction into the concept of parallel transport on manifolds in order to make our results better accessible for nonexperts in differential geometry.
2 Variational Model
Let be a complete dimensional Riemannian manifold with Riemannian metric , induced norm , and geodesic distance . Let denote the Riemannian gradient of which is characterized for all by
(1) 
where denotes the differential of at , see Appendix C.
Let , , be the unique geodesic starting from with . Further let denote a unit speed geodesic connecting . Then it fulfills , , where denotes the length of the geodesic. We further denote by a minimizing geodesic, i.e. a geodesic having minimal length . If it is clear from the context, we write geodesic instead of minimizing geodesic, but keep the notation of using when referring to all geodesics including the nonminimizing ones. We use the exponential map given by and the inverse exponential map denoted by .
The core of our restoration model are absolute second order differences of points lying in a manifold. In the following we define such differences in a sound way. The basic idea is based on rewriting the Euclidean norm of componentwise second order differences in as , see Fig. 0 (a). We define the set of midpoints between as
and the absolute second difference operator by
(2) 
The definition is illustrated for in Fig. 0 (b). For the manifold , definition (2) coincides, up to the factor , with those of the absolute second order differences in [9]. Similarly we define the second order mixed differences based on as
(3) 
Let . We want to denoise manifoldvalued images by minimizing functionals of the form
(4) 
where
(5)  
(6)  
(7)  
(8) 
For minimizing the functional we want to apply a cyclic PPA [4, 12]. This algorithm sequentially computes the proximal mappings of the summands involved in the functional. While the proximal mappings of the summands in data term and in the first regularization term are known analytically, see [27] and [77], respectively, the proximal mappings of and are only known analytically in the special case , see [9]. In the following section we deal with the computation of the proximal mapping of . The difference can be treated in a similar way.
3 Subgradients of Second Order Differences on
Since we work in a Riemannian manifold, it is necessary to impose an assumption that guarantees that the involved points do not take pathological (practically irrelevant) constellations to make the following derivations meaningful. In particular, we assume in this section that there is exactly one shortest geodesic joining i.e., is not a cut point of ; cf. [23]. We note that this is no severe restriction since such points form a set of measure zero. Moreover, we restrict our attention to the case where the minimizer in (2) is taken for the corresponding geodesic midpoint which we denote by .
We want to compute the proximal mapping of by a (sub)gradient descent algorithm. This requires the computation of the (sub)gradient of which is done in the following subsections. For the subgradient of coincides with its gradient
(9) 
If then is not differentiable. However, we will characterize the subgradients in Remark 3.4. In particular, the zero vector is a subgradient which is used in our subgradient descent algorithm.
3.1 Gradients of the Components of Second Order Differences
We start with the computation of the second component of the gradient (9). In general we have for , , see [68], that
(10) 
Lemma 3.1.
The second component of in (9) is given for by
(11) 
Proof.
Applying (10) for we obtain the assertion. ∎
By the symmetry of both gradients and can be realized in the same way so that we can restrict our attention to the first one. For fixed we will use the notation instead of . Let denote the unit speed geodesic joining and , i.e.,
We will further need the notation of parallel transport. For readers which are not familiar with this concept we give a brief introduction in Appendix C. We denote a parallel transported orthonormal frame along by
(12) 
For we use the special notation .
Lemma 3.2.
The first component of in (9) is given for by
(13) 
Proof.
For the computation of we can exploit Jacobi fields which are defined as follows: for , let , , denote the unit speed geodesic with and . Let denote the tangential vector in of the geodesic joining and . Then
with small are variations of the geodesic and
are the corresponding Jacobi field along . For an illustration see Fig. 2. Since and for all we have
(17) 
Since we conclude by definition (68) of the differential
Any Jacobi field of a variation through fulfills a linear system of ordinary differential equations (ODE) [42, Theorem 10.2]
(18) 
where denotes the Riemannian curvature tensor defined by . Here is the Lie bracket, see Appendix C. Our special Jacobi fields have to meet the boundary conditions (17). We summarize:
Lemma 3.3.
Finally, we give a representation of the subgradients of in the case .
Remark 3.4.
Let with and
(20) 
where is the Jacobi field along the unit speed geodesic determined by and . Then the subdifferential at reads
(21) 
where denotes the set of normalized vectors fulfilling
for all , and the interval endpoints , are given by
(22)  
(23) 
This can be seen as follows: For arbitrary tangent vectors , sitting in , , respectively, we consider the uniquely determined geodesic variation given by the side conditions as well as We note that which implies for that
(24) 
In view of the definition of a subgradient, see [26, 33], it is required, for a candidate , that
(25) 
for any sufficiently small . Setting equation (24) tells us that the left hand side above equals up to , and thus, for a candidate ,
for all of small magnitude. Since these are actually linear equations for in the tangent space, we get
Then, if , the nominators in (22) are nonzero and the limits exist. Finally, we conclude (21) from (25).
In the following subsection we recall how Jacobi fields can be computed for general symmetric spaces and have a look at two special examples, namely spheres and manifolds of symmetric positive definite matrices.
3.2 Jacobi Equation for Symmetric Spaces
Due to their rich structure symmetric spaces have been the object of differential geometric studies for a long time, and we refer to [24, 25] or the books [8, 18] for more information.
A Riemannian manifold is called locally symmetric if the geodesic reflection at each point given by mapping for all geodesics through is a local isometry, i.e., an isometry at least locally near . If this property holds globally, it is called a (Riemannian globally) symmetric space. More formally, is a symmetric space if for any and all there is an isometry on such that and . A Riemannian manifold is locally symmetric if and only if there exists a symmetric space which is locally isometric to . As a consequence of the Cartan–Ambrose–Hicks theorem [18, Theorem 1.36], every simply connected, complete, locally symmetric space is symmetric. Symmetric spaces are precisely the homogeneous spaces with a symmetry at some point . Beyond spheres and the manifold of symmetric positive definite matrices, hyperbolic spaces, Grassmannians as well as compact Lie groups are examples of symmetric spaces. The crucial property we need is that a Riemannian manifolds is locally symmetric if and only if the covariant derivative of the Riemannian curvature tensor along curves is zero, i.e.,
(26) 
Proposition 3.5.
Let be a symmetric space. Let be a unit speed geodesic and a parallel transported orthonormal frame along . Let be a Jacobi field of a variation through . Set . Then the following relations hold true:

Let be chosen as the initial orthonormal basis which diagonalizes the operator
(28) at with corresponding eigenvalues , , and let be the corresponding parallel transported frame along . Then the matrix becomes diagonal and (27) decomposes into the ordinary linear differential equations
(29) 
The Jacobi fields
(30) form a basis of the dimensional linear space of Jacobi fields of a variation through fulfilling the initial condition .
Part i) of Proposition 3.5 is also stated as Property A in Rauch’s paper [56] in the particularly nice form “The curvature of a 2section propagated parallel along a geodesic is constant.” For convenience we add the proof.
Proof.
i) Using the frame representation of , (69) and the linearity of in the first argument, the Jacobi equation (18) becomes
(31) 
and by taking inner products with further
(32) 
Now (26) implies for
which is, by the linear independence of the , , only possible if all are constants and we get .
Parts ii) and iii) follow directly from i). For iii) we also refer to [18, p. 77]. ∎
With respect to our special Jacobi fields in Lemma 3.3 we obtain the following corollary.
Corollary 3.6.
The Jacobi fields , of a variation through with boundary conditions and fulfill
(33) 
Proof.
Let us apply our findings for the sphere and the manifold of symmetric positive definite matrices.
The Sphere .
We consider the sphere . Then, the Riemannian metric is just the Euclidean distance in . For the definitions of the geodesic distance, the exponential map and parallel transport see Appendix A. Let . We choose and complete this to an orthonormal basis of with corresponding parallel frame along . Then diagonalizing the operator (28) is especially simple. Since has constant curvature , the Riemannian curvature tensor fulfills [42, Lemma 8.10] . Consequently,
so that at the vector is an eigenvector with eigenvalue and , , are eigenvectors with eigenvalues . Consequently, we obtain by Lemma 3.3 and Corollary 3.6 the following corollary.
Corollary 3.7.
For the sphere and the above choice of the orthonormal frame system, the following relations hold true:
Symmetric Positive Definite Matrices.
Let denote the space of symmetric matrices with (Frobenius) inner product and norm
(34) 
Let be the manifold of symmetric positive definite matrices. It has the dimension . The tangent space of at is given by , in particular , where denotes the identity matrix. The Riemannian metric on reads
(35) 
where denotes the matrix inner product (34). For the definitions of the geodesic distance, exponential map, parallel transport see the Appendix B.
Let and let the matrix have the eigenvalues with a corresponding orthonormal basis of eigenvectors in , i.e.,
(36) 
We will use a more appropriate index system for the frame (12), namely
Then the matrices
(37) 
form an orthonormal basis of .
In other words, we will deal with the parallel transported frame , , of (37) instead of , . To diagonalize the operator (28) at we use that the Riemannian curvature tensor for has the form
(38) 
with the Lie bracket of matrices. Then
and for with the righthand side becomes
(39)  
(40) 
where and . Expanding