Second Order Differences of Cyclic Data and Applications in Variational Denoising
Abstract
In many image and signal processing applications, as interferometric synthetic aperture radar (SAR) or color image restoration in HSV or LCh spaces the data has its range on the onedimensional sphere . Although the minimization of total variation (TV) regularized functionals is among the most popular methods for edgepreserving image restoration such methods were only very recently applied to cyclic structures. However, as for Euclidean data, TV regularized variational methods suffer from the so called staircasing effect. This effect can be avoided by involving higher order derivatives into the functional.
This is the first paper which uses higher order differences of cyclic data in regularization terms of energy functionals for image restoration. We introduce absolute higher order differences for valued data in a sound way which is independent of the chosen representation system on the circle. Our absolute cyclic first order difference is just the geodesic distance between points. Similar to the geodesic distances the absolute cyclic second order differences have only values in . We update the cyclic variational TV approach by our new cyclic second order differences. To minimize the corresponding functional we apply a cyclic proximal point method which was recently successfully proposed for Hadamard manifolds. Choosing appropriate cycles this algorithm can be implemented in an efficient way. The main steps require the evaluation of proximal mappings of our cyclic differences for which we provide analytical expressions. Under certain conditions we prove the convergence of our algorithm. Various numerical examples with artificial as well as realworld data demonstrate the advantageous performance of our algorithm.
periodspace. \captionsetupformat=hang,labelsep=periodspace,indention=2em,labelfont=bf,width=.95skip=.5 \captionsetup[subfigure]aboveskip=.5ex,belowskip=1.5ex,indention=0cm,labelformat=simple,labelsep=space, labelfont=normal, hypcap=true,width=.975skip=.5 \setkomafontsectioning \setkomafonttitle \setlistleftmargin=0pt,labelwidth=*, ,align=left,itemsep=0pt
1 Introduction
A frequently used method for edgepreserving image denoising is the variational approach which minimizes the RudinOsherFatemi (ROF) functional [40]. In a discrete (penalized) form the ROF functional can be written as
where is the given corrupted image and denotes the discrete gradient operator which contains usually first order forward differences in vertical and horizontal directions. The regularizing term can be considered as discrete version of the total variation (TV) functional. Since the gradient does not penalize constant areas the minimizer of the ROF functional tends to have such regions, an effect known as staircasing. An approach to avoid this effect consists in the employment of higher order differences/derivatives. Since the pioneering work [10] which couples the TV term with higher order terms by infimal convolution various techniques with higher order differences/derivatives were proposed in the literature, among them [8, 11, 12, 15, 16, 27, 29, 31, 32, 41, 42, 43].
In various applications in image processing and computer vision the functions of interest take values on the circle or another manifold. Processing manifoldvalued data has gained a lot of interest in recent years. Examples are wavelettype multiscale transforms for manifold data [25, 37, 49] and manifoldvalued partial differential equations [13, 24]. Finally we like to mention statistical issues on Riemannian manifolds [19, 20, 36] and in particular the statistics of circular data [18, 28]. The TV notation for functions with values on a manifold has been studied in [22, 23] using the theory of Cartesian currents. These papers were an extension of the previous work [21] were the authors focus on valued functions and show in particular the existence of minimizers of certain energies in the space of functions with bounded total cyclic variation. The first work which applies a cyclic TV approach among other models for imaging tasks was recently published by Cremers and Strekalovskiy in [44, 45]. The authors unwrapped the function values to the real axis and proposed an algorithmic solution to account for the periodicity. An algorithm which solves TV regularized minimization problems on Riemannian manifolds was proposed by Lellmann et al. in [30]. They reformulate the problem as a multilabel optimization problem with an infinite number of labels and approximate the resulting hard optimization problem using convex relaxation techniques. The algorithm was applied for chromaticitybrightness denoising, denoising of rotation data and processing of normal fields for visualization. Another approach to TV minimization for manifoldvalued data via cyclic and parallel proximal point algorithms was proposed by one of the authors and his colleagues in [50]. It does not require any labeling or relaxation techniques. The authors apply their algorithm in particular for diffusion tensor imaging and interferometric SAR imaging. For CartanHadamard manifolds convergence of the algorithm was shown based on a recent result of Bačák [1]. Unfortunately, one of the simplest manifolds that is not of CartanHadamard type is the circle .
In this paper we deal with the incorporation of higher order differences into the energy functionals to improve denoising results for valued data. Note that the (secondorder) total generalized variation was generalized for tensor fields in [46]. However, to the best of our knowledge this is the first paper which defines second order differences of cyclic data and uses them in regularization terms of energy functionals for image restoration. We focus on a discrete setting. First we provide a meaningful definition of higher order differences for cyclic data which we call absolute cyclic differences. In particular our absolute cyclic first order differences resemble the geodesic distance (arc length distance) on the circle. As the geodesics the absolute cyclic second order differences take only values in . This is not necessary the case for differences of order larger than two. Following the idea in [50] we suggest a cyclic proximal point algorithm to minimize the resulting functionals. This algorithm requires the evaluation of certain proximal mappings. We provide analytical expression for these mappings. Further, we suggest an appropriate choice of the cycles such that the whole algorithm becomes very efficient. We apply our algorithm to artificial data as well as to realworld interferometric SAR data.
The paper is organized as follows: in Section 2 we propose a definition of differences on . Then, in Section 3, we provide analytical expressions for the proximal mappings required in our cyclic proximal point algorithm. The approach is based on unwrapping the circle to and considering the corresponding proximal mappings on the Euclidean space. The cyclic proximal point algorithm is presented in Section 4. In particular we describe a vectorization strategy which makes the Matlab implementation efficient and provides parallelizability, and prove its convergence under certain assumptions. Section 5 demonstrates the advantageous performance of our algorithm by numerical examples. Finally, conclusions and directions of future work are given in Section 6.
2 Differences of –valued data
Let be the unit circle in the plane
endowed with the geodesic distance (arc length distance)
Given a base point , the exponential map from the tangent space of at onto is defined by
This map is periodic, i.e., for any , where denotes the unique point in such that , . Some useful properties of the mapping (which can also be considered as mapping from onto ) are collected in the following remark.
Remark 2.1.
The following relations hold true:

for all .

If then for all .
While i) follows by straightforward computation relation ii) can be seen as follows: For there exists such that
Hence it follows and since further
To guarantee the injectivity of the exponential map, we restrict its domain of definition from to . Thus, for , there is now a unique satisfying . In particular we have . Given such representation system of , centered at an arbitrary point on the geodesic distance becomes
(1) 
Actually we need only in the minimum. Clearly, this definition does not depend on the chosen center point .
We want to determine general finite differences of valued data. Let with
(2) 
where denotes the vector with components one. We define the finite difference operator by
By (2), we see that vanishes for constant vectors and is therefore translation invariant, i.e.,
(3) 
Example 2.2.
For the binomial coefficients with alternating signs
we obtain the (forward) differences of order :
Note that does not only fulfill (2), but vanishes exactly for all ‘discrete polynomials of order ’, i.e., for all vectors from . Here we are interested in first and second order differences
Moreover, we will apply the ‘mixed second order’ difference with and use the notation
We want to define differences for points using their representation with respect to an arbitrary fixed center point. As the geodesic distance (1) these differences should be independent of the choice of the center point. This can be achieved if and only if the differences are shift invariant modulo . Let . We define the absolute cyclic difference of (resp. ) with respect to by
(4) 
where denotes the componentbycomponent application of if , and , . The definition allows that points having the same value are treated separately, cf. Figure 5. This ensures that is a continuous map. For example we have . Figures 4 and 5 illustrate definition (4). For the absolute cyclic differences related to the differences in Example 2.2 we will use the simpler notation
(5) 
The following equivalent definition of absolute cyclic differences appears to be useful.
Lemma 2.3.
Let be sorted in ascending order as and set . Let denote the corresponding permutation matrix, i.e., and Consider the shifted versions of given by
where denotes the th unit vector. Then it holds
(6) 
Proof.
For the geodesic distance we obtain by (1) that . In general the relation
(8) 
does not hold true as the following example shows.
Example 2.4.
In general the th order absolute cyclic difference cannot be written as Consider for example the absolute cyclic third order difference for given by (6) as
We obtain
so that .
For relation (8) holds true by the next lemma.
Proposition 2.5.
For the following relation holds true:
(9) 
Proof.
Since for , we see that and .
First we consider . By Lemma 2.3 we obtain
(11) 
where we can assume by the cyclic shift invariance of that .
If , then the corresponding permutation matrix in Lemma 2.3 is the identity matrix. Further we obtain that and by (11) we get
If , then and . In this case we get
This proves the first assertion.
For we can again assume that . Exploiting that
we have to consider the following three cases:
If , then is the identity matrix, and
If , then and . By (7) we have
If , then and . Here we obtain
This finishes the proof. ∎
3 Proximal mapping of absolute cyclic differences
For a proper, closed, convex function and the proximal mapping is defined by
(12) 
see [34]. The above minimizer exits and is uniquely determined. Many algorithms which were recently used in variational image processing reduce to the iterative computation of values of proximal mappings. An overview of applications of proximal mappings is given in [35].
In this section, we are interested in proximal mappings of absolute cyclic differences , i.e., for . More precisely, we will determine for valued vectors represented by the values
for and first and second order absolute cyclic differences , . Here means that we are looking for the representative of in . In particular, we will see that these proximal mapping are singlevalued for with and have two values for .
We start by considering the proximal mappings of the appropriate differences in . Then we use the results to find the proximal functions of the absolute cyclic differences.
3.1 Proximity of differences on
First we give analytical expressions for , where and , . Since we could not find a corresponding reference in the literature, the computation of the minimizer of
(13) 
is described in the following lemmas. We start with .
Lemma 3.1.
For given and , set
Then the minimizer of
(14) 
is given by
(15) 
and the minimum by
(16) 
Proof.
Since , there exists a component and we rewrite
Substituting , and , we see that , where is the minimizer of
The (Fenchel) dual problem of reads
(17) 
and the relation between the minimizers of the primal and dual problems is given by
(18) 
Rewriting (17) we see that is the minimizer of
where . Hence we obtain
and by (18) further
Substituting back results in (15) and plugging into we get (16). ∎
Example 3.2.
Let , , and .

For and we get and so that the minimizer of follows by soft shrinkage of with threshold :
(19) 
For and we obtain and . Consequently, the minimizer of is given by
(20) 
For and we obtain and , so that the minimizer of is given by
(21)
We will apply the following corollary.
Corollary 3.3.
Let . Further, let and be given such that . Then
(22) 
Proof.
Next we consider the case .
Lemma 3.4.
Let .

Then, for and , the minimizer of
(23) is given by
and the minimum by
(24) 
If for some and , then
(25)
3.2 Proximity of absolute cyclic differences of first and second order
Now we turn to valued data represented by . We are interested in the minimizers of
(26) 
on for and . We start with the case .
Theorem 3.5.
For set . Let and , where is adapted to the respective length of .

If , then the unique minimizer of is given by
(27) 
If , then has the two minimizers
(28)
Note that for case ii) appears exactly if and are antipodal points.