Second Order Differences of Cyclic Dataand Applications in Variational Denoising

Second Order Differences of Cyclic Data and Applications in Variational Denoising

Abstract

In many image and signal processing applications, as interferometric synthetic aperture radar (SAR) or color image restoration in HSV or LCh spaces the data has its range on the one-dimensional sphere . Although the minimization of total variation (TV) regularized functionals is among the most popular methods for edge-preserving image restoration such methods were only very recently applied to cyclic structures. However, as for Euclidean data, TV regularized variational methods suffer from the so called staircasing effect. This effect can be avoided by involving higher order derivatives into the functional.

This is the first paper which uses higher order differences of cyclic data in regularization terms of energy functionals for image restoration. We introduce absolute higher order differences for -valued data in a sound way which is independent of the chosen representation system on the circle. Our absolute cyclic first order difference is just the geodesic distance between points. Similar to the geodesic distances the absolute cyclic second order differences have only values in . We update the cyclic variational TV approach by our new cyclic second order differences. To minimize the corresponding functional we apply a cyclic proximal point method which was recently successfully proposed for Hadamard manifolds. Choosing appropriate cycles this algorithm can be implemented in an efficient way. The main steps require the evaluation of proximal mappings of our cyclic differences for which we provide analytical expressions. Under certain conditions we prove the convergence of our algorithm. Various numerical examples with artificial as well as real-world data demonstrate the advantageous performance of our algorithm.

\DeclareCaptionLabelSeparator

periodspace.  \captionsetupformat=hang,labelsep=periodspace,indention=-2em,labelfont=bf,width=.95skip=.5 \captionsetup[subfigure]aboveskip=.5ex,belowskip=1.5ex,indention=0cm,labelformat=simple,labelsep=space, labelfont=normal, hypcap=true,width=.975skip=.5 \setkomafontsectioning \setkomafonttitle \setlistleftmargin=0pt,labelwidth=*, ,align=left,itemsep=0pt

1 Introduction

A frequently used method for edge-preserving image denoising is the variational approach which minimizes the Rudin-Osher-Fatemi (ROF) functional [40]. In a discrete (penalized) form the ROF functional can be written as

where is the given corrupted image and denotes the discrete gradient operator which contains usually first order forward differences in vertical and horizontal directions. The regularizing term can be considered as discrete version of the total variation (TV) functional. Since the gradient does not penalize constant areas the minimizer of the ROF functional tends to have such regions, an effect known as staircasing. An approach to avoid this effect consists in the employment of higher order differences/derivatives. Since the pioneering work [10] which couples the TV term with higher order terms by infimal convolution various techniques with higher order differences/derivatives were proposed in the literature, among them [8, 11, 12, 15, 16, 27, 29, 31, 32, 41, 42, 43].

In various applications in image processing and computer vision the functions of interest take values on the circle or another manifold. Processing manifold-valued data has gained a lot of interest in recent years. Examples are wavelet-type multiscale transforms for manifold data  [25, 37, 49] and manifold-valued partial differential equations  [13, 24]. Finally we like to mention statistical issues on Riemannian manifolds [19, 20, 36] and in particular the statistics of circular data [18, 28]. The TV notation for functions with values on a manifold has been studied in [22, 23] using the theory of Cartesian currents. These papers were an extension of the previous work [21] were the authors focus on -valued functions and show in particular the existence of minimizers of certain energies in the space of functions with bounded total cyclic variation. The first work which applies a cyclic TV approach among other models for imaging tasks was recently published by Cremers and Strekalovskiy in [44, 45]. The authors unwrapped the function values to the real axis and proposed an algorithmic solution to account for the periodicity. An algorithm which solves TV regularized minimization problems on Riemannian manifolds was proposed by Lellmann et al. in [30]. They reformulate the problem as a multilabel optimization problem with an infinite number of labels and approximate the resulting hard optimization problem using convex relaxation techniques. The algorithm was applied for chromaticity-brightness denoising, denoising of rotation data and processing of normal fields for visualization. Another approach to TV minimization for manifold-valued data via cyclic and parallel proximal point algorithms was proposed by one of the authors and his colleagues in [50]. It does not require any labeling or relaxation techniques. The authors apply their algorithm in particular for diffusion tensor imaging and interferometric SAR imaging. For Cartan-Hadamard manifolds convergence of the algorithm was shown based on a recent result of Bačák [1]. Unfortunately, one of the simplest manifolds that is not of Cartan-Hadamard type is the circle .

In this paper we deal with the incorporation of higher order differences into the energy functionals to improve denoising results for -valued data. Note that the (second-order) total generalized variation was generalized for tensor fields in [46]. However, to the best of our knowledge this is the first paper which defines second order differences of cyclic data and uses them in regularization terms of energy functionals for image restoration. We focus on a discrete setting. First we provide a meaningful definition of higher order differences for cyclic data which we call absolute cyclic differences. In particular our absolute cyclic first order differences resemble the geodesic distance (arc length distance) on the circle. As the geodesics the absolute cyclic second order differences take only values in . This is not necessary the case for differences of order larger than two. Following the idea in [50] we suggest a cyclic proximal point algorithm to minimize the resulting functionals. This algorithm requires the evaluation of certain proximal mappings. We provide analytical expression for these mappings. Further, we suggest an appropriate choice of the cycles such that the whole algorithm becomes very efficient. We apply our algorithm to artificial data as well as to real-world interferometric SAR data.

The paper is organized as follows: in Section 2 we propose a definition of differences on . Then, in Section 3, we provide analytical expressions for the proximal mappings required in our cyclic proximal point algorithm. The approach is based on unwrapping the circle to  and considering the corresponding proximal mappings on the Euclidean space. The cyclic proximal point algorithm is presented in Section 4. In particular we describe a vectorization strategy which makes the Matlab implementation efficient and provides parallelizability, and prove its convergence under certain assumptions. Section 5 demonstrates the advantageous performance of our algorithm by numerical examples. Finally, conclusions and directions of future work are given in Section 6.

2 Differences of –valued data

Let be the unit circle in the plane

endowed with the geodesic distance (arc length distance)

Given a base point , the exponential map  from the tangent space  of at onto is defined by

This map is -periodic, i.e., for any , where denotes the unique point in such that , . Some useful properties of the mapping  (which can also be considered as mapping from onto ) are collected in the following remark.

Remark 2.1.

The following relations hold true:

  1. for all .

  2. If then for all .

While i) follows by straightforward computation relation ii) can be seen as follows: For there exists such that

Hence it follows and since further

To guarantee the injectivity of the exponential map, we restrict its domain of definition from to . Thus, for , there is now a unique satisfying . In particular we have . Given such representation system  of , centered at an arbitrary point  on  the geodesic distance becomes

(1)

Actually we need only in the minimum. Clearly, this definition does not depend on the chosen center point .

We want to determine general finite differences of -valued data. Let with

(2)

where denotes the vector with components one. We define the finite difference operator by

By (2), we see that vanishes for constant vectors and is therefore translation invariant, i.e.,

(3)
Example 2.2.

For the binomial coefficients with alternating signs

we obtain the (forward) differences of order :

Note that does not only fulfill (2), but vanishes exactly for all ‘discrete polynomials of order ’, i.e., for all vectors from . Here we are interested in first and second order differences

Moreover, we will apply the ‘mixed second order’ difference with and use the notation

We want to define differences for points using their representation with respect to an arbitrary fixed center point. As the geodesic distance (1) these differences should be independent of the choice of the center point. This can be achieved if and only if the differences are shift invariant modulo . Let . We define the absolute cyclic difference of (resp. ) with respect to by

(4)

where denotes the component-by-component application of if , and , . The definition allows that points having the same value are treated separately, cf. Figure 5. This ensures that is a continuous map. For example we have . Figures 4 and 5 illustrate definition (4). For the absolute cyclic differences related to the differences in Example 2.2 we will use the simpler notation

(5)
{subfigure}

.8 {subfigure}.8 {subfigure}.8

Figure 1: (top) and (bottom).
Figure 2: (top) and (bottom).
Figure 3: Settings from the tangential maps of , , on using the representation system according to .
Figure 4: Three points , , on the circle (blue) and their inverse exponential maps at , , (dark blue), where  denotes the antipodal point of . In other words, we cut the circle at the point and unwind it with respect to the tangent line at the antipodal point . The absolute cyclic differences take the three pairwise different positions of the points , to each other into account. These are shown in\subrefsubfig:forpartsonR with respect to the representation system from the arbitrary point  in\subrefsubfig:p3.
Figure 5: Three points , on the circle, where and , . Though denote the same point on the circle they are treated separately in the definition of the absolute cyclic differences.

The following equivalent definition of absolute cyclic differences appears to be useful.

Lemma 2.3.

Let be sorted in ascending order as and set . Let denote the corresponding permutation matrix, i.e., and Consider the shifted versions of given by

where denotes the -th unit vector. Then it holds

(6)
Proof.

The first equality in (6) follows directly by definition (4). To see the second one, note that by linearity of the inner product we have

(7)

For the geodesic distance we obtain by (1) that . In general the relation

(8)

does not hold true as the following example shows.

Example 2.4.

In general the -th order absolute cyclic difference cannot be written as Consider for example the absolute cyclic third order difference for given by (6) as

We obtain

so that .

For relation  (8) holds true by the next lemma.

Proposition 2.5.

For the following relation holds true:

(9)

Note that we need only the minimum over in Proposition 2.5 and more precisely

(10)

where and

Proof.

Since for , we see that and .

First we consider . By Lemma 2.3 we obtain

(11)

where we can assume by the cyclic shift invariance of that .

If , then the corresponding permutation matrix in Lemma 2.3 is the identity matrix. Further we obtain that and by (11) we get

If , then and . In this case we get

This proves the first assertion.

For we can again assume that . Exploiting that

we have to consider the following three cases:

If , then is the identity matrix, and

If , then  and . By  (7) we have

If , then and . Here we obtain

This finishes the proof. ∎

3 Proximal mapping of absolute cyclic differences

For a proper, closed, convex function and the proximal mapping is defined by

(12)

see [34]. The above minimizer exits and is uniquely determined. Many algorithms which were recently used in variational image processing reduce to the iterative computation of values of proximal mappings. An overview of applications of proximal mappings is given in [35].

In this section, we are interested in proximal mappings of absolute cyclic differences , i.e., for . More precisely, we will determine for -valued vectors represented by the values

for and first and second order absolute cyclic differences , . Here means that we are looking for the representative of in . In particular, we will see that these proximal mapping are single-valued for with and have two values for .

We start by considering the proximal mappings of the appropriate differences in . Then we use the results to find the proximal functions of the absolute cyclic differences.

3.1 Proximity of differences on

First we give analytical expressions for , where and , . Since we could not find a corresponding reference in the literature, the computation of the minimizer of

(13)

is described in the following lemmas. We start with .

Lemma 3.1.

For given and , set

Then the minimizer of

(14)

is given by

(15)

and the minimum by

(16)
Proof.

Since , there exists a component and we rewrite

Substituting , and , we see that , where  is the minimizer of

The (Fenchel) dual problem of reads

(17)

and the relation between the minimizers of the primal and dual problems is given by

(18)

Rewriting (17) we see that is the minimizer of

where . Hence we obtain

and by (18) further

Substituting back results in (15) and plugging into we get (16). ∎

Example 3.2.

Let , , and .

  1. For and we get and so that the minimizer of follows by soft shrinkage of with threshold :

    (19)
  2. For and we obtain and . Consequently, the minimizer of is given by

    (20)
  3. For and we obtain and , so that the minimizer of is given by

    (21)

We will apply the following corollary.

Corollary 3.3.

Let . Further, let and be given such that . Then

(22)
Proof.

Set and . By assumption and according to (16) we have to consider three cases.

    [label=0.]
  1. Let . Then by assumption also  and we conclude by (16) that

  2. Let and . By (16) this implies

    Since and we obtain .

  3. Let and . By (16) this implies

    and we are done.∎

Next we consider the case .

Lemma 3.4.

Let .

  1. Then, for and , the minimizer of

    (23)

    is given by

    and the minimum by

    (24)
  2. If for some and , then

    (25)
Proof.
  1. Setting the gradient of (23) to zero results in

    Using the Sherman-Morrison formula [7, p. 129] it follows

    For the corresponding energy we obtain by straightforward computation

  2. follows directly from (24).∎

3.2 Proximity of absolute cyclic differences of first and second order

Now we turn to -valued data represented by . We are interested in the minimizers of

(26)

on for and . We start with the case .

Theorem 3.5.

For set . Let and , where is adapted to the respective length of .

  1. If , then the unique minimizer of  is given by

    (27)
  2. If , then has the two minimizers

    (28)

Note that for case ii) appears exactly if and are antipodal points.

Proof.

By (1) and Lemma 2.5 we can rewrite in (26) as

(29)
(30)

where . Let

We are looking for