Nonlinear Spectral Decompositions by Gradient Flows of One-Homogeneous Functionals

# Nonlinear Spectral Decompositions by Gradient Flows of One-Homogeneous Functionals

Leon Bungert, Martin Burger, Antonin Chambolle, Matteo Novaga
###### Abstract

The aim of this paper is to establish a theory of nonlinear spectral decompositions in an infinite dimensional setting by considering the eigenvalue problem related to an absolutely one-homogeneous functional in a Hilbert space. This approach is motivated by works for the total variation and related functionals in , where some interesting results on the eigenvalue problem and the relation to the total variation flow have been proven previously, and by recent results on finite-dimensional polyhedral semi-norms, where gradient flows can yield exact decompositions into eigenvectors.

We provide a geometric characterization of eigenvectors via a dual unit ball, which applies to the infinite-dimensional setting and allows applications to several relevant examples. In particular our analysis highlights the role of subgradients of minimal norm and thus connects directly to gradient flows, whose time evolution can be interpreted as a decomposition of the initial condition into subgradients of minimal norm. We can show that if the gradient flow yields a spectral decomposition it is equivalent in an appropriate sense to other schemes such as variational methods with regularization parameter equal to time, and that the decomposition has an interesting orthogonality relation. Indeed we verify that all approaches where these equivalences were known before by other arguments – such as one-dimensional TV flow and multidimensional generalizations to vector fields or the gradient flow of certain polyhedral semi-norms – yield exact spectral decompositions, and we provide further examples. We also investigate extinction times and extinction profiles, which we characterize as eigenvectors in a very general setting, generalizing several results from literature.

Keywords: Nonlinear spectral decomposition, gradient flows, nonlinear eigenvalue problems, one-homogeneous functionals, extinction profiles.

AMS Subject Classification: 35P10, 35P30, 47J10.

## 1 Introduction

Spectral properties and spectral decompositions are at the heart of many arguments in mathematics and physics, let us just mention the spectral decomposition of self-adjoint linear operators (cf. [30, 27]) or the eigenvalue problems for high-dimensional or nonlinear Schrödinger equations (cf. [33]) as two prominent examples. In signal and image processing a variety of successful approaches were based on Fourier transforms and Laplacian eigenfunctions, in image reconstruction and inverse problems the singular value decomposition is the fundamental tool for linear problems. Over the last two decades variational approaches and other techniques such as sparsity in anisotropic Banach spaces became popular and spectral techniques lost their dominant roles (cf. [7, 17, 18, 21, 28]).

Standard examples considered in the nonlinear setting are gradient flows of the form

 ∂tu(t)=−p(t),p(t)∈∂J(u(t)),u(0)=f,

in some Hilbert space with denoting the subdifferential of a semi-norm (without Hilbertian structure in general) on a dense subspace, and variational problems of the form

 12∥u−f∥2+tJ(u)→minu

with the norm in the first term being the one in the Hilbert space. As some recent results demonstrate, the role of eigenvalue problems and even spectral decompositions may be underestimated in such settings. First of all, data satisfying the nonlinear eigenvector relation

 λf∈∂J(f)

for a scalar give rise to analytically computable solutions for such problems, which was made precise for the TV flow (cf. [3, 5]) and is hidden in early examples for the variational problem (cf. [29] and [6] for a detailed discussion). Secondly, for a general datum the solution of the gradient flow satisfies

 f=∫∞0p(t)dt,

i.e., the datum is decomposed into subgradients of the functional . In the case, that these subgradients are even eigenvectors, this is called a nonlinear spectral decomposition.

In [6] some further interesting properties of nonlinear eigenvectors (in a more general setting) such as their use for scale estimates and several relevant examples have been provided. A rather surprising conjecture was made by Gilboa (cf. [25]), suggesting that TV flow and similar schemes can provide a spectral decomposition, i.e., time derivatives of the solution are related to eigenfunctions of the total variation. This was made precise in [15] in a certain finite-dimensional setting; furthermore, scenarios where a decomposition into eigenvectors can be computed were investigated. In more detail, functionals of the form with a matrix such that is diagonally dominant lead to such spectral decompositions.

In an infinite-dimensional setting a detailed theory is widely open and will be subject of this paper. We will consider an absolutely one-homogeneous functional on a Hilbert space and the corresponding eigenvalue problem . Effectively this means we look at a semi-norm defined on a subspace (being dense in many typical examples) of the Hilbert space and investigate the associated eigenvalue problem and gradient flow. The basic theory does not assume any relation between and the norm of the Hilbert space, but we shall see that many favourable properties of the gradient flow – such as finite time extinction, for instance – rely on a Poincaré-type inequality. That is, after factorizing out the null-space of the functional we have a continuous embedding of the Banach space with norm into the ambient Hilbert space. It is thus natural to think in terms of a Gelfand triple, with subgradients of existing a-priori only in a dual space which is larger than the Hilbert space. The eigenvalue problem and the gradient flow naturally lead to considering only cases with a subgradient in the Hilbert space, which is an abstract regularity condition known as source condition in inverse problems (cf. [16, 7]). We shall see that a key role is played by the subgradient of minimal norm (known as minimal norm certificate in compressed sensing and related problems, cf. [23, 22, 19]). A first key contribution of this paper is a geometric characterization of eigenvectors in such a setting, which is based on a dual characterization of absolutely one-homogeneous convex functionals. Roughly speaking we can interpret all possible subgradients as elements lying on the boundary of a dual unit ball (the subdifferential of at ) and single out eigenvectors as those elements which are a normal vector to an orthogonal supporting hyperplane of the ball. Thus, the eigenvalue problem becomes a geometric problem for a Banach space relative to a Hilbert space structure.

We also show that being a subgradient of minimal norm is a necessary condition for eigenvectors. This establishes an interesting connection to gradient flows, which automatically select subgradients of minimal norm as the time derivative of the primal variable. We hence study gradient flows in further detail and conclude that – if the above-noted geometric condition is satisfied – they yield a spectral decomposition, i.e., a representation of the inital data as integral of eigenvectors with decreasing frequency (decreasing Hilbert space norm at fixed dual norm). Moreover, we show that if the gradient flow yields a spectral decomposition, this is already sufficient to obtain equivalence to the variational method as well as an inverse scale space method proposed as an alternative to obtain spectral decompositions (cf. [14]). With an appropriate reparameterization from the time in the gradient flow to a spectral dimension we rigorously obtain a spectral decomposition akin to the spectral decomposition of self adjoint linear operators in Hilbert space as discussed in [15]. We apply our theory to several examples: in particular, it matches the finite-dimensional theory for polyhedral regularizations in [15], and it can also be used for the one-dimensional total variation flow, a flow of a divergence functional for vector fields, as well as for a combination of divergence and rotation sparsity. Moreover, we visit the simple case of a flow of the -norm, which gives further intuition and limitations in a case where no Poincaré-type estimate between the convex functional and the Hilbert space norm is valid.

Finally, we also discuss the extinction times and extinction profiles of gradient flows, a problem that was studied for TV flow in detail before (cf. [3, 24, 8]). Our theory is general enough to allow for a direct generalizations of the results to flows of absolutely one-homogeneous convex functionals and simplifies many proofs. In particular, we can show that under generic conditions the gradient flows have finite extinction time and there is an extinction profile, i.e., a left limit of the time derivative at the extinction, which is an eigenvector. Furthermore, we give sharp upper and lower bounds of the extinction time. For flows that yield a spectral decomposition, we obtain a simple relation between the initial datum, the extinction time, and the extinction profile. In the case of the one-dimensional total variation flow we get the results in [8] as special cases.

The remainder of the paper is organized as follows: in Section 2 we discuss some basic properties of absolutely one-homogeneous functionals and the related nonlinear eigenvalue problem, Section 3 is devoted to obtain further geometric characterizations of eigenvectors and to work out connections to subgradients of minimal norm. In Section 4 we discuss the potential to obtain spectral decompositions by gradient flows. For this sake we give an overview of the classical theory by Brezis, which shows that gradient flows generate subgradients of minimal norm, and we also provide equivalence results to other methods in the case of spectral decompositions, for which we give a geometric condition. Moreover, we discuss the appropriate scaling of the spectrum from time to eigenvalues in order to obtain a more suitable decomposition. In Section 5 we show that the geometric condition for obtaining a spectral decomposition is indeed satisfied for relevant examples such as 1D TV flow and multidimensional flows of vector fields with functionals based on divergence and rotation. Finally, in Section 6 we investigate the extinction profiles of the gradient flows, which we show to be eigenvectors also when the flow itself does not produce a spectral decomposition.

## 2 Absolutely homogeneous convex functionals and eigenvalue problems

In the following we collect some results about the basic class of convex functionals we consider in this paper, moreover we provide basic definitions and results about the nonlinear eigenvalue problem related to such functionals.

### 2.1 Notation

Let be a real Hilbert space with induced norm and be a functional in the class , defined as follows:

###### Definition 2.1.

The class consists of all maps such that

• is convex,

• is lower semicontinuous with respect to the strong topology on ,

• is absolutely one-homogeneous, i.e.,

 J(cu) =|c|J(u),∀u∈H,c∈R∖{0}, (2.1) J(0) =0. (2.2)

The effective domain and null-space of are given by

 dom(J) :={u∈H:J(u)<∞}, (2.3) N(J) :={u∈H:J(u)=0}. (2.4)

Note that and are not empty due to (2.2) and that is a (strongly and weakly) closed linear space [13] whose orthogonal complement we denote by . The effective domain is also a linear space but not closed, in general. Given any , the orthogonal projection of onto is

 ¯f:=argminu∈N(J)∥u−f∥, (2.5)

and the “dual norm” of with respect to is defined as

 ∥f∥∗:=supu∈N(J)⊥⟨f,u⟩J(u). (2.6)

Note that this defines a norm on a suitable subspace of . Considering as a norm on the Banach space , the above dual norm is indeed defined on the dual space of (respectively a predual if it exists). Hence, we naturally obtain a Gelfand triple structure and the eigenvalue problem can also be understood as a relation between the geometries of the Hilbert space and the Banach spaces and .

By we denote the subdifferential of in , given by

 ∂J(u):={p∈H:⟨p,v⟩≤J(v),⟨p,u⟩=J(u)}, (2.7)

and define

 dom(∂J):={u∈H:∂J(u)≠∅}. (2.8)

Of particular importance will be the subdifferential in zero

 ∂J(0)={p∈H:⟨p,v⟩≤J(v)}, (2.9)

Twhich uniquely determines as we will see. Using the concept of the dual norm (2.6), it can be easily seen that

 ∂J(0)={p∈N(J)⊥:∥p∥∗≤1}, (2.10)

i.e., roughly speaking the subdifferential of in   coincides with the dual unit ball.

Lastly, we recall the definition of the Fenchel-conjugate of a general function as

 Φ∗(u):=supv∈H⟨v,u⟩−Φ(v)

and of the indicator function of a subset as

 χK(u):={0,u∈K,+∞,u∉K.

We refer to [4] for fundamental properties.

### 2.2 Absolutely one-homogeneous functionals

In the following, we collect some elementary properties of functionals in and their subdifferentials whose proofs are either trivial or can be found in [15].

###### Proposition 2.2.

Let , , and . It holds

1. ,

2. and ,

3. for all ,

4. ,

5. is convex and weakly closed and it holds ,

6. ,

7. , i.e., for all .

As already indicated, the knowledge of the set suffices to uniquely determine a functional . Furthermore, any such functional has a canonical dual representation, similar to the concept of dual norms. This is no surprise since the elements of are semi-norms on subspaces of and norms if and only if they have trivial null-space.

###### Theorem 2.3.

belongs to if and only if

 J(u)=supp∈K⟨p,u⟩=χ∗K(u)

for a set that meets . In this case, it holds

 ∂J(0)=¯¯¯¯¯¯¯¯¯¯¯conv(K)

where denotes the closed convex hull of a set.

###### Proof.

It is well-known that the Fenchel-conjugate of an absolutely one-homogeneous functional is given by . Hence, for the first implication we observe that by choosing and using that – being lower semi-continuous and convex – equals its double Fenchel-conjugate it holds

 J(u)=J∗∗(u)=χ∗K(u)=supp∈K⟨p,u⟩.

Furthermore, meets which can be seen from for .

Conversely, any given by where trivially belongs to and it holds . Hence, by standard subdifferential calculus holds.

###### Remark 2.4.

Using the convexity and homogeneity of , Jensen’s inequality immediately implies that the generalized triangle inequality

 J(∫bau(t)dt)≤∫baJ(u(t))dt (2.11)

for a function holds, whenever these expressions make sense.

Due to Theorem 2.3, we will from now on assume that

 J(u)=supp∈K⟨p,u⟩,u∈H,

where (after possibly replacing by the closure of its convex hull).

### 2.3 Subgradients and eigenvectors

In this section, we will define the non-linear eigenvalue problem under consideration and provide first insights into the the geometric connection of eigenvectors and the dual unit ball . We start with general subgradients of a functional before we turn to the special case of subgradients which are eigenvectors.

###### Definition 2.5 (Subgradients).

Let and . Then the elements of the set are called subgradients of in .

###### Proposition 2.6.

Let be a subgradient of in . Then lies in the relative boundary of , i.e., where denotes the relative interior of .

###### Proof.

We observe that for all and it holds

 ⟨cp,v⟩=c⟨p,v⟩≤cJ(v)≤J(v),

i.e., . Let us assume that there is such that . Then, using that yields

 c⟨p,u⟩=⟨cp,u⟩≤J(u)=⟨p,u⟩

which clearly contradicts . Hence, we have established the claim ∎

Interestingly, due to the fact that subdifferentials are convex sets and lie in the (relative) boundary of the convex set , they are either singletons or lie in a “flat” region of the boundary of .

###### Definition 2.7 (Eigenvectors).

Let . We say that is an eigenvector of with eigenvalue if

 λu∈∂J(u). (2.12)
###### Remark 2.8.

Due to the positive zero-homogeneity of any multiple of with is also an eigenvector of with eigenvalue . To avoid this ambiguity one sometimes additionally demands from an eigenvector . In this work, however, we do not adopt this normalization since it does not match the flows that we are considering. As a consequence, all occurring eigenvalues should be multiplied by the norm of the eigenvector to become interpretable. E.g., let . Then has unit norm and is an eigenvector of with eigenvalue since . The last equality follows from 6. in Proposition 2.2.

Now we collect some basic properties of eigenvectors and, in particular, we show that eigenvectors are the elements of minimal norm in their respective subdifferential. Hence, one can restrict the search for eigenvectors to the subgradients of minimal norm; this shows a first connection to gradient flows which select the subgradients of minimal norm, as already indicated.

###### Proposition 2.9 (Properties of eigenvectors).

Let be an eigenvector of with eigenvalue . Then it holds

1. is eigenvector with eigenvalue ,

2. and if and only if ,

3. .

###### Proof.

Ad 3.: It holds for all

 λ∥u∥2=⟨λu,u⟩=J(u)=⟨p,u⟩≤∥p∥∥u∥

and, since , we obtain . The convexity of shows that is the unique element of minimal norm. ∎

###### Remark 2.10.

In the following, we will simply talk about eigenvectors whilst suppressing the dependency upon , for brevity.

It is trivial that all elements in the null-space of are eigenvectors with eigenvalue . However, these eigenvectors are only of minor interest as the example of total variation shows, where the null-space consists of all constant functions. Hence, in [6] so-called ground states where considered. These are eigenvectors in the orthogonal complement of the null-space with the lowest possible positive eigenvalue and, hence, the second largest eigenvalue of the operator . In our setting, these ground states correspond to vectors with minimal norm in the boundary of .

###### Proposition 2.11.

Let be a ground state of , defined as

 u0∈argminu∈N(J)⊥∥u∥=1J(u), (2.13)

and let . Then is an element of minimal norm in and an eigenvector.

###### Proof.

It has been shown in [6] that ground states in (2.13) exist under relatively weak assumptions111E.g. if meets (6.3) and are eigenvectors with eigenvalue . Furthermore, is per definitionem the smallest eigenvalue that a normalized eigenvector in can have. Hence, which shows that is an eigenvector. Let us assume there is such that . This implies with the Cauchy-Schwarz inequality and the definition of

 ⟨q,u⟩≤∥q∥∥u∥<λ0∥u∥=J(u),∀u∈N(J)⊥.

Since this inequality is strict, we can conclude that . Therefore, is minimal, as claimed. ∎

## 3 Geometric characterization of eigenvectors

In this section, we give a novel geometric characterizing of eigenvectors. Simply speaking, eigenvectors (with eigenvalue 1) are exactly the vectors on the relative boundary of for which there exists a supporting hyperplane of through that is orthogonal to . In other words, there is a zero-centered ball with respect to the Hilbert norm which is tangential to at . All other eigenvectors are multiples of these. Since this geometric interpretation is not very handy in case of infinite dimensional Hilbert spaces (e.g. function spaces), we will only work with an algebraic characterization, in the following. We start with a lemma that allows us to limit ourselves to the study of eigenvectors with eigenvalue 1 without loss of generality.

###### Lemma 3.1.

is an eigenvector with eigenvalue if and only if is an eigenvector with eigenvalue 1.

###### Proof.

The statement follows directly from 6. in Proposition 2.2. ∎

A key geometric characterization is provided by the following result:

###### Proposition 3.2.

is an eigenvector with eigenvalue 1 if and only if

 ⟨p,p−q⟩≥0∀q∈K. (3.1)
###### Proof.

It holds that is an eigenvector with eigenvalue 1 if and only if

 ⟨p,p⟩=J(p)=supq∈K⟨p,q⟩.

This is equivalent to (3.1) which concludes the proof. ∎

Proposition 3.2 has nice geometric interpretations: first of all, it means that lies on one side of the hyperplane orthogonal to , the subgradient of minimal norm. Since this hyperplane is always a tangent plane to a ball in the Hilbert space , it provides a relation between unit balls in and : we obtain an eigenector if a multiple of the unit ball in touches the dual unit ball tangentially.

This geometric insight can be used to obtain the following results:

###### Corollary 3.3.

Let be a point of maximal norm in , i.e. for all . Then every positive multiple of is an eigenvector.

###### Example 3.4.

Consider the linear eigenvalue problem for a symmetric and positively semi-definite matrix . The corresponding functional is given by and is an ellipsoid. Along the main axes of the ellipsoid the hypersurface is tangential, hence the main axes define the eigenvectors.

###### Remark 3.5 (Existence of nonlinear spectral decompositions).

Note that unlike in the above-noted linear cases, where there are exactly different eigendirections, nonlinear eigenvectors in our setting may consitute an overcomplete generating set of the ambient space, as it can be seen in Fig. 1. It shows four different sets together with an eigenvector from each eigendirection. All other eigenvectors are multiples of these. The leftmost set corresponds to the linear case and has two different eigendirections. In contrast, in the non-linear cases one can have more different eigendirections.

###### Proposition 3.6.

Let and be an eigenvector with eigenvalue 1. Then it holds

 ⟨^p,^p−p⟩=0,∀p∈∂J(u), (3.2)

which can be reformulated as

###### Proof.

The nonnegativity of the left-hand side follows directly from the assumption that is an eigenvector and thus fulfills (3.1). For the other inequality, we let arbitrary and consider for , which is in , as well, due to convexity. Using (3.1) and the minimality of yields

 λ⟨^p,p⟩+(1−λ)∥^p∥2=⟨^p,q⟩≤∥^p∥2≤∥q∥2=λ2∥p∥2+(1−λ)2∥^p∥2+2λ(1−λ)⟨^p,p⟩.

Dividing this by yields

which can be simplified to

 11−λ⟨^p,p⟩≤λ1−λ∥∥p2∥∥−∥^p∥2+2⟨^p,p⟩.

Letting tend to zero and reordering shows

 ⟨^p,^p−p⟩≤0,

hence equality holds. ∎

The converse statement of Proposition 3.6 is false in general. This can be seen by choosing to be an ellipsoid. In this case all subdifferentials are singletons since the boundary of an ellipsoid does not contain convex sets consisting of two or more points. Hence, (3.2) is always met but not every boundary point has an orthogonal tangent hyperplane. However, the converse is true in finite dimensions if is a polyhedron.

###### Proposition 3.7.

Let be a convex polyhedron such that for all the element satisfies condition

 ⟨^p,^p−p⟩=0,∀p∈∂J(u) (MINSUB)

from [15]. Then for every facet of the element of minimal norm is an eigenvector with eigenvalue 1, all other eigenvectors are multiples of those.

###### Proof.

Let us fix and let be the element of minimal norm in . By the definition of the subdifferential and by Proposition 2.6 we infer that – being the intersection of and the hypersurface – must coincide with a facet of the polyhedron. Due to (MINSUB), the set defines a hypersurface through such that and is orthogonal to . Since is convex, all other points in lie on one side of which implies that is supporting and hence for all . With Proposition 3.2 we conclude that is an eigenvector with eigenvalue 1. ∎

###### Remark 3.8.

Notably, the minimality of does not play a role in the proof of Proposition 3.7. However, from the Cauchy-Schwarz inequality it follows that only subgradients of minimal norm can satisfy (MINSUB).

## 4 Spectral decompositions by gradient flows

The fact that eigenvectors are subgradients of minimal norms motivates to further study processes that yield such as a basis of spectral decompositions. Indeed, the theory of maximal monotone evolution equations shows that gradient flows have this desireable property.

### 4.1 Gradient flow theory from maximal monotone evolution equations

In this section we give a concise recap of the theory of non-linear evolution equations due to Brezis [9]. It deals with the solution of the differential inclusion

 {∂tu(t)+Au(t)∋0,u(0)=f, (4.1)

for times . Here denotes a potentially non-linear and multi-valued operator defined on a subset and is assumed to be maximally monotone. That is,

 ⟨p−q,u−v⟩≥0,∀p∈Au,q∈Av, (4.2)

and cannot be extended to a monotone operator with larger domain (see [9] for a precise definition). Furthermore, one defines

 A0u:=argmin{∥p∥:p∈Au},u∈dom(A), (4.3)

which is the norm-minimal element in , which is a convex set.

For these class of operators one has the following

###### Theorem 4.1 (Brezis, 1973).

For all there exists a unique function such that

1. for all

2. is Lipschitz continuous on , i.e., (as distributional derivative) and it holds

 ∥∂tu∥L∞(0,∞;H)≤∥∥A0f∥∥ (4.4)
3. (4.1) holds for almost every

4. is right-differentiable for all and it holds

 ∂+tu(t)+A0u(t)=0,∀t∈(0,∞) (4.5)
5. The function is right-continuous and the function is non-increasing

###### Proof.

For the proof see [9, Theorem 3.1]. ∎

A important instance of maximally montonone operators are subdifferentials of lower semi-continuous convex functionals . For these one can relax the assumption to and obtains

###### Theorem 4.2 (Brezis, 1973).

Let where is lower semi-continuous, convex, and proper, and let . Then there exists a unique function with such that

1. for all

2. is Lipschitz continuous on for all and it holds

 ∥∂tu∥L∞(δ,∞;H)≤∥∥A0v∥∥+1δ∥f−v∥,∀v∈dom(A),∀δ>0 (4.6)
3. is right-differentiable for all and it holds

 ∂+tu(t)+A0u(t)=0,∀t∈(0,∞) (4.7)
4. The function is right-continuous for all and the function is non-increasing

5. The function is convex, non-increasing, Lipschitz continuous on for all and it holds

 d+dtJ(u(t))=−∥∥∂+tu(t)∥∥2,∀t>0 (4.8)
###### Proof.

For the proof see [9, Theorem 3.2] where it should be noted that right-differentiability of the map follows since it is Lipschitz continuous and non-increasing. ∎

Applying Theorem 4.2 to the so-called gradient flow of the functional

 ⎧⎨⎩∂tu(t)=−p(t),p∈∂J(u(t)),u(0)=f, (GF)

yields the existence of a unique solution with associated subgradients which have minimal norm in for all .

###### Remark 4.3.

From now on, we will denote all occurring right-derivatives with the usual derivative symbols and to simplify our notation.

Having the geometric characterization of eigenvectors and the existence theory of gradient flows at hand, we are now interested in the scenario that the gradient flow yields a sequence of subgradients which are eigenvectors, i.e., . Reformulating this using the eigenvector characterization from Proposition 3.2 we obtain

###### Theorem 4.4.

The gradient flow (GF) yields a sequence of eigenvectors for if and only if

 ⟨p(t),p(t)−q⟩≥0,∀q∈K,∀t>0.

Before giving examples of functionals that guarantee this to happen, we investigate the consequences of such a behavior of the flow. We prove that, in this case, the gradient flow is equivalent to a variational regularization method and an inverse scale space flow. Furthermore, disjoint increments of eigenvectors will turn out to be mutually orthogonal. Finally, we will use the subgradients of an arbitrary gradient flow to define a measure that acts as generalization of the spectral measure corresponding to a self-adjoint / compact linear operator in the case that are eigenvectors.

### 4.2 Equivalence of gradient flow, variational method, and inverse scale space flow

First we show that if the gradient flow (GF) generates eigenvectors, this implies the equivalence with a variational regularization problem (VP), and the inverse scale space flow (ISS), given by

 {v(t)=argminv∈HEt(v)Et(v)=12∥v−f∥2+tJ(v),v∈H, (VP) ⎧⎪⎨⎪⎩∂τr(τ)=f−w(τ),r(τ)∈∂J(w(τ)),r(0)=0,w(0)=¯f. (ISS)

Here denotes a time / regularization parameter, whereas will turn out to correspond to the “inverse time” . Furthermore, the initial condition of the inverse scale space flow denotes the orthogonal projection of onto the null-space of defined in (2.5) (cf. [13] for more details). Note that all time derivatives ought to be understood in the weak sense, existent for almost all times, or in the sense of right-derivatives that exist for all times.

###### Theorem 4.5 (Equivalence of GF and VP).

Let be a solution of the gradient flow (GF) and assume that for all it holds . Then for it holds . Moreover, where solves (VP).

###### Proof.

From (3.1) we see that in particular

 ⟨p(t),p(t)−p(s)⟩≥0,∀0

holds and hence, using together with 5. in Thm 4.2,

 0≥⟨−p(t),p(t)−p(s)⟩=ddtJ(u(t))−⟨∂tu(t),p(s)⟩.

Integrating from to yields

 J(u(t))−J(u(s))−⟨u(t),p(s)⟩+J(u(s))≤0,

i.e. , hence . That solves (VP) follows by observing that the Fejer mean is the appropriate subgradient for the optimality condition of being a minimizer of , i.e.,

 v(t)−f+tq(t)=0,q(t)∈∂J(v(t)).

The fact that (GF) and (VP) posses unique solutions concludes the proof. ∎

Now we prove the equivalence of the variational problems and the inverse scale space flow. To avoid confusion due to the time reparametrization connecting and , we denote the derivatives of and with respect to the regularization parameter in (VP) by and , respectively. For instance, simply means and this expression exists since by the previous theorem and is right-differentiable for all .

###### Theorem 4.6 (Equivalence of VP and ISS).

Let the gradient flow (GF) generate eigenvectors . Let, furthermore, be the solutions of the variational problem (VP) with subgradients . Then for the pair , given by

 w(τ) :=v(1/τ)−1τv′(1/τ), (4.9) r(τ) :=q(1/τ), (4.10)

is a solution of the inverse scale space flow (ISS).

###### Proof.

From the optimality conditions of (VP) for we deduce

 q(t)=f−v(t)t.

By the quotient rule we obtain

 q′(t)=v(t)−f−tv′(t)t2.

Inserting yields

 q′(1/τ)=τ2[v(1/τ)−f−1τv′(1/τ)]

Using this we find with the chain rule

 ∂+τr(τ)=−1τ2q′(1/τ)=f−v(1/τ)+1τv′(1/τ)=f−w(τ),

hence, fulfills the inverse scale space equality. It remains to check that . We use the fact that according to Theorem 4.5 the solutions of the gradient flow and the variational problem coincide, i.e, and , to obtain

 J(w(τ))=J(v(1/τ)−1τv′(1/τ))=J(v(1/τ)+1τp(1/τ))≤J(v(1/τ))+1τJ(p(1/τ)).

Using that the subgradients are eigenvectors with minimal norm in and invoking Lemma 3.6 we infer

 J(p(1/τ))=∥p(1/τ)∥2=⟨q(1/τ),p(1/τ)⟩.

By inserting this and using that we obtain

 J(w(τ)) ≤⟨q(1/τ),v(1/τ)⟩+1τ⟨q(1/τ),p(1/τ)⟩ =⟨r(τ),w(τ)⟩.

The fact that finally shows that . Once again, the uniqueness solutions of (VP) and (ISS) concludes the proof. ∎

### 4.3 Orthogonality of the decomposition

Simple examples of flows with piecewise constant subgradients show that it is false that two subgradients of a gradient flow corresponding to different time points are orthogonal. However, we are able to show that the differences of subgradients are orthogonal if the subgradients themselves are eigenvectors.

###### Theorem 4.7.

If the gradient flow (GF) generates eigenvectors, it holds for that

 ⟨p(t),p(s)−p(r)⟩=0. (4.11)
###### Proof.

As stated in Theorem 4.5, it holds with minimal in . Hence, the assertion follows directly by employing Lemma 3.6 with , , and to obtain

 ⟨p(t),p(s)⟩=∥p(t)∥2=⟨p(t),p(r)⟩.

###### Corollary 4.8.

If one defines the spectral increments

 ϕ(s,t)=p(t)−p(s),s,t>0,

then Theorem 4.7 implies

 ⟨ϕ(s1,t1),ϕ(s2,t2)⟩=0

for all .

Ideally, one would like to obtain this orthogonality relation for the time derivative of which, however, only exists in a distributional sense. Formally one would define

 ϕ(t)=−t∂tp(t)

to obtain a orthogonal spectral representation of the data in the sense of [15], i.e.,

 f=∫∞0ϕ(t)+¯f,⟨ϕ(s),ϕ(t)⟩=0,s≠t.

Here, formally act as spectral measure. Since, however, this approach fails due to the lacking regularity of , we will present a rigorous definition of a spectral measure in the next section.

###### Remark 4.9.

It remains to be mentioned that all results in Sections 4.2 and 4.3 have already been proved in a finite dimensional setting [15] and assuming that is a polyhedron that meets (MINSUB) or the seemingly stronger condition that with a matrix such that is diagonally dominant. However, these results are not stronger than ours because using the results in Section 3 it is trivial to show that satisfying (5.13) implies that the gradient flow yields eigenvectors (see also Theorem 5.1). Hence, our result are a proper generalization to infinite dimensions and, furthermore, do not require a special structure of the functional but only that the gradient flow produces eigenvectors.

### 4.4 Large time behavior and the spectral measure

In this section we aim at defining a measure that corresponds to the spectral measure of linear operator theory in the case that the gradient flow (GF) generates eigenvectors. However, all statements in this section are true without this assumption. In order to define the measure rigorously, we have to investigate the large time behavior of the gradient flow first.

###### Theorem 4.10 (Large time behavior).

Let solve (GF). Then strongly converges to as .

###### Proof.

The proof for strong convergence of to some as is given in [12, Theorem 5] for even and hence, in particular, for absolutely one-homogeneous . To see that we observe

 ⟨u(t)−f,v⟩=∫t0⟨∂tu(s),v⟩ds=−∫t0⟨p(s),v⟩ds=0,∀v∈N(J),t>0,

by using 7. in Proposition 2.2. Hence and by using the strong closedness of the orthogonal complement in Hilbert spaces we infer and . This is equivalent to . ∎

###### Corollary 4.11.

Let solve (GF). Then it holds

 limT→∞∥∥∥∫T0p(t)dt−(f−¯f)∥∥∥=0. (4.12)
###### Lemma 4.12.

The solution of with is given by where solves with .

###### Proof.

The proof is trivial by using 3. and 7. in Proposition 2.2. ∎

Now we consider solving with and, without loss of generality, we can assume . This can always be achieved by replacing with and using the previous Lemma. Note that if and only if . Then by Theorem 4.10 together with Cor. 4.11 we infer that

 f=∫∞0p(t)dt. (4.13)

Let us compare this to the statement of the spectral theorem for a self-adjoint linear operator . For these one has

 Φ(T)f=∫σ(T)Φ(λ)dEλf,

where denotes the spectrum of , is the spectral measure, is a function, and . By choosing one obtains the spectral decomposition of the operator itself and by choosing one obtains the decomposition of the identity instead:

 f=∫σ(T)dEλf. (4.14)

If is even compact, the spectral measure becomes atomic and is given by

 Eλ=∞∑k=1δλk(λ)⟨⋅,ek⟩ek (4.15)

where denotes a set of orthonormal eigenvectors of with a null sequence of eigenvalues . Plugging this in one can express any by a linear combination of eigenvectors:

 f=∞∑k=1⟨f,ek⟩ek. (4.16)

Consequently, our aim is to manipulate the measure in (4.13) in such a way that it becomes a non-linear generalization of (4.14)-(4.16) for the case of the maximal monotone operator . In particular, it should become atomic if is a sequence of countably / finitely many distinct eigenvectors with eigenvalue 1. To achieve this we condense all with the same value of into one atomic point which can be considered as the corresponding eigenvalue of . Since the function is non-increasing and converges to zero according to Theorem 4.2 and [10, Theorem 7], this yields a perfect analogy to the linear situation where only orthogonality has to be replaced by orthogonality of differences of eigenvectors.

###### Definition 4.13 (Spectral measure).

Let solve with and and let be the measure

 ~μ(A)=∫Ap(t)dt, (4.17)

for Borel sets . If we set , then the spectral measure of with respect to is defined as the pushforward of through , i.e.,

 μ(B):=~μ(λ−1B), (4.18)

for Borel sets . The spectrum of is given by

 σ(μ):={λ(t):t>0}. (4.19)
###### Remark 4.14.

Note that is indeed a measurable map since it is non-increasing according to 4. in Theorem 4.2 which makes well-defined.

By definition of it holds

 ∫σ(μ)dμ=∫(0,∞)d~μ=f,

i.e., has a reconstruction property like (4.14). Furthermore, if is a collection of countably many distinct eigenvectors, the map only has a countable range. Consequently, the measure – which is supported on the range of – becomes concentrated in countably many points and, hence, atomic. This is the analogy to the linear case (4.15).

### 4.5 A sufficient condition for spectral decompositions

Before moving to examples of specific functionals whose gradient flows yield spectral decompositions of any data, we first give a sufficient condition on the data such that the gradient flow of any functional computes a spectral decomposition of the data. Although being quite specific, our condition can still be shown to be weaker than the (SUB0) + orthogonality condition defined in [31], which appears to be the only sufficient condition in that line.

###### Theorem 4.15.

Let and assume that where for all . Then solves the gradient flow (GF), i.e., for .

###### Proof.

Obviously, it holds and thus it remains to be checked that for . We compute

 J(u(t))=J(∫Ttp(s)ds)≤∫TtJ(p(s))dt=∫Tt⟨p(t),p(s)⟩dt=⟨p(t),u(t)⟩.

Together with for all this concludes the proof. ∎

###### Remark 4.16.

To see the connection to the (SUB0) + orthogonality condition, note that any finite representation of the data as

 f=N∑i=1γiui

with numbers and eigenvectors meeting can be rewritten as

 f=t1N∑i=1λiui+(t2−t1)N∑i=2λiui+⋯+(tN−tN−1)N∑i=Nλiui

where and we can assume the ordering