Shape deformation analysis from the optimal control viewpoint

Shape deformation analysis from the optimal control viewpoint

Sylvain Arguillère111Sorbonne Universités, UPMC Univ Paris 06, CNRS UMR 7598, Laboratoire Jacques-Louis Lions, F-75005, Paris, France (sylvain.arguillere@upmc.fr).    Emmanuel Trélat222Sorbonne Universités, UPMC Univ Paris 06, CNRS UMR 7598, Laboratoire Jacques-Louis Lions, Institut Universitaire de France and Team GECO Inria Saclay, F-75005, Paris, France (emmanuel.trelat@upmc.fr).    Alain Trouvé333Ecole Normale Supérieure de Cachan, Centre de Mathématiques et Leurs Applications, CMLA, 61 av. du Pdt Wilson, F-94235 Cachan Cedex, France (trouve@cmla.ens-cachan.fr).    Laurent Younes444Johns Hopkins University, Center for Imaging Science, Department of Applied Mathematics and Statistics, Clark 324C, 3400 N. Charles st. Baltimore, MD 21218, USA (laurent.younes@jhu.edu).
Abstract

A crucial problem in shape deformation analysis is to determine a deformation of a given shape into another one, which is optimal for a certain cost. It has a number of applications in particular in medical imaging.

In this article we provide a new general approach to shape deformation analysis, within the framework of optimal control theory, in which a deformation is represented as the flow of diffeomorphisms generated by time-dependent vector fields. Using reproducing kernel Hilbert spaces of vector fields, the general shape deformation analysis problem is specified as an infinite-dimensional optimal control problem with state and control constraints. In this problem, the states are diffeomorphisms and the controls are vector fields, both of them being subject to some constraints. The functional to be minimized is the sum of a first term defined as geometric norm of the control (kinetic energy of the deformation) and of a data attachment term providing a geometric distance to the target shape.

This point of view has several advantages. First, it allows one to model general constrained shape analysis problems, which opens new issues in this field. Second, using an extension of the Pontryagin maximum principle, one can characterize the optimal solutions of the shape deformation problem in a very general way as the solutions of constrained geodesic equations. Finally, recasting general algorithms of optimal control into shape analysis yields new efficient numerical methods in shape deformation analysis. Overall, the optimal control point of view unifies and generalizes different theoretical and numerical approaches to shape deformation problems, and also allows us to design new approaches.

The optimal control problems that result from this construction are infinite dimensional and involve some constraints, and thus are nonstandard. In this article we also provide a rigorous and complete analysis of the infinite-dimensional shape space problem with constraints and of its finite-dimensional approximations.

Keywords: shape deformation analysis, optimal control, reproducing kernel Hilbert spaces, Pontryagin maximum principle, geodesic equations.

AMS classification: 58E99 49Q10 46E22 49J15 62H35 53C22 58D05

1 Introduction

The mathematical analysis of shapes has become a subject of growing interest in the past few decades, and has motivated the development of efficient image acquisition and segmentation methods, with applications to many domains, including computational anatomy and object recognition.

The general purpose of shape analysis is to compare two (or more) shapes in a way that takes into account their geometric properties. Two shapes can be very similar from a human’s point of view, like a circle and an ellipse, but very different from a computer’s automated perspective. In Shape Deformation Analysis, one optimizes a deformation mapping one shape onto the other and bases the analysis on its properties. This of course implies that a cost has been assigned to every possible deformation of a shape, the design of this cost function being a crucial step in the method. This approach has been used extensively in the analysis of anatomical organs from medical images (see [15]).

In this framework, a powerful and convenient approach represents deformations as flows of diffeomorphisms generated by time-dependent vector fields [12, 28, 29]. Indeed, when considering the studied shapes as embedded in a real vector space , deformations of the whole space, like diffeomorphisms, induce deformations of the shape itself. The set of all possible deformations is then defined as the set of flows of time-dependent vector fields of a Hilbert space , called space of ”infinitesimal transformations”, which is a subset of the space of all smooth bounded vector fields on .

This point of view has several interesting features, not the least of which being that the space of possible deformations is a well-defined subgroup of the group of diffeomorphisms, equipped with a structure similar to the one of a right-invariant sub-Riemannian metric [8, 24]. This framework has led to the development of a family of registration algorithms called Large Deformation Diffeomorphic Metric Mapping (LDDMM), in which the correspondence between two shapes comes from the minimization of an objective functional defined as a sum of two terms [6, 7, 19, 22, 23]. The first term takes into account the cost of the deformation, defined as the integral of the squared norm of the time-dependent vector field from which it arises. In a way, it is the total kinetic energy of the deformation. The second term is a data attachment penalizing the difference between the deformed shape and a target.

An appropriate class of Hilbert spaces of vector fields for is the one of reproducing kernel Hilbert spaces (in short, RKHS) [5], because they provide very simple solutions to the spline interpolation problem when the shape is given by a set of landmarks [32, 34], which is an important special case since it includes most practical situations after discretization. This framework allows one to use tools from Riemannian geometry [32], along with classical results from the theory of Lie groups equipped with right-invariant metrics [3, 4, 17, 21, 34]. These existing approaches provide an account for some of the geometric information in the shape, like singularities for example. However, they do not consider other intrinsic properties of the studied shape, which can also depend on the nature of the object represented by the shape. For example, for landmarks representing articulations of a robotic arm, the deformation can be searched so as to preserve the distance between certain landmarks. For cardiac motions, it may be relevant to consider deformations of the shape assuming that the movement only comes from a force applied only along the fiber structure of the muscle. In other words, it may be interesting to constrain the possible deformations (by considering non-holonomic constraints) in order to better fit the model.

In order to take into account such constraints in shape deformation problems, we propose to model these problems within the framework of optimal control theory, where the control system would model the evolution of the deformation and the control would be the time-dependent vector field (see preliminary ideas in [32]).

The purpose of this paper is to develop the point of view of optimal control for shape deformation analysis as comprehensively as possible. We will show the relevance of this framework, in particular because it can be used to model constrained shapes among many other applications.

Indeed, a lot of tools have been developed in control theory for solving optimal control problems with or without constraints. The well-known Pontryagin maximum principle (in short PMP, see [26]) provides first-order conditions for optimality in the form of Hamiltonian extremal equations with a maximization condition permitting the computation of the optimal control. It has been generalized in many ways, and a large number of variants or improvements have been made over the past decades, with particular efforts in order to be able to address optimal control problems involving general state/control constraints (see the survey article [16] and the many references therein). The analysis is, however, mainly done in finite dimension. Since shape analysis has a natural setting in infinite dimension (indeed, in 2D, the shape space is typically a space of smooth curves in ), we need to derive an appropriate infinite-dimensional variant of the PMP for constrained problems. Such a variant is nontrivial and nonstandard, given that our constrained shape analysis problems generally involve an infinite number of equality constraints.

Such a PMP will allow us to derive in a rigorous geometric setting the (constrained) geodesic equations that must be satisfied by the optimal deformations.

Moreover, modeling shape deformation problems within the framework of optimal control theory can inherit from the many numerical methods in optimal control and thus lead to new algorithms in shape analysis.

The paper is organized as follows.

Section 2 is devoted to modeling shape deformation problems with optimal control. We first briefly describe, in Section 2.1, the framework of diffeomorphic deformations arising from the integration of time-dependent vector fields belonging to a given RKHS, and recall some properties of RKHS’s of vector fields. In Section 2.2 we introduce the action of diffeomorphisms on a shape space, and we model and define the optimal control problem on diffeomorphisms which is at the heart of the present study, where the control system stands for the evolving deformation and the minimization runs over all possible time-dependent vector fields attached to a given RKHS and satisfying some constraints. We prove that, under weak assumptions, this problem is well posed and has at least one solution (Theorem 1). Since the RKHS is in general only known through its kernel, we then provide a kernel formulation of the optimal control problem and we analyze the equivalence between both problems. In Section 2.3 we investigate in our framework two important variants of shape spaces, which are lifted shapes and multi-shapes. Section 2.4 is devoted to the study of finite-dimensional approximations of the optimal control problem. Section 2.5 contains a proof of Theorem 1.

Section 3 is dedicated to the derivation of the constrained geodesic equations in shape spaces, that must be satisfied by optimal deformations. We first establish in Section 3.1 an infinite dimensional variant of the PMP which is adapted to our setting (Theorem 2). As an application, we derive in Section 3.2 the geodesic equations in shape spaces (Theorem 3), in a geometric setting, and show that they can be written as a Hamiltonian system.

In Section 4, we design some algorithms in order to solve the optimal control problem modeling the shape deformation problem. Problems without constraints are first analyzed in Section 4.1, and we recover some already known algorithms used in unconstrained shape spaces, however with a more general point of view. We are thus able to extend and generalize existing methods. Problems with constraints are investigated in Section 4.2 in view of solving constrained matching problems. We analyze in particular the augmented Lagrangian algorithm, and we also design a method based on shooting.

In Section 5 we provide numerical examples, investigating first a matching problem with constant total volume, and then a multishape matching problem.

2 Modelling shape deformation problems with optimal control

The following notations will be used throughout the paper. Let fixed. A vector can be as well viewed as a column matrix of length . The Euclidean norm of is denoted by . The inner product between two vectors can as well be written, with matrix notations, as , where is the transpose of . In particular one has .

Let be a Banach space. The norm on is denoted by , and the inner product by whenever is a Hilbert space. The topological dual of is defined as the set of all linear continuous mappings . Endowed with the usual dual norm , it is a Banach space. For , the natural pairing between and is , with the duality bracket. If then can be identified with a column vector through the equality .

Let be an open subset of , and let be another Banach space. The Fréchet derivative of a map at a point is written as . When it is applied to a vector , it is denoted by or . When , we may also write .

We denote by (resp. ) the usual Sobolev space of elements of , with (resp., with ) having a weak derivative in . For we denote by (resp., by ) the space of all (resp., ) such that .

For every , a mapping is called a diffeomorphism if it is a bijective mapping of class with an inverse of class . The space of all such diffeomorphisms is denoted by . Note that is the space of all homeomorphisms of .

For every mapping of class with compact support, we define the usual semi-norm

We define the Banach space (endowed with the norm ) as the completion of the space of vector fields of class with compact support on with respect to the norm . In other words, is the space of vector fields of class on whose derivatives of order less than or equal to converge to zero at infinity.

We define as the set of all diffeomorphisms of class that converge to identity at infinity. Clearly, is the set of all such that . It is a group for the composition law .

Note that, if , then is an open subset of the affine Banach space . This allows one to develop a differential calculus on .

2.1 Preliminaries: deformations and RKHS of vector fields

Our approach to shape analysis is based on optimizing evolving deformations. A deformation is a one-parameter family of flows in generated by time-dependent vector fields on . Let us define this concept more rigorously.

Diffeomorphic deformations.

Let . Let

be a time-dependent vector field such that the real-valued function is integrable. In other words, we consider an element of the space .

According to the Cauchy-Lipshitz theorem, generates a (unique) flow (see, e.g., [1] or [27, Chapter 11]), that is a one-parameter family of diffeomorphisms such that

for almost every and every . In other words, considering as a curve in the space , the flow is the unique solution of

(1)

Such a flow is called a deformation of of class .

Proposition 1.

The set of deformations of of class coincides with the set

In other words, the deformations of of class are exactly the curves on that are integrable on as well as their derivative, such that .

Proof.

Let us first prove that there exists a sequence of positive real numbers such that for every deformation of of class , with , induced by the time-dependent vector field , one has

(2)

for every .

The case is an immediate consequence of the integral formulation of (1). Combining the formula for computing derivatives of a composition of mappings with an induction argument shows that the derivatives of order of are polynomials in the derivatives of and of order less than or equal to . Moreover, these polynomials are of degree one with respect to the derivatives of , and also of degree one with respect to the derivatives of of order . Therefore we can write

(3)

where is a polynomial independent of and , and the norms of the derivatives of the are computed in the space of -valued multilinear maps. The result then follows from Gronwall estimates and from an induction argument on .

That any deformation of of class is a curve of class in is then a direct consequence of (2) and (3).

Conversely, for every curve on of class , we set for every . We have for almost every , and hence it suffices to prove that is integrable. The curve is continuous on and therefore is bounded. This implies that is bounded as well. The formula for computing derivatives of compositions of maps then shows that is integrable whenever is integrable, which completes the proof since is of class . ∎

Reproducing Kernel Hilbert Spaces of vector fields.

Let us briefly recall the definition and a few properties of RKHS’s (see [5, 32] for more details). Let be fixed.

Given a Hilbert space , according to the Riesz representation theorem, the mapping is a bijective isometry between and , whose inverse is denoted by . Then for every and every one has and .

Definition 1.

A Reproducing Kernel Vector Space (RKHS) of vector fields of class is a Hilbert space of vector fields on such that with continuous inclusion.

Let be an RKHS of vector fields of class . Then, for all , by definition the linear form on , defined by for every , is continuous (actually this continuity property holds as well for every compactly supported vector-valued distribution of order at most on ). By definition of , there holds . The reproducing kernel of is then the mapping defined on , with values in the set of real square matrices of size , defined by

(4)

for all . In other words, there holds , for all and every , and is a vector field of class in , element of .

It is easy to see that , for all , and hence that and that is positive semi-definite under the assumption that no nontrivial linear combination , with given distinct ’s can vanish for every . Finally, writing , we have

(5)

for every compactly supported vector-valued distribution on of order less than or equal to .555Indeed, it suffices to note that

As explained in [5, 34], one of the interests of such a structure is that we can define the kernel itself instead of defining the space . Indeed a given kernel yields a unique associated RKHS. It is usual to consider kernels of the form with . Such a kernel yields a metric that is invariant under rotation and translation. The most common model is when is a Gaussian function but other families of kernels can be used as well [31, 34].

2.2 From shape space problems to optimal control

We define a shape space in as an open subset of a Banach space on which the group of diffeomorphisms of acts in a certain way. The elements of , called states of the shape, are denoted by . They are usually subsets or immersed submanifolds of , with a typical definition of the shape space as the set of all embeddings of class of a given Riemannian manifold into . For example, if is the unit circle then is the set of all parametrized simple closed curves in . In practical applications or in numerical implementations, one has to consider finite-dimensional approximations, so that usually just consists of a finite set of points, and then is a space of landmarks (see [31, 34] and see examples further).

Let us first explain how the group of diffeomorphisms acts on the shape space , and then in which sense this action induces a control system on .

The group structure of .

Let . The set is an open subspace of the affine Banach space and also a group for the composition law. However, we can be more precise.

First of all, the mappings and are continuous (this follows from the formula for the computation of the derivatives of compositions of mappings).

Moreover, for every , the right-multiplication mapping is Lipschitz and of class , as the restriction of the continuous affine map . Its derivative at is then given by . Moreover, is easily seen to be continuous.

Finally, the mapping

is of class . Indeed we have , for every . Then, using the uniform continuity of any derivative of order , it follows that the mapping is continuous. These properties are useful for the study of the Fréchet Lie group structure of [25].

Group action on the shape space.

In the sequel, we fix , and we assume that the space acts continuously on (recall that is an open subset of a Banach space ) according to a mapping

(6)

such that and for every and all .

Definition 2.

is a shape space of order if the action (6) is compatible with the properties of the group structure of described above, that is:

  • For every fixed, the mapping is Lipschitz with respect to the (weaker when ) norm , i.e., there exists such that

    (7)

    for all .

  • The mapping is differentiable at . This differential is denoted by and is called the infinitesimal action of . From (7) one has

    for every , and if then has a unique continuous extension to the whole space .

  • The mapping

    (8)

    is continuous, and its restriction to is of class . In particular the mapping is of class , for every bounded vector field of class .

Example 1.

For , the action of on itself by left composition makes it a shape space of order in .

Example 2.

Let and let be a smooth compact Riemannian manifold. Consider the space equipped with its usual Banach norm. Then is a shape space of order , where the action of is given by the composition . Indeed, it is continuous thanks to the rule for computing derivatives of a composition, and we also have

Moreover, given and , is the vector field along given by . Finally, the formula for computing derivatives of a composition yields

for every , and the last part of the definition follows. This framework describes most of shape spaces.

An interesting particular case of this general example is when is a finite set (zero-dimensional manifold), and

is a (so-called) space of landmarks in . For , the smooth action of order is . For , the infinitesimal action of at is given by .

Remark 1.

In most cases, and in all examples given throughout this paper, the mapping restricted to is of class , for every .

Proposition 2.

For every , the mapping is of class , and its differential at is given by . In particular, given and given a deformation of of class , which is the flow of the time-dependent vector field , the curve is of class and one has

(9)

for almost every .

Proof.

Let , fix and take . Then for small enough. We define . We have

and therefore the mapping is differentiable at , with continuous differential . The result follows. ∎

The result of this proposition shows that the shape is evolving in time according to the differential equation (9), where is the time-dependent vector field associated with the deformation .

At this step we make a crucial connection between shape space analysis and control theory, by adopting another point of view. The differential equation (9) can be seen as a control system on , where the time-dependent vector field is seen as a control. In conclusion, the group of diffeomorphisms acts on the shape space , and this action induces a control system on .

As said in the introduction, in shape analysis problems, the shapes are usually assumed to evolve in time according to the minimization of some objective functional [32]. With the control theory viewpoint developed above, this leads us to model the shape evolution as an optimal control problem settled on , that we define hereafter.

Induced optimal control problem on the shape space.

We assume that the action of on is smooth of order . Let be an RKHS of vector fields of class on . Let denote its reproducing kernel (as defined in Section 2.1). Let be another Banach space. Most problems of shape analysis can be recast as follows.

Problem 1.

Let , and let be a mapping such that is linear for every . Let be a function. We consider the problem of minimizing the functional

(10)

over all such that and for almost every .

In the problem above, stands for an initial shape, and stands for continuous constraints. Recall that the infinitesimal action can be extended to the whole space .

Note that if is square-integrable then is square-integrable as well. Indeed this follows from the differential equation and from Gronwall estimates. Therefore the minimization runs over the set of all .

Problem 1 is an infinite-dimensional optimal control problem settled on , where the state is a shape and the control is a time-dependent vector field. The constraints can be of different kinds, as illustrated further. A particular but important case of constraints consists of kinetic constraints, i.e., constraints on the speed of the state, which are of the form . Pure state constraints, of the form with a differentiable map , are in particular equivalent to the kinetic constraints .

To the best of our knowledge, except very few studies (such as [35]), only unconstrained problems have been studied so far (i.e., with ). In contrast, the framework that we provide here is very general and permits to model and solve far more general constrained shape deformation problems.

Remark 2.

Assume is an RKHS of class , and let . Then induces a unique deformation on , and the curve satisfies and for almost every . As above, it follows from the Gronwall lemma that . Moreover, according to the Cauchy-Lipshitz theorem, if then is the unique such element of . Therefore, if then Problem 1 is equivalent to the problem of minimizing the functional over all such that for almost every .

Concerning the existence of an optimal solution of Problem 1, we need the following definition.

Definition 3.

A state of a shape space of order is said to have compact support if for some compact subset of , for some and for all , we have

where denotes the restriction of to .

Except for itself, every state of every shape space given so far in examples had compact support.

Theorem 1.

Assume that is an RKHS of vector fields of class on , that is continuous, and that is bounded below and lower semi-continuous. If has compact support, then Problem 1 has at least one solution.

In practice one does not usually have available a convenient, functional definition of the space of vector fields. The RKHS is in general only known through its kernel , as already mentioned in Section 2.1 (and the kernel is often a Gaussian one). Hence Problem 1, formulated as such, is not easily tractable since one might not have a good knowledge (say, a parametrization) of the space .

One can however derive, under a slight additional assumption, a different formulation of Problem 1 that may be more convenient and appropriate in view of practical issues. This is done in the next section, in which our aim is to obtain an optimal control problem only depending on the knowledge of the reproducing kernel of the space (and not directly on itself), the solutions of which can be lifted back to the group of diffeomorphisms.

Kernel formulation of the optimal control problem.

For a given , consider the transpose of the continuous linear mapping . This means that for every the element (sometimes called pullback) is defined by , for every . Besides, by definition of , there holds . The mapping is often called the momentum map in control theory [21].

We start our discussion with the following remark. As seen in Example 2, we observe that, in general, given the mapping is far from being injective (i.e., one-to-one). Its null space can indeed be quite large, with many possible time-dependent vector fields generating the same solution of and for almost every .

A usual way to address this overdetermination consists of selecting, at every time , a that has minimal norm subject to (resulting in a least-squares problem). This is the object of the following lemma.

Lemma 1.

Let . Assume that is closed. Then, for every there exists such that . Moreover, the element is the one with minimal norm over all elements such that .

Proof.

Let denote the orthogonal projection of 0 on the space , i.e., the element of with minimal norm. Then is characterized by and . Using the Banach closed-range theorem, we have , so that there exists such that , and hence . ∎

Remark 3.

Note that the latter assertion in the proof does not require to be closed, since we always have .

Whether is closed or not, this lemma and the previous discussion suggest replacing the control in Problem 1 by such that . Plugging this expression into the system leads to the new control system , where

(11)

for every . The operator is continuous and symmetric (i.e., for all ), satisfies for every and thus is positive semi-definite, and is as regular as . Note that whenever is closed.

This change of variable appears to be particularly relevant since the operator is usually easy to compute from the reproducing kernel of , as shown in the following examples.

Example 3.

Let be the set of continuous mappings from a Riemannian manifold to . The action of is smooth of order , with (see Example 2). Let be an RKHS of vector fields of class on , with reproducing kernel . Every can be identified with a vector-valued Radon measure on . Then

for every and for every . In other words, one has , and therefore, by definition of the kernel, we have . We finally infer that

Example 4.

Let and (as in Example 2). Then , and every is identified with a vector of by . Therefore, we get