Filter–Trust-Region Method for contact problems

A globally convergent filter–trust-region method for large deformation contact problems


We present a globally convergent method for the solution of frictionless large deformation contact problems for hyperelastic materials. The discretisation uses the mortar method which is known to be more stable than node-to-segment approaches. The resulting non-convex constrained minimisation problems are solved using a filter–trust-region scheme, and we prove global convergence towards first-order optimal points. The constrained Newton problems are solved robustly and efficiently using a Truncated Non-smooth Newton Multigrid (TNNMG) method with a Monotone Multigrid (MMG) linear correction step. For this we introduce a cheap basis transformation that decouples the contact constraints. Numerical experiments confirm the stability and efficiency of our approach.

This work has been done within the DFG Matheon project CH1 funded by ECMath

1. Introduction

Although large deformation contact problems arise in many important applications, only very few methods today can solve them fast and robustly. All of these methods have their advantages and disadvantages.

Discretisation of such problems leads to constrained non-convex minimisation problems. The prevailing methods for these problems are primal–dual active set strategies [10, 11, 17] and penalty methods [18]. For both methods only local convergence can be expected [12]. Furthermore, the resulting linearised Newton problems can be indefinite due to the non-convexity of the strain energy.

In this work we construct a filter–trust-region method [6] for the constrained non-convex minimisation problem. The filter technique ensures asymptotic fulfilment of the non-linear non-penetration constraints by rejecting iterates that are neither improving the energy nor the infeasibility compared to all previous iterates. The trust-region method provides a natural way to handle indefiniteness of the linearised problems. We show that the modifications we need to make to the method to apply it to contact problems stay within the realm of the general filter–trust-region convergence theory, and hence we obtain global convergence of the method to first-order stationary points.

A priori, the Newton problems of a filter–trust-region method are quadratic minimisation problems with convex inequality constraints. Such problems are generally expensive to solve. We extend an efficient multigrid strategy originally introduced for contact problems in small strain elasticity [13] to the case of large strains. This requires rewriting the inequality constraints as sets of bound constraints. The inequality constraints consist of two parts: the trust-region constraint and the linearised contact condition. We define the trust-region in terms of the max-norm. With this choice, the trust-region constraints form a set of bound constraints by construction. To decouple the contact constraints we extend the technique used in [13] to the finite strain case. The idea there is to construct a basis transformation that replaces the nodal basis at the contact boundaries by a system of relative movements. The construction of this transformation requires the solution of a linear system involving a contact surface mass matrix (the non-mortar matrix) during each Newton-type iteration. The additional computational cost of this is negligible because the size of the mass matrix is much smaller than the overall problem size, and grows with a lower order. The transformation also leads to a slight modification of the Newton matrix, but we show that this modification does not influence the convergence behaviour of the overall method.

In previous work, the Truncated Non-smooth Newton Multigrid (TNNMG) method has been shown to be very fast and effective for quadratic minimisation problems with bound constraints such as small-strain contact problems and obstacle problems [8, 9]. Since we have found a way to uncouple the contact constraints for finite-strain contact problems, we can also harness the performance of TNNMG for the Newton problems of a finite-strain contact problem. Unfortunately, this only works if the quadratic models are convex. For the non-convex case, we extend the TNNMG method by combining it with a Monotone Multigrid (MMG) method for the linear correction step. The MMG method will handle the trust-region constraints (which have a comparatively simple structure) while the more complicated contact constraints will be left to the TNNMG step. The resulting scheme is globally convergent even for indefinite trust-region problems. At the same time, we observe multigrid-type convergence rates in numerical experiments.

This paper is organised as follows: In Section 2 the static large deformation contact problem is described and its weak formulation is derived. In Section 3 we summarise the mortar discretisation of the problem, which we use because it avoids most instabilities and unphysical oscillations of the node-to-segment approaches [18]. As a stepping stone, we then construct a locally convergent, efficient solver based on sequential quadratic programming in Section 4. We introduce the TNNMG multigrid algorithm and the constraints decoupling strategy needed to solve the quadratic constrained Newton problems. To globalise the local SQP solver, in Section 5 we then describe the filter–trust-region algorithm and the combined TNNMG/MMG scheme for the solution of the linearised problems. We show global convergence of both methods. The final Section 6 is dedicated to a numerical example.

2. Static large deformation contact problems

In this section we will briefly summarise the equations of equilibrium of two non-linear hyperelastic bodies subject to mutual contact. A more detailed introduction can be found, e.g., in [14].

2.1. Strong formulation

Let , denote the disjoint reference configurations of two deformable objects. Assume that the boundaries of the are such that the outer unit normal fields exist everywhere. Let the boundaries be decomposed into disjoint relatively open sets corresponding to Dirichlet, Neumann, and contact boundaries. We assume that has positive -dimensional measure for , and that is compactly embedded in .

In the following, unindexed variables are used to denote quantities defined over both objects. For example denotes the reference configuration of both bodies together. Neglecting the inertia terms, the balance of linear momentum yields the following system of partial differential equations in reference coordinates for the deformation function


Here, is the first Piola–Kirchhoff stress field, and is the set of matrices with positive determinant. The functions and are prescribed external volume and traction force densities, which are assumed to be independent of the deformation. The function specifies the Dirichlet boundary conditions. We will only consider hyperelastic continua, i.e., materials for which there exists a stored energy functional , , that links the stresses to the deformation via


We assume that the hyperelastic energy is penalising any violation of the orientation-preserving condition


in the sense that

As a consequence, we will not explicitly enforce (3) as a hard constraint.

The subsets denote the parts of the boundaries where contact may occur. Contact constraints are naturally formulated on the deformed domain. For let denote the outer unit normal field on the deformed contact boundary . Modelling of non-penetration can be done in several ways, depending on which projection is chosen to identify the contact surfaces with each other. Earlier papers used the closest-point projection from to  [14, 15, 25]. Recently, using the projection along has become more popular [11, 18, 17, 20]. In the following we only consider the closest-point projection approach, but others can be used equally well. The deformed contact boundaries are identified with each other through the projection

The resulting distance function or signed gap function is given by


where we have used the fact that is orthogonal to at . With these definitions, non-penetration of the bodies is enforced by requiring


cf. Figure 1.

Figure 1. Reference and deformed configuration of the two bodies

So far the non-penetration constraint was derived only from a kinematical point of view. To investigate the effect of these constraints on the elastic system we examine the resulting contact forces. Consider the Cauchy stress tensor , which expresses the stress relative to the deformed configuration . The Cauchy boundary traction

then represents the contact forces on . It can be decomposed into normal and tangential parts and with respect to . We consider frictionless contact only so the tangential traction at the contact boundary vanishes. The contact normal stresses fulfil the Karush–Kuhn–Tucker (KKT) conditions

where the first one states that traction is a pressure, the second one is (5), and the last one is the complementary condition [14].

2.2. Weak formulation

The equilibrium configurations of hyperelastic continua are characterised as stationary points of the energy functional

where is the hyperelastic energy density (2), and and are potentials of the external forces. Stable configurations are the local minimisers of this energy [3, Theorem 4.1-2]. Existence of minimisers has been shown for the case of a poly-convex and coercive strain energies [3, Theorem 7.7-1]. The corresponding first-order optimality condition is the weak form of the elasticity problem (2.1)

We now add the contact constraints. In a Sobolev space setting, the non-penetration constraint (5) takes the form


and similarly for the other two KKT conditions. In anticipation of the mortar discretisation we rewrite this condition in a variationally consistent form. Let denote the Sobolev space of -valued weakly differentiable functions fulfilling the Dirichlet boundary conditions in the sense of traces. We assume that the gap function is smooth enough such that

maps every function to a function in . We denote the dual trace space by

and the cones of positive functions and dual functionals by

where denotes the dual paring of and . Now, the resulting weak formulation of the non-penetration constraint (6) is given by

The equivalence of this to (6) is shown in [24]. We denote by

the closed non-convex set of feasible deformations. The weak formulation of the large deformation contact problem now reads:


To our knowledge the question of existence of solutions is still open.

3. Discretisation

In this section we will describe the discretisation of the minimisation problem (7) using first-order Lagrangian finite elements, and mortar elements for the contact constraints. Let be a shape-regular grid of the bodies , and the set of vertices. The space of -valued first-order finite elements is , and for each node the scalar nodal basis function corresponding to is denoted by . We discretise the hyperelasticity problem (7) by replacing the solution space by the finite dimensional subspace .

3.1. Dual mortar discretisation of the contact constraints

We use dual mortar functions [23] to discretise the mortar cone , but Lagrange functions can be used equally well. For a given discrete deformation , let be the grid of the deformed contact boundary obtained by restricting to the reference contact boundary , and then deforming this restriction using . We denote the basis of the Lagrange multiplier space by


The discrete mortar cone is then given by

This leads to the weak non-penetration constraint

which, considering the definition (4) of the gap function , is


As the normal field of a piecewise polynomial surface, is not continuous on . We therefore replace it by a smoothed normal field . Define vertex normals by averaging the adjacent face normals, i.e., for each vertex with neighbouring faces on the contact boundary we set

where is the face normal of at the corner . The discretised normal field is then defined as the finite element function


and we replace in (9) with . This continuous approximation yields a smoother behaviour when sliding occurs compared to using discontinuous element normals, cf. [18]. The resulting discrete non-penetration constraint with




We denote the corresponding discrete feasible set by

Summarising, the discrete problem is given by:


As for the non-discrete case (7), the existence of solutions of (13) appears to be an open question.

3.2. Algebraic contact problem

For the rest of this paper we will denote the -th component of a (block-)vector by , the -th row of a matrix by , and the -th entry of a (block-)matrix by . The algebraic representation of the finite-strain contact problem is derived using the canonical isomorphism (where ) that identifies finite element functions with their coefficient (block-)vectors. The algebraic energy is then given by

with given component-wise by

where denotes the -th Euclidean basis vector. The non-penetration constraint (12) is represented algebraically by a function , with , defined by testing the weak constraint (12) with the mortar basis functions (8)


Details on how to assemble the algebraic constraints can be found in [17]. Summarising, the non-convex algebraic contact problem reads:



4. Inexact SQP multigrid methods for contact problems

In this section we show how (15) can be solved locally using sequential quadratic programming (SQP). We propose a basis transformation that decouples the linearised constraints for each quadratic sub-problem. The transformed problems can then be solved robustly and efficiently using a Truncated Non-smooth Newton Multigrid (TNNMG) method. The transformation involves minor modifications to the tangent stiffness matrices that do not harm the overall convergence properties.

4.1. Sequential Quadratic Programming

Consider the constrained optimisation problem (15). The first-order optimality conditions are given by the following theorem.

Theorem 4.1 ([16], Theorem 12.1).

Let be a local minimiser of (15). If the rows of the active constraint Jacobian, i.e., those rows of for which , are linearly independent, then there exists a Lagrange multiplier such that



The SQP method is derived by applying Newton’s method to the first-order optimality system (16) and eliminating the Lagrange multiplier. In the following let upper indices be the iteration number of the Newton method, and introduce a quadratic model energy by


with symmetric. The SQP constraints are derived by replacing the constraint with its linearisation at


where is the argument of . Then, the Newton problems can be reformulated as quadratic minimisation problems for the correction


Local linear convergence of this scheme can be proven if is a symmetric positive definite approximation of the Hessian of the Lagrangian


see [16, Theorem 18.7].

4.2. Multigrid methods for bound-constrained quadratic minimisation problems

In an SQP method, solving the constraint quadratic problems (QP) is by far the most costly part. We will solve these problems with multigrid efficiency using the Truncated Non-smooth Newton Multigrid (TNNMG) method [9]. To illustrate the method we assume for the rest of this section that the local models are strictly convex.

Consider the quadratic functional (17) with symmetric and positive definite. For simplicity we drop the superscript for this section. Also, for this section only, we assume that the linear constraints decouple into bound constraints. Hence, we want to find the unique minimiser of subject to


where the may be and the may be . One iteration step of TNNMG for this problem can be separated into the following four sub-steps: Let be a given iterate.

1. Projected Gauss-Seidel step
Set ; then for , set


where is the -th Euclidean basis vector.
Denote by the resulting pre-smoothed iterate.

2. Truncated linear correction
To accelerate the convergence, the smoothing is followed by a linear correction step for the defect problem

where the residual is given by

For this step the active components

are truncated [8], i.e., temporarily frozen. This is achieved by multiplying the defect problem with the truncation matrix


The linear truncated defect problem therefore reads


Note that the defect problem (23) is unconstrained. For the approximate solution of this problem on the space spanned by the inactive components one (or a few) geometric or algebraic linear multigrid step(s) is used.

3. Projection
The resulting correction may violate the defect constraints. To ensure feasibility it is projected back onto the defect obstacles in the -sense, i.e., we define by

4. Line search
The projection in Step 3 can lead to an increase of model energy. To ensure monotonicity of the algorithm a line search is performed


This one-dimensional constrained quadratic problem can be solved analytically. As a result we obtain with

Global convergence of the algorithm follows immediately from the convergence of the pre-smoothing Gauss–Seidel step and the monotonicity.

Theorem 4.2 ([8], Theorem. 6.4).

Suppose that is symmetric positive definite, and the constraints have the form (20). Then the TNNMG method converges globally to a minimiser of (17) subject to (20).

4.3. Decoupling the constraints

In this section we will construct a basis transformation of that decouples the linearised contact constraints (18). This generalises an idea from [13], which did the same in the infinitesimal strain framework. We start by considering in more detail: The linearisation

of the -th component of the algebraic contact constraint (14) in the direction of can be divided into three parts


The first part is the linearisation of the nodally averaged normal field (10). In the continuous case this term vanishes due to the colinearity of the normal with the closest point projection , see [14]. The second part is the linearisation of the discretised gap function (11) and the mortar basis function, and the third summand is the linearisation of the deformation dependent integral domain, which we denote by .

Let be the number of vertices on the contact boundary , and as before . In the following we assume for simplicity that the coefficient vectors are ordered such that , where and are the degrees of freedom on the contact boundaries and respectively, and denotes all other degrees of freedom. Then, the algebraic form (25) of the constraint Jacobian can be split into a non-mortar and a mortar part, corresponding to the linearisations with respect to and , respectively

where are sparse block-matrices given by

and denotes a zero matrix. The algebraic linearised constraints (18) then take the form


In our aim to decouple these constraints we first separate the normal from the tangential components. Let be the block-diagonal matrix consisting of Householder transformations such that rotates the first Euclidean basis vector onto the normal at the projected vertex , for all . We use to transform the non-mortar matrix by


In the normal part , the first component of each -block of is collected. Analogously, the tangential components are collected in . The crucial insight of [13] was to see that the contact constraints can be decoupled by inverting . For small-strain contact problems this could be trivially achieved, because the biorthogonality of the dual mortar basis lead to a diagonal matrix . In the finite-strain setting, is sparse but no longer diagonal. We suppose that the matrix remains invertible, for all relevant configurations . For the sake of the argument we use its inverse now and comment later on how to compute it efficiently.

Consider the following deformation-dependent transformation


where the -block-matrices are given component-wise by

The inverse of is sparse and has the form


Lemma 4.3.

In the transformed coordinates

the linearised contact constraints (26) take the form


where is the vector that contains the first of each block of degrees of freedom on .


We omit the dependencies on for simplicity. The linearised constraints (26) transform according to

The first column of this is an -matrix. It can be simplified by noting that for any

The second column vanishes since for ,  we have

where we have used the relationship (27). Therefore, if is a vector such that (18) holds, we obtain that (29), and vice versa. ∎

In transformed coordinates, sub-problem (QP) turns into


with transformed quadratic energy

If the transformed model energy is strictly convex, this quadratic minimisation problem with bound constraints can be solved by the TNNMG method of the previous section. To avoid having to assemble (28) on all grid levels, only the pre-smoothing Gauss–Seidel step is applied to the decoupled formulation (TQP). The truncated defect problem and coarse grid correction are computed in Euclidean coordinates.

4.4. Avoiding the inverse non-mortar matrix

The decoupling strategy of the previous section uses the explicit inverse of the sparse matrix , whose size corresponds to the number of degrees of freedom on the non-mortar contact boundary . While the inverse itself can be computed in reasonable time using a direct sparse solver, it leads to a considerable increase of density of the tangent stiffness matrix compared to the untransformed matrices . This severely slows down the multigrid solver. In the following we show how the matrix inversion and the resulting density increase can be avoided while conserving the convergence of the SQP method and the filter method presented in the next section. To this end, we first consider a lumped approximation of the non-mortar matrix

We then define a new transformation by formula (28), but using the diagonal matrix instead of . Then, we apply this new transformation to the tangent stiffness matrix only, but we keep the exact transformation for the gradient . The resulting approximate SQP problem reads



In other words, we still compute the sub-problem in the transformed coordinates of Section 4.3, but we have replaced the tangent matrix by a sparser approximation. Note that we retain the first-order consistency of the SQP model (QP), because the linear term is still transformed according to the exact mapping . This guarantees the convergence of the SQP method, and of the filter–trust-region method presented in the next section. Further, this transformation can be done without explicitly computing by solving the small linear system

where consists of the first components of the entries of that correspond to degrees of freedom on the non-mortar contact boundary . The transformed gradient can then be directly computed from by multiplication with and resp. , cf. (28). Similarly, the transformation back to Euclidean coordinates

can be computed without the explicit inverse by solving the small linear system

and rotating the block vector

where denotes the block-vector corresponding to the tangential non-mortar degrees of freedom.

Remark 4.4.

Approximating the algebraic problem is often done in large deformation contact problems to simplify the unknown contact forces that show up explicitly in the weak formulation when applying an active-set method [10, 17]. This allows to eliminate the Lagrange multipliers at the cost of losing angular momentum conservation. In contrast, by conserving the first-order consistency of the sub-problems (QP), the approximation of the Hessian suggested here is only affecting the convergence rate of the SQP method and preserves the angular momentum.

5. Globalisation by Filter–Trust-Region Methods

We globalise the SQP method of the previous chapter by extending it to a filter–trust-region method. In contrast to the active-set strategies widely used in contact mechanics [11, 17, 10], this method can be shown to converge globally even for rather general non-convex strain energy functionals.

5.1. Filter–trust-region methods

The SQP method of the previous chapter converges only locally. Furthermore, away from local minimisers of , the exact Hessian (19) does not have to be positive definite. Hence, approximating it by a positive definite matrix may result in poor performance of the SQP method [16]. In the following we will use the popular approximation of (19) by the Hessian of the energy

which avoids the need to compute the Lagrange multipliers during the SQP iteration. To handle the possible unboundedness from below of the local problems (QP) with this definition of , the trust-region globalisation adds a norm constraint on the correction


We choose the infinity norm as then (30) is equivalent to a set of bound constraints, which fits naturally with the non-smooth multigrid solver of Section 4.2.

The constraint is adjusted dynamically according to how well the local model approximates the non-linear functional. We measure the approximation quality by the scalar quantity


where is the solution of the SQP sub-problem (IQP) in transformed coordinates and .

Incorporating (30) into (IQP) yields the constrained quadratic optimisation problems



These problems always have at least one solution, even if is non-convex.

To arrive at a globally convergent scheme one also has to control the possible infeasibility of the intermediate iterates , which results from replacing the non-linear contact constraint from (15) by a linearised one. We measure the infeasibility of an iterate using the non-smooth function

A filter method creates tentative new iterates by solving (TRQP), and accepting or rejecting them based on a set of criteria. In the following we use the abbreviations and to denote the energy and infeasibility of the -th iterate. Let be a potential new iterate, i.e., , with an approximate solution of (TRQP). If and for all previous iterates , then the step can be accepted. If there is a previous iterate , such that

then the candidate should be rejected. The critical question is what to do if

or vice versa, for all previous iterates. To overcome this difficulty Fletcher and Leyffer introduced the notion of a filter [6].

Definition 5.1.

Let . A pair -dominates if

For a fixed constant , a set of tuples is called a filter , if no tuple -dominates any other tuple in (Figure 2).

A filter defines a region of acceptable new iterates.

Definition 5.2.

An iterate is acceptable to the filter , if