A Subgradient Method for Free Material Design
Abstract
A small improvement in the structure of the material could save the manufactory a lot of money. The free material design can be formulated as an optimization problem. However, due to its large scale, secondorder methods cannot solve the free material design problem in reasonable size. We formulate the free material optimization (FMO) problem into a saddlepoint form in which the inverse of the stiffness matrix in the constraint is eliminated. The size of is generally large, denoted as . This is the first formulation of FMO without . We apply the primaldual subgradient method [17] to solve the restricted saddlepoint formula. This is the first gradienttype method for FMO. Each iteration of our algorithm takes a total of floatingpoint operations and an auxiliary vector storage of size , compared with formulations having the inverse of which requires arithmetic operations and an auxiliary vector storage of size . To solve the problem, we developed a closedform solution to a semidefinite least squares problem and an efficient parameter update scheme for the gradient method, which are included in the appendix. We also approximate a solution to the bounded Lagrangian dual problem. The problem is decomposed into small problems each only having an unknown of ( or ) matrix, and can be solved in parallel. The iteration bound of our algorithm is optimal for general subgradient scheme. Finally we present promising numerical results.
keywords fast gradient method, Nesterov’s primaldual subgradient method, free material optimization, largescale problems, firstorder method, saddlepoint, Lagrangian, complexity, duality, constrained least squares.
AMS 90C90, 90C06, 90C25, 90C30, 90C47, 9008
1 Introduction
The approach of Free Material Optimization (FMO) optimizes the material structure while the distribution of material and the material itself can be freely varied. FMO has been used to improve the overall material arrangement in air frame design (www.platon.org). The fundamentals of FMO were introduced in [3, 19]. And the model was further developed in [2, 24] etc. In the model, the elastic body of the material under consideration is represented as a bounded domain with a Lipschitzian boundary in a two or threedimensional Euclidean space depending on the design requirement. For computational purpose, the domain is discretized into finite elements: so that all the points in the same element are considered to have the same property.
Let denote the displacement vector of the body at point under load. Denote the (small)strain tensor as:
Let () denote the stress tensor. The system is assumed to follow the Hooke’s law—the stress is a linear function of the strain:
where is the (plainstress) elasticity tensor of order , which maps the strain to the stress tensor. The matrix measures the degree of deformation of a material under external loads and is a symmetric positive semidefinite matrix of order for the 2dimensional and of order for the dimensional material design problem. The diagonal elements of measure the stiffness of the material at in the coordinate directions. Hence the trace of is used to measure the cost (resource used) of a material in the model.
Denote by the identity matrix of order and the direct product of cones of symmetric positive semidefinite matrices:
For a symmetric matrix , let denote .
Let denote the elasticity tensor of order for the th element : The ’s are considered to be constant on each but can be different for different ’s and are the design variables of the FMO model:
The design problem is to find a structure that is low ‘cost’ (the tensor having small trace) and is stable under given multiple independent loads (forces). There are some different formulas of the FMO problem depending on the design needs. This paper focuses on the minimumcost FMO problem which is to design a material structure that can withstand a whole given set of loads in the worstcase scenario and the trace of is minimal. Below we describe the model based on [11].
The “cost”—stiffness of the material—is measured by the trace of : . For each , is lower bounded to avoid singularity in the FMO model. The constraints for the pointwise stiffness upper and lower bounds are:
From the engineering literature the dynamic stiffness of a structure can be improved by raising its fundamental eigenfrequency. Thus we have a lower bound on its eigen values:
Let be the number of nodes (vertices of the elements). Let denote the number of Gauss integration points in each element. In every element, the displacement vector is approximated as a continuous function which is linear in every coordinate:
where is the value of at the th node, and is the basis function associated with the th node. For , define matrices
For , let be the block matrix whose th block is evaluated at the th integration point and zero otherwise. The full dimension of is for the dimensional case and for the dimensional case.
Let denote the stiffness matrix relating the forces to the displacements; Let denote the element stiffness matrices:
Since the material obeys Hooke’s law, forces (loads) on each element, denoted as , are linearly related to the displacement vector:
(1) 
The system is in equilibrium for if outer and inner forces balance each other. The equilibrium is measured by the compliances of the system: the less the compliance, the more rigid the structure with respect to the loads. The compliance can be represented as:
In the minimumcost FMO model, an upper bound is imposed on the compliances. Further in view of equation (1), we have
In summary, with given loads , imposed upper and lower bounds and , , and compliance upper bound , the minimumcost multipleload material design problem is the following:
(2) 
Some optimization approaches have been applied to FMO; for instance, Zowe et al. [24] formulate the multipleload FOM as a  convex program. They propose penalty/barrier multiplier methods and interiorpoint methods for the problem. BenTal et al. [2] consider bounded trace minimum compliance multipleload FMO problem. They formulate the problem as a semidefinite program and solve the problem by an interiorpoint method. Stingl et al. [21] solve the minimum compliance multipleload FMO problem by a sequential semidefinite programming algorithm. Weldeyesus and Stolpe [22] propose a primaldual interior point method to several equivalent FMO formulations. Stingl et al. [20] study minimum compliance singleload FMO problem with vibration constraint and propose an approach to the problem based on nonlinear semidefinite lowrank approximation of the semidefinite dual. Haslinger et al. [8] extend the original problem statement by a class of generic constraints. Czarnecki and Lewiński [5] deal with minimization of the weighted sum of compliances related to the nonsimultaneously applied load cases. All of them are secondorder methods. To our knowledge, no firstorder methods have been employed to FMO.
Secondorder methods exploit the information of Hessians in addition to gradients and function values. Thus, compared with firstorder methods, secondorder methods generally converge faster and are more accurate; on the other hand, firstorder methods don’t require formulation, storage, and inverse of Hessian and thus can be applied to largescale problems. For certain structured problems with bounded simple feasible sets, Nesterov [13] showed that the complexity of fast gradient methods is one magnitude lower than the theoretical lower complexity bound of the gradienttype method for the blackbox oracle model. After that work, there appears quite a lot of papers on fast gradienttype methods, such as [12, 17, 16, 18, 14, 6, 4, 15, 23].
However, not every realworld problem is suitable for secondorder methods or fast gradienttype methods; for instance, when the structure of the problem is too complex to apply the interiorpoint method or the smoothing technique to. The minimum weight FMO model (2) is such a case. For the model, although the matrices are sparse, is generally not. The number is at least thousands; and is smaller than only by a constant factor. To roughly measure the amount of work per iteration, we use flops, i.e. floating point operations, such as arithmetic operations (), comparisons and exchanges. It takes a vector of length to store the matrix or its Cholesky factor in the memory, and about flops to evaluate . Hence, it is difficult to manage model (2) of reasonable size by secondorder methods, since secondorder methods work on the Hessian of the problem whose size is at least the square of total variables. And the variables of model (2) are matrices of size . In addition, the constraints of model (2) are not simple, which prevents us from applying usual gradientproject type methods to it, because it is not easy to project onto its feasible set.
In this paper, we reformulate model (2) into a saddlepoint problem and apply the primaldual subgradient method [17] to the saddlepoint problem. The advantage of our formulation is that the inverse or the Cholesky factorization of doesn’t need to be calculated; thus reduce the computational cost of each iteration to just .
The traditional subgradient method for minimizing a nonsmooth convex function over the Euclidean space employs a prechosen sequence of steps which satisfies the divergentseries rule:
The iterates are generated as follows:
In the traditional subgradient method, new subgradients enter the model with decreasing weights, which contradicts to the general principle of iterative scheme—new information is more important than the old one. But the vanishing of steps is necessary for the convergence of the iterates .
The primaldual subgradient method [17] associates the primal minimization sequence with a master process in the dual space; it doesn’t have the drawback of diminishing step sizes in the dual space; the method is proven to be optimal for saddlepoint problems, nonsmooth convex minimization, minimax problems, variational inequalities, and stochastic optimization. Let be a finite dimensional real vector space equipped with a norm . Let be its dual. Let be a closed convex set. Let be a proxfunction of with convexity parameter : , :
Let be a function mapping to . For instance, for the convex minimization problem, the function can be a subgradient of the objective function. The generic scheme of dual averaging (DAscheme) [17] works as below:
 Initialization:

Set . Choose .
 Iteration

():

Compute .

Choose . Set .

Choose . Set .

Let
The scheme has two main variants: simple averages where and with constant , and weighted averages where and with constant .
There are some other gradient methods for saddlepoint problems. In [4], Chambolle and Pock study a firstorder primaldual algorithm for a class of saddlepoint problems in two finitedimensional real vector spaces and :
where is a linear operator, and and are proper convex, lowersemicontinuous functions. That algorithm, as well as the classical ArrowHurwicz method [1] and its variants for saddlepoint problems, is not applicable to our FMO formulation, because in our formulation the function between two spaces is nonlinear. Nemirovski’s proxmethod [12] reduces the problem of approximating a saddlepoint of a function to that of solving the associated variational inequality by a proxmethod. The approach is not applicable to our FMO formulation, because the structure of our FMO formulation is not simple enough and its objective function is not in .
In our approach, the inverse of in model (2) doesn’t need to be calculated, which decreases computational cost per iteration by one magnitude. Solutions of the primal and dual subproblems at each iteration can be written in closedform. Each iteration takes roughly flops. And the auxiliary storage space is linear in . Furthermore, since the primal subproblem is decoupled into small problems that can be solved in parallel. And each small problem can be solved in approximately flops. Thus, it is possible to work on largescale problems, compared with secondorder methods dealing with the Hessian of or variables plus additional constraints on the matrices. To prove the efficiency of the algorithm, we give iteration complexity bounds of our algorithm, which includes simple dual averaging and weighted dual averaging schemes. The complexity bounds are optimal for the general subgradient methods. Numerical experiments are described at the end of the paper.
The remainder of the paper is organized as follows. In Section 2, we give our saddlepoint form of the problem. In Section 3, we show that a solution to our bounded Lagrangian form either solves the original problem or gives an approximate solution of the original problem. In Section 4, we present our algorithm. In Section 5, we give closedform solutions to the subproblems. In Section 6, we derive complexity bounds of our algorithm. In section 8, we present some computational examples of our algorithm. In Section 7, we describe and analyze a penalized lagrangian approach. In the Appendixes, we give a closedform solution of a related matrix projection problem and an update scheme for the parameters of the algorithm.
2 SaddlePoint Formulation
We first rewrite problem (2) in a saddlepoint form. Denote
(3) 
The second group of constraints in (2) can be represented in max form:
Assume that problem (2) satisfies some constraint qualifications, such as the Slater condition—there exists such that for . Then a Lagrangian multiplier exists; and we can solve the Lagrangian of problem (2) instead. Thus, problem (2) can be written as follows:
The dimension of the matrix is large; the first transformation eliminates the need of calculating its inverse, but that results in a nonconcave objective function in and . The second transformation makes the function concave in and . In the last step, variable is eliminated to simplify the formulation.
Denote
Thus, to solve problem (2), we only need to solve
(4) 
Note that is convex in and concave in .
3 Bounded Lagrangian
We apply the primaldual subgradient method [17] to the saddlepoint formulation (4). The convergence of the algorithm requires the iterates be uniformly bounded [17]. We therefore impose a bound on :
(5) 
Next we show that the primal solution of the saddlepoint problem (5) is either a solution to the original problem (2) or an approximate solution in the sense that its constraintviolation is bounded by and its objective value is smaller than that of the optimal value of (2).
Let be a solution to the saddlepoint problem (4); then for any , is also its solution. We can choose small enough; for instance, let , so that is a solution to the bounded saddlepoint form (5).
For any , denote the index set of its violated constraints as
Denote . We have the following results regarding our material design problem.
Lemma 1
Proof: Item 1 is obvious as the constraints are nonbinding.
Next, we prove item 2.
Since
we also have
Therefore,
Note that as , the set of saddlepoints of (5) approaches that of the original problem.
4 The Algorithm
In this part, we describe how to apply the primaldual subgradient method [17] to the saddlepoint reformulation of model (2). We have developed a parameter update scheme for the algorithm, which is included in the Appendix.
For a matrix , let vector denote the eigenvalues of ; let be the smallest eigenvalue of . The gradient (subgradients) of at are: for , ,
For the primal space, we choose the standard Frobenius norm:
For the dual space, we choose the standard Euclidean norm:
Their dual norms are denoted as: , .
The set for is defined in (3); and the set for is
Note that is nonsmooth. The primaldual subgradient method [17] for saddlepoint problems (5) works as follows.
Initialization: Set , .  
Choose , .  
Iteration  
1. Compute , , for .  
2. Choose , set  


3. Choose , set  




Output: . 
Details of a parameter update scheme for is given in the Appendix.
We take
Based on different choices of , there are two variants of the algorithm:

Method of Simple Dual Averages
We let

Method of Weighted Dual Averages
We let
5 Solution to the Subproblem
In this part, we give closedform solutions to the subproblems at each iteration of our algorithm.
Solution of . The closedform solution for in Step 3 of the algorithm is derived as below.
By CauchySchwartzBoniakovsky inequality, for :
with equality iff for some . Therefore,
(7) 
Solution of . For a set , let denote the cardinality of ; i.e., the number of elements in . In Step 3 of the algorithm, can be seen as the projection
By Theorem 4 in Appendix: Matrix Projection, we can represent as follows.
For each , let be the eigenvalue decomposition of , and be its eigenvalues. Define the sets
Then
(8) 
where is determined according to the following three cases.

.
Let

.
Then there is a partition :
Let

.
Then there is a partition :
Let
The eigenvalues in case 2 can be obtained by the following algorithm:
Algorithm projSyml
 Step 1

(Initialization) Let be the negative eigenvalues of .
Let  Step 2

While , do
 Step 3

Let
Similarly, the eigenvalues in case 3 can be obtained by the following algorithm:
Algorithm projSymg
 Step 1

(Initialization) Let be the positive eigenvalues of .

If , let

If , let

 Step 2

While , do
 Step 3

Let
6 Complexity of the Algorithm
To understand the complexity of the algorithm for model (2), in this part we study duality gap and computational cost of each iteration. By [17], it takes iterations to solve a general convexconcave saddlepoint problem to the absolute accuracy , which is the exact lower complexity bound for such class of algorithm schemes. To give an insight of how the data of FMO model, such as , , and , affect convergence time, we give upper bounds of the duality gap of the iterates generated by our algorithm in terms of the number of iterations and input data in §§ 6.1. In §§ 6.2, we derive computational cost per iteration. From the duality gap and computational cost per iteration given in this section, we can estimate from given data how much computational effort is needed at most to approximate a solution of a problem instance of model (2) based on the method proposed in the paper.
6.1 Iteration Bounds
By [7, Chapter 6, Proposition 2.1], for a function , assume

the sets and are convex, closed, nonempty, and bounded;

for any fixed , is concave and upper semicontinuous;

for any fixed , is convex and upper semicontinuous;
then the function has at least one saddlepoint.
Since and are bounded, and is continuous and finite, by the above results, we conclude that has a saddlepoint and a finite saddlevalue. An upper bound on duality gap is given in [17, Theorem 6]. We next represent the duality gap in terms of input data.
Define