An Inner Convex Approximation Algorithm for BMI Optimization and Applications in Control

An Inner Convex Approximation Algorithm for BMI Optimization and Applications in Control

Quoc Tran Dinh, Wim Michiels and Moritz Diehl Department of Electrical Engineering (ESAT/SCD) and Optimization in Engineering Center (OPTEC), Katholieke Universiteit Leuven, Belgium. Email: {quoc.trandinh, moritz.diehl}@esat.kuleuven.beDepartment of Computer Science and Optimization in Engineering Center (OPTEC), KU Leuven, Belgium. Email: wim.michiels@cs.kuleuven.beDepartment of Mathematics-Mechanics-Informatics, Hanoi University of Science, Hanoi, Vietnam.
Abstract

In this work, we propose a new local optimization method to solve a class of nonconvex semidefinite programming (SDP) problems. The basic idea is to approximate the feasible set of the nonconvex SDP problem by inner positive semidefinite convex approximations via a parameterization technique. This leads to an iterative procedure to search a local optimum of the nonconvex problem. The convergence of the algorithm is analyzed under mild assumptions. Applications in static output feedback control are benchmarked and numerical tests are implemented based on the data from the COMPLib library.

1 Introduction

We are interested in the following nonconvex semidefinite programming problem:

()

where is convex, is a nonempty, closed convex set in and () are nonconvex matrix-valued mappings and smooth. The notation means that is a symmetric negative semidefinite matrix. Optimization problems involving matrix-valued mapping inequality constraints have large number of applications in static output feedback controller design and topology optimization, see, e.g. [4, 10, 13, 18]. Especially, optimization problems with bilinear matrix inequality (BMI) constraints have been known to be nonconvex and NP-hard [3]. Many attempts have been done to solve these problems by employing convex semidefinite programming (in particular, optimization with linear matrix inequality (LMI) constraints) techniques [6, 7, 10, 11, 21]. The methods developed in those papers are based on augmented Lagrangian functions, generalized sequential semidefinite programming and alternating directions. Recently, we proposed a new method based on convex-concave decomposition of the BMI constraints and linearization technique [20]. The method exploits the convex substructure of the problems. It was shown that this method can be applied to solve many problems arising in static output feedback control including spectral abscissa, , and mixed synthesis problems.

In this paper, we follow the same line of the work in [2, 15, 20] to develop a new local optimization method for solving the nonconvex semidefinite programming problem (). The main idea is to approximate the feasible set of the nonconvex problem by a sequence of inner positive semidefinite convex approximation sets. This method can be considered as a generalization of the ones in [2, 15, 20].

Contribution. The contribution of this paper can be summarized as follows:

  • We generalize the inner convex approximation method in [2, 15] from scalar optimization to nonlinear semidefinite programming. Moreover, the algorithm is modified by using a regularization technique to ensure strict descent. The advantages of this algorithm are that it is very simple to implement by employing available standard semidefinite programming software tools and no globalization strategy such as a line-search procedure is needed.

  • We prove the convergence of the algorithm to a stationary point under mild conditions.

  • We provide two particular ways to form an overestimate for bilinear matrix-valued mappings and then show many applications in static output feedback.

Outline. The next section recalls some definitions, notation and properties of matrix operators and defines an inner convex approximation of a BMI constraint. Section 3 proposes the main algorithm and investigates its convergence properties. Section 4 shows the applications in static output feedback control and numerical tests. Some concluding remarks are given in the last section.

2 Inner convex approximations

In this section, after given an overview on concepts and definitions related to matrix operators, we provide a definition of inner positive semidefinite convex approximation of a nonconvex set.

2 Preliminaries

Let be the set of symmetric matrices of size , , and resp., be the set of symmetric positive semidefinite, resp., positive definite matrices. For given matrices and in , the relation (resp., ) means that (resp., ) and (resp., ) is (resp., ). The quantity is an inner product of two matrices and defined on , where is the trace of matrix . For a given symmetric matrix , denotes the smallest eigenvalue of .

Definition 2.1

[17] A matrix-valued mapping is said to be positive semidefinite convex (psd-convex) on a convex subset if for all and , one has

(1)

If (1) holds for instead of for then is said to be strictly psd-convex on . In the opposite case, is said to be psd-nonconvex. Alternatively, if we replace in (1) by then is said to be psd-concave on . It is obvious that any convex function is psd-convex with .

A function is said to be strongly convex with parameter if is convex. The notation denotes the subdifferential of a convex function . For a given convex set , if and if denotes the normal cone of at .

The derivative of a matrix-valued mapping at is a linear mapping from to which is defined by

For a given convex set , the matrix-valued mapping is said to be differentiable on a subset if its derivative exists at every . The definitions of the second order derivatives of matrix-valued mappings can be found, e.g., in [17]. Let be a linear mapping defined as , where for . The adjoint operator of , , is defined as for any .

Finally, for simplicity of discussion, throughout this paper, we assume that all the functions and matrix-valued mappings are twice differentiable on their domain.

2 Psd-convex overestimate of a matrix operator

Let us first describe the idea of the inner convex approximation for the scalar case. Let be a continuous nonconvex function. A convex function depending on a parameter is called a convex overestimate of w.r.t. the parameterization if and for all . Let us consider two examples.

Example 1. Let be a continuously differentiable function and its gradient is Lipschitz continuous with a Lipschitz constant , i.e. for all . Then, it is well-known that . Therefore, for any we have with . Moreover, for any . We conclude that is a convex overestimate of w.r.t the parameterization . Now, if we fix and find a point such that then . Consequently if the set is nonempty, we can find a point such that . The convex set is called an inner convex approximation of .

Example 2. [2] We consider the function in . The function is a convex overestimate of w.r.t. the parameterization provided that . This example shows that the mapping is not always identity.

Let us generalize the convex overestimate concept to matrix-valued mappings.

Definition 2.2

Let us consider a psd-nonconvex matrix mapping . A psd-convex matrix mapping is said to be a psd-convex overestimate of w.r.t. the parameterization if and for all and in .

Let us provide two important examples that satisfy Definition 2.2.

Example 3. Let be a bilinear form with , and arbitrarily, where and are two matrices. We consider the parametric quadratic form:

(2)

One can show that is a psd-convex overestimate of w.r.t. the parameterization .

Indeed, it is obvious that . We only prove the second condition in Definition 2.2. We consider the expression . By rearranging this expression, we can easily show that . Now, since , by [1], we can write:

Note that . Therefore, we have for all and .

Example 4. Let us consider a psd-noncovex matrix-valued mapping , where and are two psd-convex matrix-valued mappings [20]. Now, let be differentiable and be the linearization of at . We define . It is not difficult to show that is a psd-convex overestimate of w.r.t. the parametrization .

Remark 2.3

Example 3 shows that the “Lipschitz coefficient” of the approximating function (2) is . Moreover, as indicated by Examples 3 and 4, the psd-convex overestimate of a bilinear form is not unique. In practice, it is important to find appropriate psd-convex overestimates for bilinear forms to make the algorithm perform efficiently. Note that the psd-convex overestimate of in Example 3 may be less conservative than the convex-concave decomposition in [20] since all the terms in are related to and rather than and .

3 The algorithm and its convergence

Let us recall the nonconvex semidefinite programming problem (). We denote by

(4)

the feasible set of () and

(5)

the relative interior of , where is the relative interior of . First, we need the following fundamental assumption.

Assumption A. 1

The set of interior points of is nonempty.

Then, we can write the generalized KKT system of () as follows:

(6)

Any point with is called a KKT point of (), where is called a stationary point and is called the corresponding Lagrange multiplier.

3 Convex semidefinite programming subproblem

The main step of the algorithm is to solve a convex semidefinite programming problem formed at the iteration by using inner psd-convex approximations. This problem is defined as follows:

()

Here, is given and the second term in the objective function is referred to as a regularization term; is the parameterization of the convex overestimate of .

Let us define by the solution mapping of depending on the parameters . Note that the problem is convex, is multivalued and convex. The feasible set of is written as:

(7)

3 The algorithm

The algorithm for solving () starts from an initial point and generates a sequence by solving a sequence of convex semidefinite programming subproblems approximated at . More precisely, it is presented in detail as follows.

Algorithm 1 (Inner Convex Approximation)

Initialization. Determine an initial point . Compute for . Choose a regularization matrix . Set .

Iteration () Perform the following steps:

  • Step 1. For given , if a given criterion is satisfied then terminate.

  • Step 2. Solve the convex semidefinite program to obtain a solution and the corresponding Lagrange multiplier .

  • Step 3. Update , the regularization matrix (if necessary). Increase by and go back to Step 1.

End.

The core step of Algorithm 1 is Step 2 where a general convex semidefinite program needs to be solved. In practice, this can be done by either implementing a particular method that exploits problem structures or relying on standard semidefinite programming software tools. Note that the regularization matrix can be fixed at , where is sufficiently small and is the identity matrix. Since Algorithm 1 generates a feasible sequence to the original problem () and this sequence is strictly descent w.r.t. the objective function , no globalization strategy such as line-search or trust-region is needed.

3 Convergence analysis

We first show some properties of the feasible set defined by (7). For notational simplicity, we use the notation .

Lemma 3.1

Let be a sequence generated by Algorithm 1. Then:

  • The feasible set for all .

  • It is a feasible sequence, i.e. .

  • .

  • For any , it holds that:

    where is the strong convexity parameter of .

{proof}

For a given , we have and for . Thus if then , the statement a) holds. Consequently, the sequence is feasible to () which is indeed the statement b). Since is a solution of , it shows that . Now, we have to show it belongs to . Indeed, since by Definition 2.2 for all , we conclude . The statement c) is proved. Finally, we prove d). Since is the optimal solution of , we have for all . However, we have due to c). By substituting in the previous inequality we obtain the estimate d).

Now, we denote by the lower level set of the objective function. Let us assume that is continuously differentiable in for any . We say that the Robinson qualification condition for holds at if for . In order to prove the convergence of Algorithm 1, we require the following assumption.

Assumption A. 2

The set of KKT points of () is nonempty. For a given , the matrix-valued mappings are continuously differentiable on . The convex problem is solvable and the Robinson qualification condition holds at its solutions.

We note that if Algorithm 1 is terminated at the iteration such that then is a stationary point of ().

Theorem 3.2

Suppose that Assumptions A.1 and A.2 are satisfied. Suppose further that the lower level set is bounded. Let be an infinite sequence generated by Algorithm 1 starting from . Assume that . Then if either is strongly convex or for then every accumulation point of is a KKT point of (). Moreover, if the set of the KKT points of () is finite then the whole sequence converges to a KKT point of ().

{proof}

First, we show that the solution mapping is closed. Indeed, by Assumption A.2, is feasible. Moreover, it is strongly convex. Hence, , which is obviously closed. The remaining conclusions of the theorem can be proved similarly as [20, Theorem 3.2.] by using Zangwill’s convergence theorem [22, p. 91] of which we omit the details here.

Remark 3.3

Note that the assumptions used in the proof of the closedness of the solution mapping in Theorem 3.2 are weaker than the ones used in [20, Theorem 3.2.].

4 Applications to robust controller design

In this section, we present some applications of Algorithm 1 for solving several classes of optimization problems arising in static output feedback controller design. Typically, these problems are related to the following linear, time-invariant (LTI) system of the form:

(8)

where is the state vector, is the performance input, is the input vector, is the performance output, is the physical output vector, is state matrix, is input matrix and is the output matrix. By using a static feedback controller of the form with , we can write the closed-loop system as follows:

(9)

The stabilization, , optimization and other control problems of the LTI system can be formulated as an optimization problem with BMI constraints. We only use the psd-convex overestimate of a bilinear form in Example 3 to show that Algorithm 1 can be applied to solving many problems ins static state/output feedback controller design such as:

  • Sparse linear static output feedback controller design;

  • Spectral abscissa and pseudospectral abscissa optimization;

  • optimization;

  • optimization;

  • and mixed synthesis.

These problems possess at least one BMI constraint of the from , where , where and are matrix variables and is a affine operator of matrix variable . By means of Example 3, we can approximate the bilinear term by its psd-convex overestimate. Then using Schur’s complement to transform the constraint of the subproblem into an LMI constraint [20]. Note that Algorithm 1 requires an interior starting point . In this work, we apply the procedures proposed in [20] to find such a point. Now, we summary the whole procedure applying to solve the optimization problems with BMI constraints as follows:

Scheme A. 1

Step 1. Find a psd-convex overestimate of w.r.t. the parameterization for (see Example 1).
Step 2. Find a starting point (see [20]).
Step 3. For a given , form the convex semidefinite programming problem and reformulate it as an optimization with LMI constraints.
Step 4. Apply Algorithm 1 with an SDP solver to solve the given problem.

Now, we test Algorithm 1 for three problems via numerical examples by using the data from the COMPib library [12]. All the implementations are done in Matlab 7.8.0 (R2009a) running on a Laptop Intel(R) Core(TM)i7 Q740 1.73GHz and 4Gb RAM. We use the YALMIP package [14] as a modeling language and SeDuMi 1.1 as a SDP solver [19] to solve the LMI optimization problems arising in Algorithm 1 at the initial phase (Phase 1) and the subproblem . The code is available at http://www.kuleuven.be/optec/software/BMIsolver. We also compare the performance of Algorithm 1 and the convex-concave decomposition method (CCDM) proposed in [20] in the first example, i.e. the spectral abscissa optimization problem. In the second example, we compare the -norm computed by Algorithm 1 and the one provided by HIFOO [8] and PENBMI [9]. The last example is the mixed synthesis optimization problem which we compare between two values of the -norm level.

4 Spectral abscissa optimization

We consider an optimization problem with BMI constraint by optimizing the spectral abscissa of the closed-loop system as [5, 13]:

(10)

Here, matrices , and are given. Matrices and and the scalar are considered as variables. If the optimal value of (10) is strictly positive then the closed-loop feedback controller stabilizes the linear system .

By introducing an intermediate variable , the BMI constraint in the second line of (10) can be written . Now, by applying Scheme 1 one can solve the problem (10) by exploiting the Sedumi SDP solver [19]. In order to obtain a strictly descent direction, we regularize the subproblem by adding quadratic terms: , where . Algorithm 1 is terminated if one of the following conditions is satisfied:

  • the subproblem encounters a numerical problem;

  • ;

  • the maximum number of iterations, , is reached;

  • or the objective function of () is not significantly improved after two successive iterations, i.e. for some and , where .

We test Algorithm 1 for several problems in COMPib and compare our results with the ones reported by the convex-concave decomposition method (CCDM) in [20].

Problem Convex-Concave Decom. Inner Convex App.
Name CCDM iter time[s] Iter time[s]
AC1 0.000 -0.8644 62 23.580 -0.7814 55 19.510
AC4 2.579 -0.0500 14 6.060 -0.0500 14 4.380
AC5 0.999 -0.7389 28 10.200 -0.7389 37 12.030
AC7 0.172 -0.0766 200 95.830 -0.0502 90 80.710
AC8 0.012 -0.0755 24 12.110 -0.0640 40 32.340
AC9 0.012 -0.4053 100 55.460 -0.3926 200 217.230
AC11 5.451 -5.5960 200 81.230 -3.1573 181 73.660
AC12 0.580 -0.5890 200 61.920 -0.2948 200 71.200
HE1 0.276 -0.2241 200 56.890 -0.2134 200 58.580
HE3 0.087 -0.9936 200 98.730 -0.8380 57 54.720
HE4 0.234 -0.8647 63 27.620 -0.8375 88 70.770
HE5 0.234 -0.1115 200 86.550 -0.0609 200 181.470
HE6 0.234 -0.0050 12 29.580 -0.0050 18 106.840
REA1 1.991 -4.2792 200 70.370 -2.8932 200 74.560
REA2 2.011 -2.1778 40 13.360 -1.9514 43 13.120
REA3 0.000 -0.0207 200 267.160 -0.0207 161 311.490
DIS2 1.675 -8.4540 28 9.430 -8.3419 44 12.600
DIS4 1.442 -8.2729 95 40.200 -5.4467 89 40.120
WEC1 0.008 -0.8972 200 121.300 -0.8568 68 76.000
IH 0.000 -0.5000 7 23.670 -0.5000 11 82.730
CSE1 0.000 -0.3093 81 219.910 -0.2949 200 1815.400
TF1 0.000 -0.1598 87 34.960 -0.0704 200 154.430
TF2 0.000 -0.0000 8 4.220 -0.0000 12 10.130
TF3 0.000 -0.0031 93 35.000 -0.0032 95 70.980
NN1 3.606 -1.5574 200 57.370 0.1769 200 59.230
NN5 0.420 -0.0722 200 79.210 -0.0490 200 154.160
NN9 3.281 -0.0279 33 11.880 0.0991 44 13.860
NN13 1.945 -3.4412 181 64.500 -0.2783 32 12.430
NN15 0.000 -1.0424 200 58.440 -1.0409 200 60.930
NN17 1.170 -0.6008 99 27.190 -0.5991 132 34.820
TABLE I: Computational results for (10) in COMPib

The numerical results and the performances of two algorithms are reported in Table I. Here, we initialize both algorithms with the same initial guess .

The notation in Table I consists of: Name is the name of problems, , are the maximum real part of the eigenvalues of the open-loop and closed-loop matrices , , respectively; iter is the number of iterations, time[s] is the CPU time in seconds. Both methods, Algorithm 1 and CCDM fail or make only slow progress towards a local solution with problems: AC18, DIS5, PAS, NN6, NN7, NN12 in COMPib. Problems AC5 and NN5 are initialized with a different matrix to avoid numerical problems. The numerical results show that the performances of both methods are quite similar for the majority of problems.

Note that Algorithm 1 as well as the algorithm in [20] are local optimization methods which only find a local minimizer and these solutions may not be the same.

4 optimization: BMI formulation

Next, we apply Algorithm 1 to solve the optimization with BMI constraints arising in optimization of the linear system (8). In this example we assume that , this problem is reformulated as the following optimization problem with BMI constraints [12]:

(11)

Here, as before, we define and . The bilinear matrix term at the top-corner of the first constraint can be approximated by the form of defined in (2). Therefore, we can use this psd-convex overestimate to approximate the problem (11) by a sequence of the convex subproblems of the form . Then we transform the subproblem into a standard SDP problem that can be solve by a standard SDP solver thanks to Schur’s complement [1, 20].

To determine a starting point, we perform the heuristic procedure called Phase 1 proposed in [20] which is terminated after a finite number of iterations. In this example, we also test Algorithm 1 for several problems in COMPib using the same parameters and the stopping criterion as in the previous subsection. The computational results are shown in Table II. The numerical results computed by HIFOO and PENBMI are also included in Table II.

Here, three last columns are the results and the performances of our method, the columns HIFOO and PENBMI indicate the -norm of the closed-loop system for the static output feedback controller given by HIFOO and PENBMI, respectively. We can see from Table II that the optimal values reported by Algorithm 1 and HIFOO are almost similar for many problems whereas in general PENBMI has difficulties in finding a feasible solution.

Problem information Other Results, Results and Performances
Name HIFOO PENBMI iter time[s]
AC2 5 3 3 5 3 0.1115 - 0.1174 120 91.560
AC3 5 4 2 5 5 4.7021 - 3.5053 267 193.940
AC6 7 4 2 7 7 4.1140 - 4.1954 167 138.570
AC7 9 2 1 1 4 0.0651 0.3810 0.0339 300 276.310
AC8 9 5 1 2 10 2.0050 - 4.5463 224 230.990
AC11 5 4 2 5 5 3.5603 - 3.4924 300 255.620
AC15 4 3 2 6 4 15.2074 427.4106 15.2036 153 130.660
AC16 4 4 2 6 4 15.4969 - 15.0433 267 201.360
AC17 4 2 1 4 4 6.6124 - 6.6571 192 64.880
HE1 4 1 2 2 2 0.1540 1.5258 0.2188 300 97.760
HE3 8 6 4 10 1 0.8545 1.6843 0.8640 15 16.320
HE5 8 2 4 4 3 8.8952 - 36.3330 154 208.680
REA1 4 3 2 4 4 0.8975 - 0.8815 183 67.790
REA2 4 2 2 4 4 1.1881 - 1.4444 300 109.430
REA3 12 3 1 12 12 74.2513 74.4460 75.0634 2 137.120
DIS1 8 4 4 8 1 4.1716 - 4.2041 129 110.330
DIS2 3 2 2 3 3 1.0548 1.7423 1.1570 78 28.330
DIS3 5 3 3 2 3 1.0816 - 1.1701 219 160.680
DIS4 6 6 4 6 6 0.7465 - 0.7532 171 126.940
TG1 10 2 2 10 10 12.8462 - 12.9461 64 264.050
AGS 12 2 2 12 12 8.1732 188.0315 8.1733 41 160.880
WEC2 10 4 3 10 10 4.2726 32.9935 8.8809 300 1341.760
WEC3 10 4 3 10 10 4.4497 200.1467 7.8215 225 875.100
BDT1 11 3 3 6 1 0.2664 - 0.8544 3 5.290
MFP 4 2 3 4 4 31.5899 - 31.6388 300 100.660
IH 21 10 11 11 21 1.9797 - 1.1861 210 2782.880
CSE1 20 10 2 12 1 0.0201 - 0.0219 3 39.330
PSM 7 3 2 5 2 0.9202 - 0.9266 153 104.170
EB1 10 1 1 2 2 3.1225 39.9526 2.0532 300 299.380
EB2 10 1 1 2 2 2.0201 39.9547 0.8150 120 103.400
EB3 10 1 1 2 2 2.0575 3995311.0743 0.8157 117 116.390
NN2 2 1 1 2 2 2.2216 - 2.2216 15 7.070
NN4 4 3 2 4 4 1.3627 - 1.3884 204 70.200
NN8 3 2 2 3 3 2.8871 78281181.1490 2.9522 240 84.510
NN11 16 5 3 3 3 0.1037 - 0.1596 15 86.770
NN15 3 2 2 4 1 0.1039 - 0.1201 6 4.000
NN16 8 4 4 4 8 0.9557 - 0.9699 36 32.200
NN17 3 1 2 2 1 11.2182 - 11.2538 270 81.480
TABLE II: synthesis benchmarks on COMPib plants

4 optimization: BMI formulation

Motivated from the optimization problem, in this example we consider the mixed synthesis optimization problem. Let us assume that , and the performance output is divided in two components, and . Then the linear system (8) becomes: