Inexact indefinite proximal ADMMs for 2-block separable convex programs and applications to 4-block DNNSDPs

# Inexact indefinite proximal ADMMs for 2-block separable convex programs and applications to 4-block DNNSDPs

Li Shen111Department of Mathematics, South China University of Technology, Guangzhou, 510641, China (shen.li@mail.scut.edu.cn).   and  Shaohua Pan222Corresponding author. Department of Mathematics, South China University of Technology, Tianhe District of Guangzhou City, China (shhpan@scut.edu.cn).
July 8, 2015
###### Abstract

This paper is concerned with two-block separable convex minimization problems with linear constraints, for which it is either impossible or too expensive to obtain the exact solutions of the subproblems involved in the proximal ADMM (alternating direction method of multipliers). Such structured convex minimization problems often arise from the two-block regroup of three or four-block separable convex optimization problems with linear constraints, or from the constrained total-variation superresolution image reconstruction problems in image processing. For them, we propose an inexact indefinite proximal ADMM of step-size with two easily implementable inexactness criteria to control the solution accuracy of subproblems, and establish the convergence under a mild assumption on indefinite proximal terms. We apply the proposed inexact indefinite proximal ADMMs to the three or four-block separable convex minimization problems with linear constraints, which come from the important class of doubly nonnegative semidefinite programming (DNNSDP) problems with many linear equality and/or inequality constraints. Numerical results indicate that the inexact indefinite proximal ADMM with the absolute error criterion has a comparable performance with the directly extended multi-block ADMM of step-size without convergence guarantee, whether in terms of the number of iterations or the computation time.

Keywords: Separable convex optimization, inexact proximal ADMM, DNNSDPs

## 1 Introduction

Let and be the finite dimensional vector spaces endowed with the inner product and its induced norm . Given closed proper convex functions and , we are concerned with the separable convex optimization problem

 minx∈X,y∈Yf(x)+g(y) s.t.  A∗x+B∗y=c, (1)

where and are the given linear operators, and denote the adjoint operators of and , respectively, and is a given vector.

As well known, there are many important cases with the form of (1), which include the covariance selection problems and semidefinite least squares problems in statistics [1, 30, 39], the sparse plus low-rank recovery problem arising from the so-called robust PCA (principle component analysis) with noisy and incomplete data [34, 32], the constrained total-variation image restoration and reconstruction problems [22, 29], the simultaneous minimization of the nuclear norm and -norm of a matrix arising from the low-rank and sparse representation for image classification and subspace clustering [40, 36], and so on.

For the structured convex minimization problem (1), the alternating direction method of multipliers (ADMM for short), first proposed by Glowinski and Marrocco [11] and Gabay and Mercier [12], is one of the most popular methods. For any given , let denote the augmented Lagrangian function of problem (1)

 Lσ(x,y,z):=f(x)+g(y)+⟨z,A∗x+B∗y−c⟩+σ2∥A∗x+B∗y−c∥2.

The ADMM, from an initial point , consists of the steps

 xk+1∈argminx∈X Lσ(x,yk,zk), (2a) yk+1∈argminy∈Y Lσ(xk+1,y,zk), (2b) zk+1=zk+τσ(A∗xk+1+B∗yk+1−c), (2c)

where is a constant to control the step-size in (2c). The iterative scheme of ADMM actually embeds a Gaussian-Seidel decomposition into each iteration of the classical augmented Lagraigan method of Hestenes-Powell-Rockafellar [14, 25, 28], so that the challenging task (i.e., the exact solution or the approximate solution with a high precision of the Lagrangian minimization problem) is relaxed to several easy ones.

Notice that the subproblems (2a) and (2b) in the ADMM may have no closed-form solutions or even be difficult to solve. When the functions and enjoy a closed-form Moreau envelope, one usually introduces the proximal terms and respectively into the subproblems (2a) and (2b) to cancel the operators and so as to get the exact solutions of proximal subproblems. This is the so-called proximal-ADMM which, for a chosen initial point , consists of

 xk+1=argminx∈X Lσ(x,yk,zk)+12∥x−xk∥Pf, (3a) yk+1=argminy∈Y Lσ(xk+1,y,zk)+12∥y−yk∥Pg, (3b) zk+1=zk+τσ(A∗xk+1+B∗yk+1−c). (3c)

The existing works on the proximal ADMM mostly focus on the positive definite proximal terms (see, e.g., [15, 35, 41]). It is easy to see that the proximal subproblems with the positive definite proximal terms will have a big difference from the original subproblems of ADMM. In fact, as pointed out in the conclusion remarks of [15], “large and positive definite proximal terms will lead to easy solution of subproblems, but the number of iterations will increase. Therefore, for subproblems which are not extremely ill-posed, the proximal parameters should be small.” In view of this, some researchers recently develop the semi-proximal or indefinite proximal ADMM [38, 9, 21] by using the positive semidefinite even indefinite proximal terms. The numerical experiments in [9] show that such tighter proximal terms display better numerical performance. In addition, it is worthwhile to emphasize that the ADMM itself is a semi-proximal (of course an indefinite proximal) ADMM, but is not in the family of positive definite proximal ADMMs.

In this paper we are interested in problem (1) in which the functions and/or may not have a closed-form Moreau envelope or the linear operators and/or have a large spectral norm (now the proximal subproblems with a positive definite proximal term are bad surrogates for those of the ADMM), for which it is impossible or too expensive to achieve the exact solutions of the proximal subproblems though they are unique. Such separable convex optimization problems arise directly from the constrained total-variation superresolution image reconstruction problems [4, 24] in image processing, and the two-block regroup of three or four-block separable convex minimization problems. Indeed, for the following four-block separable convex minimization problem

 minxi∈Xi ∑4i=1fi(xi) s.t.  ∑4i=1A∗ixi=c (4)

where for are closed proper convex functions, and for are linear operators, since the directly extended multi-block ADMM does not have the convergence guarantee (see the counterexamples in [3]), one may rearrange it as the form of (1) by reorganizing any two groups of variables into one group, and then apply the classical ADMM for solving the two-block regrouped problem. Clearly, the exact solution of each subproblem of ADMM for the two-block regrouped problem is difficult to obtain due to the cross of two classes of variables. In particular, the two-block regroup resolving of multi-block separable convex optimization also has a separate study value.

To resolve this class of difficult two-block separable convex minimization problems, we propose an inexact indefinite proximal ADMM with a step-size , in which the proximal subproblems are solved to a certain accuracy with two easily implementable inexactness criteria to control the accuracy. Here, an indefinite proximal term, instead of a positive definite proximal term, is introduced into each subproblem of the ADMM to guarantee that each proximal subproblem has a unique solution as well as becomes a good surrogate for the original subproblem of the ADMM. For the proposed inexact indefinite proximal ADMM, we establish its convergence under a mild assumption on the indefinite proximal terms. To the best of our knowledge, this is the first convergent inexact proximal ADMM in which step-size may take the value in the interval . We notice that a few existing research papers on inexact versions of the ADMM all focus on the unit step-size; see [8, 15, 24, 13, 5], and moreover, only the papers [24, 13, 5] develop truly implementable inexactness criteria in the exact solutions are not required. Our inexact indefinite proximal ADMM is using the same absolute error criterion and and a little different relative error from the one used in [24]. It is well known that the ADMM with requires less to iterations than the one with , especially for those difficult SDP problems [33]. Thus, the proposed inexact indefinite proximal ADMMs with a large step-size is expected to have better performance.

In this work, we apply the inexact indefinite proximal ADMMs to the three and four-block separable convex minimization problems with linear constraints, coming from the duality of the doubly nonnegative semidefinite programming (DNNSDP) problems with many linear equality and/or inequality constraints. Specifically, we solve the two-block regroupment for the dual problems of DNNSDPs with the inexact indefinite proximal ADMM. Observe that the iterates yielded by solving each subproblem in an alternating way can satisfy the optimality condition approximately. Hence, in the implementation of the inexact indefinite proximal ADMMs, we get the inexact solution of each subproblem by minimizing the two group of variables alternately. Numerical results indicate that the inexact indefinite proximal ADMM with the absolute error criterion is comparable with the directly extended multi-block ADMM with step-size whether in terms of the number of iterations or the computation time, while the one with the relative error criterion requires less outer-iterations but more computation time since the error criterion is more restrictive and requires more inner-iterations. Thus, the inexact indefinite proximal ADMM with the absolute error criterion provides an efficient tool for handling the three and four-block separable convex minimization problems.

We observe that there are several recent works [37, 18, 19, 10] to regroup the multi-block separable convex minimization problems into two-block or several subblocks, and then solve each subblock simultaneously by introducing a positive definite proximal term related to the numbers of subproblems. Such procedures lead to easily solvable subproblems, but their performance becomes worse due to larger proximal terms.

The rest of this paper is organized as follows. Section 2 gives some notations and the main assumption. Section 3 describes the inexact indefinite proximal ADMMs and analyzes the properties of the sequence generated. The convergence of the inexact indefinite proximal ADMMs is established in Section 4. Section 5 applies the inexact indefinite proximal ADMMs for solving the duality of the doubly DNNSDPs with many linear equality and/or inequality constraints. Some concluding remarks are given in Section 6.

## 2 Notations and assumption

Notice that the functions and are closed proper convex, and the subdifferential mappings of closed proper convex functions are maximal monotone [26, Theorem 12.17]. Hence, there exist self-adjoint operators and such that for all and ,

 f(x)≥f(˜x)+⟨˜u,x−˜x⟩+12∥x−˜x∥2Σf  and  ⟨u−˜u,x−˜x⟩≥∥x−˜x∥2Σf; (5)

and for all and ,

 g(x)≥g(˜y)+⟨˜v,y−˜y⟩+12∥y−˜y∥2Σg  and  ⟨v−˜v,y−˜y⟩≥∥y−˜y∥2Σg. (6)

For a self-adjoint linear operator , the notation (respectively, ) means that is positive semidefinite (respectively, positive definite), that is, for all (respectively, for all ). Given a self-adjoint positive semidefinite linear operator , we denote by the norm induced by , i.e.,

 ∥x∥T:=√⟨x,Tx⟩ ∀x∈X.

Given a self-adjoint positive definite linear operator, we denote by and the largest eigenvalue and the smallest eigenvalue of , respectively, and by the distance induced by from to a closed set , that is, When is the identity operator, we suppress the notation in and write simply . Clearly, for any positive definite linear operator and ,

 2|⟨u,v⟩|≤γ−1∥u∥2T+γ∥v∥2T−1 ∀u,v∈X. (7)

In addition, for any and any self-adjoint linear operator , the following two identities will be frequently used in the subsequent analysis:

 2⟨u,Tv⟩ =⟨u,Tu⟩+⟨v,Tv⟩−⟨u−v,T(u−v)⟩ =⟨u+v,T(u+v)⟩−⟨u,Tu⟩−⟨v,Tv⟩. (8)

Throughout this paper, we make the following assumption for problem (1):

###### Assumption 2.1

Problem (1) has an optimal solution, to say , and there exists a point such that .

Under Assumption 2.1, from [27, Corollary 28.2.2 & 28.3.1] and [27, Theorem 6.5 & 23.8], it follows that there exists a Lagrange multiplier such that

 −Az∗∈∂f(x∗), −Bz∗∈∂g(x∗)  and  A∗x∗+B∗y∗−c=0 (9)

where and are the subdifferential mappings of and , respectively. Moreover, any satisfying (9) is an optimal solution to the dual problem of (1). In the sequel, we call a primal-dual solution pair of problem (1).

## 3 Inexact indefinite proximal ADMMs

In this section, we describe the iteration steps of the inexact indefinite proximal ADMMs for solving problem (1), and then analyze the properties of the sequence generated.

The iteration steps of our inexact indefinite proximal ADMMs are stated as follows.

IEIDP-ADMM (Inexact indefinite proximal ADMM for (1)) (S.0) Let be given. Choose self-adjoint linear operators and   such that and .   Choose an initial point . Set . (S.1) Find . (S.2) Find . (S.3) Update the Lagrange multiplier via the following formula (S.4) Let , and go to Step (S.1).

The approximate optimality in (S.1) and (S.2) is measured by the following criteria:

• and ;

• , and , where and are self-adjoint positive definite linear operators with and ;

• , and , where and are same as the one in (C2).

Notice that (C1) is an absolute error criterion, while (C2) and (C2’) are a relative error criterion. Clearly, when the approximate optimality of and is measured by (C1), (S.1) and (S.2) are equivalent to finding and such that

 {ξk+1∈∂ϕk(xk+1), ∥ξk+1∥≤μk+1  with  ∑∞k=0μk+1<∞,ηk+1∈∂ψk(yk+1), ∥ηk+1∥≤νk+1  with  ∑∞k=0νk+1<∞. (10)

If the approximate optimality of and is measured by (C2) or (C2’), (S.1) and (S.2) are equivalent to finding and such that with or ,

 {ξk+1∈∂ϕk(xk+1), ∥ξk+1∥F≤μk+1∥xk+1−xk∥Tf  with ∑∞k=0μpk+1<∞,ηk+1∈∂ψk(yk+1), ∥ηk+1∥G≤νk+1∥yk+1−yk∥Tg  with ∑∞k=0νpk+1<∞. (11)
###### Remark 3.1

(a) When the proximal operators and are chosen as for a constant and the step-size is set to be , the IEIDP-ADMM with (C1) reduces to the IADM1 in [24]. If, in addition, taking , the IEIDP-ADMM with (C2’) requires

 {ξk+1∈∂ϕk(xk+1), ∥ξk+1∥≤μk+1√β∥xk+1−xk∥σAA∗+βI  with ∑∞k=0μ2k+1<∞,ηk+1∈∂ψk(yk+1), ∥ηk+1∥≤νk+1√β∥yk+1−yk∥σBB∗+βI  with ∑∞k=0ν2k+1<∞,

whereas the LADM2 in [24] is actually requiring that and satisfy

Since and , the above inexact criterion (C2’) is looser than Criterion 2 used in [24].

(b) When and are chosen to be self-adjoint positive semidefinite operators, the IEIDP-ADMMs with reduce to the semi-proximal ADMM in [38, 9].

(c) For the self-adjoint positive definite linear operators and in (C2) and (C2’), an immediate choice is and . Since and are easy to estimate, such a choice is convenient for the numerical implementation.

Next we study the properties of the sequence generated by the IEIDP-ADMMs. For convenience, we let for , and for each write

 xke:=xk−x∗,  yke:=yk−y∗,  zke:=zk−z∗; Δyk:=yk−yk−1, Δxk:=xk−xk−1, Δzk:=zk−zk−1.

Using these notations and noting that , we can rewrite Step (S.3) as

 zk=zk+1−τσh(xk+1,yk+1)=zk+1−τσ(A∗xk+1e+B∗yk+1e). (12)
###### Lemma 3.1

Let be the sequence generated by the IEIDP-ADMMs with and satisfying equation (10) or (11). Suppose that Assumption 2.1 holds and the operator also satisfies . Then, for all we have

 (2−τ)σ∥h(xk+1,yk+1)∥2+(τσ)−1(∥zk+1e∥2−∥zke∥2)+∥yk+1e∥2Tg−∥yke∥2Tg +∥xk+1e∥2Pf+Σf−∥xke∥2Pf+Σf+∥Δyk+1∥2Pg+34Σg−∥Δyk∥2Pg+34Σg ≤2(1−τ)σ⟨h(xk,yk),B∗Δyk+1⟩+rk+1−∥Δxk+1∥2Pf+12Σf−∥Δyk+1∥2Tg

where .

Proof: From the expressions of and and equations (10) and (11), it follows that

 ξk+1−Azk−σA(A∗xk+1+B∗yk−c)−PfΔxk+1 ∈∂f(xk+1), (13) ηk+1−Bzk−σB(A∗xk+1+B∗yk+1−c)−PgΔyk+1 ∈∂g(yk+1). (14)

Substituting the first identity in (12) into equations (13) and (14) respectively yields

 (τ−1)σAh(xk+1,yk+1)−Azk+1+σAB∗Δyk+1−PfΔxk+1+ξk+1∈∂f(xk+1), (τ−1)σBh(xk+1,yk+1)−Bzk+1−PgΔyk+1+ηk+1∈∂g(yk+1).

In view of inequalities (5) and (6), from the last two inclusions and equation (9) we have

 ⟨xk+1e,(τ−1)σAh(xk+1,yk+1)−Azk+1e+σAB∗Δyk+1−PfΔxk+1+ξk+1⟩≥∥xk+1e∥2Σf, ⟨yk+1e,(τ−1)σBh(xk+1,yk+1)−Bzk+1e−PgΔyk+1+ηk+1⟩≥∥yk+1e∥2Σg.

Adding the last two inequalities together and using equation (12) yields that

 (τ−1)σ∥h(xk+1,yk+1)∥2−(τσ)−1⟨Δzk+1,zk+1e⟩+σ⟨h(xk+1,yk+1),B∗Δyk+1⟩ (15) −⟨xk+1e,PfΔxk+1−ξk+1⟩−⟨yk+1e,(Pg+σBB∗)Δyk+1−ηk+1⟩≥∥xk+1e∥2Σf+∥yk+1e∥2Σg.

Next we deal with the term in inequality (15). Notice that

 σ⟨h(xk+1,yk+1),B∗Δyk+1⟩ =(1−τ)σ⟨h(xk+1,yk+1)−h(xk,yk),B∗Δyk+1⟩ (16) +⟨Δzk+1,B∗Δyk+1⟩+(1−τ)σ⟨h(xk,yk),B∗Δyk+1⟩.

We first bound the first two terms in (15). From equations (14) and (12), it follows that

 −Bzk+1+(τ−1)σBh(xk+1,yk+1)−PgΔyk+1+ηk+1∈∂g(yk+1), −Bzk+(τ−1)σBh(xk,yk)−PgΔyk+ηk∈∂g(yk).

Combining the last two inclusions with the second inequality in (6) yields that

 (τ−1)σ⟨h(xk+1,yk+1)−h(xk,yk),B∗Δyk+1⟩−⟨Δzk+1,B∗Δyk+1⟩ −⟨Δyk+1−Δyk,PgΔyk+1⟩+⟨ηk+1−ηk,Δyk+1⟩≥∥Δyk+1∥2Σg. (17)

Using equation (2) and the given assumption , we have that

 ⟨Δyk−Δyk+1,PgΔyk+1⟩ =12∥yk+1−yk−1∥2Pg−12∥Δyk∥2Pg−12∥Δyk+1∥2Pg−∥Δyk+1∥2Pg ≤12∥yk+1−yk−1∥2Pg+38Σg−12∥Δyk∥2Pg−32∥Δyk+1∥2Pg ≤12∥Δyk∥2Pg+34Σg−12∥Δyk+1∥2Pg+34Σg+34∥Δyk+1∥2Σg (18)

where the last inequality is using Combining inequalities (3) and (3) with equation (16), we immediately obtain

 σ⟨h(xk+1,yk+1),B∗Δyk+1⟩ ≤(1−τ)σ⟨h(xk,yk),B∗Δyk+1⟩+⟨ηk+1−ηk,Δyk+1⟩ +12∥Δyk∥2Pg+34Σg−12∥Δyk+1∥2Pg+34Σg−14∥Δyk+1∥2Σg. (19)

Now substituting inequality (3) into equation (15), we immediately obtain that

 (τ−1)σ∥h(xk+1,yk+1)∥2−(τσ)−1⟨Δzk+1,zk+1e⟩−⟨xk+1e,PfΔxk+1⟩ +(1−τ)σ⟨h(xk,yk),B∗Δyk+1⟩+12∥Δyk∥2Pg+34Σg−12∥Δyk+1∥2Pg+34Σg −⟨yk+1e,(Pg+σBB∗)Δyk+1⟩+12rk+1≥∥xk+1e∥2Σf+∥yk+1e∥2Σg+14∥Δyk+1∥2Σg. (20)

By the first equality of (2) and equation (12), the term can be written as

 ⟨Δzk+1,zk+1e⟩=12∥zk+1e∥2−12∥zke∥2+(τσ)22∥h(xk+1,yk+1)∥2.

Applying equation (2) to and yields

 ⟨xk+1e,PfΔxk+1⟩=12∥xk+1e∥2Pf−12∥xke∥2Pf+12∥Δxk+1∥2Pf, ⟨yk+1e,(Pg+σBB∗)Δyk+1⟩=12∥yk+1e∥2Pg+σBB∗−12∥yke∥2Pg+σBB∗+12∥Δyk+1∥2Pg+σBB∗.

Substituting the last three equalities into inequality (3), we have that

 (τ−2)σ∥h(xk+1,yk+1)∥2+(τσ)−1(∥zke∥2−∥zk+1e∥2)+(∥yke∥2Tg−∥yk+1e∥2Tg) +(∥xke∥2Pf+