Implementation of Interior-point Methods for LP based on Krylov Subspace Iterative Solvers with Inner-iteration Preconditioning

Implementation of Interior-point Methods for LP
based on Krylov Subspace Iterative Solvers
with Inner-iteration Preconditioning

Yiran Cui Department of Computer Science, University College London, Gower Street, London WC1E 6BT, United Kingdom. y.cui.12@ucl.ac.uk    Keiichi Morikuni Division of Information Engineering, University of Tsukuba, Tenoudai 1-1-1, Tsukuba, Ibaraki 305-8573, Japan. The author was supported in part by JSPS KAKENHI Grant Number 16K17639. morikuni@cs.tsukuba.ac.jp    Takashi Tsuchiya National Graduate Institute for Policy Studies, 7-22-1 Roppongi, Minato, Tokyo 106-8677, Japan. The author was supported in part by JSPS KAKENHI Grant Number 15H02968. tsuchiya@grips.ac.jp    Ken Hayami National Institute of Informatics, SOKENDAI (The Graduate University for Advanced Studies), 2-1-2 Hitotsubashi, Chiyoda, Tokyo 101-0003, Japan. The author was supported in part by JSPS KAKENHI Grant Number 15K04768. hayami@nii.ac.jp
Abstract

We apply novel inner-iteration preconditioned Krylov subspace methods to the interior-point algorithm for linear programming (LP). Inner-iteration preconditioners recently proposed by Morikuni and Hayami enable us to overcome the severe ill-conditioning of linear equations solved in the final phase of interior-point iterations. The Krylov subspace methods do not suffer from rank-deficiency and therefore no preprocessing is necessary even if rows of the constraint matrix are not linearly independent. Extensive numerical experiments are conducted over diverse instances of 125 LP problems including the Netlib, QAPLIB, and Mittelmann collections. The largest problem has 434,580 variables. It turns out that our implementation is more robust than the standard public domain solvers SeDuMi (Self-Dual Minimization) and SDPT3 (Semidefinite Programming Toh-Todd-Tütüncü) without increasing CPU time. As far as we know, this is the first time an interior-point method based on iterative solvers succeeds in solving a fairly large number of LP instances from benchmark libraries under the standard stopping criteria.

1 Introduction

Consider the linear programming (LP) problem in the standard primal-dual formulation

(1.1a)
(1.1b)

where , , and we assume the existence of an optimal solution. In this paper, we describe an implementation of the interior-point method for LP based on iterative solvers. The main computational task in one iteration of the interior-point method is the solution of a system of linear equations to compute the search direction. Although there are two known approaches for this, i.e., direct and iterative methods, the direct method is the primary choice so far and there is very few implementation solely depending on an iterative method, e.g., [7]. This is because the linear system becomes notoriously ill-conditioned toward the end of interior-point iterations and no iterative solver has managed to resolve this difficulty.

Here, we apply novel inner-iteration preconditioned Krylov subspace methods for least squares problems. The inner-iteration preconditioners recently proposed by Morikuni and Hayami [48, 49] enable us to deal with the severe ill-conditioning of the system of linear equations. Furthermore, the proposed Krylov subspace methods do not suffer from singularity and therefore no preprocessing is necessary even if is rank-deficient. This is another advantage of our approach over the direct solvers.

Extensive numerical experiments were conducted over diverse instances of 125 LP problems taken from the benchmark libraries Netlib, QAPLIB, and Mittelmann collections. The largest problem has 434,580 variables. Our implementation proves to be more robust than the public domain solvers SeDuMi (Self-Dual Minimization) [57] and SDPT3 (Semidefinite Programming Toh-Todd-Tütüncü) [59, 60] without increasing CPU time. As far as the authors know, this is the first time an interior-point method entirely based on iterative solvers succeeds in solving a fairly large number of standard LP instances from the benchmark libraries with standard stopping criteria. Our implementation is considerably slower than the interior-point solver of MOSEK [50], one of the state-of-the-art commercial solvers, though it is competitive in robustness. On the other hand, we observed that our implementation is able to solve ill-conditioned dense problems with severe rank-deficiency which the MOSEK solver can not solve.

We emphasize that there are many interesting topics to be further worked out based on this paper. There is still room for improvement regarding the iterative solvers as well as using more sophisticated methods for the interior-point iterations.

In the following, we introduce the interior-point method and review the iterative solvers previously used. We employ an infeasible primal-dual predictor-corrector interior-point method, one of the methods that evolved from the original primal-dual interior-point method [58, 35, 43, 61] incorporating several innovative ideas, e.g., [63, 39].

The optimal solution to problem (1.1) must satisfy the Karush-Kuhn-Tucker (KKT) conditions

(1.2a)
(1.2b)
(1.2c)
(1.2d)

where , , and . The complementarity condition (1.2c) implies that at the optimal point, one of the elements or must be zero for .

The following system is obtained by relaxing 1.2c to with :

(1.3)

The interior-point method solves the problem (1.1) by generating approximate solutions to (1.3), with decreasing toward zero, so that (1.2) is satisfied within some tolerance level at the solution point. The search direction at each infeasible interior-point step is obtained by solving the Newton equations

(1.4)

where is the residual of the dual problem, is the residual of the primal problem, , is the duality measure, and is the centering parameter, which is dynamically chosen to govern the progress of the interior-point method. Once the th iterate is given and (1.4) is solved, we define the next iterate as , where is a step length to ensure the positivity of and , and then reduce to before solving (1.4) again.

At each iteration, the solution of (1.4) dominates the total CPU time. The choice of linear solvers depends on the way of arranging the matrix of (1.4). Aside from solving the system (1.4), one can solve its reduced equivalent form of size

(1.5)

or a more condensed equivalent form of size

(1.6)

both of which are obtained by performing block Gaussian eliminations on (1.4). We are concerned in this paper with solving the third equivalent form (1.6).

It is known that the matrix of (1.6) is semidefinite when any of the following cases is encountered. First, when is rank-deficient, system (1.6) is singular. There exist presolving techniques that can detect and remove the dependent rows in , see, e.g., [2, 25]. Second, in late interior-point iterations, the diagonal matrix has very tiny and very large diagonal values as a result of convergence. Thus, the matrix may become positive semidefinite. In particular, the situation becomes severe when primal degeneracy occurs at the optimal solution. One can refer to [28, 64] for more detailed explanations.

Thus, when direct methods such as Cholesky decomposition are applied to (1.6), some diagonal pivots encountered during decomposition can be zero or negative, causing the algorithm to break down. Many direct methods adopt a strategy of replacing the problematic pivot with a very large number. See, e.g., [64] for the Cholesky-Infinity factorization, which is specially designed to solve (1.6) when it is positive semidefinite but not definite. Numerical experience [1, 37, 20, 38, 3, 62, 12] indicates that direct methods provide sufficiently accurate solutions for interior-point methods to converge regardless of the ill-conditioning of the matrix. However, as the LP problems become larger, the significant fill-ins in decompositions make direct methods prohibitively expensive. It is stated in [26] that the fill-ins are observed even for very sparse matrices. Moreover, the matrix can be dense, as in quadratic programs in support vector machine training [19] or linear programs in basis pursuit [7], and even when is sparse, can be dense or have a pattern of nonzero elements that renders the system difficult for direct methods. The expensive solution of the KKT systems is a usual disadvantage of second-order methods including interior-point methods.

These drawbacks of direct methods and the progress in preconditioning techniques motivate researchers to develop stable iterative methods for solving (1.6) or alternatively (1.5). The major problem is that as the interior-point iterations proceed, the condition number of the term increases, making the system of linear equations intractable. One way to deal with this is to employ suitable preconditioners. Since our main focus is on solving (1.6), we explain preconditioners for (1.6) in detail in the following. We mention [8, 21, 22, 4, 51] as literature related to preconditioners for (1.5).

For the iterative solution of (1.6), the conjugate gradient (CG) method [32] has been applied with diagonal scaling preconditioners [6, 53, 36] or incomplete Cholesky preconditioners [39, 34, 8, 42]. LSQR with a preconditioner was used in [23]. A matrix-free method of using CG for least squares (CGLS) preconditioned by a partial Cholesky decomposition was proposed in [27]. In [10], a preconditioner based on Greville’s method [11] for generalized minimal residual (GMRES) method was applied. Suitable preconditioners were also introduced for particular fields such as the minimum-cost network flow problem in [54, 33, 44, 45]. One may refer to [13] for a review on the application of numerical linear algebra algorithms to the solutions of KKT systems in the optimization context.

In this paper, we propose to solve (1.6) using Krylov subspace methods preconditioned by stationary inner-iterations recently proposed for least squares problems in [31, 48, 49]. In Section 2, we briefly describe the framework of Mehrotra’s predictor-corrector interior-point algorithm we implemented and the normal equations arising from this algorithm. In Section 3, we specify the application of our method to the normal equations. In Section 4, we present numerical results in comparison with a modified sparse Cholesky method and three direct solvers in CVX, a major public package for specifying and solving convex programs [30, 29]. In Section 5, we conclude the paper.

Throughout, we use bold lower case letters for column vectors. We denote quantities related to the th interior-point iteration by using a superscript with round brackets, e.g., , the th iteration of Krylov subspace methods by using a subscript without brackets, e.g., , and the th inner iteration by using a superscript with angle brackets, e.g., . denotes the range space of a matrix . denotes the condition number , where and denote the maximum and minimum nonzero singular values of , respectively. denotes the Krylov subspace of order .

2 Interior-point algorithm and the normal equations

We implement an infeasible version of Mehrotra’s predictor-corrector method [40], which has been established as a standard in this area [37, 38, 61, 41]. Note that our method can be applied to other interior-point methods (see, e.g., [61] for more interior-point methods) whose directions are computed via the normal equations (1.6).

2.1 Mehrotra’s predictor-corrector algorithm

In this method, the centering parameter is determined by dividing each step into two stages.

In the first stage, we solve for the affine direction

(2.1)

and measure its progress in reducing . If the affine direction makes large enough progress without violating the nonnegative boundary (1.2d), then is assigned a small value. Otherwise, is assigned a larger value to steer the iterate to be more centered in the strictly positive region.

In the second stage, we solve for the corrector direction

(2.2)

where , and is determined according to the solution in the first stage. Finally, we update the current iterate along the linear combination of the two directions.

In our implementation of the interior-point method, we adopt Mehrotra’s predictor-corrector algorithm as follows.

1:  Given with .
2:  for  until convergence, do
3:      {the predictor stage}
4:     Solve (2.1) for the affine direction .
5:     Compute .
6:     if  then
7:        ,
8:     else
9:        Set and . {the corrector stage}
10:        Solve (2.2) for the corrector direction .
11:        
12:     end if
13:     Compute .
14:     
15:  end for
Algorithm 2.1 Mehrotra’s predictor-corrector algorithm.

In line 5 in Algorithm 2.1, the step lengths are computed by

(2.3)

where .

In line 9, the quantity is computed by

In the same line, the parameter is chosen as in the early phase of the interior-point iterations. The value and the range for are adopted by the LIPSOL package [64]. In the late phase of the interior-point iterations, is chosen as approximately 10 times the error measure defined in (4.1). Here the distinction between early and late phases is when is more or less than .

In line 13, we first compute trial step lengths using equations (2.3) with . Then, we gradually reduce to find the largest step lengths that can ensure the centrality of the updated iterates, i.e., to find the maximum that satisfy

where is typically chosen as .

2.2 The normal equations in the interior-point algorithm

We consider modifying Algorithm 2.1 so that it is not necessary to update . Since we assume the existence of an optimal solution to problem (1.1), we have . Let and . Problem (1.6) with (the normal equations of the second kind) is equivalent to

(2.4)

where .

In the predictor stage, the problem (2.1) is equivalent to first solving (2.4) for with , and then updating the remaining unknowns by

(2.5a)
(2.5b)

In the corrector stage, the problem (2.2) is equivalent to first solving (2.4) for with , and then updating the remaining unknowns by

(2.6a)
(2.6b)

By solving (2.4) for instead of solving (1.6) for , we can compute , , , and and can save 1MV 111“MV” denotes the computational cost required for one matrix-vector multiplication. in (2.5a) and another in (2.6a) if a predictor step is performed per interior-point iteration.

Remark 2.1.

For solving an interior-point step from the condensed step equation (1.6) using a suited Krylov subspace method, updating rather than can save 1MV each interior-point iteration.

Note that in the predictor and corrector stages, problem (2.4) has the same matrix but different right-hand sides. We introduce methods for solving it in the next section.

3 Application of inner-iteration preconditioned Krylovsubspace methods

In lines 4 and 10 of Algorithm 2.1, the linear system (2.4) needs to be solved, with its matrix becoming increasingly ill-conditioned as the interior-point iterations proceed. In this section, we focus on applying inner-iteration preconditioned Krylov subspace methods to (2.4) because they are advantageous in dealing with ill-conditioned sparse matrices. The methods to be discussed are the preconditioned CG and MINRES methods [32, 52] applied to the normal equations of the second kind ((P)CGNE and (P)MRNE, respectively) [9, 49], and the right-preconditioned generalized minimal residual method (AB-GMRES) [31, 49].

First, the conjugate gradient (CG) method [32] is an iterative method for solving linear systems , where is a symmetric and positive (semi)definite matrix and . CG starts with an initial approximate solution and determines the th iterate by minimizing over the space , where , is a solution of , and .

Second, MINRES [52] is another iterative method for solving linear systems , where is symmetric. MINRES with determines the th iterate by minimizing over the same space as CG.

Third, the generalized minimal residual (GMRES) method [56] is an iterative method for solving linear systems , where . GMRES with determines the th iterate by minimizing over .

3.1 Application of inner-iteration preconditioned CGNE and MRNE methods

We first introduce CGNE and MRNE. Let , , , and for the predictor stage, and similarly, let , , , and for the corrector stage. CG and MINRES applied to systems are CGNE and MRNE, respectively. With these settings, let the initial solution in both stages, and denote the initial residual by . CGNE and MRNE can solve (2.4) without forming explicitly.

Concretely, CGNE gives the th iterate such that , where is the minimum-norm solution of for and . MRNE gives the th iterate such that .

We use inner-iteration preconditioning for CGNE and MRNE methods. The following is a brief summary of the part of [49] where the inner-outer iteration method is analyzed. We give the expressions for the inner-iteration preconditioning and preconditioned matrices to state the conditions under which the former is SPD. Let be a symmetric nonsingular splitting matrix of such that . Denote the inner-iteration matrix by . The inner-iteration preconditioning and preconditioned matrices are and , respectively. If is nonsingular, then , is equivalent to for all . For odd, is symmetric and positive definite (SPD) if and only if the inner-iteration splitting matrix is SPD [46, Theorem 2.8]. For even, is SPD if and only if the inner-iteration splitting matrix is SPD [46, Theorem 2.8]. We give Algorithms 3.1, 3.2 for CGNE and MRNE preconditioned by inner iterations [49, Algorithms E.3, E.4].

1:  Let be the initial approximate solution, and .
2:  Apply steps of a stationary iterative method to to obtain and .
3:  
4:  for  until convergence, do
5:     
6:     Apply steps of a stationary iterative method to to obtain and .
7:     
8:  end for
Algorithm 3.1 CGNE method preconditioned by inner iterations.
1:  Let be the initial approximate solution, and .
2:  Apply steps of a stationary iterative method to to obtain .
3:  
4:  for  until convergence, do
5:     
6:     Apply steps of a stationary iterative method to to obtain .
7:     
8:     
9:  end for
Algorithm 3.2 MRNE method preconditioned by inner iterations.

3.2 Application of inner-iteration preconditioned AB-GMRES method

Next, we introduce AB-GMRES. GMRES can solve a square linear system transformed from the rectangular system in the predictor stage and in the corrector stage by using a rectangular right-preconditioning matrix that does not necessarily have to be . Let be a preconditioning matrix for . Then, AB-GMRES corresponds to GMRES [56] applied to

which is equivalent to the minimum-norm solution to the problem (2.4), for all if [49, Theorem 5.2], where or , or , respectively. AB-GMRES gives the th iterate such that , where is the initial iterate and .

Specifically, we apply AB-GMRES preconditioned by inner iterations [48, 49] to (2.4). This method was shown to outperform previous methods on ill-conditioned and rank-deficient problems. We give expressions for the inner-iteration preconditioning and preconditioned matrices. Let be a nonsingular splitting matrix such that . Denote the inner-iteration matrix by . With , the inner-iteration preconditioning and preconditioned matrices are and , respectively. If the inner-iteration matrix is semiconvergent, i.e., exists, then AB-GMRES preconditioned by the inner-iterations determines the minimum-norm solution of without breakdown for all and for all [49, Theorem 5.5]. The inner-iteration preconditioning matrix works on in AB-GMRES as in Algorithm 3.3 [49, Algorithm 5.1].

1:  Let be the initial approximate solution, and .
2:  
3:  for  until convergence, do
4:     Apply steps of a stationary iterative method to to obtain .
5:     
6:     for do
7:        
8:     end for
9:     
10:  end for
11:  
12:  Apply steps of a stationary iterative method to to obtain .
13:  
Algorithm 3.3 AB-GMRES method preconditioned by inner iterations.

Here, are orthonormal, is the first column of the identity matrix, and .

Note that the left-preconditioned generalized minimal residual method (BA-GMRES) [31, 48, 49] can be applied to solve the corrector stage problem, which can be written as the normal equations of the first kind

or equivalently

(3.1)

In fact, this formulation was adopted in [26] and solved by the CGLS method preconditioned by partial Cholesky decomposition that works in -dimen-sional space. The BA-GMRES also works in -dimensional space.

The advantage of the inner-iteration preconditioning methods is that we can avoid explicitly computing and storing the preconditioning matrices for in (2.4). We present efficient algorithms for specific inner iterations in the next section.

3.3 SSOR inner iterations for preconditioning the CGNE and MRNE methods

The inner-iteration preconditioned CGNE and MRNE methods require a symmetric preconditioning matrix. This is achieved by the SSOR inner-iteration preconditioning, which works on the normal equations of the second kind , , and its preconditioning matrix is SPD for odd for [46, Theorem 2.8]. This method exploits a symmetric splitting matrix by the forward updates, in lines 36 in Algorithm 3.5 and the reverse updates, , and can be efficiently implemented as the NE-SSOR method [55], [49, Algorithm D.8]. See [5] where SSOR preconditioning for CGNE with is proposed. Let be the th row vector of .  Algorithm 3.4 shows the NE-SSOR method.

1:  Let and .
2:  for do
3:     for do
4:        
5:        
6:     end for
7:     for do
8:        
9:        
10:     end for
11:     
12:  end for
Algorithm 3.4 NE-SSOR method.

When Algorithm 3.4 is applied to lines 2 and 6 of Algorithm 3.1 and lines 2 and 6 of Algorithm 3.2, the normal equations of the second kind are solved approximately.

3.4 SOR inner iterations for preconditioning the AB-GMRES method

Next, we introduce the successive overrelaxation (SOR) method applied to the normal equations of the second kind , with or as used in Algorithm 3.3. If the relaxation parameter satisfies , then the iteration matrix of this method is semiconvergent, i.e., exists [16]. An efficient algorithm for this method is called NE-SOR and is given as follows [55], [49, Algorithm D.7].

1:  Let .
2:  for do
3:     for do
4:        
5:     end for
6:     
7:  end for
Algorithm 3.5 NE-SOR method.

When Algorithm 3.5 is applied to lines 4 and 12 of Algorithm 3.3, the normal equations of the second kind are solved approximately.

Since the rows of are required in the NE-(S)SOR iterations, it would be more efficient if is stored row-wise.

3.5 Row-scaling of

Let be a diagonal matrix whose diagonal elements are positive. Then, problem (2.4) is equivalent to

(3.2)

Denote and . Then, the scaled problem (3.2) is

(3.3)

If satisfies , then (3.3) is equivalent to

(3.4)

for all . The methods discussed earlier can be applied to (3.4). In the NE-(S)SOR inner iterations, one has to compute , the norm of the th row of . However, this can be omitted if the th diagonal element of is chosen as the norm of the th row of , that is, . With this choice, the matrix has unit row norm . Hence, we do not have to compute the norms inside the NE-(S)SOR inner iterations if we compute the norms for the construction of the scaling matrix . The row-scaling scheme does not incur extra CPU time. We observe in the numerical results that this scheme improves the convergence of the Krylov subspace methods.

CGNE and MRNE preconditioned by inner iterations applied to a scaled linear system are equivalent to CG and MINRES applied to , , respectively, and hence determine the minimum-norm solution of for all and for all if is SPD. Now we give conditions under which AB-GMRES preconditioned by inner iterations applied to a scaled linear system determines the minimum-norm solution of the unscaled one .

Lemma 3.1.

If and is nonsingular, then AB-GMRES applied to determines the solution of , subject to without breakdown for all and for all if and only if .

Proof.

Since gives , the equality holds for all [31, Theorem 3.1]. AB-GMRES applied to determines the th iterate by minimizing over the space , and thus determines the solution of , subject to without breakdown for all and for all if and only if [49, Theorem 5.2], which reduces to from . ∎

Theorem 3.2.

If is nonsingular and the inner-iteration matrix is semiconvergent, then AB-GMRES preconditioned by the inner iterations applied to determines the solution of , subject to without breakdown for all and for all .

Proof.

From Lemma 3.1, it is sufficient to show that and . Since is the splitting matrix of for the inner iterations, the inner-iteration matrix is . Hence, the inner-iteration preconditioning matrix satisfies [49, Lemma 4.5]. On the other hand, satisfies [49, Lemmas 4.3, 4.4]. ∎

4 Numerical experiments

In this section, we compare the performance of the interior-point method based on the iterative solvers with the standard interior-point softwares. We also developed an efficient direct solver coded in C to compare with the iterative solvers. For the sake of completeness, we briefly describe our direct solver first.

4.1 Direct solver for the normal equations

To deal with the rank-deficiency, we used a strategy that is similar to the Cholesky-Infinity modification scheme introduced in the LIPSOL solver [64]. However, instead of penalizing the elements that are close to zero, we removed them and solved the reduced system. We implemented this modification by an LDLT decomposition. We used the Matlab built-in function chol to detect whether the matrix is symmetric positive definite. We used the ldlchol from CSparse package version 3.1.0 [14] when the matrix was symmetric positive definite, and we turned to the Matlab built-in solver ldl for the semidefinite cases which uses MA57 [18].

We explain the implementation by an example where . For matrix , LDLT decomposition gives

Correspondingly, we partition and . Assuming that the diagonal element is close to zero, we let , , , , and solve

using forward and backward substitutions. The solution is then given by .

4.2 Implementation specifications

In this section, we describe our numerical experiments.

The initial solution for the interior-point method was set using the method described in LIPSOL solver [64]. The initial solution for the Krylov subspace iterations and the inner iterations was set to zero.

We set the maximum number of the interior-point iterations as and the stopping criterion regarding the error measure as