A Quantum Interior Point Method for LPs and SDPs

A Quantum Interior Point Method for LPs and SDPs

Iordanis Kerenidis CNRS, IRIF, Université Paris Diderot, Paris, France and Centre for Quantum Technologies, National University of Singapore, Singapore. Email: jkeren@irif.fr.    Anupam Prakash CNRS, IRIF, Université Paris Diderot, Paris, France. Email: anupamprakash1@gmail.com.
Abstract

We present a quantum interior point method with worst case running time for SDPs and for LPs, where the output of our algorithm is a pair of matrices that are -optimal -approximate SDP solutions. The factor is at most for SDPs and for LP’s, and is an upper bound on the condition number of the intermediate solution matrices. For the case where the intermediate matrices for the interior point method are well conditioned, our method provides a polynomial speedup over the best known classical SDP solvers and interior point based LP solvers, which have a worst case running time of and respectively. Our results build upon recently developed techniques for quantum linear algebra and pave the way for the development of quantum algorithms for a variety of applications in optimization and machine learning.

\DeclareBoldMathCommand\boldlangle\DeclareBoldMathCommand\boldrangle

1 Introduction

Semidefinite programming is an extremely useful and powerful technique that is used both in theory and in practice for convex optimization problems. It has also been widely used for providing approximate solutions to NP-hard problems, starting from the work of Goemans and Williamson [12] for approximating the MAXCUT problem.

A Semidefinite Program (SDP) is an optimization problem with inputs a vector and matrices in , and the goal is to find a vector that minimizes the inner product , while at the same time the constraint is satisfied. In other words, an SDP is defined as

One can also define the dual SDP as the following optimization problem,

For most cases of SDPs, strong duality implies that the two optimal values, for the primal SDP and for the dual SDP are actually equal. Semidefinite programs are a generalization of Linear Programs, which is the special case where all matrices are diagonal. The main advantage of both linear and semidefinite programs is that they encapsulate a large number of optimization problems and there are polynomial time algorithms for solving them.

The ellipsoid algorithm [19] was the first provably polynomial-time algorithm to be given for LPs. In a seminal paper, Karmarkar [14] introduced an algorithm that is equivalent to an interior point method to improve the running time of the ellipsoid method. The first provably polynomial time algorithm for SDPs was given by Nesterov and Nemirovskii [21] using interior point methods with self-concordant barrier functions.

The running time of the best known general method for solving SDPs [20] is where is the sparsity, the maximum number of non zero entries in a row of the input matrices and the SDPs are solved to additive error . The running time in the worst case is for dense SDPs with . For the case of linear number of constraint matrices () the worst-case running time is , or slightly better in case the sparsity of the constraint matrices is small.

For our results we will in fact use a version of the classical interior point method for SDPs based on [6], whose running time is

For dense SDPs with the running time is , while for instances with , the running time is . This method can also be used for Linear Programming in dimensions with constraints, and running time .

The fastest known algorithm for linear programs is by Vaidya [23], it has time complexity where is the logarithm of the maximal sub-determinant of the constraint matrix. The main bottleneck for interior point style algorithms for both LPs and SDPs is maintaining suitable approximations of the inverse of the Hessian matrices at each step of the computation, Vaidya’s algorithm uses a combination of pre-computation and low-rank updates to accelerate this step for linear programs. The interior point method was originally proposed for solving linear programs [14] and the best known LP algorithms using this approach [24] have complexity if all the arithmetic operations are performed up to a constant number of bits of precision.

A different approach to SDP solving is the framework of Arora and Kale [5, 4]. The Arora-Kale algorithm uses a variant of the multiplicative weights update method to iteratively find better solutions to the primal and the dual SDPs until a solution close to the optimal is found. The running time of the algorithm depends on the dimensions of the problem, the approximation guarantee between the solution and the optimal value, and upper bounds and on the ”size” of the optimal primal and dual solutions (in some appropriate norm). More precisely, an upper bound on the running time of this algorithm given in [3] is,

While the Arora-Kale framework has found many applications in complexity theory [5], for the case of solving general SDPs, it remains an algorithm of mostly theoretical interest. It is known that for many combinatorial problems (like MAXCUT or scheduling problems), the width grows at least linearly in the dimensions (Theorem 24, [3]), making this algorithm infeasible in practice and with worse running time than the interior point method.

Recently, quantum algorithms for solving semi definite programs (SDPs) using the Arora-Kale framework were first proposed by Brandao and Svore [8] and subsequently improved by van Appeldoorn, Gilyen, Gribling and de Wolf [3]. These algorithms had a better dependence on but a worse dependence on other parameters compared to the classical algorithm. Very recently, [7] and independently [2] (subsequent to a previous version of [7]) provided an even better dependence on the parameters and improved the dependence on the error to . In order to discuss these results and compare them to ours, we first need to describe the quantum input models for SDPs.

The most basic input model is the sparse oracle model of , where the input matrices are assumed to be -sparse and one has access to an oracle for computing the -th element of . The quantum state model was introduced in [7], and in this model each is split as a difference of positive semidefinite matrices () and we assume that we have access to purifications of the density matrices corresponding to for all .

The input model most relevant for our work is the quantum operator model of [2] where one assumes access to unitary block encodings of the the input matrices , that is there are efficient implementations of unitary operators . That is, we assume that the operations can be performed efficiently. It is shown in [2] that both the sparse oracle model and the quantum state model can be viewed as instances of the operator model. The best upper bound for the quantum algorithm from [2] in the operator model is,

(1)

We note that this parameter can be in the worst case but it can in principle be smaller than that. The various input models for quantum SDP solvers assume oracle access to the input matrices in the form described above and do not address further the question of implementing these oracles. In this paper, we work in the quantum data structure model introduced in [17, 16] that explicitly provides a method for implementing block encodings for arbitrary matrices.

In the quantum data structure model the algorithms have quantum access to a data structure that stores the matrix in a QRAM (Quantum Random Access Memory). The data structure is built in a single pass over a stream of matrix entries and the time required to process a single entry is poly-logarithmic in . Thus, the construction of the data structure does not incur additional overhead over that required for storing the matrices sequentially into classical memory or the QRAM. More generally, we can define the quantum data structure model for a general data structure as follows.

Definition 1.1.

A data structure for storing a dataset of size in the QRAM is said to be efficient if it can be constructed in a single pass over the entries for and the insertion and update time per entry is .

In [17, 16] we had given quantum algorithms for matrix multiplication and inversion for an arbitrary matrix with in the QRAM data structure model. The running time for these algorithms was where is the error, is the condition number, and for . Note that as we have assumed that .

The different terms in the minimum in the definition of correspond to different choices for the data structure for storing . Subsequently, [10] pointed out that these results are equivalent to there being an efficient block encoding for with parameter . Moreover, in [10], the dependence on precision for the quantum linear system solvers in the QRAM data structure model was improved to and that on to linear. The QRAM data structure model thus explicitly provides efficient block encodings for matrix with and can be used to implement the operator model for SDPs as described above.

The quantum interior point algorithm is developed in the QRAM data structure model and can therefore be compared to the best quantum SDP algorithm in the Arora-Kale framework [2] with running time given by equation (1). We note that both and are at most when the input SDP matrices are -sparse, so the algorithms remain comparable for the case of sparse SDPs as well. There is a linear lower bound on (Theorem 24, [3]), and hence in the best case, i.e. when , the running time in equation (1) is , which is similar to the running time for the classical interior point method.

1.1 Our techniques and results

In this paper we develop the first quantum algorithms for SDPs and LPs using interior point methods and obtain a significant polynomial speedup over the corresponding classical algorithms if the intermediate matrices arising in the interior point method are well-conditioned.

The classical interior point method starts with a pair of feasible solutions to the SDP and iteratively updates the solutions, improving the duality gap in each iteration. Each iteration of the interior point method involves constructing and solving the Newton linear system whose solutions give the updates to be applied to the SDP solution.

The main bottleneck for our quantum interior point method, as for the classical interior point method, is that the matrix for the Newton linear system is not given directly and is expensive to compute from the data. One of our main contributions is a technique for constructing block encodings for the Newton linear system matrix which allows us to solve this linear system with low cost in the quantum setting. We utilize the quantum techniques for linear algebra developed in [17, 16] and the improvements in precision for these linear algebra techniques in [10, 11].

The quantum solution of the Newton linear system corresponds to a single step of the interior point method. We need a classical description of the solution of the linear system in order to define the linear system for the next step. We therefore perform tomography on the quantum state corresponding to the solution of the Newton linear system at each step of the method. The dimension of the state that we perform tomography on is so a super-linear tomography algorithm [22, 13] would be prohibitively expensive. We provide a tomography algorithm that given a unitary for preparing a vector state, reconstructs a vector -close to the target vector in the norm with complexity which is linear in the dimension.

We also prove a convergence theorem in order to upper bound the number of iterations of the method, taking into account the various errors introduced by the quantum algorithms. The algorithm outputs a pair of matrices that are -optimal -approximate SDP solutions, where by -optimal we mean that forthe duality gap we have and by -approximate we mean that the SDP constraints are satisfied to error . The classical analysis also implies that the algorithm converges if the precision for the tomography algorithm is where is the condition number of the matrix being updated.

We will provide the exact running time of our quantum interior point method in a later section after defining precisely the matrix of the Newton Linear system, but we describe it here without making explicit the matrices involved and suppressing logarithmic factors. Our interior point algorithm for SDPs has worst case running time,

The number of iterations for convergence is , this is same as in the classical case. The term comes from the tomography algorithm that reconstructs the solution to the Newton linear system in each step to error . The additional and factors arise from the quantum linear system solver as described earlier. Note that since the Newton linear system matrix has dimension the factor is at most .

lf the intermediate matrices arising in the method are well-conditioned, the worst case running time for our algorithm is which is significantly better than the corresponding classical interior point method in [6] whose running time is and the best known general SDP solver [20] whose running time is .

While we have a worst case bound of on , this parameter can be much smaller in practice than the worst case scenario. For example, in [15], a quantum linear system solver was used for dimensionality reduction and classification of the MNIST data set, and in that case the dimension of the corresponding matrix was while the value of , when taken to be the maximum norm of any row, was found to be less than 10. Another remark is that the real condition number can be replaced by a smaller condition threshold.

In this paper we focus on the case of dense SDPs where , for the sparse case when as in the classical case one can develop variants of the interior point method with faster running time. It is easy to see that these methods can be quantized, we do not address these methods for the sparse case in this work.

For the special case of Linear Programs our algorithm has running time

The condition number here is a bit different than the SDP case, it is the ratio of the maximum to the minimum element of the intermediate solution vectors, while is in the worst case at most . The running time of our algorithm is if the intermediate Newton matrices and solution vectors are well conditioned, compared to for the corresponding classical algorithm in [6] and [24]. We note that there is a better specialized classical algorithm for linear programs[23] with time complexity where is the logarithm of the maximal sub-determinant of the constraint matrix.

Our results provide the first quantum SDP and LP solvers based on the interior point method, thus paving the way for a vast variety of applications. An interesting feature of our algorithm is that the quantum part is no more difficult than a quantum linear system solver, a circuit whose depth depends on and but only logarithmically on the dimension and the error parameter. This quantum circuit is repeated independently at each step of the iterative method, and for each step a number of times required by the tomography. Therefore, building a quantum circuit for solving linear systems implies a quantum SDP solver.

This paper is organized as follows. In Section 2 we collect some linear algebra preliminaries and useful quantum procedures. The results on quantum linear system solvers and quantum data structures that we need are introduced in Section 3. In Section 4 we provide an algorithm for vector state tomography. In Section 5 we describe the classical interior point method and in 6 we provide a convergence analysis for the interior point method when the linear systems are solved approximately. We present the quantum interior point method for SDPs and LPs in Section 7 and bound its running time.

2 Preliminaries

We introduce some notation that is used throughout the paper. The entry-wise or Hadamard product of two matrices is denoted by . The concatenation of two vectors is denoted by . The vector state for is the quantum state . Given a vector the vectors obtained by taking the entry-wise square root and squares of are denoted as and respectively. If , then denotes the block diagonal matrix .

The singular value decomposition (SVD) for a matrix is denoted as . The spectral decomposition for a symmetric matrix is denoted as . The condition number of an invertible matrix is the ratio between the largest and smallest singular values of . The Frobenius norm and the spectral norm are functions of the singular values. The norm of a vector is denoted as . The -th column of is denoted by and the -th row is denoted as .

2.1 Linear Algebra

We collect in this section linear algebra facts and lemmas that are used in later sections.

Fact 2.1.

For all , .

Fact 2.2.

for positive definite and .

A simple lemma that will be used later is stated below.

Lemma 2.3.

Let be such that , then .

Proof.

The squared Frobenius norm of is the sum of the squared norms of the columns of . Denoting the columns of by , we have,

(2)

where the inequality follows as . ∎

2.2 Quantum procedures

We will require the following auxiliary lemma that gives a procedure for preparing given procedures for preparing and . This lemma will be useful for preparing block encodings required for the quantum interior point method.

Lemma 2.4.

Let for integers and let and be unitary operators that can be implemented in time respectively, then given the state can be prepared in time .

Proof.

We start with an auxiliary qubit in the state and an output register initialized to , where . The initial state is,

Conditioned on the control qubit being apply to obtain the state . Conditioned on the control qubit being apply to obtain . After this step we obtain the state,

Let be the dimension of the output register. Define the unitary which acts as follows: for and for . Note that is a unitary as it permutes the orthogonal basis states for . Applying to the above state we obtain,

Apply a bit-flip conditioned on the auxiliary register being in a state such that , this operation erases the control qubit as it maps . Discarding the auxiliary qubit, after this operation we have the desired state,

The time required for the procedure is for the conditional operations, the remaining steps have negligible cost. ∎

We also state an approximate version of the above lemma where the norms are estimated within error .

Corollary 2.5.

Given estimates and unitaries in Lemma 2.4 a state such that can be prepared in time .

The proof is straightforward and follows by replacing in the proof of Lemma 2.4 by the respective estimates.

3 Quantum linear system solvers and QRAM data structures

In this section we collect the results on quantum linear algebra primitives and QRAM data structures that are required for the interior point method.

3.1 Quantum linear system solvers

We assume without loss of generality that the matrices are symmetric. If the matrix is rectangular or not symmetric, then it is well known we can work instead with the symmetrized matrix .

In [17, 16] we constructed efficient data structures (Definition 1.1) for storing matrices that were used to obtain algorithms for quantum linear algebra operations including matrix inversion and multiplication with running time where is a factor that depends on the matrix and is the accuracy in the norm. These algorithms used phase estimation and amplitude amplification and therefore require time inverse polynomial in the precision and quadratic in the condition number . In recent work [10], the dependence on precision for the quantum linear system solvers in the QRAM data structure model was improved to and that on to linear. In order to state the improved results, we recall the notion of a block encoding of a matrix introduced in [10].

Definition 3.1.

Let be a -qubit operator, an qubit unitary is an block encoding of if such that . A block encoding of is said to be efficient if it can be implemented in time .

We next state the results on improved linear system solvers and matrix multiplication from [10, 11]. In the theorem stated below, the first part is given as Lemma 27, the second part of the result follows from Lemma 22 and the third part follows from Corollary 29 and Theorem 21 in [10]. The singular value transformation approach [11] can also be used to obtain these results.

Theorem 3.2.

[10, 11] Let be a matrix with non-zero eigenvalues in the interval . Given an implementation of an block encoding for in time and a procedure for preparing state in time ,

  1. If then a state -close to can be generated in time .

  2. If then a state -close to can be generated in time .

  3. For and as in parts 1 and 2 and , an estimate such that with probability can be generated in time .

We require the following simple result on composing such operations.

Theorem 3.3.

[10, 11] Given -block encoding for matrices implemented in time for such that and state , a state -close to the for with can be generated in time and an -estimate to the norm in time .

The running time has a factor of instead of a factor since we do not perform amplitude amplification after every matrix operation but only at the end.

It is standard to assume for quantum linear system solvers that the eigenvalues of belong to the interval . The quantum data structures constructed in [17, 16] implement a unitary which is a block encoding for a matrix in time .

Theorem 3.4.

[16, 17] There are efficient QRAM data structures for storing vectors and matrices such that with access to these data structures there is,

  1. An time state preparation procedure for .

  2. An efficient block encoding for for
    .

We also require an auxiliary result which provides error bounds for implementing a block encoding for a matrix whose rows and columns can be prepared within error . It follows from the second part of Lemma 23 in [10] which provides an analysis for the QRAM data structure in [16] for the approximate case.

Lemma 3.5.

[10, 16] Let be a matrix such that the unitaries and can be implemented to error in time , then an block encoding for can be implemented in time .

3.2 Quantum data structures

For the quantum interior point method we require that the input matrices and for the SDP input are stored in appropriate QRAM data structures so that we can perform multiplication or inversion with these matrices with the guarantees in Theorem 3.2.

Let be the vector with entries for . We also need to prepare the states for . The SDP input matrices are stored as a tensor . We provide efficient QRAM data structures that enable the above operations to be carried out in time . We show next that this can be done using the same algorithms as in [16, 17] and with the same running time guarantees.

Let be a 3-dimensional tensor. Let be the set of matrices that can be obtained by fixing the third index of and let be the set of vectors that can be obtained by fixing the first two indices. We have the following corollary extending Theorem 3.4 for the storage of tensors.

Corollary 3.6.

There are efficient QRAM data structures for storing the tensor such that given quantum access to these data structures, there are efficient block encodings for all for and time preparation procedures for all .

Proof.

The algorithm maintains separate data structures for all matrices in and vectors in . A single entry belongs to exactly one matrix in and one vector in . On receiving the entry the algorithm updates the two data structures corresponding and . This incurs a constant overhead over Theorem 3.4 and thus achieves the same running time and guarantees. ∎

4 Tomography for efficient vector states

We present in this section an algorithm for tomography of pure states with real amplitudes when we have an efficient unitary to prepare the states. The tomography algorithm will be used to recover classical information from the dimensional states corresponding to the solutions of the Newton linear system in each step of the interior point method.

Let be a unitary operator that creates copies of the vector state for a unit vector in time . We also assume that we can apply the controlled version of in the same time. Our tomography algorithm outputs a unit vector such that with probability . The algorithm uses calls to and runs in time , thus the sample and time complexity for our algorithm are both when is polylogarithmic. Such a linear time tomography algorithm for real pure states does not follow from existing results.

A sample-efficient tomography algorithm for mixed states that requires copies of to obtain estimate with the guarantee that was given in O’Donnell and Wright [22]. The time complexity of this approach is high as it uses a computationally expensive state estimation procedure [18]. Another approach to tomography is through compressed sensing [13]. While this is sample-efficient for low-rank (including pure states), it is also computationally expensive as it solves an SDP to perform the reconstruction.

We note that our tomography problem is easier than the more general problems addressed in [22, 13], since we assume we can apply the unitary that prepares the state and its controlled version. Note also that for our purposes we only need to deal with real amplitudes, though complex amplitudes could be dealt with in a similar way. We next state our tomography algorithm and establish its correctness.

0:  Access to a unitary such that and to its controlled version.
  1. Amplitude estimation

    1. Measure copies of in the standard basis and obtain estimates where is the number of times outcome is observed.

    2. Store in QRAM data structure so that can be prepared efficiently.

  2. Sign estimation

    1. Create copies of the state using a control qubit.

    2. Apply a Hadamard gate on the first qubit of each copy of the state to obtain

    3. Measure each copy in the standard basis and maintain counts of the number of times outcome is observed for .

    4. Set if and otherwise.

  3. Output the unit vector with .

Algorithm 1 Vector state tomography algorithm.

We will use the following version of the multiplicative Chernoff bounds for the analysis.

Fact 4.1.

[1] Let for be independent random variables such that and let . Then,

  1. For , .

  2. For , .

Combining the two bounds for we have .

We first prove an auxiliary lemma that shows that the sign estimation procedure in Step 2 of Algorithm 1 succeeds with high probability for all sufficiently large .

Lemma 4.2.

Let , then for all with probability at least .

Proof.

Let be the indicator random variable for the event that the -th measurement outcome in step 1 of Algorithm 1 is and let . Note that and . Applying the multiplicative Chernoff bound (Fact 4.1) with , we have that for all .

Using the fact that for all and choosing we have,

for a fixed . By the union bound, the event that or equivalently holds for all with probability at least . Let us condition on this event.

We next show that the algorithm obtains the correct signs for all with high probability. We provide the argument for the case when the sign is positive (i.e. when we need to output ), the other case is similar with replaced by . We have,

Further as and we have that . Using the multiplicative Chernoff bound , we conclude that with probability at least , and in this case correctly determines the sign of . By the union bound, the signs are determined correctly for all with probability at least , the claim follows. ∎

The following theorem establishes the correctness of Algorithm 1.

Theorem 4.3.

Algorithm 1 produces an estimate with such that with probability at least .

Proof.

As shown in the proof of Lemma 4.2, the multiplicative Chernoff bound 4.1 implies that for all and for all . Using the factorization , the Chernoff bound can be rewritten as

As for all we have . It follows that for all and for . For , choosing we obtain,

By the union bound the event that for all occurs with probability at least . Conditioning on , we have the bound .

We can now bound the error for the algorithm conditioned on event that the signs are determined correctly for all (Lemma 4.2) and on , and have

(3)

For the second inequality, we used that . It therefore suffices to show that with high probability.

Part 2 of the multiplicative Chernoff bound yields that for all . Choosing we have, . Thus, with overwhelming probability. Substituting in equation (4) we obtain that with probability at least (ignoring lower order terms), we have , the theorem follows. ∎

The success probability for our vector state tomography algorithm can be boosted to by increasing the number of samples to for suitable constants . In the interior point method we perform tomography for iterations, so Theorem 4.3 ensures that all the tomography results will be correct with high probability. In order to extend this approach to all pure states instead of the sign one would need to estimate the phase to sufficient accuracy.

The quantum tomography algorithm is used for learning the output of a quantum linear system solver, that is the unitary in algorithm 1 is not perfect but produces a state such that , equivalently it produces a density matrix such that the trace-distance between the and is . As long as the error is , the trace-distance between the states and remains close to and hence any algorithm with input will have the same guarantees as the error-free algorithm 1.

The complexity of the linear system solver scales as in the error parameter by Theorem 3.2, hence the precision can be boosted to have error at the cost of a logarithmic overhead in the dimension . We can therefore assume that the guarantees in Theorem 4.3 hold when the tomography algorithm is used for reconstructing the solutions to the Newton linear system.

Note that if we want to estimate a non-unit vector and we have an error estimate for and a unitary that outputs that corresponds to the unit vector , then we can first use tomography to get with and then we have that

5 The classical interior point method

We start by providing the details of the classical interior point method for SDPs and elements of its analysis based on [6]. We assume the bit complexity is constant, so we hide some logarithmic factors. This method has the following complexity: (i) For Linear Programming in dimensions with constraints, the algorithm has running time . (ii) For Semi-Definite Programming over