Distributed Semidefinite Programming with Application to Large-scale System Analysis

Distributed Semidefinite Programming with Application to Large-scale System Analysis

Sina Khoshfetrat Pakazad,  Anders Hansson,  Martin S. Andersen,  and Anders Rantzer,  S. Khoshfetrat Pakazad and A. Hansson are with the Division of Automatic Control, Department of Electrical Engineering, Linköping University, Sweden. Email: {sina.kh.pa, hansson}@isy.liu.se.M. S. Andersen is with Department of Applied Mathematics and Computer Science, Technical University of Denmark, Denmark. Email: mskan@dtu.dkA. Rantzer is with Department of Automatic Control, Lund University, Sweden. Email: anders.rantzer@control.lth.se
Abstract

Distributed algorithms for solving coupled semidefinite programs (SDPs) commonly require many iterations to converge. They also put high computational demand on the computational agents. In this paper we show that in case the coupled problem has an inherent tree structure, it is possible to devise an efficient distributed algorithm for solving such problems. This algorithm can potentially enjoy the same efficiency as centralized solvers that exploit sparsity. The proposed algorithm relies on predictor-corrector primal-dual interior-point methods, where we use a message-passing algorithm to compute the search directions distributedly. Message-passing here is closely related to dynamic programming over trees. This allows us to compute the exact search directions in a finite number of steps. Furthermore this number can be computed a priori and only depends on the coupling structure of the problem. We use the proposed algorithm for analyzing robustness of large-scale uncertain systems distributedly. We test the performance of this algorithm using numerical examples.

SDPs, distributed algorithms, primal-dual methods, robustness analysis, interconnected uncertain systems.

I Introduction

Semidefinite programs are convex optimization problems that include linear matrix inequalities (LMIs) or semidefinite constraints. The computational complexity of solving such problems commonly scales badly with the number of optimization variables and/or the dimension of the semidefinite constraints in the problem. This limits our ability to solve large SDPs. Despite this, large SDPs are appearing more and more in different engineering fields, e.g., in problems related to sensor networks, smart grids and analysis of uncertain systems, e.g., see [7, 4, 29, 2, 31]. This has been the driving force for devising efficient and tailored centralized solvers for such problems. These solvers exploit the structure in the problem to reduce the computational burden of solving the problem in a centralized manner, see e.g., [25, 2, 40, 19, 41]. Despite the success of such approaches for solving medium to large-scale problems, there are still problems that cannot be solved using centralized solvers, see e.g., [36, 1, 8, 14]. This can be due to limited available computational power and/or memory that prohibits us from solving the problem. Also it can be due to certain structural constraints, e.g., privacy requirements, that obstructs us from even forming the centralized problem.

For such instances, distributed algorithms may be used for solving the problem. These algorithms facilitate solving the problem using a network of computational agents, without the need for a centralized unit. Due to this, the computational complexity of these algorithms scales better, and they potentially enable us to address structural constraints in the problem. The main approach for designing distributed algorithms consists of two major phases. First the structure in the problem is exploited to decompose the problem or reformulate it as a coupled problem. Then first-order splitting methods are used for solving the resulting problem distributedly, see e.g., [37, 27]. This approach has been used in many applications, e.g., see [36, 23, 14]. In [36] the authors consider a sensor localization problem and use a so-called edge-based decomposition for reformulating the underlying SDP as a coupled one. They then employ alternating direction method of multipliers (ADMM) to solve the problem distributedly. An optimal power flow problem has been considered in [14], where the authors reformulate the problem as a coupled SDP using semidefinite relaxation techniques. They then use ADMM to solve the coupled problem distributedly. In [23] the authors consider robustness analysis of large-scale interconnected uncertain systems. They exploit the sparsity in the interconnections to decompose the underlying SDP and reformulate it as a coupled problem. This problem is then solved distributedly using algorithms that rely on proximal splitting methods.

The algorithms designed using the aforementioned approach, although effective, suffer from some issues. For instance, since these algorithms rely on first-order splitting methods, with convergence rates or where is the number of iterations, they require many iterations to converge to an accurate enough solution. Furthermore, exploiting structure and decomposing problems is commonly done through introduction of consensus constraints, which describe the coupling structure in the problem. The number of such constraints is commonly large for SDPs, which can in turn adversely affect the computational and/or convergence properties. Moreover the agents involved in these distributed algorithms need to solve an SDP at every iteration of the algorithm, which can potentially put a considerable computational burden on the agents.

In this paper we propose a distributed algorithm for solving coupled SDPs with a tree structure. These SDPs are defined in Section IV. This algorithm does not suffer from any of the aforementioned issues. We achieve this by avoiding the use of first-order splitting methods and instead rely on primal-dual interior-point methods, which have superior convergence properties. The proposed algorithm is produced by distributing the computations conducted at each iteration of the primal-dual method. Particularly, we use a message-passing algorithm for computing the search directions. Message passing, here, is closely related to non-serial dynamic programming, [22, 26, 5]. We also present a similar approach for distributing the remaining computations at every iteration. As a consequence, at each iteration of the primal-dual method, the computational burden on each agent is very low. In fact during each iteration, an agent is required to factorize a relatively small matrix once and is required to communicate with its neighbors twelve times.

The proposed algorithm in this paper is closely related to that of [22]. In fact, the authors in [22] use the same approach for devising a distributed algorithm for solving coupled non-conic problems. However, the computation of search directions for SDPs is not as straightforward as for non-conic problems. This is due to introduction of scaling matrices and their inverses in the KKT system, which destroys the structure in the problem. In order to circumvent this issue, we here put forth a novel way for computing the search directions at each iteration. This in turn enables us to use the message-passing algorithm for computing the search directions.

Notice that by using this approach for computing the search directions, we implicitly solve the so-called augmented system. This is done by computing a block factorization of its coefficient matrix using a fixed pivoting ordering, where the ordering is enforced by the coupling structure in the problem, [22]. This is in contrast to existing methods that commonly solve the so-called Schur complement system or normal equations. As a result, the proposed algorithm provides us with more stable and accurate implementation, [42, 12]. Solving the augmented system is also considered in [30], where the authors also compute the search directions through solving the augmented system by computing an factorization using fixed pivoting ordering. This is particularly done by using regularization and iterative refinement. In this paper, however, a block factorization is computed using a fixed pivoting ordering without the use of regularization. Hence, the augmented system is solved without the need for iterative refinement.

We then use the proposed algorithm for analyzing large-scale interconnected uncertain systems, distributedly. This is made possible by exploiting the sparsity in the interconnections, as outlined in [2]. A similar approach was also used in [23]. There, the authors utilized the so-called range-space decomposition for reformulating the analysis problem as a coupled feasibility problem. They then used algorithms that rely on proximal splitting methods for solving it distributedly. We here instead use the so-called domain-space decomposition to reformulate the analysis problem as a coupled SDP. The coupling structure of this coupled problem is less complicated than that of in [23], and has a tree structure. This then enables us to use the presented distributed algorithm for solving the problem efficiently and distributedly. We illustrate the performance of the algorithm using numerical examples.

Outline

Next we first define some notations that are used throughout the paper. In Section II we put forth a definition of coupled and loosely coupled SDPs. We review a predictor-corrector primal-dual interior-point method in Section III and briefly discuss how the structure in coupled problems is reflected in the computations conducted at every iteration of this method. Section IV expresses coupled problems with a tree structure and discusses the use of message-passing algorithm for solving coupled problems with a tree structure. This is then used in Section V where we present the proposed distributed algorithm for solving coupled SDPs with tree structure. In Section VI we discuss a decomposition approach for sparse SDPs. This approach is used in Section VII for reformulating the problem of robustness analysis of large-scale interconnected uncertain systems as coupled SDPs with a tree structure. We test the performance of the proposed distributed algorithm when applied to this problem using numerical experiments in Section VIII. Finally we finish the paper with some concluding remarks in Section IX.

Notation

We denote the set of real and complex numbers with and , and the set of real and complex matrices with and , respectively. The transpose and conjugate transpose of a matrix is denoted by and , respectively. The null space of a matrix is denoted by . With and we denote the set of symmetric and Hermitian matrices. The set of integer numbers is denoted by . Given a set of positive integers , the matrix is a 0–1 matrix obtained from an identity matrix with rows indexed by removed, where denotes the number of elements in . This means that is a -dimensional vector that contains the elements of indexed by . We denote this vector by . By and we denote the th element of vector and the element at row and column of matrix at the th iteration, respectively. Given matrices for , denotes a block-diagonal matrix with blocks specified by the given matrices. Similarly is a diagonal matrix with diagonal elements . Given vectors for , the column vector is all of the given vectors stacked. The generalized matrix inequality () means that is negative (semi)definite. Given a matrix , is an -dimensional vector that is obtained by stacking all columns of on top of each other. Given two matrices , . For a symmetric matrix

Operators and are defined as inverses of and , respectively. Given two matrices and by we denote the standard Kronecker product. Given , define as an matrix such that . Then for two matrices , denotes the symmetrized Kronecker product that is defined as

For properties of the symmetrized Kronecker product refer to [38]. Given two sets and , denotes the standard cartesian product and by we denote the symmetrized cartesian product defined as

For these two sets denotes the standard set minus. By we denote the minimum value and with we denote the minimizing argument of a function. By we denote the set of -dimensional square integrable signals, and represents the set of real, rational transfer function matrices with no poles in the closed right half plane. A graph is denoted by where is its set of vertices or nodes and denotes its set of edges. An induced graph by on , is a graph where .

Ii Coupled and Loosely Coupled SDPs

Let us consider a coupled SDP given as

(1a)
(1b)
(1c)

where such that has full column rank for all , with the ordered sets such that , and with such that if and . This problem can be seen as a combination of coupled subproblems, each of which defined by the objective function and constraints for and . Let us now define , which denotes the set of subproblems that are coupled in that they all depend on the variable . Notice that agents and are members of if and only if . It is possible to provide a more explicit description of the coupling among the subproblems by decomposing (1) as

(2a)
(2b)
(2c)
(2d)

Notice that in (2), the objective function terms and constraints in (2a)–(2c) are decoupled and the coupling in the problem is described using the consensus constraints in (2d). It is also possible to provide a graphical representation of the coupling using undirected graphs. Particularly let be a graph with vertex set as defined above and edge set . We refer to this graph as the sparsity graph of the problem. Let us now illustrate the definitions above using an example given as

(3a)
(3b)

Notice that the constraint in (3b) can be rewritten as

with , and . Then the optimal objective value of

(4a)
(4b)
(4c)
(4d)

defines an upperbound for the optimal objective value of (3). The problem in (4) is a coupled SDP with smaller semidefinite constraints. This method for reformulating the problem is commonly used for cases when the original problem is either impossible or very difficult to solve. Notice that this problem is in the same format as (1). The sparsity graph of this problem is illustrated in Figure 1, where for instance there is an edge between the nodes and since the intersection between the sets and is nonempty.

Fig. 1: Sparsity graph for the coupled SDP in (4).

In case for a coupled problem

  • for all ;

  • for all ,

then we call this problem loosely coupled. As we will see later, it is possible to devise efficient distributed solvers based on primal-dual interior-point methods for solving coupled and loosely coupled SDPs. To this end, let us first briefly review primal-dual interior-point methods for solving SDPs.

Iii Primal-Dual Interior-point Methods for Solving SDPs

It is possible to iteratively solve a standard-form SDP, given as

(5)

where and such that has full column rank, using primal-dual interior-point methods. Particularly, given the iterates , a primal-dual interior-point method generates the next iterates by taking a single Newton step applied to the perturbed KKT conditions

(6a)
(6b)
(6c)

together with and where . Specifically this Newton step can be computed by solving the following linear system of equations

(7a)
(7b)
(7c)

where , is the perturbation parameter with denoting the surrogate duality gap and , and where (7c) is a modified linearization of (6c) that ensures that the computed directions and are symmetric. There are different choices for the scaling matrix in (7c), e.g., see [38] and references therein. For the sake of brevity, we limit our discussion to the choices presented in [33, 32], that is we choose with where

(8)

This scaling is referred to as the Nesterov-Todd or NT scaling. In order to make the notation less complicated, from now on we drop the iteration index , and we use lowercase notation for denoting vectorized variables or residuals, e.g., we use as or as . Using symmetrized Kronecker product we can then rewrite (7) more compactly as

(9)

where , , and

(10)

see [38]. One way of solving (9), is to first solve for as in

(11)

and then solve

(12)

for and , where . Notice that since is positive definite, [38, Thm. 3.2], (12) also describes the optimality condition for the following convex optimization problem

(13)

So it is possible to compute and by either solving the system of equations in (12) or the problem in (13). In this paper we focus on predictor-corrector primal-dual methods that rely on modified Newton directions. In order to compute these directions, at each iteration, we need to solve (12) or (13) twice with different choices of . We lay out a predictor-corrector primal-dual interior-point method in Algorithm 1, based on the work in [38].

Remark 1

Algorithm 1 can detect infeasibility of the problem if either the primal or dual iterates diverge. This means that this algorithm is unable to detect weak infeasibility, [15, 28], and generally in such cases converges to a near-feasible solution, [39].

The major computational burden of each iteration of this primal-dual method concerns the computation of the predictor and corrector directions. Next we will investigate how the structure in coupled problems is reflected in (13) and how this structure can be used to our advantage.

1:Given , , , , , initial iterates such that and and .
2:repeat
3:     Compute .
4:     Predictor step: Set and compute the search directions , by either solving (12) or (13) and using (11).
5:     Compute primal and dual step sizes as
6:     Set .
7:     Corrector step: Having computed compute the search directions , by either solving (12) or (13) with
and using
8:     Compute primal and dual step sizes as above, though using and .
9:     Update
10:     Set .
11:     .
12:until  and .
Algorithm 1 Predictor-corrector Primal-dual Interior-point Method, [38]

Let us apply the primal-dual method in Algorithm 1 to the coupled SDP in (2). The perturbed KKT optimality conditions for this problem can be written as

(14a)
(14b)
(14c)
(14d)

for , together with

(15)

and for . Define . Similar to (7), given iterates and such that they satisfy (2d), , and such that for all , the Newton step corresponding to the above system of equations can be computed by solving

(16a)
(16b)
(16c)
(16d)

for , together with where the scaling matrices are computed as discussed above and in (8), though based on the given local iterates and . This system of equations can be rewritten in a more compact manner as

(17)

where , and are block-diagonal with diagonal blocks and

and . Also here , , and denote all the corresponding variables stacked, e.g., . Similarly , and denote all the primal, dual and centering residuals stacked, where each of the stacked terms in the residual vectors are based on

(18a)
(18b)
(18c)

Similar to before, we compute the primal-dual directions by first solving for as

(19)

or equivalently as

(20)

for . Then we solve

(21)

where with . Notice that the system of equations in (21) also describes the optimality conditions for

(22a)
(22b)
(22c)

where for . So the predictor and corrector directions can also be computed by solving (22). To be more precise, for the predictor directions, we solve (22), with , for and , and compute using (19). For the corrector directions, using the updated , we compute the directions and by solving (22) with

(23)

and compute as

(24)

for . As a result having computed predictor or corrector versions of the directions and , computing and can be done independently by computing agents in parallel. Also notice that the coupling structure in (22) is the same as in (2). This allows us to employ distributed computational algorithms to distributedly solve for the search directions using collaborating agents. To illustrate this, note that the problem in (22) can be written as

(25)

with and . This problem can be solved distributedly using proximal splitting methods, e.g., ADMM, [10, 34, 6]. The use of proximal splitting methods for computing the primal-dual directions has been considered in [3, 21]. Devising distributed algorithms for solving coupled SDPs that also rely on this approach can be seen as an extension of the use of the algorithm proposed in [3] to SDPs. Even though distributed algorithms based on proximal splitting are effective for non-conic problems, they suffer from certain issues when used for solving SDPs. Particularly, notice that the computed search directions using this approach are inexact and first-order splitting methods generally require many iterations to compute accurate enough search directions. Furthermore, the number of consensus constraints in (22c) are generally large for coupled SDPs which can in turn adversely affect the performance and numerical properties of such splitting methods. Also notice that for a predictor-corrector primal-dual method the search directions are computed through solving a system of the form (22) twice. This means that the iterative scheme for solving (22) needs to be run twice at each iteration of the primal-dual method. Hence, distributed algorithms that rely on proximal or first-order splitting for computing the search directions, potentially, require many iterations to converge to the solution. Despite all such issues, in many cases such splitting methods are among the only resorts for distributedly solving coupled or loosely coupled SDPs. However for coupled problems that have an inherent tree structure, which is common in loosely coupled SDPs, we can devise an efficient algorithm for solving coupled SDPs. This is the focus of the upcoming sections. But first we express what we mean by the tree structure.

Iv Tree Structure in Coupled Problems and Message Passing

Fig. 2: Clustered sparsity graph.
Fig. 3: Tree representation of the sparsity graph .

Let us reconsider the coupled SDP in (4). Notice that for this problem it is possible to cluster the variables or the nodes in its sparsity graph as shown in Figure 2. As can be seen from the figure, each of the clusters induce a complete subgraph on the sparsity graph. We can then provide a more compact representation of the sparsity graph using the tree in Figure 3. Each node in this tree corresponds to each of the clusters of variables denoted by . Furthermore, for this problem, the tree is such that for every two nodes and in the tree, is contained in all the clusters in the path connecting the two nodes in the tree. We refer to problems that enjoy this inherent structure as coupled with a tree structure. Next we lay out an approach for exploiting this structure in coupled problems.

Let us start by describing some definitions relating to graphs. Consider a graph . A clique of this graph is a maximal subset of that induces a complete subgraph on , i.e., no clique is properly contained in another clique, [9]. Assume that all cycles of length at least four of have a chord, where a chord is an edge between two non-consecutive vertices in a cycle. This graph is then called chordal [17, Ch. 4]. It is possible to make a non-chordal graph chordal by adding edges to the graph. The resulting graph is then referred to as a chordal embedding. Let denote the set of its cliques, where is the number of cliques of the graph. Then there exists a tree defined on such that for every where , is contained in all the cliques in the path connecting the two cliques in the tree. This property is called the clique intersection property, [9], and trees with this property are referred to as clique trees. As a result it is possible to represent chordal graphs using clique trees. This means that in case the sparsity graph is chordal, it is possible to use algorithms for generating clique trees for chordal graphs, to extract the aforementioned tree structure in the problem. In fact this has been used for the coupled example in (4). Notice that the sparsity graph for this example is chordal, and the clusters marked in Figure 2 are its cliques. Their corresponding clique tree is depicted in Figure 3. Also notice that in case the sparsity graph is not chordal, the same procedure can be used on its chordal embedding for extracting the tree structure. Coupled problems with a tree structure can be solved using a message-passing algorithm. Consider the following coupled convex optimization problem

(26)

where for . This problem can be seen as a combination of subproblems, each of which is defined by a term in the cost function and depends only on a few elements of . Let us describe the coupling structure in this problem in a similar manner as we did for the coupled SDP in (1). That is we denote the ordered set of indices of that each subproblem depends on by , and we denote the ordered set of indices of functions that depend on by . We can equivalently rewrite this problem as

(27)

where the functions are lower dimensional descriptions of s such that for all and . Let us assume that the sparsity graph of this problem, , has an inherent tree structure with a set of cliques and a clique tree, . This problem can be solved distributedly using the message-passing algorithm that utilizes the clique tree as its computational graph. This means that the nodes act as computational agents that communicate or collaborate with their neighbors defined by the edge set . In order to describe the message-passing algorithm, we first need to assign each subproblem in (27) to each of the agents. We can assign a subproblem or function to an agent if . Let us denote the set of indices of the subproblems assigned to agent by . Then we can rewrite (26) as

(28)

where . The message-passing algorithm, much the same as dynamic programming, solves (28) by performing an upward-downward pass through the clique tree, see e.g., [22, Sec. 4], [26] and references therein. Next we show how the message-passing algorithm can be used for devising distributed solvers for coupled SDPs with a tree structure.

V Distributed Primal-dual Interior-point Methods for Coupled SDPs

Let us reconsider the coupled SDP in (1), and assume that the sparsity graph of this problem, , has an inherent tree structure with clique set and clique tree . Here we propose a method that allows us to solve this problem distributedly over the clique tree. To this end, we first need to assign the constituent subproblems of (1) to each of the agents in the tree. Firstly define such that