A Globally Convergent Gauss-Newton Algorithm for AC Optimal Power Flow

A Globally Convergent Gauss-Newton Algorithm for AC Optimal Power Flow


We propose a globally convergent and robust Gauss-Newton algorithm for finding a (local) optimal solution of a non-convex and possibly non-smooth optimization problem arising from AC optimal power flow on meshed networks. The algorithm that we present is based on a Gauss-Newton-type iteration for an exact penalty reformulation of the power flow problem. We establish a global convergence rate for such a scheme from any initial point to a stationary point of the problem while using an exact penalty formulation to gain numerical robustness against ill-conditioning. We compare our algorithm with a well-established solver, IPOPT, on several representative problem instances in MATPOWER. We demonstrate the comparable, but more robust, performance of our method for a variety of the MATPOWER test cases.

AC optimal power flow, non-convex optimization, penalty reformulation, Gauss-Newton method.

I Introduction

The optimal power flow (OPF) problem [1] consists in finding an optimal operating point of a power system while minimizing a certain objective (typically power generation cost), subject to the Kirchhoff’s power flow equations and various network and control operating limits. We focus in this paper on the alternating current optimal power flow (AC-OPF) problem [2], which lies at the heart of short-term power system operations [3]. The increasing integration of distributed resources that are connected to medium and low-voltage networks has increased the relevance of AC-OPF as an appropriate framework of modeling operational constraints that affect the coordination of transmission and distribution system operations [4].

Related work In recent years, there has been a great body of literature that has focused on convex relaxations of the AC-OPF problem, including semidefinite programming relaxations [5, 6], conic relaxations [7, 8, 9], and quadratic relaxations [10]. These works have established conditions under which these relaxations are exact, and understanding cases in which this is not so [11]. Instead, our interest in the present paper is to tackle directly this problem as a non-convex optimization problem with non-linear equality constraints. Such a formulation is sufficiently general to produce physically implementable solutions in the context of realistic system operations.

Our interest in this application is driven by recent research on the integration of transmission and distribution system operations [12] in the context of the SmartNet EU project [13]. The standard linear approximations of power flow are inadequate for medium and low-voltage distribution networks, where real power losses are significant, reactive power flows are non-negligible, and voltage constraints are relevant. Moreover, the complexity of the network (meshed transmission networks and almost-radial distribution networks [14]) renders the existing relaxations inexact, and the resulting dispatch possibly non-implementable. This necessitates empirical operator interventions, and undermines our ability to actively engage large numbers of distributed resources in short-term power system operations.

The AC-OPF problem is usually formulated as a non-convex optimization problem. It is well-known that optimization problems with non-convex constraints are difficult to solve. Classical techniques such as interior-point, augmented Lagrangian, penalty, Gauss-Newton, and sequential quadratic programming methods can only aim at finding a stationary point, which is a candidate for a local minimum [15, 16]. For an iterative method to identify a stationary point that is a local minimum, but not a saddle-point, more sophisticated techniques are required, such as cubic regularization [17] or random noise gradient [18]. However, these methods are often very difficult to implement and inefficient in large-scale problems with non-convex constraints. One of the most efficient and well-established nonlinear solvers for finding stationary points is IPOPT [19], which relies on a primal-dual interior-point method combined with other advanced techniques. We emphasize that this classical method is only guaranteed to converge to a stationary point, and often requires a strategy such as line-search, filter, or trust-region to achieve global convergence under certain restrictive assumptions. Moreover, each iteration of IPOPT requires solving a non-convex subproblem via linearization combined with a line-search or filter strategy.

Our approach and contributions In this paper, we consider an AC optimal power flow problem over large-scale networks. We are interested in an approach that can tackle general meshed networks. We show that this problem can be posed in the framework of non-convex optimization with a particular structure on the constraints. Based on this structure we devise a provable convergent Gauss-Newton (GN)-type algorithm for solving this non-convex problem. Our algorithm converges globally to a stationary point of the problem from any starting point. In addition, it is also different from standard GN methods in the literature due to the use of a non-smooth penalty instead of a classical quadratic penalty term. This allows our algorithm to converge globally and also to be more robust to ill-conditioning [20]. Hence, we refer to this algorithm as a global and robust GN scheme. The main idea of our method is to keep the convex sub-structure of the original problem unchanged and to convexify the non-convex part by exploiting penalty theory and the GN framework. Hence, in contrast to IPOPT, each iteration of our algorithm requires solving a convex subproblem, which can efficiently be solved by many existing convex solvers such as CPLEX or Gurobi.

The main contributions of the paper are the following:

  • We consider a quadratic reformulation of the AC-OPF problem, as in [7, 21], and propose an exact penalty reformulation of the problem in order to handle the non-convex equality constraints and a novel global and robust GN algorithm for solving the corresponding problem.

  • For our optimization algorithm, we prove that its iterate sequence converges globally (i.e. from any starting point) to a stationary point of the underlying problem. We also estimate its best-known global sublinear convergence rate.

  • We show that the newly developed algorithm can be implemented efficiently on AC-OPF problems and test it on several numerical examples from the well-known MATPOWER test cases [22], [23]. We observe competitive, and often superior and more robust, performance to the well-established and widely-used IPOPT solver.

Our algorithm is simple to implement and can be incorporated flexibly with any available convex sub-solver that supports a warm start strategy in order to gain efficiency.

Content The paper is organized as follows. In Section II, we present the AC-OPF problem and its quadratic reformulation. In Section III, we introduce our Gauss-Newton algorithm and analyze its convergence properties. Finally, in Section IV, we adapt the algorithm to the AC-OPF problem and test it on several representative MATPOWER test cases.

Ii Mathematical Modeling

Ii-a Problem settings

Consider a directed power network with a set of nodes and a set of branches . The network consists of a set of generators, with denoting the set of generators at bus . Denote as the system admittance matrix.

The decision variables of AC-OPF are the voltage magnitudes , phase angles , and the real and reactive power outputs of generators, which are denoted as and . We will consider a fixed real and reactive power demand at every node , which we denote as and , respectively.

The constraints of the AC-OPF problem can be described by equations (1)-(5), see [9]. Constraints (1) and (2) correspond to the real and reactive power balance equations of node . Constraints (3) and (4) impose complex power flow limits on each line, which we indicate by a parameter matrix . Constraints (5) impose bounds on voltage magnitudes (indicated by parameter vectors and ), bounds on real power magnitudes (indicated by parameter vectors and ), bounds on reactive power magnitudes (indicated by parameter vectors and ), and bounds on voltage phase angles (indicated by parameter vectors and ). Note that the power balance equality constraints (1), (2) as well as the inequality constraints (3), (4) are non-convex with respect to and .


We will consider an objective of minimizing real power generation costs. Hence, we consider a convex quadratic objective function :

where and are given coefficients of the cost function.

The AC OPF problem then reads as the follwoing non-convex optimization problem:

Ii-B Quadratic reformulation

The starting point of our proposed GN method for solving problem is the quadratic reformulation of AC-OPF [21]. In this reformulation, we use a new set of variables and for replacing the voltage phasors . These new variables are defined for all and as:


where we will denote the vectors and as the collection of the and variables, respectively.

For , the mapping from to defined by (6), (7) can be inverted as follows:

thereby defining a bijection in and .

The set of and that define this bijection is further equivalent to the following set of non-linear non-convex constraints in , see [9]:


Now, we will substitute the voltage magnitude variables into the problem , and consider the problem on the variables . This reformulation has been commonly employed in the literature in order to arrive at an SOCP relaxation of the problem [21, 7]. Through numerical experiments, we demonstrate that this reformulation results in highly effective starting points for our algorithm based on the SOCP relaxation of the AC-OPF. Moreover, the reformulation preserves the power balance constraints in linear form. Thus, the power balance constraints are not penalized in our scheme, which implies that they are respected at every iteration of the algorithm. For all these reasons, we pursue the quadratic reformulation of the present section, despite the fact that it requires the introduction of the new variables and .

Concretely, the AC power balance constraints (1) and (2) are linear in and :


Similarly, the power flow limit constraints (3) and (4) are convex quadratic in and :


The box constraints (5) are reformulated as follows:


As a result, an equivalent formulation for the AC-OPF model is:


With this reformulation, the decisions are . Constraints define a convex set.We also have two non-convex equality constraints:

  • Constraints (8): . We will refer to them as quadratic constraints.

  • Constraints (9): . We will refer to them as trigonometric constraints.

Iii Optimization Algorithm and Its Convergence Guarantees

In this section, we cast the quadratic formulation (15) of the AC-OPF model as a generic non-convex optimization problem with nonlinear equality constraints, propose an exact penalty reformulation, and solve it using a Gauss-Newton-type algorithm. We further characterize global convergence rate of our algorithm. The proofs of all theoretical results are provided in Appendices A and B.

Iii-a Constrained NLP formulation of Optimal Power Flow

Problem can be written in the following generic form:


where is a convex function; is a linear operator defining the balance equations (10) and (11); collects all the quadratic equality constraints from (8) and the trigonometric equality constraints (9); and are given lower bounds and upper bounds on , respectively; and are positive semi-definite matrices, with the associated constraints representing the line flow limits (12) and (13).

To provide a unified theoretical study for (16), we rewrite it in the following compact form:


where, in the context of AC-OPF, we have , , , and


Hence, in the sequel we devise an algorithm for solving the non-convex optimization problem (17). For this problem we assume throughout this section that the objective function is convex and differentiable, is a compact convex set, and the non-convexity enters in (17) through the non-linear equality constraints . We further assume that is differentiable and its Jacobian is Lipschitz continuous, i.e. there exists such that:

is the -norm. Note that all these assumptions hold for the AC-OPF problem (16). Indeed, is quadratic, and since is linear, are boxes, and is convex, is a closed, convex, and bounded set. Moreover, since is the collection of (8) and (9), we show in the next lemma that it is differentiable and that its Jacobian is Lipschitz continuous.

Lemma III.1.

For the AC-OPF problem, defined by (8)–(9), is smooth, and its Jacobian is Lipschitz continuous with a Lipschitz constant , i.e. for all , where


Further, let defines the normal cone of as

Since problem (17) is nonconvex, our goal is to search for a stationary point of (17) that is a candidate of local optima in the following sense.

Definition III.1.

A point is said to be a KKT point of (17) if it satisfies the following conditions:


Here, is called a stationary point of (17), and is the corresponding multiplier. Let denote the set of these stationary points.

Since is compact, and and are continuous, by the well-known Weierstrass theorem, we have:

Proposition III.1.

If , then (17) has global optimal solutions.

Iii-B Exact penalized formulation

Associated with (17), we consider its exact penalty form:


where is a penalty parameter, and is the -norm.

Two reasons for choosing an exact (non-smooth) penalty are as follows. First, for a certain finite choice of the parameter , a single minimization in of (21) can yield an exact solution of the original problem (17). Second, it does not square the condition number of as in the case of quadratic penalty methods, thus making our algorithm presented below more robust to ill-conditioning of the non-convex constraints. Now, we summarize the relationship between stationary points of (17) and of its penalty form (21). For this, let us define:


where is one subgradient of at , and denotes the subdifferential of , see [20]. Recall that the necessary optimality condition of (21) is

Then, this condition can be expressed equivalently as


where is the set of feasible directions to at :


Any point satisfying (23) is called a stationary point of the penalized problem (21). If, in addition, is feasible to (17), then we say that is a feasible stationary point. Otherwise, we say that is an infeasible stationary point.

Proposition III.2 shows the relation between (17) and (21).

Proposition III.2 ([15], (Theorem 17.4.)).

Suppose that is a feasible stationary point of (21) for sufficiently large. Then, is also stationary point of the original problem (17).

In general, needs to be chosen such that , where is any optimal Lagrange multiplier of (17). We will discuss in detail the choice of in Section IV.

Iii-C Global and robust Gauss-Newton method

We first develop our GN algorithm. Then, we investigate its global convergence rate.

The derivation of the Gauss-Newton scheme and the full algorithm

Our GN method aims at solving the penalized problem (21) using the following convex subproblem:


where is a given point in for linearization, is the Jacobian of , and is a regularization parameter. Note that our subproblem (25) differs from those used in classical penalty methods [15], since we linearize the constraints and we also add a regularization term. Thus, the objective function of (25) is strongly convex. Hence, if is nonempty, this problem admits a unique optimal solution, and can be solved efficiently by several convex methods and solvers. Let us denote


The necessary and sufficient optimality condition for subproblem (25) becomes


where .

Given , we define the following quantities:


Then, can be considered as a gradient mapping of in (21) [20], and is a search direction for Algorithm 1. As we will see later, should be chosen such that .

Now, using the subproblem (25) as a main component, we can describe our GN scheme in Algorithm 1.

1:Initialization: Choose and a penalty parameter sufficiently large (ideally, ).
2:Choose a lower bound .
3:For to perform
4:    Find such that (see Lemma III.3).
5:    Update .
6:    Update if necessary.
7:End for
Algorithm 1 (The Basic Global Gauss-Newton Algorithm)

The main step of Algorithm 1 is the solution of (25) at Step 4. As mentioned, this problem is strongly convex, and can be solved by several methods that converge linearly. If we choose , then we do not need to perform a line-search on at Step 4, and only need to solve (25) once. However, the global upper bound may be too conservative, i.e. it does not take into account the local structures of nonlinear functions in (21). Therefore, we propose to perform a line-search in order to find an appropriate . If we perform a bi-section, then the number of line-search iterations is at most , see [20]. The penalty parameter can be fixed or can be updated gradually.

Global convergence analysis

We first summarize some properties of Algorithm 1 without updating as follows.

Lemma III.2.

Let be defined by (26), and , , and be defined by (III-C1). Then the following statements hold:

  • If , then is a stationary point of (21).

  • The norm is nondecreasing in , and is nonincreasing in . Moreover, we have

  • If is Lipschitz continuous with the Lipschitz constant , then, for any , we have


Statement shows that if we can find such that , then is an approximate stationary point of (21) within the accuracy . From statement , we can see that if the line-search condition at Step 4 holds, then . That is, the objective value decreases at least by after the -th iteration. We first claim that Algorithm 1 is well-defined.

Lemma III.3.

Algorithm 1 is well-defined, i.e. step 4 terminates after a finite number of iterations. That is, if , then .

Let be the level set of at . Now, we are ready to state the following theorem on global convergence of Algorithm 1.

Theorem III.1.

Suppose that the feasible set is sufficiently large such that for . Then, the sequence generated by Algorithm 1 satisfies


where . Moreover, we also obtain


and the set of limit points of the sequence is connected. If this sequence is bounded (in particular, if is bounded) then every limit point is a stationary point of (21). Moreover, if the set of limit points is finite, then the sequence converges to a stationary point of (21). If, in addition, is feasible to (17) and is sufficiently large, then is also a stationary point of (17).

Theorem III.1 provides a global convergence result for Algorithm 1. Moreover, our algorithm requires solving convex subproblems at each iteration, thus offering a great advantage over classical penalty-type schemes. However, we only show that, under the stated conditions, the iterate sequence converges to a stationary point of (21). Since , if , then is also a stationary point of (17). We can guarantee this by combining the algorithm with a run-and-inspect procedure, whereby if violates , then we restart the algorithm at a new starting point. However, we leave this extension for our future work.

Iv Implementation & Simulation

In this section, we benchmark the GN algorithm against MATPOWER instances that are solved using IPOPT, which is a state-of-the-art interior-point solver. We consider transmission networks with sizes ranging from 1,354 buses to 25,000 buses. Our first goal is to optimize the settings of the GN method. We will see that the choice of and is crucial. This will allow us to derive a practical version of the GN algorithm, which we compare to IPOPT.

Iv-a A Practical Implementation of Algorithm 1 for AC-OPF

Subproblem (26) in the context of AC-OPF

We first recall how the AC-OPF formulation can be cast in our GN framework by specifying the following components:

  • , .

  • .

  • .

We define the following functions in order to keep our notation compact:

Note that we have two different type of constraints (quadratic and trigonometric) and that they may attain different relative scales for different instances, as it is the case in our numerical experiments. We will define different penalty terms depending on the type of constraint, which will increase the flexibility of our implementation, whereas it does not affect our theory. To this end, we denoted by (resp. ) the penalty parameter associated with the quadratic (resp. trigonometric) constraints. The trigonometric constraints depend on , and , whereas the quadratic ones only depend on and . Therefore, we define two separate regularization parameters: and (we drop the subindex from now on for notational simplicity).

The GN subproblem at iteration , corresponding to (25), is convex and has the form:

The optimal solution of , which we denote by , provides the next iterate .

Stopping criteria

We terminate Algorithm 1 in three occasions, which, in our experiments, work quite reasonably:

  • If the maximum number of iterations has been reached, we terminate our experiments.

  • If the difference , then Algorithm 1 has reached an approximate stationary point of the exact penalized formulation (21) (see Lemma III.2(a)), and we also terminate Algorithm 1. However, in this case it is possible that the last iterate might not be feasible for . We numerically check the feasibility of by directly computing the feasibility violation. In the experiments section, we employ a value of .

  • If the quadratic and trigonometric constraints are satisfied with tolerance , where , we also terminate. Concretely, we stop Algorithm 1 if .

Parameter tuning strategies

Tuning the parameters is quite crucial to improve practical performance of constrained nonconvex algorithms, including Algorithm 1. We demonstrate the impact of a careful choice for parameters in Fig. 1. This figure presents the performance of Algorithm 1 on the MATPOWER instance (case1888rte), with a constant value for and (i.e. and ). Concretely, we fix and set to the theoretical upper bound as described in our theory. As shown in Fig. 1, the algorithm reaches the maximum number of iterations (one hundred), but shows some slow progress on the quadratic feasibility. Although the maximum violation of the trigonometric constraints remains within the acceptable tolerance of within 6 iterations, the maximum violation of the quadratic constraints decreases very slowly without reaching the desired tolerance. This behavior reveals that a careful balance between the different parameters needs to be achieved. The cause of the poor performance in Fig. 1 is the fact that the trigonometric constraints are quickly satisfied, which ‘locks in’ the values of , , and , and limits the ability of the algorithm to decrease the maximum violations of the quadratic constraints. Note that this is consistent with the fact that the objective value is not evolving over iterations.

Fig. 1: Evolution of maximum violation of the quadratic and tangent constraints, and the objective function along the iterations for case1888rte, for and a fixed value of along the iterations.

Different parameter choices for two types of constraints

Following a detailed experimental investigation which is presented in Appendix C, we fix the parameters of Algorithm 1 as follows. We set equal to the number of lines in the network, , and we set . The intuition behind these choices is that (i) according to Proposition III.2, large values of ensure the equivalence between (17) and (21); (ii) quadratic constraints and trigonometric constraints scale up differently. Concerning the values for and , we perform a geometric line search at line 4 of Algorithm 1 and choose , and the geometric update coefficient . Since we need to solve each time and/or are updated, keeping the number of updates of low is crucial for the performance of the algorithm. The detailed justification of our approach towards updating the values of the coefficient is developed in Appendix D.

Iv-B Numerical experiments

We used the MATPOWER library in order to compare our approach to IPOPT for a wide range of test systems that have been investigated in the literature. MATPOWER provides a large set of instances of AC-OPF. We tested our approach on instances whose size ranges between 1,354 and 25,000 nodes. MATPOWER also provides several possibilities for solving AC-OPF. We benchmark our approach against IPOPT, a non-linear solver based on the interior-point method. Since we initialize the GN algorithm with the solution of the SOCP relaxation of AC-OPF, we also initialize IPOPT with the solution of the SOCP relaxation, in order to establish a fair comparison.

We compare the following three methods:

  1. GN: GN algorithm initialized at the solution of the SOCP relaxation. The parameters of the algorithm are chosen according to section IV-A4.

  2. MP-IPOPT: The MATPOWER implementation of IPOPT. MATPOWER uses the midpoints of the box constraints of the problem as an initial solution for the interior-point solver.

  3. SOCP-IPOPT: IPOPT initialized at the solution of the SOCP relaxation.

The results of our analysis are presented in Table I. The MATPOWER implementation of IPOPT fails in a number of instances1. Note that the instances 1888rte, 2868rte, 6468rte, 6515rte present exceeding run times or significant constraint violations. Other test cases perform extremely well, most notably the last four instances which are also the largest ones. When we initialize IPOPT from the solution of the SOCP relaxation, the violation of the constraints is less sensitive across instances, however the variation in execution time remains significant.

Test Case # It+ Objective Time MV Objective Time MV Objective Time MV
1354pegase 4+0 7.21 1.59 81
1888rte 27+28 132 3,675 20.6
1951rte 3+0 10.0 26.8 30.9
2383wp 16+1 82.9 3.86 18.4
2848rte 6+0 32.4 2,604 76.3
2868rte 5+0 28.5 3,868 62.5
2869pegase 3+0 20.6 3.11 194
6468rte 14+8 607 4,350 372
6470rte 11+0 424 68.4 406
6495rte 4+0 157 135 428
6515rte 4+0 163 5,761 466
9241pegase 43+41 6,304