Power System State Estimation via Feasible Point Pursuit: Algorithms and Cramér-Rao Bound
Accurately monitoring the system’s operating point is central to the reliable and economic operation of an electric power grid. Power system state estimation (PSSE) aims to obtain complete voltage magnitude and angle information at each bus given a number of system variables at selected buses and lines. Power flow analysis is a special case of PSSE, and amounts to solving a set of noise-free power flow equations. Physical laws dictate quadratic relationships between available quantities and unknown voltages, rendering general instances of power flow and PSSE nonconvex and NP-hard. Past approaches are largely based on gradient-type iterative procedures or semidefinite relaxation (SDR). Due to nonconvexity, the solution obtained via gradient-type schemes depends on initialization, while SDR methods do not perform as desired in challenging scenarios. This paper puts forth novel feasible point pursuit (FPP)-based solvers for power flow and PSSE, which iteratively seek feasible solutions for a nonconvex quadratically constrained quadratic programming (QCQP) reformulation of the weighted least-squares (WLS) problem. Relative to the prior art, the developed solvers offer superior performance at the cost of higher complexity. Furthermore, they converge to a stationary point of the WLS problem. As a baseline for comparing different estimators, the Cramér-Rao lower bound (CRLB) is derived for the fundamental PSSE problem in this paper. Judicious numerical tests on several IEEE benchmark systems showcase markedly improved performance of our FPP-based solvers for both power flow and PSSE tasks over popular WLS-based Gauss-Newton iterations and SDR approaches.
Index terms— Power flow analysis, state estimation, nonconvex QCQP, feasible point pursuit, CRLB.
Recognized as the greatest engineering achievement of the st century , the electric power grid is a complex cyber-physical system comprising multiple subsystems, each with a transmission infrastructure to deliver electricity from power generators to distribution networks to customers. Accurately monitoring the operational condition of a power grid is crucial to various system control and optimization tasks, which include unit commitment, optimal power flow (OPF), and economic dispatch [2, 3]. To enable such an accurate monitoring, a set of system variables are specified (and enforced) or measured at selected buses and lines for determining or estimating the system’s operating point, namely complex voltages at all buses of the grid. These two tasks correspond to the so-termed power flow analysis and power system state estimation (PSSE), respectively. Both are central to monitoring, control, and future planning of electricity networks.
In power engineering, power flow analysis is a numerical analysis of the normal steady-state flow of electric power over the grid, that is crucial for planning future power system expansions (e.g., designing components such as generators, lines, transformers, and capacitors), as well as in determining the best operation of the existing systems . The goal of power flow analysis is to obtain complete voltage magnitude and angle information at each bus for specified or enforced load and generator active power and voltage conditions . Once this information is available, other system variables including active and reactive power flows as well as generator reactive power outputs can be analytically obtained.
Power flow analysis amounts to solving a set of quadratic equations given by the nonlinear AC power flow model obeying Ohm’s and Kirchhoff’s laws. Solving power flow equations for both transmission and distribution systems is known to be NP-hard . Due to the nonlinear nature, several numerical solvers have been developed to obtain a solution that is within an acceptable tolerance [3, 5]. Past solvers include the Gauss-Seidel and Newton-Raphson iterative algorithms , and the semidefinite relaxation (SDR) [5, 6]. The Gauss-Seidel method is reported as the earliest devised power flow solver . On the other hand, the Newton-Raphson algorithm iteratively seeks improved approximations to the zeros of real-valued functions, featuring quadratic convergence whenever the initial point lands within a small neighborhood of the zeros . As convergence of both algorithms relies heavily on the initial point, they may diverge if the initialization is not reliable . With a carefully designed objective function and sufficiently small angle differences across lines, the SDR approaches have been shown capable of recovering the true power flow solution provided that the set of available specifications includes all voltage magnitudes, and the active power flows over a spanning tree of the network [5, 6].
The task of PSSE can be described as estimating the voltage magnitudes and angles at all buses across the network from a subset of supervisory control and data acquisition (SCADA) measurements including active and reactive power injections and flows (at both the sending and receiving ends), as well as squared voltage magnitudes . Since its appearance in the s , PSSE has become a prerequisite for supervisory control, system planning, and economic dispatch . Nonlinear SCADA measurements however, render the PSSE problem nonconvex and NP-hard in general .
PSSE solvers so far are largely based on Gauss-Newton iterations and SDR heuristics. The “workhorse” Gauss-Newton method for nonconvex optimization has two limitations [10, Sec. 1.5], i.e., sensitivity to the initial guess, and lack of convergence guarantees. SDR-based approaches on the other hand solve first for a matrix variable that can be computationally expensive [9, 11, 12, 6, 5, 13]. SDR’s performance degrades when the data-size is relatively small, or when the data do not include all voltage magnitudes [6, 5]. For PSSE of large-scale networks, distributed Gauss-Newton and SDR implementations have been reported in , , .
Solving power flow equations and the PSSE can be shown equivalent to solving nonconvex QCQPs, which is generally NP-hard . Many heuristics have been put forward. A feasible point pursuit (FPP) algorithm developed in  was shown to enjoy improved performance over the SDR-based methods. The FPP heuristic has been employed for solving the OPF problem , where the resulting solver was empirically shown more effective for multi-phase transmission networks than popular SDR- and moment relaxation-based ones .
Building on our precursors [17, 19] and inspired by the inherent nonconvex challenge, the goal of this work is to develop power flow and PSSE solvers capable of attaining or approximating the global optimum at manageable computational complexity. Starting with the WLS formulation, the power flow and PSSE tasks are equivalently reformulated as a nonconvex QCQP, which can be readily tackled by FPP. We further show that our FPP-based solvers converge to a stationary point of the WLS problem. As a baseline for comparing different SE approaches, the Cramér-Rao lower bound (CRLB) is derived for the basic PSSE problem under additive white Gaussian noise (AWGN). This is achieved by means of Wirtinger’s calculus for functional analysis over complex domains. Finally, numerical experiments using several IEEE benchmark systems corroborate the superior performance of our proposed solvers over existing methods for both power flow and PSSE tasks.
Regarding notation, matrices (vectors) are denoted by upper- (lower-) case boldface letters, and , , and stand for complex conjugate, transpose, and conjugate-transpose, respectively. Calligraphic letters are reserved for sets, e.g., . Symbol () takes the real (imaginary) part of a complex-valued object, and is a diagonal matrix holding in order entries of on its diagonal.
Ii System Modeling and Problem Statement
An electric transmission network having nodes (buses) and edges (lines) can be represented by a graph , whose nodes correspond to buses, and whose edges correspond to transmission lines. For every bus , let be the nodal complex voltage, whose magnitude and phase are given by and , respectively; likewise for the complex current injection . Let also be the corresponding complex power injection, in which and are the active and reactive power injection, respectively. For every line , let denote the complex current flowing from bus to , and the complex power flow from bus to seen at the sending end, where and are the active and reactive power flow, respectively; and likewise for the receiving-end (active and reactive) power flow and .
The AC power flow model dictates that system variables , , , , , , and are quadratic functions of the state vector . Clearly, this holds true for the squared voltage magnitude understood as . To specify the relationship between power quantities and , introduce to represent the bus admittance matrix, which is in general symmetric. Ohm’s law in conjunction with Kirchhoff’s law reads as
It is worth mentioning that is sparse, thus enabling efficient computations in large-size power networks, and its -th entry is given by
where denotes the admittance of line , the admittance to the ground at bus , and the set of neighboring buses directly connected to bus . For , let be the shunt admittance at bus associated with line . Recall from Ohm’s and Kirchhoff’s laws that the current flowing from bus to can be expressed as
whereby the reverse-direction current can be given symmetrically. Due to in general, it holds .
The AC model also asserts . Appealing again to (1) leads to the next matrix-vector form
where both active and reactive power injections are quadratically related to . Likewise, the sending-end active and reactive power flow over line can be written as
where the second equality is obtained by substituting in (3) into the first. Hence, and can also be expressible as quadratic functions of . By symmetry, this quadratic relationship also holds for and .
To perform either power flow analysis or PSSE, a total of system variables are specified or measured by the system operator. The nonlinear AC networks have available the next seven types of quantities: , , , , , , and . If , , , (), and () denote the selected sets of buses/lines over which actual quantities of the corresponding type are available, the elaborated quadratic relationships prompt us to define the data vector , whose entries can be succinctly given by
where are some coefficient matrices to be specified. For this purpose, let be the canonical basis of , and introduce also the admittance-dependent matrices
For , it is clear that the corresponding in (6) is
and with sending-end and receiving-end active and reactive power flow at all lines
Ii-a Power flow analysis
Power flow analysis deals with specified power quantities, which are enforced for optimally operating an electric power grid. Specifically, given perfectly known specifications and valid network parameters as in (6), the goal of power flow analysis is to decide the state vector that satisfies all specifications, namely,
Recall that each bus in a power system is classified as a PQ, PV, or slack (reference) bus based on the constraints imposed per bus. PQ buses, which often correspond to loads, specify and enforce only active and reactive power injection and on bus . On the other hand, the PV buses, which are typically associated with generators, enforce active power injection and voltage magnitude . For the slack bus, its voltage phase is fixed at , by convention. With , the power flow problem in (10) is equivalent to solving for real-valued unknowns from quadratic equations. The classical power flow problem considers the particular case where the specifications are enforced only at the PV, PQ, and slack buses as opposed to a combination of buses and lines .
Ii-B Power system state estimation
PSSE on the other hand deals with noisy observations acquired by the SCADA system adhering to
where accounts for the zero-mean distributed measurement error with known variance , henceforth assumed independent across meters. The goal of PSSE is, given SCADA measurements and also parameters , estimate the state vector .
Adopting the WLS criterion, the SE task can be cast as that of solving the following nonlinear LS problem
where entries of the weight vector are often taken as for known values. The WLS estimate coincides with the maximum likelihood one when the error vector obeys the multivariate Gaussian distribution with . Unfortunately, due to the quadratic terms inside the squares, the WLS SE problem is nonconvex. Minimizing nonconvex objectives, which typically exhibit many stationary points, is NP-hard in general . Hence, solving the problem in (12) is indeed challenging.
PSSE approaches so far can be grouped as convex and nonconvex ones. The latter includes the “workhorse” Gauss-Newton method, which is also typically employed in practice: Upon linearizing the error function in the LS cost around a given estimate, the minimizer of the norm of the resulting linearized approximation is used to initialize the next iteration [10, Sec. 1.5]. Minimizing nonconvex functions, Gauss-Newton iterations can be problematic due to: i) its sensitivity to the initial point; and, ii) lack of convergence guarantee to even a stationary point . Convex approaches via SDR [9, 6] express all data as linear functions of the outer-product . Problem (12) is then convexified by dropping the nonconvex constraint . SDR-based methods seldom yield solutions of rank- in the noisy case. Further eigen-decomposition or randomization procedures are required to recover the estimator from the SDR solution . Performance of SDR solutions degrades when the data size is small, or when the set of measurements does not include the voltage magnitude at all buses, as will be demonstrated by our numerical results in Sec. V.
Iii Feasible Point Pursuit based Solvers
In this section, the FPP-based power flow and PSSE solvers will be developed based on procedures distinct from existing iterative optimization and SDR-based SE approaches. To this end, some basics of FPP are first reviewed. For nonconvex QCQPs, FPP iteratively solves a series of convexified QCQPs obtained with successive convex inner-restrictions of the original nonconvex feasibility set, and with additive slacks to approximate the feasible solutions of the original nonconvex QCQP . Specifically, starting with an initial guess, FPP first decomposes the quadratic terms in all nonconvex constraints into their convex and nonconvex parts by means of eigen-decomposition, which can be efficiently carried out offline; then it linearizes the nonconvex parts around the current iterate to obtain a restricted convex QCQP. Due to restriction of the feasibility set, the convexified QCQP may be infeasible. To sustain feasibility, a slack variable is introduced for each relaxed constraint, with a convex penalty on the slack variables added to the cost function, which can enforce sparing use of slacks to produce solutions of minimal constraint violation. The minimizer of the regularized convex QCQP subproblem is taken as the next iterate, which will be used as the linearization point of the nonconvex components at the next iteration. This successive convex approximation and feasibility-restoring procedure is repeated until a certain stopping criterion is met. Further details of FPP can be found in , .
Note that the power flow problem (10) consists of quadratic equality constraints, which are not in the standard QCQP form. To apply FPP, equalities are relaxed to inequalities, while penalizing the slack variables , yielding
where other choices of the convex penalty function include the (weighted) or norm. Problem (13) is equivalent to the original power flow formulation (10) when the latter is feasible. To see this, assume that the set of power flow equations in (10b) admits (possibly more than one) feasible solutions. Clearly at the optimum of (13), the objective reduces to zero, the slack variables take zero values, and all equalities in (13b) are achieved, thus yielding a feasible solution to the set of power flow equations in (10).
where the slack variables in this case relate to the deviations between noisy measurements and the actual quantities . Problem (14) can be similarly shown equivalent to (12). Other convex penalty functions in (14a) can also be selected. In particular, if the error vector follows the multivariate Laplace distribution, i.e., with collecting all scaling parameters, minimizing the -based function with in (14) produces the maximum likelihood estimate [2, 6].
Apparently, the reformulated power flow and PSSE problems are of the same form [cf. (13) and (14)], except for a minor difference in the cost functions. Setting unit weights in (14) reduces problem (14) to (13). Without loss of generality, we will hereafter focus on the PSSE formulation (14), and develop the novel FPP solver. The power flow problem can be readily handled with all weights being .
Problem (15) is nonconvex even for (semi)definite coefficient matrices . Next we demonstrate how to take advantage of FFP to solve the problem at hand in detail.
As discussed in Sec. II, there are two types of matrices, one corresponding to the squared voltage magnitude, and the other to power quantities. Type-I are positive semidefinite [cf. (7)], while Type-II are non-definite [cf. (8) and (9)]. For ease of exposition, let us introduce the FPP constraint convexification procedure using one nonconvex quadratic constraint in (15). Along the lines of FPP, consider the term in (15b) for some in (8), which can be decomposed into its convex and nonconvex components as
where and represent the positive semidefinite (convex) and negative semidefinite (nonconvex) parts of in (16), respectively. For the nonconvex source in (16), an inner linear restriction will be derived next.
The following inequality holds for any due to the negative semidefiniteness of
Upon expanding the left-hand-side and rearranging terms, one arrives at
The strategy in selecting the linearization point will be discussed shortly. In the same fashion, the nonconvex quadratic constraints in (15c) can be replaced by
Heed that the flexibility introduced by the slacks always restores the feasibility of the relaxed constraints, which contributes to improved performance of FPP over other convexification approaches . In the presence of noise, the minimum values required for to satisfy (18) and (19) depend on the measurement error contained in .
The FPP method replaces all nonconvex constraints in (15b) by their convex restriction (18), and those in (15c) by (19) to derive a convexified QCQP regularized with slack variables to ensure feasibility. Minimizing some convex penalty function of the slacks not only minimizes the fitting error between and , but also enforces sparing use of slacks and promotes solutions of minimal constraint violation.
In a nutshell, the developed FPP-based PSSE solver can be understood as follows. Starting with an initial point (typically the flat voltage profile point, i.e., all-ones vector), our FPP-based solver successively tackles a sequence of convexified QCQPs with the linearization point being the current iterate , which is the -minimizer obtained by solving a convexified QCQP at the previous iteration. Hence, assuming available the -minimizer at the -st iteration, our FPP-based solver boils down to solving the following convexified QCQP subproblem
The FPP-based PSSE solver is summarized in Alg. 1. The following three properties of our FPP-based solver are worth highlighting.
Remark 1 (Power flow analysis).
Cast as a special instance of PSSE, the power flow problem in (10) can be solved by our developed FPP-based PSSE solver with unit weights .
Remark 2 (Bad data removal).
Remark 3 (Synchrophasors).
Synchrophasors, if available, can be easily incorporated into the developed PSSE formulation (20). To see this, letting collect the noisy PMU data at bus , hybrid estimation exploiting both nonlinear SCADA measurements and linear PMU ones can be achieved  with an additional data-fitting term for the PMU data in (20a), namely, , where denotes the subset of the PMU-instrumented buses.
On the theoretical side, the following result establishes convergence of our developed FPP-based solvers to a stationary point of the WLS formulation.
Proposition 1 (Global convergence of FPP-based solvers).
As elaborated in Sec. III, solving problem (15) is equivalent to solving problem (12). The nonconvex QCQP of complex-valued vector in (15) can be equivalently posed as a QCQP of the expanded real-valued vector , where the associated quadratic matrices are given as
where and are the positive and negative semidefinite parts of , hence rendering terms and both convex. Alg. 1 is tantamount to an application of the convex-concave procedure [23, 24] to the reformulated QCQP in the real domain. Hence, the sequence generated by Alg. 1 converges to a stationary point of (12) by appealing to the results in [25, Thm. 10]. ∎
Iv Cramér-Rao Bound for PSSE
According to standard results from estimation theory , the variance of any unbiased estimator is lower bounded by the Cramér-Rao lower bound (CRLB). Appreciating its key role as a performance benchmark across different estimators, this section establishes the CRLB for the fundamental PSSE problem. The CRLB analysis of PSSE however, entails finding derivatives (gradient and Hessian) of a real-valued function with respect to multiple complex-valued variables. To address this challenge, we call for advanced complex analysis tools based on the so-termed Wirtinger derivative and Wirtinger’s calculus, which are detailed in the Appendix. The following result provides a closed-form CRLB for any unbiased PSSE solver under the AWGN model in (11), which can be directly used to assess the performance of other PSSE solvers.
Consider estimating the unknown state vector from noisy data obeying the model in (11), where the noise is assumed Gaussian distributed with mean zero and variance , and is also independent across meters. Then the covariance matrix of any unbiased estimator obeys
where the Fisher information matrix is given by
Furthermore, has at least rank- deficiency even when all possible SCADA measurements are available.
The proof of Prop. 2 is deferred to the Appendix. Even though the Fisher information matrix (FIM) in (23) is rank deficient, the pseudo-inverse of FIM qualifies itself as a valid yet generally looser lower bound on the mean-square error (MSE) of any unbiased estimator . This lower bound is often attainable in practice, and is predictive of optimal estimator performance , as will be demonstrated by our numerical tests in Sec. V. The derived CRLB in (22) will be employed to benchmark and compare performance of different PSSE solvers next.
V Simulated Tests
In the section, we compare the proposed FPP-based solvers in Alg. 1 with existing alternatives including the WLS via Gauss-Newton iterations (GN-based), and the SDR-based solver (SDR-based)  for both power flow and PSSE tasks on several IEEE benchmark systems . Throughout, all reported numerical results were obtained by averaging over independent Monte Carlo realizations. The three PSSE solvers from noisy measurements are compared in terms of the mean-square error , where is the returned estimate at the -th realization, and the actual voltage profile. In the absence of noise, performance of the power flow solvers is assessed through the empirical success rate over trials. A success is declared for a trial if the returned power flow solution incurs a relative violation on the given set of power flow equations, given by less than . (The reason why is not used is due to existence of possibly multiple solutions satisfying the set of power flow equations.)
Different system quantities and voltage profiles were generated via the MATLAB-based toolbox MATPOWER . The Gauss-Newton method was implemented using the SE function ‘doSE.m’ in MATPOWER, which was modified to terminate either upon convergence, or, when the condition number of the approximate linearization exceeds flagging explosion of the iterates . The SDR- and FPP-based solvers were realized via the optimization modeling package YALMIP , as well as the interior-point solver SeDuMi . Furthermore, the flat-voltage profile point was used as the initial guess for the Gauss-Newton and FPP approaches. In order to fix the phase ambiguity, the phase generated at the reference bus is set to in all tests. The FPP solver stops either when a maximum number of iterations are reached, or when the objective value improvement between two consecutive iterations becomes smaller . All experiments were conducted on an Intel CPU @ GHz ( GB RAM) computer.
To evaluate the performance of the FPP-based solver for power flow analysis, the first experiment simulates noiseless data corresponding to the classical power flow problem. That is, a total of system variables were specified at the PV, PQ, and slack buses to solve for real-valued unknowns in with the reference bus’s phase fixed at . The actual voltage magnitude of each bus was uniformly distributed over , and its angle over with and . Empirical success rate results on several IEEE benchmark systems were reported in Tables I and II for and , respectively. Apparently, our developed FPP-based power flow solver solves exactly the classical power flow problem in all simulated tests, while the SDR-based one fails with high probability. The Gauss-Newton method performs well when the initial point lies close to the actual solution due to small in Table I, while it diverges frequently for large values in Table II.
The second experiment compares the MSE performance of various approaches relative to the analytical Cramér-Rao bound in (22) on the IEEE -bus test system . The actual voltage magnitude and angle of each bus were generated uniformly over , and , respectively. Initially, all voltage magnitudes as well as all sending-end and receiving-end active power flow were taken, which corresponds to the base case in the -axis of Fig. 1. To demonstrate the SE performance evolution of various approaches with respect to the increasing number of measurements, additional types of measurements were included in a deterministic manner detailed next. All seven types of SCADA measurements in (7)-(9) were ordered as . Each -axis value in Fig. 1 implies that the number of ordered types of measurements was used in the experiment to obtain the mean-square errors. For instance, on the -axis corresponds to the case where the first types of measurements (i.e., all ) were used; and likewise for all other -axis values. Measurement noise was randomly and independently generated from Gaussian distribution having zero-mean and standard deviation . The SDR estimator was recovered from the SDR solution by picking the minimum-cost vector over the eigenvector and zero-mean Gaussian randomizations with covariance matrix being the SDR solution. The MSE as well as the CRLB versus the types of measurements available are shown in Fig. 1, corroborating the near-optimal performance relative to the CRLB and robustness of our developed FPP-based PSSE solver.
The last experiment on the IEEE -bus benchmark system simulates a high signal-to-noise ratio and complete-data scenario, where all voltage magnitude as well as all active power flow at both sending- and receiving-ends were measured to be advantageous to the SDR-based method . Independent zero-mean Gaussian noise was assumed to have standard deviations for power measurements and for voltage measurements. The actual voltage magnitude and angle of each bus were generated uniformly at random over , and , respectively. Fig. 2 depicts the average magnitude and angle estimation errors of three PSSE schemes across buses. The curves in Fig. 2 demonstrate the merits of the FPP-based PSSE solver in this scenario.
Motivated by the inherent nonconvexity of the power flow and PSSE tasks and leveraging recent advances in handling nonconvex QCQPs, this work first reformulated power flow and PSSE as a nonconvex QCQP. The resulting nonconvex QCQP was subsequently solved by the FPP algorithm. The novel FPP-based solvers were shown to converge to a stationary point of the WLS formulation. To fairly compare different PSSE solvers from noisy data, the CRLB for PSSE assuming an AWGN model was derived based on Wirtinger’s calculus for functions over complex domains. Extensive numerical tests showed markedly improved performance of our FPP-based solver for both power flow and PSSE tasks at the price of increased runtime over competing Gauss-Newton- and SDR-based alternatives on a variety of IEEE test systems.
Pertinent future research directions include developing distributed implementations for large-scale power networks by exploiting the natural low-rank and sparsity structure present in the coefficient matrices . Another possibility consists of leveraging state-of-the-art approaches for tackling random quadratic systems of equations to solve the power flow and PSSE problems . Generalizing feasible point pursuit algorithms to other nonconvex power grid control tasks such as stochastic energy management , and distribution system-level power flow and PSSE  constitute meaningful directions for future research as well.
and the negative log-likelihood is
The Fisher information matrix is defined as the Hessian of the objective function with respect to the variable vector . So the task of deriving the Cramér-Rao bound amounts to finding the Hessian of a real-valued function with respect to a complex-valued vector. Recall from Wirtinger’s calculus that can be equivalently rewritten as . Upon introducing the conjugate coordinates , the so-called Wirtinger derivative is 
Our definitions here follow the convention in multivariate calculus that derivatives are denoted by row vectors, and gradients by column vectors. For brevity, let . Accordingly, the derivatives of in (24) can be obtained as
where the partial derivatives of can be found as
In the conjugate coordinate system, the complex Hessian is defined as
whose blocks are given by
The other blocks can be derived in a similar fashion. Upon omitting algebraic details, the remaining three blocks can be obtained as follows
Evaluating the Hessian in (28) [and its blocks in (29)-(32)] at the true value of , and taking the expectation with respect to the noise vector , it is easy to verify that . Hence, the -related terms disappear, so the FIM can be expressed as 
where is introduced to show the rank-deficiency of , whose -th column is given as
To demonstrate the rank- deficiency of , it suffices to find a nonzero vector such that . To this end, consider the vector . It is straightforward to check that for all
therefore giving rise to . That is, for any nonzero , there always exists a nonzero vector