# Monitoring and Optimization for Power Grids:

A Signal Processing Perspective

\bstctlcite
IEEEexample:BSTcontrol

## I Introduction

Albeit the North American power grid has been recognized as the most important engineering achievement of the 20th century, the modern power grid faces major challenges [87]. Increasingly complex interconnections even at the continent size render prevention of the rare yet catastrophic cascade failures a strenuous concern. Environmental incentives require carefully revisiting how electrical power is generated, transmitted, and consumed, with particular emphasis on the integration of renewable energy resources. Pervasive use of digital technology in grid operation demands resiliency against physical and cyber attacks on the power infrastructure. Enhancing grid efficiency without compromising stability and quality in the face of deregulation is imperative. Soliciting consumer participation and exploring new business opportunities facilitated by the intelligent grid infrastructure hold a great economic potential.

The smart grid vision aspires to address such challenges by capitalizing on state-of-the-art information technologies in sensing, control, communication, and machine learning [2, 24]. The resultant grid is envisioned to have an unprecedented level of situational awareness and controllability over its services and infrastructure to provide fast and accurate diagnosis/prognosis, operation resiliency upon contingencies and malicious attacks, as well as seamless integration of distributed energy resources.

### I-a Basic Elements of the Smart Grid

A cornerstone of the smart grid is the advanced monitorability on its assets and operations. Increasingly pervasive installation of the phasor measurement units (PMUs) allows the so-termed synchrophasor measurements to be taken roughly 100 times faster than the legacy supervisory control and data acquisition (SCADA) measurements, time-stamped using the global positioning system (GPS) signals to capture the grid dynamics. In addition, the availability of low-latency two-way communication networks will pave the way to high-precision real-time grid state estimation and detection, remedial actions upon network instability, and accurate risk analysis and post-event assessment for failure prevention.

The provision of such enhanced monitoring and communication capabilities lays the foundation for various grid control and optimization components. Demand response (DR) aims to adapt the end-user power usage in response to energy pricing, which is advantageously controlled by utility companies via smart meters [29]. Renewable sources such as solar, wind, and tidal, and electric vehicles are important pieces of the future grid landscape. Microgrids will become widespread based on distributed energy sources that include distributed generation and storage systems. Bidirectional power flow to/from the grid due to such distributed sources has potentials to improve the grid economy and robustness. New services and businesses will be generated through open grid architectures and markets.

### I-B SP for the Grid in a Nutshell: Past, Present, and Future

Power engineers in the 60’s were facing the problem of computing voltages at critical points of the transmission grid, based on power flow readings taken at current and voltage transformers. Local personnel manually collected these readings and forwarded them by phone to a control center, where a set of equations dictated by Kirchoff’s and Ohm’s laws were solved for the electric circuit model of the grid. However, due to timing misalignment, instrumentation inaccuracy, and modeling uncertainties present in these measurements, the equations were always infeasible. Schweppe and others offered a statistical signal processing (SP) problem formulation, and advocated a least-squares approach for solving it [69]—what enabled the power grid monitoring infrastructure used pretty much invariant till now [57], [1].

This is a simple but striking example of how SP expertise can have a strong impact in power grid operation. Moving from the early 70’s to nowadays, the environment of the power system operation has become considerably more complex. New opportunities have emerged in the smart grid context, necessitating a fresh look. As will be surveyed in this article, modern grid challenges urge for innovative solutions that tap into diverse SP techniques from estimation, machine learning, and network science.

Avenues where significant contribution can be made include power system state estimation (PSSE) in various renditions, as well as “bad data” detection and removal. As costly large-scale blackouts can be caused by rather minor outages in distant parts of the network, wide-area monitoring of the grid turns out to be a challenging yet essential goal [78]. Opportunities abound in synchrophasor technology, ranging from judicious placement of PMUs to their role in enhancing observability, estimation accuracy, and bad data diagnosis. Unveiling topological changes given a limited set of power meter readings is a critical yet demanding task. Applications of machine learning to the power grid for clustering, topology inference, and Big Data processing for e.g., load/price forecasting constitute additional promising directions.

Power grid operations that can benefit from the SP expertise include also traditional operations such as economic dispatch, power flow, and unit commitment [84], [70], [25], as well as contemporary ones related to demand scheduling, control of plug-in electric vehicles, and integration of renewables. Consideration of distributed coordination of the partaking entities along with the associated signaling practices and architectures require careful studies by the SP, control, and optimization experts.

Without any doubt, computationally intelligent approaches based on SP methodologies will play a crucial role in this exciting endeavor. From grid informatics to inference for monitoring and optimization tools, energy-related issues offer a fertile ground for SP growth whose time has come.

The rest of the article is organized as follows. Modeling preliminaries for power system analysis are provided in Sec. II. Sec. III deals with the monitoring aspect, delineating various SP-intensive topics including state estimation and PMUs, as well as the inference, learning and cyber-security tasks. Section IV is devoted to grid optimization issues, touching upon both traditional problems in economic power system operations, as well as more contemporary topics such as demand response, electric vehicles, and renewables. The article is wrapped up with a few open research directions in Sec. V.

## Ii Modeling Preliminaries

Power systems can be thought of as electric circuits of even continent-wide dimensions. They obey multivariate versions of Kirchoff’s and Ohm’s laws, which in this section are overviewed using a matrix-vector notation. As the focus is laid on alternating current (AC) circuits, all electrical quantities involved (voltage, current, impedance, power) are complex-valued. Further, quantities are measured in the per unit (p.u.) system, which means that they are assumed properly normalized. For example, if the “base voltage” is kV, then a bus voltage of kV is 1.01 p.u. The p.u. system enables uniform single- and three-phase system analysis, bounds the dynamic range of calculations, and allows for uniform treatment over the different voltage levels present in the power grid [84], [25].

Consider first a power system module of two nodes, and , connected through a line. A node, also referred to as a bus in the power engineering nomenclature, can represent, e.g., a generator or a load substation. A line (a.k.a. branch) can stand for a transmission or distribution line (overhead/underground), or even a transformer. Two-node connections can be represented by the equivalent model depicted in Fig. 1 [99, 57], which entails the line series impedance and the total charging susceptance . The former comprises a resistive part and a reactive (actually inductive) one , that is . The line series admittance is often used in place of the impedance. Its real and imaginary parts are called conductance and susceptance, respectively. Letting denote the complex voltage at node , the current flowing from node to , and invoking Ohm’s and Kirchoff’s laws on the circuit of Fig. 1, yields

(1) |

The reverse-direction current is expressed symmetrically. Unless is zero, it holds that . A small shunt susceptance is typically assumed between every node and the ground (neutral), yielding the current .

Building on the two-node module, consider next a power system consisting of a set of buses along with a set of transmission lines. By Kirchoff’s current law, the complex current at bus denoted by must equal the sum of currents on the lines incident to bus ; that is,

(2) |

where is the set of buses directly connected to bus , and . Collecting node voltages (currents) in the vector (), leads to the multivariate Ohm’s law

(3) |

where is the so-termed bus admittance matrix with -th diagonal entry and -th off-diagonal entry if , and zero otherwise (cf. (2)). Matrix is symmetric and more importantly sparse, thus facilitating efficient storage and computations. On the contrary, the bus impedance matrix , defined as the inverse of (and not as the matrix of bus pair impedances), is full and therefore it is seldom used.

A major implication of (3) is control of power flows. Let be the complex power injected at bus whose real and imaginary parts are the active (reactive) power (). Physically, represents the power generated and/or consumed by plants and loads residing at bus . For bus and with denoting conjugation, it holds that , or after collecting all power injections in ( denotes a diagonal matrix holding on its diagonal) one arrives at (cf. (3))

(4) |

Complex power flowing from bus to a neighboring bus is similarly given by

(5) |

The ensuing analysis pertains mainly to nodal quantities. However, line quantities such as line currents and power flows over lines can be modeled accordingly using (1) and (5).

Typically, the complex bus admittance matrix is written in rectangular coordinates as . Two options become available from (4), depending on whether the complex nodal voltages are expressed in polar or rectangular forms. The polar representation yields [cf. (2)]

(6a) | ||||

(6b) |

where . Since and depend on phase differences , power injections are invariant to phase shifts of bus voltages. This explains why a selected bus called the reference, slack, or swing bus is conventionally assumed to have zero voltage phase without loss of generality.

If is known, the equations in (6) involve the variables . Among the nodal variables, (i) the reference bus has fixed ; (ii) pairs are controlled at generator buses (and are thus termed PV buses); while, (iii) power demands are predicted for load buses (also called PQ buses). Fixing these variables and solving the non-linear equations (6) for the remaining ones constitutes the standard power flow problem [84, Ch. 4]. Algorithms for controlling PV buses and predicting load at PQ buses are presented in Sec. IV-A and Sec. III-D3, respectively.

Pairs satisfying (approximately) power flow
equations paralleling (6) can be found in
[25, Ch. 3]. Among the approximations of the latter
as well as (6), the so called DC
model is reviewed next due to its importance in grid monitoring and
optimization. The DC model hinges on three assumptions:

(A1) The power network is purely inductive, which means
that is negligible. In high-voltage transmission lines, the
ratio is large enough so that
resistances can be ignored and the conductance part of
can be approximated by zero;

(A2) In
regular power system conditions, the voltage phase differences
across directly connected buses are small; thus, for every pair of neighboring buses , and the
trigonometric functions in (6) are
approximated as and
; and

(A3) Due to typical
operating conditions, the magnitude of nodal voltages is
approximated by one p.u.

Under (A1)-(A3) and upon exploiting the structure of (cf. (3)), the model in (6) boils down to

(7a) | ||||

(7b) |

where is the susceptance of the branch, and in deriving (7), approximation of nodal voltage magnitudes to unity implies , yet .

The DC model (7) entails linear equations that are neatly decoupled: active powers depend only on voltage phases, whereas reactive powers are solely expressible via voltage magnitudes. Furthermore, the linear dependence is on voltage differences. In fact, since and , active power flows across lines from the larger- to the smaller-voltage phase buses.

Consider now the active subproblem described by (7a). Stacking the nodal real power injections in and the nodal voltage phases in , leads to

(8) |

where the symmetric is defined similar to by only accounting for reactances. Specifically, for all , and , if line exists, and zero otherwise.

An alternative representation of is presented next. Define matrix , and the branch-bus incidence matrix , such that if its -th row corresponds to the branch, then , , and zero elsewhere. Based on these definitions, can be viewed as a weighted Laplacian of the graph describing the power network. This in turn implies that is positive semidefinite, and the all-ones vector lies in its null space. Further, its rank is if and only if the power network is connected. Since , it follows that ; stated differently, the total active power generated equals the active power consumed by all loads, since resistive elements and incurred thermal losses are ignored.

As a trivia, the terminology DC model stems from the fact that (8) models the AC power system as a purely resistive DC circuit by identifying the active powers, reactances, and the voltage phases of the former to the currents, the resistances, and the voltages of the latter.

Coming back to the exact power flow model of (4), consider now expressing nodal voltages in rectangular coordinates. If for all buses, it follows that

(9a) | ||||

(9b) |

Based on (9a) and (9b), it is clear that (re)active power flows depend quadratically on the rectangular coordinates of nodal voltages. Because (9) is not amenable to approximations invoked in deriving (6), the polar representation has been traditionally preferred over the rectangular one.

Before closing this section, a few words are due on modeling transformers that were not explicitly accounted so far. Upon adding the circuit surrounded by the yellow square to the model of Fig. 1, the possibility of having a transformer on a branch is considered in its most general setting [25], [99]. An ideal transformer residing on the line at the -th bus side yields and , where is its turn ratio. Hence, (1) readily generalizes to

(10) |

Using (10) in lieu of (1), a similar analysis can be followed with the exception that in the presence of phase shifters, the corresponding bus admittance matrix will not be symmetric. Note though that the DC model of (8) holds as is, since it ignores the effects of transformers anyway.

## Iii Grid Monitoring

In this section, SP tools and their roles in various grid monitoring tasks are highlighted, encompassing state estimation with associated observability and cyber-attack issues, synchrophasor measurements, as well as intriguing inference and learning topics.

### Iii-a Power System State Estimation

Simple inspection of the equations in Section II confirms that all nodal and line quantities become available if one knows the grid parameters , and all nodal voltages that constitute the system state. Power system state estimation (PSSE) is an important module in the supervisory control and data acquisition (SCADA) system for power grid operation. Apart from situational awareness, PSSE is essential in additional tasks, namely load forecasting, reliability analysis, the grid economic operations detailed in Sec. IV, network planning, and billing [25, Ch. 4]. Building on Sec. II, this section reviews conventional solutions and recent advances, as well as pertinent smart grid challenges and opportunities for PSSE.

#### Iii-A1 Static State Estimation

Meters installed across the grid continuously measure electric quantities, and forward them every few seconds via remote terminal units (RTUs) to the control center for grid monitoring. Due to imprecise time signaling and the SCADA scanning process, conventional metering cannot utilize phase information of the AC waveforms. Hence, legacy measurements involve (active/reactive) power injections and flows, as well as voltage and current magnitudes on specific grid points. Given the SCADA measurements and assuming stationarity over a scanning cycle, the PSSE module estimates the state, namely all complex nodal voltages collected in . Recall that according to the power flow models presented in Sec. II, all grid quantities can be expressed in terms of . Thus, the vector of SCADA measurements can be modeled as , where is a properly defined vector-valued function, and captures measurement noise and modeling uncertainties. Upon prewhitening, can be assumed standard Gaussian. The maximum-likelihood estimate (MLE) of can be then simply expressed as the nonlinear least-squares (LS) estimate

(11) |

Prior information, such as zero-injection buses () and feasible ranges (of and ), can be included as constraints in (11). In any case, the optimization problem is nonconvex. For example, when states are expressed in rectangular coordinates, the functions in are quadratic; cf. (9). In general, PSSE falls under the class of nonlinear LS problems, for which Gauss-Newton iterations are known to offer the “workhorse” solution [1, Ch. 2]. Specifically, upon expressing in polar coordinates, the quadratic can be linearized using Taylor’s expansion around a starting point. The Gauss-Newton method hence approximates the cost in (11) with a linear LS one, and relies on its minimizer to initialize the subsequent iteration. This iterative procedure is closely related to gradient descent algorithms for solving nonconvex problems, which are known to encounter two issues: i) sensitivity to the initial guess; and ii) convergence concerns. Without guaranteed convergence to the global optimum, existing variants improve numerical stability of the matrix inversions per iteration [1]. In a nutshell, the grand challenge so far remains to develop a solver attaining or approximating the global optimum at polynomial-time.

Recently, a semidefinite relaxation (SDR) approach has been recognized to develop polynomial-time PSSE algorithms with the potential to find a globally optimal solution [95], [96]. Challenged by the nonconvexity of (11), the measurement model is reformulated as a linear function of the outer-product matrix , where the state is now expressed in rectangular coordinates. This allows reformulating (11) to a semidefinite program (SDP) with the additional constraint . Dropping the nonconvex rank constraint to acquire a convex SDP has been well-appreciated in signal processing and communications; see e.g., [52]. The SDR-based PSSE has been shown to approximate well the global optimum, while it is possible to further improve computational efficiency by exploiting the SDP problem structure [95].

#### Iii-A2 Dynamic State Estimation

As power systems evolve in time, dynamic PSSE is well motivated thanks to its predictive ability emerging when additional temporal information is available. In practice, it is challenged by both the unknown dynamics and the requirement of real-time implementation. While the latter could become tractable with (extended) Kalman filtering (KF) techniques, it is more difficult to develop simple state-space models to capture the power system dynamics.

There have been various proposals for state transition models in order to perform the prediction step, mostly relying on a quasi-steady state behavior; see [67] for a review of the main developments. One simplified and widely used model poses a “random-walk” behavior expressing the state in polar coordinates per time slot as , where is zero-mean white Gaussian with a diagonal covariance matrix estimated online [57]. A more sophisticated dynamical model reads , where is a diagonal transition matrix and captures the process mismatch. Recently, a quasi-static state model has been introduced to determine by approximating first-order effects of load data [7].

For the correction step, the extended KF (EKF) is commonly used via linearizing the measurement model around the state predictor [57, 67]. To overcome the reduced accuracy of EKF linearization, unscented KF (UKF) of higher complexity has been reported in [81]. Particle filtering may also be of interest if its computational efficiency can be tolerated by the real-time requirements of power systems.

#### Iii-A3 Distributed State Estimation

Parallel and distributed solvers were investigated early on [69]. The motivation was primarily computational, even though additional merits of coordination across adjacent control areas were also recognized. In vertically integrated electricity markets, each local utility estimated its own state and modeled the rest of the system at boundary points using only local measurements. Adjacent power systems were connected via tie lines, which were basically used in emergency situations, and PSSE was performed locally with limited interaction among control centers.

Currently, the deregulation of energy markets has led to continent-wide interconnections that are split into subnetworks monitored by independent system operators (ISOs). Increasing amount of power is transferred over multiple control areas, and tie lines must be accurately monitored for reliability and accounting [27]. The ongoing penetration of renewables further intensifies long-distance power transfers, while their intermittent nature calls for frequent monitoring. Interconnection-level PSSE is therefore a key factor for modernizing power grids. Even though advanced instrumentation can provide precise and timely measurements (cf. Sec. III-C), an interconnection could consist of thousands of buses. The latter together with privacy policies deem decentralized PSSE a pertinent solution.

To understand the specifications of distributed PSSE, consider the toy example of Fig. 2. Area 2 consists of buses , but it also collects current measurements on tie lines . Its control center has two options regarding these measurements: either to ignore them and focus on the internal state, or to consider them and augment its state by the external buses . The first option is statistically suboptimal; let alone it may incur observability loss (check for example Area 3). For the second option, neighboring areas should consent on shared variables. This way, agreement is achieved over tie line charges and the global PSSE problem is optimally solved.

It was early realized that for a chain of serially interconnected areas, KF-type updates can be implemented incrementally in space [69, Pt. III]. For arbitrarily connected areas though, a two-level approach with a global coordinator is required [69]: Local measurements involving only local states are processed to estimate the latter. Local estimates of shared states, their associated covariance matrices, and tie line measurements are forwarded to a global coordinator. The coordinator then updates the shared states and their statistics. Several recent renditions of this hierarchical approach are available under the assumption of local observability [27, 28]. A central coordinator becomes a single point of failure, while the sought algorithms may be infeasible due to computational, communication, or policy limitations. Decentralized solutions include block Jacobi iterations [16], and the auxiliary problem principle [19]. Local observability is waived in [88], where a copy of the entire high-dimensional state vector is maintained per area, and linear convergence of the proposed first-order algorithm scales unfavorably with the interconnection size. A systematic framework based on the alternating direction method of multipliers is put forth in [34]. Depending solely on existing PSSE software, it respects privacy policies, exhibits low communication load, and its convergence is guaranteed even in the absence of local observability. Finally, for a survey on multi-area PSSE, refer to [28].

#### Iii-A4 Generalized State Estimation (G-SE)

PSSE presumes that grid connectivity and the electrical parameters involved (e.g., line admittances) are known. Since these are oftentimes unavailable, generalized state estimation (G-SE) extends the PSSE task to jointly recovering them too [1, Ch. 8], [25, Sec. 4.10]. PSSE operates on the bus/branch grid model; cf. Fig. 3(a). A more meticulous view of this grid is offered by the corresponding bus section/switch model depicted in Fig. 3(b). This shows how a bus is partitioned by circuit breakers into sections (e.g., bus 1 to sections ), or how a substation can appear as two different buses (e.g., sections and mapped to buses and , respectively). Circuit breakers are zero-impedance switching components and are used for seasonal, maintenance, or emergency reconfiguration of substations. For some of them, the status and/or the power they carry may be reported to the control center. A topology processing unit collects this information and validates network connectivity prior to PSSE [57].

Even though topology malfunctions can be detected by large PSSE residual errors, they are not easily identifiable [1]. Hence, joint PSSE with topology processing under the G-SE task has been a well-appreciated solution [57]. G-SE essentially performs state estimation using the bus section/switch model. Due to the zero impedances though, breaker flows are appended to the system state. For regular transmission lines of unknown status or parameters, G-SE augments the system state by their flows likewise. In any case, to tackle the increased state dimensionality, breakers of known status are treated as constraints: open (closed) breakers correspond to zero flows (voltage drops). Practically, not all circuit breakers are monitored; and even for those monitored, the reported status may be erroneous [1]. Nowadays, G-SE is further challenged: the penetration of renewables and DR programs will cause frequent substation reconfigurations. Yet, G-SE can be aided by advanced substation automation and contemporary intelligent electronic devices (IEDs).

Identifying substation configuration errors has been traditionally treated by extending robust PSSE methods (cf. Sec. III-B2) to the G-SE framework. Examples include the largest normalized residual test, and the least-absolute value and the Huber’s estimators [1, Ch. 8]. To reduce the dimensionality of G-SE, an equivalent smaller-size model has been developed in [26]. The method in [37] leverages advances in compressive sampling and instrumentation technology. Upon regularizing the G-SE cost by -norms of selected vectors, it promotes block sparsity on real and imaginary pairs of suspected breakers.

### Iii-B Observability, Bad Data, and Cyber-attacks

The PSSE module presumes that meters are sufficiently many and well distributed across the grid so that the power system is observable. Since this may not always be the case, observability analysis is the prerequisite of PSSE. Even when the set of measurements guarantees system state observability, resilience to erroneous readings should be solicited by robust PSSE methods. Nonetheless, specific readings (un)intentionally corrupted can harm PSSE results. This section studies these intertwined topics.

#### Iii-B1 Observability Analysis

Given the network model and measurements, observability amounts to the ability of uniquely identifying the state . Even when the overall system is unobservable, power system operators are interested in observable islands. An observable island is a maximally connected sub-grid, whose states become observable upon selecting one of its buses as a reference. Identifying observable islands is important because it determines which line flows and nodal injections can be uniquely recovered. Identifying unobservable islands further provides candidate locations for additional (pseudo-)measurements needed to restore global observability. Pseudo-measurements are prior state information about e.g., scheduled generations, forecasted loads, or predicted values (based on historical data) to aid PSSE in the form of measurements with high-variance additive noise (estimation error).

Due to instrument failures, communication delays, and network reconfigurations, observability must be checked online. The analysis typically resorts to the DC model (7), and hence, it can be performed separately per active and reactive subproblems thanks to the - and - decoupling. Since power measurements oftentimes come in (re)active pairs, the observability results obtained for the active subproblem (8) carry over to the reactive one, assuming additionally that at least one nodal voltage magnitude is available per observable island (the reactive analogue of the reference bus).

Commonly used observability checks include topological as well as numerical ones; see [1, Ch. 4] for a review. Topological observability testing follows a graph-theoretic approach [14]. Given the graph of the grid and the available set of measurements, this test builds a maximal spanning tree. Its branches are either lines directly metered or lines incident to a metered bus, while every branch should correspond to a different measurement. If such a tree exists, the grid is deemed observable; otherwise, the so-derived maximal spanning forest defines the observable islands.

On the other hand, numerical observability considers the identifiability of the noiseless approximate DC model [58]. Linear system theory asserts that the state is observable if is full column rank. Recall however that active power measurements introduce a voltage phase shift ambiguity (cf. (6)–(7)). That is why a power system with branch-bus incidence matrix is deemed observable simply if for every satisfying , i.e., . Observe now that the entries of are proportional to line power flows. Hence, intuitively, whenever there is a non-zero power flow in the power grid, at least one of its measurements should be non-zero for it to be fully observable. When this condition does not hold, observable islands can be identified via the iterative process developed in [58].

#### Iii-B2 Robust State Estimation by Cleansing Bad Data

Observability analysis treats all measurements received as reliable and trustworthy. Nonetheless, time skews, communication failures, parameter uncertainty, and infrequent instrument calibration can yield corrupted power system readings, also known as “bad data” in the power engineering parlance. If bad data pass through simple screening tests, e.g., polarity or range checks, they can severely deteriorate PSSE performance. Coping with them draws methods from robust statistical SP to identify outlying measurements, or at least detect their presence in the measurement set.

Two statistical tests, namely the -test and the largest normalized residual test (LNRT), were proposed in [69, Part II], and are traditionally used for bad data detection and identification, respectively [57], [1, Ch. 5]. Both tests rely on the model , assuming a full column rank matrix and a zero voltage phase at the reference bus. The two tests check the residual error of the LS estimator which can be expressed as , where satisfying . Apparently, when is standardized Gaussian, is Gaussian too with covariance ; hence, follows a distribution with degrees of freedom. The -test then declares an LS-based PSSE possibly affected by outliers whenever exceeds a predefined threshold.

LNRT exploits further the Gaussianity of . Indeed, as should be standard Gaussian for all when bad data are absent, LNRT finds the maximum absolute value among these ratios and compares it against a threshold to identify a single bad datum [1, Sec. 5.7]. Practically, if a bad datum is detected, it is removed from the measurement set, and the LS estimator is re-computed. The process is repeated till no bad data are identified. Successive LS estimates can be efficiently computed using recursive least-squares (RLS). The LNRT is essentially the leave-one-out approach, a classical technique for identifying single outliers. Interesting links between outlier identification and -(pseudo)-norm minimization are presented in [42] and [34] under the Bayesian and the frequentist frameworks, respectively.

Apart from the two tests treating bad data a posteriori, outlier-robust estimators, such as the least-absolute deviation, the least median of squares, or Huber’s estimator have been considered too; see [1]. Recently, -norm based methods have been devised; see e.g., [42], [90], [34].

Unfortunately, all bad data cleansing techniques are vulnerable to the so called “critical measurements” [1]. A measurement is critical if once removed from the measurement set, the power system becomes unobservable. If for example one removes the current measurement on line from the grid of Fig. 2, then bus voltage cannot be recovered. Actually, it can be shown that the -th measurement is critical if the -th column of is zero, which translates to being always zero too. Due to the latter, the LNRT is undefined for critical measurements.

Intuitively, a critical measurement is the only observation related to some state. Thus, this measurement cannot be cross-validated or questioned as an outlier, but it should be blindly trusted. The existence of critical measurements in PSSE reveals the connection between bad data and observability analysis. Apparently, the notion of critical measurements can be generalized to multiple simultaneously corrupted readings. Even though such events are naturally rare, their study becomes timely nowadays under the threat of targeted cyber-attacks as explained next.

#### Iii-B3 Cyber-attacks

As a complex cyber-physical system spanning a large geographical area, the power grid inevitably faces challenges in terms of cyber-security. With more data acquisition and two-way communication required for the future grid, enhancing cyber-security is of paramount importance. From working experience in dealing with the Internet and telecommunication networks, there is potential for malicious and well-motivated adversaries to either physically attack the grid infrastructure, or remotely intrude the SCADA system. Among all targeted power grid monitoring and control operations, the PSSE task in Sec. III-A appears to be of extreme interest as adversaries can readily mislead operators and manipulate electric markets by altering the system state [42], [89].

Most works analyzing cyber-attacks consider the linear measurement model modified as , where the attack vector has non-zero entries corresponding to compromised meters. It was initially pointed out in [50] that if the adversary knows , the attack can be constructed to lie in the range space of so that the system operator can be arbitrarily misled. Under such a scenario, the attack cannot be detected. Such attacks are related to the observability and bad data analysis described earlier, since by deleting the rows of corresponding to the nonzero entries of , the resultant system becomes unobservable [42]. Various strategies to construct have been derived in [50], constrained by the number of counterfeit meters; see also [42] for the minimum number of such meters. Cyber-attacks under linear state-space models are considered in [63].

A major limitation of existing works lies in the linear measurement model assumption, not to mention the practicality of requiring attackers to know the full system configuration. Attacks in nonlinear measurement models for AC systems are studied in [97]. Granted that a nonlinear PSSE model can be approximated around a given state point, it is not obvious how the attacker can acquire such dynamically varying information in real time in order to construct the approximation. This requires a per-adversary PSSE and assessment of a significant portion of meter measurements. On the defender’s side, robustifying PSSE against bad data is a first countermeasure. Since cyber-attacks can be judiciously designed by adversaries, they may be more challenging to identify, thus requiring further prior information e.g., on the state vector statistics [42].

### Iii-C Phasor Measurement Units

#### Iii-C1 Phasor Estimation

PMUs are contemporary devices complementing legacy (SCADA) meters in advancing power system applications via their high-accuracy and time-synchronized measurements [65]. Different from SCADA meters which provide amplitude (power) related information, PMUs offer also phase information. At the implementation level, current and voltage transformers residing at substations provide the analog input waveforms to a PMU. After anti-alias filtering, each one of these analog signals is sampled at a rate several times the nominal power system frequency (50/60 Hz). If the signal of interest has frequency , its phasor information (magnitude and phase) can be obtained simply by correlating a window of its samples with the sampled cosine and sine functions, or equivalently by keeping the first (non-DC) discrete Fourier transform component. Such correlations can be implemented also recursively. Since power system components operate in the frequency range Hz, acquiring phasor information for off-nominal frequency signals has been also considered [65, Ch. 3].

The critical contribution of PMU technology to grid instrumentation is time-tagging. Using precise GPS timing (the one pulse-per-second signal), synchrophasors are time-stamped at the universal time coordinated (UTC). PMU data can thus be consistently aggregated across large geographic areas. Apart from phasors, PMUs acquire the signal frequency and its frequency derivative too. Data from several PMUs are collected by a phasor data concentrator (PDC) which performs time-aligning, local cleansing of bad data, and potentially data compression before forwarding data flows to the control center. The IEEE standards C37.118.1/2-2011 determine PMU functional requirements.

#### Iii-C2 PMU Placement

Although PMU technology is sufficiently mature, PMU penetration has been limited so far, mainly due to the installation and networking costs involved [78]. Being the key technology towards wide area monitoring though guarantees their wide deployment. During this instrumentation stage, prioritizing PMU locations is currently an important issue for utilities and reliability operators worldwide. Many PMU placement methods are based on the notion of topological observability; cf. Sec. III-B1. A search algorithm for placing a limited number of PMUs on a maximal spanning forest is developed in [61]. Even though topological observability in general does not imply numerical observability, for practical measurement matrices it does [57]. In any case, a full column rank yet ill-conditioned linear regression matrix can yield numerically unstable estimators. Estimation accuracy rather than observability is probably a more meaningful criterion. Towards that end, PMU placement is formulated as a variation of the optimal experimental design problem in [48], [35]. The approach in [48] considers estimating voltage phases only, ignores PMU current measurements, and proposes a greedy algorithm. In [35], the state is expressed in rectangular coordinates, all PMU measurements are considered, and the SDP relaxation of the problem is solved via a projected gradient algorithm. For a detailed review of PMU placements, the reader is referred to [53].

#### Iii-C3 State Estimation with PMUs

As explained in Sec. III-A1, PSSE is conventionally performed using SCADA measurements [84, Ch. 12]. PMU-based PSSE improves estimation accuracy when conventional and PMU measurements are jointly used [66, 65]. However, aggregating conventional and synchrophasor readings involves several issues. First, SCADA measurements are available every 4 secs, whereas 30-60 synchrophasors can be reported per sec. Second, explicitly including conventional measurements reduces the linear PMU-based PSSE problem into a non-linear one. Third, compatibility to existing PSSE software and phase alignment should be also considered. An approach to address these challenges is treating SCADA-based estimates as pseudo-measurements during PMU-driven state estimation [65]. Essentially, the slower rate SCADA-based state estimates, expressed in rectangular coordinates, together with their associated covariance matrix can be used as a Gaussian prior for the faster rate linear PSSE problem based on PMU measurements [65], [35]. Regarding phase alignment, as already explained SCADA-based estimates assume the phase of the reference bus to be zero, whereas PMUs record phases with respect to GPS timing. Aligning the phases of the two estimates can be accomplished by PMU-instrumenting the reference bus, and then simply adding its phase to all SCADA-based state estimates [65].

Synchrophasor measurements do not contribute only to PSSE. Several other monitoring, protection, and control tasks, ranging from local to interconnection-wide scope can benefit from PMU technology. Voltage stability, line parameter estimation, dynamic line rating, oscillation and angular separation monitoring, small signal analysis are just a few entries from the list of targeted applications [78], [65].

### Iii-D Additional Inference and Learning Issues

PSSE offers a prototype class of problems that SP tools can be readily employed to advance grid monitoring performance, especially after leveraging recent PMU technology to complement SCADA measurements. However, additional areas can benefit from SP algorithms applied to change detection, estimation, classification, prediction, and clustering aspects of the grid.

#### Iii-D1 Line Outage Identification

Unexpected events, such as a breaker failure, a tree fall, or a lightning strike, can make transmission lines inoperative. Unless the control center becomes aware of the outage promptly, power generation and consumption will remain almost unchanged across the grid. Due to flow conservation though, electric currents will be automatically altered in the outaged transmission network. Hence, shortly after, a few operating lines may exceed their ratings and successively fail. A cascading failure can spread over interconnected systems in a few minutes and eventually lead to a costly grid-wide blackout in less than an hour. Timely identifying line outages, or more generally abrupt changes in line parameters, is thus critical for wide-area monitoring.

One could resort to the generalized PSSE module to identify line outages (cf. Sec. III-A4). Yet most existing topology processors rely on data of the local control area (a.k.a. internal) system; see also Fig. 4. On the other hand, flow conservation can potentially reveal line changes even in external systems. This would be a non-issue if inter-system data were available at a sufficiently high rate. Unfortunately, the system data exchange (SDX) module of the North-American Electric Reliability Corporation (NERC) can provide the grid-wide basecase topology only on an hourly basis [76], while the desideratum here is near-real-time line monitoring. In a nutshell, each internal system needs to timely identify line changes even in the external systems, relying only on local data and the infrequently updated basecase topology.

To concretely lay out the problem, consider the pre- and post-event states, and let denote the subset of lines in outage. Suppose that the interconnected grid has reached a stable post-event state, and it remains connected [76]. With reference to the linear DC model in (8), its post-event counterpart reads , where captures small zero-mean power injection perturbations. Recalling from Sec. II that , the difference can be expressed as . With , the “difference model” can be written as , where , . Based on the latter, to identify of a given cardinality , one can enumerate all possible topologies in outage, and select the one offering the minimum LS fit. Such an approach incurs combinatorial complexity, and has thus limited the existing exhaustive search methods to identifying single [76], or at most double line outages [77]. A mixed-integer programming approach was proposed in [20], which again deals with single line outages.

To bypass this combinatorial complexity, [98] considers an overcomplete representation capturing all possible line outages. By constructing an vector , whose -th entry equals , if , and otherwise, it is possible to reduce the previous model to a sparse linear regression one given by

(12) |

Since the control center only has estimates of the internal bus phases, it is necessary to solve (12) for , and extract the rows corresponding to the internal buses. This leads to a linear model slightly different from (12); but thanks to the overcomplete representation, identifying amounts to recovering . The key point here is the small number of line outages that makes the sought vector sparse. Building on compressive sampling approaches, sparse signal recovery algorithms have been tested in [98] using IEEE benchmark systems, and near-optimal performance was obtained at computational complexity growing only linearly in the number of outages.

#### Iii-D2 Mode Estimation

Oscillations emerge in power systems when generators are interconnected for enhanced capacity and reliability. Generator rotor oscillations are due to lack of damping torque, and give rise to oscillations of bus voltages, frequency, and (re)active power flows. Oscillations are characterized by the so-termed electromechanical modes, whose properties include frequency, damping, and shape [44]. Depending on the size of the power system, modal frequencies are often in the range of Hz. While a single generator usually leads to local oscillations at the higher range ( Hz), inter-area oscillations among groups of generators lie in the lower range ( Hz). Typically, the latter ones are more troublesome, and without sufficient damping they grow in magnitude and may finally result in even grid breakups. Hence, estimating electromechanical modes, especially the low-frequency ones, is truly important, and known as the small-signal stability problem in power system analysis [44].

Albeit near-and-dear to SP expertise on retrieving harmonics, modal estimation is challenging primarily due to the nonlinear and time varying properties of power systems, as well as the co-existence of several oscillation modes at nearby frequencies. Fortunately, the system behaves relatively linearly when operating at steady state, and can thus be approximated by the continuous-time vector differential equations , where the eigenvalues of characterize the oscillation modes, and and correspond to the exogenous input and the random perturbing noise, respectively. Assuming linear dynamic state models, mode estimation approaches are either model- or measurement-based. The former construct the exact nonlinear differential equations from system configurations, and then linearize them at the steady-state to obtain for estimating electromechanical modes [55]. In measurement-based methods, oscillation modes are acquired directly by peak-picking the spectral estimates obtained using linear measurements [79]. Since the complexity of model-based methods grows with the network size, scalability issues arise for larger systems. With PMUs, modes can be estimated directly from synchrophasors, and even updated in real time.

Depending on the input , the measurements are either ambient, or ring-down (a.k.a. transient), or probing; see e.g., Fig. 5. With only random noise attributed to load perturbations, the system operates under an equilibrium condition and the ambient measurements look like pseudo-noise. A ring-down response occurs after some major disturbance, such as line tripping or a pulse input , and results in observable oscillations. Probing measurements are obtained after intentionally injecting known pseudo-random inputs (probing signals), and can be considered as a special case of ring-down data. Missing entries and outliers are also expected in meter measurements, hence robust schemes are of interest for mode estimation [94]. Measurement-based algorithms can be either batch or recursive. In batch modal analysis, off-line ring-down data are modeled as a sum of damped sinusoids and solved using e.g., Prony’s method to obtain linear transfer functions. Ambient data are handled by either parametric or nonparametric spectral analysis methods [79]. To recursively incorporate incoming data, several adaptive SP methods have been successfully applied, including least-mean squares (LMS) and RLS [94]. Apart from utilizing powerful statistical SP tools for mode estimation, it is also imperative to judiciously design efficient probing signals for improved accuracy with minimal impact to power system operations [79].

#### Iii-D3 Load and Electricity Price Forecasting

Smooth operation of the grid depends heavily on load forecasts. Different applications require load predictions of varying time scales. Minute- and hour-ahead load estimates are fed to the unit commitment and economic dispatch modules as described in Sec. IV-A. Predictions at the week scale are used for reliability purposes and hydro-thermal coordination; while forecasts for years ahead facilitate strategic generation and transmission planning. The granularity of load forecasts varies spatially too, ranging from a substation, utility, to an interconnection level. Load forecasting tools are essential for electricity market participants and system operators. Even though such tools are widely used in vertically organized utilities, balancing supply and demand at a deregulated electricity market makes load forecasting even more important. At the same time, the introduction of electric vehicles and DR programs further complicates the problem.

Load prediction can be simply stated as the problem of inferring future power demand given past observations. Oftentimes, historical and predicted values of weather data (e.g., temperature and humidity) are included as prediction variables too. The particular characteristics of power consumption render it an intriguing inference task. On top of a slowly increasing trend, load exhibits hourly, weekly, and seasonal periodicities. Holidays, extreme weather conditions, big events, or a factory interruption create outlying data. Moreover, residential, commercial, and industrial consumers exhibit different power profiles. Apart from the predicted load, uncertainty descriptors such as confidence intervals are important. Actually, for certain reliability and security applications, daily, weekly, or seasonal peak values are critically needed.

Several statistical inference methods have been applied for load forecasting: ordinary linear regression; kernel-based regression and support vector machines; time series analysis using auto-regressive (integrated) moving average (with exogenous variables) models (ARMA, ARIMA, ARIMAX); state-space models with Kalman and particle filtering; neural networks, expert systems, and artificial intelligence approaches. Recent academic works and current industry practices are variations and combinations of these themes reviewed in [70, Ch. 2]. Low-rank models for load imputation have been pursued in [54].

Load forecasting is not the only prediction task in modern power systems. Under a deregulated power industry, market participants can also leverage estimates of future electricity prices. To appreciate the value of such estimates, consider a day-ahead market: an ISO determines the prices of electric power scheduled for generation and consumption at the transmission level during the 24 hours of the following day. The ISO collects the hourly supply and demand bids submitted by generator owners and utilities. Using the optimization methods described later in Sec. IV-A, the grid is dispatched in the most economical way while complying with network and reliability constraints. The output of this dispatch are the power schedules for generators and utilities, along with associated costs. Modern electricity markets are complex. Trading and hedging strategies, weather and life patterns, fuel prices, government policies, scheduled and random outages, reliability rules, all these factors influence electricity prices. Even though prices are harder to predict than loads, the task is truly critical in financial decision making [3]. The solutions proposed so far include econometric methods, physical system modeling, time series and statistical methods, artificial intelligence approaches, and kernel-based approaches; see e.g., [3], [86], [36] and references therein.

#### Iii-D4 Grid Clustering

Modularizing power networks is instrumental for grid operation as it facilitates decentralized and parallel computation. Partitioning the grid into control regions can also be beneficial for implementing “self-healing” features, including islanding under contingencies [47]. For example, after catastrophic events, such as earthquakes, alternative power supplies from different management regions may be necessary due to power shortage and system instability. Furthermore, grid partitioning is essential for the zonal analysis of power systems, to aid load reliability assessment, and operational market analysis [8]. In general, it is imperative to partition the grid judiciously in order to cope with issues involving connected or disconnected “subgrids.” Regional partitioning of the North American grid is illustrated in Fig. 6, where each interconnection is further divided into several zones for various planning and operation purposes. However, the static and manual grid partitioning currently in operation may soon become obsolete with the growing incorporation of renewables and the overall system scaling.

The clustering criterion must be in accordance with grid partitioning goals. In islanding applications, sub-groups of generators are traditionally formed by minimizing the real generator-load imbalance to regulate the system frequency within each island. Recently, reactive power balance has been incorporated in a multi-objective grid partitioning problem to support voltage stability in islanding [47]. For these methods, it is necessary to reflect the real-time operating conditions that depend on the slow-coherency among generators, and the flow density along transmission lines.

Different from the islanding methods that deal with real-time contingencies, zonal analysis intends to address the long-term planning of transmission systems. Therefore, it is critical to define appropriate distance metrics between buses. Most existing works on long-term reliability have focused on the knowledge of network topology, including the seminal work of [83], which pointed out the “small-world” effects in power networks. To account for the structure imposed by Kirchhoff’s laws, it was proposed in [8] to define “electrical distances” between buses using the inverse admittance matrix.

## Iv Optimal Grid Operation

Leveraging the extensive monitoring and learning modalities outlined in the previous section, the next-generation grid will be operated with significantly improved efficiency and reduced margins. After reviewing classical results on optimal grid dispatch, this section outlines challenges and opportunities related to demand-response programs, electric vehicle charging, and the integration of renewable energy sources with particular emphasis on the common optimization tools engaged.

### Iv-a Economic Operation of Power Systems

#### Iv-A1 Economic Dispatch

Economic dispatch (ED) amounts to optimally setting the generation output in an electric power network so that the load is served and the cost of generation is minimized. ED pertains to generators which consume some sort of non-renewable fuel in order to produce electric energy, the most typical fuel types being oil, coal, natural gas, or uranium. In what follows, a prototype ED problem is described, with focus placed on a specific time span, e.g. 10 minutes or one hour, over which the generation output is supposed to be roughly constant.

Specifically, consider a network with generators. Let be the output of the th generator in MWh. The cost of the th generator is determined by a function , which represents the cost in $ for producing energy of MWh (i.e., maintaining power output MW for one hour). The cost is modeled as strictly increasing and convex, with typical choices including piecewise linear or smooth quadratic functions. The output of each generator is an optimization variable in ED, constrained within minimum and maximum bounds, and , determined by the generator’s physical characteristics [84, Ch. 2]. Since once a power plant is on, it has substantial power output, is commonly around 25% of .

With denoting the load forecasted as described in Section III-D3, the prototype ED problem is to minimize the total generation cost so that there is supply-demand balance within the generators’ physical limits:

(13a) | ||||

subj. to | (13b) | |||

(13c) |

Problem (13) is convex, so long as the functions are convex. In this case, it can be solved very efficiently. Convex choices of offer a model approximating the true generation cost quite well and are used widely in the literature. Nevertheless, the true cost in practice may not be strictly increasing or convex, while the power output may be constrained to lie in a collection of disjoint subintervals . These specifications make ED nonconvex, and hence hard to solve. A gamut of approaches for solving the ED problem can be found in [84, Ch. 3].

Following a duality approach, suppose that Lagrange multiplier corresponds to constraint (13b). The multiplier has units $/MWh, which has the meaning of price. Then, the KKT optimality condition implies that for the optimal generation output and the optimal multiplier , it holds that

(14) |

Due to (14), is the -th generator’s cost in dollars. Moreover, if is the price at which each generator is getting paid to produce electricity, then is the profit for the -th generator. Hence, the minimum in (14) is the net cost, i.e., the cost minus the profit, for generator . The latter reveals that the optimal generation dispatch is the one minimizing the net cost for each generator. If an electricity market is in place, ED is solved by the ISO, with representing the supply bids.

There are two take-home messages here. First, a very important operational feature of an electrical power network is to balance supply and demand in the most economical manner, and this can be cast as an optimization problem. Second, the Lagrange multiplier corresponding to the supply-demand balance equation can be readily interpreted as a price. However, the formulation in (13) entails two simplifying assumptions: (i) it does not account for the transmission network; and (ii) it only pertains to a specific time interval, e.g., one hour. In practice, the power output across consecutive time intervals is limited by the generator physical characteristics. Even though the more complex formulations presented next alleviate these simplifications, the two take-home messages are still largely valid.

#### Iv-A2 Optimal Power Flow

The first generalization is to include the transmission network, using the DC load flow model of Sec. II; cf. (8). The resultant formulation constitutes the DC optimal power flow (DC OPF) problem [12]. Specifically, it is postulated that at each bus there exist a generator and a load with output , and demand , respectively. The cases of no or multiple generators/loads on a bus can be readily accommodated.

Recall from (7a) that the real power flow from bus to is approximated by . The bus angles are also variables in the DC OPF problem that reads

(15a) | |||

subj. to | |||

(15b) | |||

(15c) | |||

(15d) |

The objective in (15a) is the total generation cost. Constraint (15b) is the per bus balance. Specifically, the left-hand side of (15b) amounts to the net power injected to bus from the generator and the load situated at the bus, while the right-hand side is the total power that flows towards all neighboring buses. Upon defining vectors for the generator and the load powers, (15b) could be written in vector form as [cf. (8)]. Finally, constraint (15d) enforces power flow limits for line protection.

For convex generation costs , the DC-OPF problem is convex too, and hence, efficiently solvable. A major consequence of considering per bus balance equations is that every bus may have a different Lagrange multiplier. The pricing interpretation of Lagrange multipliers implies that a different price, called locational marginal price, corresponds to each bus. The ED problem (13) can be thought of as a special case of DC OPF, where the entire network consists of a single bus on which all generators and loads reside.

Due to the DC load flow approximation, the accuracy of the DC OPF greatly depends on how well assumptions (A1)-(A3) hold for the actual power system. For better consistency with (A2), it is further suggested to penalize the cost (15a) with the sum of squared voltage angle differences , which retains convexity. Even if the DC OPF is a rather simplified model for actual power systems, it is worth stressing that it is used for the day-to-day operation in several North American ISOs.

Consider next replacing the DC with the AC load flow model (cf. Sec. II) in the OPF context. Generators and loads are now characterized not only by their real powers, but also the reactive ones, denoted as and . The AC OPF takes the form

(16a) | |||

subj. to | |||

(16b) | |||

(16c) | |||

(16d) |

Constraint (16b) reveals that now both the real and reactive powers must be balanced per bus. Recall further that represents the complex power flowing over line . Therefore, the first constraint in (16d) refers to the real power flowing over line [cf. (15d)], while the second to the apparent power. The last constraint in (16d) calls for voltage amplitude limits.

Due to the nonlinear (quadratic equality) couplings between the power quantities and the complex voltage phasors, the AC OPF in (16) is highly nonconvex. Various nonlinear programming algorithms have been applied for solving it, including the gradient method, Newton-Raphson, linear programming, and interior-point algorithms; see e.g., [84, Ch. 13]. These algorithms are based on the KKT necessary conditions for optimality, and can only guarantee convergence to a stationary point at best. Taking advantage of the quadratic relations from voltage phasors to all power quantities as in SE, the SDR technique has been successfully applied, while a zero duality gap has been observed for many practical instances of the AC OPF, and theoretically established for tree networks; see [46, 45], and references therein. SDR-based solvers for three-phase OPF in distribution networks is considered in [17].

The AC OPF offers the most detailed and accurate model of the transmission network. Two main advantages over its DC counterpart are: i) the ability to capture ohmic losses; and ii) its flexibility to incorporate voltage constraints. The former is possible because the resistive part of the line -model is included in the formulation. Recall in contrast that assumption (A1) in the DC model sets . But it is exactly the resistive nature of the line that causes the losses. In view of (16), the total ohmic losses can be expressed as . Such losses in the transmission network may be as high as 5% of the total load so that they cannot be neglected [25, Sec. 5.2].

The discussion on OPF—with DC or AC power flow—so far has focused on economic operation objectives. System reliability is another important consideration, and the OPF can be modified in order to incorporate security constraints too, leading to the security-constrained OPF (SCOPF). Security constraints aim to ensure that if a system component fails—e.g., if a line outage occurs—then the remaining system remains operational. Such failures are called contingencies. Specifically, the SCOPF aims to find an operating point such that even if a line outage occurs, all post-contingency system variables (powers, line flows, bus voltages, etc.) are within limits. The primary concern is to avoid cascading failures that are the main reasons for system blackouts. As explained in Sec. III-D1, if a line is in outage, the power flows on all other lines are adjusted automatically to carry the generated power.

SCOPF is a challenging problem due to the large number of possible contingencies. For the case of the DC OPF, power flows after a line outage are linearly related to the flows before the outage through the line outage distribution factors (LODFs) [12], [84, Ch. 11]. The LODFs can be efficiently calculated based on the bus admittance matrix and are instrumental in the security-constrained DC OPF. The case of AC OPF is much more challenging, and a possible approach is enumeration of all possible contingency cases; see e.g., [84, Sec. 13.5] for different approaches.

#### Iv-A3 Unit Commitment

Here, the scope of DC OPF is broadened to incorporate the scheduling of generators across multiple time periods, leading to the so-termed unit commitment (UC) problem. It is postulated that the scheduling horizon consists of periods labeled as (e.g., a day consisting of 24 1-hour periods). Let be the output of the -th generator at period , and the respective demand. The generation cost is allowed to be time-varying, and is denoted by . A binary variable per generator and period is introduced, so that if generator is on at , and otherwise. Moreover, the th bus angle at is denoted by .

Consideration of multiple time periods allows inclusion of practical generator constraints into the scheduling problem. These are the ramp-up/down and minimum up/down time constraints. The former indicate that the difference in power generation between two successive periods is bounded. The latter mean that if a unit is turned on, it must stay on for a minimum number of hours; similarly, if it is turned off, it cannot be turned back on before a number of periods. The UC problem is formulated as follows.

(17a) | |||

subj. to | |||

(17b) | |||

(17c) | |||

(17d) | |||

(17e) | |||

(17f) | |||

(17g) | |||

(17h) |

The term in the cost (17a) captures generator start-up or shut-down costs. Such costs are generally dependent on the previous on/off activity. For instance, the more time a generator has been off, the more expensive it may be to bring it on again. The initial condition is known. It is also assumed that . The balance equation is given next by (17b). Generation limits are captured by (17c). Constraint (17d) represents the ramp-up/down limits, where the bounds and and the initial condition are given. Constraint (17e) means that if generator is turned on at period , it must remain on for the next periods; and similarly for the minimum down time constraint in (17f), where both and are given [75]. The line flow constraints are given by (17g), while the binary feasible set for the scheduling variables is shown in (17h).

It is clear that problem (17) is a mixed integer program. What makes it particularly hard to solve is the coupling across the binary variables expressed by (17e) and (17f). Note that the DC OPF in (15) is a special case of the UC (17) with the on/off scheduling fixed and the time horizon limited to a single period. It is noted in passing that a multi-period version of the DC OPF can also be considered, by adding the ramp constraints to (15) while keeping the on/off scheduling fixed in (17), therefore obtaining a convex program. Most importantly, note that the UC dimension can be brought into the remaining two problems described here, that is, the ED and the AC OPF. In the latter, the problem has two mathematical reasons for being hard, namely, the integer variables and the nonconvexity due to the AC load flow. The problems discussed here are illustrated in Fig. 7.

A traditional approach to solving the UC is to apply Langrangian relaxation with respect to the balance equations [84, Ch. 5], [5], [75]. The dual problem can be solved by a non-differentiable optimization method (e.g., a subgradient or bundle method), while the Lagrangian minimization step is solved via dynamic programming. An interesting result within the Lagrangian duality framework is that the duality gap of the UC problem without a transmission network diminishes as the number of generators increases [5]. One of the state-of-the-art methods for UC is Benders decomposition, which decomposes the problem into a master problem and tractable subproblems [70, Ch. 8].

### Iv-B Demand Response

Demand response (DR) or load response is the adaptation of end-user power consumption to time-varying (or time-based) energy pricing, which is judiciously controlled by the utility companies to elicit desirable energy usage [24], [29]. The smart grid vision entails engaging residential end-users in DR programs. Residential loads have the potential to offer considerable gains in terms of flexible load response, because their consumption can be adjusted—e.g., an air conditioning unit (A/C)—or deferred for later or shifted to an earlier time. Examples of flexible loads include pool pumps or plug-in (hybrid) electric vehicles. The advent of smart grid technologies have also made available at the residential level energy storage devices (batteries), which can be charged and discharged according to residential needs, and thus constitute an additional device for control.

Widespread adoption of DR programs can bring significant benefits to the future grid. First, the peak demand is reduced as a result of the load shifting capability, which can have major economical benefits. Without DR, the peak demand must be satisfied by generation units such as gas turbines that can turn on and be brought in very fast during those peaks. Such units are very costly to operate, and markedly increase the electricity wholesale prices. This can be explained in a simple manner by recalling the ED problem and specifically (14). Considering a gas turbine that is brought in and does not operate at its limits, (14) implies that . Expensive units have exactly very high derivative , that is, increasing their power output requires a lot of fuel.

A second benefit of DR is that it has the potential to reduce the end-user bills. This is due to the time-based pricing schemes, which encourage consumption during reduced-price hours, but also because the wholesale prices become less volatile as explained earlier, which means that the electricity retailers can procure cheaper sources. A third benefit is that DR can strengthen the adoption of renewable energy. The reason is that the random and intermittent nature of renewable energy can be compensated by the ability of the load to follow such effects. More light into the latter concept will be shed in Sec. IV-D.

DR is facilitated by deployment of the advanced metering infrastructure (AMI), which comprises a two-way communication network between utility companies and end-users (see Fig. 8) [24, 29]. Smart meters installed at end-users’ premises are the AMI terminals at the end-users’ side. These measure not just the total power consumption, but also the power consumption profile throughout the day, and report it to the utility company at regular time intervals—e.g., every ten minutes or every hour. The utility company sends pricing signals to the smart meters through the AMI, for the smart meters to adjust the power consumption profile of the various residential electric devices, in order to minimize the electricity bill and maximize the end-user satisfaction. Energy consumption is thus scheduled through the smart meter. The communication network at the customer’s premises between the smart meter and the smart appliances’ controllers is part of the so-called home area network (HAN).

Time-varying pricing has been a classical research topic [10]. The innovation DR brings is that the end-users’ power consumption becomes controllable, and therefore, part of the system optimization. Novel formulations addressing the various research issues are therefore called for. DR-related research issues can be classified in two groups. The first group deals with joint optimization of DR for a set of end-users, which will be termed hereafter multi-user DR. The second group focuses on optimal algorithm design for a single smart meter with the aim of minimizing the electricity bill and the user discomfort in response to real-time pricing signals. Each approach has unique characteristics, as explained next.

Multi-user DR sets a system-wide performance objective accounting for the cost of the energy provider and the user satisfaction. Joint scheduling must be performed in a distributed fashion, and much of the effort is to come up with pricing schemes that achieve this goal. Privacy of the customers must be protected, in the sense that they do not reveal their individual power consumption preferences to the utility, but the desired power consumption profile is elicited by the pricing signals. One of the chief advantages of joint DR scheduling for multiple users is that the peak power consumption is reduced as compared to a baseline non-DR approach. The reason is that joint scheduling opens up the possibility of loads being arranged across time so that valleys are filled and peaks are shaved.

On the other hand, energy consumption scheduling formulations for a single user can model in great detail the various smart appliance characteristics, often leading to difficult nonconvex optimization problems. This is in contrast with the vast majority of multi-user algorithms, which tend to adopt a more abstract and less refined description of the end-users’ scheduling capabilities. More details on the two groups of problems are given next.

#### Iv-B1 Multi-user DR

Consider residential end-users, connected to a single load-serving entity (LSE), as illustrated in Fig 9. The LSE can be an electricity retailer or an aggregator, whose role is to coordinate the users’ consumption and present it as a larger flexible load to the main grid. The time horizon consists of periods, which can be a bunch of 1-hour or ten-minute intervals. User has a set of smart appliances . Let be the power consumption of appliance of user at time period (typically in kWh), and a vector collecting the corresponding power consumptions across slots.

The LSE incurs cost for providing energy to the users. This cost is essentially the cost of energy procurement from the wholesale market or through direct contracts with energy generation units, and may also include other operation and maintenance costs. Each user also adopts a utility function , which represents user willingness to consume power.

The prototype multi-user DR problem takes the following form

(18a) | ||||

subj. to | (18b) | |||

(18c) | ||||

(18d) |

Clearly, the objective is optimizing the system’s social welfare. Constraint (18b) amounts to a balance equation for each period. Moreover, the set in (18c) represents the scheduling constraints for every appliance, while constraint (18d) bounds the power provided by the LSE.

Problem (18) is convex as long as is convex, is concave, and sets are convex. This is typically the case, and different works in the literature address DR using versions of the previous formulation [11, 56, 68, 23]. Various examples of appliance models—including batteries—together with their utility functions and constraint sets can also be found in the aforementioned works.

Problem (18) as described so far amounts to energy consumption scheduling. Another instance of DR that can be described by the previous formulation is load curtailment. In this context, there is an energy deficit in the main grid for a particular time period, and the LSE must regulate the power consumption to cover for this deficit. The situation can be captured in (18) by setting (single time period), and the power deficit as . The cost does not affect the optimization, while the negative of represents the discomfort of the end-user due to the power curtailment, so the total discomfort is minimized. This problem is addressed in [62, 39] and the references therein.

One of the main research objectives regarding (18) is to solve the scheduling problem in a distributed fashion, without having the functions and sets communicated to the LSE in order to respect customer privacy. Algorithmic approaches typically entail message exchanges between the LSE and the users or among the users, and lead to different pricing interpretations and models. Specific approaches include gradient projection [11]; block coordinate descent [56]; dual decomposition and subgradient method [23], [62]; the Vickrey-Clarke-Groves (VCG) mechanism [68]; Lagrangian relaxation and Newton method [62]; and dual decomposition with the bisection and Illinois methods [39].

Formulation (18) refers to ahead-of-time scheduling. Real-time scheduling is also important. A real-time load response approach operating on a second-to-second scale is developed in [41] and references thereof. The aim is to have the aggregate power consumption of a set of thermostatically controlled loads (TCLs), such as A/C units, follow a desired signal. Model predictive control is employed to this end. Moreover, in order to come up with a simple description of the state space model pertaining to the set of TCLs, system identification ideas are brought to bear.

#### Iv-B2 Single-user DR

The problems here focus on minimizing the total cost due to energy consumption or the peak instantaneous cost over a billing interval (or possibly a combination thereof). User comfort levels and preferences must also be taken into account.

Detailed modeling of appliance characteristics and scheduling capabilities typically introduces integer variables into the formulation, which is somewhat reminiscent of the unit commitment problem [cf. (17)]; see e.g., [64, 40, 74] and references therein. Solution approaches include standard mixed-integer programming techniques—e.g., branch-and-bound, Lagrangian relaxation, dynamic programming—as well as random search methods such as genetic algorithms and particle swarm optimization. An interesting result is that when the problem is formulated over a continuous time horizon and accounts for the fact that appliances can be turned on or off anytime within the horizon, then it has zero duality gap [22].

Real-time approaches have also been pursued. A linear programming DR model with robustness against price uncertainty and time-series-based price prediction from period to period is developed in [15]. Moreover, [60] focuses on TCLs, and specifically, on a building with multiple zones, with each zone having its own heater. The aim is to minimize the peak instantaneous cost due to the power consumption of all heaters, while keeping each zone at a specified temperature interval. The problem is tackled through a decomposition into a master mixed-integer program and per zone heater control subproblems.

### Iv-C Plug-in (Hybrid) Electric Vehicles

As an important component of the future smart grid vision, electric vehicles (EVs) including plug-in (hybrid) EVs (P(H)EVs) are receiving a lot of attention. A global driving factor behind the research and development efforts on EVs is the environmental concern of the greenhouse gases emitted by the conventional fossil fuel-based transportation. As the future grids accommodate the renewable energy resources in an increasing scale, the carbon footprint is expected to be markedly curbed by high EV penetration. Electric driving also bears strategic relevance in the context of growing international tension over key natural resources including crude oil. From the simple perspective of improving overall energy efficiency, electrification of transportation offers an excellent potential.

PEVs interact directly with the power grid through plug-in charging of built-in batteries. As such, judicious control and optimization of PEV charging pose paramount challenges and opportunities for the grid economy and efficiency. Since PEV charging constitutes an elastic energy load that can be time-shifted and warped, the benefits of DR are to be magnified when PEV charging is included in DR programs. In fact, as the scale of PEV adoption grows, it is clear that smart coordination of the charging task will become crucial to mitigate overloading of current distribution networks [13, 85, 18]. Without proper coordination, PEV charging can potentially create new peaks in the load curves with detrimental effects on generation cost. On the other hand, it is possible for the PEV aggregators that have control over a fleet of PEVs to provide ancillary services by modulating the charging rate in the vehicle-to-grid (V2G) concept [72]. This in turn allows the utilities to depend less on conventional generators with costly reserve capacities, and facilitates mitigation of the volatility of renewable energy resources integrated to the grid [38]. The aforementioned topics are discussed in more detail next.

#### Iv-C1 Coordination of PEV Charging

It is widely recognized that uncoordinated PEV charging can pose serious issues on the economy of power generation and the quality of power delivered through the distribution networks. PEVs are equipped with batteries with sizable capacities, and it is not difficult to imagine that most people would opt to start charging their vehicles immediately after their evening commute, which is the time of the day that already exhibits a significant peak in power demand [18]. Fortunately, the smart grid AMI reviewed in Sec. IV-B provides the groundwork for effective scheduling and control of PEV charging to meet the challenges and sustain mass adoption.

A variety of approaches have been proposed for PEV charging coordination. The power losses in the distribution network were minimized by optimizing the day-ahead charging rate schedules for given PEV charging demands in [13]. Real-time coordination was considered in [18], where the cost due to time-varying electricity price as well as the distribution losses were minimized by performing a simple sensitivity analysis of the cost and accommodating the charging priorities. Extending recent results on globally optimal solution of the OPF problem via its Lagrangian dual [46], the optimality of similar approaches for PEV coordination problems was investigated in [71].

Interestingly, PEV charging can be also pursued in a distributed fashion. Further, optimizing feeder losses of distribution networks, load factor, and load variance are oftentimes equivalent problems [73]. Leveraging the latter, minimization of load variance was investigated in [21]. Specifically, the optimal day-ahead charging profiles for vehicle over a -slot horizon, are obtained by solving

(19a) | ||||

subj. to | (19b) | |||

(19c) |

where is the given base demand, and specify the limits on charging rates, and represents the total energy expended for charging PEV to the desired state-of-charge (SoC). The formulation is referred to as “valley-filling” in [21], as it schedules PEV loads in the valleys of the base load curve.

An optimal solution to (19) can be obtained iteratively [21]. Supposing that the initial pricing signal , , and the initial charging profiles are identically zero for iteration , each PEV updates charging profiles via

(20a) | ||||

subj. to | (20b) |

A central entity such as the utility or a PEV aggregator then collects the profiles from all PEVs, and updates the pricing signal as

(21) |

The new pricing signals are then fed back to the PEVs and the procedure iterates until convergence. It is clear from (21) that the per-vehicle objective in (20a) corresponds to a first-order estimate of the overall objective in (19a), augmented with a proximal term. The overall procedure turns out to be a projected gradient search.

#### Iv-C2 Integration with Renewables and V2G

It is only when the wide adoption of PEVs is coupled with large-scale integration of renewable energy sources that the emission problem can be alleviated, as the conventional generation itself contributes heavily to the emission. However, renewable energy sources are by nature intermittent, and often hard to predict accurately. By allowing the PEV batteries or fuel cells to supply their stored power to the grid based on the V2G concept, it was observed in [38] that photovoltaic (PV) resources harnessed by the EVs could competitively provide peak power (since the PV power becomes highest few hours earlier than the daily load peak quite predictably), and large-scale wind power could be stabilized for providing base power, via intelligent control. For specific control strategies to accomplish such benefits, formulations that maximize the profit for providing ancillary services were considered in [72] and references therein.

#### Iv-C3 Charging Demand Prediction

An important prerequisite task to support optimal coordination of PEVs is modeling and prediction of the PEV charging demand. The probability distributions of the charging demand were characterized in [51] and references therein. Spatio-temporal PEV charging demand was analyzed for highway traffic scenarios using a fluid traffic model and a queuing model in [4]. However, there are many interesting issues remaining that deserve further research in this forecasting task.

### Iv-D Renewables

The theme of Sec. IV-A has been economic scheduling of generators, which consume non-renewable fuels. The subject of the present section is on including generation from renewable energy sources (RESs), with the two prime examples being wind and solar energy. RESs are random and intermittent, which makes them nondispatchable. That is, RESs are not only hard to predict, but their intermittency gives rise to high variability even within time periods as short as 10 minutes. Therefore, they cannot be readily treated as conventional generators, and be included in the formulations of Sec. IV-A. In this context, methods for integrating generation from RESs to the smart grid operations are outlined next.

#### Iv-D1 Forecast-Based Methods

To illustrate the forecast-based methods, recall the ED problem [cf. (13)], and suppose that there is also a wind power generator that can serve the load. The output of the wind power generator for the next time period is a random variable denoted by . It is assumed that a forecast is available, and that the wind power generator has no cost (as it does not consume fuel). Then, the balance constraint is replaced by [cf. (13b)]

(22) |

while the remainder of the ED problem remains the same. Since the load is actually forecasted (cf. Sec. III-D3), constraint (22) essentially treats the uncertain RES no different than a negative load.

In order for the forecast to be accurate, the time period of ED is recommended to be short, such as 10 minutes. Building on this, a multi-period ED is advocated in [32], where the main feature is a model-predictive control approach with a moving horizon. Specifically, the ED over multiple periods and accompanying forecasts is solved for e.g., 6 ten-minute periods representing an hour. The generation is dispatched during the first period according to the obtained solution. Then, the horizon is moved, and a new multi-period ED with updated forecasts is solved, whose results are applied only to the next period, and so on. Such a method can accommodate the ramping constraints, and is computationally efficient.

#### Iv-D2 Chance-Constrained Methods

To account for the random nature of RES in ED, the probability distribution of comes handy. Specifically, the constraint is now that the supply-demand balance holds with high probability , say 99%. Hence, (22) is substituted by the chance constraint

(23) |

Note that the equality of the balance equation has been replaced by an inequality in (23), because excess power from RESs can in principle be curtailed.

To solve the chance-constrained ED, the distribution of must be known. For wind power, this is derived from the wind speed distribution, and the speed-power output mapping of the generator [49]. The most typical speed distribution is Weibull, while the speed-power output mapping is nonlinear. Evidently, this approach poses formidable modeling and computing challenges when multiple RESs and their spatio-temporal correlation are considered. The probability that the load is not served (immediately obtained from the one in (23)) is often called loss of load probability. Related sophisticated methods which account for chance constraints are also described in [82]. An alternative approach not requiring the joint spatio-temporal wind distribution is presented in [92].

#### Iv-D3 Robust (Minmax) Optimization

This approach postulates that the power generation from all RESs across space and time belongs to a deterministic uncertainty set. The aim is to minimize the worst-case operational costs, while setting the dispatchable generation and other optimization variables to such levels so that the balance is satisfied for any possible RES output within the uncertainty set. The main attractive feature here is that no detailed probabilistic models are needed. Only the uncertainty set must be obtained, e.g., from historical data, or, meteorological factors.

A robust version of UC [cf. (17)] is presented next. Following the notation of Sec. IV-A, it is postulated that there are RESs with power output per bus and time period. Let , and denote the uncertainty set for . The optimization variables are set in two stages. The on/off variables are chosen during the first stage. The power generation variables and bus angles are set after the RES power output is realized—which constitutes the second stage. Therefore, the power outputs and bus angles are functions of the commitments as well as the RES power outputs, and are denoted as and . The robust two-stage UC problem takes the form

(24a) | |||

subj. to | |||

(24b) | |||