Learning Topology of the Power Distribution Grid with and without Missing Data
Distribution grids refer to the part of the power grid that delivers electricity from substations to the loads. Structurally a distribution grid is operated in one of several radial/tree-like topologies that are derived from an original loopy grid graph by opening switches on some lines. Due to limited presence of real-time switch monitoring devices, the operating structure needs to be estimated indirectly. This paper presents a new learning algorithm that uses only nodal voltage measurements to determine the operational radial structure. The algorithm is based on the key result stating that the correct operating structure is the optimal solution of the minimum-weight spanning tree problem over the original loopy graph where weights on all permissible edges/lines (open or closed) is the variance of nodal voltage difference at the edge ends. Compared to existing work, this spanning tree based approach has significantly lower complexity as it does not require information on line parameters. Further, a modified learning algorithm is developed for cases when the input voltage measurements are limited to only a subset of the total grid nodes. Performance of the algorithms (with and without missing data) is demonstrated by experiments on test cases.
Power Distribution Networks, Power Flows, Spanning Tree, Graphical Models, Load estimation, Voltage measurements, Missing data, Computational Complexity
Distribution grids constitute the low voltage segment of the power system delivering electricity from substations to end-users. Both structurally and operationally the distribution grids are distinct from the transmission (high voltage) portion of the power system. A typical distribution grid is operated as a collection of disjoint tree graphs, each growing from substations at the root to customers. However, the complete layout of the distribution system is loopy to allow multiple alternatives for the trees to energize operationally. Switching from one layout to another, implemented through switch on/off devices placed on many segments of the distribution grid , can take place rather often, in some cases few times an hour. (See Fig. 1 for the illustration.) More frequent reconfiguration of the distribution is also promoted by recent in-mass integration of smart meters, PMUs  and smart devices, such as deferrable loads and energy storage devices. Mixed operational responsibilities in monitoring and operations, as well as the growing role of the new smart devices and controls, make fast and reliable estimation of the operational configuration of the distribution grid an important practical task, complicated by the lack of real-time, line-based measurements. In such a scenario, to estimate the distribution grid operational topology one ought to rely only on nodal measurements of voltage and end-user consumption. Notice, that brute force (combinatorial) check of topologies for the nodal measurement consistency is prohibitively expensive with the complexity growing exponentially with the number of loops in the grid layout.
In this work we focus on beating the naive exponential complexity of the operational topology learning task by exploring power flow specific correlations between available nodal measurements. In particular, we develop a spanning tree algorithm that reconstructs the radial operational topology from the original loopy layout by using functions of nodal voltage magnitudes as edge weights. Computational complexity of this algorithm is order in the size of the loopy graph’s edge set. Moreover, the algorithm is generalized to the case when some nodes are hidden.
1.1 Prior Work
Several approaches in the past have been made to learn the topology of power grids under different operating conditions and available measurements.  uses a Markov random field model for bus phase angles to build a dependency graph to identify faults in the grids.  presents a topology identification algorithm for distribution grids that uses the signs of elements in the inverse covariance matrix of voltage measurements.  compares available time-series observations from smart meters with a database of permissible signatures to identify topology changes. This is similar to envelope comparison schemes used in parameter estimation [6, 7]. For available line flow measurements, topology estimation using maximum likelihood tests was analyzed in . In our own prior work [1, 9], we analyzed an iterative greedy structure learning algorithm using trends in second order moments of voltages.  also presented the first attempt at topology learning from incomplete voltage data where nodes with missing voltages are separated by greater than two hops. The aforementioned approaches are specific to power grid graphs and typically not linked to research in probabilistic Graphical Models (GM)  used to study statistics of images, languages, social networks, and communication schemes. Learning generic (loopy) structures from pair-wise correlations in a GM is a difficult task, normally based on the maximal likelihood  with regularization for sparsity  and greedy schemes utilizing conditional mutual information [12, 13]. However, the GM-based learning simplifies dramatically when used, following the famous Chow-Liu approach , to reconstruct the spanning tree maximizing edge-factorized mutual information.  generalizes this technique to learn tree structured GMs with latent variables (missing data) using information distances as edge weights.
1.2 Contribution of This Work
Following [1, 9], we consider linear lossless AC power flow models (also called, following [16, 17] Lin-Dist-Flow) and assume that fluctuations of consumption at the nodes are uncorrelated. In this setting, our main result states that reconstruction of the operating grid topology is equivalent to solving the minimum weight spanning tree problem defined over the loopy graph of the grid layout where edge weights are given by variances in voltage magnitude differences across the edges. We use this result to formulate the operating topology as a spanning tree reconstruction problem that needs only empirical voltage magnitude measurements as input. As spanning trees can be efficiently reconstructed, our learning algorithm has much lower average and worst-case computational complexity compared to existing techniques [4, 9]. While our algorithm does not require knowing line impedances, these can be used to estimate additionally statics of power consumption. Further, we extend the topology learning algorithm to the case with missing voltage data. The extension works provided nodes with missing data are separated by at least two hops from each other and covariances of nodal power consumption are available. Compared to our prior work  on learning with missing data, the spanning tree approach has lower complexity. It also allows extension to cases with lesser restrictions on missing data. Our algorithm shows some commonality with the GM based spanning-tree learning of . However the key difference is that our approach relies principally on the Kirchoff’s laws of physical network flows contrary to the measure of conditional independence utilized in [14, 15]. Thus, voltage magnitude based edge weights used in our work are not restricted to satisfy graph additivity unlike information distances in GM. Further, in the case with missing data, we use power flow relations between nodal voltages and injections that, to the best of our knowledge, do not have an analog in GM learning literature. We highlight the performance of our algorithm through experiments on test distribution grids for both cases, with or without missing data.
The rest of the manuscript is organized as follows. Section 2 introduces notations, nomenclature and power flow relations in the distribution grids. Section 3 describes important features of the nodal voltage magnitudes. This Section also contains the proof of our main – spanning tree learning/reconstruction – theorem. Algorithm reconstructing operational spanning tree in the case of complete visibility (voltage magnitudes are observed at all nodes) is discussed in Section 4. Modification of the algorithm which allows for some missing data (at the nodes separated by at least two hopes) is described in Section 5. This Section also contains a brief discussion of some other extensions/applications of our approach. Simulation results of our learning algorithm on a test radial network are presented in Section 6. Finally, Section 7 contains conclusions and discussion of future work.
2 Distribution Grid: Structure and Power Flows
Radial Structure: The original distribution grid is denoted by the graph , where is the set of buses/nodes of the graph and is the set of all undirected lines/edges (open or operational). We denote nodes by alphabets (, ,…) and the edge connecting nodes and by . The operational grid has a ‘radial’ structure as shown in Fig. 1. In general, the operational grid is a collection of disjoint trees, where each tree’s root node has degree one (connected by one edge) and represents a substation.
In this paper, we will mainly focus on grids where the operational structure consists of only one tree with nodes and operational edge set . Generalization to the case with multiple disjoint trees will be discussed along side major results.
Power Flow (PF) Models: Let denote the complex impedances of a line (). Here and are line resistance and reactance respectively. Kirchhoff’s laws express the complex valued power injection at a node in tree as
where the real valued scalars, , , and denote the voltage magnitude, voltage phase, active and reactive power injection respectively at node . and denote the nodal complex voltage and injection respectively. One node (substation/root node in our case) is considered as reference and the voltage magnitude and phase at every non-substation node are measured relative to the reference values. As the complex power injection at the reference bus is given by negation of the sum of injections at other buses, without a loss of generality the analysis can be limited to a reduced system, where one ignores reference substation bus voltages and power injections. Under realistic assumption that losses of both active and reactive power in lines of a distribution system are small, Eq. (1) can be linearized as follows.
Linear Coupled (LC) model [1, 9]: In this model, phase difference between neighboring nodes and magnitude deviations () from the reference voltage are assumed to be small. The PF Eqs. (1) are linearized jointly over both voltage magnitude and phase to give:
Here, and are the vectors of real power, reactive power, voltage magnitude deviation and phase angle respectively at the non-substation nodes of the reduced system. and denote the reduced weighted Laplacian matrices for where reciprocal of resistances and reactances are used respectively as edge weights. The reduced Laplacian matrices are of full rank and constructed by removing the row and column corresponding to the reference bus from the true Laplacian matrix.
 shows that the LC-PF model is equivalent to the LinDistFlow model [16, 17, 18], if deviations in voltage magnitude are assumed to be small and thus ignored. (Notice, that if line resistances are equated to zero, the LC-PF model reduces to the DC PF model  used for transmission grids.) We can express means and covariance matrices of voltage magnitude deviations and phase angles in terms of corresponding statistics of power injections using Eq. (2) as shown below. Other quantities can be similarly determined.
In the next Section, we derive key results for functions of nodal voltages in a radial distribution grid that will subsequently be used in the topology learning algorithm.
3 Properties of Voltage Magnitudes in Radial Grids
Consider grid tree with operational edge set . Let denote the set of edges in the unique path from node to the root node (reference bus) in tree . A node is termed as a descendant of node if includes some edge connected to node . We use to denote the set of descendants of . By definition, . If is an immediate descendant of (), we term as parent and as its child. These definitions are illustrated in Fig 2.
Due to the radial topology of , the inverse of the reduced weighted graph Laplacian matrix has the following structure (see Section in  for details).
Thus, the entry in is given by the sum of line resistances of edges that are included in the path to the root from either node as shown in Fig. 2. For nodes and its parent in tree (see Fig. 2), it follows from Eq. (5) that
Assumption : Power Injection at different nodes are not correlated, while active and reactive injections at the same node are positively correlated. Mathematically, non-substation nodes
Note that this is a valid assumption for many distribution grids due to independence between different nodal load fluctuations and alignment/correlations between same node’s active and reactive power usage.
Under Assumption , we state the following result without proof. (See  for details.)
[9, Theorem 1] If node is a descendant of node on tree then .
Next, we define the term , which is the variance of the difference in voltage magnitudes between nodes and .
The next result identifies trends in along the radial grid. Note that the first two cases in Lemma 1 are proven in . The additional final case is opposite of the first case and helps develop our new learning scheme presented later in this paper.
For three nodes in grid tree , holds for the following cases:
We give the proof for Case depicted in Fig. 3. In this case, , where is the set of edges in the unique path from node to the root node of . Further, the sets of descendants of and satisfy . From Fig. 3, it is clear that any node belongs to either , , or . When , using Eq. (5), we have,
For node , one derives
For , one derives
Finally for . Such inequalities also hold for matrix. Using the inequalities in Eqs. (10, 12,14) for and with Eq. (8) results in for Case . The proofs for the other cases ( and ) can be done in a similar way and they are thus skipped. ∎
Further, the following results hold for operational edges in .
It is worth mentioning that all three statements in Lemma 2 involve line impedances corresponding to edges and only. In the following sections, we use these results to design our topology learning algorithm.
4 Structure Learning with Full Observation
Our main result for topology learning using voltage magnitude measurements is formulated using Lemma 1.
Let the weight of each permissible edge of the original loopy graph be . Then operational edge set in radial grid forms the minimum weight spanning tree of the original graph.
From Lemma 1, it is clear that for each node , the minimum value of along any path in (towards or away from the root node) is attained at its immediate neighbor on that path, connected by edge . The minimum spanning tree for the original loopy graph with ’s as edge weights is thus given by the operational edges in the radial tree. ∎
Note that if node is taken as the substation/root node (), the weight of any edge is given by . As mentioned in Section 2, the substation has one child. In the spanning tree construction, the root is thus connected to the node with lowest variance of voltage magnitude. This is in agreement with Theorem 1.
Algorithm : The input consists of voltage magnitude readings for all non-substation buses in the system. An observer computes for all permissible edges (including those with the root node) and identifies edges in the minimum spanning tree as the set of operational edges . The root node is restricted to have a single edge. Note that Algorithm does not need any information on line parameters (resistances and reactances) or on statistics of active and reactive nodal power consumption. If impedances of lines in and phase angle measurements at all nodes are known, Eqs. (2), (3) and (4) can be used to estimate means and covariances of each node’s power injection.
Algorithm Complexity: Using Kruskal’s Algorithm [20, 21], the minimum spanning tree from edges can be computed in operations. This is a great improvement over previous iterative or matrix inversion based techniques which scaled as , where is the number of nodes in the grid. If is not known or corresponds to the complete graph, Algorithm ’s complexity is , i.e. it still compares favorably with the prior scheme.
Extension to Multiple Trees: If multiple trees exist in the grid, voltage magnitudes at nodes and belonging to disjoint trees will be independent. Thus, . This result can be used to separate nodes into disjoint groups before running Algorithm to generate the operational tree in each group.
In the next Section, we extend our spanning tree based algorithm to consider cases where information is missing at some fraction of nodes.
5 Structure Learning with Missing Data
In a realistic power grid, communication packet drops or random noise events may erase voltage magnitude measurements for node set in . Following , we consider arbitrary placement of unobserved nodes with the following restriction.
Assumption : Missing nodes are separated by greater than two hops in the grid tree .
Note that under assumption , an observable node cannot be connected to two or more unobserved nodes. (We plan to analyze extensions beyond Assumption in future work.) Additionally, we assume that the adversary estimates or has access to historical information for the values of and covariance matrices for all nodes and impedances of all possible lines in .
To reconstruct operational topology in the presence of missing data, we first construct the minimum weight observable spanning tree using as edge weights between observable nodes. We then analyze edges in tree and detect unobserved node locations. Consider the situation shown in Fig. 4 where information from the leaf node is missing. By Assumption , information from its parent () and grandparent () are observed in . Note that satisfies Statement in Lemma 2. If all other descendants of are known, statement of the Lemma can be used to identify the existence of unobserved node .
We now discuss the identification of a non-leaf node with missing information. Assume that information is missing at the node in Fig. 4. ’s parent and children node set comprise its one-hop neighborhood, and are observable under Assumption . Using Cases and in Lemma 1, and . Thus, descendants of are connected to the rest of through edges between its one-hop neighbors (set and ). The following theorem gives the edge configurations possible in for and nodes in .
Let . No edge between children nodes exists in . All nodes in set are connected to node , while all nodes in are connected to .
Theorem 3 does not specify if edge exists in . In fact node will be connected to a node instead of if holds. There are thus two permissible configurations and (see Figs. 4, 4) in for connections between one hop neighbors of non-leaf unobservable node . Note that one of sets or may be empty as well.
Any two nodes in are children of node and thus satisfy Statement in Lemma 2. Observe that for both configurations and , this result holds for and any of its children in that belong to . The result also holds for and its parent in configuration . On the other hand, any node in and are actually separated by node and thus it satisfies Statement in Lemma 2. This result thus holds for node and any of its children from . Statements and in Lemma 2 can hence be used to identify unobservable node in Algorithm .
Algorithm : Assume that information is missing at the set , thus leaving only observable. Covariance matrices for power injection at all nodes of the observed set are assumed known to the observer along with impedances of all lines in . Algorithm , first, constructs spanning tree for observed nodes using edge weights for all node combinations given by . Observed nodes in are then arranged in reverse topological order (decreasing depth from root node). This is done as unobserved node locations are iteratively searched from leaf sites inward towards the root (see Step 5). For each leaf with parent , Steps 7 to 12 checks if edge with or without some unobserved leaf node connected to . For undecided nodes in , the Algorithm first checks for configuration or described in the preceding discussion. Step 15 determines if nodes in and are separated by a unobserved node using Statement in Lemma 2. If such a node doesn’t exist, Step 18 search for a unobserved node that is parent of both nodes in and node using using Statement in Lemma 2. Nodes and set are removed from the observed tree in each iteration and discovered edges are added to . Further, injection covariances at the recently identified descendants are added for use in later checks involving results from Lemma 2. Note that only in the final case (Step 18), the unobserved node is not removed from set as its parent node has not been determined yet. This process is iterated by picking a new node with all children as leaf nodes until no nodes with missing information remain to be discovered.
Complexity: Computing the spanning tree for observed nodes has complexity . Sorting observed nodes in topological order is done in linear time () . Finally, checking (Steps 5, 7, 12, 15) for all iterations has complexity as total observed nodes and edges number and searching over unobserved nodes takes at most steps. The overall complexity of Algorithm is thus which is in the worst case. Note that this is also the worst-case complexity of Algorithm .
Relation to Learning Probabilistic Graphical Model: It is worth noting that in the tree-structured GM learning , edge always exists due to the graph-additivity of edge weights and configuration in Fig. 4 is not realized. The inequality in Eq. (15) of Lemma 2 shows that may be strictly increasing with the number of graph hops and thus it does not satisfy graph additivity in general. Non-additivity of edge weights makes our topology learning approach a generalization of the additive model in  .
Extensions: We briefly mention two extensions of Algorithm , planning to analyze these in details in the future. First, Algorithm can be used for structure learning when injection covariances at unobserved nodes are not known. Here each unobserved node must have at least two children for unique identification. Second, Algorithm will be extended to operate when unobserved nodes are separated by hops. In this case, permissible configurations in addition to and (see Fig. 4) need to be checked. A modification of Statement in Lemma 2 will be used to detect unobserved nodes. In the following Section, we discuss the performance of our designed algorithms through experiments on test networks.
Here we demonstrate performance of Algorithm in determining the operational edge set of the radial grid . We consider a radial network [22, 23] with load nodes and one substation as shown in Fig. 5. In each of our simulation runs, we first collect complex power injection samples at the non-substation nodes from a multivariate Gaussian distribution that is uncorrelated between different nodes as per Assumption . We use LC-PF model to generate nodal voltage magnitude measurements. Finally, we introduce additional edges (at random) forming the loopy edge set . The additional edges are given random impedances comparable to those of operational lines. We, first, test performance of the Algorithm for the case where locations of edges in the set and voltage magnitude measurements at all non-substation nodes are available. We show results for topology learning for this case in Fig. 6. Note that the estimation is extremely accurate and average errors expressed relative to the size of the operational edge set) decay to zero at the sample sizes less than . We also estimate covariance matrices of complex nodal power injections using the just reconstructed radial operating topology and plot results in Fig. 6. For covariance estimation, line impedances of the set and samples of phase angle measurements are used along with voltage magnitude samples as input. The relative errors in this case decay exponentially with increase in the number of the measurement samples.
Next, we present simulations for Algorithm where the operational grid structure is reconstructed in the presence of unobserved nodes. We consider three cases with information at the nodes , and missing. The location of the unobserved nodes are selected at random in accordance with Assumption . Voltage magnitudes at the unobserved nodes are removed from the input data. Covariance of power injections at all the load nodes and impedances of all the lines within the loopy edge set are provided as input to the observer. The average number of errors shown in Fig. LABEL:fig:plotmissing decreases steadily with increase in the number of samples. This tendency is seen clearly for all the cases of the unobserved node sets. Further, the average errors increase with increase in the number of unobserved nodes for a fixed number of measurement samples. The average errors produced by Algorithm are significantly lower in comparison with the respective algorithm from , however (and as expected) the Algorithm is significantly less accurately than Algorithm where all nodes are observed.
Identifying the operational edges in the distribution grids is critical for real-time control and reliable management of different grid operations. In this paper, we study the problem of learning the radial operating structure from a dense loopy grid graph. Under an LC (linear coupled) power flow model, we show that if edge weights between load nodes are defined as the variance of the difference of their voltage magnitudes, the minimum weight spanning tree optimization over the loopy physical layout outputs operational radial structure. Using this spanning tree property, we design a fast structure learning algorithm that uses only nodal voltage magnitude measurements for the input. We then extend the spanning tree based framework to learn the operational structure when available voltage measurements are limited to a subset of the grid nodes. For unobserved nodes separated by greater than three hops, the learning algorithm is able to identify locations of the missing measurements by verifying properties of our voltage magnitude based edge weights. In this case, statistics of nodal injections and line impedances are used as a part of the input. We demonstrate good performance of the learning algorithm through experiments on distribution grid test cases. Finally, we discuss how voltage magnitude based edge weights in our algorithm generalizes edge metrics used in learning schemes of probabilistic GMs. In future we plan to generalize our approach reducing restrictions, e.g. allowing unobserved nodes to be separated by less than two hops and utilizing less information about nodal consumption.
- D. Deka, S. Backhaus, and M. Chertkov, “Structure learning and statistical estimation in distribution networks - part i,” arXiv preprint arXiv:1501.04131, 2015.
- A. Phadke, “Synchronized phasor measurements in power systems,” Computer, 1993.
- M. He and J. Zhang, “A dependency graph approach for fault detection and localization towards secure smart grid,” Smart Grid, IEEE Transactions on, vol. 2, no. 2, pp. 342–351, 2011.
- S. Bolognani, N. Bof, D. Michelotti, R. Muraro, and L. Schenato, “Identification of power distribution network topology via voltage correlation analysis,” in Decision and Control (CDC), 2013 IEEE 52nd Annual Conference on. IEEE, 2013, pp. 1659–1664.
- G. Cavraro, R. Arghandeh, A. von Meier, and K. Poolla, “Data-driven approach for distribution network topology detection,” arXiv preprint arXiv:1504.00724, 2015.
- J. Peppanen, J. Grimaldo, M. J. Reno, S. Grijalva, and R. G. Harley, “Increasing distribution system model accuracy with extensive deployment of smart meters,” in PES General Meeting— Conference & Exposition, 2014 IEEE. IEEE, 2014, pp. 1–5.
- J. Peppanen, M. J. Reno, M. Thakkar, S. Grijalva, and R. G. Harley, “Leveraging ami data for distribution system model calibration and situational awareness,” 2015.
- R. Sevlian and R. Rajagopal, “Feeder topology identification,” arXiv preprint arXiv:1503.07224, 2015.
- D. Deka, S. Backhaus, and M. Chertkov, “Structure learning and statistical estimation in distribution networks - part ii,” arXiv preprint arXiv:1502.07820, 2015.
- M. J. Wainwright and M. I. Jordan, “Graphical models, exponential families, and variational inference,” Foundations and Trends® in Machine Learning, vol. 1, no. 1-2, pp. 1–305, 2008.
- P. Ravikumar, M. J. Wainwright, J. D. Lafferty et al., “High-dimensional ising model selection using â1-regularized logistic regression,” The Annals of Statistics, vol. 38, no. 3, pp. 1287–1319, 2010.
- A. Anandkumar, V. Tan, and A. S. Willsky, “High-dimensional graphical model selection: tractable graph families and necessary conditions,” in Advances in Neural Information Processing Systems, 2011, pp. 1863–1871.
- P. Netrapalli, S. Banerjee, S. Sanghavi, and S. Shakkottai, “Greedy learning of markov network structure,” in Communication, Control, and Computing (Allerton), 2010 48th Annual Allerton Conference on. IEEE, 2010, pp. 1295–1302.
- C. Chow and C. Liu, “Approximating discrete probability distributions with dependence trees,” Information Theory, IEEE Transactions on, vol. 14, no. 3, pp. 462–467, 1968.
- M. J. Choi, V. Y. Tan, A. Anandkumar, and A. S. Willsky, “Learning latent tree graphical models,” The Journal of Machine Learning Research, vol. 12, pp. 1771–1812, 2011.
- M. Baran and F. Wu, “Optimal sizing of capacitors placed on a radial distribution system,” Power Delivery, IEEE Transactions on, vol. 4, no. 1, pp. 735–743, Jan 1989.
- ——, “Optimal capacitor placement on radial distribution systems,” Power Delivery, IEEE Transactions on, vol. 4, no. 1, pp. 725–734, Jan 1989.
- ——, “Network reconfiguration in distribution systems for loss reduction and load balancing,” Power Delivery, IEEE Transactions on, vol. 4, no. 2, pp. 1401–1407, Apr 1989.
- A. Abur and A. G. Exposito, Power system state estimation: theory and implementation. CRC Press, 2004.
- J. B. Kruskal, “On the shortest spanning subtree of a graph and the traveling salesman problem,” Proceedings of the American Mathematical society, vol. 7, no. 1, pp. 48–50, 1956.
- T. H. Cormen, C. E. Leiserson, R. L. Rivest, and C. Stein, Introduction to Algorithms. The MIT Press, 2001.
- U. Eminoglu and M. H. Hocaoglu, “A new power flow method for radial distribution systems including voltage dependent load models,” Electric Power Systems Research, vol. 76, no. 1â3, pp. 106 – 114, 2005.
- [Online]. Available: http://www.dejazzer.com/reds.html