Exact Topology and Parameter Estimation in Distribution Grids with Minimal Observability
Abstract
Limited presence of nodal and line meters in distribution grids hinders their optimal operation and participation in realtime markets. In particular lack of realtime information on the grid topology and infrequently calibrated line parameters (impedances) adversely affect the accuracy of any operational power flow control. This paper suggests a novel algorithm for learning the topology of distribution grid and estimating impedances of the operational lines with minimal observational requirements  it provably reconstructs topology and impedances using voltage and injection measured only at the terminal (enduser) nodes of the distribution grid. All other (intermediate) nodes in the network may be unobserved/hidden. Furthermore no additional input (e.g., number of grid nodes, historical information on injections at hidden nodes) is needed for the learning to succeed. Performance of the algorithm is illustrated in numerical experiments on the IEEE and custom power distribution models.
This work is supported by the US DOE/OE’s Grid Modernization Laboratory Consortium (GMLC) program.
I Introduction
Distribution grids enable supply of power to endusers/ loads. With the advent of smart grids, new resources like controllable loads, smallscale/household renewable generators (e.g. solar panels) and storage units (e.g. batteries, electric vehicles) have been introduced in the distribution grid. This paradigm shifting change is turning traditionally passive and largely static distribution grids, that acted solely as power sinks, into dynamic, reconfigurable and active resources for novel controls. Emerging smart grid control technologies, relying on these dynamic and reconfigurable features, include participation in demand response, use of batteries in frequency regulation and interhousehold energy settlements/transactions. These new technologies also require much more accurate estimation of the grid characteristics. The radial topology of current operational lines, and their impedances are the most significant of these frequently changing characteristics of the smart distribution grids. However, reliable and realtime estimation of the distribution grid topology and line impedances is impeded by limited observability. Even though Phasor Measurement Unit (PMU) technology has become available recently, it still largely limited to power transmission systems [1]. It is not obvious if a widespread (full coverage) use of the PMU technology will ever be economically justified at the distribution level. Moreover, access to underground distribution grid in urban areas (e.g. New York City) is technically challenging. These constraints/limitations justify importance of developing techniques capable of estimating operational topology and line impedances in the situation of infrequent calibration and sparse access.
In spite of the access limitations PMUs, microPMUs [2], FNETs [3] have being placed into many distribution grids. In addition, and most importantly for the setting discussed in the manuscript, smart enduser measurement units, such as ones associated with smart household devices and EVs, have been installed. These modern enduser devices have the ability to record and communicate nodal voltages and injections. Inspired by this novel capabilities we analyze in this manuscript the joint problem of topology and impedance estimation using measurements collected from smart meters located only at the end/terminal nodes of the smart distribution grids.
Ia Prior Work
Topology estimation in the power grid is an active area of research. Researchers have proposed different methods depending on availability and type of measurements. [4] uses cycle basis and maximum likelihood tests to reconstruct topology from line measurements. For measurements of nodal voltages collected at all nodes, [5, 6] uses the signs of inverse covariance of complex voltages to identify the operational lines. In a similar regime, [7, 8] presents greedy schemes based on trends in second moments of voltage magnitudes to identify grid topology. [9, 10] utilizes statistical independence tests, in the setting of a graphical model associated with nodal voltages, to reconstruct topology. A number of data driven and modelfree schemes, using signature and regression based methods, were developed to reconstruct topology and line parameters [11, 12, 13]. In the work [14], most closely related to this manuscript, an iterative scheme to estimate topology and line impedances from voltage measurements is proposed.
A common feature of the aforementioned papers is that they all rely on availability of nodal measurements (voltage and/or injection) at all nodes of the grid. Reconstruction in the case with limited voltage observability as discussed in [7, 8, 15], require full information about all line impedances in the network. However, these assumptions can be easily broken due to nonubiquitous presence of nodal meters and lack of historical and stateestimation information.
In this manuscript, we overcome these impediments and suggest an efficient algorithm which estimates jointly operational topology and impedances of the operationally radial distribution grids in the setting when voltage and injection/consumption measurements at all but enduser nodes are missing/hidden.
IB Technical Contribution
We develop a provable method/algorithm for topology estimation in the radial distribution grids from samples of nodal voltage and injection measurements collected from smart meters at the enduser locations. The remaining nodes (including all intermediate nodes) in the grid may be unobserved, i.e., their states are unmeasured and further even the knowledge of their existence and number may be unknown. This represents a realistic scenario where majority of measurements are limited to residential devices installed at the enduser locations. Our reconstruction is modelbased. We consider a linearized power flow model [16, 17, 18, 7]. Under this model, we develop an algorithm which can provably learn the impedance distance, i.e. sum of line impedances along the unique operational path connecting any two observed nodes in the radial grid. Once the impedance distance is reconstructed, we utilize the recursive grouping algorithm [19] to learn topology and impedances including impedances between missing nodes. Our algorithm is characterized by computational complexity and sample complexity. We demonstrate performance of the algorithm on some useful toy examples and then present results of experiments based on ac power flows on realistic IEEE test cases. To the best of our knowledge, this is the first work which provides guaranteed topology and impedance reconstruction in the distribution grid where only terminal nodes are observed.
The rest of the manuscript is organized as follows. Section II introduces nomenclature and power flow relations in the distribution grid. Our main algorithm is described in Section III. Analysis of the computational/sample complexity of the algorithm is also presented in Section III. Numerical experiments on IEEE test cases are presented in Section IV. Finally, Section V is reserved for conclusions and discussion of future work.
Ii Distribution Grid and Power Flow Model
Radial Structure: The distribution grid is defined over graph , where the set of buses/nodes is denoted by and the set of undirected operational lines/edges is denoted by . The operational grid is assumed to have ‘radial’ operational structure, that is it forms the forest consisting of disjoint trees with roots corresponding to substations. Fig 1 shows a disjoint tree part of a radial grid where a red node represents substation, blue nodes represent leaf nodes and dotted nodes represent internal nodes. Notationally, we use alphabets to represent buses/nodes and stands for a line/edge between nodes and . We denote as a root node (reference bus). We denote as a unique path from node to node from the same (operational) tree.
Power Flow Models: Given a radial distribution grid on a tree structured graph , the grid satisfies the following Kirchhoff’s law of power flow which express the complex power injection at a node via nodevoltages and lineimpedances as follows:
(1) 
In this equation, denote impedance of , voltage magnitude, voltage phase, active and reactive power at respectively. The substation/root/reference nodes, maintained at unit/nominal voltage, are assumed known/fixed. Since (1) is nonconvex, we simplify the model by making a realistic assumption that the second order terms in (1) is negligible. Under this assumption, we introduce the linearized lossless power flow equation describing the linear coupled power flow (LCPF) model [14, 7]:
(2) 
where , and are resistance, reactance of line respectively, i.e., . By considering only deviations from the respective steady state reference values, are modeled as random variables with zero mean (counted from the known reference values).
Linear Coupled Power Flow Models: Note that under assumption that voltage deviations from nominal and power injections deviations from the base case are both small the LCPF model (2) is equivalent to the LinDistFlow equations introduced in [20, 21, 22]. The LCPF model (2) can also be stated in the following matrix form [7]
(3) 
where are respectively vectors of voltage magnitude, voltage phase, active and reactive power at the nonsubstation buses of the grid.
represents the reduced weight Laplacian matrices for where are used edgeweights respectively.
Iii Topology and Impedance Learning Algorithm
In this section, we introduce our main algorithm for learning topology and impedances. We assume that timestamped observations of voltage magnitudes, voltage phase, active and reactive injections at the endnodes are available to the observer. Our algorithm is built on the notion of additive ‘distance’ defined as a distance over the graph which thus satisfies the weighted metric property, . We first estimate the distance, and then utilize the recursive grouping algorithm [19] to learn operational topology of the grid. Before introducing our algorithm, let us make the following assumption about the missing intermediate nodes.
Assumption 1.
All missing intermediate nodes have a degree at least 3.
We note that Assumption 1 is necessary to recover the true topology of the grid. See Assumption 2 in [14] for the details. In addition, we assume that the complex power injections at different nodes are uncorrelated
Assumption 2.
.
As considered in prior studies [5, 7], the uncorrelated assumption 2 is welljustified over sufficiently short time intervals while considering deviations of injections at endusers. Further, for intermediate nodes that involved in separation of power into downstream lines and without any major nodal usage, leakage or device losses cause the net power injection, and hence may be considered as independent from the rest. Note also that the Assumption 2 does not specify the class of distributions that can model individual nodeâs power injection. It is applicable when nodal injections are negative (loads), positive (due to local generation) or are a mixture of both. In a future work, we will relax this assumption and discuss learning in the presence of correlated enduser injection profiles that are only uncorrelated to injections at intermediate nodes.
Now, we refer the following key property of the inverse of a reduced weight Laplacian matrix which is necessary to define the gridbased distance metric [7]
(4) 
See Section 4 in [7] for details. In (4), is the unique path from to , is the resistance of and is the root (reference bus) of the grid. One can also derive a similar formulation for line reactances and .
Under Assumption 2 and using (3) in the case of observed nodes , one derives the following identity
(5) 
where are quantities that can be computed from measurements at observed nodes and . Notice that ability to estimate the expectations in (5) implies that one can also estimate the value of and for any observed unless . To avoid such pathological situation, we make the following assumption.
Assumption 3.
There exists a constant such that for all node ,
Using the estimated , we can now estimate the resistance distance (effective resistance) between observed nodes as
(6) 
Note that effective resistance is an additive distance metric between nodes and in the grid. Similarly, one can also estimate the additive reactance distance between observed nodes . Once we estimate for all pairs of observed nodes, we can utilize the recursive grouping algorithm (RG) [19] which directly leads us, under Assumption 1, to consistent topology and impedance estimation of the power grid.
Iiia Recursive Grouping Algorithm with Exact Distance
The recursive grouping algorithm (RG) is an algorithm which recovers the true radial topology given any additive distance of observed nodes on the tree where it requires to observe every leaf nodes. Now, we first make the ideal setting assumption that exact values of are known for every pair of observed nodes. We now need to introduce the following lemma [19].
Lemma 1.
For , the following relation holds:

for all if and only if is a leaf node and is its parent.

for all if and only if are leaf nodes with common parent, i.e., they belong to the same group of siblings.
Using Lemma 1 (i), one can figure out the parentchild relationship for a set of observed nodes . In addition, Lemma 1 (ii) enables us to find the groups/sets of siblings of .
Now we are ready to describe how the RG works. The RG steps are illustrated in Figure 2. The input of RG is a set of observed nodes and the additive distance for all . For example, in Figure 1(a), green nodes represent . First, RG finds groups of siblings and parents of a node using Lemma 1, also as illustrated in Figure 1(b). After recovering the parentchild, sibling relationships, it adds edges to all identified parentchild pairs. For siblings without observed parent, RG adds a new node for a potential parent and adds edges to the newly added parents and its children. The procedure is illustrated in Figure 1(c). Once nodes/edges update is done, RG updates the distance between the newly added parents. For siblings and their newly added parent , the distance is calculated by
(7) 
for any . Also, for any , RG also computes by
(8) 
Finally, RG updates the set with (newly added) parents and nodes which have no (established) relations which requires to update the parentchild, sibling relationships at the next iteration of RG. For example, green nodes in Figure 1(c) is an updated . After updating , RG starts over the whole procedure anew unless , which implies that we can add an edge to remaining vertices or only a single vertex left. Figure 1(d)1(f) illustrates advanced iterations of the RG (following the first one). Formal description of RG is given in the Algorithm 1.
Overall, we propose the following two stage algorithm for topology learning of grids with missing modes:

Recover missing nodes and lines using the recursive grouping algorithm.
for learning the topology and impedance of the grid. The formal statement of the algorithm is presented in Algorithm 2.
IiiB Recursive Grouping Algorithm with Samples
In the practical scenario, we can only observe the approximated value of calculated from samples rather than the exact value. Given finite number of samples the variance of the distance is nonzero. To account for the variance, we allow some tolerance for finding parentchild and sibling relationships. In addition, we test the relationship of only using nodes which are close enough to both and , i.e., nodes in where satisfies
for some constant . Let us now present rules which guide the relationships of nodes using samples.

Set as a parent of if for all .

Set as siblings if
One can observe that the newly introduced rules are equivalent to the RG rules with exact except for a tolerance . Update of the distance is done in a similar manner. For and its newly added parent , one sets
where denotes the children set of . Likewise, update of the distance for ,
IiiC Sample and Computational Complexity
We observe that our algorithm terminates in steps where is the depth of the grid and follows from the calculation of . In addition, our algorithm requires (under some mild assumptions) only samples to correctly recover the operational topology and line impedances. The following theorem makes a formal statement on the sample complexity of RG.
Theorem 1.
Suppose that a radial graph/grid has a constant depth. Under Assumptions 13 and assuming the LCPF model, if line impedances are bounded from below by nonzero value, nodal power injections are subGaussian with constantly bounded subGaussian parameter and the number of samples is greater than for some constant , then Algorithm 2 recovers the true topology and impedances with probability .
Proof of the theorem is omitted as it is analogous to the proof of Theorem 11 in [19]. Note that Algorithm 2 has an extra term in sample complexity compare to Theorem 11 in [19]. This is because are expressed by summations of independent complex power injections in (3). In Assumption 2, we assume the independence of complex power injections in different nodes and achieve (5). However, the variance should be considered in (5) as
which increases the sample complexity.
Iv Experiments
In this section, we present experimental results of our algorithm on custom and IEEE models.
Custom Examples:
In each of our simulation runs we construct the random radial grid and complex power injections.
For the topology of the grid, we generate a random tree with maximum degree 5.
The line resistance and reactance are independently sampled from the distribution .
Under this setting, we run experiments changing number of vertices from 10 to 100, the number of samples from 1000 to 10000, and also changing tolerance . To quantify performance of our algorithm, we record accuracy of the correct recovery and errors in the recovered topology and impedances. For each number of vertices and number of samples, we generate 100 random radial grids to measure the performance. Figure 3 shows the correct recovery ratio and the average error in estimating line impedances. The average error is defined for the grid with correct recovery according to . One observes that our algorithm recover line impedances with small error even in the demanding case of 1000 samples. We also observe that larger results in a higher accuracy for the small number of samples but it becomes less accurate for the large number of samples (compare to ). However, if the threshold is too small (), the algorithm performance decreases for all samples sizes. Note that similar results (thus not shown) are derived when changing variance of the complex power injectiona/consumptions.
IEEE models: For the more realistic experiments we use a IEEE model with 56 nodes [5] where the topology was modified (to be radial) and the internal nodes all to be of degree . The modified grid is illustrated in Figure 4. We generate the complex power injections from the independent normal distribution as in the case of the custom models. From the complex power injections, we obtain the corresponding voltage magnitude and phase by using ac power flow equations in MATPOWER [23]. The input of the algorithm is the complex power injections, voltage magnitude and phases of the leaf/enduser nodes.
Under this setting, we measure performance of our algorithm by varying the input number of samples, the variance of the complex power injection and the threshold value, . In particular, as a way to quatify errors, we count the number of edge difference between the recovered topology and the true topology. We also compare the performance of our algorithm with MATPOWER samples and LCPF samples generated with the same complex power injections. Figure 5 and Figure 6 show our IEEE model experimental results. In Figure 5, one observes that the algorithm works similarly for both MATPOWER samples and the LCPF samples. In Figure 6, similarly to what we saw in the custom model experiments, the algorithm performance decreases as the threshold increases.
V Conclusion
Topology learning of the distribution grids in real time from sparse data is critically important for a number of operational/control applications. In this manuscript, we propose a novel algorithm which recovers topology and line impedances by only using measurements at the enduser nodes. In this approach we utilize LCPF model to approximate the resistance distance (also called effective resistance) between any two observed nodes and apply the recursive grouping algorithm to recover the topology. Computational complexity of the algorithm is and the algorithm guarantees to output the correct topology with only samples (under some mild technical conditions regarding statistics of loads/generation). Furthermore, our experimental results, derived for custom (randomized) and IEEE models, shows that the algorithm performs remarkably well. In the future, we plan to extend our algorithm to the case of correlated injections/consumptions and also attempt to generalize to the case of sparse but loopy operations grids/graphs.
Footnotes
 denotes subgraph of induced by .
 is a coarsest partition if for any and for any , there exists such that . The coarsest partition in Algorithm 1 represents a collection of sets of siblings and their parent.
 is a uniform distribution on an interval .
References
 R. Hoffman, “Practical state estimation for electric distribution networks,” in IEEE PES Power Systems Conference and Exposition. IEEE, 2006, pp. 510–517.
 A. von Meier, D. Culler, A. McEachern, and R. Arghandeh, “Microsynchrophasors for distribution systems,” Innovative Smart Grid Technologies Conference (ISGT), 2014 IEEE PES, pp. 1–5, 2014.
 Z. Zhong, C. Xu, B. J. Billian, L. Zhang, S.J. S. Tsai, R. W. Conners, V. A. Centeno, A. G. Phadke, and Y. Liu, “Power system frequency monitoring network (fnet) implementation,” Power Systems, IEEE Transactions on, vol. 20, no. 4, pp. 1914–1921, 2005.
 R. Sevlian and R. Rajagopal, “Feeder topology identification,” arXiv preprint arXiv:1503.07224, 2015.
 S. Bolognani, N. Bof, D. Michelotti, R. Muraro, and L. Schenato, “Identification of power distribution network topology via voltage correlation analysis,” in Decision and Control (CDC), 2013 IEEE 52nd Annual Conference on. IEEE, 2013, pp. 1659–1664.
 D. Deka, M.Chertkov, S. Talukdar, and M. V. Salapaka, “Topology estimation in bulk power grids: Theoretical guarantees and limits,” in accepted in the Bulk Power Systems Dynamics and Control SymposiumIREP, 2017.
 D. Deka, M. Chertkov, and S. Backhaus, “Structure learning in power distribution networks,” IEEE Transactions on Control of Network Systems, 2017.
 D. Deka, S. Backhaus, and M. Chertkov, “Learning topology of the power distribution grid with and without missing data,” in Control Conference (ECC), 2016 European. IEEE, 2016, pp. 313–320.
 ——, “Estimating distribution grid topologies: A graphical learning based approach,” in Power Systems Computation Conference (PSCC), 2016. IEEE, 2016, pp. 1–7.
 Y. Liao, Y. Weng, G. Liu, and R. Rajagopal, “Urban distribution grid topology estimation via group lasso,” arXiv preprint arXiv:1611.01845, 2016.
 G. Cavraro, R. Arghandeh, A. von Meier, and K. Poolla, “Datadriven approach for distribution network topology detection,” arXiv preprint arXiv:1504.00724, 2015.
 V. Arya, T. Jayram, S. Pal, and S. Kalyanaraman, “Inferring connectivity model from meter measurements in distribution networks,” in Proceedings of the fourth international conference on Future energy systems. ACM, 2013, pp. 173–182.
 J. Peppanen, J. Grimaldo, M. J. Reno, S. Grijalva, and R. G. Harley, “Increasing distribution system model accuracy with extensive deployment of smart meters,” in PES General Meeting— Conference & Exposition, 2014 IEEE. IEEE, 2014, pp. 1–5.
 D. Deka, S. Backhaus, and M. Chertkov, “Learning topology of distribution grids using only terminal node measurements,” in Smart Grid Communications (SmartGridComm), 2016 IEEE International Conference on. IEEE, 2016, pp. 205–211.
 ——, “Learning topology of distribution grids using only terminal node measurements,” in IEEE Smartgridcomm, 2016.
 M. Baran and F. Wu, “Optimal sizing of capacitors placed on a radial distribution system,” Power Delivery, IEEE Transactions on, vol. 4, no. 1, pp. 735–743, Jan 1989.
 ——, “Optimal capacitor placement on radial distribution systems,” Power Delivery, IEEE Transactions on, vol. 4, no. 1, pp. 725–734, Jan 1989.
 S. Bolognani and S. Zampieri, “On the existence and linear approximation of the power flow solution in power distribution networks,” Power Systems, IEEE Transactions on, vol. 31, no. 1, pp. 163–172, 2016.
 M. J. Choi, V. Y. Tan, A. Anandkumar, and A. S. Willsky, “Learning latent tree graphical models,” The Journal of Machine Learning Research, vol. 12, pp. 1771–1812, 2011.
 M. E. Baran and F. F. Wu, “Optimal sizing of capacitors placed on a radial distribution system,” IEEE Transactions on power Delivery, vol. 4, no. 1, pp. 735–743, 1989.
 ——, “Optimal capacitor placement on radial distribution systems,” IEEE Transactions on power Delivery, vol. 4, no. 1, pp. 725–734, 1989.
 ——, “Network reconfiguration in distribution systems for loss reduction and load balancing,” IEEE Transactions on Power delivery, vol. 4, no. 2, pp. 1401–1407, 1989.
 R. D. Zimmerman, C. E. MurilloSánchez, and R. J. Thomas, “Matpower: Steadystate operations, planning, and analysis tools for power systems research and education,” IEEE Transactions on power systems, vol. 26, no. 1, pp. 12–19, 2011.