Risk of Cascading Blackouts Given Correlated Component Outages

Risk of Cascading Blackouts Given Correlated Component Outages

Laurence A. Clarfeld, Paul D.H. Hines, Eric M. Hernandez, and Margaret J. Eppstein. L.A. Clarfeld and M.J. Eppstein are with the Department of Computer Science, University of Vermont, Burlington, VT, 05405 (e-mail: Laurence.Clarfeld@uvm.edu; Maggie.Eppstein@uvm.edu). P.D.H. Hines is with the Department of Electrical and Biomedical Engineering, University of Vermont, Burlington, VT, 05405 (e-mail: paul.hines@uvm.edu). E.M. Hernandez is with the Department of Civil and Environmental Engineering, University of Vermont, Burlington, VT, 05405 (e-mail: eric.hernandez@uvm.edu).

Cascading blackouts typically occur when nearly simultaneous outages occur in out of components in a power system, triggering subsequent failures that propagate through the network and cause significant load shedding. While large cascades are rare, their impact can be catastrophic, so quantifying their risk is important for grid planning and operation. A common assumption in previous approaches to quantifying such risk is that the initiating component outages are statistically independent events. However, when triggered by a common exogenous cause, initiating outages may actually be correlated. Here, copula analysis is used to quantify the impact of correlation of initiating outages on the risk of cascading failure. The method is demonstrated on two test cases; a 2383-bus model of the Polish grid under varying load conditions and a synthetic 10,000-bus model based on the geography of the Western US. The large size of the Western US test case required development of new approaches for bounding an estimate of the total number of blackout-causing contingencies. The results suggest that both risk of cascading failure, and the relative contribution of higher order contingencies, increase as a function of spatial correlation in component failures.

blackout risk, cascading failure, cascading outage, correlated outages, Random Chemistry.

1 Introduction

Cascading power failures are typically initiated when a small number of components in a power system of components stop working nearly simultaneously, and the subsequent rerouting of power flow triggers additional component outages. This process continues until the system reaches a state of equilibrium. While most cascades do not propagate extensively throughout the network, the rare cases when they do can cause massive blackouts affecting millions of people. Due to their vast size and substantial social and economic costs, the risk they pose to power systems is significant [1, 2, 3].

To mitigate the risk posed by cascading failure, power systems are operated such that no single component outage will cause a cascade (so-called -1 security). While grid planners and operators are now also required to consider the risk of cascading failure due to multiple contingencies [4], it is not yet clear how to estimate this risk. For brevity, minimal contingencies that result in a cascading blackout are referred to as “malignancies”, while contingencies that do not cause a blackout are referred to as “benign” [5]. By “minimal”, we mean that no smaller subset of outages results in a blackout. Analysis of high order malignancies is challenging due to the nonlinear ways in which cascades propagate, the vast number of malignancies, and the combinatorial search space of possible contingencies, which is computationally intractable for realistically-sized grids.

Many previous approaches to cascading failure risk analysis (including our own) assumed initiating component outages to be independent events [1, 6, 7, 8, 9]. However, malignancies triggered by the same exogenous event, or “common cause”, represent a significant source of risk to power systems [10]. For example, extreme weather events can result in spatially correlated damage [11], protection system failures can sometimes cause multiple outages within a small geographic region [12], and terrorist attacks may be spatially localized. Non-spatial attributes, such as component age, may also induce correlations in component failures [13, 14].

There is a dramatically increasing computational burden to assessing risk for malignancies as increases, so it is important to understand the degree to which higher-order () malignancies contribute to risk, and thus how important it is to consider them in risk estimation. Even though there are many more than malignancies for any given system, when there is no correlation in initiating outages the probability of malignancies occurring is so much lower than that of malignancies that the impact of malignancies on risk is negligible [7]. However, as correlation in component outages increases, the impact of higher order malignancies on risk will increase. To what degree should risk analysis take into account the conditional probability of component failure, given a common cause?

There has been some amount of prior work on ways to incorporate correlation into risk analysis. In [15], correlation was incorporated by assuming 100% correlation of outages within a fixed radius. In [11], spatial correlation was achieved by determining outage rates of lines adjacent to initial failures probabilistically, according to a Poisson process. In [16], a random field with spatial autocorrelation was used in a cascade model to assess risk from common-cause events. Others [17] have simulated the impact of hidden relay failures on cascading failure risk by allowing proximate lines to trip probabilistically.

Another approach to incorporating correlation into risk estimation is via copula analysis. Popularized in the field of finance [18], copulas have been used in a wide variety of disciplines to model the co-dependence of multiple variables [19, 20]. Within the realm of power systems, copulas are a popular tool for uncertainty analysis. They have been used in the modelling of stochastic generation, such as wind [21, 22, 23]. The impacts of variable infeeds on security assessment have also been considered using copulas [24]. Li [25] suggests copulas as a useful way to incorporate correlation between random variables in power systems risk analysis. However, to our knowledge, copula analysis has not yet been applied to cascading failure risk analysis, outside of the preliminary study [26] that we build upon here.

Here, a flexible and generalizable methodology to incorporate correlated component outages into blackout risk estimation using copula analysis is presented. The method is demonstrated on two test cases, where it is shown that even modest assumptions of correlation can dramatically increase the risk associated with cascading power failures, and that the relative risk due to higher-order malignancies increases with increasing correlation.

This paper is organized as follows: methods for risk estimation using samples of malignancies, and the computationally efficient “Random Chemistry” (RC) sampling method used in this work, are reviewed in Sections 2.1 and 2.2, respectively. In Section 2.3 a method using copula analysis to incorporate initiating outage correlations into risk estimation is presented, and in Section 2.4 an approach to quantifying distance between transmission lines, when considering spatial correlation, is described. The two test cases used to demonstrate the method are described in Section 2.5; the large size of one of the test cases required new methods for estimating the total number of malignancies, described in section 2.6. Results and discussion are presented in sections 3 and 4, respectively.

2 Methods

2.1 Estimating Risk of Cascading Failure

This study uses the method for estimating risk of cascading failure from sampled malignancies presented in [9, 7], briefly reviewed below.

The risk due to a set of branches (transmission lines or transformers) can be calculated as  [27]:


where is the joint probability of the branches in failing and is the size of the resultant blackout. Note that is itself a function of , the independent outage probability for each branch , as well as any effect of correlation among branch outage probabilities (as further defined in Section 2.3).

Blackout size is quantified as the total power (MW) unserved due to load shedding. In this work, a cascading blackout is considered to have occurred when 5% or more of the total load is shed in DCSIMSEP, a simulator of cascading outages in power systems [28]. The risk posed to the system by all malignancies comprising branches , for a given , is then:


where is the complete set of all malignancies for the specified . For realistically-sized power systems it is not computationally tractable to find the entire set for . However, if is a large and representative subset of size , comprising all unique malignancies found by many iterations of some sampling strategy, and if the size of the complete set of malignancies can be estimated, then risk associated with malignancies, for a given , can be estimated as follows:


Estimating for on large systems is itself a very challenging problem, as discussed in Section 2.6.

Considering only malignancies with and assuming that non-minimal supersets of malignancies do not substantially change the amount of load shed, as justified in  [7], the risk of cascading failure can be approximated as:


2.2 Random Chemistry Sampling

For each , there are possible contingencies, only a small proportion of which are malignancies. While exhaustive search may be feasible (albeit time consuming) for , it is computationally intractable for in large power systems. Thus, many iterations of the Random Chemistry (RC) sampling method were used to efficiently identify large sets of malignancies in each test case, for . RC is a stochastic set size reduction algorithm for identifying a small minimal set of initiating events that trigger some outcome of interest [29], and was first applied for identifying malignancies in power systems in [5]. For the reader’s convenience, the RC algorithm is briefly reviewed below.

The RC algorithm uses a subset reduction scheme . Subsets of size are randomly sampled from a universal set of system components until one such set is found that causes a blackout; if is relatively large, this typically requires few tries. A set of size is then randomly sub-sampled from the preceding set of size until a set is found that causes a blackout (or some maximum number of iterations is reached, in which case the algorithm aborts), and so on for each subsequent set size in the scheme. If , for some constant , then the algorithm requires only time to identify a subset of size , after which a bottom-up brute force search of all subsets of a given size is conducted (in randomized order, starting from ), exiting when the first minimal malignancy of size is identified (or it has been determined that none exist).

Repeated sampling with independent RC trials is performed to compile large subsets of malignancies (), for . Risk due to each is then calculated using in (3) for estimating system risk with (4).

A comparison of risk estimation using RC sampling vs. Monte Carlo (MC) sampling on a model of the Polish power system at peak winter load [30] showed that the RC approach was at least two orders of magnitude faster than MC on this system, and did not introduce measurable bias into the estimate [9, 7]. However, these previous studies assumed branch outages were uncorrelated. Under that assumption, malignancies contribute relatively little to the risk, despite the fact that , since their probability of occurrence is so much smaller than that of the malignancies [7].

In this study, the universal set is assumed to comprise the set of branches in each test case, , and ; the specific set reduction scheme used for each test case is given in Section 2.5.

2.3 Copula Analysis for Correlation

In [26] a method was introduced using copula analysis to generalize the risk estimation approach to systems with correlated outages, and applied it to the 2383-bus Polish system, with . Here, this method is extended to and a study was conducted to asses how the relative risk of malignancies changes as a function of correlation in outage probabilities, both in the Polish system and a larger 10,000-bus test case, described in Section 2.5.

Copula functions couple multivariate distributions to the marginal distributions of individual variables [31]. Given a set of random variables, , where is the marginal probability that branch fails, for some threshold , then


where , represents the joint probability that branches fail together. Without loss of generality, it is assumed that for all .

There are numerous classes of copula functions in popular use. For this demonstration of the method, a Gaussian copula was assumed, but alternative distributions may be assumed where appropriate. Here, it is assumed that the (inverse) stress on a transmission line is a univariate Gaussian random variable , with mean and standard deviation , with the cumulative distribution function:


Given the independent probability of branch going out, and are chosen such that when , the branch goes out. In other words, and are chosen such that for each branch . Without loss of generality, it is assumed that , and each is then solved for as follows:


A multivariate normal distribution with mean and covariance matrix is then used as the copula function to couple these univariate marginal distributions (Fig. 1).

Fig. 1: A visual depiction of the copula method for two components and with hypothetical Gaussian distributions of some performance attributes, and , which impact whether each component is operational or in a failure state. The curves on the vertical planes represent the marginal distributions of each component’s attribute, with the shaded regions of these curves, () and (), representing the failure state for each component. The shaded gradient on the horizontal plane represents the density of the joint distribution (copula) of the two variables, with darker shading representing higher probability density. The probability mass within the red hatched area represents the region of joint failure (), with the red dotted line depicting the boundaries of this region.

In this study, it is assumed that the correlation between outages in branches and decays exponentially with the distance between them , according to:


where represents the maximum possible correlation coefficient (at distance zero) and represents the characteristic length, which controls the decay rate of the correlation; can be interpreted as the distance at which reaches (i.e., of ) (Fig. 2).

Eq. (8) can be adjusted to represent a wide range of exogenous common cause events individually, or in combination, by adjusting the parameters and to align with data for a particular set of threats. The exponential decay form captures the spatially decaying nature of earthquakes [32], and can approximately capture the impact of other threats that are likely to be geographically correlated, such as tornados [33] and hurricanes [34].

Fig. 2: Change in the correlation between two branches as a function of the distance between them, assuming (8) with characteristic length units of distance and correlation for branches that are 0 units of distance apart.

The resulting correlation coefficient calculated by (8), and the standard deviations and calculated by (7), are used to calculate the pairwise covariance between branches and as:


Using (9) to find each element of the covariance matrix , the probability density function of the multivariate normal distribution (10) is used to form the copula.


Integrating (10) over the region in the joint distribution that represents outages of all system components gives:


where represents the joint outage probability of components . The multiple-integral in (11) represents the generalized solution for arbitrary and is equivalent to the cumulative distribution function of the multivariate normal distribution, which can be solved efficiently in MATLAB using methods described in [35]. In this work, .

2.4 Defining Inter-Branch “Distance”

The definition of “distance” will vary based on the type of common cause threatening the system. Assuming spatial correlation in branch outages here, without loss of generality a modified version of the inter-branch distance metric defined in [26] is used. Branches are assumed to be straight lines between the buses that form their endpoints. Given branch with endpoints () and branch with endpoints (), let the distance from to be defined as


where is the minimum Euclidean distance from the point to the line segment , calculated as:


where and , as illustrated in Figure 3.

Fig. 3: Visual example for calculating the distance between branches and with endpoints () and (,), respectively.

This formulation defines a semi-metric since the triangle inequality does not hold in some cases. However, all other formal requirements of a metric are met. Specifically:

The third identity implies branches and share the same endpoints, thus are parallel. This measure is consistent with what would be intuitively expected when considering spatially correlated damage. For example, in Fig. 4, and .

Using the measure, it is apparent that branch pairs that form malignancies are much closer together than those of benign contingency pairs in both test cases (Fig. 5). This property will exacerbate the effects of spatial correlation on risk of cascading failure.

Fig. 4: Branch pairs used for pairwise distance examples described in the text.
Fig. 5: Distance between the 540 and 564 branch pairs that form malignancies in the Polish and Western US test cases vs. a random sample of 1,000,000 benign branch pairs from each test case.

2.5 Case Studies

This risk estimation approach is demonstrated on two publicly available test cases, modeling the Polish and Western United States (US) transmission systems.

The Polish test case, examined in previous work on risk estimation [6, 7, 26], contains 2383 buses and 2896 branches at a peak winter load and is distributed with the MATPOWER simulation package [30]. The true spatial locations of branches and buses are not publicly available for this test case, so hypothetical locations were inferred based on a graph layout of the grid topology, assuming branches are straight lines between buses (Fig. 6). This layout was then scaled to km, the approximate width/height of Poland, to simulate geographic distances. Some of the transmission lines were overloaded in the Polish test case provided by [30], so the adjusted base case described in [6] was used. Unless otherwise stated, references to the “Polish test case” refer to this adjusted base case. As in [7], different load levels were modeled in the Polish test case from 80% to 115% of the adjusted base case by multiplying all line loads by a scalar factor and then re-running the security constrained optimal power flow, to ensure the pre-contingency system at each load level is secure.

The Western US test case is a synthetic network based on the footprint of the western Unites States and comes via the Electric Grid Test Case Repository [36]. This test case is much larger than the Polish test case, with 10,000 buses and 12,706 branches, and has a more realistic geographic layout (Fig. 7). As with the Polish test case, some transmission lines were overloaded for the Western US test case provided by [36], and so adjustments were made as described in [6]. Since the case did not include short and long-term emergency flow limits (“RateB” and “RateC”), they were synthesized to be 110% and 150% of normal (“RateA”) limits, respectively.

Independent branch outage rates were not available for either the Polish or Western US test cases. For the results presented here, all independent outage rates were set equal to the mean outage rate of 0.9158 hours per year provided by the RTS-96 test case [37]. These independent outage rates were deliberately assumed identical for all branches in order to more clearly elucidate the impact of spatial correlations in outage rates, as assessed using (4), for all combinations of km and .

Fig. 6: Synthetic geographic layout of the Polish test case. Positionally, this layout is arbitrary and has been centered at (0,0), however, units were scaled so that the diameter of the geographic layout is roughly equal to that of Poland (in km).
Fig. 7: Geographic layout of the synthetic 10,000 bus Western US test case.

For the results shown here, the RC algorithm was used to identify large sets of and malignancies using the subset reduction scheme for 1,000,000 RC trials in the Polish model and for 704,400 RC trials in the Western US model. Fewer RC trials were used for the larger Western US test case because computation time was much higher than for the Polish model (averaging 9.5 seconds per RC trial for the Western US model vs. 2.35 seconds for the Polish model, on an Intel Core i5-3470 CPU @ 3.2GHz with 8 GB of RAM).

2.6 Estimating

As described in Section 2.1, this approach to risk estimation requires an estimate of the total number of malignancies , for . There was no need to estimate , since RC sampling identified the complete set of malignancies in both test cases, as evidenced by the flattening in the accumulation curves (Fig. 8(top)), and later verified through brute force search for the Polish test case. The set of unique malignancies was complete after only 5,090 non-unique malignancies had been found by RC sampling in the Polish test case (of 4,191,960 possible contingencies) and after only 9,364 non-unique malignancies had been found by RC sampling in the Western US test case (of 80,714,865 possible contingencies).

Fig. 8: (top) Accumulation curves for RC sampling of malignancies in the Polish and Western US test cases. In both cases, ; (bottom) Accumulation curves for malignancies found by RC sampling in the Polish and Western US test cases. In both cases, .

However, obtaining the entire set of N-3 malignancies is not computationally tractable in either test case, due to the sheer size of these sets. It was initially argued (incorrectly) in [5] that, if one has already identified of the malignancies using independent RC trials, then the probability that the next identified malignancy has not previously been found is , so one could infer from the observed frequency with which unique malignancies were found (assuming sufficient curvature in the accumulation curve). However, the assumption that independent RC trials uniformly sample from the sets has since proven false. In subsequent studies it was discovered that the accumulation curves were not exponential (as they would be if the sampling were uniform), but could be better fit with a 4-parameter exponential Weibull curve to estimate [7]. While this non-linear curve-fitting approach works for estimating in the Polish test case, the Western US test case is so much larger that there is insufficient curvature in the accumulation curve (Fig. 8(bottom)) to reliably fit a curve.

It has previously been noted that the frequency of occurrence of individual branches in malignancies is heavy-tailed [5, 38]. A similarly heavy-tailed distribution is apparent in the distribution of occurrences of specific branch pairs in malignancies, in both the Polish and Western US (Fig. 9).

Fig. 9: Histograms of the number of occurrences of distinct branch pairs in unique malignancies found via RC in (top) the Polish test case and (bottom) the Western US test case.

Further examination reveals that the set reduction scheme used in RC does, indeed, introduce a sampling bias when sampling from such heavy-tailed distributions. Specifically, the branch pairs that appear in disproportionately large numbers of malignancies are systematically under-sampled by the RC algorithm. To illustrate this, the 20 most frequently occurring branch pairs found in for the Western US test case were selected, and a brute force search of all possible contingencies that included each of these top 20 branch pairs (requiring computation time for each branch pair) was performed. In Fig. 10, the proportion of malignancies found by RC that contain each of these branch pairs is plotted as a function of the observed number of occurrences of the branch pairs in . A clear negative trend is present, with malignancies containing the most frequently occurring branch pairs severely under-sampled relative to malignancies containing less frequently occurring branch pairs. While a thorough explanation of the causes of this sampling bias are beyond the scope of this paper, here the bias is exploited to estimate both lower and upper bounds on .

Fig. 10: The relation between number of occurrences of specific branch pairs in malignancies found by RC (-axis) and the proportion RC has found of all malignancies that include those pairs (-axis), for the Western US test case. Only the 20 most frequently occurring branch pairs are shown, with the star indicating the branch pair that occurred most frequently in (.

Given that sampling probabilities are unequal, this problem is analogous to the common conservation biology task of estimating population sizes via mark-and-recapture surveys in closed populations with heterogeneous sampling probabilities. There are numerous techniques that have been developed for this kind of problem [39]. Here, Chao’s method [40] is used, because it is known to be particularly robust to heterogeneous sampling probabilities. In the power system context, the “population” under consideration is , the set of all malignancies. Chao’s estimate is calculated as , where is the number of malignancies found exactly once by RC sampling and is the number of malignancies found exactly twice by RC sampling. Chao’s method produces a lower-bound on the population size within a fixed confidence interval [40], so it is assumed that .

An upper-bound on can be estimated by taking advantage of the two observations demonstrated above: (i) certain branch pairs appear disproportionately often in malignancies (Fig. 9), and (ii) the most frequently occurring branch pairs are under-sampled, relative to less frequent branch pairs (Fig. 10). Based on these observations, the Random Chemistry Proportional (RCP) method is proposed as a way to estimate an upper bound on , as follows: (i) apply RC sampling for a sufficient number of trials such that the identity of the most frequently occurring branch-pair () in the malignancies of the growing set becomes stable (for the Western US test case, , indicated by the star in Fig. 10, became stable after about 7000 RC trials), (ii) perform a brute force search of all possible contingencies that include (this requires only simulations) to determine the true number of malignancies that include , (iii) compute what proportion of all malignancies that include were found by RC sampling, and finally (iv) we estimate the total number of malignancies (referred to as ) by assuming that all other less-frequently occurring branch-pairs have found this same proportion of the total number of malignancies in which they occur. Assuming that is under-sampled, this method provides an overestimate, and hence an upper-bound, on ; i.e., it is expected that .

As the number of malignancies found by RC sampling increases, and can be seen to be converging (Fig. 11), thus increasing the confidence in these as lower and upper bounds on the true value of . Risk estimates are calculated for the rightmost values of and shown in Fig. 11, to obtain approximate bounds on risk due to malignancies for the Western US test case.

In principle, these approaches to estimating lower and upper bounds can also be applied for estimating for , as long as is sufficiently large such that the -tuple that is most frequent in malignancies found by RC sampling has stabilized.

Fig. 11: The accumulation curve of malignancies found by RC is displayed below the lower-bound (Chao’s method) and upper-bound (RCP method) estimates of for the Western US test case.

3 Results

3.1 Set sizes of and

Brute force search was used to verify that RC sampling found all malignancies in the Polish base test case, with . It is assumed that RC sampling also found all malignancies in the Western US test case with , since the accumulation curve became flat (Fig. 8(top)). Using the non-linear curve-fitting method of [7], is estimated to be in the Polish test case at the base load. Using the Chao lower-bounding method [41] and the RCP upper-bounding method (described in Section 2.6), it is estimated that in the Western US test case.

3.2 Impact of Correlation and Load Level on Risk

As shown in prior work [6, 7], the load levels on the Polish grid can greatly affect the vulnerability of the network to cascading power failure due to malignancies. As noted in [7], risk varies non-linearly and non-monotonically with load, in part due to variations in the proximity of generation to demand that result from optimal power flow dispatch at different load levels. For a direct comparison to the results presented in [7], the impact of spatial correlation in malignancies on risk was assessed as a function of load in the Polish test case.

Changes in the system risk as a function of load at km (the longest characteristic correlation length tested) for 3 values of are illustrated in Fig. 12. Risk increased faster than linearly as a function of linearly increasing , at each of the given load levels. The largest percentage increase in system risk occurred at load level 114%, while the greatest absolute increases in risk occur between load levels of 97%-112%, where there are the most malignancies. In general, while introducing correlation in initiating outages magnifies the risk of large cascading blackouts, it does not fundamentally alter the overall shape of the risk curve as a function of load at km.

Fig. 12: Risk of large cascading blackouts posed by spatially-correlated malignancies with a fixed characteristic length ( km) and varying values of maximum correlation for load levels from 80%-115% of the Polish base test case.

When (the largest tested) and was varied from 0 to 300 km, results were superficially similar to those in Fig. 12, in that higher correlation increases risk without changing the overall shape of the risk curve as a function of load. As was the case when was fixed, the largest percentage increase in system risk was found to occur at load level 114%, and greatest absolute increases in risk occured between load levels of 97%-112%. However, in this case risk increases slower than linearly with linear increases in , with the largest increases occurring for intermediate values of (Fig. 13). This occurs because increasing beyond a certain point has diminishing impact on correlation, as approaches the radius of the network.

Fig. 13: Risk of large cascading blackouts posed by spatially-correlated malignancies with a fixed maximum correlation () and varying values of characteristic length (in km) for load levels from 80%-115% of the Polish base test case.

The super-linear increases in risk as a function of and sub-linear increases in risk as a function of at the base load are clearly illustrated in Fig. 14.

Fig. 14: Comparing change in risk of large cascading blackouts for varying (in km) with fixed at 0.15 (top -axis) vs. varying with fixed at 300 km (bottom -axis) for the Polish base test case.

3.3 Risk from and Malignancies

Risk of large cascading blackouts posed by and malignancies was computed for both the Polish base test case and the Western US test case, over all values of and tested, using the set size estimates given in Sec. 3.1.

For the Polish test case (Table I), the increase in estimated risk due to spatial correlation ranges from 149% for the most modest level of correlation tested ( km, ) to 582% in the most extreme case tested ( km, ), relative to the uncorrelated case.

0 100 200 300
0.00 0.0394 - - -
0.05 - 0.0586 0.0675 0.0720
0.10 - 0.0876 0.1142 0.1290
0.15 - 0.1314 0.1918 0.2293
TABLE I: Risk Attributable to and Malignancies in the Polish Test Case for Varying Levels of Spatial Correlation.

For the Western US test case, (Table II), the increase in lower (upper) bounds on risk estimates varied from 129% (130%) for the most modest level of correlation tested ( km, ) to 428% (456)% in the most extreme case tested ( km, ), relative to the uncorrelated case.

0 100 200 300
- - -
TABLE II: Estimated Lower Bounds () and Upper Bounds () on Risk Attributable to and Malignancies in the Western US Test Case for Varying Levels of Correlation.

For both test cases, the general effect of and on risk that is described in section 3.2 also holds in these results. That is, risk tends to grow faster than linearly with respect to and slower than linearly with respect to . The larger proportionate increases in the Polish test case, relative to the Western US test case, occur because the average distance between branches in malignancies are shorter than in the Western US test case (Fig. 5), thus magnifying the impacts of spatial correlation.

3.4 Relative Risk of vs. Malignancies

It is expected that malignancies will contribute more to risk when there is spatial correlation in initiating outages, but it is not clear to what degree. There are several factors that could potentially disproportionately affect the impact of malignancies on risk when there is spatial correlation, relative to that of malignancies, including: (i) size of blackouts caused by vs. malignancies; (ii) the independent probability of branch outages in vs. malignancies; (iii) the distance between branches in vs. malignancies. These factors are each discussed in more detail below.

3.4.1 Blackout Sizes

If the sizes of blackouts resulting from malignancies were larger than malignancies, this could disproportionately increase the relative contribution of malignancies to risk when spatial correlation is present. Distributions of blackout sizes caused by and malignancies, as estimated by DCSIMSEP, are shown for both the Polish and Western US test cases in Fig. 15. In both test cases, the median blackout size for malignancies was actually lower than those caused by malignancies. Specifically, for the Polish test case, the median blackout size caused by malignancies was 7,624 MW vs. 3,372 MW for those caused by malignancies. In the Western US test case, the median blackout size from malignancies was 10,473 MW whereas from the malignancies it was 10,382 MW

Fig. 15: Distributions of blackout sizes (in total MW load shed) caused by all vs. malignancies found by RC sampling, for the Polish and Western US test cases.

3.4.2 Independent Branch Outage Rates

In this study, independent outage rates were assumed to be homogeneous for all branches. However, in a real system the distribution of independent outage rates will be heterogeneous. If branches that are typically involved in malignancies are independently more likely to fail than those involved in malignancies, this could inflate the relative risk of malignancies when spatial correlation is present. While there is no obvious rationale for why this might be true, the observation that branches that occur frequently in malignancies also appear frequently in malignancies is indirect evidence against this. For example, in the Polish network, 8 of the 10 most frequently occurring branches in and malignancies are shared, accounting for 44% and 24% of all and malignancies found, respectively. Likewise, for the Western US test case, 9 of the 10 most frequently occurring branches in malignancies are also in the top 10 most frequently occurring malignancies, accounting for 49% and 29% of all and malignancies, respectively.

3.4.3 Distance Between Branches

The distance between branches in vs. malignancies will obviously impact the degree to which spatial correlation will increase their relative contributions to risk. In both the Polish and Western US test cases, distances between all pairs of branches occurring in malignancies tend to be greater than in malignancies (Fig. 16). Specifically, in the Polish test case the median distances between pairs of branches were 38.3 km and 61.2 km, in and malignancies, respectively; in the Western US test case the medians were 140.1 km and 451.4 km in the and malignancies, respectively. This helps to mitigate the increase in relative risk from vs. malignancies that occurs as a result of spatial correlation.

Fig. 16: Distributions of pairwise distances among branches in all vs. malignancies identified by RC sampling, for the Polish and Western US test cases.

3.4.4 Comparing Relative Risk with Correlation

For the Polish test case, of risk can be attributed to malignancies when there is no correlation whereas under the highest level of correlation considered ( km, ), the share of risk associated with malignancies rises to around 9% (Fig. 17). Similarly, for the Western US test case, malignancies account for 3%-5% of risk when there is no correlation, but between 16%-24% under the maximal correlation ( km, ) considered (Fig. 18).

Fig. 17: Estimated percentage of risk attributable to malignancies vs. malignancies for the Polish Test Case under varying levels of correlation, including all combinations of km and .
Fig. 18: Shaded regions represent bounded estimates on the percentage of risk attributable to malignancies vs. malignancies for the Western US Test Case under varying levels of correlation, including all combinations of km and .

4 Discussion

Previous work had demonstrated that malignancies constitute a relatively low proportion of risk compared to malignancies, assuming initiating branch outage independence [7]. This suggests that, if initiating outages are generally caused by independent events, limiting risk analysis to the more computationally tractable malignancies may be sufficient to capture the majority of risk. However, in reality, common causes such as relay failures, weather disturbances, earthquakes, fire, or terrorism may trigger multiple near-simultaneous outages that could potentially result in cascading blackouts. In this case, an assumption of independence will under-estimate cascading blackout risk.

This paper presents a method using copula analysis as a flexible, customizable approach for incorporating correlation into risk calculations, building on preliminary work presented in [26]. The impact of spatial correlation in and initiating outages on risk of cascading blackouts is assessed in the Polish test case as well as the much larger, and geographically more realistic, Western US test case.

The Western US test case has over four times as many branches as the Polish test case, and the increased computational cost of performing cascade simulations combined with the substantially larger search space of contingencies rendered previously developed methods [5, 7] for estimating the set size of malignancies ineffective. Thus, extending the approach to include risk from malignancies in the Western US test case required new methodologies to estimate lower- and upper-bounds on the total number of malignancies, for .

The results indicate that when spatial correlation is present in initiating outages, the relative contribution of malignancies to risk of cascading blackouts increases, although the increase is partially mitigated by the fact that pairwise distances between branches in malignancies are greater than in malignancies. It is expected that the impact of higher-order malignancies will similarly increase with increasing spatial correlation. Thus, ignoring malignancies for will likely underestimate the magnitude of the risk of cascading failure.

The lack of accurate data regarding independent transmission line outage rates and the impacts of common cause events on those rates are an important limitation to applying these methods in practice. Some such data is available to industry through systems such as the NERC TADS database, but these data are typically unavailable for research purposes. Increasingly, efforts are being made to better predict how common-cause events such as weather-related events will impact the grid [42]. Such knowledge will inform specific applications of the generalized framework introduced herein.

Future work will apply a more sophisticated AC cascading failure simulator to derive risk estimates that take into account the non-linear dynamics of power flow in transmission systems. Additionally, the impact of parametric choices such as distance metrics, correlation functions, and the underlying branch outage distributions which affect the copula function will be studied. Lastly, given the potential importance of incorporating higher-order contingencies in assessing risk under correlation, tractable methods for bounding estimates of the true set sizes of malignancies (for ) will be further explored and validated.


This work was supported in part by the National Science Foundation, award numbers ECCS-1254549 and DGE-1144388. The computational resources provided by the University of Vermont Advanced Computing Core (VACC) are gratefully acknowledged.


  • [1] Q. Chen, C. Jiang, W. Qiu, and J. D. McCalley, “Probability models for estimating the probabilities of cascading outages in high-voltage transmission network,” IEEE TRANSACTIONS ON POWER SYSTEMS PWRS, vol. 21, no. 3, p. 1423, 2006.
  • [2] I. Dobson, B. A. Carreras, V. E. Lynch, and D. E. Newman, “Complex systems analysis of series of blackouts: Cascading failure, critical points, and self-organization,” Chaos: An Interdisciplinary Journal of Nonlinear Science, vol. 17, no. 2, p. 026103, 2007.
  • [3] D. Newman, B. Carreras, V. Lynch, and I. Dobson, “Exploring complex systems aspects of blackout risk and mitigation,” IEEE Transactions on Reliability, vol. 60, no. 1, pp. 134 –143, 2011.
  • [4] S. NERC, “Top-004–2: Transmission operations,” North American Electric Reliability Corporation, 2007.
  • [5] M. J. Eppstein and P. D. Hines, “A “random chemistry” algorithm for identifying collections of multiple contingencies that initiate cascading failure,” IEEE Transactions on Power Systems, vol. 27, no. 3, pp. 1698–1705, 2012.
  • [6] P. Rezaei, M. J. Eppstein, and P. D. Hines, “Rapid assessment, visualization, and mitigation of cascading failure risk in power systems,” in System Sciences (HICSS), 2015 48th Hawaii International Conference on.   IEEE, 2015, pp. 2748–2758.
  • [7] P. Rezaei, P. D. Hines, and M. J. Eppstein, “Estimating cascading failure risk with random chemistry,” IEEE Transactions on Power Systems, vol. 30, no. 5, pp. 2726–2735, 2015.
  • [8] K. Köck, H. Renner, and J. Stadler, “Probabilistic cascading event risk assessment,” in Power Systems Computation Conference (PSCC), 2014.   IEEE, 2014, pp. 1–7.
  • [9] P. Rezaei, P. D. Hines, and M. Eppstein, “Estimating cascading failure risk: Comparing monte carlo sampling and random chemistry,” in PES General Meeting— Conference & Exposition, 2014 IEEE.   IEEE, 2014, pp. 1–5.
  • [10] M. Papic, S. Agarwal, R. N. Allan, R. Billinton, C. J. Dent, S. Ekisheva, D. Gent, K. Jiang, W. Li, J. Mitra et al., “Research on common-mode and dependent (cmd) outage events in power systems: A review,” 2017.
  • [11] I. Dobson, N. K. Carrington, K. Zhou, Z. Wang, B. A. Carreras, and J. M. Reynolds-Barredo, “Exploring cascading outages and weather via processing historic data,” in Hawaii International Conference on System Sciences (HICSS-51), Jan 2017.
  • [12] K. Jiang and C. Singh, “New models and concepts for power system reliability evaluation including protection system failures,” IEEE Transactions on Power Systems, vol. 26, no. 4, pp. 1845–1855, 2011.
  • [13] W. Li, “Incorporating aging failures in power system reliability evaluation,” IEEE Transactions on Power systems, vol. 17, no. 3, pp. 918–923, 2002.
  • [14] A. M. Salman, “Age-dependent fragility and life-cycle cost analysis of timber and steel distribution poles subjected to hurricanes,” 2014.
  • [15] A. Bernstein, D. Bienstock, D. Hay, M. Uzunoglu, and G. Zussman, “Power grid vulnerability to geographically correlated failures—analysis and control implications,” in INFOCOM, 2014 Proceedings IEEE.   IEEE, 2014, pp. 2634–2642.
  • [16] A. Scherb, L. Garrè, and D. Straub, “Reliability and component importance in networks subject to spatially distributed hazards followed by cascading failures,” ASCE-ASME Journal of Risk and Uncertainty in Engineering Systems, Part B: Mechanical Engineering, vol. 3, no. 2, p. 021007, 2017.
  • [17] J. Chen, J. S. Thorp, and I. Dobson, “Cascading dynamics and mitigation assessment in power system disturbances via a hidden failure model,” International Journal of Electrical Power & Energy Systems, vol. 27, no. 4, pp. 318–326, 2005.
  • [18] U. Cherubini, E. Luciano, and W. Vecchiato, Copula methods in finance.   John Wiley & Sons, 2004.
  • [19] A. Onken, S. Grünewälder, M. H. Munk, and K. Obermayer, “Analyzing short-term noise dependencies of spike-counts in macaque prefrontal cortex using copulas and the flashlight transformation,” PLoS computational biology, vol. 5, no. 11, p. e1000577, 2009.
  • [20] C. Schoelzel and P. Friederichs, “Multivariate non-normally distributed random variables in climate research–introduction to the copula approach,” Nonlinear Processes in Geophysics, vol. 15, no. 5, pp. 761–772, 2008.
  • [21] S. Hagspiel, A. Papaemannouil, M. Schmid, and G. Andersson, “Copula-based modeling of stochastic wind power in europe and implications for the swiss power grid,” Applied energy, vol. 96, pp. 33–44, 2012.
  • [22] G. Papaefthymiou and P. Pinson, “Modeling of spatial dependence in wind power forecast uncertainty,” in Probabilistic Methods Applied to Power Systems, 2008. PMAPS’08. Proceedings of the 10th International Conference on.   IEEE, 2008, pp. 1–9.
  • [23] G. Papaefthymiou and D. Kurowicka, “Using copulas for modeling stochastic dependence in power system uncertainty analysis,” IEEE Transactions on Power Systems, vol. 24, no. 1, pp. 40–49, 2009.
  • [24] M. de Jong, G. Papaefthymiou, and P. Palensky, “A framework for incorporation of infeed uncertainty in power system risk-based security assessment,” IEEE Transactions on Power Systems, vol. 33, no. 1, pp. 613–621, 2018.
  • [25] W. Li, Risk assessment of power systems: models, methods, and applications.   John Wiley & Sons, 2014.
  • [26] L. A. Clarfeld, M. J. Eppstein, P. D. Hines, and E. M. Hernandez, “Assessing risk from cascading blackouts given correlated component failures,” in 2018 Power Systems Computation Conference (PSCC).   IEEE, 2018, pp. 1–7.
  • [27] M. Vaiman, K. Bell, Y. Chen, B. Chowdhury, I. Dobson, P. Hines, M. Papic, S. Miller, and P. Zhang, “Risk assessment of cascading outages: Methodologies and challenges,” IEEE Transactions on Power Systems, vol. 27, no. 2, p. 631, 2012.
  • [28] P. Hines and P. Rezaei, Smart Grid Handbook.   John Wiley & Sons, 2016, ch. Cascading Failures in Power Systems.
  • [29] M. J. Eppstein, J. L. Payne, B. C. White, and J. H. Moore, “Genomic mining for complex disease traits with “random chemistry”,” Genetic Programming and Evolvable Machines, vol. 8, no. 4, pp. 395–411, 2007.
  • [30] R. D. Zimmerman, C. E. Murillo-Sánchez, R. J. Thomas et al., “Matpower: Steady-state operations, planning, and analysis tools for power systems research and education,” IEEE Transactions on power systems, vol. 26, no. 1, pp. 12–19, 2011.
  • [31] R. B. Nelsen, An introduction to copulas.   New York: Springer, 2010.
  • [32] N. Jayaram and J. W. Baker, “Correlation model for spatially distributed ground-motion intensities,” Earthquake Engineering & Structural Dynamics, vol. 38, no. 15, pp. 1687–1708, 2009.
  • [33] J. Wurman and C. R. Alexander, “The 30 may 1998 spencer, south dakota, storm. part ii: Comparison of observed damage and radar-derived winds in the tornadoes,” Monthly weather review, vol. 133, no. 1, pp. 97–119, 2005.
  • [34] H. Willoughby, R. Darling, and M. Rahn, “Parametric representation of the primary hurricane vortex. part ii: A new family of sectionally continuous profiles,” Monthly weather review, vol. 134, no. 4, pp. 1102–1120, 2006.
  • [35] A. Genz, “Numerical computation of rectangular bivariate and trivariate normal and t probabilities,” Statistics and Computing, vol. 14, no. 3, pp. 251–260, 2004.
  • [36] A. B. Birchfield, T. Xu, K. M. Gegner, K. S. Shetye, and T. J. Overbye, “Grid structural characteristics as validation criteria for synthetic networks,” IEEE Transactions on power systems, vol. 32, no. 4, pp. 3258–3265, 2017.
  • [37] R. T. Force, “The ieee reliability test system-1996,” IEEE Trans. Power Syst, vol. 14, no. 3, pp. 1010–1020, 1999.
  • [38] Y. Yang, T. Nishikawa, and A. E. Motter, “Small vulnerable sets determine large network cascades in power grids,” Science, vol. 358, no. 6365, p. eaan3184, 2017.
  • [39] S. C. Amstrup, T. L. McDonald, and B. F. Manly, Handbook of capture-recapture analysis.   Princeton University Press, 2010.
  • [40] A. Chao, “Estimating the population size for capture-recapture data with unequal catchability,” Biometrics, pp. 783–791, 1987.
  • [41] A. Chao and S.-M. Lee, “Estimating the number of classes via sample coverage,” Journal of the American statistical Association, vol. 87, no. 417, pp. 210–217, 1992.
  • [42] Y. Liu and J. Zhong, “Risk assessment of power systems under extreme weather conditions—a review,” in PowerTech, 2017 IEEE Manchester.   IEEE, 2017, pp. 1–6.

Laurence A. Clarfeld received his B.S. in Mathematics from the University of Vermont in 2007 and his M.S. in Environmental Studies, focusing on Conservation Biology, from Antioch University of New England in 2013. He returned to University of Vermont in 2016 to pursue a Ph.D. in Computer Science as a graduate research fellow in the NSF IGERT Smart Grid program, where his research interests include risk analysis of cascading power failures.

Paul D.H. Hines (S‘96,M‘07,SM‘14) received the Ph.D. in Engineering and Public Policy from Carnegie Mellon University in 2007 and M.S. (2001) and B.S. (1997) degrees in Electrical Engineering from the University of Washington and Seattle Pacific University, respectively. He is currently Associate Professor and the L. Richard Fisher chair in the Electrical and Biomedical Engineering department, with a secondary appointment in Computer Science, at the University of Vermont. He is also a member of the external faculty of the Santa Fe Institute and a co-founder of Packetized Energy, a cleantech startup. Formerly he worked at the U.S. National Energy Technology Laboratory, the US Federal Energy Regulatory Commission, Alstom ESCA, and for Black and Veatch. He currently serves as the vice-chair of the IEEE PES Working Group on Cascading Failure.

Eric M. Hernandez received a B.S. degree in Civil Engineering from Universidad Nacional Pedro Henriquez Urena, Santo Domingo, Dominican Republic, in 1999, and M.S. and Ph.D. degrees in Civil Engineering from Northeastern University, in 2004 and 2007, respectively. In 2011, he joined the Department of Civil and Environmental Engineering, at the University of Vermont (UVM) as an Assistant Professor and in 2017 was promoted to Associate Professor. In 2018, he was awarded the inaugural Gregory N. Sweeny Green and Gold Professorship in Civil Engineering at UVM. His current research interests include structural health monitoring, stochastic modeling, inverse problems and Bayesian reliability.

Margaret J. Eppstein received the B.S. degree in zoology from Michigan State University in 1978 and the M.S. degree in computer science and the Ph.D. degree in civil & environmental engineering from the University of Vermont (UVM), Burlington, VT, in 1983 and 1997, respectively. She is Research Professor and Professor of Computer Science Emerita at UVM, where she has been on the faculty since 1983. She was founding director of the Vermont Complex Systems Center (2006-2010) and Chair of the UVM Department of Computer Science (2012-2018). Her research interests comprise interesting computational challenges in modeling and analysis of complex systems in a variety of application domains.

Comments 0
Request Comment
You are adding the first comment!
How to quickly get a good reply:
  • Give credit where it’s due by listing out the positive aspects of a paper before getting into which changes should be made.
  • Be specific in your critique, and provide supporting evidence with appropriate references to substantiate general statements.
  • Your comment should inspire ideas to flow and help the author improves the paper.

The better we are at sharing our knowledge with each other, the faster we move forward.
The feedback must be of minimum 40 characters and the title a minimum of 5 characters
Add comment
Loading ...
This is a comment super asjknd jkasnjk adsnkj
The feedback must be of minumum 40 characters
The feedback must be of minumum 40 characters

You are asking your first question!
How to quickly get a good answer:
  • Keep your question short and to the point
  • Check for grammar or spelling errors.
  • Phrase it like a question
Test description