Operational risk of a wind farm energy production by Extreme Value Theory and Copulas
Abstract
In this paper we use risk management techniques to evaluate the potential effects of those operational risks that affect the energy production of a wind farm. We concentrate our attention on three major risk factors: wind speed uncertainty, wind turbine reliability and interactions of wind turbines due mainly to their placement.
As a first contribution, we show that the Weibull distribution, commonly used to fit recorded wind speed data, underestimates rare events. Therefore, in order to achieve a better estimation of the tail of the wind speed distribution, we advance a Generalized Pareto distribution. The wind turbines reliability is considered by modeling the failures events as a compound Poisson process. Finally, the use of Copula able us to consider the correlation between wind turbines that compose the wind farm. Once this procedure is set up, we show a sensitivity analysis and we also compare the results from the proposed procedure with those obtained by ignoring the aforementioned risk factors.
keywords:
Wind speed, Weibull distribution, Generalized Pareto distribution, compound Poisson processauthoryear
1 Introduction
Wind power now accounts for a high proportion of generation capacity in many regions. For example, wind accounted for 17 percent of Germany’s 167.8 GW of installed generation capacity in 2011 (Hau and Von Renouard (2013)). As wind power’s share of electricity generation has increased, so have the financial consequences of risks associated with its inherently high variability. Wind speed variability has many financial consequences. Low wind speeds reduce generation and revenues for wind power generators and may adversely affect the ability to meet debt payments, creating credit risks for investors. Conversely, excessively high wind speeds may temporarily halt generation or delay wind farm construction. When wind has priority access to the grid, thermal power plants have to balance generation regardless of whether wind is above or below forecasted levels. Wind speed variability may also compound price risk for other market players through its influence on wholesale electricity market clearing prices in competitive dayahead and intraday markets. The uncertainty in wind power production needs to be hedged trough risk management techniques.
In this work we will focus our attention on the source of operational risk which are present in a wind farm, namely: wind speed uncertainty, wind turbine reliability and interactions of wind turbines due mainly to their placement.
Despite the Weibull distribution (WD) is often used by practitioners and researcher alike (see Weisser (2003); Akdağ and Dinler (2009); Chang (2011)) it does not fit well the right tail of the wind speed distribution underestimating strong wind probabilities. The WD models accurately the body of the wind speed distribution, but the same statement can not be made for the tail of the distribution. By applying extreme value theory we will show that it is possible to better estimate the number of strong wind events if a Generalized Pareto distribution (GPD) is used to fit the right tail of the wind speed distribution. A similar approach was already used in the field of the wind speed modeling (see Morgan et al. (2011); Holmes and Moriarty (1999); Van de Vyver and Delcloo (2011); Zachary et al. (1998)). Here we are interested into highlighting the importance of extreme value theory as a mean for controlling the operational risk arising from the uncertainty of the wind speed as applied to a real case of energy production.
Another source of operational risk is the wind turbine reliability and the necessary time to repair. We model the failure events by means of a compound Poisson process. The compound Poisson model is widely adopted by insurance modelers for measuring aggregate risks (see e.g. Tse (2009)) and we will show that it can be used also in the management of a wind farm to consider periods of non production of energy due to failure of the wind turbines and to the time for repairing.
The third source of operational risk we consider is the correlation between wind turbines energy production. Indeed, since in a wind farm many turbines act together, it would be better to consider their multivariate distribution of energy production instead of considering the turbines as independent and with an identical production of energy. This risk factor is considered through Copulas that permit the construction of a multivariate model having fixed marginals (univariate) distributions.
In the present work we consider a wind speed database of a specific site in Alaska and we assume to put there a wind farm composed of 10 commercial wind turbines. We propose then a procedure to estimate correctly the energy production of the allocated wind farm by taking into account the three sources of uncertainty.
The paper is organized as follow: in section 2 we describe the wind speed database and the commercial wind turbines considered in the application. In section 3 we present the models at the basis of the proposed procedure. Section 4 shows the application of the procedure to our database, sensitivity analysis and the comparison with the energy estimation of a wind farm without considering the aforementioned risk factors. At last, in section 5 we give some concluding remarks.
2 Material
2.1 Database
The database of wind speed used in our analysis was collected by the National Data Buoy Center (www.ndbc.noaa.gov). Particularly, we downloaded the data from the inshore station RDDA2 that is situated at 67.577 N 164.065 W in Alaska. The data are available for six years ranging from 2006 to 2012 with a sample period of six minutes. The instrumentation are located at 10 above the ground and mean and maximum values of wind speed in the database are respectively of 4.5 and 34.8 .
This database is used to analyze the production of energy from commercial wind turbines which have the hub at a given altitude. Then, since the altitude from the ground influences the wind speed, we have to transform the 10 velocities to corresponding data at the required altitude. It is well known in the literature that wind speed has the following dependence from the altitude (see e.g. D’Amico et al. (2013b)):
(1) 
where is the wind speed at the height of the wind turbine hub, is the value of the wind speed at the height of the instrument, and are the height of the wind turbine and of the instrument ( and ), respectively. The parameter is a factor that takes into account the morphology of the area near the wind turbine. For a region without buildings or trees, this parameter varies from 0.01 and 0.001, instead for the offshore application it is equal to 0.0001. In our analysis we consider a mean value for an onshore application, then we fix =0.005. With this transformation we have an increase of the mean and also of the maximum value of the wind speed, which became 5.4 and 42 respectively. In figure 1 we show the main characteristics of the database. In panel we show the probability density function (PDF) of the wind speed. Panel shows a piece of one year of the time series, instead in panel we report the BoxPlot where we can see that the median wind speed is below the mean value and that in the fourth quartile there are all the wind speeds greater than 15 .
2.2 Commercial wind turbine
Wind turbines convert the kinetic energy of wind into electrical power. The quantity of converted energy depends, ceteris paribus, on the installed wind turbines. In this application we chose a commercial wind turbine, the 330 Enercon E33. This turbine has an height of the hub from the ground of 50 . The most important property of each wind turbine is its power curve that characterizes the performance of the wind turbine. This curve gives the energy produced by the turbines as a function of wind speed. The power curve of the 330 Enercon E33 is represented in figure 2 and the numerical values are reported on table 1. For the present application given the continuous nature of wind speed, in order to convert each wind speed into energy a linear interpolation between discrete states of the power curve was performed.
Wind speed [m/s]  0  1  2  3  4  5  6  7  8  9  10  11  12  13 28 

Power [kW]  0  0  0  5  13.7  30  55  92  138  196  250  293  320  335 
The given wind turbine starts to produce energy at a given wind speed (cutin), below this speed the kinetic energy of the wind is too low to move the blades, viceversa there exist a value of wind speed (cutoff) above which the blades cannot operates and are disposed parallel to the wind speed flow to avoid structural breaks. As it is possible to see in table 1 the Enercon E33 has a cutin wind speed of and a cutoff wind speed of 28
3 Wind speed model
In this section we describe the mathematical tools used to simulate the energy produced by a wind farm considering the three sources of operational risk. We concentrate the attention on the estimation of the wind speed distribution at a given time interval and not on the adoption of a stochastic process approach to model wind speed as for example in D’Amico et al. (2013a, e); DâAmico et al. (2014a); D’Amico et al. (2013d, c); DâAmico et al. (2014b).
3.1 Step 1: fitting the wind speed distribution
In order to fit the experimental distribution of wind speed many authors suggest the use of a Weibull distribution, see e.g. Weisser (2003); Akdağ and Dinler (2009); Chang (2011). A random variable has a 2parameter Weibull distribution if its probability density function obeys relation:
where is the shape parameter and is the scale parameter.
We are going to show that this distribution is not able to take into account the extreme wind speed values and this has important consequences when dealing with energy production. A good way of considering these extreme events is by using appropriate fat tail distributions. In this paper we adopt the generalized Pareto distribution (GPD) as a model for extreme wind speed values. A random variable has a GPD distribution if its probability density function obeys relation:
where is the location parameter, is the scale parameter and is the shape parameter.
Following extreme value theory, we use the WD to model the body of the experimental distribution (ED) and GPD to model the right tail of the ED. Summarizing, we ended up with a threshold density that consider the WD for nonexceeding wind speed and a GPD distribution for wind speed values that fall above the fixed threshold .
Therefore, the considered wind speed distribution is the following:
(2) 
The function , as it is defined, it is not a probability function, in fact:
(3) 
then is normalized as follow:
(4) 
3.2 Step 2: modeling the multivariate wind speed distribution
The second step of the procedure has the objective of recovering the joint distribution of wind speed for the different installed wind turbines. Let us assume that the wind farm consists of wind turbines. We denote by the cumulative distribution of wind speed at the ith turbine where and is given in formula . The distributions of wind speed of different turbines installed at a given location are strongly dependent because there are numerous common factors that affect all these distributions such as the underlying wind, geomorphological factors, shear effect and others. Since we have determined at the previous step a suitable marginal distribution (for each single turbine), we desire to maintain it while extending the analysis to model the multivariate distribution. As it is well known, the copula function represent a solution to this problem.
An ndimensional copula is a mapping from to the set . It is grounded and nincreasing and satisfies the following condition:
The margins satisfy for all .
If we denote by the joint distribution of wind speed at the different turbines, then by Sklar theorem there exists a unique copula such that
and conversely, if we specify a copula function , then is a multivariate distribution function with marginal distribution function for all .
The problem of selecting an appropriate copula function will be discussed later on the applicative section.
3.3 Step 3: simulation of vectors of wind speed
The third step consists in the simulation of vectors of correlated wind speed. The vector represents the values of wind speed acting on the ith turbine. This vector has dimension where is the horizon time of interest. In this study we fix which represent a six minutes frequency for one year. The Monte Carlo simulation starts by generating time series of a certain length composed of random numbers, ranging from to , correlated through the given Copula’s law. The transformation of these random numbers into wind speed (following a specific PDF) is made through the inverse of the WGPD cumulative distribution function previously explained and showed in Figure 4. The application of the proposed procedure has hitherto generated time series, of length , correlated through the Copula and following the WGPD.
3.4 Step 4: modelling wind turbine failures
As the totality of mechanical device, also wind turbines are subject to failures that can determine the break of the wind energy production. In this step we specify a simple model of failure, namely the compound Poisson model that we will use later for computing the total time of inactivity of the turbines and the consequent loss of energy production (see also Tavner et al. (2007)).
Let denote by the number of failures in the block of wind turbines, and by the total time of inactivity of the given wind turbine that supported the th failure. Then the total time of inactivity is given by
The key and simplifying hypotheses that are commonly done are:

The random variables and are assumed to be independent for all ;

The random variables are independent and with the same distribution as those of the random variable denoted by .
The consequence of these hypotheses is that the random variable follows a compound distribution, with being the primary distribution and the secondary one and then the probability generating function of S is given by:
To specify completely the model here below we assume that the distribution of is a Poisson of parameter and the distribution of the repair times is also a Poisson with parameter .
The compound Poisson model is a very popular option adopted by insurance modelers for measuring aggregate risks, see e.g. Tse (2009) . Many of the advantages of this stochastic model translate naturally to the analysis of wind farm. One of the main advantages is that, if the wind farm is expanded by an increase of the number of installed turbines, then this choice will impact only on the frequency of failures (the primary distribution) but not the time necessary to repair it (the secondary distribution). Differently, innovation in repair management, may affect only the distribution of the time of repair of the turbine and not the frequency distribution. Furthermore, general innovation in technology and quality of materials may affect both the primary and secondary distribution differently, therefore a separate modeling through the primary and secondary distribution allows the comprehension and measurement of the effects of events on the aggregate time of nonproduction.
3.5 Step 5: simulation of failures and repair times for the wind farm
For each one of the wind turbine we simulate the number of failures and the repair times within the horizon time . Moreover for each failure we simulate the positioning on the time interval and then we modify the vectors of correlated wind speed simulated at Step 3 by substituting the velocities with zeros in correspondence of times of repair of failures. This step can be formally described by the following algorithm where comments follows instruction being inserted between parenthesis.

For ; (for each wind turbines)

Sample from ; (simulate the number of failures for each turbine)

For Sample from ; (simulate the repair time for the jth failure of the ith turbines)

Set and Sample from ; (simulate the time of the jth failure of the ith turbine. Notice that next failure can occur at time which lies after the repair of the jth failure and the horizon time )

Set ; ( is the effective number of failures supported by the ith turbines within time )

For all Set for all ; (this instruction replace the velocities simulated at Step 3 with values equal to zero in correspondence of all times when the turbines are not working due to the failures and the repair times)
3.6 Step 6: computation of the energy production
As last step we compute the total energy produced by the wind farm using the vector of velocities modified in Step 5. The vectors of velocities (one for each wind turbine) is converted into energy production by using the power curve that characterizes each wind turbine. In this way we obtain the energy produced at each period from each turbine and then summing over all periods and turbines we obtain the energy produced by the wind farm. It should be remarked that to compute the total energy we used dependent vectors of wind speed whose marginal considers appropriately extreme events and a model of failure of the wind turbines. In Section 4.4 we will better highlight the main steps of the conversion of the wind speed into energy produced by the wind turbine.
4 Results
In this section we will apply the methodology previously explained to the specific wind speed dataset introduced in section 2.
4.1 Parameter estimation of the PDF
At the basis of the proposed procedure is the possibility to model the PDF of wind speed by two different distributions. While the WD is used to model the first part of the empirical PDF, a GPD is used for the right tail. Then, the first thing to do, in order to apply this methodology, is to choose a wind speed threshold under which we use the WD and over the GPD. We then fit using maximum likelihood estimation a WD to our experimental distribution and by using a QQ plot (see Figure 3) we verify when there is a departure, statistical significant, from the straight line. Just to remind, a QQ plot shows the quantiles of one distribution as a function of the quantiles of another distribution. If the plot gives a straight line with a slope of 45, the two distributions are practically the same. Meaning that the theoretical distribution fits well the experimental distribution. Instead, a departure from the straight line means a not perfect concordance between the two distributions. In our case (Figure 3, panel a) the WD is not suited for representing wind speed greater than 20 , then, above this threshold we fit the empirical distribution with a GPD. From panel of the same figure, where the WD is used to model wind speed until threshold and GPD over threshold, one can notice the improvement in the fitting of the extreme values of the wind speed distribution with respect to the simple WD.
Another comparison between the two distributions can be made by plotting their cumulative distribution function. In Figure 4, upper panel, we show the empirical cumulative distribution function of the real data (blue line), the Weibull (dashed black line) and the WGP (red dotted line) cumulative distribution function. As it is possible to note in the lower panel of figure 4, the WGP is closer to the empirical cumulative distribution function, with respect to the simple Weibull. From this figure we can highlight that if we model the wind speed with a WD we have an underestimation of the probabilities of events over a given threshold. As we will show better later, this implies that a model based on WD underestimates the energy produced by a wind farm due to the underestimation of high values of the wind speed.
4.2 Copulas application and Monte Carlo simulation
In the application tested here we consider a wind farm composed of 10 turbines. The correlations between the wind turbines that compose the wind farm are taken into account through Copulas. We consider two kind of Copulas, Gaussian and Student’s t, the latter with different values of the degree of freedom (DOF). We consider constant values of the correlational matrix between the turbines and we study the variation on the production of energy by varying the correlation coefficients. We also tested the results with respect to the chosen Copula and to the DOF in the case of the Student’s t Copula. The algorithm used for Monte Carlo simulation is described in section 3
4.3 Wind turbine failures application
There are many reasons that can cause the absence of energy production in a wind farm derived from wind turbine failures. In figure 5 we summarize these kinds of failures and for each of them we show the annual frequency and the time necessary to repair the blades. Data are taken from Tavner et al. (2010) and they are referred to a wind farm composed of 69 wind turbines, located in Ormont (Germany), of the same type of that chosen for our analysis (see section 2).
As mentioned in section 3, we assume that the number of wind turbine failures per year follows a Poisson distribution with parameter . This value represents the average of the number of wind turbine failures per year. Moreover we also assume that the repair time follows a Poisson distribution with parameter . Also in this case is the average of the data of figure 5, upper panel. In this way we can simulate the total time of no production by using the compound Poisson model.
4.4 Energy conversion and sensitivity analysis
At this point of the simulation we have 10 synthetic wind speed time series, one for each wind turbine, with a length equal to the number of years that we want simulate, then 10, correlated through Copula and with period of no production inside disposed randomly. Wind speed is converted into energy by using the power curve described in Section 2.2. In figure 6 we show the probability to produce a given energy every six minutes as a function of the correlation coefficients using a gaussian Copula. As it is possible to note, for low values of the correlation coefficient, the probability distribution becomes equal to a WD. Instead, for high values of the correlation coefficient, it assumes the common form of the energy production of a wind turbine, where the most likely values assumed by the system are in correspondence of the rated power and of absence of wind.
In Figure 7 we show the probability of producing energy below a certain percentage of the rated power of the wind farm (the rated power of a wind farm is equal to the rated power of the wind turbine multiplied for the number of wind turbines that compose the wind farm, in this case it is equal to 3300 ) as a function of the correlation coefficient and of the DOF of the Copula. Particularly, we show the probability of producing below 1%, 5%, 10%, 25%, 50% and 100% of the rated power respectively in panels, , , , , and . As it is evident in the panels of figure 7 there is a great variation of the probability with the variation of the correlation coefficient and a little dependence of the probability from the chosen Copula. Another important results is that the probability to produce less than a given threshold (at least when this threshold is below 50% of the rated power) is much higher for correlated turbine than for uncorrelated ones. The results are opposite when the threshold is above 50% of the rated power.
In order to highlight this phenomenon we fix the threshold at 25% of the rated power and, for different values of the correlation coefficient, we plot the probability as a function of the DOF of the Copula. Results are shown in figure 8, where it is possible to see that there is a little variation of the probability with the increase of the DOF of the Copula.
4.5 Energy production: comparison with simple Weibull based model
Lastly, we compared the estimated produced energy by using the proposed methods and the estimated produced energy by using a simple Weibul model for wind speed. As it was clear in the explanation of figure 4 the WD underestimate the probability of wind speed exceeding a given threshold. This fact implies an underestimation in terms of energy produced by the entire wind farm. We simulate 10 distinct time series of wind speed (corresponding to 10 wind turbine composing the wind farm) with a length of 10 years. We transform the wind speed into energy produced and then in Euro by considering a price of 0.3 . At last, we consider the difference between the energy estimated with the model presented in this paper and with the WD. This simulation was repeated 1000 times and the results are plotted in figure 9. This Figure shows that, by using a simple WD model, we have a mean underestimation of production of about 1.29 millions of Euro in 10 years. This results is most interesting if we consider that with the new procedure we take into account also the lost of energy due to the wind turbines failures.
5 Discussion and conclusion
The goal of this paper is to propose a procedure to estimate accurately possible loss generated by the presence of period of no production caused by wind turbine failures and, on the other side, to fit well high values of the wind speed that can cause both no production or high production of energy, if the cut off wind speed of the wind turbine is exceeded or not. Nevertheless the use of the Copula appears as a good instruments to take into account the relationship between the wind turbines of the farm. The sensitivity analysis showed that the energy production of the wind farm is influenced by the DOF o the Copula but, above all, by the correlation coefficients.
Future work will focus on using the proposed procedure in order to evaluate new wind farm sites, to apply it to real case of investment into wind energy production and to extend the analysis when we assume a stochastic evolution of the wind speed process.
References
 Akdağ, S. A., Dinler, A., 2009. A new method to estimate weibull parameters for wind energy applications. Energy Conversion and Management 50 (7), 1761–1766.
 Chang, T. P., 2011. Performance comparison of six numerical methods in estimating weibull parameters for wind energy application. Applied Energy 88 (1), 272–282.
 D’Amico, G., Petroni, F., Prattico, F., 2013a. First and second order semimarkov chains for wind speed modeling. Physica A: Statistical Mechanics and its Applications 392, 1194–1201.
 D’Amico, G., Petroni, F., Prattico, F., 2013b. Forecasting wind speed financial return. arXiv preprint arXiv:1312.3895.
 D’Amico, G., Petroni, F., Prattico, F., 2013c. Reliability measures for indexed semimarkov chains applied to wind energy production. arXiv preprint arXiv:1311.6585.
 D’Amico, G., Petroni, F., Prattico, F., 2013d. Reliability measures of secondorder semimarkov chain applied to wind energy production. Journal of Renewable Energy Article ID 368940.
 D’Amico, G., Petroni, F., Prattico, F., 2013e. Wind speed modeled as an indexed semimarkov process. Environmetrics 24, 367–376.
 DâAmico, G., Petroni, F., Prattico, F., 2014a. Performance analysis of second order semimarkov chains: An application to wind energy production. Methodology and Computing in Applied Probability, 1–14.
 DâAmico, G., Petroni, F., Prattico, F., 2014b. Wind speed and energy forecasting at different time scales: A nonparametric approach. Physica A: Statistical Mechanics and its Applications 406, 59–66.
 Hau, E., Von Renouard, H., 2013. Wind turbines: fundamentals, technologies, application, economics. Springer.
 Holmes, J., Moriarty, W., 1999. Application of the generalized pareto distribution to extreme value analysis in wind engineering. Journal of Wind Engineering and Industrial Aerodynamics 83 (1), 1–10.
 Morgan, E. C., Lackner, M., Vogel, R. M., Baise, L. G., 2011. Probability distributions for offshore wind speeds. Energy Conversion and Management 52 (1), 15–26.
 Tavner, P., Gindele, R., Faulstich, S., Hahn, B., Whittle, M., Greenwood, D., 2010. Study of effects of weather & location on wind turbine failure rates. In: Proceedings of the European Wind Energy Conference EWEC. Vol. 20.
 Tavner, P., Xiang, J., Spinato, F., 2007. Reliability analysis for wind turbines. Wind Energy 10 (1), 1–18.
 Tse, Y.K., 2009. Nonlife actuarial models: theory, methods and evaluation. Cambridge University Press.
 Van de Vyver, H., Delcloo, A., 2011. Stable estimations for extreme wind speeds. an application to belgium. Theoretical and applied climatology 105 (34), 417–429.
 Weisser, D., 2003. A wind energy analysis of grenada: an estimation using the âweibullâdensity function. Renewable Energy 28 (11), 1803–1812.
 Zachary, S., Feld, G., Ward, G., Wolfram, J., 1998. Multivariate extrapolation in the offshore environment. Applied Ocean Research 20 (5), 273–295.