Forecasting SpatioTemporal Renewable Scenarios: a Deep Generative Approach
Abstract
The operation and planning of largescale power systems are becoming more challenging with the increasing penetration of stochastic renewable generation. In order to minimize the decision risks in power systems with large amount of renewable resources, there is a growing need to model the shortterm generation uncertainty. By producing a group of possible future realizations for certain set of renewable generation plants, scenario approach has become one popular way for renewables uncertainty modeling. However, due to the complex spatial and temporal correlations underlying in renewable generations, traditional modelbased approaches for forecasting future scenarios often require extensive knowledge, while fitted models are often hard to scale. To address such modeling burdens, we propose a learningbased, datadriven scenario forecasts method based on generative adversarial networks (GANs), which is a class of deeplearning generative algorithms used for modeling unknown distributions. We firstly utilize an improved GANs with convergence guarantees to learn the intrinsic patterns and model the unknown distributions of (multiplesite) renewable generation timeseries. Then by solving an optimization problem, we are able to generate forecasted scenarios without any scenario number and forecasting horizon restrictions. Our method is totally modelfree, and could forecast scenarios under different level of forecast uncertainties. Extensive numerical simulations using realworld data from NREL wind and solar integration datasets validate the performance of proposed method in forecasting both wind and solar power scenarios.
I Introduction
Decisionmaking under uncertainty environments have long been a challenging problem for power system engineers. There is no perfect forecasts, and thus consideration of generation uncertainties is a necessity for reliable, efficient power system operation planning [1]. To accommodate for the higher penetration of renewable energy such as wind, solar and hydro power, it becomes an urgent need for developing more accurate, scalable and efficient approaches to model shortterm renewable generation uncertainties [2]. Handling the uncertainty for renewable generations is key to increasing economic benefits and enforcing reliability criteria for decisionmaking. One widely used approach to capture the uncertainties in renewable resources is by modeling the future via generating a set of possible future timeseries called scenarios. Comparing to other uncertainty modeling techniques such as probabilistic forecasts and quantile forecasts, scenarios reflect the joint distributions of renewables generation at different locations and varying lead time [3]. Scenario approach has been playing an important role in a series of stochastic and robust optimization problems such as unit commitment, energy trading strategy, storage sizing and etc [4, 5, 6].
Despite such promise and wide applications, generating scenarios that are able to accurately inform system operators on power generation uncertainties remains to be a challenging problem. One of the biggest challenges of scenario generation is the difficulty of modeling or learning the unknown stochastic processes that drive renewable power generation. Researchers have conducted extensive research on using probabilistic, statistical models for scenario generations. In [3, 7, 8], Gaussian copula or paircopula model are used to generate statistical scenarios that accounts for both the interdependence structure of prediction errors and the predictive distributions from wind power probabilistic forecasting. Autoregressive moving average (ARMA) along with Monte Carlo simulation are used to generate wind power scenarios in [9]. For generating spatial correlated scenarios, time series models [10, 11] are illustrated to produce a set of plausible scenarios characterizing the uncertainty associated with wind speed at different geographic sites.
However, most approaches mentioned above required large amount sitespecific modeling knowledge on renewable resources. What’s more, the model assumptions on future power generations, e.g. multidimensional Gaussian assumptions on forecast horizons, are normally not held and vary by locations and time. The intermittent and timevarying nature of renewables, the complex spatial and temporal interactions make most of these methods difficult to apply and hard to scale in practice. Thus generated scenarios may not represent the intrinsic patterns and realistic timeseries of real historical observations of renewable energy resources.
Instead of modeling the stochastic processes explicitly using statistical models, datadriven methods have also been considered to represent the complex dynamics in renewables generation processes [12, 13]. In [14], a scenario generation methodology based on artificial neural networks (ANNs) is proposed to firstly get an accurate point forecasts, yet such supervised approach heavily relies on the Gaussian assumption and statistical information of forecast errors.
On the other hand, generative models have been used to directly learn the underlying distribution of renewables generation processes. Generative models are trained to learn the true data distribution during the training phases. The learning process is unsupervised, and once the model is trained, it can efficiently generate data that has a similar distribution to the original ones [15]. In [16], variational autoencoder (VAE) is adopted to generate scenarios for wind and PV power. In [17, 18], the authors firstly propose to use generative adversarial networks (GANs) [19], which is a modelfree, datadriven and scalable approach for generating renewable scenarios by deep generative models. However, such method cannot incorporate forecast information and be applied for future scenario generations. A followup work [20] proposed to generate a group of future realizations based on trained GANs, yet generating scenarios for multiple spatially correlated sites are not discussed.
Due to the correlations of meteorological conditions, renewable generation outputs of different locations exist certain unknown, hardtomodel correlations. The spatial dependence along with temporal correlations are both imperatives for joint uncertainty modeling, especially for power flow optimizations and transmission risk assessments. In this paper, we follow the line of deep generative model, and propose a novel method based on GANs to directly generate future scenarios for multiplesite renewable power generations. GANs are composed of two deep neural networks: a generator network trying to generate realistic samples, and a discriminator network trying to discriminate the input samples. In the training process, the generator network tries to ”fool” the discriminator network by generating realistic samples, while the discriminator network tries to distinguish the real training data from the output of the generator network. The joint training of this two networks form a minimax game. GANs lie in the category of unsupervised learning model, while trained generators can generate realistic scenarios resembling to renewable power generation samples. We observe the training instability and low generation quality problems, and propose techniques to improve GANs training.
Our method for scenarios forecasts contains two steps. In the first step, we train GANs based on historical observations of renewable generations. Once training is completed, by formulating an optimization problem based on given point forecasts, we optimize over the noise vectors to find the future scenarios from trained generator’s outputs. The only information we need is a historical dataset comprised of power generations of target sites, along with any available point forecasts. The proposed approach is free of any statistical assumptions, and can forecast scenarios without relying on any sampling techniques. Fig. 1 illustrates the algorithmic framework for our proposed method.
Specifically, we make the following contributions:

Based on any provided point forecasts, our method is able to generate shortterm forecasting scenarios with highlevel flexibility on number of renewable generation sites and scenarios as well as the length of forecast horizons. To our knowledge, this is the first work that utilizes deep generative models for forecasting spatiotemporal scenarios.

The improved generative model avoid the problems of exploding or vanishing gradients during training and can also achieve faster convergence to reduce the training time for the generative models.

Scenarios forecasted by our method can not only represent a group of future realizations, but also reflect the intrinsic complex spatial and temporal patterns lying in the renewable generation processes. Extensive numerical simulations validate the performance of proposed method on varying scenario forecasts tasks.
The rest of this paper is organized as follows. Section II describes the model setup for GANs along with improved training techniques. In Section III we describe our proposed method for using GANs to forecast spatiotemporal scenarios. Section IV provides the model structure and training algorithms of GANs for renewable scenario generation. In section V, numerical simulations are conducted to validate the proposed technique for forecasting wind or solar scenarios through a comprehensive analysis comprising for both single site and spatialcorrelated multiple sites. Conclusion marks are made in Section VI.
Ii DataDriven Generative Model
In this section, we describe the formulation and training techniques for GANs, and illustrate how to utilize GANs as an efficient module for scenario generation. Improved training techniques for GANs, including using Wasserstein distance as loss metric, enforcing Lipschitz constraints and continuity conditions are discussed.
Iia Wasserstein GAN (WGAN)
GANs utilizes the power of deep learning to learn the unknown target distribution. It can be regarded as a twoplayer zerosum game between the two interconnected neural networks, the generator and the discriminator , under the adversarial learning framework. Given a training set of historical renewable generation data, the generator’s goal is to find a function that transforms a sample from known noise distribution to a sample following the same distribution as the historical observations. The discriminator’s goal is to distinguish whether the input data comes from the generator or comes from real historical samples. When the adversarial networks are trained to an equilibrium, the discriminator can no longer distinguish between generated and historical data, meaning the generator can produce realistic samples as if they are coming from the true distribution.
Suppose the distribution of the historical data is represented by the probability density function , and a noise vector is sampled from a given Gaussian distribution with known mean and variance^{1}^{1}1In training implementations, any known, easytosample distribution can be used for GANs training.. Just like any training procedure for neural networks model, we need to define the loss functions to guide the parameter updates.
We firstly formulate the loss function to update the weights of ’s parameters. During training, a batch of samples drawn with distribution output newly generated data under generated samples’ distribution . A small can be achieved by maximizing , which indicates the generated samples from distribution look like real samples from discriminator’s perspective. Following this guideline, the loss function is defined as
(1) 
The discriminator takes input samples either coming from generator or from real historical data. It is alternately trained with the generator. During training, the discriminator’s goal is to distinguish between and . In other words, to maximize the value between and . To update the parameters of , the loss function can be similarly defined by
(2) 
As for the adversarial training of the two interconnected neural networks, the discriminator outputs a continuous value to measure the input samples. For a given , maximized output means to minimize , while the discriminator wants to minimize for generated samples and maximize for real samples. With the two loss functions and defined, we then can formulate the twoplayer game with a value function :
(3) 
The minimax objective (3) can be theoretically interpreted as the dual of the Wasserstein distance for and . The original form of Wasserstein distance is defined as follows:
(4) 
where denotes the set of all joint distributions whose marginals are and , and requires to find the joint distribution that have smallest distance for and . Yet directly minimizing (4) is impractical for the generated samples parameterized by neural networks . By borrowing KantorovichRubinstein duality [21], we could alternatively optimize over the dual form of the Wasserstein distance:
(5) 
Wasserstein distance is also known as the EarthMover (EM) distance [21], which indicates how much “mass” is needed to transport from one distribution to the targeted distribution. In terms of model training, this distance has nice properties on indicating the distance between generated samples and real samples compared with other standard distance measurements (e.g., JensenShannon divergence, KullbackLeibler divergence). We will show training convergence results and highquality generated renewable scenarios in latter sections.
IiB Improved Training Techniques
WGAN uses Wasserstein distance to measure the distance difference between generated scenarios and historical data distribution, which theoretically addresses the problem of training convergence. The following techniques improve the constraints upon model weights, and guarantee quicker convergence and better scenario generation performances.
IiB1 Enforcing Lipschitz Constrains
We follow the improved strategy proposed for imposing the Lipschitz constraint [22]. Inspired by the optimal discriminator that has unit gradient norm almost everywhere under and , the gradient penalty is given by
(6) 
where for . Given that enforcing the unit gradient norm constraint everywhere is intractable, this alternative way is an effective way to use for model training.
The gradient penalty term GP performs better than the standard weight clipping for Lipschtiz constraint. The modified loss function stabilizes the GANs training over a wide range of architectures with almost no hyperparameter tuning and can generate higher quality samples on different datasets [22].
IiB2 Enforcing Continuity Conditions
Since the gradient term can only be punished at sampled data points in the training process, a large part of the data points will not be sampled at all. In addition, the output of the generator is significantly different from the actual data point at the start of the training. The 1Lipschtiz constraint is not enforced until the data distributions and are close enough to each other. To overcome these issues, an additional consistency term (CT) is proposed to improve the training [23]. Instead of focusing on particular data points sampled on specific data points, a region around the real data manifold is considered. In particular, two perturbed data points and near observed real data point are used to check the continuity condition. The discriminator is Lipschitz continuous if there exists , such that for any input , , we must have
(7) 
where and can be found by applying the stochastic dropout to the hidden layers of the discriminator. GANs performance can be further improved by controlling the output of secondtolast layer of the discriminator . The final consistency regularization takes the following form to penalize the violation of (7):
(8) 
where is a bounded constant.
In summary, the gradient penalty term GP (6) enforces the continuity over the points sampled between the real and generated points; while the consistency term CT (8) can complement the former by focusing on the region around the real data manifold instead. Therefore, these two terms can be used together to improve the training of GANs. Putting all together, the improved objective function for WGAN training can be expressed as
(9) 
where and are used to balance the weights of loss.
GANs provide a strong modelfree, datadriven model for generating realistic scenarios. Based on the required forecast horizon and number of renewable generation plants, We can feed GANs with historical renewable generation data of corresponding dimensions, and then can train the generator and discriminator simultaneously to capture the data distribution of historical observations using (9). Once trained, we can sample from generator to get realistic samples resembling to the samples coming from the joint distribution of renewables generation timeseries with predefined timing and locational information.
An important benefit of using WGAN is that the training evolution can be continuously evaluated by the output of discriminator, which provides a useful training and evaluation for system operators for real applications. To show that our method preserves this property, we train WGAN on NREL renewable integration dataset and plot the convergence curves of the discriminator’s value functions in Fig. 2. The red curves are evaluated on the training set and the orange ones are on the test set. We could observe that the results on the test set can consistently vary with the almost same trend with that of the training set for both wind and solar power, demonstrating that the generative models are well trained. Once the training is completed, we get an optimal generator that can capture the underlying spatialtemporal correlations in renewable generation data. In the next section, we will formally formulate the scenario forecasts problem, where we use the pretrained GANs for generating a set of future renewables scenarios based on points forecasts.
Iii Scenario Forecasts via GANs
By training GANs to convergence, we get a powerful generator to model the complex spatial and temporal relations existed in multiplesite renewable generations. However, one more challenging problem is to generate multiple feasible future scenarios. We will show by formulating the scenario forecasts as an optimization problem, we could embed the fitted GANs with any given point forecast method to generate scenarios without dimension and number limitations.
Since the singlesite scenario forecasts is a special case of multiple sites, here we give the general problem formulation for multiple sites’ scenario forecasts. For a typical multiple renewable power generation sites, assume at timestep , we have some forecasting method to obtain the point forecasts for each power generation site and each lookahead time , , . The point forecasts are given as
where denotes the number of sites and denotes the forecasting horizon. In this paper, we focus on the scenario forecasting problem, so it is flexible to work with any point forecasts method, e.g., ARIMA model or information from numerical weather prediction (NWP). Given some input noise , the pretrained generates a possible realization without regarding to the forecast information . Based on the generator and the point forecasts, we are interested in forecasting a group of scenarios to represent the uncertainty of renewable generation. Meanwhile, shall accurately reflect the temporal and spatial dynamics of future generation.
Since the point forecasts only provide a deterministic information for future renewable generations, we take the notion of prediction interval to indicate the region around that generated scenarios should lie in [20, 24]. We describe this interval with an upper bound and a lower bound :
(10) 
where the hyperparameter can be interpreted as the prediction confidence or prediction interval.
Since the forecasting scenarios should reflect the forecast information around the point forecast , we can first obtain a good starting point for by solving the following problem:
(11)  
(11a)  
(11b) 
where is randomly sampled from an initial fluctuation interval . Note that our goal is to forecast scenarios that not only can represent the uncertainty of future time, but also can generate realistic timeseries that can capture the intrinsic patterns of renewable energy sources at different prediction horizons. According to the loss defined in (3), larger discriminator output indicates more realistic samples. To ensure the generated scenarios are realistic with pretrained generator, we use the objective to enforce the generated samples are realistic enough. Meanwhile, we want to constrain generated scenarios within a predetermined confidence interval according to actual needs of risk management. Using all of the objectives above and pretrained model , the scenario forecasts problem can be formulated as a constrained optimization problem:
(12)  
(12a)  
(12b) 
In order that we can always obtain a good initial , we set the prediction interval in (11) to be slightly smaller than in main objective function (12). Since both of the objective and constraints in (12) are nonconvex, to deal with the inequality constraints, we propose to substitute it into the main objective with two log barriers. Meanwhile, by using the fact that there are multiple local optima in deep neural networks, we can find different initial points by solving (11) for multiple times, and get distinct forecasting scenario by solving (12) with log barriers. As the training loss defined in (9) incurs to generate diverse scenarios given different , we are able to obtain a group of distinct yet realistic scenarios that not only can reflect the point forecast information, but also can represent different uncertainty levels according to the actual needs of risk management .
Iv Network Structure and Training Details
In this section, we describe the network structure that has been used for generating multiplesite shortterm scenario forecasts. Note that our structure is flexible in generating scenarios for varying time horizons and number of power plants, it could be served as a plugin module for generating realistic future scenarios.
Both networks in our model utilize the power of deep learning to represent the temporal and geographical dependencies of scenarios. The generator network starts with fully connected multilayer perceptron and 3 deconvolutional layers to upsample the input samples to generate renewable timeseries. The discriminator network has a similar yet reversed structure to distinguish historical samples from generated samples with a scalar output. Sigmoid function is used as the activation function to limit the output of discriminator lying in the interval . ReLU and LeakyReLU activations are respectively used in the hidden layers of the generator and the discriminator. Dropout is only applied in the output of each hidden layer of the discriminator. Batch normalization can be used to help stabilize training in both the generator and the discriminator, but it changes the form of the discriminator’s problem from mapping a single input to a single output to mapping from an entire batch of inputs to a batch of outputs. Since the improved loss objective in (3) is no longer valid in this setting, we can omit or replace the batch normalization by layer normalization in our model structure [25]. All our experiments for scenario forecasts are implemented in Python using opensource machine learning package TensorFlow [26].
V Computational Experiments
In this section, we describe our experiments and results on two renewable datasets: wind and solar power generations respectively. By first traininig the proposed GANs model on historical datasets, we fix the model weights and implement our scenario forecasts algorithms. We show that the proposed method can forecast shortterm scenarios for both single site and spatialcorrelated multiple sites. We validate the forecasted scenarios by examining their statistical properties.
Va Simulation Setup
In order to test the performance of our proposed framework for scenario forecasts, we build training and validation dataset using power generation data from National Renewable Energy Laboratory (NREL) Wind and Solar Integration Datasets [27]. Historical power measurements have a resolution of minutes. We choose wind farms and solar power plants located in the State of Washington to use as the datasets for our datadriven method. Historical samples are split into training set and standalone validation set. In general, we randomly select of the data as training set. we also collect the corresponding 24hour ahead forecast data, which is served as historical observations which are later used for forecasting scenarios based on pretrained GANs. All renewable power measurements and forecasts are rescaled to .
Our scenario generation model is repeatedly fed with the historical samples until the discriminator loss to converge. We keep the training until about iterations to demonstrate the training procedure is stable. The training convergence curves on solar and wind data sets for GANs are shown in Fig. 2. Output of is growing initially since the discriminator could learn to distinguish samples generated by the generator from the real historical samples. The generator gradually learns various patterns in historical renewable data. After iterations of training, the loss function already converges to near . As the training tends to converge, The generator is able to generate plausible power trajectories with a small and the discriminator can hardly distinguish between generated timeseries and real ones. Eventually, the output power scenarios of the generator can represent the stochastic processes of renewable power generation.
VB Scenario Forecasts
For different scenario forecasting tasks, we can use the same GANs model. The framework for using GANs for forecasting scenarios is illustrated in Fig. 1. The proposed method contains two steps. We can use the first step to model the uncertainty and capture the data distribution of renewable resources. while in the second step, by solving the optimization problem, we can generate a large number of forecasted scenarios.
In this subsection, we validate the proposed method that can forecasting scenarios for a single site of renewable resources. Historical data in geographical proximity is collected as input samples to represent the stochastic generation dynamics. We first show that the proposed method can generate scenarios of different levels of uncertainty for solar and wind power. The size of samples from the training set is composed of twoday data. The forecasted trajectories with varying PIs of 1.5, 2 and 3 are shown in Fig. 3 for both solar and wind cases with 3 randomly selected 2day samples. By visually inspection, we can observe that the samples generated by proposed methods can correctly capture the hallmark features (e.g., large peak values, daily variations, and ramp events of large fluctuations) of the solar and wind power profiles from the predicted data. By selecting different prediction interval , the forecasted scenarios can represent different degrees of uncertainty in renewable power generation. When the interval level is , generated scenarios are close to point forecasts, yet fail to cover the realizations; while when , generated trajectories could cover the actual power production values, but are less concentrated. We note in the first example of wind scenario forecasts, even when the point forecasts are not accurate, a larger prediction interval will help generate diversified scenarios that cover the forecast uncertainties. The prediction interval can be properly tuned based on the realworld applications. Meanwhile, we can further adjust the hyperparameters of the upper and lower boundaries of the logbarrier in optimization problem to generate trajectories that is more in line with the actual needs.
In order to further verify the generated scenarios’ temporal statistics, we calculate and compare samples’ autocorrelation. The autocorrelation measures the degree of correlation of a time series between two different periods. Autocorrelation represents the temporal correlation at a renewable resource. The autocorrelation coefficient for a given timeseries is calculated as
(13) 
where is the lookahead time and represents generated samples or realizations with mean .
The temporal correlation of the first two examples shown in Fig. 3 are calculated by (13), and the results are shown in Fig 4. Scenarios’ autocorrelation plots cover the measurements’ autocorrelation, indicating the generated scenarios are able to represent the temporal dependence of real timeseries. The discussions on prediction intervals also hold similarly on the autocorrelation plots, as the increase of PIs leads to diversified scenarios that have various representation of autocorrelations.
VC Spatial Correlation
For the scenario forecasts of multiple sites, instead of feeding historical data for a single site to GANs, we input the model with a real data matrix of size , where denotes the total number of generation sites, while denotes the total number of timesteps for each scenario. Here we choose with a resolution of hour. In order to further examine the correlations between individual locations, we calculate the correlations of the simulated time series and compare the values with those of measured time series. Each row of the correlation matrix shows the correlation between that site (e.g. row 1 represents Wind farm 1) and the other sites, so that the diagonal is composed of ones (the site autocorrelation) and the other terms are the crosscorrelation between sites. A sample of 24 wind farms’ real power generations and forecasting scenarios along with the correlation matrix between different sites are plotted in Fig. 5. By visual inspection we find that their dynamic behaviors are similar to each other. The spatial and temporal correlations in the real data (again, not seen in the training stage) are correctly preserved by our forecasted scenarios. From the spatial correlation coefficient colormaps of these two group of data, we can see the generated scenarios preserve the relative correlation between any pair of wind farms. Such results indicate that our proposed method could preserve both the spatial and temporal correlations, and can be used as a plugandplay module for multisite planning and operation problem.
For purposes of illustration, some elements of the correlation matrix for a few sites are also shown in Fig. 6 for both the measured timeseries and the forecasted scenarios. Each pair of curves basically maintains a consistent trend compared to every other site. The results show that the generated scenarios using proposed method agree with the ground truth, showing that spatial correlations between different sites are correctly retained.
We further verify that generated timeseries have the same statistical properties as the measured data. We generate 50 scenarios for a group of 24 wind farms. We compare the CDF of the true realizations and the generated scenarios in Fig. 7. For the sake of simplicity, we only select some of them for display. It is clear the methodology for different sites has the capability to generate samples with the correct marginal distributions that are basically the same as the predicted scenarios. The power generation and fluctuations at these sites are consistent with the measured data. The level of different generation capacities and the magnitude of different fluctuations can be correctly captured.
Vi Conclusions
This paper proposes a novel method to forecast scenarios for renewables power generation processes based on deep generative models. Capable of working with any offtheshelf point forecast methods, the proposed algorithm not only characterizes the uncertainty associated with renewable energy resources for both single site and spatially correlated multiplesite cases, but also generates realistic shortterm scenarios without any number or forecast horizon restrictions. Comprehensive simulations carried out for different case studies show the effectiveness of the proposed methodology. With high reliability and high flexibility, the proposed approach can be used to directly generate a large number of scenarios. Various statistical methods validate the superiority of proposed approach: the marginal distribution associated with each renewable power stochastic process is retained by the generated scenarios; the temporal correlations are characterized by autocorrelations at each renewable stochastic process; the spatial correlations are verified by crosscorrelations among different sites. Our method can serve as a promising pipeline for generating high quality scenarios that reflect the intrinsic patterns and data distribution of renewablge generations.
References
 [1] D. Bertsimas, E. Litvinov, X. A. Sun, J. Zhao, and T. Zheng, “Adaptive robust optimization for the security constrained unit commitment problem,” IEEE transactions on power systems, vol. 28, no. 1, pp. 52–63, 2013.
 [2] J. Zhu, Optimization of power system operation. John Wiley & Sons, 2015, vol. 47.
 [3] P. Pinson and R. Girard, “Evaluating the quality of scenarios of shortterm wind power generation,” Applied Energy, vol. 96, pp. 12–20, 2012.
 [4] J. Wang, M. Shahidehpour, and Z. Li, “Securityconstrained unit commitment with volatile wind power generation,” IEEE Transactions on Power Systems, vol. 23, no. 3, pp. 1319–1327, 2008.
 [5] X.Y. Ma, Y.Z. Sun, H.L. Fang, and Y. Tian, “Scenariobased multiobjective decisionmaking of optimal access point for wind power transmission corridor in the load centers,” IEEE Transactions on Sustainable Energy, vol. 4, no. 1, pp. 229–239, 2013.
 [6] Y. Wang, Y. Dvorkin, R. FernandezBlanco, B. Xu, T. Qiu, and D. S. Kirschen, “Lookahead bidding strategy for energy storage,” IEEE Transactions on Sustainable Energy, vol. 8, no. 3, pp. 1106–1117, 2017.
 [7] P. Pinson, H. Madsen, H. A. Nielsen, G. Papaefthymiou, and B. Klöckl, “From probabilistic forecasts to statistical scenarios of shortterm wind power production,” Wind Energy: An International Journal for Progress and Applications in Wind Power Conversion Technology, vol. 12, no. 1, pp. 51–62, 2009.
 [8] C. Tang, Y. Wang, J. Xu, Y. Sun, and B. Zhang, “Efficient scenario generation of multiple renewable power plants considering spatial and temporal correlations,” Applied Energy, vol. 221, pp. 348–357, 2018.
 [9] P. Meibom, R. Barth, B. Hasche, H. Brand, C. Weber, and M. O’Malley, “Stochastic optimization model to study the operational impacts of high wind penetrations in ireland,” IEEE Transactions on Power Systems, vol. 26, no. 3, pp. 1367–1379, 2011.
 [10] J. M. Morales, R. Minguez, and A. J. Conejo, “A methodology to generate statistically dependent wind speed scenarios,” Applied Energy, vol. 87, no. 3, pp. 843–855, 2010.
 [11] D. D. Le, G. Gross, and A. Berizzi, “Probabilistic modeling of multisite wind farm production for scenariobased applications,” IEEE Transactions on Sustainable Energy, vol. 6, no. 3, pp. 748–758, 2015.
 [12] J. Liu, F. Qu, X. Hong, and H. Zhang, “A smallsample wind turbine fault detection method with synthetic fault data using generative adversarial nets,” IEEE Transactions on Industrial Informatics, 2018.
 [13] Y. Qin, X. Wang, and J. Zou, “The optimized deep belief networks with improved logistic sigmoid units and their application in fault diagnosis for planetary gearboxes of wind turbines,” IEEE Transactions on Industrial Electronics, vol. 66, no. 5, pp. 3814–3824, 2019.
 [14] S. I. Vagropoulos, E. G. Kardakos, C. K. Simoglou, A. G. Bakirtzis, and J. P. Catalao, “Annbased scenario generation methodology for stochastic variables of electric power systems,” Electric Power Systems Research, vol. 134, pp. 9–18, 2016.
 [15] J. Li, S. Liu, H. He, and L. Li, “A novel framework for gear safety factor prediction,” IEEE Transactions on Industrial Informatics, 2018.
 [16] H. Zhanga, W. Hua, R. Yub, M. Tangb, and L. Dingc, “Optimized operation of cascade reservoirs considering complementary characteristics between wind and photovoltaic based on variational autoencoder,” in MATEC Web of Conferences, vol. 246. EDP Sciences, 2018, p. 01077.
 [17] Y. Chen, Y. Wang, D. Kirschen, and B. Zhang, “Modelfree renewable scenario generation using generative adversarial networks,” IEEE Transactions on Power Systems, vol. 33, no. 3, pp. 3265–3275, 2018.
 [18] C. Jiang, Y. Mao, Y. Chai, M. Yu, and S. Tao, “Scenario generation for wind power using improved generative adversarial networks,” IEEE Access, vol. 6, pp. 62 193–62 203, 2018.
 [19] I. Goodfellow, J. PougetAbadie, M. Mirza, B. Xu, D. WardeFarley, S. Ozair, A. Courville, and Y. Bengio, “Generative adversarial nets,” in Advances in neural information processing systems, 2014, pp. 2672–2680.
 [20] Y. Chen, X. Wang, and B. Zhang, “An unsupervised deep learning approach for scenario forecasts,” in 2018 Power Systems Computation Conference (PSCC). IEEE, 2018, pp. 1–7.
 [21] C. Villani, Optimal transport: old and new. Springer Science & Business Media, 2008, vol. 338.
 [22] I. Gulrajani, F. Ahmed, M. Arjovsky, V. Dumoulin, and A. C. Courville, “Improved training of wasserstein gans,” in Advances in Neural Information Processing Systems, 2017, pp. 5767–5777.
 [23] X. Wei, B. Gong, Z. Liu, W. Lu, and L. Wang, “Improving the improved training of wasserstein gans: A consistency term and its dual effect,” arXiv preprint arXiv:1803.01541, 2018.
 [24] C. Wan, Z. Xu, P. Pinson, Z. Y. Dong, and K. P. Wong, “Probabilistic forecasting of wind power generation using extreme learning machine,” IEEE Transactions on Power Systems, vol. 29, no. 3, pp. 1033–1044, 2014.
 [25] J. L. Ba, J. R. Kiros, and G. E. Hinton, “Layer normalization,” arXiv preprint arXiv:1607.06450, 2016.
 [26] M. Abadi, P. Barham, J. Chen, Z. Chen, A. Davis, J. Dean, M. Devin, S. Ghemawat, G. Irving, M. Isard et al., “Tensorflow: A system for largescale machine learning,” in 12th USENIX Symposium on Operating Systems Design and Implementation (OSDI 16), 2016, pp. 265–283.
 [27] C. Draxl, A. Clifton, B.M. Hodge, and J. McCaa, “The wind integration national dataset (wind) toolkit,” Applied Energy, vol. 151, pp. 355–366, 2015.