# Probabilistic Solar Power Forecasting: Long Short-Term Memory Network vs Simpler Approaches

## Abstract

The high penetration of volatile renewable energy sources such as solar, make methods for coping with the uncertainty associated with them of paramount importance. Probabilistic forecasts are an example of these methods, as they assist energy planners in their decision-making process by providing them with information about the uncertainty of future power generation. Currently, there is a trend towards the use of deep learning probabilistic forecasting methods. However, the point at which the more complex deep learning methods should be preferred over more simple approaches is not yet clear. Therefore, the current article presents a simple comparison between a long short-term memory neural network and other more simple approaches. The comparison consists of training and comparing models able to provide one-day-ahead probabilistic forecasts for a solar power system. Moreover, the current paper makes use of an open source dataset provided during the Global Energy Forecasting Competition of 2014 (GEFCom14).

###### keywords:

GEFCom14, Neural Networks, Quantile Regressions, LSTM, Probabilistic Forecasting## 1 Introduction

Over the past couple of years solar power has become one of the most popular renewable energy sources (RES). Unfortunately, the generation of solar power depends completely on the Sun Bak et al. (2002). This dependency on weather adds uncertainty and variability to the generation of solar power. To deal with this uncertainty, solar forecasts are made in-order to predict the future power generation.

Solar power forecasts can be categorized into deterministic and probabilistic forecasts Antonanzas et al. (2016). Some examples of deterministic forecasting methods present in literature can be found in Abuella et al. (2017); Diagne et al. (2013); González Ordiano et al. (2018); Sharma,Vinayak (2018); Sharma et al. (2018). While deterministic forecasts predict only the expected future generation, probabilistic forecasts offer a description of the forecast uncertainty. This additional information helps in managing resources, as well as, in calculating risks associated with future decisions Appino et al. (2017); Gneiting and Katzfuss (2014). Furthermore, economic benefits can also be gained from using probabilistic forecasts, since they improve the decision making capabilities within electricity markets Roulston et al. (2003).

Various methodologies to generate probabilistic solar power forecasts have been discussed in literature. For example, nearest neighbor approaches Zhang and Wang (2015), vector auto-regressive (VAR) models Bessa et al. (2015), methods for estimating volatility Fliess et al. (2018), and ensemble models Alessandrini et al. (2015). Additionally, examples of solar power probabilistic forecasting using deep learning techniques can also be found in literature, e.g., in Gensler et al. (2016). However, even though deep learning methodologies have gained in popularity in the past couple of years, they have often under-performed in terms of accuracy when compared to other statistical forecasting techniques Makridakis et al. (2018).

For this reason, the current article presents a small experiment with the goal of defining a starting point for understanding the limitations of deep learning probabilistic forecasting methodologies. To be more specific, the experiment consists in training, evaluating, and comparing solar power probabilistic forecasts based on quantile regressions Fahrmeir et al. (2013) obtained using a long-short term memory (LSTM) neural network (i.e. a deep learning approach) and more simple techniques (i.e. polynomials and a fully connected artificial neural network). Furthermore, the open source dataset of the Global Energy Forecasting Competition of 2014 is used for the experiment.

The remainder of the current paper is divided as follows. Section 2 presents a brief description of the various methods tested. Thereafter, Section 3 describes more in detail the conducted experiment. Afterwards, Section 4 presents the obtained results and finally, Section 5 offers the conclusion and outlook of this work.

## 2 Methods

Quantile regressions are useful at estimating the uncertainty of a time series’ future. A finite time series is defined as a sequence of observations measured at different points in time; with the timestep defining the order of the observation in the sequence and representing the time series’ length.

In turn, a quantile regression can be viewed as a model able to estimate a quantile with a probability of a future value at a forecast horizon . For instance, a quantile regression that takes auto-regressive and exogenous values as input can be defined as:

(1) |

where is the quantile estimate, is the number of used lags, and represents a vector containing observations of exogenous time series at timestep . Moreover, is a vector containing the estimated regression parameters, which are traditionally obtained through the minimization of the sum of pinball-losses Fahrmeir et al. (2013). Furthermore, one of the most important properties of quantile regressions is the fact that pairs of them can be combined to form intervals with a certain probability of containing a future time series’ value (i.e. probabilistic forecasts). Finally, more detailed information of the models used in the present article can be found in the following sections.

### 2.1 Simple Models

#### Polynomials

Quantile regressions trained using a polynomial model are multiple linear quantile regressions, whose features can be raised to a maximal allowed degree. Some examples of this type of model can be found in González Ordiano et al. (2017).

#### Fully Connected Artificial Neural Network

The fully connected artificial neural network (FCANN) used in the present article is a simple multilayer perceptron Hastie (2016) with only one hidden layer. The advantage of this model over the polynomials is the fact that it can more easily describe non-linear relations between its inputs and its desired output (i.e. the solar power time series’ future values). It needs to be mentioned, that the FCANN quantile regressions are trained using a method described in González Ordiano et al. (2019).

### 2.2 Long Short-Term Memory Neural Network

A Long Short-Term Memory (LSTM) neural network model Lipton (2015) is part of the Recurrent Neural Network (RNN) family. An RNN is a neural network able to learn temporal dependencies in data. In other words, RNNs can establish a correlation between the previous data points and the current data point in the training sequence Kong et al. (2019). This property makes them ideal for solar power forecasting. However, in cases where long-term relationships need to be learned, traditional RNNs face the problem of gradient vanishing. LSTMs solve this issue by using an additional unit called a memory cell Lipton (2015) that helps them in learning and explaining long-term relationships Gensler et al. (2016). LSTM quantile regressions can be obtained using the pinball-loss as error function during training.

## 3 Experiment

### 3.1 Data

The dataset used comes from the solar track of the Global Energy Forecasting Competition of 2014 (i.e. GEFCom14) Hong et al. (2016). It contains three different sets of time series with hourly power measurements of three solar power systems in Australia (normalized to values between 0 and 1), as well as, a number of corresponding weather forecast time series for the period of April , 2012 to July , 2014. In the present work, only the forecast weather time series containing forecasts of the solar surface radiation, solar thermal radiation, and top net solar radiation from the 1st zoneID are used. Additionally, the data of only one of the solar power systems is utilized; with 70% of the data used for training and 30% for testing.

### 3.2 Experiment Description

The experimental setup, to compare the performance of the LSTM to that of the other models, consists in forecasting daily 99 quantiles (i.e. ) of the next 24 hours of solar power generation (i.e. , due to the time series’ hourly resolution). Furthermore, the same input data is used for all quantile regressions; i.e. the solar power measured over the past 24 hours and the forecast radiation values for the next day.

The polynomial models used have maximal allowed degrees of one up to three, hence they are referred to as Poly1, Poly2, and Poly3. In turn, the simple FCANN models are multilayer perceptrons with one hidden layer and 10 hidden neurons with a tanh activation function. Additionally, a forward feature selection is applied to select the four most relevant features with which both the polynomial and FCANN models are later trained. Moreover, to improve the forecast accuracy, the night values are removed during training and automatically set to zero during testing. Note that all polynomial and FCANN models are trained using the MATLAB open-source toolbox SciXMiner Mikut et al. (2017).

Finally, an LSTM model with one input layer, one hidden layer, and one output layer is developed using the Keras API ^{1}

The value used to evaluate the results on a test set of size is the pinball-loss averaged over all the estimated quantiles (as in Hong et al. (2016)), i.e.:

(2) |

In the previous equation, is the pinball-loss obtained by a quantile regression with probability , while is the average of the pinball-losses obtained by all estimated regressions. Please notice that a comparison based on computation time is excluded from the present article, as some models are created with MATLAB and others with Python. Nevertheless, due to its relevance, such a comparison is to be done in future related works.

## 4 Results

The results on the test set from the above described experiments are presented in Table 1.

Model | Avg. Pinball-loss [%] |
---|---|

Poly1 | 1.70 |

Poly2 | 1.59 |

Poly3 | 1.66 |

FCANN | 1.43 |

LSTM | 1.43 |

As the contents of Table 1 show, the LSTM outperforms all of the polynomial models. Nonetheless, the difference in pinball-loss between the LSTM model and the best performing polynomial model is not significantly large, as it just amounts to . Additionally, the FCANN model has in average the same performance as the more complex LSTM model. The underwhelming performance of the LSTM regressions may be caused by different reasons. For instance, their extensive need for a large training dataset, as it is known that deep learning methodologies need large amounts of data to accurately learn the relationship between the dependent and the independent variables Najafabadi et al. (2015). Furthermore, the manually selected hyper-parameters may also be behind the LSTM’s underwhelming performance, as this manual selection does not assure that the optimal set of parameters is found. Another explanation could be that the existing real-world non-linearities can be covered by FCANN as good as by LSTM

For the sake of illustration, Figure 1 depicts the interval forecasts obtained by the FCANN and LSTM models.

As can be seen in Figure 1, the LSTM intervals seem to be larger than the ones obtained by the FCANN regressions. Therefore it can be argued, that the LSTM may be overestimating in some degree the uncertainty. This aspect needs to be considered in future related works, if the accuracy of the herein LSTM-based probabilistic forecasts is to be improved.

## 5 Conclusion and Outlook

The main contribution of the current article is to present a comparison between a long short-term memory (LSTM) model and other more simple approaches; specifically some polynomial models and a simple fully connected artificial neural network (FCANN). The comparison consists in obtaining and evaluating 24 hour ahead probabilistic solar forecasts. The experiment shows that the LSTM model performs slightly better than the polynomials and obtains the same results as the FCANN. Therefore, it can be argued that the complex LSTM may not always provide the best solution, at least not for the dataset evaluated in this paper. Henceforth, the current article recommends the use of simpler/classical forecasting methodologies as a preliminary benchmarking step before exploring more complex deep learning methods.

Also, since the underwhelming performance of the LSTM may be caused by a sub-optimal selection of hyper-parameters, hyper-parameter selection via automated machine learning (AutoML) techniques has to be studied in future related works. Moreover, aspects like multiple runs of the neural networks and computation time need also to be taken into consideration in future experiments. At the same time, comparisons as the one presented herein for the case of probabilistic wind and/or load forecasts also need to be studied in the future.

## Acknowledgement

The present contribution is supported by the Helmholtz Association under the Joint Initiative “Energy System 2050 — A Contribution of the Research Field Energy”

## References

### Footnotes

### References

- (2017) Solar power forecasting using support vector regression. arXiv preprint arXiv:1703.09851. Cited by: §1.
- (2015) An analog ensemble for short-term probabilistic solar power forecast. Applied Energy 157, pp. 95 – 110. External Links: ISSN 0306-2619 Cited by: §1.
- (2016) Review of photovoltaic power forecasting. Solar Energy 136, pp. 78–111. Cited by: §1.
- (2017) On the use of probabilistic forecasts in scheduling of renewable energy sources coupled to storages. Applied Energy 210, pp. 1207–1218. Cited by: §1.
- (2002) Photo-electrochemical hydrogen generation from water using solar energy. materials-related aspects. International Journal of Hydrogen Energy 27 (10), pp. 991 – 1022. External Links: ISSN 0360-3199 Cited by: §1.
- (2015) Probabilistic solar power forecasting in smart grids using distributed information. International Journal of Electrical Power & Energy Systems 72, pp. 16 – 23. Note: The Special Issue for 18th Power Systems Computation Conference. External Links: ISSN 0142-0615 Cited by: §1.
- (2013) Review of solar irradiance forecasting methods and a proposition for small-scale insular grids. Renewable and Sustainable Energy Reviews 27, pp. 65 – 76. External Links: ISSN 1364-0321, Document, Link Cited by: §1.
- (2013) Regression: models, methods and applications. Springer, Berlin, Germany. Cited by: §1, §2.
- (2018-05) Prediction bands for solar energy: New short-term time series forecasting techniques. Solar Energy 166, pp. 519–528. Cited by: §1.
- (2016-10) Deep learning for solar power forecasting — an approach using autoencoder and LSTM neural networks. In 2016 IEEE International Conference on Systems, Man, and Cybernetics (SMC), Vol. , pp. 002858–002865. External Links: ISSN Cited by: §1, §2.2.
- (2014) Probabilistic forecasting. Annual Review of Statistics and Its Application 1 (1), pp. 125–151. External Links: https://doi.org/10.1146/annurev-statistics-062713-085831 Cited by: §1.
- (2019) Probabilistic energy forecasting using the nearest neighbors quantile filter and quantile regression [in press]. International journal of forecasting. Cited by: §2.1.2.
- (2018) Energy forecasting tools and services. Wiley Interdisciplinary Reviews: Data Mining and Knowledge Discovery 8 (2), pp. e1235. External Links: https://onlinelibrary.wiley.com/doi/pdf/10.1002/widm.1235 Cited by: §1.
- (2017) Photovoltaic power forecasting using simple data-driven models without weather data. Computer Science - Research and Development 32, pp. 237–246. External Links: ISSN 1865-2042 Cited by: §2.1.1.
- (2016) The elements of statistical learning: data mining, inference, and prediction. Second edition, corrected at 11th printing 2016 edition, Springer series in statistics, Springer US, New York, USA, New York, NY. External Links: ISBN 978-0-387-84857-0 Cited by: §2.1.2.
- (2016) Probabilistic energy forecasting: global energy forecasting competition 2014 and beyond. International Journal of Forecasting 32(3), pp. 896 – 913. External Links: ISSN 0169-2070 Cited by: §3.1, §3.2.
- (2019-01) Short-term residential load forecasting based on LSTM recurrent neural network. IEEE Transactions on Smart Grid 10 (1), pp. 841–851. External Links: ISSN 1949-3053 Cited by: §2.2.
- (2015) A critical review of recurrent neural networks for sequence learning. arXiv preprint arXiv:1506.00019. Cited by: §2.2.
- (2018-03) Statistical and machine learning forecasting methods: concerns and ways forward. PLOS ONE 13 (3), pp. 1–26. External Links: Link, Document Cited by: §1.
- (2017) The MATLAB toolbox SciXMiner: user’s manual and programmer’s guide. Technical report arXiv:1704.03298. Cited by: §3.2.
- (2015-02-24) Deep learning applications and challenges in big data analytics. Journal of Big Data 2 (1), pp. 1. External Links: ISSN 2196-1115, Document, Link Cited by: §4.
- (2003) Using medium-range weather forecasts to improve the value of wind energy production. Renewable Energy 28 (4), pp. 585 – 602. External Links: ISSN 0960-1481, Document, Link Cited by: §1.
- (2018) Numerical weather prediction data free solar power forecasting with neural networks. In Proceedings of the Ninth International Conference on Future Energy Systems, e-Energy ’18, New York, NY, USA, pp. 604–609. External Links: ISBN 978-1-4503-5767-8 Cited by: §1.
- (2018) Deterministic and probabilistic forecasting for wind and solar power using advance data analytics and machine learning techniques. Ph.D. Thesis, (English). Note: Copyright - Database copyright ProQuest LLC; ProQuest does not claim copyright in the individual underlying works; Last updated - 2018-08-23 External Links: ISBN 9780438254725 Cited by: §1.
- (2015-07) GEFCom2014 probabilistic solar power forecasting based on k-nearest neighbor and kernel density estimator. In 2015 IEEE Power Energy Society General Meeting, Vol. , pp. 1–5. External Links: Document, ISSN 1932-5517 Cited by: §1.