# A Regressive Convolution Neural network and Support Vector Regression Model for Electricity Consumption Forecasting

## Abstract

Electricity consumption forecasting has important implications for the mineral companies on guiding quarterly work, normal power system operation, and the management. However, electricity consumption prediction for the mineral company is difficult since electricity consumption can be affected by various factors. The problem is non-trivial due to three major challenges for traditional methods: insufficient training data, high computational cost and low prediction accuracy. To tackle these challenges, we firstly propose a Regressive Convolution Neural Network (RCNN) model, but RCNN still suffers from high computation overhead. Then we utilize RCNN to extract features from data and Regressive Support Vector Machine (SVR) trained with features to predict the electricity consumption. The experimental results show that RCNN-SVR model achieves higher accuracy than using the traditional RCNN or SVM alone. The MSE, MAPE, and CV-RMSE of RCNN-SVR model are 0.8564, 1.975%, and 0.0687% respectively, which illustrates the low predicting error rate of the proposed model.

Youshan Zhang, Qi Li

## 1 Introduction

The electricity consumption of large enterprises has been a major factor of the cost control and the operational efficiency. Specifically, mineral companies consume large quantities of electricity in the coal production process daily. The electricity consumption forecasting has important implications for the mineral companies on guiding quarterly work, the normal power system operation and power management. Besides, the prediction accuracy of electricity consumption directly determines the power construction, network planning and the planning of electricity marketing strategies [1, 2, 3, 4]. Therefore, predicting the electricity consumption accurately is demanded and crucial to mineral companies.

Since the complicated dynamic of the electrical power system, it is difficult to establish an explicit model. Many traditional methods are applied to predict the electricity consumption, such as Gray prediction, regression analysis, time series, artificial neural network (ANN), support vector machine (SVM) [3, 5, 6, 7, 8, 9]. However, these methods have their respective disadvantages. For example, traditional ANN train data are mostly based on the gradient, and it may fail into local minimum easily [4]. One common limitation is that these methods are strongly depended on the number of training data, which discover the relationship between predictive value and model. Also, some statistical analysis models such as Kalman filters, and Autoregressive Integrated Moving Average (ARIMA) [10, 11, 12] were also applied in electricity consumption prediction. However, they still have constraints of insufficient data size. In [13], Hu presented a neural-network-based gray prediction (NNGM(1,1)) method, which can overcome the limitation of the traditional gray prediction method. It can easily determine the developing coefficient and control variables in the gray prediction model. Therefore, NNGM(1,1) can improve load forecasting accuracy. Similarly, in [2], Song et al. modified the gray prediction method and proposed a rolling gray prediction(NOGM(1,1)) model. [2] overcame the deficiencies of fixed structure and poor adaptability in the original gray prediction model. The empirical results showed the NOGM(1,1) model has higher prediction accuracy than original gray prediction model. However, the prediction accuracies of these methods are still not satisfying.

The major challenge is that electricity consumption prediction of the mineral company is different from the traditional electricity load prediction since mineral company electricity consumption is affected by various factors (e.g., ore grade, processing quantity of the crude ore, Ball milling fill rate). Conventional methods only consider the electricity values and ignore the influential factors. Therefore, it is necessary to build a new model that not only considers electricity values and influential factors but predicts the monthly electricity consumption of mineral company. In this paper, we will solve three issues by our proposed electricity consumption prediction model: (1) reduce the computational cost; (2) train the model with limited data; and (3) improve the prediction accuracy. Convolution Neural Network (CNN) [14, 15, 16] has become a popular method for solving image classification, segmentation, and regression problem recently. However, there is no such a Regressive CNN (namely RCNN, ending with a regression layer) architecture for predicting electricity consumption of mineral company.

In this study, we present a new electricity consumption forecasting model based on regressive convolution neural network and support vector regression (RCNN-SVR). Compared with traditional methods, the RCNN model is capable of extracting more representative features of history electricity consumption data, while SVR model can reduce the computation overhead. The forecasting accuracy of the proposed model is higher than several baseline models such as BP neural network and SVM [3]. There are two major contributions of this paper: (1) build the RCNN-SVR architecture to predict the electricity consumption of electricity; (2) compare prediction performances of our model with several baseline forecasting methods. We describe the RCNN and the SVR model, and introduce the model architecture in section 2. Experiments are conducted to verify our model and the comparisons with previous methods are available in section 3. Based on results in section 3, we discuss the experimental results, make a conclusion and explore future work in section 5.

## 2 Methodologies

. In this section, we first introduce the regressive convolution neural network (RCNN), and support vector regression(SVR) model, separately. Then, we present our RCNN-SVR architecture for predicting electricity consumption.

### 2.1 Data prepossessing

The electricity consumption data was collected from a mineral company in Liaoning province, China. It contained the monthly electricity consumption from 2012 to 2017 (only two months data are provided in 2017) with total 62 months. We split the data into training data and testing data. Testing data are not used during the training process. Training data contain influential factors(IFs) 8 is eight IFs of each month, and 50 is the number of month. true electricity consumption values(EVs). Testing data contain IFs, true EVs. For the input for RCNN, and RCNN-SVR model, we reshape the influential factors into a 4-D array, for example, influential factors change into for training and testing dataset, and represents for length, height, and depth.

### 2.2 RCNN Architecture

We first propose a regressive convolution neural network model(RCNN, shows in Fig. 1), which is similar to DeepEnergy in [17]. But our RCNN model has fewer layers because of limited data, the input is influential factors (IFs), and the last layer is regression layer which represents the electricity consumption values(EVs). In this network, it contains two main steps: feature extraction, and prediction. It only has eight layers. The feature extraction is performed by two convolution layers (Conv1, Conv2), and two max-pooling layer, (Maxpool1, Maxplool2), one rectified linear units (ReLU) layer, and one normalization (Norm) layer. The prediction step consists of a fully-connected layer and a regression layer. The input layer is comprised of influential factors (one month), Conv1 and Conv2 have the filter size () of , and filter number () 25 with padding size () 0; Maxpool1 and Maxpool2 have the stride size () of . Therefore, after the max-pooling layer, the dimension of feature map is divided by 2. The ReLU layer reduces the number of epochs to achieve the training error rate greater than traditional tanh units. The normalization layer increases generalization and reduces the error rate. Also, ReLU and normalization layer does not change the size of the feature map. The pooling layers summarize the outputs of adjacent pooling units.

One of the most obvious merits of RCNN is more features can be extracted from different layers. With more features, we can easily build the relationship between model and the predicted value. For example, if the input size () is , we assume that feature map size is . In Conv layer, the feature map size can be calculated as: . And feature map size is equal to in max-pooling layers. In the Conv1 layer, the feature map size is: ; the feature map size in Maxpool1 is . Again, the feature size is: in the Maxpool2 layer, and feature size becomes: in the Maxpool2 layer. The total number of features is increased (50 in maxpooling2 layer v.s. 8 in input layer), and this is one reason why RCNN can generate a better-predicted result than other neural networks which only use input data as feature map.

#### Electricity prediction using RCNN

As shown in Fig. 1, with more features extracted in the Maxpooll2 layer, we will connect it into FC layer and flat all features into one dimension. In the training stage, the input size is: . The size of the fully-connected layer is , and it has the same size as the regression layer, and this why points in FC layer are only connected to one point in regression layer. During the training process, if the desired Mean Square Error (MSE) is not reached in the current epoch, the training will continue until the maximal number of epochs or desired MSE is reached. On the contrary, if the maximal number of epochs is reached, then the training process will stop regardless the MSE value. Final performances are evaluated to demonstrate feasibility and practicability of the proposed method. During the test stage, we input the test data set , and by using the training RCNN model, we can predict the electricity consumption of each month.

### 2.3 Svr

The original linear support vector machine (SVM) is proposed for binary classification problem. Given data and its labels: , , and . It aims to optimize following equation:

(1) | |||

where controls the width of margin (smaller margin with smaller ); is a non-negative slack variable and penalizes data points which against the margin; is the bias.

Linear SVM can also be used as a regression method (called SVR), there are few minor differences comparing with SVM for classification problem. First of all, the output of SVR is a continuous number, but not the classes in the classification problem. Besides, there is a margin of tolerance in the SVR. However, the main idea is always the same: minimize the error and maximize the margin. Fig. 2 describes the one-dimensional SVR, it aims to optimize following constrained function:

(2) | |||

#### Electricity prediction using SVR

To apply SVR method in predicting the electricity consumption of mineral company, we use SVR classifier to train the eight factors and predict the electricity consumption using the trained classifier. The SVR structure is shown in Fig. 3. In training stage, we train the SVR classifier using IFs, and we compare the predicting electricity value with true EVs and check whether the model is convergent; if not, the training stage will execute again. During test stage, we use IFs from test data set and predict electricity value.

### 2.4 Rcnn-Svr

Inspired by the RCNN and SVR, we combine the deep neural network with SVR and design an RCNN-SVR model. Specifically, we train SVR classifier using features, which extracted from RCNN, then predict the electricity consumption using trained SVR classifier. Different from above RCNN architecture, we add more layers in the RCNN part to get more useful features. The RCNN-SVR architecture is shown in Fig. 4. Different from single RCNN and SVR model, RCNN-SVR combines the advantages of these two methods. RCNN-SVR can extract more features and use the less computational time to train the model. In our RCNN-SVR model, it also contains two steps: the feature extraction step is from RCNN model, and predicting step is from SVR model. Also, to extract the features, we fine-tuned the network. Different from the number of layers in RCNN, we add another Conv3 and Maxpool3 layer. To reduce error and prevent the overfilling, we use the drop out strategies, which adds a droppoutlayer after the Maxpool3 layer. For three Conv layers, the fitter size is , and the filter number are: , and , respectively. For three Maxpool layers, the stride size is . Besides, we removed the last two layers (FC and regression layer), since we could not extract significant features from these two layers. The feature size of last dropout layer is the same as the feature map size in the Maxpool2 layer of RCNN. But the feature map is different; there is more information in feature map of RCNN-SVR model. As shown in Fig. 5 and Fig. 6, the feature map of RCNN-SVR model (both training and testing data) has more features than 8 layers RCNN model in section 2.2. With more features extracted in RCNN model, it will provide enough information for SVR model to train the features. Further, we can build a better relationship between features and actually electricity consumption values.

#### Electricity prediction using RCNN-SVR

To apply the RCNN-SVR model in predicting the electricity consumption of mineral company, we use RCNN to extract features of eight IFs and predict the electricity consumption using the trained SVR classifier. The RCNN-SVR structure is shown in Fig. 4. As shon in Alg.1, in training stage, we train the SVR classifier using features from RCNN model, and we compare the predict electricity value with true EVs and check whether the model is convergent. If not, the training stage will execute again. During test stage, we use IFs from test data set and predict electricity consumption values.

## 3 Results

In the experiment, we use data which are provided by a mineral company. Besides, the training data are the electricity consumption values of past 50 months, and the test data are 12 months electricity consumption values. The data were processed in section 2.1. Fig. 7 and Fig. 8 is the comparison predicting result of RCNN-SVR model with RCNN, SVR, MPSO-BP, and DeepEnergy. In Fig. 7, the vertical axes represent the electricity consumption (kWh), and the horizontal axes denote different test months. According to the results in Fig.8, RCNN-SVR model has the highest accuracy among all models.

### 3.1 Evaluation of model accuracy

To evaluate the performance of predicting results, we employ three evaluation functions: Mean Standard Error (MSE), Mean Absolution Percentage Error (MAPE) and Cumulative Variation of Root Mean Square Error (CV-RMSE) [17]. And these evaluation functions are defined in equation (3), where is the true electricity value, is the predicting value, represents the data size.

(3) | |||

The comparison results of four methods are shown in table 1. As shown in table 1, the MAPE and CV-RMSE of the RCNN-SVR model are the smallest, and the goodness of error is the best among all models, namely, MSE, average MAPE and CV-RMSE are 0.8564, 1.975% and 0.0687%, respectively. The MAPE of SVR model is the largest among all of the models; an average error is about 2.3341%. On the other hand, the CV-RMSE of SVR model is the largest among all models; an average error is about 0.0809%. According to the MSE, average MAPE and CV-RMSE values, the electricity consumption forecasting accuracy of tested models in descending order is as follows: RCNN-SVR, RCNN, MPSO-BP, DeepEnergy, and SVR. However, SVR uses less time than other models (1.82 s), comparing with the rest three methods, our model RCNN-SVR uses relatively less time than RCNN, SVR and DeepEnergy methods.

RCNN-SVR | RCNN | SVR | MPSO-BP [4] | DeepEnergy [17] | |
---|---|---|---|---|---|

MSE | 0.8564 | 1.0690 | 1.1639 | 0.9236 | 1.0720 |

MAPE | 1.975% | 2.1239% | 2.3341% | 2.2665% | 2.330% |

CV-RMSE | 0.0687% | 0.0755% | 0.0809% | 0.0745% | 0.0760% |

Time (s) | 4.35 | 221.74 | 1.82 | 27.21 | 3758 |

From table 1, we can find that our RCNN-SVR model has the smallest MSE, MAPE, and CV-RMSE, which means our model has the highest accuracy than other methods. Therefore, the RCNN-SVR model is the most suitable method for electricity predicting. We recommend using the RCNN-SVR model to predict the electricity consumption of mineral company.

### 3.2 Forecasting of electricity consumption of each month in 2018

Using the trained RCNN-SVR model, we predict the electricity consumption values of each month in 2018, as shown in table 2. The electricity consumption will increase in November and December, and this may cause by heavy pressure on the operation, maintenance, and supply heating of power system.

Months | 1 | 2 | 3 | 4 | 5 | 6 |
---|---|---|---|---|---|---|

Evs (kwh/t) | 38.78 | 38.56 | 37.01 | 36.83 | 35.76 | 35.94 |

Months | 7 | 8 | 9 | 10 | 11 | 12 |

Evs (kwh/t) | 37.33 | 36.84 | 36.76 | 36.06 | 37.78 | 38.28 |

## 4 Discussion

The traditional method, such as SVR, BP neural network has been applied in electricity consumption prediction. In this paper, these methods also provided a reasonable result (as shown in table 1). Regarding SVR, the results are worst among these methods. One reason is that there are no enough features can be trained due to the limited data. According to the table 1, the RCNN has a relative long computational time, and this is caused by the features extraction and training step. Our RCNN-SVR model has the lowest MSE, MAPE, and CV-RMSE comparing with other methods. Furthermore, the selection of extracting features from which layer in RCNN-SVR model is important, as shown in Fig. 9, MSE is overall reduced with the selected later layers. And this implies that the most useful features are shown in the last layers in the RCNN-SVR model. Therefore, we may get better results if we use the features from the last layer.

## 5 Conclusions

In this paper, we propose a regressive convolution neural network and support vector regression (RCNN-SVR) model for electricity consumption forecasting. The proposed model is validated by experiment with the electricity consumption data from the past five years. In the experiment, the data from a mineral company were used, and historical electricity demands are considered. According to the experimental results, the RCNN-SVR model can precisely predict electricity consumption in the next following months. Also, the proposed model is compared with four models that were used in electricity consumption forecasting. The comparison results showed that performance of our RCNN-SVR model is the best among all tested algorithms, which has the lowest values of MSE, MAPE, and CV-RMSE. According to all of the obtained results, the proposed method can reduce computation time. The proposed RCNN-SVR method successfully solves three issues which are mentioned above: (1) reduce the computational cost; (2) train the model with limited data; and (3) improve the prediction accuracy. Therefore, the RCNN-SVR model can be used to predict the electricity consumption of mineral company.

However, our paper has the limitation of data size. For future work, we will first test our model use more data, then we will expand the different neural networks, such as DenseNet, Adversarial neural network to extract the features of data. What’s more, the novel model in this paper can be used in predicting electricity values in other fields, such as wind power generation system electricity prediction, and agricultural electricity consumption area.

### References

- Abdollah Kavousi-Fard, Haidar Samet, and Fatemeh Marzbani. A new hybrid modified firefly algorithm and support vector regression model for accurate short term load forecasting. Expert systems with applications, 41(13): 6047-6056, 2014.
- Song Ding, Keith W Hipel, and Yao-guo Dang. Fore- casting chinaâs electricity consumption using a new grey prediction model. Energy, 149:314 - 328, 2018.
- Fazil Kaytez, M Cengiz Taplamacioglu, Ertugrul Cam, and Firat Hardalac. Forecasting electricity consumption: A comparison of regression analysis, neural networks and least squares support vector machines. International Journal of Electrical Power & Energy Systems, 67:431 - 438, 2015.
- Zhang Youshan, Guo Liangdong, Li Qi, and Li Junhui. Electricity consumption forecasting method based on mpso-bp neural network model. PROCEEDINGS OF THE 2016 4TH INTERNATIONAL CONFERENCE ON ELECTRICAL ELECTRONICS ENGINEERING AND COMPUTER SCIENCE (ICEEECS 2016), 50:674-678, 2016.
- Diyar Akay and Mehmet Atak. Grey prediction with rolling mechanism for electricity demand forecasting of turkey. Energy, 32(9):1670-1675, 2007.
- Vincenzo Bianco, Oronzio Manca, and Sergio Nardini. Electricity consumption forecasting in italy using linear regression models. Energy, 34(9):1413-1421, 2009.
- RE Abdel-Aal and AZ Al-Garni. Forecasting monthly electric energy consumption in eastern saudi arabia using univariate time-series analysis. Energy, 22(11):1059- 1069, 1997.
- L Ekonomou. Greek long-term energy consumption prediction using artificial neural networks. Energy, 35(2):512-517, 2010.
- Shuai Wang, Lean Yu, Ling Tang, and Shouyang Wang. A novel seasonal decomposition based least squares support vector regression ensemble learning approach for hydropower consumption forecasting in china. Energy, 36(11):6542-6554, 2011.
- Chaoqing Yuan, Sifeng Liu, and Zhigeng Fang. Compar- ison of chinaâs primary energy consumption forecasting by using arima (the autoregressive integrated moving average) model and gm (1, 1) model. Energy, 100:384-390, 2016.
- Ted Soubdhan, Joseph Ndong, Hanany Ould-Baba, and Minh-Thang Do. A robust forecasting framework based on the kalman filtering approach with a twofold param- eter tuning procedure: Application to solar and photo- voltaic prediction. Solar Energy, 131:246-259, 2016.
- HM Al-Hamadi and SA Soliman. Short-term electric load forecasting based on kalman filtering algorithm with moving window weather and load model. Electric power systems research, 68(1):47-59, 2004.
- Yi-Chung Hu. Electricity consumption prediction using a neural-network-based grey forecasting approach. Journal of the Operational Research Society, 68(10):1259-1264, 2017.
- Alex Krizhevsky, Ilya Sutskever, and Geoffrey E Hinton. Imagenet classification with deep convolutional neural networks. In Advances in neural information processing systems, pages 1097-1105, 2012.
- Jonathan Long, Evan Shelhamer, and Trevor Darrell. Fully convolutional networks for semantic segmentation. In Proceedings of the IEEE conference on computer vision and pattern recognition, pages 3431-3440, 2015.
- Nal Kalchbrenner, Edward Grefenstette, and Phil Blun- som. A convolutional neural network for modelling sentences. arXiv preprint arXiv:1404.2188, 2014.
- Ping-Huan Kuo and Chiou-Jye Huang. A high precision artificial neural networks model for short-term energy load forecasting. Energies, 11(1):213, 2018.
- Alex J Smola and Bernhard Scho Ìlkopf. A tutorial on support vector regression. Statistics and computing, 14(3):199-222, 2004.
- Debasish Basak, Srimanta Pal, and Dipak Chandra Pa- tranabis. Support vector regression. Neural Information Processing-Letters and Reviews, 11(10):203-224, 2007.
- Yichuan Tang. Deep learning using linear support vector machines. arXiv preprint arXiv:1306.0239, 2013.