Exploiting Vulnerabilities of Load Forecasting Through Adversarial Attacks
Abstract.
Load forecasting plays a critical role in the operation and planning of power systems. By using input features such as historical loads and weather forecasts, system operators and utilities build forecast models to guide decision making in commitment and dispatch. As the forecasting techniques becomes more sophisticated, however, they also become more vulnerable to cybersecurity threats. In this paper, we study the vulnerability of a class of load forecasting algorithms and analyze the potential impact on the power system operations, such as load shedding and increased dispatch costs. Specifically, we propose data injection attack algorithms that require minimal assumptions on the ability of the adversary. The attacker does not need to have knowledge about the load forecasting model or the underlying power system. Surprisingly, our results indicate that standard load forecasting algorithms are quite vulnerable to the designed blackbox attacks. By only injecting malicious data in temperature from online weather forecast APIs, an attacker could manipulate load forecasts in arbitrary directions and cause significant and targeted damages to system operations.
1. Introduction
Load forecasting is a fundamental step in power system planning and operations. It is used to inform system operators the future load profiles, and serves as the basis of decisionmaking problems such as unit commitment, reserve management, economic dispatch and maintenance scheduling (Gross and Galiana, 1987). Consequently, the accuracy of forecasted loads directly impact the cost and reliability of system operations (Hobbs et al., 1999). With a growing penetration of new technologies into the demand side, the utilities and system operators need to place more importance on both accurate and robust forecasts.
For years, the holy grail in shortterm load forecasting has been to improve the forecast accuracy, which has been vigorously pursued by the research community. The variations in load are driven by many different factors, including temperature, weather, temporal and seasonal effects (e.g., weekday vs. weekend) and other socioeconomic factors. All of these factors influence the load in nonlinear and complex ways. Over the past decades, a myriad of load forecasting algorithms have been proposed and adopted in practice. See, for example, (Gross and Galiana, 1987; De Gooijer and Hyndman, 2006; Charytoniuk et al., 1998) and the references within.
For simplicity, in this paper, we restrict the inputs of the algorithms to be the historical load data, time indicators and temperature information. These algorithms can be thought of as finding a mapping between the (high dimensional) input features to the forecasted time series of load values. Statistical and machine learning techniques, such as support vector regression (Ceperic et al., 2013), ARIMA (Contreras et al., 2003) and neural networks (Hippert et al., 2001; Chen et al., 2018a) have been applied to short term load forecasting and are well adopted in practice. The recent advances in deep learning opened the door to using more input features and deeper model architectures to further improve load forecasting accuracy and provided some of the best performances to date (Kong et al., 2017; Quilumba et al., 2015; Wang et al., 2016).
As the forecasting methods become more complex and accurate, they are also more susceptible to cybersecurity threats. In this paper, we look into the data vulnerabilities of such methods, where an attacker adversarially injects false data into the input features of forecasting algorithms. Specifically, we investigate false data injection attacks of the temperature data. It is an important input to load forecasting algorithms and is mostly obtained from external services/APIs, therefore providing an easier avenue for data perturbations and attack injections. The potential damage of these types of attacks can be significant, leading to increases in system operation costs and maybe even more catastrophic events such as load shedding. In Figure 1, we show the schematic of threats and proposed attacks to systems.
In this paper, we take the perspective of an attacker and develop attack strategies on load forecasting algorithms, and conduct damage analysis of the proposed attacks. We take a restrictive setting of both the attacker’s “knowledge” and “capabilities”, where the attacker does not know any parameter of the targeted load forecasting algorithms, and could only inject perturbations into input temperatures under constraints to avoid detection.
Under this setup, we develop two simple datadriven attack strategies for finding the injected perturbations onto the original temperature data. Surprisingly, we find the proposed attacks significantly degrade the performance of a class of (accurate) load forecasting algorithms. With only few degrees of perturbations injected into input temperatures, the load forecasting algorithm’s output deviates drastically from original values. We also assess the damages brought by such model vulnerabilities in power system operations. Simulations based on realworld load datasets show that by changing only few degrees of temperature, adversarial forecasts not only increase the operation cost of power systems, but can also lead to load shedding and infeasible generator schedules.
This study illustrates the need to look at other properties of load forecasting techniques in addition to forecast accuracy. We demonstrate that accuracy may not mean robustness, and since a wrong forecast of load potentially leads to costly operation decisions or system damages, we call for a more comprehensive analysis when developing and applying load forecasting techniques. Specifically, we make the following contributions in this work:

To the best of our knowledge, this is the first to evaluate the security issues of load forecasting procedures in power system operations. Data vulnerabilities of current forecasting methods are discussed and formulated.

Two datadriven, blackbox attack algorithms, namely learn and attack and gradient estimation, are proposed to generate hardtodetect, adversarial input data for load forecasting algorithms.

Case studies on power system operations demonstrate potential damages via proposed attacks. We show that the strategically designed adversarial injections could lead to either increased system operating costs or load shedding.
We make our code open source on load forecasting model development, attack implementations and market operation evaluation, and make it as a package for evaluating load forecasting robustness and security^{1}^{1}1https://github.com/chennnnnyize/load_forecasts_attack.
The rest of the paper is organized as follows. A literature review is presented in Section 2; we then briefly summarize a general load forecasting model, and formulate the objective and constraints of attackers in Section 3; in Section 4, we detail the algorithms for implementing the attack; to illustrate the attack’s threats to the power system operations, we describe the market setup and a toy example in Section 5; through simulations based on realworld load data in Section 6, we demonstrate the threats posed by the proposed attacks; furthur discussion on model/data security and conclusion are drawn in Section 7.
2. Related Work
In this section, we give brief literature review on both the load forecasting methods and cybersecurity of power systems. Our work is different from most related work in two aspects: most of the studies in forecasts do not consider security and robustness, while most of the studies in power system security evaluate attacks with almost no knowledge about the targeted system or constrained capabilities.
Our work is related to the large body of work on forecasting in power networks, such as renewables forecasting (Pinson et al., 2007) and load forecasting (Gross and Galiana, 1987; Park et al., 1991). Since the costs of making erroneous forecasts are so high, even reducing forecast error in a few percent points are important (De Gooijer and Hyndman, 2006). Various methods have been applied and evaluated in load forecasting problems, including using nonparametric regression (Charytoniuk et al., 1998), support vector regression (Ceperic et al., 2013), ARIMA (Contreras et al., 2003) and neural networks (Hippert et al., 2001; Chen et al., 2010). Among these forecast models, neural network has become increasingly more popular, as it provides highly accurate results due to the ability of representing the complex relations between highdimensional features and outputs.
The recent progress in deep learning and data science also promotes the use of deep neural networks and more complicated feature representations in forecast models (Chen et al., 2018a; Kong et al., 2017; Chen et al., 2017). Many works focus on feature selection and feature engineering by considering the uncertainties coming from both electrical loads (Wang et al., 2018) and exogenous variables such as weather (Hong et al., 2010; Wang et al., 2016), customer behaviors (Quilumba et al., 2015) etc. However, most research doesn’t look into the robustness issues, and model performances under adversarial environements are rarely discussed (Luo et al., 2018; Chen et al., 2018b).
Our work is also under the scope of cyberphyscial system security, especially the cybersecurity of power systems (McDaniel and McLaughlin, 2009). Many studies focused on compromising the communication, sensing or monitoring process in modern smart grids (Sridhar et al., 2012; Mo and Sinopoli, 2009). For instance, denial of service attacks and deception attacks are aimed at compromising either communication channel or communication packets (Amin et al., 2009); false data injections on state estimation have been widely discussed (Kosut et al., 2010; Liu et al., 2011), where the attackers introduce estimation errors on state variables, e.g., phase angles and voltage magnitudes. Such attacks strategically manipulate meter measuresments to bypass conventional bad data detection. In (Xie et al., 2010; Tan et al., 2018), the authors analyzed how maliciously changed system states could affect the market operations during dispatch process. Most of the previous attacks assume full knowledge of system configuration. It is also assumed that attackers possess strong capabilities to implement attacks, e.g., to compromise communication channel or to modify meter data arbitrarily.
In this paper, we focus on the previously overlooked vulnerabilities in the load forecasting process. For instance, forecasting model inputs can be exposed to adversarial modification and the model performance may be impacted by such malicious changes. Recently, there has been a hot debate on the security of machine learning models (Szegedy et al., 2013) following the deep learning’s stateoftheart achievements on a bunch of benchmark tasks. In computer vision, researchers found a small, adversarially designed noises injected to clean image would deceive a welltrained image classifier (Papernot et al., 2016b; Bhagoji et al., 2018). We are interested in whether such attacks could also impact the performance of load forecasting models and if so to what extent. The proposed class of data injection attacks do not assume the forecasting model itself is known to the attackers. In addition, successful distortion on load forecasting also impacts the reliable operations of power systems, so it is important to investigate the data vulnerabilities in existing load forecasting methods.
3. Formulation: Forecasters and Attackers
In this section, we formally describe the forecasting and attacking models. To set up realistic vulnerability analyses, we also describe the set of restrictions on the knowledge and capability of the attacker.
3.1. Load Forecasting Formulation
The schematic of general load forecasting model is depicted in Figure 1. We consider the setup for a family of load forecasting algorithms with different architectures. The input features of these algorithms include historical records of load, weather forecasts including temperature, weather indicators (e.g., sunny, rainy or cloudy) and seasonal indicator variables such as weekdays/weekends and hour of the day. Mathematically, the system operator would be able to collect a training dataset based on available historical data. Here are scalars representing scaled load values (or response variables) (Gross and Galiana, 1987). is the model’s forecast horizon, typically ranging from one hour to one day in shortterm load forecasts. are scaled, dimensional input feature verctors (or numerical predictor variables). Let’s denote , where are the load history records; are the temperature value vectors, which could be acquired from either system historical records or weather forecast API; are a collection of indicators, indicating the weather characteristics, seasonal factors and time factors. determines how much history of training data the operators want to take into consideration for forecasting. Longer history would provide more information to the forecast model, yet brings more difficulty in model training and fitting.
In the task of load forecasting, one is interested to find a function parameterized by : , which learns the mapping from to future loads . The mean absolute error (MAE) is widely used to measure the performance of forecasting algorithm, which is defined by the average norm of difference on forecasted loads. Estimation of is given by minimizing the norm of the difference between model predictions and ground truth values:
(1a)  
(1b) 
During training, ground truth of historical records on and are used; during testing and realworld system implementations, we are using which are coming from weather forecasts to forecast future loads. Once the model is learned, it can be applied in a rollinghorizon fashion to make use of forecasted along with and to forecast for furthur into the future.
3.2. Specific Forecasting Models
We describe the model setup for several representative load forecasting algorithms which have achieved good performances and have been widely adopted (Hippert et al., 2001). In Appendix A we detaild the model parameter settings and training approaches. We note that the vulnerabilitiy analysis conducted by this paper is not constrained to the following forecasting algorithms. As long as the model output is sensitive with respect to input features, our proposed attack methods would be able to alter the load patterns maliciously.
3.2.1. FeedForward Neural Networks
A multilayered, feedforward neural networks (NN) has been widely used to represent the nonlinearities between input features and output forecasts (Hippert et al., 2001). For the input layer of neural networks, each neuron represents one feature of training input, and all features of past steps are stacked as the inputs. For each intermediate layer, NN could have a tunable number of hidden units, which represent the input feature combinations. Recent advances in deep learning also allow for deeper and more comlicated network design (Chen et al., 2018a).
3.2.2. Recurrent Neural Networks
A recurrent neural networks (RNN) is a class of neural networks that are specially designed for sequential modeling (Vermaak and Botha, 1998). Instead of stacking all time steps’ features together as in the feedforward neural networks, RNN feeds each step’s input sequentially, and outputs a hidden unit to represent the feature combination of current input and historical features. The last neuron outputs the forecasted load values.
3.2.3. Long ShortTerm Memory
Long ShortTerm Memory network (LSTM) is designed to deal with the vanishing gradient problem existing in the RNN with longtime dependencies (Kong et al., 2017). The major improvements over RNN are the design of “forget" gates to model the temporal dependencies and capture long time dependencies in load patterns more accurately.
3.3. Objective of Attacker
The attacker’s goal is to distort the forecasted load as much as possible in a certain direction, e.g., to either increase or decrease forecasted values. In order to distort the output forecast values, the attacker actually has two choices of inserting attacks: to attack or to attack . While the trained model itself is often safely kept by the operators, it has to use external data such as weather forecasts as input features. Then the attacker’s goal is to inject perturbations into the weather forecasts coming from external services to generate adversarial input data for , so that predictions are modified. We use to denote the chosen attack direction by attackers. If , the attacker tries to find to decrease the load forecast values; when , the attacker tries to find to increase load forecasts values. Since load values are always positive, the attacker’s goal is to find that minimizes the value of .
3.4. Attacker’s Knowledge
We consider two attack scenarios, white box and blackbox attacks. In the whitebox settings, the attacker is assumed to know exactly the model parameters . This is a strong assumption in the sense that load forecast model is fully exposed to the attacker. On the contrary, in the blackbox setting, the attacker only knows which family of load forecasting model has been applied (e.g., NN or RNN), but is blind to the forecasting algorithms and has no knowledge of any parameters of . We consider two possible avenues of attacks. In the first case, the attacker possesses a substitute training dataset which may or may not be the same as . Such dataset also represents the ground truth of historical load and features. In the second case, the attacker cannot acquire such dataset due to lack of access to the historical load records. We assume the attacker has query access to the load forecasting model^{2}^{2}2Such query access assumption is possible in many forecastasaService businesses, e.g., SAS energy forecasting and Itron forecasting.. That is, the attacker could query the implemented load forecasting model by using different values of input features for a limited number of times, and then try to get insights on how works.
3.5. Attacker’s Capability
As an attacker, it is important to avoid being detected by the bad data detection algorithms used by system operators. The attacker’s capability could be upper bounded by the allowed number of perturbed entries in the input data; it could be bounded by the average deviations on all features; or it could be also bounded by the largest deviation from the original value. Mathematically, the attacker wants to keep bounded, where can take different values such as to express certain norm constraints corresponding to different detection algorithms.
In summary, we formulate the model of attackers as the following optimization problem:
(2a)  
(2b) 
Note that there is a parallel between the forecast problem (1) and attack problem (2), where the objective’s optimization directions and optimization variables are exactly in the opposite directions: forecasting model works on model parameters to minimize forecast errors, while attacker works on model inputs to maximize the errors to targeted directions. However, due to lack of model knowledge in the black box setting, it is a challenging task for attackers to find efficient attack input via (2). In the next section, we will show two attack methods generally working with attacker’s knowledge coming from substitute training dataset and query access respectively.
4. Blind Attack on Load Forecasting
In this section, we first describe attacks under the whitebox setting, where an attacker possesses full knowledge of load forecasting model parameters. This serves as a benchmark for evaluation of the success of attackers. We then focus on two more realistic settings where the attacker does not know the model parameters. We describe how data injection attacks can be implemented when either the historical data is known or the attacker has limited query access to the load forecasting model.
4.1. WhiteBox Attack
Under the whitebox setting, since the model parameters are known to the attacker, it is possible to find the attack input via solving (2). For the convenience of notations, we omit the superscript on X in some of the following paragraphs, and introduce the generalizable attack methods not only suitable for attacking temperature forecasts, but also suitable for injecting false data into other features.
Since most stateoftheart load forecasting algorithms use complex models such as neural networks, the attacker’s problem (2) is nonconvex and furthurmore, there is no closedform solution for . Nevertheless, an attacker can still find some attack vectors iteratively by taking gradients with respect to each time step’s temperature values. Even though this may not find the optimal solution to (2), because of the highly nonconvex nature of the forecasting model, a slight (suboptimal) perturbation of the input features would drastically change the forecast output.
Based on (2), we define a loss function with respect to each time step’s feature . Then the attacker iteratively takes gradients of to find the adversarial input . The constraints in (2b) is included in the loss function using a logbarrier:
(3) 
where is the weight of the barrier term. Since there are a large number of parameters and input features in many load forecasting algorithms, it can be computationally expensive to compute the exact gradient values for each input feature. We follow a simpler method in (Szegedy et al., 2013) to only update the feature values based on the sign of the gradient at each iteration :
(4) 
where controls the step size for updating adversarial temperature values. The resulting adversarial temperature vector is obtained by applying (4) a number of times.
4.2. Learn and Attack
In the learn and attack setting, we assume the attacker does not have access to the model parameters, and there is no query access to the model. The only knowledge the attacker has is a historical dataset , which includes same features as data set used to train the load forecasting model^{3}^{3}3In Learn and Attack setting, we make assumption that the attacker know the family of targeted load forecasting model, e.g., a feedforward neural networks or a Recurrent Neural Networks.. The proposed attack algorithm consists of a training phase and an attack phase as shown in Fig. 2(a). In the training phase, the attacker trains substitute model based on to minimize the training loss. In the attack phase, the attacker pretends that the substitute model is the true load forecast model and performs whitebox attacks on it to find the attack vectors. This strategy is based on the assumption that the substitute model behaves similarly to the true model not only for the training data X, but also for the attack vector . Then by injecting into the true load forecasting model, the forecast values go to attacker’s desired directions.
It is useful to evaluate the transferability of proposed attacks across different set of models and . The phenomenon of transferability in adversarial attacks for machine learning models have been discussed in (Papernot et al., 2016a; Hosseini et al., 2017), where adversarial instance generated using can be also treated as an adversarial instance by with high probability. The theoretical understanding of why attacks transfer remains an open question and is out of scope for this paper. In Fig. 3 we show such transferability also exists in the load forecasting model. The temperature inputs are generated by implementing the iterative gradient update (4) based on a substitute model under norm of attack perturbations, yet such adversarial temperature values also mislead the (unknown) true load forecasting model to be wildly inaccurate.
4.3. Gradient Estimation Attack
To implement learn and attack on load forecasting algorithms, the attacker needs get a version of the training data to learn a substitute load forecasting model. In the case there is no available historical data records, if the attacker is able to query the load forecasting algorithm for a limited number of times, it is still possible to construct adversarial temperature inputs by using queries to estimate the gradients. In Figure 2 (b) we show the schematic on generating adversarial temperature instances via querying.
For th dimension of the input feature at time stamp , , the attacker needs to query the load forecasting system on each feature dimension to calculate the twosided estimation of the gradient of :
(5) 
where is a dimensional vector with all zero except at th component, and takes a small value for gradient estimation. Once the gradient is estimated for each dimension of temperature features, we can follow the same method of (4) to iteratively build the adversarial features using the estimated gradient vectors:
(6) 
To satisfy the norm constraints on the allowed perturbation of , the attacker projects the adversarial data back into the predefined norms after finishing the iterative attack constructions. In (Bhagoji et al., 2018) techniques on reducing number of queries are also discussed for attacking an image classifier, which could also help improve the query efficiency of load forecasting attacks.
5. Attacks on System Operations
In this section, we first illustrate a power system operation case consisting of a dayahead planning stage and a realtime operational stage, which is simple yet close to realworld market operations. We then describe two simple temporal attack strategies that pose threats to such system operations via injecting perturbations into load forecasting inputs.
5.1. Power System Operations Model

A commitment schedule based on the dayahead load forecasts is created by a unit commitment (UC) model based on the dayahead load forecast:
(7a) (7b) s.t. (7c) (7d) (7e) (7f) (7g) (7h) where is the binary decision variable of the commitment status of generator at time , with indicating is online; is the real power output of generator at time ; all the ’s and ’s are collected together into vectors and ; and represent the dispatch costs and startup and shutdown costs, respectively, of all the generators in all periods; the constraints are systemwide power balance constraint (7b), generation limits constraints (7c), generator logical constraint (7d), minimum up time constraint (7e), minimum down time constraint (7f) and ramping constraints (7g). Once solved, the operator gets the schedule for the set of online generators at each time .

For each time stage of each day, the dispatch of the scheduled units and the actual dispatch cost are calculated according to a basic Economic Dispatch (ED) model (Kirschen and Strbac, 2018) based on the actual load and generation schedule :
(8a) (8b) s.t. (8c) where it aims to find the real power dispatch at time , , that minimizes the dispatch costs at time , , considering systemwide power balance constraint (8b) and generation limits constraints (8c). The daily operation cost is obtained by summing the 24hour dispatch costs and the startup and shutdown costs. When the ED based on the dayahead commitment does not have a feasible solution, a load is shed to maintain the balance between supply and demand.
5.2. Attack Strategies
Under normal operating conditions, the load forecasting algorithms provide accurate forecasts on dayahead load for system operators to solve (7). During an attack, adversarial temperature forecasts are injected into the dayahead planning stage to cause deviation from the normal operations, e.g., increased system costs, load shedding, no feasible generation dispatch or violation of ramping constraints. We assume the attacker does not know the parameters of underlying system such as each generator’s capacity and ramp constraints.
We propose two intuitive attack strategies that move the load forecasts as far away as possible to stress the system. Simple as it is, the toy example in Section 5.3 and case studies in Section 6 using realworld load data reveal the potential vulnerabilities brought by these types of load forecasting attacks.
5.2.1. Load Maximization
Under this strategy, the attacker increases the load forecasts as much as possible. Then with an overestimation of the system loads at each time step in the dayahead stage, the operator tends to turn on more than necessary generation units, which will increases the system operation costs.
5.2.2. Load Minimization
Under this strategy, the attacker decreases the load forecasts as much as possible. Then in dayahead planning stage, the system operator underestimates the future load, and fewer generators are scheduled than needed. If the real load is not too much higher than the adversarial load, the system can still use spinning reserve to satisfy the underestimated loads, but could cause expensive dispatch. If the real load exceeds the available capacity, load shedding could take place.
5.3. Toy Example
To illustrate why such simple attack strategies would cause an increase of system costs and occurrences of operations anomalies, we show an toy example here with generators of same capacity serving an aggregate load. We demonstrate four possible unexpected cases in Figure 4. In the simplifying case, we consider a step forecast and unit commitment. For ease of illustration, we are still assuming there exist ramp constraints and capacity constraints in the toy example, but no minimum up and down time constraints. In Figure 4(a) and 4(d), the attacker drives the forecasts lower than the real loads, and we observe either the actual load exceeds the scheduled generator’s generation capacity, or actual ramp exceeds the scheduled generator’s ramping capacity. In Figure 4(b) and 4(c), the attacker either increases the peak load forecasts, or keeps the forecast larger than actual load. Both cases cost the system to keep one more generator online for some time, and dispatch the load in an uneconomical way.
There are other possible attack strategies, such as changing forecasts to random directions, shifting the peak load, cutting the forecast peak load or decreasing the forecasted ramp magnitudes. All these attacks could bring economic and operational damages to the system, but should need more specific design based on the specific load profile, temporal patterns and may require more knowledge of the system.
5.4. Key Insights
For the general power system operations including a planning stage (unit commitment) and a realtime operational stage (economic dispatch), we observe the following characteristics of impacts by adversarial load forecasts:

Increasing the load maliciously will normally incur extra system costs, such as starting to operate redundant generators, using more expensive generation combinations and etc;

By decreasing the peak load maliciously, system operators would ignore the real peaks of future loads, and schedule fewer generators. This would potentially cause load shedding or failing to follow the severe ramps in the actual load patterns;

We assume an attacker with constrained capability on modifying the input features for load forecast models, and with no knowledge about the system parameters such as generator schedule and load forecasting model parameters. The proposed attack could be even more detrimental if the attacker possesses extra knowledge of the system and implement targeted attack during certain time periods.
6. Case Studies
In this section, we show a detailed simulation on realworld Swiss load data, and show the threats posed by our data injection attacks in several ways. In particular, we first illustrate the proposed attacks could degrade a set of accurate load forecasting algorithms dramatically; we then quantitatively evaluate the damages brought to the system operations, and compare the results with the case using clean data for load forecasting. We demonstrate that attackers with little efforts and knowledge are able to cause load shedding or infeasible dispatch.
6.1. Experimental Setup Description
Dataset Description: We collected and queried hourly actual load data from European Network of Transmission System Operators for Electricity(ENTSOE)’s API^{4}^{4}4https://transparency.entsoe.eu/ ranging from Jan 1st, 2015 to May 16th, 2017, and we followed (Marino et al., 2016) to collect dayahead historical weather forecasts coming from major cities in Switzerland such as Zurich, Basel, Lucerne and etc. All the weather data were queried from Dark Sky API ^{5}^{5}5https://darksky.net/forecast/47.3769,8.5414/us12/en. We also collect other indicator features , such as onehot vectors of hour of day, day of week (weekend or weekday), and season of year. We split of data as our training sets, and use the remaining of data on validating and evaluating the load forecasting prediction accuracy, attack performance and case studies on market operations. Note that even though we collected offline data to train and validate both of our load forecasting and attack models, these data collection procedures could be applied in an online fashion so that attacker could inject realtime adversarial attacks into certain load forecasting models.
Power Systems Setup: The system has 1 aggregated load for Switzerland based on the ENTSOE data. The nominal load values are in the range of . We take a simplified power system model of using generators with total capacity of , and omit the network constraints. We adopt the generator parameter settings of ramp capacity, generation costs and minimum on/off time based on (Kirschen and Strbac, 2018). We set the spinning reserve requirement as 3% of the total forecasted demand based from (Rebours and Kirschen, 2005). During the run of dayahead unit commitment, either normal dayahead forecasts or adversarial forecasts are used for generation scheduling; during the run of economic dispatch, the real loads are used for generation dispatch. The models of UC and ED are implemented in Python using PyPSA (Brown et al., 2018), and these two modules are directly interfaced with the load forecastling and attack algorithms. Note that even in our simiulated system, it ignores the line constraints, while the attacker does not know any information about the system operation, we already observe a set of damages posed by load forecasting attack. We expect more severe effects of attacks with either more generation constraints or less attack constraints.
Model Training and Attack Implementation: We set up three load forecasting models, NN, RNN and LSTM respectively, and use standard stochastic gradient descent methods for model training (Bottou, 2010). For detailed model setup and training, we refer to Appendix.A. All three forecast methods could get similar converged validation errors, and as shown in the first column of Table 1, the errors in mean absolute percentage error (MAPE) are comparable to the errors reported in several recent studies on load forecasting (Chen et al., 2018a; Kong et al., 2017). We save the model parameters and keep them away from blackbox attackers. For the substitute model training of learn and attack method, we keep the training set same as the load forecasting model training set . Decreasing the size of or using different substitute dataset could decrease the performance of learn and attack. We use constraints on the attacker’s capability (2b), such that the attacker is constrained by the maximum deviation of perturbed temperature values. We validate trained model’s performance under attacks with varying constrained values. For details of training techniques, training accuracies, training and attack implementation time, we refer to Appendix A and Appendix B.
Forecasts Error (MAPE)  Clean Data  Learn and Attack  Gradient Estimation 

NN  
RNN  
LSTM 
6.2. Load Forecasting Performance
We calibrate and compare the load forecasting model performance with and without adversarial attacks on test datasets. Though all three models exhibit good performances on clean test data, we inject different level of perturbations generated by learn and attack and gradient estimation methods respectively, and found the forecasting performance decrease drastically as the adversarial perturbations become larger (Table 1). In Figure 3 we show the RNN’s load forecasting results for hours using learn and attack algorithm with maximum perturbation on temperature of and . The attacker tries to increase the load in the first hours, and decrease the load in the latter hours. We observe that the algorithm finds the correct attack direction to either increase or decrease the load. What’s more, with only deviation on temperatures, the load forecasts changes over MW at some time steps. When the attacker increases the perturbation to , larger forecasts error over MW are observed. The temperature profile before and after attack still looks similar, which could avoid system operators’ security inspection (Figure 5). Table 1 compares all three load forecasting models’ performance using clean and adversarial data. For both learn and attack and gradient estimation algorithms, they distort all three load forecasting models’ output and increase model’s forecast error. Gradient estimation attack works generally better for all three models, and this is due to estimating the gradients via querying directly is more accurate than calculating it from the substitute model and transferring to .
In Figure 6, we evaluate RNN’s load forecasting performance under two attack strategies: load maximization or load minimization. We observe gradient estimation attack causes similar MAPE compared to white box attack. The load decreasing attack is normally more successful than load increasing attack in terms of MAPE. Load minimization attack is more harmful results than load increasing ones, since increased forecasts only let system operators start up more generations, while adversarially decreasing the forecasted load leads to wrong generation decisions that fails to meet the larger real load.
6.3. Impact of Attacks on Operation Costs
As mentioned earlier, we are interested in the possible consequences caused by wrong forecasts. We first analyze the increased costs caused by adversarial forecasts. We implement learn and attack algorithm on 3 weeks’ random selected test data to increase the forecasted load at each time step. Under such circumstances, the system operator sets dayahead generator schedule based on adversarial loads larger than actual loads. In Figure 7 we show the bar plot of increased costs versus varying perturbation on temperature forecasts. When the temperature perturbations are small, increased costs are limited, and such increments are mostly due to extra startup costs. When perturbation becomes larger, system operators sometimes derive totally different unit commiment schedule to accomodate higher loads and larger ramps, so in some days we observe larger increase in system costs, of which values are times of nomial hourly operating costs.
6.4. Impacts of Attacks on Feasibility
In addition to increasing system costs, adversarial attacks on load forecasting could even lead to worse situations. We illustrate a load minimization strategy that leads to infeasible solutions (e.g., load shedding, ramp constraint violations) to the economic dispatch problem. We implement both learn and attack and gradient estimation algorithms with maximum perturbation of , and test the results on weeks’ load data. In Table 2, we note the occurrence frequencies of both load shedding and ramp constraints violation. Since change in temperature forecasts can lead to over MW decreasing on load forecasts, system operators tend to keep fewer generators on. This leads to many days’ generation capacity fall short of the load, and the scheduled generators can not fulfill the large ramps in real load profiles. In Figure 8 we show two examples on this two kind of failures respectively. In Figure 8(a), during peak hours, the adversarial load forecasts let the system operator schedule one generator off compared to the case of correct forecasts. Even taking the spinning reverse during the dayahead unit commitment, the actual load at the mid of the day exceeds the adversarial load by over and the total load exceeds the generator capacity. In Figure 8(b), the actual loads are increasing rapidly at hour 5 and 6, yet the adversarials load profile flattens such ramps, and cause the scheduled generators incapable of meeting the large ramp. We expect more frequent violations of ramp constraints if the attacker specifically design attack strategies based on the load patterns.
Occurrences (Number of Days)  Learn and Attack  Gradient Estimation 

Load Shedding  
Ramp Constraints Violation 
7. Discussion and Conclusion
In this paper, we studied the potential vulnerabilities generally existing in many load forecasting algorithms. Such vulnerabilities have been overlooked by the development of many forecasting techniques. We design two attack algorithms which do not require much knowledge about the forecast algorithms, but lead to large increase in forecast errors with adversarial data injections in load forecasting input features. The proposed attack adversarilly manipulate the load forecasting values either to increase or decrease, and thus provide system operators wrong information on future demands. Experiments on realworld load datasets demonstrate such threats over power system operations. Such threats model along with damage analysis indicate that there need more security evaluations in the design and implementation of load forecasting algorithms. In order to mitigate the damages brought by such false data injection attacks, countermeasures in building robust load forecasting algorithms are strongly recommended, which may include anomaly detection techniques considering input data distribution as well as other robust statistics.
References
 (1)
 Abadi et al. (2016) Martín Abadi, Paul Barham, Jianmin Chen, Zhifeng Chen, Andy Davis, Jeffrey Dean, Matthieu Devin, Sanjay Ghemawat, Geoffrey Irving, Michael Isard, et al. 2016. Tensorflow: a system for largescale machine learning.. In OSDI, Vol. 16. 265–283.
 Amin et al. (2009) Saurabh Amin, Alvaro A Cárdenas, and S Shankar Sastry. 2009. Safe and secure networked control systems under denialofservice attacks. In International Workshop on Hybrid Systems: Computation and Control. Springer, 31–45.
 Bhagoji et al. (2018) Arjun Nitin Bhagoji, Warren He, Bo Li, and Dawn Song. 2018. Practical Blackbox Attacks on Deep Neural Networks using Efficient Query Mechanisms. In European Conference on Computer Vision. Springer, Cham, 158–174.
 Bottou (2010) Léon Bottou. 2010. Largescale machine learning with stochastic gradient descent. In Proceedings of COMPSTAT’2010. Springer, 177–186.
 Brown et al. (2018) T. Brown, J. Hörsch, and D. Schlachtberger. 2018. PyPSA: Python for Power System Analysis. Journal of Open Research Software 6, 4 (2018). Issue 1. https://doi.org/10.5334/jors.188 arXiv:1707.09913
 Ceperic et al. (2013) Ervin Ceperic, Vladimir Ceperic, Adrijan Baric, et al. 2013. A strategy for shortterm load forecasting by support vector regression machines. IEEE Transactions on Power Systems 28, 4 (2013), 4356–4364.
 Charytoniuk et al. (1998) W Charytoniuk, MS Chen, and P Van Olinda. 1998. Nonparametric regression based shortterm load forecasting. IEEE transactions on Power Systems 13, 3 (1998), 725–730.
 Chen et al. (2018a) Kunjin Chen, Kunlong Chen, Qin Wang, Ziyu He, Jun Hu, and Jinliang He. 2018a. Shortterm Load Forecasting with Deep Residual Networks. IEEE Transactions on Smart Grid (2018).
 Chen et al. (2010) Ying Chen, Peter B Luh, Che Guan, Yige Zhao, Laurent D Michel, Matthew A Coolbeth, Peter B Friedland, and Stephen J Rourke. 2010. Shortterm load forecasting: similar daybased wavelet neural networks. IEEE Transactions on Power Systems 25, 1 (2010), 322–330.
 Chen et al. (2017) Yize Chen, Yuanyuan Shi, and Baosen Zhang. 2017. Modeling and optimization of complex building energy systems with deep neural networks. In 2017 51st Asilomar Conference on Signals, Systems, and Computers. IEEE, 1368–1373.
 Chen et al. (2018b) Yize Chen, Yushi Tan, and Deepjyoti Deka. 2018b. Is Machine Learning in Power Systems Vulnerable?. In 2018 IEEE International Conference on Communications, Control, and Computing Technologies for Smart Grids (SmartGridComm). IEEE, 1–6.
 Contreras et al. (2003) Javier Contreras, Rosario Espinola, Francisco J Nogales, and Antonio J Conejo. 2003. ARIMA models to predict nextday electricity prices. IEEE transactions on power systems 18, 3 (2003), 1014–1020.
 De Gooijer and Hyndman (2006) Jan G De Gooijer and Rob J Hyndman. 2006. 25 years of time series forecasting. International journal of forecasting 22, 3 (2006), 443–473.
 Gross and Galiana (1987) George Gross and Francisco D Galiana. 1987. Shortterm load forecasting. Proc. IEEE 75, 12 (1987), 1558–1573.
 Hippert et al. (2001) Henrique Steinherz Hippert, Carlos Eduardo Pedreira, and Reinaldo Castro Souza. 2001. Neural networks for shortterm load forecasting: A review and evaluation. IEEE Transactions on power systems 16, 1 (2001), 44–55.
 Hobbs et al. (1999) Benjamin F Hobbs, Suradet Jitprapaikulsarn, Sreenivas Konda, Vira Chankong, Kenneth A Loparo, and Dominic J Maratukulam. 1999. Analysis of the value for unit commitment of improved load forecasts. IEEE Transactions on Power Systems 14, 4 (1999), 1342–1348.
 Hong et al. (2010) Tao Hong, Pu Wang, Anil Pahwa, Min Gui, and Simon M Hsiang. 2010. Cost of temperature history data uncertainties in short term electric load forecasting. In Probabilistic Methods Applied to Power Systems (PMAPS), 2010 IEEE 11th International Conference on. IEEE, 212–217.
 Hosseini et al. (2017) Hossein Hosseini, Yize Chen, Sreeram Kannan, Baosen Zhang, and Radha Poovendran. 2017. Blocking transferability of adversarial examples in blackbox learning systems. arXiv preprint arXiv:1703.04318 (2017).
 Kirschen and Strbac (2018) Daniel S Kirschen and Goran Strbac. 2018. Fundamentals of power system economics. John Wiley & Sons.
 Kong et al. (2017) Weicong Kong, Zhao Yang Dong, Youwei Jia, David J Hill, Yan Xu, and Yuan Zhang. 2017. Shortterm residential load forecasting based on LSTM recurrent neural network. IEEE Transactions on Smart Grid (2017).
 Kosut et al. (2010) Oliver Kosut, Liyan Jia, Robert J Thomas, and Lang Tong. 2010. Malicious data attacks on smart grid state estimation: Attack strategies and countermeasures. In Smart Grid Communications (SmartGridComm), 2010 First IEEE International Conference on. IEEE, 220–225.
 Liu et al. (2011) Yao Liu, Peng Ning, and Michael K Reiter. 2011. False data injection attacks against state estimation in electric power grids. ACM Transactions on Information and System Security (TISSEC) 14, 1 (2011), 13.
 Luo et al. (2018) Jian Luo, Tao Hong, and ShuCherng Fang. 2018. Benchmarking robustness of load forecasting models under data integrity attacks. International Journal of Forecasting 34, 1 (2018), 89–104.
 Marino et al. (2016) Daniel L Marino, Kasun Amarasinghe, and Milos Manic. 2016. Building energy load forecasting using deep neural networks. In Industrial Electronics Society, IECON 201642nd Annual Conference of the IEEE. IEEE, 7046–7051.
 McDaniel and McLaughlin (2009) Patrick McDaniel and Stephen McLaughlin. 2009. Security and privacy challenges in the smart grid. IEEE Security & Privacy 3 (2009), 75–77.
 Mo and Sinopoli (2009) Yilin Mo and Bruno Sinopoli. 2009. Secure control against replay attacks. In Communication, Control, and Computing, 2009. Allerton 2009. 47th Annual Allerton Conference on. IEEE, 911–918.
 Nair and Hinton (2010) Vinod Nair and Geoffrey E Hinton. 2010. Rectified linear units improve restricted boltzmann machines. In Proceedings of the 27th international conference on machine learning (ICML10). 807–814.
 Papernot et al. (2016a) Nicolas Papernot, Patrick McDaniel, and Ian Goodfellow. 2016a. Transferability in machine learning: from phenomena to blackbox attacks using adversarial samples. arXiv preprint arXiv:1605.07277 (2016).
 Papernot et al. (2016b) Nicolas Papernot, Patrick McDaniel, Somesh Jha, Matt Fredrikson, Z Berkay Celik, and Ananthram Swami. 2016b. The limitations of deep learning in adversarial settings. In Security and Privacy (EuroS&P), 2016 IEEE European Symposium on. IEEE, 372–387.
 Park et al. (1991) Dong C Park, MA ElSharkawi, RJ Marks, LE Atlas, and MJ Damborg. 1991. Electric load forecasting using an artificial neural network. IEEE transactions on Power Systems 6, 2 (1991), 442–449.
 Pinson et al. (2007) Pierre Pinson, Christophe Chevallier, and George N Kariniotakis. 2007. Trading wind generation from shortterm probabilistic forecasts of wind power. IEEE Transactions on Power Systems 22, 3 (2007), 1148–1156.
 Quilumba et al. (2015) Franklin L Quilumba, WeiJen Lee, Heng Huang, David Yanshi Wang, and Robert L Szabados. 2015. Using Smart Meter Data to Improve the Accuracy of Intraday Load Forecasting Considering Customer Behavior Similarities. IEEE Trans. Smart Grid 6, 2 (2015), 911–918.
 Rebours and Kirschen (2005) Yann Rebours and Daniel Kirschen. 2005. What is spinning reserve. The University of Manchester 174 (2005), 175.
 Sridhar et al. (2012) Siddharth Sridhar, Adam Hahn, Manimaran Govindarasu, et al. 2012. CyberPhysical System Security for the Electric Power Grid. Proc. IEEE 100, 1 (2012), 210–224.
 Szegedy et al. (2013) Christian Szegedy, Wojciech Zaremba, Ilya Sutskever, Joan Bruna, Dumitru Erhan, Ian Goodfellow, and Rob Fergus. 2013. Intriguing properties of neural networks. arXiv preprint arXiv:1312.6199 (2013).
 Tan et al. (2018) Song Tan, WenZhan Song, Michael Stewart, Junjie Yang, and Lang Tong. 2018. Online data integrity attacks against realtime electrical market in smart grid. IEEE Transactions on Smart Grid 9, 1 (2018), 313–322.
 Vermaak and Botha (1998) J Vermaak and EC Botha. 1998. Recurrent neural networks for shortterm load forecasting. IEEE Transactions on Power Systems 13, 1 (1998), 126–132.
 Wang et al. (2016) Pu Wang, Bidong Liu, and Tao Hong. 2016. Electric load forecasting with recency effect: A big data approach. International Journal of Forecasting 32, 3 (2016), 585–597.
 Wang et al. (2018) Yi Wang, Ning Zhang, Qixin Chen, Daniel S Kirschen, Pan Li, and Qing Xia. 2018. Datadriven probabilistic net load forecasting with high penetration of behindthemeter PV. IEEE Transactions on Power Systems 33, 3 (2018), 3255–3264.
 Xie et al. (2010) Le Xie, Yilin Mo, and Bruno Sinopoli. 2010. False data injection attacks in electricity markets. In Smart Grid Communications (SmartGridComm), 2010 First IEEE International Conference on. IEEE, 226–231.
Appendix A Details on Load Forecasting Algorithms
We set up all load forecasting models using Tensorflow (Abadi et al., 2016) package in Python. Standard model architectures such as Dropout layers and nonlinear activation functions (e.g., ReLU or Sigmoid functions) are adopted in the deep learning models (Nair and Hinton, 2010). Since all three networks are set up to solve the load forecasting regression problem, we set the first layer having most neurons, and decrease the number of units in subsequent layers.
Forecasts Models  NN  RNN  LSTM 

Number of Layers  4  3  3 
Training Epochs  20  30  30 
Hidden Units in First Layer  512  64  64 
As shown in Figure 9, all three model’s loss are converged during training, and we use the trained model in the subsequent planning and operation problem as well as the testbed for attack algorithms. Plots are showing the mean and variance during 3 runs with different random seeds.
Appendix B Computation Time
We recorded the computation time for neural network training and the implementation time for two proposed attack algorithms. All time are recorded on a laptop with Intel 2.3GHz Core i58259U 4 Cores CPU and 8 GB RAM. The training time for NN, RNN and LSTM are calculated for 20, 30 and 30 epochs respectively. The implementation time for the attacks are averaged over all test instances. We could observe that learn and attack approach takes longer time than gradient estimation due to the longer time taken to calculate gradient signs over the whole neural networks; and as LSTM includes more complicated model architectures, it takes longer time to find the adversarial instance. Yet compared to the long model training time and application scenarios of dayahead forecasts, the attacker is still efficient enough to find the adversarial perturbations.
Forecasts Models  NN  RNN  LSTM 

Training Time  12.988  47.998  143.830 
Learn and Attack  0.133  0.157  0.579 
Gradient Estimation Attack  0.082  0.119  0.253 