A Deep Learning-based Framework for Conducting Stealthy Attacks in Industrial Control Systems

A Deep Learning-based Framework for Conducting Stealthy Attacks in Industrial Control Systems

Cheng Feng1, Tingting Li1, Zhanxing Zhu2 and Deeph Chana1 1Institute for Security Science and Technology
Imperial College London, London, United Kingdom
Emails: {c.feng, tingting.li, d.chana}@imperial.ac.uk
2Peking University and BIBDR, Beijing, China
Email: zhanxing.zhu@pku.edu.cn
Abstract

Industrial control systems (ICS), which in many cases are components of critical national infrastructure, are increasingly being connected to other networks and the wider internet motivated by factors such as enhanced operational functionality and improved efficiency. However, set in this context, it is easy to see that the cyber attack surface of these systems is expanding, making it more important than ever that innovative solutions for securing ICS be developed and that the limitations of these solutions are well understood. The development of anomaly based intrusion detection techniques has provided capability for protecting ICS from the serious physical damage that cyber breaches are capable of delivering to them by monitoring sensor and control signals for abnormal activity. Recently, the use of so-called stealthy attacks has been demonstrated where the injection of false sensor measurements can be used to mimic normal control system signals, thereby defeating anomaly detectors whilst still delivering attack objectives. To date such attacks are considered to be extremely challenging to achieve and, as a result, they have received limited attention.

In this paper we define a deep learning-based framework which allows an attacker to conduct stealthy attacks with minimal a-priori knowledge of the target ICS. Specifically, we show that by intercepting the sensor and/or control signals in an ICS for a period of time, a malicious program is able to automatically learn to generate high-quality stealthy attacks which can achieve specific attack goals whilst bypassing a black box anomaly detector. Furthermore, we demonstrate the effectiveness of our framework for conducting stealthy attacks using two real-world ICS case studies. We contend that our results motivate greater attention on this area by the security community as we demonstrate that currently assumed barriers for the successful execution of such attacks are relaxed. Such attention is likely to spur the development of innovative security measures by providing an understanding of the limitations of current detection implementations and will inform security by design considerations for those planning future ICS.

I Introduction

Industrial Control Systems (ICS) generally consist of a set of supervisory control and data acquisition (SCADA) subsystems for controlling field devices via the monitoring and processing of data related to industrial processes. In response to information received from remote sensors, control commands are issued either automatically, or manually, to remotely located control devices, which are able to make physical changes to the states of one or more industrial processes. In the pursuit of increased communication efficiency and higher throughputs, modern information and communication technologies (ICT) have been widely integrated into ICS. For instance, cloud computing has recently emerged as a new computing paradigm for ICS, proposing the conversion of localised data historians and control units into cloud based services [1]. Such an evolution satisfies the growing needs from operators, suppliers and third-party partners for remote data access and command execution from multiple platforms. A trend which, consequently, exposes modern ICS to an increased risk of cyber attacks. Unlike conventional ICT systems, ICS security breaches have the potential to result in significant physical damage to national infrastructure; resulting in impacts that can include large scale power outages, disruption to health-service operations, compromised public transport safety, damage to the environment and direct loss of life.

The first well known ICS-targeted cyber virus – Stuxnet [2] – popularised cyber security vulnerabilities of ICS and demonstrated how external attackers might feasibly penetrate multiple layers of security to manipulate the computer programs that control their field devices. Since then, according to sources such as ICS-CERT, the number of attacks recorded against ICS targets has grown steadily. In 2014 [3] 245 incidents were reported to ICS-CERT by trusted industrial partners. This figure increased to 295 in 2015 [4] and then again to 290 by 2016 [5]. More recently than Stuxnet, a major security breach to a German steel mill was widely reported in 2014. This attack was initiated by credential theft via spear-phishing emails, leading to massive damage to a blast furnace [6] at the plant and illustrating clearly the cyber-physical nature of ICS security. Even more recently, in December 2015, Ukrainian’s capital city Kiev reportedly lost approximately one-fifth of its required power capacity as a result of a utility-targeted cyber attack that caused a massive blackout, affecting 225,000 citizens [7].

As an approach to implementing effective intrusion detection systems (IDS), anomaly detection has become recognised as part of the standard toolkit for protecting ICS from cyber attacks. In recent years many ICS-specific anomaly detection models have been proposed [8, 9], mainly incorporating blacklist-based or whitelist-based methods. These IDS are designed to detect intrusions by consideration of common ICS communication protocols (e.g. Modbus/DNP3) [10], ICS operating standards [11, 12] and the identification of patterns and structures of data transmitted across ICS networks. Amongst the various underpinning technologies used in existing IDS, machine learning (ML) techniques have enabled detecting modalities that exploit automated learning from past experience and the construction of self-evolving models that can adapt to evolving/future classification problems. To date, ML approaches that have shown promising capabilities for securing ICS include: Support Vector Data Description (SVDD) [13], Bloom filters [14], statistical Bayesian networks [15] and deep neural networks [16, 17], etc.

Furthermore, anomaly detection techniques employed at the field network layer provide protection at a point within ICS where security compromise has the greatest potential for causing significant physical damage to the system. At present, most anomaly detection implementations designed for this specific purpose rely on a predictive model, where future sensor measurements predictions are generated based on historical signals and compared with real measurements using a thresholded residual error. Comparisons exceeding the defined threshold constitute a detection event and generate an alert [18, 19, 20, 16]. However, recent studies have shown that cyber attacks may be generated by the injection of false sensor measurements that avoid the detection by such systems [21, 22]. Needless to say, such stealthy attacks are amongst the most harmful class of attacks against ICS that exist, as malicious sensor measurements can directly violate the operational safety bounds and conditions set for a given ICS. Although efforts to understand the risks and mitigations related to stealthy attacks on ICS have been made [23, 24], it is still generally believed that this class of attacks has a low likelihood of occurrence. This is due, in most part, to the level and type of operational information of the system that is thought to be needed by an attacker a-priori; information that is challenging to obtain.

In this work we challenge the above strong conditions imposed on a stealthy attacker by formulating and demonstrating an attack framework that uses a deep learning-based methodology. The deep learning method proposed allows the attacker to effectively conduct stealthy attacks which lead to a specific amount of deviation on sensor measurements with minimal knowledge of the target ICS. Specifically, our framework consists of set of deep neural network models, which are trained by a recently proposed adversarial training technique – Wasserstein GAN [25], a special variant of Generative Adversarial Net (GAN) [26] to maximize the likelihood that generated malicious sensor measurements will successfully bypass an assumed black-box anomaly detector. This technique significantly lowers the bar for conducting stealthy attacks as generating high-quality attacks can be automatically achieved by intercepting the sensor and/or control signals in an ICS for a period of time using a particularly designed real-time learning method. In addition, the explicit relaxation of conditions relating to specific operational knowledge within the attack design demonstrates the potential for its general applicability across ICS instances and industrial sectors. To show this we demonstrate the effectiveness of our framework by two case studies in which we conduct stealthy attacks on a gas pipeline and a water treatment system, respectively. Overall, our results indicate that more attention to stealthy attacks is urgently required within the ICS community.

Ii Background

By way of providing relevant background for the work undertaken here, this section provides an brief overview of the ICS control network in Section II-A. The generation and detection of stealthy attacks is briefly discussed in Section II-B. Following this, Section II-C provides information on the key underpinning techniques of deep neural networks employed here.

Ii-a ICS Control Network


Fig. 1: ICS control network

We summarize the main control loops of a typical ICS in Figure 1, showing the main communication transactions within it. Specifically, in the field network level, programmable logic controllers (PLCs) or Remote Terminal Units (RTUs) receive real-time measurements from sensors and forward them to the Human Machine Interface (HMI) in the control network level. HMI processed data is then again forwarded to the Master Control Station in the supervision level. In response to the received inputs, the Master Control Station then issues necessary control commands to change the states of its monitored industrial processes. Commands eventually make their way to relevant actuators to make the physical changes required. Intrusion detection systems (IDS) are often deployed on the master control stations, where the sensor measurements and control signals are monitored to secure the physical processes under control. We refer to Section III for the detailed description about the mechanisms for securing physical processes in ICS.

Ii-B Generating and Revealing Stealthy Attacks

Stealthy attacks, or more generally mimicry attacks, have been studied in the community of conventional IT security for decades. Such attacks can achieve specific attack goals without introducing illegal control flows into computer programs. Existing IDS for computer programs were mainly relied on short call sequence verification, which were unable to detect such attacks. The work in [27] proposed an IDS implemented by two-stage machine learning algorithms to construct normal call-correlation patterns and detect stealthy intrusions. In the domain of monitoring the behaviours of applications, system calls have been used extensively to construct the normal profiles for intrusion detection [28, 29]. However, it has been shown that such IDS are inadequate to detect mimicry attacks [29] (i.e. attacks that interleave the malicious code with legit code) and impossible paths attacks [30] (i.e. attacks relying on a legal yet never executed sequence of system calls). The work presented in [29] was one of the earliest studies on mimicry attacks against host-based IDS, where a theoretical framework was proposed to evaluate the effectiveness of an IDS combating mimicry attacks. A waypoint-based IDS was introduced in [30] to detect both mimicry attacks and impossible paths attacks by considering the trustworthy execution contexts of programs and restricting system call permissions. Besides, there were also existing work conducted on generating mimicry attacks by using generic programming [31] or static binary analysis [28].

In many successful ICS attacks, physical damage is achieved from an exploitation phase that utilises methods of injecting false sensor measurements into the control network. As discussed briefly previously, such attacks have also been described as Stealthy Attacks. An early practical description of such an attack is provided in [22]. Here the ability of an attacker to insert arbitrary errors into the normal operation of a power system without being detected by an implemented state estimator IDS is shown. It was pointed out that launching such attacks do pose strong requirements for the attackers as the configuration of the targeted power system must be known. Further aspects of stealthy attacks have been studied in [21, 32]. Specifically, two security indices were introduced in [32] to quantify the difficulty of launching successful stealthy attacks against particular targets and the work in [21] has proposed an efficient way to compute such security indices for sparse stealthy attacks. A stealthy deception attack against a canal system is shown in [33] and the effect of stealthy attacks on both the regulatory layer and the supervisory layer of an ICS is discussed. In [23], the authors proposed to reveal stealthy attacks in control systems through modifying the system’s structure periodically. Recently, the authors in [24] showed that the impact of stealthy attacks can be mitigated by the proper combination and configuration of different off-the-shelf detection schemes.

Ii-C Deep Neural Networks

Recently, deep learning [34] has achieved remarkable success and improved the state-of-the-art in various artificial intelligence applications including visual object classification, speech recognition and natural language understanding. It is a powerful modeling framework in machine learning, where multiple processing layers are used to extract features from data with multiple levels of abstraction. In this work, several neural network architectures are used to conduct feature extraction, anomaly detection and stealthy attacks generation. We describe these architectures below.

Ii-C1 Feedforward Neural Network (FNN)

The FNN, also called multi-layer perceptron (MLP), describes the most classic form of neural network (NN) where multiple processing nodes are arranged in layers such that information only flows in one direction – from input to output. The architect of a typical FNN is illustrated in Figure 2. In this architecture the -th node in -th layer computes linear combination of its inputs (i.e. the outputs of last layer) followed by a simple non-linear transformation ,

where we often use rectified linear unit (ReLU) as the non-linear function , the input . And the model parameters needs to be learned by minimizing certain loss function

given the training data , where is some chosen criterion to measure the difference between the prediction and ground truth, such as squared loss for regression and cross entropy for classification problem. Often stochastic gradient descent (SGD) can be used to minimize the loss . Concretely, in each iteration, a mini-batch of training samples are selected randomly to estimate the true gradient, , where is the number of samples inside each mini-batch. Then we can update the parameters by with some given learning rate .

Fig. 2: The architect of a typical FNN

Ii-C2 Recurrent Neural Network (RNN)

In contrast to FNNs, RNNs permit cyclical connections between nodes allowing such neural networks to exhibit dynamic temporal properties in operation. RNNs [35] are suitable for dealing with sequential data, such as speech, language and structured time series. An input sequence is processed by an RNN one element at a time, and the information is encoded into a hidden unit (i.e. a state vector) for each time step that describes the history of all the past elements of the sequence. The outputs of the hidden units at different time steps can be compared with the ground truth of the sequence such that the network can be trained. The most commonly implemented RNNs fall into the class of long short-time memory (LSTM [36]) neural networks. As the name suggests, such NNs exhibit remarkable empirical performance for extracting/preserving long-term dependencies whilst also maintaining short-term signals. LSTM networks involve three gates in the computation of each hidden cell to determine what to forget, what to output and what to be provided to next hidden cell, respectively, as shown in Figure 3. The information flow of LSTM cell is as follows,

(1)
(2)
(3)
(4)
(5)

where and represent the sigmoid and hyperbolic tangent function, respectively, and denotes the element-wise product.

Fig. 3: LSTM cell

Ii-C3 Generative Adversarial Net (GAN)

GANs [26] are an example of generative models that aim to learn an estimate (i.e. ) of the data distribution , given the training samples drawn from , such that we can generate new samples from the .

The general idea of GANs is to construct a game between two players. One of them is called the generator, which intends to create samples following the same distribution as the training data. The other player is the discriminator which tries to distinguish whether the samples (obtained from the generator) are real or fake. When the discriminator cannot tell apart generated and training samples, implying that we have learned the data distribution, i.e. .

The generator is simply a differentiable function with model parameters , where the input follows a simple prior distribution, such as uniform and Gaussian distribution. Due to the high capacity of multi-layer neural networks, they are often used as the function, and its output . The discriminator is a two-class classifier, often designed as a neural network to output class probability between 0 and 1. GAN aims to solve the following min-max optimization problem,

An alternative update of and by SGD can be adopted to solve this problem. Then we can use the optimized to generate new samples through the random input .

Unfortunately, the training of the original GAN is very instable in practice, and users have to balance both of the capacity and training steps between the generator and discriminator. To overcome this issue, the authors of [25] proposed to apply Wasserstein distance to measure the discrepancy between the two probability distributions, hence the name, Wasserstein GAN (WGAN). The min-max optimization problem of WGAN can be formulated as,

(6)

Compared with original GAN, the discriminator of WGAN, also called the “critic”, can output any real value, not just probabilities. The infinity norm of is constrained to be less than a predefined positive constant , and practically this can be achieved by “clipping” the elements larger than to (and elements smaller than to ) in each iteration. And the expectation term in Eq. (6) can be approximated by using random samples from training data and .

We summarize the training procedure of WGAN in Alg. 1, where a variant of stochastic gradient descent, RMSProp [37], is used for updating the parameters.

0:  the learning rate , the clipping parameter , the number of iterations of the critic per generator iteration , the size of mini-batch, .
0:  the initial parameters and .
1:  while  has not converged do
2:     for  do
3:        Sample a mini-batch from training data.
4:        Sample a mini-batch of prior samples.
5:        Evaluate stochastic gradient of :
6:        
7:        
8:     end for
9:     Fix
10:     Sample a mini-batch of prior samples.
11:     Evaluate stochastic gradient of :
12:     
13:  end while
Algorithm 1 Training Procedure of WGAN [25]

Iii Anomaly Detection Mechanism for Securing Physical Processes

As discussed in Section II-A, the physical process in ICS is directly controlled by the field network which consists of a number of field devices called sensors, actuators and PLCs. Specifically, the sensors are devices which convert physical parameters into electronic measurements; actuators are devices which convert control commands into physical state changes (e.g., turning a pump on or off); based on measurements received from sensors through PLC-sensor channels, PLCs send control commands to actuators through PLC-actuator channels.

To protect the control system from physical faults and cyber attacks, anomaly detection mechanisms are often deployed by monitoring the sensor measurements and control commands in the system at discrete time steps. Concretely, let be a time-series in which each signal is a -dimensional vector

whose elements correspond to values representing all the sensor measurements and values capturing all the control commands in the system at time . Currently, most anomaly detection mechanisms for the control system rely on a predictive model which predicts the sensor measurements based on previous signals, and an alarm is triggered if the residual error between the predicted measurements and the true measurements exceeds a specific threshold.

Iii-a Predictive Models

The underlying predictive model can take many different forms, among which the Auto-Regressive (AR) model [20, 24] and the Linear Dynamic State-space (LDS) model [23, 24] (it is often called as state estimator in power systems [19]) are the most commonly used. Specifically, the AR model predicts by fitting a linear regression model for each sensor measurement based on its previous values:

where are coefficients representing the weights of the measurement at time for predicting , is a constant.

The LDS model assumes a vector to denote the physical state of the system, then can be inferred by the following equations:

where are matrices capturing the dynamics of the physical system, and are vectors of noise for state variables and sensor measurements with a random process with zero mean. In general, because sensor measurements only depend on the current physical state in most systems. Then, to predict , one can use and to obtain an estimate of the current system state , and predict .

Since the system dynamics in many ICS are highly nonlinear, recently people have found that deep learning models can achieve better prediction accuracy than the linear models. For example, the authors in [16] shows that the LSTM model can be employed to predict:

where is a hidden vector computed iteratively by the first equation which encodes the previous time series signals to provide the context for predicting the sensor measurements at time ; is a complex function as a short form of Equations 1 to 5; and represent the weight matrix and the bias vector respectively for decoding the hidden vector to the predicted sensor measurements.

Iii-B Detection Methods

Based on the prediction of the predictive model, an anomalous signal can be detected when the Euclidean distance between the predicted sensor measurements and their observations exceeds a specific threshold: , where is often called the residual error at time point .

Instead of solely relying on the residual errors at a single time point, we could also take account of the history of residual errors. For example, we can apply the Cumulative Sum (CUSUM) method [38] for detecting collective anomalies. Specifically, let denote the residual error at time , assuming residual errors at different time points are independent and identically distributed with mean and variance , the CUSUM method will detect anomalies based on an accumulated statistic such that:

with initials ; is often set to a reference value such as . Then, if the cumulative sum reaches the predefined threshold , an alarm is triggered. After an alarm is triggered, is set to again, and a new round of detection is initiated.

Since in many cases, anomalies are not presented in the training phase of the detection models, thus the threshold value for the residual errors is often decided by tunning the expected false alarm rate. For the CUSUM method, the expected time between false alarms can also be tuned to decide its threshold value.

Iv Stealthy Attack Model

In this section, we formally define the stealthy attacks which will be generated using our deep learning framework. Specifically, we consider an ICS with PLC-sensor channels, PLC-actuator channels, and an anomaly detector is monitoring the sensor measurements and control commands delivered via the channels. Without loss of generality, we denote the anomaly detector as a function:

where represents the monitored time series of the whole system until time ; indicates a bad sensor measurement is detected at time .

Furthermore, we consider the attacker has the ability to intercept PLC-sensor channels and PLC-actuator channels, where and . Therefore, the attacker has a partial knowledge of the system dynamics, which can be denoted as a time-series where each signal is a subset of , consisting of the sensor measurements and control commands which are delivered via the compromised channels. Unlike previous works which assume the anomaly detection function is known or at least partially known to the stealthy attacker [21, 22], we assume the anomaly detector as a black box to the attacker in this paper.

The stealthy attacker’s target is to inject malicious sensor measurements which are deviant with their real values to a specific amount, whilst bypassing the black box anomaly detector . Specifically, let denote the injected malicious sensor measures at time , denote their corresponding real values, we define a set of attack goals for the stealthy attacker. Formally, each attack goal is defined as a target function:

and is the target compromising value set by the attacker. As an illustration, the target function denotes an attack goal to fool the PLC with a fake sensor measurement with more than unit smaller than its real value . Clearly, such stealthy attacks are very dangerous, as they can potentially sabotage the ICS by implicitly putting them in a critical condition.

V Deep Learning Framework for Conducting Stealthy Attacks

In this section, we present the methodology to automatically conduct stealthy attacks from the attackers’ perspective. Specifically, the conducting of stealthy attacks consists of two phases: the reconnaissance phase and the attacking phase. In the reconnaissance phase, a deep learning model for generating stealthy attacks is initialized and trained in real-time by reconnoitering the compromised channels without launching any attacks. In the attacking phase, the malicious sensor measurements generated by the trained model are injected to replace the real measurements. Let the attacker conduct stealthy attacks by injecting a malicious program, then we outline two key parts of our framework for implementing stealthy attacks: a powerful model for generating stealthy attacks, and an effective method to train the model in real time. Therefore, in the remaining part of this section, we propose a GAN for generating stealthy attacks as well as a real-time learning method to train the stealthy attack GAN.

V-a Stealthy Attack GAN

The stealthy attack GAN is composed by two deep learning models: a generator model for creating malicious sensor measurements and a discriminator model as a substitute anomaly detector to provide information for training the generator model.

V-A1 Malicious Sensor Measurement Generator

The objective of the generator model is to generate malicious sensor measurements which can achieve the predefined attack goals whilst bypassing the black box anomaly detector. Since the attacker does not have the full information of the physical process, the best strategy for the attacker is then to maximize the information he/she can utilize, which is the time series signals in the compromised channels, , to generate malicious sensor measurements.

Concretely, to utilize the information from the compromised channels, we define a sliding window:

which contains all the time series signals obtained from the compromised channels from time to , where is the length of the sliding window. Moreover, we also maintain another sliding windows which only differs from such that the previously generated malicious sensor measurements are injected, and their real values are replaced. Our generator will generate the next malicious sensor measurements at time based on both and . Intuitively, captures the real system dynamics as well as providing the information relevant for achieving attack goals, provides the related context for bypassing the black box anomaly detector with the consideration of the previously generated malicious measurements.

We approach malicious sensor measurement generation as a sequence learning problem, hence, we propose an LSTM-FNN as illustrated in Figure 4 to model our generator. Specifically, the two LSTMs read in the signals in the sliding window and separately, learn their temporal features, and then respectively encode them to hidden vectors and , which provides the context for the generation of malicious sensor measurements. Then, the FNN will be used to learn their high dimensional features , and then output the malicious sensor measurements . Concretely, the model can be represented by the following equations:

where the first four equations are used to iteratively encode and to hidden vectors and in which is a complex function as a short form of Equations 1 to 5; , and are the weight matrices and the bias vector to further encode the hidden vectors and for higher dimensional feature representation; and are the weight matrix and the bias vector respectively to decode to the output malicious sensor measurements . For convenience, we represent the generator model as an overall function :

(7)

where denote the parameters of the model, , and are the inputs and output of the model, respectively.

Fig. 4: The generator model

Then, let be all the moments for generating malicious sensor measurements, be the size of , to make the generated malicious measurements bypass the black box anomaly detector as well as achieving the attacker’s goals is equivalent to optimize the generator model as follows:

(8)
(9)

From above, we can clearly see that in order to optimize the generator model, firstly, we have to use a substitute anomaly detection model to approximate the black box anomaly detection function to provide information for the training of the generator.

V-A2 Substitute Anomaly Detector

In order to provide information to optimize the generator of malicious sensor measurements, here we propose another neural network model to approximate the black box anomaly detector. Again, without the ability to access the entire time series , our strategy for defining the substitute anomaly detector is to utilize a sliding window ( can either be or depending on whether malicious sensor measurements are injected in the previous time steps within the sliding window) to classify whether the sensor measurements at time are malicious or not.

Fig. 5: The substitute anomaly detector model

Concretely, we employ an LSTM-FNN discriminator model whose architecture is illustrated in Figure 5 to model the substitute anomaly detector. The model consists of two parts: an LSTM which takes the sliding window as input, learns its temporal features, and then encodes them to a hidden vector ; a FNN which takes and the sensor measurements as input ( can either be malicious measurements or real measurements ), encodes them to a hidden vector for capturing nonlinear features, and then outputs a singular for classification. With a larger value of , is more likely to be malicious. Note that in general, for a binary classification problem the sigmoid activation function will be used to transform the output to a probability such that . However, since here we will use the training method inspired by WGAN, the sigmoid activation function is not used in our substitute anomaly detector.

With the discriminator model, the whole procedure for computing can be outlined by the following equations:

where the first two equations iteratively encode to a hidden vector as similar with the generator model; the third equation encodes and to a hidden vector in which and are their weight matrices, is a bias vector; the last equation decode to the output singular , in which is the weight vector, is a bias singular.

Again, for convenience, we represent the substitute anomaly detector as a function :

where is the parameters of the detector; and are the inputs; is the output. Then, the learning goal of is to output small values for real sensor measurements and large values for malicious sensor measurements in order to classify them. Hence, the optimization problem can be formulated as follows:

(10)

where and denote the moments for sampling real measurements and generated malicious measurements for training , respectively; and represent a generated malicious measurement sample and a real measurement sample, respectively.

V-A3 The GAN

With the substitute anomaly detector, then the optimization problem of the generator is equivalent to generating malicious sensor measurements which let the substitute anomaly detector output as smaller value as possible whilst achieving the attack goals. Specifically, we can replace the black box function in Equation 8 by the function , then the optimization problem can be reformulated as follows:

where we assume is fixed for the time being. Note that the above formulation requires that all the generated malicious sensor measurements can achieve the attack goals, which is, however, generally not feasible in practice (at some time points, if the attack goals are too ambitious, it is impossible to generate such measurements both bypassing the anomaly detector and achieving attack goals). As a result, we relax the optimization problem to allow the attack goals to fail in some time points, but we pay a cost from each failed cases. To implement this, we introduce slack variables . Specifically, a non-zero value for allows a generated sensor measurement to not satisfy an attack goal at a cost proportional to the value of .

With slack variables, the formulation of the optimization problem becomes:

(11)
(12)

where the second equation means we achieve attacks goals by either adding or subtracting slack variables ; in the first equation is a hyperparameter which controls the trade-off between the probability of bypassing the substitute anomaly detector and the distance to achieve the attack goal: as becomes larger, the generator model is more willing to generate sensor measurements to achieve attack goals; when is small, the model is more likely to generate sensor measurements which can bypass the anomaly detector, but the attack goals are not strictly satisfied. In our case, is always set to a small value as we should always prioritize the generator’s ability to bypass the anomaly detector.

More importantly, we can always find such slack variables to satisfy the constraints in Equation 12, and the value of slack variables can be obtained by the following equation:

Replacing the above equation into Equation 11, we can finally remove the constraints in Equation 12, then the constrained optimization problem for the generator is converted to an unconstrained optimization problem over .

With the completion of the definition of the generator and the substitute anomaly detector, we illustrate the architecture of the stealthy attack GAN in Figure 6, in which dashed arrows indicate optional data flow.

Fig. 6: The architecture of the stealthy attack GAN

V-B Real-time Learning Method

Here we present the learning method to train the stealthy attack GAN. Specifically, the basic principle of our learning method follows the training principle of WGAN, which is to train the generator and the substitute anomaly detector iteratively to play an adversarial game until the generated malicious sensor measurements cannot be distinguished from real measurements by the substitute anomaly detector. However, since it is generally not feasible for the attacker to collect the intercepted time series signals in a repository and then train the stealthy attack GAN in an offline mode, we propose a real-time learning method to train the stealthy attack GAN. The whole procedure of the learning method is illustrated in Algorithm 2.

0:  the learning rate , the clipping parameter , the number of time steps for training the substitute anomaly detector per time step for training the generator .
0:  a sliding window with real sensor measurements, and another sliding window with generated malicious sensor measurements.
0:  the length of sliding window , the total time steps for learning , the trade-off hyperparameter , the probability for reseting with
0:  the initial parameters of the substitute detector , the initial parameters of the generator , the initial sliding windows
1:  for  do
2:     Generate a random number uniformly distributed in
3:     if  then
4:        Set
5:     end if
6:     if  then
7:        Generate malicious measurements:
8:        
9:        
10:        
11:        Reset and
12:     end if
13:     if  then
14:        fix
15:        
16:        
17:        Reset and
18:     end if
19:  end for
Algorithm 2 The real-time learning method of the stealthy attack GAN

Specifically, our learning method only requires to maintain two sliding windows and for the time series signals in the compromised channels with the real sensor measurements and the corresponding generated malicious sensor measurements. At each time step, either the substitute anomaly detector is trained to minimize the objective function in Equation 10 by using the gradient at the current sample which contains the real as well as the generated malicious measurements for the time step (Steps 7 to 9), or the generator is trained to minimize the objective function in Equation 11 by the gradient at the current sample in which the two sliding windows are used to generate malicious measurements at the time step to fool the substitute anomaly detector as well as achieving the attack goals (Steps 14 to 16). At each time step, the two sliding windows are reseted such that new signals at time will be appended, and the expired signals at time will be removed (Steps 11 and 17). Moreover, we also periodically reset at a rate specified by to start a new cycle for continuously injecting malicious measurements (Steps 2 to 5). Note that reseting periodically is rather important since this gives the chance for the stealthy attack GAN to properly learn how to initiate the attacks.

After the training of the stealthy attack GAN has been finished, the attacker can generate malicious sensor measurements for each time step at the attacking phase using Equation 7, and then inject them to the corresponding compromised channels to replace the real measurements for achieving their attack goals.

Vi Gas Pipeline Case Study

In this section, we show a case study in which we conduct stealthy attacks in a laboratory-scale gas pipeline system using our framework. Specifically, the system in our case study consists of a small airtight pipeline connected to a pressure meter, a pump, and a solenoid-controlled relief valve. A SCADA system is deployed to control the air pressure in the pipeline, which contains a PLC, a sensor for pressure measurement, and several actuators. Our attack goal is to constantly fool the PLC with a malicious pressure measurement which is smaller than its real value with different scales, which can potentially lead to the explosion of the gas pipeline. All the experiments in the section are done by exploiting a public dataset [39] which records the network traffic data log captured from the gas pipeline SCADA system.

Vi-a Gas Pipeline Dataset

The gas pipeline dataset for training and testing our stealthy attack GAN consists of the sensor measurements and control commands extracted every two seconds from normal network packages in a gas pipeline network traffic data log originally described in [39]. In total, 68,803 time series signals are collected. The detailed description of the extracted features for each signal is listed in Table I. Specifically, the gas pipeline includes a system control mode with 3 states; off, manual control, or automatic control. In off mode, the pump is forced to the off state to allow it to cool down. In manual mode, a HMI can be used to manually change the pump state and control the relief valve state by opening or closing the solenoid. In automatic mode, the control scheme determines which mechanism is used to regulate the pressure set point: either by turning a pump on or off or by opening and closing a relief valve using a solenoid. Moreover, a proportional integral derivative (PID) controller is used to control the pump or solenoid depending upon the control scheme chosen. Six PID control parameters can be set, which are pressure set point, gain, reset rate, rate, dead band, and cycle time.

Feature Description
setpoint The pressure set point
gain PID gain
reset rate PID reset rate
deadband PID dead band
cycle time PID cycle time
rate PID rate
system mode automatic (2), manual (1) or off (0)
control scheme Either pump (0) or solenoid (1)
pump

Pump control – open (1) or off (0)

only for manual mode

solenoid

Valve control – open (1) or closed (0)

only for manual mode.

pressure

measurement

Pressure measurement
TABLE I: Extracted sensor measurements (in bold text) and control commands from the gas pipeline dataset [39]

Vi-B Experiment Setup

In our experiments, we split the time-series dataset into two slices. The first slice which contains of the data is used for the reconnaissance phase to train our stealthy attack GAN. The other slice is used for the attacking phase. Since the gas pipeline dataset is relatively small, we go through the first time-series slice for 50 passes to properly train the stealthy attack GAN using our real-time learning method.

Vi-B1 Baseline Anomaly Detector

For the baseline anomaly detector, three predictive models, which are the AR model, the LDS model, and the LSTM model as described in Section III-A are fitted to minimize the mean square error between the predicted pressure measurements and their real values using cross validation on the first time-series slice. We pick the model with the best prediction accuracy, which is the LSTM model (with previous signals are used for the model inputs) as our baseline predictive model. Both the residual errors and the CUSUM statistics as described in Section III-B are used to detect anomalies.

Vi-B2 Attack Scenarios

The normal air pressure range for the gas pipeline system is about [0,40]. Let be the real pressure measurement at time , we investigate two attack goals, which are to inject malicious measurement such that and . Specifically, the above attack goals require the malicious measurements being 4 or 8 units smaller than their real values (but not beyond the normal pressure range). Furthermore, we also consider two cases, in which the attacker can only compromise the PLC-sensor channel, and the attacker can compromise all the PLC-sensor, PLC-actuator channels in the gas pipeline system. To summarize, we consider four attack scenarios with different attacker’s goal and ability assumptions as illustrated in Table II.

Attack Goal
Attacker’s Abilities
PLC-Sensor
channel
Compromised
Attack Scenario 1 Attack Scenario 2
All channels
Compromised
Attack Scenario 3 Attack Scenario 4
TABLE II: Attack Scenarios of the Gas Pipeline Case Study

Vi-B3 Feature Processing

We normalize all continuous features into range by min-max scaling. Specifically, let be a continuous feature, e.g., the setpoint, at time , and respectively be the minimal and maximal value of the feature, we covert . All the categorical features (system mode, control scheme, pump, solenoid), are one-hot encoded, e.g., automatic, manual, off system modes are encoded to [1,0,0], [0,1,0], [0,0,1], respectively.

Vi-B4 Model Parameters and Memory Cost

We explicitly set the number of units in the LSTM layers and the FNN layers of the stealthy attack GAN to four and two times of the dimension of the processed signals, respectively. The length of sliding window is set to 10, which is equal to the number of previous signals used in the baseline LSTM predictive model. The trade-off hyperparameter is set to as we find it achieves the best balance between bypassing anomaly detectors and achieving attack goals. The probability for reseting with is also set to . The memory costs for the stealthy attack GANs are about 40 kB for the first two attack scenarios, 160 kB for the other two attack scenarios implemented using the Keras library [40].

Fig. 7: The generated malicious pressure measurement trace in the attacking phase compared with their real values. (Scenario 1 to 4 from left to right)
Fig. 8: The deviation between the generated malicious pressure measurements and their real values at each time point in the attacking phase compared with the target deviation as specified by the attack goals (Scenario 1 to 4 from left to right)
Fig. 9: The ratio of time points at which the attack goal is achieved in the attacking phase with different value of

Vi-C Results and Evaluation

For each attack scenario, we train the corresponding stealthy attack GAN using our real-time learning method during the reconnaissance phase, after which we start to generate m