Towards a CAN IDS based on a neural-network data field predictor
Modern vehicles contain a few controller area networks (CANs), which allow scores of on-board electronic control units (ECUs) to communicate messages critical to vehicle functions and driver safety. CAN provide a lightweight and reliable broadcast protocol but is bereft of security features. As evidenced by many recent research works, CAN exploits are possible both remotely and with direct access, fueling a growing CAN intrusion detection system (IDS) body of research. A challenge for pioneering vehicle-agnostic IDSs is that passenger vehicles’ CAN message encodings are proprietary, defined and held secret by original equipment manufacturers (OEMs). Targeting detection of next-generation attacks, in which messages are sent from the expected ECU at the expected time but with malicious content, researchers are now seeking to leverage “CAN data models”, which predict future CAN message contents and use prediction error to identify anomalous, hopefully malicious CAN messages. Yet, current works model CAN signals post-translation, i.e., after applying OEM-donated or reverse-engineered translations from raw data. In this work, we present initial IDS results testing deep neural networks used to predict CAN data at the bit level, thereby providing IDS capabilities but avoiding reverse engineering proprietary encodings. Our results suggest the method is promising for continuous signals in CAN data, but struggles for discrete, e.g., binary, signals.
1. Introduction & Background
Modern vehicles are increasingly “drive-by-wire” meaning once-mechanical interfaces of subsystems have been replaced by communication of electronic control units (ECUs), or small computers orchestrating the subsystems. Rather than using dedicated connections for each ECU pair, a few controller area networks (CANs) allow broadcast communications of all ECUs. In particular, we focus on the high-speed (250Kbs-500Mbs) controller area network (CAN) bus, as it is used for much of modern vehicle communications.
CAN 2.0 provides a standard protocol defining the physical and data link layers (Bosch GmbH, 1991). See Figure 1 for the automotive CAN frame format. Each packet’s information is contained in two fields, the Arbitration ID (AID) used for indexing and prioritizing frames and the data field containing up to 64 bits of message contents. The mapping of the data field’s bits to the signals it encodes is a proprietary secret, defined by the original equipment manufacturers (OEMs, e.g., Ford, GM), and the encodings change depending on make, model, year, and even vehicle specifications. This poses an obstacle for producing vehicle-agnostic solutions for automotive CANs, in particular, defensive and offensive cyber security. See recent work of Verma et al. (Verma et al., 2018), and Nolan et al. (Nolan et al., 2018) on discovering the syntax and semantics of automotive CAN data.
CAN is a reliable and lightweight protocol, but it has few security features, e.g., no encryption nor authentication, and has been proven to be exploitable with direct access (Hoppe et al., 2008; Checkoway et al., 2011; Moore et al., 2017; Miller and Valasek, 2013) or even remotely (Miller and Valasek, 2015; Woo et al., 2015). The attack surface for in-vehicle CANs is growing as cars become increasingly exposed e.g. via USB, cellular, bluetooth and the advent of vehicle-to-vehicle and -infrastructure networking. Providing effective intrusion detection for automotive CANs is a burgeoning research topic (Tomlinson et al., 2018).
1.1. Related CAN IDS Works
Initial automotive CAN IDS research has been rule-based (Müter et al., 2010; Hoppe et al., 2008), which pushes security to OEMs, as rules are dependent on CAN encodings (model-specific) and may require knowledge of specific attacks. Multiple works (Moore et al., 2017; Gmiden et al., 2016; Song et al., 2016) exploit message frequency anomalies for vehicle-agnostic detection of message injection attacks. In response to the infamous Miller and Valesek remote Jeep hack (Miller and Valasek, 2015) (which used a masquerade attack in which one ECU sent malicious braking signals while the brake ECU was silenced), multiple efforts have proposed data-driven efforts for ECU identification to detect AIDs originating from the wrong transmitter (Cho and Shin, 2016; Lee et al., 2017; Choi et al., 18).
The logical next-generation attack involves a reprogrammed ECU sending appropriate AIDs with appropriate timing, but with augmented, potentially malicious, data field contents. After-market “chipping” kits exhibit this capability by reprogramming ECUs, although in practice these are used for performance-tuning, not malicious purposes. Works are emerging that test supervised deep learners trained on specific attacks with labeled data (Kang and Kang, 2016; Loukas et al., 2018). We seek anomaly detection to avoid training towards a specific attack.
Unsupervised CAN IDS research for detecting malicious message contents has begun modeling correlations inherent to the CAN data that may be broken by such attacks, admitting detection. Tyree et al. (Tyree et al., 2018) propose a manifold learning technique to identify relationships in CAN data that are broken during attacks that do not coordinate related signals. Their technique requires at least the ability to tokenize (partition) the up-to 64-bit CAN data fields into signal-sized messages but not fully translate the CAN data. The other three works seeking to exploit CAN data correlations require complete knowledge of the CAN signals: Ganesan et al. (Ganesan et al., 2017) learn correlation of value pairs (e.g., speed, accelerator pedal position) using both CAN and sensor data to detect injection attacks. IDS research of Li (Li, 2016) and of Testud (Testud, 2017) propose a three-step process to model CAN packets and detect unexpected packets: (1) reverse engineer or partner with an OEM to obtain many signals in the CAN data, (2) train deep learning, neural network regressor(s) to predict the next signal value(s) from the history of observations, (3) use the error in predicted values from observed as an online anomaly detector.
We present initial results for a CAN prediction model without step (1). That is, previous work translated the 64-bit data field into the signals it encodes (requiring OEM knowledge or tedious reverse engineering) and built models of the signals. Rather, our approach models an AID’s 64-bit data field. Hence, we commence prediction and detection (steps (2) and (3)) without requiring any translation of the CAN message bits to signals.
Our long-term goal is to provide an after-market IDS for ideally all passenger vehicles. This means we cannot rely on OEM-defined CAN mappings. En route to this goal we adopt the neural network CAN prediction model; specifically, from a history of CAN data our regressors predict the next CAN data field, and we too use prediction error to detect anomalous messages. Unlike the previous two similar works (Li, 2016; Testud, 2017), we do not translate CAN data fields to signals, as we do not have the OEM’s proprietary mappings. Instead, we train a deep neural network for each AID to predict its next 64-bit data field.
The primary contribution of this work is presentation of initial results showing efficacy of the bit-level CAN models for attack detection. The benefit of this approach is straight-forward—it extends the general CAN modeling frameworks for anomaly detection (which relies critically on OEM-proprietary CAN mappings) to a vehicle-agnostic detector, as no CAN mappings are assumed. Although our focus is CAN IDS, CAN models can be used for other applications, e.g. CAN simulators.
2. CAN Prediction Model
The essential hypothesis of CAN prediction models is that there exists a dependency of future messages on recently passed or other concurrent messages. While our overall attack detector is unsupervised—that is, we do not require labeled attack and non-attack data—we exploit supervised learning to build a CAN prediction model. Specifically, we create labeled data by taking a fixed AID’s most recently observed ten data fields and try to predict next (11th) one. Hence, we model each AID independently.
Let be the set of training examples and be the set of labels. Our training data is a tuple where and as shown in Figure 2.
Recurrent neural networks (RNNs) model temporal/sequential dependence by including the previous prediction’s hidden state as well as given inputs into the current prediction (Rumelhart et al., 1986). Long Short-Term Memory (LSTM) layers provide a particular architecture for portions of an RNN that seek to leverage dependence in modeling better than “vanilla” RNNs, as they are crafted to avoid vanishing gradient problems common in RNN training (Hochreiter and Schmidhuber, 1997). Hence, this statistical machinery is a natural choice for our model.
We build the model using Keras (www.keras.io), a Python deep learning module. The model consists of three LSTM layers, a dropout layer, and two dense layers. The last layer having 64 nodes (one per predicted bit of the next data field) as the output and softmax as an activation function. Between the two dense layers, we include a dropout layer to prevent overfitting of our model. We set the layer’s drop rate to 0.2 (i.e. 20% of neurons in the first Dense layer are dropped during training). To train the model we used batch size of 32. Out of several tested architectures, where we varied number of layers and size of the hidden layers, this one showed best performance. See Figure 3.
For each desired AID, we use the described LSTM on ambient CAN data collected during normal driving conditions. We denote such a model , where is the set of training examples and is the set of labels for each example. For a given input vector (previous ten observed data fields), let denote the predicted next 64-bit data field, .
To build an anomaly score from the AID’s trained prediction model, we consider the error of each prediction, . To account for model inaccuracies, compute the mean and variance of the observed prediction errors by using the model on the training set. Specifically, and . Finally, we compute the Gaussian -score of newly observed error and use the one-sided p-value for our anomaly score, p-value, where CDF is the Gaussian normal cumulative distribution function. Note that if the error is less than expected ( ) p-value and p-value as . Similarly, if the error is greater than expected () p-value, and p-value as . Hence, a small p-values occurs if and only if the error is large relative to observations in training.
To respect space constraints, we present two indicative experiments, one model of an AID that seems to communicate continuous signals, another of an AID that seems to communicated discrete signals. Specifically, we believe the first AID communicates four two-byte messages giving the wheels’ respective speeds and the second AID a binary indicator for if the vehicle is in reverse. For data collection we used the Vehicle Spy software, produced by Intrepid Control Systems, Inc. (www.intrepidcs.com/products/software/vehicle-spy) allowing passive monitoring of CAN data via the OBD-II port.
For training, we used a portion of CAN data recorded during ambient driving lasting 141 seconds. Figure 4 visualizes a snippet of the training data for each AID. Once the prediction models are trained for each AID, we must fit a Gaussian to the observed prediction errors using only the training examples. Hence, we apply the trained model to the training set and observe the prediction errors , then compute the mean, and variance .
To test the detector, we inject CAN frames with each AID, separately, to emulate attacks on the CAN. It is important to stress that the anomaly detector does not consider the frequency nor the timestamp of CAN frame, only the sequence of data fields; hence, the high frequency injections emulate an ECU that is sending messages with false content. For each emulated attack (one per AID), we used an Arduino board for injecting CAN frames as well as the Vehicle Spy for recording CAN data, both connected to the vehicle via an OBD-II port.
3.1. Wheel Speed AID
The actual attack happened from 14s to 29s of the trip. During that time the “attacker” repeatedly injected the same AID with the same message in the 64-bit data field. As can be seen in Figure 5, the p-value of the observed signals occurring between 14s to 29s is extremely low.
3.2. Reverse Lights AID
The actual attack happened from 14.5s until 29s of the capture. During that time the “attacker” repeatedly injected the same AID with the same message in the 64-bit data field. Referring to Figure 6, it is important to note that the p-value of the observed signals is extremely low throughout the test set. However, it hits actual 0 during the attack period.
3.3. Results Discussion
Overall, we have a very strong difference in our anomaly score between attack and non-attack periods, but finding an a priori threshold seems problematic. We conjecture that current architecture is a better model for nearly continuous signals with many distinct 64-bit messages (as in Figure 5), that move in a a clear pattern (e.g., as speed increases, the place bit increases from 0 to 1, then the place bit increases from 0 to 1, … ). The second AID communicating seemingly binary signals is, unsurprisingly, harder for the model to predict. Perhaps taking inputs from a variety of other AIDs may enhance prediction accuracy.
4. Conclusions & Future Work
Recent approaches to build CAN IDSs train a “CAN language model”, that is, a machine learning model that can accurately predict the next CAN message from previous or concurrent messages. Previous works have trained models on reverse engineered signals, requiring OEM-proprietary (secret) knowledge. In this paper we build a CAN model at the bit level, eliminating the need for CAN data translation, and present initial results in use for an IDS.
To build the CAN model, we assumed a dependency between previous and future data fields within an AID of an automotive CAN, and train an LSTM recurrent neural network on ambient data for two AIDs. From both AID CAN models, we build an anomaly detector based on a relative predicted error of each CAN message. A very important feature of our method is that our neural network takes on raw 64-bit messages, and hence does not require extensive preprocessing, e.g., to reverse engineer proprietary CAN encodings. The technique works very well with AIDs that carry many distinct messages (roughly continuous messages) which change often over time. On the other hand, applying the same neural network architecture to an AID with seemingly binary signals (and therefore few distinct messages) does not yield as convincing results. In particular, prediction error during the non-attack period during testing was very large relative to expectations from training (Fig. 6).
For future work, we would like to refine the architecture of the neural network to more accurately predict non-malicious messages. Although outside of scope for this paper, we note that preliminary testing with alternate neural network configurations yielded less accurate results, but lends credence to future work aimed at optimizing the architecture for CAN modeling. Additionally, construction of a model that handles more than just one AID at a time will presumably increase accuracy as CANs communicate states of many different but physically related subsystems. Finally, work is emerging to automatically discover encoded signals in the CAN data fields (e.g, (Verma et al., 2018; Nolan et al., 2018)); hence, the logical next step is to train the CAN models conditioned on information from these works.
Authors thank S. Hollifield, J. Laska, and M. Verma for fruitful discussions. Research sponsored by the Laboratory Directed Research and Development Program of Oak Ridge National Laboratory, managed by UT-Battelle, LLC, for the U. S. Department of Energy and the National Science Foundation Math Science Graduate Internship (NSF-MSGI).
- Bosch GmbH (1991) Robert Bosch GmbH. 1991. CAN Specification Version 2.0. (1991).
- Checkoway et al. (2011) Stephen Checkoway and others. 2011. Comprehensive Experimental Analyses of Automotive Attack Surfaces.. In USENIX Security Symposium. San Francisco.
- Cho and Shin (2016) Kyong-Tak Cho and Kang G Shin. 2016. Fingerprinting Electronic Control Units for Vehicle Intrusion Detection.. In USENIX Security Symp. 911–927.
- Choi et al. (18) Wonsuk Choi and others. ‘18. Identifying ECUs through Inimitable Characteristics of Signals in Controller Area Networks. IEEE Trans. Vehic. Tech. (‘18).
- Ganesan et al. (2017) Arun Ganesan, Jayanthi Rao, and Kang Shin. 2017. Exploiting Consistency Among Heterogeneous Sensors for Vehicle Anomaly Detection, In SAE Technical Paper. (03 2017). https://doi.org/10.4271/2017-01-1654
- Gmiden et al. (2016) Mabrouka Gmiden, Mohamed Hedi Gmiden, and Hafedh Trabelsi. 2016. An intrusion detection method for securing in-vehicle CAN bus. In Proc. of Sciences and Techniques of Automatic Control and Computer Engineering. IEEE.
- Hochreiter and Schmidhuber (1997) Sepp Hochreiter and Jürgen Schmidhuber. 1997. Long Short-Term Memory. Neural Comput. 9, 8 (Nov. 1997), 1735–1780. DOI:http://dx.doi.org/10.1162/neco.1922.214.171.1245
- Hoppe et al. (2008) Tobias Hoppe and others. 2008. Security threats to automotive CAN networks–practical examples and selected short-term countermeasures. In Inter. Conf. Comp. Safety, Reliability & Security. Springer.
- Kang and Kang (2016) Min-Joo Kang and Je-Won Kang. 2016. Intrusion detection system using deep neural network for in-vehicle network security. PloS one 11, 6 (2016), e0155781.
- Lee et al. (2017) Hyunsung Lee, Seong Hoon Jeong, and Huy Kang Kim. 2017. OTIDS: A Novel Intrusion Detection System for In-vehicle Network by using Remote Frame. In PST (Privacy, Security and Trust). (accepted).
- Li (2016) Jun Li. 2016. Deep Learning on CAN Bus. (2016). https://youtu.be/1QSo5sOfXtI
- Loukas et al. (2018) George Loukas and others. 2018. Cloud-Based Cyber-Physical Intrusion Detection for Vehicles Using Deep Learning. IEEE Access 6 (2018). DOI:http://dx.doi.org/10.1109/ACCESS.2017.2782159
- Miller and Valasek (2013) Charlie Miller and Chris Valasek. 2013. Adventures in automotive networks and control units. Def Con 21 (2013), 260–264.
- Miller and Valasek (2015) Charlie Miller and Chris Valasek. 2015. Remote exploitation of an unaltered passenger vehicle. Black Hat USA 2015 (2015), 91.
- Moore et al. (2017) Michael R Moore, Robert A Bridges, Frank L Combs, Michael S Starr, and Stacy J Prowell. 2017. Modeling inter-signal arrival times for accurate detection of CAN bus signal injection attacks: a data-driven approach to in-vehicle intrusion detection. In Proc. CISRC. ACM, 11.
- Müter et al. (2010) Michael Müter and others. 2010. A structured approach to anomaly detection for in-vehicle networks. In IAS. IEEE.
- Nolan et al. (2018) Brent Nolan and others. 2018. Unsupervised Time Series Extraction from Controller Area Network Payloads. In Proc. IEEE CAVS. (to appear).
- Rumelhart et al. (1986) David E. Rumelhart and others. 1986. Learning representations by back-propagating errors. Nature 323 (Oct. 1986). dx.doi.org/10.1038/323533a0
- Song et al. (2016) Hyun Min Song, Ha Rang Kim, and Huy Kang Kim. 2016. Intrusion detection system based on the analysis of time intervals of CAN messages for in-vehicle network. In ICOIN. IEEE.
- Testud (2017) Jean-Christophe Testud. 2017. Detecting ICS Attacks Through Process Variable Analysis. (2017). https://youtu.be/b4lut5uWs2w
- Tomlinson et al. (2018) Andrew Tomlinson and others. 2018. Towards Viable Intrusion Detection Methods For The Automotive Controller Area Network. In 2nd CSCS 2018).
- Tyree et al. (2018) Zachariah Tyree, Robert A Bridges, Frank L Combs, and Michael R Moore. 2018. Exploiting the Shape of CAN Data for In-Vehicle Intrusion Detection. In IEEE CAVS. (to appear), arXiv:1808.10840.
- Verma et al. (2018) Miki E Verma, Robert A Bridges, and Samuel C Hollifield. 2018. ACTT: Automotive CAN Tokenization and Translation. In Proc. IEEE CSCI. (to appear) arXiv:1811.07897.
- Woo et al. (2015) Samuel Woo and others. 2015. A practical wireless attack on the connected car and security protocol for in-vehicle CAN. Tran. Intel. Trans. Sys. 16, 2 (2015).