Towards Robust Neural Networks via Random Selfensemble
Abstract
Recent studies have revealed the vulnerability of deep neural networks  A small adversarial perturbation that is imperceptible to human can easily make a welltrained deep neural network misclassify. This makes it unsafe to apply neural networks in securitycritical applications. In this paper, we propose a new defensive algorithm called Random SelfEnsemble (RSE) by combining two important concepts: randomness and ensemble. To protect a targeted model, RSE adds random noise layers to the neural network to prevent from stateoftheart gradientbased attacks, and ensembles the prediction over random noises to stabilize the performance. We show that our algorithm is equivalent to ensemble an infinite number of noisy models without any additional memory overhead, and the proposed training procedure based on noisy stochastic gradient descent can ensure the ensemble model has good predictive capability. Our algorithm significantly outperforms previous defense techniques on real datasets. For instance, on CIFAR10 with VGG network (which has 92% accuracy without any attack), under the stateoftheart C&W attack within a certain distortion tolerance, the accuracy of unprotected model drops to less than 10%, the best previous defense technique has accuracy, while our method still has prediction accuracy under the same level of attack. Finally, our method is simple and easy to integrate into any neural network.
1 Introduction
Deep neural networks have demonstrated their success in many machine learning and computer vision applications, including image classification [9, 5], object recognition [21] and image captioning [25]. Despite having nearperfect prediction performance, recent studies have revealed the vulnerability of deep neural networks to adversarial examples  given a correctly classified image, a carefully designed perturbation to the image can make a welltrained neural network misclassify. Algorithms crafting these adversarial images, called attack algorithms, are designed to minimize the perturbation, thus making adversarial images hard to be distinguished from natural images. This leads to security concerns, especially when applying deep neural networks to securitysensitive systems such as selfdriving cars and medical imaging.
To make deep neural networks more robust to adversarial attacks, several defensive algorithms have been proposed recently [16, 27, 12, 11, 26]. However, several recent studies showed that these defensive algorithms can only marginally improve the accuracy under the adversarial attacks [1, 2].
In this paper, we propose a new defensive algorithm: Random SelfEnsemble (RSE). More specifically, we introduce a new “noise layer” that fuses input vector with a randomly generated noise, and then add this layer before each convolution layer of a deep network. In the training phase, gradient is still computed by backpropagation but will be perturbed by a random noise when passing through the noise layer. In the inference phase, we perform several forward propagations, each time with different prediction scores on the noise layer, and then ensemble the results. We show that RSE makes the network resistant against adversarial attacks – using the proposed training and testing scheme, it will only slightly affect test accuracy. The algorithm is easy to implement and can be applied to any deep neural networks to improve the robustness against adversarial attacks.
Intuitively, RSE works well because of two important concepts: ensemble and randomness. It is known that ensemble of several trained models can improve the robustness [20], but will also increase the model size by folds. In comparison, without any additional memory requirement, RSE can construct infinite number of models , where is generated randomly, and ensemble the results to improve robustness. But how do we guarantee the ensemble of these models can achieve good accuracy? Indeed, if we train the original model without noise, but only add noise layers in the inference phase, the algorithm performs poorly. This suggests adding random noise to an existing welltrained network will significantly degrade the performance. Instead, we show that if the noise layer is taken into account in the training phase, the training procedure can be viewed as minimizing an upper bound of the loss of the ensemble model, and thus our algorithm can achieve good prediction accuracy.
Our contribution of this paper can be summarized below:

We propose a Random SelfEnsemble (RSE) approach for improving the robustness of deep neural networks. The main idea is to add a “noise layer” before each convolution layer in both training and prediction phases. The algorithm is equivalent to ensembling an infinite number of random models to defense against the attackers.

We explain why RSE can significantly improve the robustness toward adversarial attacks and show adding noise layers is equivalent to training the original network with an extra regularization of Lipchitz constant.

RSE significantly outperforms existing defensive algorithms in all our experiments. For example, on CIFAR10 with VGG network (which has 92% accuracy without any attack), under the C&W attack the accuracy of unprotected model drops to less than 10%; the best previous defense technique has accuracy; while RSE still has prediction accuracy under the same level of attack. Moreover, RSE is easy to implement and can combine with any neural network.
2 Related Work
Security of deep neural networks has been studied recently. Let us denote the neural network as where is the model parameters (weights) and is the input image. Given a correctlyclassified image (), an attacking algorithm will try to find a slightly perturbed image such that (1) the neural network will misclassify this perturbed image; and (2) the distortion is small so that the perturbation is hard to be detected by human. A defensive algorithm is designed to improve the robustness of neural networks against attackers, usually by slightly changing the loss function or training procedure. In the following, we summarize recent works along this line.
2.1 Whitebox attack
In the whitebox setting, attackers have full information about the targeted neural network, including network structure and network weights (denoted by ). Using this information, attackers can compute gradient with respect to input data by backpropagation. Note that gradient is very informative for attackers since it characterizes the sensitivity of the prediction with respect to the input image.
To craft an adversarial example, [7] proposed a fast gradient sign method (FGSM), where the adversarial example is constructed by
with some small . In fact, FGSM can be viewed as one step of gradient descent, and several works have been trying to improve over it, including rand FGSM [23], IFGSM [12]. Recently, Carlini & Wagner [2] showed that constructing an adversarial example can be formulated as solving the following optimization problem:
(1) 
where the first term is the loss function that characterizes the success of the attack and the second term is to enforce small distortion. The parameter is used to balance these two terms. Several variances are proposed recently [3, 14], but most of them are following the similar framework. The C&W attack has been recognized as the most successful attacking algorithm.
For untargeted attack, where the goal is to find an adversarial example that is close to the original example but yields different class prediction, the loss function in (1) can be defined as
where is the predicted label, is the network’s output before softmax layer.
For targeted attack, the loss function can be defined to force the classifier to predict the target label. For attackers, targeted attack is strictly harder than untargeted attack (since success of targeted attack implies success of untargeted attack). On the other hand, for defenders, untargeted attacks are strictly harder to defense than targeted attack. Therefore, we focus on defending untargeted attacks in our experiments.
2.2 Blackbox attack
The whitebox setting is often impractical because real world systems usually do not release their internal states. Therefore several recent papers are focusing on the blackbox setting [17, 4]. In the blackbox setting, the only thing attackers can do is to make queries to the targeted neural network and get the corresponding output. In this setting, a common approach is to train a “substitute model” [16] based on many input/output pairs and then attack this substitute model instead of the real one. This is based on the idea of transferability of adversarial examples [13]. However, this approach has very high failure rate since the substitute model can be totally different from the targeted network.
Recently, [4] proposed a blackbox attack algorithm called ZOO. The main idea is to solve the same objective function (1) as the C&W attack using zeroth order optimization. To solve (1), C&W attack needs to compute gradient and apply gradient descent. However, due to the blackbox setting, the gradient cannot be computed using backpropagation, thus ZOO [4] estimates the gradient by
(2) 
for all , where is a small number that controls the estimation accuracy and is the th indicator vector. In fact, if , ZOO can find a solution with similar quality as C&W’s white box attack (see [4]).
Note that our proposed method can perfectly prevent ZOO attack. Since our algorithm adds randomness into the neural network, it will never return the objective function using the same weights, which makes ZOO infeasible to use (2) to estimate gradient anymore. The details will be discussed in Section 3.
2.3 Defensive Algorithms
Because of the vulnerability of adversarial examples [22], several methods have been proposed to strengthen the network’s ability to defense against adversarial examples. [18] proposed defensive distillation, which uses a modified softmax layer by temperature to train the network as teacher network, and then use the prediction probability (softlabels) from teacher network to train the student network which has the same structure with the teacher network. However, as stated in [2], this method is not working when dealing with the C&W attack. [27] showed that by using a modified ReLU activation layer (BReLU) and adding noise into origin images to augment the training dataset, learned network will gain some ability to defense from adversarial attacks. Another popular defense approach is adversarial training [12, 11]. It generates and adds adversarial examples found by a certain attack algorithm into training set, which helps the network to learn how to distinguish an adversarial example. Combining adversarial training with enlarged model capacity, [14] is able to create a MNIST model that is robust to first order attacks, but this approach does not work well on larger dataset like CIFAR10. In addition to changing the network structure, there are other methods [26, 15, 6, 8] “detecting” the adversarial examples, which are not in the scope of our paper.
3 Proposed Algorithm: Random SelfEnsemble
In this section, we propose our selfensemble algorithm to improve the robustness of neural networks. We will first motivate and introduce our algorithm and then discuss several theoretical reasons behind it.
It is known that ensemble of several different models can improve the robustness. However, an ensemble of finite models is not very practical because it will increase the model size by folds. For example, AlexNet model on ImageNet requires 240MB storage, and storing 100 of them will require 24GB memory. Moreover, it is hard to find many heterogeneous models with similar accuracy. To improve the robustness of practical systems, we propose the following selfensemble algorithm that can generate an infinite number of models onthefly without any additional memory cost.
Our main idea is to add randomness into the network structure. More specifically, we introduce a new “noise layer” that fuses input vector with a randomly generated noise, i.e. when passing through the noise layer. Then we add this layer before each convolution layer as shown in Figure 1. Since most attacks require computing or estimating gradient, the noise level in our model will control the success rate of those attacking algorithms. In fact, we can integrate this layer into any other neural network.
If we denote the original neural network as where is the weights and is the input image, then considering the random noise layer, the network can be denoted as with random . Therefore we have an infinite number of models in the pocket (with different ) without having any memory overhead. However, adding randomness will also affect the prediction accuracy of the model. How can we make sure that the ensemble of these random models has enough accuracy?
A critical observation is that we need to add this random layer in both training and testing phases. The training and testing algorithms are listed in Algorithm 1. In the training phase, gradient is computed as which includes the noise layer, and the noise is generated randomly for each stochastic gradient descent update. In the testing phase, we construct random noises and ensemble their probability outputs by
If we do not care about the prediction time, can be very large, but in practice we found the performance will be quite stable after beyond 10 (see Figure 5).
This approach is different from Gaussian data augmentation in [27]: they only add Gaussian noise to images during the training time, while we add noise before each convolution layer at both training and inference time. When training, the noise helps optimization algorithm to find a stable convolution filter that is robust to perturbed input, while when testing, the roles of noise are twofolded: one is to perturb the gradient to fool gradientbased attacks.The other is it gives different outputs by doing multiple forward operations and a simple ensemble method can improve the testing accuracy.
3.1 Mathematical explanations
Training and testing of RSE
Here we explain our training and testing procedure. In the training phase, our algorithm is solving the following optimization problem:
(3) 
where is the loss function and is the data distribution. Note that for simplicity we assume follows a zeromean Gaussian , but in general our algorithm can work for any noise distribution.
At testing time, we ensemble the outputs through several forward propagations, specifically:
(4) 
here means the index of maximum element in a vector. The reason that our RSE algorithm achieves the similar prediction accuracy with original network is because (3) is minimizing an upper bound of the loss of (4) – If we choose negative loglikelihood loss, then and :
(5)  
Here comes from Jensen’s inequality and is by the inference rule (4). So by minimizing (3) we are actually minimizing the upper bound of inference loss , this validates our ensemble inference procedure.
RSE is equivalent to Lipschitz regularization
Another point of view is that perturbed training is equivalent to Lipschitz regularization, which further helps defensing gradient based attack. If we fix the output label then the loss function can be simply denoted as . Lipchitz of the function is a constant such that
(6) 
for all . In fact, it has been proved recently that Lipschitz constant can be used to measure the robustness of machine learning model [10]. If is large enough, even a tiny change of input can significantly change the loss and eventually get an incorrect prediction. On the contrary, by controlling to be small, we will have a more robust network.
Next we show that our noisy network indeed controls the Lipschitz constant. Following the notation of (3), we can see that
(7)  
For , we do Taylor expansion at . Since we set the variance of noise very small, we only keep the second order term. For , we notice that the Gaussian vector is i.i.d. with zero mean. So the linear term of has zero expectation, and the quadratic term is directly dependent on variance of noise and Frobenius norm of Hessian. By the norm inequality for , we can rewrite (7) as
(8) 
which means the training of noisy networks is equivalent to training the original model with an extra regularization of Lipschitz constant, and by controlling the variance of noise we can balance the robustness of network with training loss.
3.2 Discussions
Here we show both randomness and ensemble are important in our algorithm. Indeed, if we remove any component, the performance will significantly drop. And some naive ways to add random noise and ensemble does not work.
First, as mentioned before, the main idea of our model is to have infinite number of models , each with a different value, and then ensemble the result. A naive way to achieve this goal is to fix a pretrained model and then generate many in the testing phase by adding different small noise to . However, Figure 2 shows this approach (denoted as Test noise only) will result in much worse performance (20% without any attack). Therefore it is nontrivial to guarantee the model to be good after adding small random noise. In our random selfensemble algorithm, in addition to adding noise in the testing phase, we also add noise layer in the training phase, and this is important for getting good performance.
Second, we found adding noise in the testing phase and then ensemble the predictions is important. In Figure 2, we compare the performance of RSE with the version that only adds the noise layer in the training phase but not in the testing phase (so the prediction is instead of ). The results clearly show that the performance drop under smaller attacks. This proves ensemble in the testing phase is important.
3.3 Resistant against blackbox attack (ZOO)
As discussed in Section 2.2, ZOO [4] is the most successful attack algorithm in the blackbox setting and outperforms transfer attacks by a significant amount (see [4]). Interestingly, the accuracy of ZOO attack is theoretically controlled by the noise added in our noise layer. Recall that ZOO crafts the adversarial example by solving the optimization problem similar to (1) using zeroth order optimization, where gradient is estimated by finite difference. However, with RSE, the gradient estimator computed by ZOO will become
which is no longer an estimator of even when because . Therefore, ZOO will not even converge.
4 Experiments
Datasets and network structure
We test our method on two datasets—CIFAR10 and STL10. We do not compare the results on MNIST since it is a much easier dataset and existing defense methods such as [16, 27, 12, 11] can effectively increase image distortion under adversarial attacks. On CIFAR10 data, we evaluate the performance on both VGG16 [19] and ResNeXt [24]; on STL10 data we copy and slightly modify a simple model^{1}^{1}1Publicly available at https://github.com/aaronxichen/pytorchplayground which we name it as “Model A”.
Defensive algorithms
We include the following defensive algorithms into comparison (their parameter settings can be found in Table 1):

Random SelfEnsemble (RSE): our proposed method.

Defensive distillation [18]: first train a teacher network at temperature , then use the teacher network to train a student network of the same architecture and same temperature. The student network is called the distilled network.

Robust optimization combined with BReLU activation [27]: first we replace all ReLU activation with BReLU activation. And then at the training phase, we randomly perturb training data by Gaussian noise with as suggested.
Attack models
Although nowadays there are many attacking methods discussed in Section 2, they differ greatly on the power of attacks. Obviously whitebox attacks know more information about the targeted model so they have higher success rate, qualifying itself as a challenger to defense models. Thus we choose C&W attack [2] as a representative one since it is a powerful whitebox attacks, despite more computation is needed. Moreover, we test our algorithm under untargeted attack, since untargeted attack is strictly harder to defense than targeted attack. In fact, C&W untargeted attack is the most challenging attack for a defensive algorithm. As experiment in [2] shows, C&W attack should be the benchmark for defensive methods.
Measure
Unlike attacking models that only need to operate on correctly classified images, a competitive defense model not only protects the model when attackers exist, but also keeps a good performance on clean datasets. Based on this thought, we compare the accuracy of guarded models under different strengths of C&W attack, the strength can be measured by norm of image distortion and further controlled by parameter in (1). Note that an adversarial image is correctly predicted under C&W attack if and only if the original image is correctly classified and C&W attack cannot find an adversarial example within a certain distortion level.
Methods  Settings 

No defense  Baseline model 
RSE(for CIFAR10 + VGG16)  Initial noise: 0.4, inner noise: 0.1, 50ensemble 
RSE(for CIFAR10 + ResNeXt)  Initial noise: 0.1, inner noise 0.1, 50ensemble 
RSE(for STL10 + Model A)  Initial noise: 0.4, inner noise: 0.1, 50ensemble 
Defensive distill  Temperature = 40 
Adversarial retraining  FGSM adversarial examples, 
Robust Opt. + BReLU  Following [27] 
RSE(ours)  90.00%  86.06%  79.44%  67.19%  34.75% 
Adv retraining  27.00%  9.81%  4.13%  3.69%  1.44% 
Robust Opt+BReLU  75.06%  47.93%  30.94%  20.69%  13.50% 
Distill  49.88%  17.69%  4.56%  3.13%  1.44% 
No defense  30.38%  8.93%  5.06%  3.56%  2.19% 
bird  car  cat  deer  dog  frog  horse  plane  truck  
No defense  1.94  0.31  0.74  4.72  7.99  3.66  9.22  0.75  1.32 
Defensive distill  6.55  0.70  13.78  2.54  13.90  2.56  11.36  0.66  3.54 
Adv. retraining  2.58  0.31  0.75  6.08  0.75  9.01  6.06  0.31  4.08 
Robust Opt. + BReLU  17.11  1.02  4.07  13.50  7.09  15.34  7.15  2.08  17.57 
RSE(ours)  12.87  2.61  12.47  21.47  31.90  19.09  9.45  10.21  22.15 
4.1 The effect of noise level
We first test the performance of RSE under different noise levels. We use Gaussian noise for all the noise layers in our network and the standard deviation of Gaussian controls the noise level. Note that we call the noise layer before the first convolution layer the “initnoise”, and all other noise layer “innernoise”.
In this experiment, we apply different noise level in both training and testing phases to see how different variances change the robustness as well as generalization ability of networks. As an example, we choose on VGG16+CIFAR10. The result is shown in Figure 3.
As we can see, both “initnoise” and “innernoise” are beneficial to the robustness of neural network, but at the same time, one can see higher noise reduces the accuracy for weak attacks (). From Figure 3, we observe that if the input image distribution is in the range of , then choosing and is good. Thus we fix this parameter for all the experiments.
4.2 Selfensemble
Next we show selfensemble helps to improve the test accuracy of our noisy mode. As an example, we choose VGG16+CIFAR10 combination and the standard deviation of initial noise layer is , other noise layers is . We compare 50ensemble with 1ensemble (i.e. single model), and the result can be found in Figure 4.
We find the 50ensemble method outperform the 1ensemble method by accuracy when . This is because when the attack is weak enough, the majority choice of networks has lower variance and higher accuracy. On the other hand, we can see if or equivalently the average distortion greater than , the ensemble model is worse. We conjecture that this is because when the attack is strong enough then the majority of random submodels make wrong prediction, but when looking at any individual model, the random effect might be superior than group decision. In this situation, selfensemble may have a negative effect on accuracy.
Practically, if running time is of primary concern, it is not necessary to calculate many ensemble models. In fact, we find the accuracy is easily saturated with respect to number of models, moreover, if we inject smaller noise then ensemble effect would be weaker and the accuracy gets saturated earlier. Therefore, we find ensemble is good enough for testing accuracy, see Figure 5.
4.3 Comparing defense methods
Finally, we compare our RSE method with other existing defensive algorithms. Note that we test all of them using C&W untargeted attack, which is the most difficult setting for defenders.
The comparison across different datasets and networks can be found in Table 2 and Figure 6. As we can see, previous defense methods have little effect on C&W attacks. For example, Robust Opt+BReLU [27] is useful for CIFAR10+ResNeXt, but the accuracy is even worse than no defense model for STL10+Model A. In contrast, our RSE method acts as a good defence across all cases. Specifically, RSE method enforces the attacker to find much more distorted adversarial images in order to start a successful attack. As showed in Figure 6, when we allow an average distortion of on CIFAR10+VGG16, C&W attack is able to conduct untargeted attacks with success rate . On the contrary, by defending the networks via RSE, C&W attack only yields a success rate of .
Apart from accuracy under C&W attack, we find the distortion of adversarial images also increases significantly, this can be seen in Figure 2(2nd row), as is large enough (so that all defensive algorithms no longer works) our RSE method achieves the largest distortion.
Although all above experiments are concerning untargeted attack, it doesn’t mean targeted attack is not covered, as we said, targeted attack is harder for attacking methods and easier to defense. As an example, we test all the defensive algorithms on cifar10 dataset under targeted attack. We randomly pick an image from CIFAR10 and plot the perturbation in Figure 7 (the exact number is in Table 3), to make it easier to print out, we subtract RGB channels from 255. One can easily find RSE method makes the adversarial images more distorted.
5 Conclusion
In this paper, we propose a new defensive algorithm called Random SelfEnsemble (RSE) to improve the robustness of deep neural networks against adversarial attacks. We show that our algorithm is equivalent to ensemble a huge amount of noisy models together, and our proposed training process ensures that the ensemble model can generalize well. We further show that the algorithm is equivalent to adding a Lipchitz regularization and thus can improve the robustness of neural networks. Experimental results demonstrate that our method is very robust against stateoftheart whitebox attacks. Moreover, Our method is simple, easytoimplement, and can be easily embedded into an existing network.
References
 [1] N. Carlini and D. Wagner. Adversarial examples are not easily detected: Bypassing ten detection methods. arXiv preprint arXiv:1705.07263, 2017.
 [2] N. Carlini and D. Wagner. Towards evaluating the robustness of neural networks. In Security and Privacy (SP), 2017 IEEE Symposium on, pages 39–57. IEEE, 2017.
 [3] P.Y. Chen, Y. Sharma, H. Zhang, J. Yi, and C.J. Hsieh. Ead: Elasticnet attacks to deep neural networks via adversarial examples. arXiv preprint arXiv:1709.04114, 2017.
 [4] P.Y. Chen, H. Zhang, Y. Sharma, J. Yi, and C.J. Hsieh. Zoo: Zeroth order optimization based blackbox attacks to deep neural networks without training substitute models. arXiv preprint arXiv:1708.03999, 2017.
 [5] J. Dean, G. Corrado, R. Monga, K. Chen, M. Devin, M. Mao, A. Senior, P. Tucker, K. Yang, Q. V. Le, et al. Large scale distributed deep networks. In Advances in neural information processing systems, pages 1223–1231, 2012.
 [6] R. Feinman, R. R. Curtin, S. Shintre, and A. B. Gardner. Detecting adversarial samples from artifacts. arXiv preprint arXiv:1703.00410, 2017.
 [7] I. J. Goodfellow, J. Shlens, and C. Szegedy. Explaining and harnessing adversarial examples. arXiv preprint arXiv:1412.6572, 2014.
 [8] K. Grosse, P. Manoharan, N. Papernot, M. Backes, and P. McDaniel. On the (statistical) detection of adversarial examples. arXiv preprint arXiv:1702.06280, 2017.
 [9] K. He, X. Zhang, S. Ren, and J. Sun. Deep residual learning for image recognition. In Proceedings of the IEEE conference on computer vision and pattern recognition, pages 770–778, 2016.
 [10] M. Hein and M. Andriushchenko. Formal guarantees on the robustness of a classifier against adversarial manipulation. arXiv preprint arXiv:1705.08475, 2017.
 [11] R. Huang, B. Xu, D. Schuurmans, and C. Szepesvári. Learning with a strong adversary. arXiv preprint arXiv:1511.03034, 2015.
 [12] A. Kurakin, I. Goodfellow, and S. Bengio. Adversarial machine learning at scale. arXiv preprint arXiv:1611.01236, 2016.
 [13] Y. Liu, X. Chen, C. Liu, and D. Song. Delving into transferable adversarial examples and blackbox attacks. arXiv preprint arXiv:1611.02770, 2016.
 [14] A. Madry, A. Makelov, L. Schmidt, D. Tsipras, and A. Vladu. Towards deep learning models resistant to adversarial attacks. arXiv preprint arXiv:1706.06083, 2017.
 [15] D. Meng and H. Chen. Magnet: a twopronged defense against adversarial examples. arXiv preprint arXiv:1705.09064, 2017.
 [16] N. Papernot, P. McDaniel, I. Goodfellow, S. Jha, Z. B. Celik, and A. Swami. Practical blackbox attacks against deep learning systems using adversarial examples. arXiv preprint arXiv:1602.02697, 2016.
 [17] N. Papernot, P. McDaniel, I. Goodfellow, S. Jha, Z. B. Celik, and A. Swami. Practical blackbox attacks against machine learning. In Proceedings of the 2017 ACM on Asia Conference on Computer and Communications Security, pages 506–519. ACM, 2017.
 [18] N. Papernot, P. McDaniel, X. Wu, S. Jha, and A. Swami. Distillation as a defense to adversarial perturbations against deep neural networks. In Security and Privacy (SP), 2016 IEEE Symposium on, pages 582–597. IEEE, 2016.
 [19] K. Simonyan and A. Zisserman. Very deep convolutional networks for largescale image recognition. arXiv preprint arXiv:1409.1556, 2014.
 [20] T. Strauss, M. Hanselmann, A. Junginger, and H. Ulmer. Ensemble methods as a defense to adversarial perturbations against deep neural networks. arXiv:1709.03423, 2017.
 [21] C. Szegedy, W. Liu, Y. Jia, P. Sermanet, S. Reed, D. Anguelov, D. Erhan, V. Vanhoucke, and A. Rabinovich. Going deeper with convolutions. In Proceedings of the IEEE conference on computer vision and pattern recognition, pages 1–9, 2015.
 [22] C. Szegedy, W. Zaremba, I. Sutskever, J. Bruna, D. Erhan, I. Goodfellow, and R. Fergus. Intriguing properties of neural networks. arXiv preprint arXiv:1312.6199, 2013.
 [23] F. Tramèr, A. Kurakin, N. Papernot, D. Boneh, and P. McDaniel. Ensemble adversarial training: Attacks and defenses. arXiv preprint arXiv:1705.07204, 2017.
 [24] S. Xie, R. Girshick, P. Dollár, Z. Tu, and K. He. Aggregated residual transformations for deep neural networks. arXiv preprint arXiv:1611.05431, 2016.
 [25] K. Xu, J. Ba, R. Kiros, K. Cho, A. Courville, R. Salakhudinov, R. Zemel, and Y. Bengio. Show, attend and tell: Neural image caption generation with visual attention. In International Conference on Machine Learning, pages 2048–2057, 2015.
 [26] W. Xu, D. Evans, and Y. Qi. Feature squeezing: Detecting adversarial examples in deep neural networks. arXiv preprint arXiv:1704.01155, 2017.
 [27] V. Zantedeschi, M.I. Nicolae, and A. Rawat. Efficient defenses against adversarial attacks. arXiv preprint arXiv:1707.06728, 2017.