Mobile Security Enhancement against Adversarial Deception

DoPa: A Fast and Comprehensive CNN Defense Methodology against Physical Adversarial Attacks

Zirui Xu, Fuxun Yu, Xiang Chen George Mason University, Fairfax, Virginia
zxu21, fyu2, xchen26@gmu.edu
Abstract.

Recently, Convolutional Neural Networks (CNNs) demonstrate a considerable vulnerability to adversarial attacks, which can be easily mislead by adversarial perturbations. With more aggressive methods proposed, adversarial attacks can be also applied to the physical world, causing practical issues to various CNN powered applications. Most existing defense works for physical adversarial attacks only focus on eliminating explicit perturbation patterns from inputs, ignoring interpretation and solution to CNN’s intrinsic vulnerability. Therefore, most of them depend on considerable data processing costs and lack expected versatility to different attacks. In this paper, we propose DoPa – a fast and comprehensive CNN defense methodology against physical adversarial attacks. By interpreting the CNN’s vulnerability, we find that non-semantic adversarial perturbations can activate CNN with significantly abnormal activations and even overwhelm other semantic input patterns’ activations. We improve the CNN recognition process by adding a self-verification stage to analyze the semantics of distinguished activation patterns with only one CNN inference involved. Based on the detection result, we further propose a data recovery methodology to defend the physical adversarial attacks. We apply such detection and data recovery methodology into both image and audio CNN recognition process. Experiments show that our methodology can achieve an average rate of 90% success for attack detection and 81% accuracy recovery for image physical adversarial attacks. Also, the proposed defense method can achieve a 92% detection successful rate and 77.5% accuracy recovery for audio recognition applications. Moreover, the proposed defense methods are at most 2.3 faster compared to the state-of-the-art defense methods, making them feasible to resource-constrained platforms, such as mobile devices.

copyright: none

1. Introduction

In the past few years, Convolutional Neural Networks (CNNs) have been widely applied in various cognitive applications, such as image classification (Wang and et al., 2017; Zoph and et al., 2018) and speech recognition (Chorowski and et al., 2015; Chiu and et al., 2018), etc. Although effective and popular, CNN powered applications are facing a critical challenge – adversarial attacks. By injecting particular perturbations into input data, adversarial attacks can mislead CNN recognition results. The perturbations generated by traditional adversarial attacks are fragile, and can only be added into the digital data. Therefore, they can hardly threaten the recognition systems which obtain input data from the real world. However, with more advanced methods proposed, adversarial perturbations can be concentrated into a small area and be easily attached to the actual objects. Therefore, these enhanced adversarial attacks can be applied to the physical world. Fig. 1 shows a physical adversarial example on the traffic sign detection. When we attach a well-crafted adversarial patch on the original stop sign, the traffic sign detection system will be misled to a wrong recognition result as a speed limit sign. Recently, the problem of such physical adversarial attacks becomes more severe with increasing CNN based applications (James, 2018).

Many works have been proposed to defend against physical adversarial attacks (Hayes, 2018; Naseer and et al., 2019; Yang and et al., 2018; Osadchy and et al., 2017). However, most of them neglect CNN’s intrinsic vulnerability interpretations. Instead, either they merely focus on eliminating explicit perturbation patterns from input (Naseer and et al., 2019; Osadchy and et al., 2017) or they simply adopt multiple CNNs to conduct the cross-verification (Zeng and et al., 2018; Yang and et al., 2018). All these methods have certain drawbacks: First, they introduce a considerable data processing cost during perturbations elimination. Second, they can hardly defend against physical adversarial attacks with model transferability, lacking versatility for preventing different physical adversarial attacks.

Figure 1. Physical Adversarial Attack for Traffic Sign

In this paper, we propose DoPa, a fast and comprehensive defense methodology against physical adversarial attacks. By interpreting CNN’s vulnerability, we reveal that the CNN decision-making process lacks necessary qualitative semantics distinguishing ability: the non-semantic input patterns can significantly activate CNN and overwhelm other semantic input patterns. We improve the CNN recognition process by adding a self-verification stage to analyze the semantics of distinguished activation patterns with only one CNN inference involved. Fig. 1 illustrates the self-verification stage for a traffic sign adversarial attack. For each input image, after one forward process, the verification stage will locate the significant activation sources (shown in green circle) and calculate the input semantic inconsistency with the expected semantic patterns (shown in the right circle) according to the prediction result. Once the inconsistency exceeds a pre-defined threshold, CNN will conduct a data recovery method to recover the input image. Our defense methodology depends on only one CNN inference with minimum computation components involved, which can be extended to both CNN based image and audio recognition applications.

Specifically, we have the following contributions in this work:

  • By interpreting CNN’s vulnerability, we find that the non-semantic input patterns can significantly activate CNN and overwhelm other semantic input patterns.

  • We propose a self-verification stage to analyze and detect the abnormal activation patterns’ semantics. Specifically, we introduce the inconsistency between the local input patterns that cause the distinguished activations and the synthesized patterns with expected semantics.

  • We adopt two data recovery methods in our defense methodology to recover input data which has been attacked.

  • Based on the detection and data recovery methodology, we propose two defense cases for image and audio applications.

Experiments show that our method can achieve an average 90% detection successful rate and average 81% accuracy recovery for image physical adversarial attacks. Also, our method achieves 92% detection successful rate and 77.5% accuracy recovery for audio adversarial attacks. Moreover, our methods are at most 2.3 faster than the state-of-the-art defense methods, which is feasible to various resource-constrained platforms, such as mobile devices.

2. Background and Related Works

2.1. Physical Adversarial Attacks

The adversarial attack started to arouse researchers’ general concern with adversarial examples, which were first introduced by (Goodfellow and et al., 2014). The adversarial examples were designed to project prediction errors into input space to generate noises, which can perturb digital input data (e.g., images and audio clips) and manipulate prediction results. Since then, various adversarial attacks were proposed, such as L-BFGS (Szegedy and et al., 2013), FGSM (Goodfellow and et al., 2014), CW (Carlini and et al., 2017), etc. Most of these adversarial attack methods share a similar mechanism, which tries to cause the most error increment within model activation and regulate the noises within the input space.

Recently, such an attack approach was also brought from the algorithm domain into the physical world, which we refer as the physical adversarial attack.  (Eykholt and et al., 2017) first leveraged a masking method to concentrate the adversarial perturbations into a small area and implement the attack on real traffic signs with taped graffiti.  (Brown and et al., 2017) then extended the scope of physical attacks with adversarial patches. With more aggressive image patterns than taped graffiti, these patches could be attached to physical objects arbitrarily and have a certain degree of model transferability.

Beyond aforementioned image cases, some physical adversarial attacks also have been proposed to audios. Yakura et al. (Yakura and et al., 2018) proposed an audio physical adversarial attack that can still be effective after playback and recording in the physical world.  (Yakura and et al., 2018) generated audio adversarial command in a normal song which can be played through the air.

Compared to the noise based adversarial attacks, these physical adversarial attacks reduce the attack difficulty and further impair the practicality and reliability of deep learning technologies.

2.2. Image physical Adversarial Attack Defense

There are several works have been proposed to detect and defense such physical adversarial attacks in the image recognition process. Naseer et al. proposed a local gradients smoothing scheme against physical adversarial attacks (Naseer and et al., 2019). By regularizing gradients in the estimated noisy region before feeding images into CNN for inference, their method can eliminate the potential impacts from adversarial attacks. Hayes et al. proposed a physical image adversarial attack defense method based on image inpainting (Hayes, 2018). Based on the traditional image processing methods, they detect the localization of adversarial noises in the input image and further leverage the image inpainting technology to remove the adversarial noises.

Although these methods are effective for image physical adversarial attacks defense, they still have certain disadvantages regarding computation and versatility. For example, local gradients smoothing requires the manipulation for each pixel of the input image, which will introduce a large number of computation workload. Moreover, their methods are designed for solving specific adversarial attack which are not integrated for different physical adversarial attack situations. Therefore, we develop a fast and comprehensive defense methodology to address the above issues.

2.3. Audio Physical Adversarial Attack Defense

Figure 2. Audio Recognition and Physical
Adversarial Attack Process

Compared with images, the audio data requires more processing efforts for recognition. Fig. 2 shows a typical audio recognition process and the corresponding physical adversarial attack. The audio waveform is first extracted as Mel-frequency Cepstral Coefficient (MFCC) features. Then we leverage a CNN to achieve acoustic feature recognition, which can obtain the candidate phonemes. Finally, a lexicon and language model is applied to obtain the recognition result ”open”. When the adversarial noise is injected to the original input waveform, the final recognition result is misled to ”close”.

Several works have been proposed to detect and defend such adversarial attacks (Zeng and et al., 2018; Yang and et al., 2018; Rajaratnam and et al., 2018). Zeng et al. leveraged multiple Automatic Speech Recognition (ASR) systems to detect audio physical adversarial attack based on a cross-verification methodology (Zeng and et al., 2018). However, their method lacks certain versatility which cannot detect the adversarial attacks with model transferability. Yang et al. proposed an audio adversarial attack detection and defense method by exploring the temporal dependency in audio adversarial attacks (Yang and et al., 2018). However, their method requires multiple CNN recognition inferences which is time-consuming. Rajaratnam et al. leveraged the random noise flooding to defense audio physical adversarial attacks (Rajaratnam and et al., 2018). Since the ASR systems are relatively robust to natural noise while the adversarial noise is not, by injecting random noise, the functionalities of adversarial noise can be destroyed. However, this method cannot achieve high practical defense performance .

3. CNN Vulnerability Analysis for Physical Adversarial Attacks

In this section, we first interpret the CNN vulnerability by analyzing the input patterns’ semantics with the activation maximization visualization (Erhan and et al., 2009). Based on the semantics analysis, we identify the adversarial attack patches as the non-semantic input patterns with abnormal distinguished activations. Specifically, to evaluate the semantics, we propose metrics that can measure the inconsistencies between the local input patterns that cause the distinguished activations and the synthesized patterns with expected semantics. Based on the inconsistency analysis, we further propose a defense methodology consists of the self-verification and the data recovery.

Figure 3. Visualized Neuron’s Input Pattern by Activation Maximization Visualization

3.1. CNN Vulnerability Interpretation

In a typical image or audio recognition process, CNN extracts features from the original input data and gradually derive a prediction result. However, when injecting physical adversarial perturbations into the original data, CNN will be misled to a wrong prediction result. To better interpret the vulnerability, we first analyze CNN’s vulnerability by using a typical image physical adversarial attack – adversarial patch attack as an example. Compared with the original input, an adversarial patch usually has no constraints in color/shape, etc. Therefore, such the patches usually sacrifice the semantic structures so as to cause significant abnormal activations and overwhelm the other input patterns’ activations. We assume that CNN lacks qualitative semantics distinguishing ability which can be activated by the non-semantic adversarial patch during CNN inference.

To verify such an assumption, we investigate the semantic of each neuron in CNN. We adopt a CNN activation visualization method – Activation Maximization Visualization (AM) (Erhan and et al., 2009). AM can generate a pattern to visualize each neuron’s most activated semantic input. The generation process of pattern can be considered as synthesizing an input image to a CNN model that delicately maximizes the activation of the neuron in the layer of . Mathematically, this process can be formulated as:

(1)

where, is the activation of from an input image X, is the gradient ascent step size.

Fig. 3 shows the visualized neurons’ semantic input patterns by using AM. As the traditional AM method is designed for semantics interpretation, many feature regulations and hand-engineered natural image references are involved in generating interpretable visualization patterns. Therefore we can get three AM patterns with an average activation magnitude value of 3.5 in Fig. 3 (a). The objects in the three patterns indicate they have clear semantics. However, when we remove these semantics regulations in the AM process, we obtain three different visualized patterns as shown in Fig. 3 (b). We can find that these three patterns are non-semantic, but they have significant abnormal activations with an average magnitude value of 110. This phenomenon can prove our assumption that CNN neurons lack semantics distinguishing ability and can be significantly activated by non-semantic inputs patterns.

Figure 4. Image Adversarial Patch Attack

3.2. Metrics for Input Semantic Inconsistency and Prediction Activation Inconsistency

To identify the non-semantic input patterns for the attack detection, we aim to compare the natural image recognition with the physical adversarial attacks.

Fig. 4 shows a typical adversarial patch based physical attack. The patterns in the left circles are the primary activation sources from the input images, and the bars on the right are the neurons’ activations in the last convolutional layer. From input patterns, we identify a significant difference between the adversarial patch and primary activation source on the original image, which can be used to detect the adversarial patch. From prediction activations, we observe another difference between the adversarial input and the original input, which are their activation magnitudes. Therefore, we formulate two inconsistencies at two levels:

Input Semantic Inconsistency Metric This metric measures the input semantic inconsistency between the non-semantic adversarial patches and the semantic local input patterns from the natural image. It can be defined as follows:

(2)

where and represent the input patterns from the adversarial input and the original input. and represent the set of neurons’ activations produced by adversarial patch and the original input, respectively. maps neurons’ activations to the primary local input patterns. represents a similarity metric.

Prediction Activation Inconsistency Metric The second inconsistency is in the activation level, which reveals the activations’ magnitude distribution inconsistency in the last convolutional layer between the adversarial input and the original input. We also use a similar metric to measure it as follows:

(3)

where and represent the magnitude distribution of activations in the last convolutional layer generated by the adversarial input and the original input data.

For the above two inconsistency metrics, we can easily obtain and since they come from the input data. However, and are not easily to get because of the variety of the natural input data. Therefore, we need to synthesize the standard input data which can provide the semantic input patterns and activation magnitude distribution. The synthesized input data for each prediction class can be obtained from a standard dataset. By feeding CNN with a certain number of input from the standard dataset, we can record the average activation magnitude distribution in last convolutional layer. Moreover, we can locate the primary semantic input patterns for each prediction class.

3.3. Physical Adversarial Attack Defense based
on CNN Self-verification and Data Recovery

Based on the above analysis, we propose a defense methodology which consists of the self-verification stage and data recovery in the CNN decision-making process. More specifically, the entire methodology flow can be described as following: (1) We first feed the input into the CNN inference and obtain the prediction class. (2) Next, CNN can locate the primary activation sources from the practical input and obtain the activations in the last convolutional layer. (3) Then CNN leverages the input semantic inconsistency metric and the prediction activation inconsistency metric to measure the two inconsistencies between the practical input and the synthesized data with the prediction class. (4) Once inconsistency exceeds the given threshold, CNN will consider the input as an adversarial input. (5) After a physical adversarial attack has been detected by the self-verification stage, the data recovery methodology is further used to recover the input data which has been attacked. Specifically, we leverage image inpainting and activation denoising to recover the adversarial input image and audio. Our proposed methodology can defend against the physical adversarial attack with only one CNN inference involved.

We will derive two methods from such methodology for image and audio applications in Section 4 and Section 5.

4. Defense Against Image Physical Adversarial Attack

In the last section, we propose the defense methodology which consists of the self-verification stage and data recovery. In this section, we will specifically describe our proposed defense methodology against image physical adversarial attacks.

For image physical adversarial attacks defense, we mainly depend on the input semantic inconsistency in input pattern level. Therefore, we need to locate the primary activation source from the input image by adopting a CNN activation visualization method – Class Activation Mapping (CAM) (Zhou and et al., 2016). Let denotes the value of the activation in the last convolutional layer at spatial location . We can compute a weighted sum of the all activations at the spatial location in the last convolutional layer as:

(4)

where is the total number of activations in the last convolutional layer. The larger value of represents the activation source in the input image at the corresponding spatial location plays a more important role during CNN inference.

Figure 5. The Results after 2D Fast Fourier Transform

To achieve CNN self-verification for image attack detection, we further build the specific input semantic inconsistency metric. According to our preliminary analysis, the input adversarial patch contains much more high-frequency information than the natural semantic input patterns. Therefore, we first leverage 2D Fast Fourier Transform (2D-FFT) (Ayres and et al., 2008) to transfer the patterns from the temporal domain to the frequency domain and thereby concentrate the low-frequency components together. Then we convert the frequency-domain pattern to a binary pattern with an adaptive threshold. Fig. 5 shows a converted example, including adversarial patterns, expected synthesized patterns with the same prediction result, and natural input patterns. For binary patterns, we can observe the significant difference between adversarial input and semantic synthesized input. Therefore, based on the above analysis, we replace as Jaccard Similarity Coefficient (JSC) (Niwattanakul and et al., 2013) and propose our image adversarial attack inconsistency metric as:

(5)

where is the synthesized semantic pattern with predicted class. means the numbers of pixels where the pixel value of and both equal to 1.

With the above inconsistency metric, we propose our specific defense methodology which contains self-verification and image recovery. The entire process of our method is described in Fig. 6.

Self-verification for Detection For each input image, we apply CAM to locate the source location of the biggest model activations. Then we crop the image to obtain the patterns with maximum activations. In the step of semantic test, we calculate the consistency between and . Once the inconsistency is higher than a predefined threshold, we consider an adversarial input detected.

Data Recovery for Image After the patch is detected, we do the image data recovery by directly removing patch from the original input data. To eliminate the attack effects, we further leverage image inpainting technology to repair the image such as image interpolation (Bertalmio and et al., 2000). At last, we feed back the recovery image into CNN to do the prediction again.

With the above steps, we can detect and further defend an image physical adversarial attack during CNN inference process.

5. Defense Against Audio Physical Adversarial Attack

In this section, we will introduce the detailed defense design flow for the audio physical adversarial attacks.

Different from images, the audio data requires more processing efforts. As Fig. 2 shows, during the audio recognition, the input waveform needs to pass Mel-frequency Cepstral Coefficient (MFCC) conversion to be transferred from the time domain into the time-frequency domain. In that case, the original input audio data will loss semantics after the MFCC conversion. Therefore, we leverage the prediction activation inconsistency to detect the audio physical adversarial attacks.

More specifically, we measure the activation magnitude distribution inconsistency between the practical input and the synthesized data with the same prediction class. We adopt a popular similarity evaluation method - Pearson Correlation Coefficient (PCC) (Benesty and et al., 2009) and the inconsistency metric can be defined as:

(6)

where and represent the activations in the last convolutional layer for both practical input and synthesized input. and denote the mean values of and , and are the standard derivations, and means the overall expectation.

Self-verification for Detection With established inconsistency metric, we further apply self-verification stage to CNN for the audio physical adversarial attack. The detection flow is described as following: We first obtain activations in the last convolutional layer for every possible input word by testing CNN with a standard dataset. Then we calculate the inconsistency value . If the model is attacked by the audio adversarial attack, will exceed a pre-defined threshold. According to our preliminary experiments tested with various attacks, there exists a large range for the threshold to distinguish the natural and the adversarial audio, which can benefit our accurate detection.

Figure 6. Adversarial Patch Attack Defense

Data Recovery for Audio After identifying the adversarial input audio, simply denying it can cause undesired consequences. Therefore, attacked audio recovery is considered as one of the most acceptable solutions. We propose a new solution - “activation denoising” as our defense method, which targets ablating adversarial effects from the activation level. The activation denoising takes advantages of the aforementioned last layer activation patterns, which have stable correlations with determined predication labels. When the wrong label is detected, we can determine the correlated activation patterns. By suppressing these patterns in the hidden layer, the original input emerges. Therefore, we propose our adversarial audio recovery method as shown in Fig. 7: Based on the detection result, we can identify the wrong prediction label, and therefore obtain the standard activation patterns of the wrong class in the last layer. (For the best performance, we locate the top-k activation index.) Then we can find the activations with the same index. These activations are most potentially caused by the adversarial noises and supersede the original activations. Therefore, we suppress these activations to resurrect the original ones. Such an adversarial activation suppression scheme inherits the defense methodology we proposed in the image domain.

6. Experiment and Evaluation

In this section, we evaluate our method in terms of its effectiveness and efficiency in two application scenarios: image and audio physical adversarial attacks. The CNN models and datasets used in our experiments are listed in Table 1: For physical adversarial attack in image scenarios, we test our defense method’s performance on Inception-V3 (Szegedy and et al., 2015), VGG-16 (Simonyan and et al., 2014), and ResNet-18 (He and et al., 2015) using ImageNet dataset (Deng and et al., 2009). For audio scenarios, we use Command Classification Model (Morgan and et al., 2001) on Google Voice Command dataset (Morgan and et al., 2001).

6.1. CNN Image Physical Adversarial
Attack Defense Evaluation

In this part, we evaluate our proposed defense method for the image physical adversarial attack scenario. The adversarial patches are generated by using Inception-V3 as the base model. The generated patch with high transferability are utilized to attack three models: Inception-V3 itself and two other models, VGG-16 and ResNet-18. Then we apply our defense method on all the models and test their detection success rates. Meanwhile, we also record the time cost of defense methods to demonstrate the efficiency of our method. The baseline methods is Blind, which is one state-of-the-art defense method (Hayes, 2018). And the threshold for inconsistency is set as 0.46.

Table 2 shows the overall detection and image recovery performance. On all three models, our method consistently shows higher detection success rate than (Hayes, 2018). The further proposed image recovery could help to correct predictions, resulting in 80.3%82% accuracy recovery improvement on different models while Blind only achieves 78.2%79.5% accuracy recovery improvement. In terms of efficiency, the process time cost of our detect method for one physical adversarial attack is from 67ms71ms while the Blind is from 132ms153ms.

By the above comparison, we show that our defense method has better defense performance than Blind with respect to both effectiveness and efficiency.

Figure 7. Audio Adversarial Attack Defense

6.2. CNN Audio Physical Adversarial
Attack Defense Evaluation

In this part, we evaluate the effectiveness and efficiency of the proposed defense method in audio physical adversarial attack scenarios. The inconsistency threshold for adversarial detection is obtained by the grid search and set as 0.11 in this experiment. For comparison, we re-implement another two state-of-the-art defense methods: Dependency Detection (Yang and et al., 2018) and Multiversion (Zeng and et al., 2018). Four methods (Goodfellow and et al., 2014; Kurakin and et al., 2016; Carlini and et al., 2017; Alzantot and et al., 2018) are used as attacking methods to prove the generality of our defense method. Fig. 8 shows the overall performance comparison.

Our method can always achieve more than 92% detection success rate for all types of audio physical adversarial attacks. By contrast, Dependency Detection achieves 89% detection success rate in average while Multiversion Detection only have average 74%. Therefore, our method demonstrates the best detection accuracy.

Then we evaluate our method’s recovery performance. The value in the top-k index we mentioned above is set as 6. Since Multiversion (Zeng and et al., 2018) cannot be used to recovery, we re-implement another method, Noise Flooding (Rajaratnam and et al., 2018) as comparison. And we use the original vulnerable model without data recovery as the baseline.

Table 2 shows the overall audio recovery performance evaluation. After applying our recovery method, the prediction accuracy significantly increase from average 8% to average 85.8%, which is 77.8% accuracy recovery. Both Dependency Detection and Noise Flooding have lower accuracy recovery rate, which are 74% and 54%, respectively.

For defense efficiency, since our method is based on the activation pattern and numerical similarity (which is easy to compute), the detection can be efficiently done during the CNN forward process. As the result, the time cost of our method is 521ms while other two methods usually cost more than 1540ms for each single physical adversarial attack. Therefore, our defense method is 23 faster than the other two methods.

Attack Model Dataset
Image Physical
Adversarial Attack
Inception-V3
VGG-16
ResNe-18
ImageNet-10
Audio Physical
Adversarial Attack
Command
Classification
Speech
Commands
Table 1. CNN Models and Datasets
Stage Inception-V3 VGG-16 ResNet-18
Blind* Ours Blind* Ours Blind* Ours
Detection
Detection
Succ. Rate
88% 91% 89% 90% 85% 89%
Time Cost 132ms 68ms 144ms 67ms 153ms 71ms
Recovery Original Acc. 9.8% 9.8% 9.5% 9.8% 10.8% 9.8%
Recovery Acc. 88% 90% 89.3% 91.5% 90% 91%
  • *:Blind (Hayes, 2018)

Table 2. Image Adversarial Patch Attack Defense Evaluation
Method FGSM (Goodfellow and et al., 2014) BIM (Kurakin and et al., 2016) CW (Carlini and et al., 2017) Genetic (Alzantot and et al., 2018) Time Cost
No Recovery 10% 5% 4% 13% NA
Dependency
Detection (Yang and et al., 2018)
85% 83% 80% 80% 1813ms
Noise Flooding (Rajaratnam and et al., 2018) 62% 65% 62% 59% 1246ms
Ours 87% 88% 85% 83% 521ms
Table 3. Audio Adversarial Attack Data Recovery Evaluation

7. Conclusion

In this paper, we propose a CNN defense methodology for physical adversarial attacks for both image and audio recognition applications. Leveraging the comprehensive CNN vulnerability analysis and two novel CNN inconsistency metrics, our method can effectively and efficiently detect and eliminate the image and audio physical adversarial attacks. Experiments show that our methodology can achieve an average 90% successful rate for attack detection and 81% accuracy recovery for image physical adversarial attack. Also, the proposed defense method can achieve 92% detection successful rate and 77.5% accuracy recovery for audio recognition applications. Moreover, the proposed defense methods are at most 2.3 faster compared to the state-of-the-art defense methods, making them feasible to resource-constrained platforms, such as mobile devices.

Figure 8. Audio Adversarial Attack Detection Performance

References

  • (1)
  • Alzantot and et al. (2018) M. Alzantot and et al. 2018. Did You Hear that? Adversarial Examples Against Automatic Speech Recognition. arXiv preprint arXiv:1801.00554 (2018).
  • Ayres and et al. (2008) C. Ayres and et al. 2008. Measuring Fiber Alignment in Electrospun Scaffolds: A User’s Guide to the 2D Fast Fourier Transform Approach. Journal of Biomaterials Science, Polymer Edition 19, 5 (2008), 603–621.
  • Benesty and et al. (2009) J. Benesty and et al. 2009. Pearson Correlation Coefficient. In Noise reduction in speech processing. Springer, 1–4.
  • Bertalmio and et al. (2000) M. Bertalmio and et al. 2000. Image Inpainting. In Proc. of SIGGRAPH. ACM, 417–424.
  • Brown and et al. (2017) T. Brown and et al. 2017. Adversarial Patch. arXiv preprint arXiv:1712.09665 (2017).
  • Carlini and et al. (2017) N. Carlini and et al. 2017. Towards Evaluating the Robustness of Neural Networks. In Proc. of SP. 39–57.
  • Chiu and et al. (2018) C. Chiu and et al. 2018. State-of-the-art Speech Recognition with Sequence-to-sequence Models. In Proc. of ICASSP. 4774–4778.
  • Chorowski and et al. (2015) J. Chorowski and et al. 2015. Attention-based Models for Speech Recognition. In Proc. of NIPS. 577–585.
  • Deng and et al. (2009) J. Deng and et al. 2009. Imagenet: A large-scale Hierarchical Image Database. In Proc. of CVPR. 248–255.
  • Erhan and et al. (2009) D. Erhan and et al. 2009. Visualizing Higher-layer Features of A Deep Network. University of Montreal 1341, 3 (2009), 1.
  • Eykholt and et al. (2017) K. Eykholt and et al. 2017. Robust Physical-world Attacks on Deep Learning Models. arXiv preprint arXiv:1707.08945 (2017).
  • Goodfellow and et al. (2014) I. Goodfellow and et al. 2014. Explaining and Harnessing Adversarial Examples. arXiv preprint arXiv:1412.6572 (2014).
  • Hayes (2018) J. Hayes. 2018. On Visible Adversarial Perturbations & Digital Watermarking. In Proc. of CVPR Workshops. 1597–1604.
  • He and et al. (2015) K. He and et al. 2015. Deep Residual Learning for Image Recognition. In Proc. of CVPR. 770–778.
  • James (2018) V. James. 2018. Google is making it easier than ever to give any app the power of object recognition. https://www.theverge.com/2017/6/15/15807096/google-mobile-ai-mobilenets-neural-networks
  • Kurakin and et al. (2016) A. Kurakin and et al. 2016. Adversarial Examples in the Physical World. arXiv preprint arXiv:1607.02533 (2016).
  • Morgan and et al. (2001) S. Morgan and et al. 2001. Speech Command Input Recognition System for Interactive Computer Display with Term Weighting Means Used in Interpreting Potential Commands from Relevant Speech Terms. US Patent 6,192,343.
  • Naseer and et al. (2019) M. Naseer and et al. 2019. Local Gradients Smoothing: Defense against localized adversarial attacks. In Proc. of WACV. 1300–1307.
  • Niwattanakul and et al. (2013) S. Niwattanakul and et al. 2013. Using of Jaccard Coefficient for Keywords similarity. In Proc. of IMECS, Vol. 1. 380–384.
  • Osadchy and et al. (2017) M. Osadchy and et al. 2017. No Bot Expects the DeepCAPTCHA! Introducing Immutable Adversarial Examples, With Applications to CAPTCHA Generation. IEEE Transactions on Information Forensics and Security 12, 11 (2017), 2640–2653.
  • Rajaratnam and et al. (2018) K. Rajaratnam and et al. 2018. Noise Flooding for Detecting Audio Adversarial Examples Against Automatic Speech Recognition. In Proc. of ISSPIT. 197–201.
  • Simonyan and et al. (2014) K. Simonyan and et al. 2014. Very Deep Convolutional Networks for Large-scale Image Recognition. arXiv preprint arXiv:1409.1556 (2014).
  • Szegedy and et al. (2013) C. Szegedy and et al. 2013. Intriguing Properties of Neural Networks. arXiv preprint arXiv:1312.6199 (2013).
  • Szegedy and et al. (2015) C. Szegedy and et al. 2015. Going Deeper with Convolutions. In Proc. of CVPR. 1–9.
  • Wang and et al. (2017) F. Wang and et al. 2017. Residual Attention Network for Image Classification. In Proc. of CVPR. 3156–3164.
  • Yakura and et al. (2018) H. Yakura and et al. 2018. Robust Audio Adversarial Example for A Physical Attack. arXiv preprint arXiv:1810.11793 (2018).
  • Yang and et al. (2018) Z. Yang and et al. 2018. Characterizing Audio Adversarial Examples Using Temporal Dependency. arXiv preprint arXiv:1809.10875 (2018).
  • Zeng and et al. (2018) Q. Zeng and et al. 2018. A Multiversion Programming Inspired Approach to Detecting Audio Adversarial Examples. arXiv preprint arXiv:1812.10199 (2018).
  • Zhou and et al. (2016) B. Zhou and et al. 2016. Learning Deep Features for Discriminative Localization. In Proc. of CVPR. 2921–2929.
  • Zoph and et al. (2018) B. Zoph and et al. 2018. Learning Transferable Architectures for Scalable Image Recognition. In Proc. of CVPR. 8697–8710.
Comments 0
Request Comment
You are adding the first comment!
How to quickly get a good reply:
  • Give credit where it’s due by listing out the positive aspects of a paper before getting into which changes should be made.
  • Be specific in your critique, and provide supporting evidence with appropriate references to substantiate general statements.
  • Your comment should inspire ideas to flow and help the author improves the paper.

The better we are at sharing our knowledge with each other, the faster we move forward.
""
The feedback must be of minimum 40 characters and the title a minimum of 5 characters
   
Add comment
Cancel
Loading ...
366092
This is a comment super asjknd jkasnjk adsnkj
Upvote
Downvote
""
The feedback must be of minumum 40 characters
The feedback must be of minumum 40 characters
Submit
Cancel

You are asking your first question!
How to quickly get a good answer:
  • Keep your question short and to the point
  • Check for grammar or spelling errors.
  • Phrase it like a question
Test
Test description