Unrestricted Adversarial Attacks for Semantic Segmentation
Semantic segmentation is one of the most impactful applications of machine learning; however, their robustness under adversarial attack is not well studied. In this paper, we focus on generating unrestricted adversarial examples for semantic segmentation models. We demonstrate a simple yet effective method to generate unrestricted adversarial examples using conditional generative adversarial networks (CGAN) without any hand-crafted metric. The naïve implementation of CGAN, however, yields inferior image quality and low attack success rate. Instead, we leverage the SPADE (Spatially-adaptive denormalization) structure with an additional loss item, which is able to generate effective adversarial attacks in a single step. We validate our approach on the well studied Cityscapes and ADE20K datasets, and demonstrate that our synthetic adversarial examples are not only realistic, but also improve the attack success rate by up to 41.0% compared with the state of the art adversarial attack methods including PGD attack.
Despite their impressive accuracy and wide adaption, machine learning models remain fragile to adversarial attacks (Szegedy et al. (2013); Carlini and Wagner (2017); Papernot et al. (2016)), which raises serious concerns for deploying them into real-world applications, especially in safety and security-critical systems. Extensive efforts have been made to combat these adversarial attacks: robust models are trained such that they are not easily evaded by adversarial examples (Goodfellow et al. (2015); Papernot et al. (2016); Madry et al. (2018)). Although these defense methods improve the models’ robustness, they are mostly limited to addressing norm bounded attacks such as PGD (Madry et al. (2018)). Realistic adversarial attacks beyond norm bound thus remain a major concern to those robust models, which spur extensive efforts to explore stronger and realistic adversarial attacks, e.g., using Wasserstein bound measurement (Wong et al. (2019)), realistic image transformations (Engstrom et al. (2017)) etc. In particular, Song et al. (2018) propose unrestricted adversarial attacks using conditional GAN for the image classification models, which is a big step toward realistic attacks beyond human crafted constrains. However, due to their model design, they are mostly restricted to low-resolution images—for high resolution, the generated images are not very realistic.
The problem of achieving realistic adversarial attacks and defenses aggravate further for more difficult visual recognition tasks such as semantic segmentation, where one needs to attack order of magnitude more pixels while achieving a consistent perception by human. It is essential to make the segmentation models robust against adversarial attacks, especially due to their applicabilities in autonomous driving (Ess et al. (2009)), medical imaging Ronneberger et al. (2015); Shen et al. (2018), and computer-aided diagnose system (Milletari et al. (2016)). Unfortunately, we show that existing attack methods that are primarily designed for simple classification tasks do not generalize well to semantic segmentation. For instance, following the work of Arnab et al. (2018), we show that the norm-bound perturbation becomes human visible since larger bounds are required for launching a successful attack. The unrestricted adversarial attack, on the other hand, is not constrained by the norm bounded budget, which can expose more vulnerabilities of a given machine learning model. However, the quality and resolution of the unrestricted adversarial images generated by those methods are low and limited to simple images like the handwritten digits.
In this paper, we present the first realistic unrestricted adversarial attack, AdvSPADE, for semantic segmentation models. Figure 1 illustrates the effectiveness of our proposed method. To generate realistic images for the semantic segmentation models, we use SPADE (Park et al. (2019)), a state-of-the-art conditional generative model to generate high-resolution images (up to 1 million pixels). We further add an additional adversarial loss term on the original SPADE architecture to fool the target model. Such a simple yet effective method, AdvSPADE, creates a wide variety of adversarial examples from a single image in a single step. Empirical results show that successful adversarial attacks vary in different styles (e.g., different lighting condition, texture, etc.), suggesting a large volume of vulnerabilities exist for semantic segmentation models beyond hand-crafted perturbations. We further show that augmenting the training data with such realistic adversarial examples have the potential to improve the models’ robustness.
To this end, our main contribution is: (1) We propose a new realistic attack for semantic segmentation which defeats the existing state-of-the-art robust models. We demonstrate the existence of a rich variety of unrestricted adversarial examples besides the previously known ones. (2) We demonstrate that augmenting the training dataset with our new adversarial examples have the potential of improving the robustness of existing models. (3) We present an empirical evaluation of our approach using two popular semantic segmentation dataset. First, we evaluate the quality of our generated adversarial examples using Amazon Mturk and demonstrate that our samples are indistinguishable to natural images for humans. Using these adversarial images, we further show that the attack success rate can be improved by up to .
2 Related Work
Semantic Segmentation. It is one of the most critical tasks in the computer vision field, which can be considered as a multi-output classification task that provides more fine-granular information in the prediction (Barrow and Tenenbaum (1981)). Plenty of network architectures have been proposed to address semantic segmentation task efficiently (Ronneberger et al. (2015); Long et al. (2015); Chen et al. (2017a); Badrinarayanan et al. (2017)). Generally, a segmentation network contains two parts: an encoder and a decoder . Encoder is for the feature extraction and decoder is for the dimension restoration. Although the structure design of segmentation networks have been well-studied, very few studies (Arnab et al. (2018); Xie et al. (2017)) look into the robustness of this class of networks against adversarial examples.
Adversarial Attacks. The attacks are examples that carefully crafted to mislead the prediction of machine learning models, while still perceived the same by the human. In recent years, researchers have proposed multiple methods for generating adversarial examples for the image classification task (Goodfellow et al. (2015); Kurakin et al. (2016); Carlini and Wagner (2017); Madry et al. (2018)), where the target model is fooled by the adversarial images. Hand-crafted metrics, such as norm bound (Madry et al. (2018); Feinman et al. (2017)) and Wesstrasien distance (Wong et al. (2019)), are applied to the generation process to preserve the semantic meaning of the adversarial examples to human. Recently, (Song et al. (2018)) proposes to use generative adversarial network to generate unrestricted adversarial attacks for image classification. By leveraging Auxiliary Classifier Generative Adversarial Network (-) (Odena et al. (2016)), the model was able to generate low quality adversarial examples from scratch and beyond any norm bound. (Wang et al. (2019)) proposed a new generative model called (-) to learn a transformation between a pre-trained GAN and a adversarial GAN which can generate adversarial examples for the target classifier. The work generating adversarial examples using GAN (Song et al. (2018); Wang et al. (2019)) adopt two step procedures which involves many steps of gradient descent on the second part. In contrast, our adversarial examples are generated in a single step which is more efficient.
A few studies focused on the adversarial attack on modern semantic segmentation networks. Arnab et al. (2018) conducted the first systematic analysis about the effect of multiple adversarial attack methods on different modern semantic segmentation network architectures across two large-scale datasets. (Xie et al. (2017)) propose a new attack method called Dense Adversary Generation , which generates a group of adversarial examples for a bunch of state-of-the-art segmentation and detection deep networks. However, all of the attack methods rely on norm-bounded perturbations, which only cover a small fraction of all the feasible adversarial examples.
Defense Methods. Adversarial training is the state-of-the-art method for training robust classifiers. (Goodfellow et al. (2015); Lyu et al. (2015); Shaham et al. (2018); Szegedy et al. (2013); Hinton et al. (2015); Papernot and McDaniel (2017); Xu et al. (2018); Madry et al. (2018)). Besides, Other defense methods like the input transformation, including rescaling, JPEG compression, Gaussian blur, HSV jitter, grayscale against adversarial attack on semantic segmentation networks are evaluated by (Arnab et al. (2018)). These input transformation methods, however, were shown to rely on obfuscated gradients and give a false sense of robustness (Athalye et al. (2018)). On the other hand, (Athalye et al. (2018)) endorsed the robustness of the model trained with adversarial training, which is the state-of-the-art adversarial robust model available. In this paper, we demonstrate the robustness of our attack method by bypassing the adversarial training instead of evaluating on the defense method relies on obfuscated gradients.
3 Generating Adversarial Examples
In this section, we introduce our methodology, AdvSPADE, for generating unrestricted adversarial examples for semantic segmentation. For this purpose, we leverage a conditional Generative Adversarial Networks, SPADE. The main goal of a standard conditional GAN is to synthesize realistic images that will fool the discriminator. The generation of adversarial attack, however, also requires to fool the segmentation model under attack. AdvSPADE thus adds an additional loss function to fool both the discriminator and the segmentation model. Figure 2 shows the overall workflow. The rest of the section describes the relevant terminologies and the new loss function in details.
Unrestricted Adversarial Examples. Consider as a set of images and be the set of all possible categories for . Suppose is an oracle that can map any image from its domain to correctly. A classification model can also provide a class prediction for any given images in . Under the assumption that , an unrestricted adversarial example is any image which meets following requirements (Song et al. (2018)): , .
Conditional Generative Adversarial Networks. A Conditional Generative Adversarial Network (Mirza and Osindero (2014)) consists of a generator G and a discriminator D and they are both conditioned on auxiliary information . Combining random noise and extra information as input, G is able to map it to a realistic image. The discriminator aims to distinguish the real images and synthetic images from the Generator. G and D correspond to a minimax two-player game and can be formalized as
Unrestricted Adversarial Loss. We design an adversarial loss term for the unrestricted adversarial examples generation. We mainly focus on the untargeted attack in this paper though our approach is general and can be simply applied on targeted attack. Intuitively, the SPADE generator is trained to mislead the prediction of target segmentation network. The synthetic images are not only required to fool the discriminator for the conditional GAN but also need to be mis-segmented by the target segmentation network. To achieve this goal, we introduce the target segmentation network into the training phase and aim to maximize the loss of the segmentation model while keeping the quality and semantic meaning of the synthetic images. We denote the target segmentation network by , SPADE generator by , input semantic label by , and the input random vector by . We define the untargeted version of Unrestricted Adversarial Loss as follows:
We select Dice Loss Sudre et al. (2017) as the objective function . An image encoder processes a real image and generates a mean vector and a variance vector and then compute the noise input according to reparameterization trick Kingma and Welling (2013).
The complete objective function of AdvSPADE then can be written as:
More details about remaining loss terms in Eq 3 can be found in (Park et al. (2019); Wang et al. (2018)). To speed up the generation process as well as the quality of synthesized images, we follow Spatially-adaptive denormalization, as proposed by Park et al. (2019).
Spatially-adaptive denormalization. Our model uses SPADE architecture (Park et al. (2019)) as the conditional GAN model, where the Batch Normalization (Ioffe and Szegedy (2015)) is replaced with Spatially-adaptive denormalization. This method is proved to maintain the semantic segmentation information which will get lost during the subsampling. Please refer to Appendix A for details.
4 Experimental Set-up
Datasets. We evaluate our method on two large-scale image segmentation datasets: Citsycapes (Cordts et al. (2016)) and ADE20K (Zhou et al. (2016)). Cityscapes contains street view images from German cities and semantic classes, and it consists of training and validation images. ADE20K covers semantic classes in multiple real world scenes, where the training and validation set contains and images respectively.
Training Details. Following Park et al. (2019), we apply the Spectral Norm (Miyato et al. (2018)) in all layers for both generator and discriminator. We train our model with epochs on Cityscapes. However, due to the large size of ADE20K and computation limits, we only run epochs on ADE20K rather than epochs reported in (Park et al. (2019)). We set the learning rate of the generator and discriminator both equal to and start to decay learning rate linearly from -th epoch when trained on ADE20K. We employ the ADAM (Kingma and Ba (2014)) with , . In Equation 3, we set , , for both Cityscapes and ADE20K and for Cityscapes, for ADE20K respectively. All experiments are done on a single NVIDIA TITAN Xp GPU.
Baseline Models. We compare AdvSPADE generated attacks with traditional norm-bounded attacks in two settings: real images with perturbation and generated clean images with perturbation. For the second setting, we use vanilla SPADE to generate clean images first, and then, add norm-bounded perturbation over the synthetic images. For a better comparison, we choose the same segmentation networks as target networks for each dataset as (Park et al. (2019)) mentioned: DRN-D-105 (Yu et al. (2017)) for Cityscapes, Uppernet-101 for ADE20K (Xiao et al. (2018)). Besides, we also select several state-of-the-art open source segmentation networks to evaluate the transferability of our unrestricted adversarial examples as a black box setting: DRN-38, DRN-22 (Yu et al. (2017); Yu and Koltun (2016)), DeepLab-V3 (Chen et al. (2017b)), PSPNet-34-8s (Zhao et al. (2017)) for Cityscapes, PPM-18, MobilenetV2, Uppernet-50, PPM-101 (Zhou et al. (2018)) for ADE20K.
Evaluation metric. Due to the dense output property of the semantic segmentation task, the evaluation of the attack success rate is different from that of the classification (Song et al. (2018)). Let be a set of RGB images with height and width and channel . Let be the set of semantic labels for the corresponding images from . Suppose is an oracle that can map any images from its domain which presents all images that look realistic to humans to correctly. A segmentation model can provide pixel-wise predictions for any given images in . With this setting, we evaluate the following two categories of adversarial examples: 1) Given a constant and a hand-craft norm , a restricted adversarial example is an image that meets the following conditions: , , . (2) An unrestricted adversarial example is an image that meets following requirement: , . Here, stand for the prediction given by oracle and segmentation network at pixel respectively. is a hyperparameter. In this paper, we set .
Given the nature of semantic segmentation task, misclassifying a single pixel does not lead an image fall into the class of adversarial examples. A legitimate adversarial example should have the property that the majority of pixels in it are misclassified (measured by mIoU score), and the adversarial image still looks realistic to humans (measured by FID score) with the same semantic meaning as the original images (measured by Amazon Turk). In particular, we use following three measures: 1. Mean Intersection-over-Union (mIoU). For measuring the effect of different attack methods on the target networks, we measure the drop in recognition accuracy using mIoU score which is widely used in semantic segmentation tasks (Cordts et al. (2016); Zhou et al. (2016))—lower mIoU score means better adversarial example. 2. Fréchet Inception Distance (FID). We use FID (Heusel et al. (2017)) to compute the distance between the distribution of our adversarial examples and the distribution of the real images; small FID stands for the high quality of generated images. 3. Amazon Mechanical Turk (AMT). AMT is used to verify the success of our unrestricted adversarial attack. Here, we randomly select generated adversarial images under two experimental settings from each dataset to generate AMT assignments. Each assignment is answered by different workers and each worker has minutes to make decision. We use the result of a majority vote as each assignment’s final answer.
5 Experiment Result
|Datasets||Seg Model||Real Images||SPADE||AdvSPADE (Ours)||Datasets||Seg Model||Real Images||SPADE||AdvSPADE (Ours)|
5.1 Evaluating Generated Adversarial Images
Here, we compare the adversarial images generated by AdvSPADE with the original real images and the clean synthetic images created by vanilla SPADE using mIoU and FID scores. Table 1 shows that compared to vanilla SPADE, AdvSPADE generated images under whitebox attack can lead to a giant decline on mIoU score (from to for DRN-105, from to for Uppernet-101). On different network architectures, our adversarial examples can also decrease of mIoU to a certain extent (around on Cityscapes, on ADE20K) showing strong transferability of our examples across models.
Compare to vanilla SPADE, the FID of our adversarial examples increases slightly ( to on Cityscapes, to on ADE20K, as shown in Table 2) which means our samples have comparable quality and variety . Note that we only train AdvSPADE half epochs as reported in Park et al. (2019) and achieve FID on ADE20K, which is still smaller than other leading semantic image synthesis models such as Pix2PixHD (Wang et al. (2018)), SIMS (Qi et al. (2018)), CRN (Chen and Koltun (2017)). Qualitative results are shown in Figure 3. Moreover, by introducing an image encoder and KL Divergence loss, we can generate multi-modal stylized adversarial examples which are shown in the appendix.
|\diagboxDatasetModel||Vanilla SPADE||Pix2PixHD||CRN||AdvSPADE (Ours)|
5.2 Norm-bounded Adversarial attacks
We compare the attack success rate of AdvSPADE with the state-of-the-art norm bounded adversarial attacks, including FGSM and PGD (Goodfellow et al. (2015); Madry et al. (2018)), for two datasets. We set the norm bound size to for both FGSM and PGD. For PGD, we follow the (Kurakin et al. (2016); Arnab et al. (2018)) and set number of attack iterations to . We apply FGSM and PGD on both real images and synthetic images by vanilla SPADE, and compare their mIoU scores and FID with ours. Overall, AdvSPADE achieves higher attack success rates on both datasets ( on Cityscapes, on ADE20K) than traditional norm-bounded attack approaches, as shown in Table 3.
Table 4 further reveals that for both FGSM and PGD attack, to decrease the mIoU to the same level as AdvSPADE (mIoU = ), the generated perturbation becomes conspicuous () so that human can easily distinguish adversarial examples from clean images. FID also reflects the decline of adversarial images’ quality. Secondly, adversarial examples generated by FGSM and PGD attack can not make mIoU drop down to the same level as AdvSPADE if it is required to maintain the quality of the samples. Consider the adversarial samples on Cityscapes generated by vanilla SPADE and add perturbation with , their FID (64.455) is comparable with our samples, but mIoU () is much larger than ours (). Figure 4 illustrates the difference between AdvSPADE samples and norm-bounded samples on the same level of mIoU score. We can easily see the noise pattern in norm-bounded samples rather than in our examples.
|Bound Size||Real Images||Vanilla SPADE||Bound Size||Real Images||Vanilla SPADE|
|Bound Size||Real Images||Vanilla SPADE||Bound Size||Real Images||Vanilla SPADE|
5.3 Human Evaluation
Using Amazon Mechanical Turk (AMT), we evaluate how a human perceives AdvSPADE generated adversarial images. A detailed result is presented in the Appendix. This is done in two settings:
(1) Semantic Consistency Test: Here, we aim to validate that semantic meanings of our adversarial examples are consistent with their respective ground truth labels. If it is true, humans will segment our adversarial examples correctly. However, asking workers to segment every pixel in an image is time-consuming and inefficient. Instead, we give AMT workers a pair of images: a generated adversarial image and a semantic label (half of the images pairs are matched, and rest are mismatched) and ask them if the semantic meaning of given synthetic image is consistent with the given semantic label. We notice that users can identify the semantic meaning of our adversarial examples precisely ( for Cityscapes, for ADE20K). In other words, the segmentation network’s reaction toward our adversarial examples is inconsistent with human, which proves the success of our attack.
(2) Fidelity AB Test: We compare the visual fidelity of AdvSPADE with vanilla SPADE. Here we give workers the semantic ground truth label and two generated images by AdvSPADE and vanilla SPADE respectively and ask them to select the more appropriate image corresponding to the ground truth label. and users favor our examples over vanilla SPADE for Cityscapes and ADE20K dataset respectively, which indicates competitive visual fidelity of our adversarial images.
5.4 Robustness Evaluation
In this section, we first show that robust training with norm-bounded adversarial images can defend restricted adversarial attacks successfully where perturbation can be added either on real images or synthesis images. Then, we show that our unrestricted adversarial examples can still attack these robust models successfully (Goodfellow et al. (2015)). Finally, we present the results of our experiment to build a more robust segmentation model based on our unrestricted examples. We follow the training setting introduced by (Madry et al. (2018)): we select PGD as the attack method and set the adversarial training epoch = on Cityscapes, on ADE20K, norm-bound size , attack iteration = , step size = . After the training phase, we use PGD with the same setting to generate norm-bounded perturbation and add it on both real images and synthesis images by vanilla SPADE.
It turns out that real and synthesized images with perturbation can only make mIoU decrease to and on robust DRN-105, and on robust Uppernet-101, respectively. In contrast, our adversarial examples can achieve mIoU score on robust DRN-105, and on robust Uppernet-101 which shows that our unrestricted adversarial examples can successfully surpass the robust models trained with norm-bounded adversarial examples. Next, we train a model with our unrestricted adversarial examples on the Cityscapes dataset and then apply PGD to attack. The result shows that PGD attack can only achieve attack success rate on DRN-105. Since norm-bound examples are unknown for the robust model defended by our samples, the low success rate reflects models gain stronger robustness from adversarial training with AdvSPADE examples.
In this paper, we explore the existence of adversarial examples beyond norm-bounded metric on the state-of-the-art semantic segmentation neural networks. By modifying the loss function of SPADE architecture, we are able to generate high quality unrestricted realistic adversarial examples beyond any norms, which mislead segmentation networks’ behavior. We demonstrate the effectiveness and robustness of our method by comparing ours with traditional norm-bounded attack methods. We also show that our generated adversarial examples can easily surpass the state-of-the-art defense method, which raises new concerns about the security of segmentation neural networks.
- On the robustness of semantic segmentation models to adversarial attacks. 2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition. External Links: Cited by: §1, §2, §2, §2, §5.2.
- Obfuscated gradients give a false sense of security: circumventing defenses to adversarial examples. External Links: Cited by: §2.
- Segnet: a deep convolutional encoder-decoder architecture for image segmentation. IEEE transactions on pattern analysis and machine intelligence 39 (12), pp. 2481–2495. Cited by: §2.
- Interpreting line drawings as three-dimensional surfaces. Artif. Intell. 17 (1-3), pp. 75–116. External Links: Cited by: §2.
- Towards evaluating the robustness of neural networks. In 2017 IEEE Symposium on Security and Privacy, SP 2017, San Jose, CA, USA, May 22-26, 2017, pp. 39–57. Cited by: §1, §2.
- Deeplab: semantic image segmentation with deep convolutional nets, atrous convolution, and fully connected crfs. IEEE transactions on pattern analysis and machine intelligence 40 (4), pp. 834–848. Cited by: §2.
- Rethinking atrous convolution for semantic image segmentation. External Links: Cited by: §4.
- Photographic image synthesis with cascaded refinement networks. In Proceedings of the IEEE International Conference on Computer Vision, pp. 1511–1520. Cited by: §5.1.
- The cityscapes dataset for semantic urban scene understanding. In Proc. of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR), Cited by: §4, §4.
- A rotation and a translation suffice: fooling cnns with simple transformations. External Links: Cited by: §1.
- Segmentation-based urban traffic scene understanding.. In BMVC, Vol. 1, pp. 2. Cited by: §1.
- Detecting adversarial samples from artifacts. ArXiv abs/1703.00410. Cited by: §2.
- Explaining and harnessing adversarial examples. In International Conference on Learning Representations, External Links: Cited by: §A.1, §1, §2, §2, §5.2, §5.4.
- GANs trained by a two time-scale update rule converge to a local nash equilibrium. External Links: Cited by: §4.
- Distilling the knowledge in a neural network. In NIPS Deep Learning and Representation Learning Workshop, External Links: Cited by: §2.
- Batch normalization: accelerating deep network training by reducing internal covariate shift. External Links: Cited by: §3.
- Adam: a method for stochastic optimization. arXiv preprint arXiv:1412.6980. Cited by: §4.
- Auto-encoding variational bayes. Note: cite arxiv:1312.6114 External Links: Cited by: §3.
- Imagenet classification with deep convolutional neural networks. In Advances in neural information processing systems, pp. 1097–1105. Cited by: §A.1.
- Adversarial machine learning at scale. External Links: Cited by: §2, §5.2.
- Fully convolutional networks for semantic segmentation. In Proceedings of the IEEE conference on computer vision and pattern recognition, pp. 3431–3440. Cited by: §2.
- A unified gradient regularization family for adversarial examples. 2015 IEEE International Conference on Data Mining. External Links: Cited by: §2.
- Towards deep learning models resistant to adversarial attacks. In International Conference on Learning Representations, External Links: Cited by: §A.1, §1, §2, §2, §5.2, §5.4.
- V-net: fully convolutional neural networks for volumetric medical image segmentation. In 2016 Fourth International Conference on 3D Vision (3DV), pp. 565–571. Cited by: §1.
- Conditional generative adversarial nets. External Links: Cited by: §3.
- Spectral normalization for generative adversarial networks. arXiv preprint arXiv:1802.05957. Cited by: §4.
- Conditional image synthesis with auxiliary classifier gans. In ICML, Cited by: §2.
- Transferability in machine learning: from phenomena to black-box attacks using adversarial samples. External Links: Cited by: §1.
- Distillation as a defense to adversarial perturbations against deep neural networks. 2016 IEEE Symposium on Security and Privacy (SP). External Links: Cited by: §1.
- Extending defensive distillation. External Links: Cited by: §2.
- Semantic image synthesis with spatially-adaptive normalization. External Links: Cited by: §A.1, §1, §3, §3, §4, §4, §5.1.
- Semi-parametric image synthesis. 2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition. External Links: Cited by: §5.1.
- U-net: convolutional networks for biomedical image segmentation. Medical Image Computing and Computer-Assisted Intervention – MICCAI 2015, pp. 234–241. External Links: Cited by: §1, §2.
- Understanding adversarial training: increasing local stability of supervised models through robust optimization. Neurocomputing 307, pp. 195–204. Cited by: §2.
- Brain tumor segmentation using concurrent fully convolutional networks and conditional random fields. In Proceedings of the 3rd International Conference on Multimedia and Image Processing, pp. 24–30. Cited by: §1.
- Constructing unrestricted adversarial examples with generative models. External Links: Cited by: §1, §2, §3, §4.
- Generalised dice overlap as a deep learning loss function for highly unbalanced segmentations. Lecture Notes in Computer Science, pp. 240–248. External Links: Cited by: §3.
- Intriguing properties of neural networks. External Links: Cited by: §1, §2.
- High-resolution image synthesis and semantic manipulation with conditional gans. 2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition. External Links: Cited by: §A.1, §3, §5.1.
- AT-gan: a generative attack model for adversarial transferring on generative adversarial nets. External Links: Cited by: §2.
- Wasserstein adversarial examples via projected sinkhorn iterations. External Links: Cited by: §1, §2.
- Unified perceptual parsing for scene understanding. In Proceedings of the European Conference on Computer Vision (ECCV), pp. 418–434. Cited by: §4.
- Adversarial examples for semantic segmentation and object detection. 2017 IEEE International Conference on Computer Vision (ICCV). External Links: Cited by: §2, §2.
- Feature squeezing: detecting adversarial examples in deep neural networks. Proceedings 2018 Network and Distributed System Security Symposium. External Links: Cited by: §2.
- Dilated residual networks. In Proceedings of the IEEE conference on computer vision and pattern recognition, pp. 472–480. Cited by: §4.
- Multi-scale context aggregation by dilated convolutions. In International Conference on Learning Representations (ICLR), Cited by: §4.
- Pyramid scene parsing network. 2017 IEEE Conference on Computer Vision and Pattern Recognition (CVPR). External Links: Cited by: §4.
- Semantic understanding of scenes through the ade20k dataset. arXiv preprint arXiv:1608.05442. Cited by: §4, §4.
- Semantic understanding of scenes through the ade20k dataset. International Journal on Computer Vision. Cited by: §4.
- Toward multimodal image-to-image translation. External Links: Cited by: §A.1.
Appendix A Appendix
a.1 Additional Experiments Details
Background on SPADE Model SPADE normalizes the activation of each layer in a neural network in a channel-wise manner and adjusts it by scale and bias which are learned dynamically from two simple two-layer CNNs respectively. Let , denote the input semantic label and adjusted activation map. stands for the -th layer’s original activation value of -th sample at location . ( means the channel, width and height of the activation map respectively)
where and are calculated by:
is the number of samples in a batch. are the height and width of the activation map in the corresponding layer.
SPADE Network Architectures In order to achieve the comparative quality of synthetic images,we basically follow the generator and discriminator architecture settings in (Park et al. (2019)). Due to SPADE module, encoder part is unnecessary for the generator. The simplified lightweight network takes semantic label and random vector as input, after going through alternate SPADE modules and upsampling layers, it can generate high-quality realistic images. For the discriminator structure, we also follow the guideline of SPADE which uses the multi-scale discriminator (Wang et al. (2018)) and loss function with the hinge loss term (Park et al. (2019)). SPADE inherits the property of BicycleGAN (Zhu et al. (2017)) and provides an easier and more straightforward way to synthesize multi-modal realistic images. Our AdvSPADE is based on the multi-modal version SPADE for generating various adversarial examples to increase the coverage of real-world adversarial examples. The implementation details of generator, discriminator and image encoder are shown in Fig 5 and Fig 6 .
AdvSPADE Training Re-normalization Normalization is an essential step in the pre-processing phase. We notice that in SPADE implementation, author leverages z-score normalization with , , to map input RGB images into the range of :
After passing through SPADE generator, the generated image we gain has the same range with normalized input image : . In our AdvSPADE, we need to feed generated image into the target segmentation network . However, there will be a value range shifting. Since main semantic segmentation networks do not use the same normalization parameters (, ) with SPADE. Currently, people use to compute corresponding mean vector (()) and variance vector (()) for a given dataset or directly use mean and variance vectors of Imagenet (Krizhevsky et al. (2012)) dataset: , . We need to guarantee that generated adversarial example fed into the target network has the same range in training phase and testing phase. Let , be the mean and variance vector for semantic segmentation task. Before feeding generated adversarial examples into target segmentation network , we need to do a re-normalization:
Otherwise, even though generated unrestricted adversarial example can mislead the target network while training, it will still fail in the testing phase due to a large value shifting. We provide a simple quantitative computation to prove our statement.
Let , , , and the value at -th,-th position in : . After saving into a jpg file as an adversarial example, it will be map to . In the attack phase, will be normalize to . Notice that a valid adversarial pixel is mapped to a complete different value while attacking. There is no guarantee that can still mislead the target network which shows that the necessity of re-normalization while training AdvSPADE.
Robust training details We provide more detail settings about the robust training with AdvSPADE examples experiment described in Section 4. Note that the target network is fixed while training AdvSPADE. In other words, we consider the gradient flow of target segmentation network but do not update it in the whole training phase. We are able to gain effective unrestricted adversarial example after training model dozens epochs. However, the training time cost is non-negligible (around 48 hours on Cityscapes, 200 hours on ADE20K with single NVIDIA TITAN Xp GPU). According to the definition from Madry et al. (2018); Goodfellow et al. (2015), in each training epoch, we apply certain attack method (PGD, iFGSM,etc) to generate adversarial examples and augment dataset with above examples, then using augmented dataset to fine-tune the target network. If we directly replace previously attack methods to our attack method, we need to train AdvSPADE dozens epochs in every adversarial training epoch when target network is updated which means the time cost will rise linearly related to the number of adversarial training epoch. In order to decrease the enormous time cost, we adopt a compromise adversarial training strategy with our unrestricted adversarial examples and achieve promising results. Instead of fixing during training stage, we also optimize parameters of each epoch and encourage it to segment generated adversarial examples correctly. We combine robust training into the AdvSPADE training stage and reduce the time cost to a acceptable level (around 50 hours on Cityscapes with single NVIDIA TITAN Xp GPU). Algorithm is shown in 1. In this paper, we set . As we report in the 4, after adversarial training with AdvSPADE generated adversarial examples, the PGD atack can only achieve attack success rate on the robust segmentation network.
initialized networks ,,, pre-trained network , dataset
Output: Updated .
|Datasets||True Positive||True Negative||False Positive||False Negative||Accuracy||Ours Vs. Vanilla SPADE|
a.2 Additional Qualitative resutls
Table 5 shows the AMT evaluation results and Figure 7 presents the user interfaces we design for the two experiments. Figure 8 and Figure 9 show the variety of our unrestricted adversarial examples on two datasets. Figure 10 compares norm-bounded samples and our unrestricted adversarial examples under the condition that making the mIoU of target network drop to the same level () on Cityscapes. Figure 11 visualizes the relationship between bound size and mIoU decline on two datasets. The conclusion is that norm-bounded adversarial examples can achieve the same attack effect with our examples only with large bound size. Note that our examples do not contain any bound restriction, drawing it with other norm-bound examples in the same coordinate axis is for the convenience of comparison. Figure 12 and 13 compare the quality and attack effect of norm-bounded examples with multiple bound sizes and our examples. Figure 14 and 15 shows more unrestricted adversarial examples generated by AdvSPADE on two datasets.