Relative Attributing Propagation: Interpreting the Comparative Contributions of Individual Units in Deep Neural Networks

Relative Attributing Propagation: Interpreting the Comparative Contributions of Individual Units in Deep Neural Networks

Woo-Jeoung Nam1, Shir Gur3, Jaesik Choi5, Lior Wolf3,4, Seong-Whan Lee1,2
1Department of Computer and Radio Communications Engineering, Korea University
2Department of Artificial Intelligence, Korea University
3The School of Computer Science, Tel Aviv University
4Facebook AI Research
5Graduate School of Artificial Intelligence, KAIST
Abstract

As Deep Neural Networks (DNNs) have demonstrated superhuman performance in a variety of fields, there is an increasing interest in understanding the complex internal mechanisms of DNNs. In this paper, we propose Relative Attributing Propagation (RAP), which decomposes the output predictions of DNNs with a new perspective of separating the relevant (positive) and irrelevant (negative) attributions according to the relative influence between the layers. The relevance of each neuron is identified with respect to its degree of contribution, separated into positive and negative, while preserving the conservation rule. Considering the relevance assigned to neurons in terms of relative priority, RAP allows each neuron to be assigned with a bi-polar importance score concerning the output: from highly relevant to highly irrelevant. Therefore, our method makes it possible to interpret DNNs with much clearer and attentive visualizations of the separated attributions than the conventional explaining methods. To verify that the attributions propagated by RAP correctly account for each meaning, we utilize the evaluation metrics: (i) Outside-inside relevance ratio, (ii) Segmentation mIOU and (iii) Region perturbation. In all experiments and metrics, we present a sizable gap in comparison to the existing literature. Our source code is available in https://github.com/wjNam/Relative˙Attributing˙Propagation.

Introduction

Despite the impressive performance, the adoption of Deep Neural Networks (DNNs) is sometimes hindered by a transparency issue that arises from the complex internal structure of DNNs. Many studies have recently attempted to resolve the lack of transparency in DNNs. The attributing methods [3, 19, 12, 25, 20, 15, 17, 28] reveal the significant factors of the input in making decisions by assigning a relevance score to the input layer.

Figure 1: Comparison of the conventional explaining methods and RAP applied to VGG-16. In the previous methods, the attributions are similarly distributed across the entire image. Our RAP clearly distinguishes relevant (red) and irrelevant pixels (blue), placing the relevant attributions on the object, and the irrelevant ones on the background.

To consider the positive and negative contributions of each image location to the output of a DNN,  [3] introduced the layer-wise relevance propagation rule, which propagates the relevance from the prediction. However, propagating the positive and negative relevance without considering the amount and direction of the contribution, may lead to defective interpretation. It is required to clarify the actual influences of individual units to the output, since the components of the complex inner structure shift and switch the conveyance of value. Furthermore, the relevance of each neuron is highly dependent on the absolute amount of contribution, resulting in both positive and negative relevance types to be correlated.

In this paper, we propose a new perspective for interpreting the relevance of each neuron, accounting for each neuron’s influence among connected neighbors and allocating it with the relative importance. The main idea of this paper is changing the perspective of the relevance, from the sign of contribution to the influence among the neurons. Our method redistributes the relevance by changing the priority and rearranging it into positive and negative while preserving the conservation. This way, the relevance is assigned to each neuron directionally in line with the degree of importance to the output.

Fig. 1 illustrates the comparison between RAP and conventional methods. While previous work considers directionality based on the sign of the neuron’s contribution, which leads to a similar distribution of the positive and negative attributions, our method assigns the relevance according to the importance of neuron, which is highly focused on the object. The main contributions of this work are as follows:

  • We propose relative attributing propagation (RAP), a method for attributing the positive and negative relevance to each neuron, according to its relative influence among the neurons. We address the phenomenon of the relevance dependency, which is highly dependant on the amount of neuron contribution and present the necessity of the new perspective to approach the relevance from the priority. We also prevent the risk of degeneracy during propagation by setting the criterion of separation according to the actual contribution between the intermediate layers.

  • We apply the Intersection of Union, Outside-Inside relevance ratio [14] and region perturbation [24] to assess whether the propagated attributions are meaningful. The evaluation shows that attributions from RAP provide a high objectness score with a clear separation of irrelevant regions, compared to the other explaining methods as well as to the literature of objectness methods.

Related Work

There has been many recent studies on understanding of what a DNN model has learned. From the standpoint of interpreting a DNN model, the manner in which a DNN works can be visualized by maximizing the activation of hidden layers [7] or generating salient feature maps [6, 26, 18, 32, 30, 31]. [9] introduced the input switched affine network, which can decompose the contributions of previous characters to the current prediction, and [13] proposed the influence function to understand model behavior, debug models, detect dataset errors, and even create visually indistinguishable training-set attacks. [22] proposed LIME, an algorithm that explains the predictions of the classifier, by learning an interpretable model locally around the prediction.

From the standpoint of explaining the decision of a DNN, the contributions of the input are propagated backward, resulting in a redistribution of relevance in the pixel space. Sensitivity analysis visualizes the sensitivities of input images classified by a DNN while explaining the factors that reduce/increase the evidence for the predicted results [4]. [30] proposed a deconvolution method to identify the patterns of a predicted input image from a DNN. Layer-wise relevance propagation (LRP) [3] was introduced to backpropagate the relevance, also called as attribution, by making the network output become fully redistributed throughout the layers of a DNN. [24] showed that the LRP algorithm qualitatively and quantitatively provides a better explanation than do either the sensitivity-based approach or the deconvolution method.

Guided BackProp [27] and Integrated Gradients [28] each compute the single and average partial derivatives of the output to attribute the prediction of a DNN. Deep Taylor Decomposition [19] is an extension of LRP for interpreting the decision of a DNN, by decomposing the activation of a neuron in terms of the contributions from its inputs. DeepLIFT [25] decomposes the output prediction by assigning the differences of contribution scores between the activation of each neuron to its reference activation. [2] approached the problem of the attribution value from a theoretical perspective and formally proved the conditions of equivalence and approximation between four attribution methods: Guided Backprop [27], Integrated Gradients [28], LRP and DeepLIFT. [17] proposed SHAP explaining methods with unifying the conventional explaining methods and approximate the shapley value.

However, there are no studies which analyze the problem of ambiguous visualization in dealing with negative relevance. We bring out the fundamental causes of this problem and address the solution to handle the priority of neurons, resulting in the clear separation of the (ir)relevant objects.

Background

Notations

Throughout this paper, the letter is used to denote the value of the network output before passing through the softmax layer for input . represents the value of corresponding to the prediction class, which constitutes the input relevance for the attributing procedure. A neuron in the layer receives the value from a neuron in the layer , which is obtained by multiplying the activation of the neuron in layer , denoted , and the weight . These contributions are summed over the the relevant neurons to obtain , which becomes after adding the bias and applying the activation function .

(1)

We consider the positive and negative parts of the contributions: , where and .

Layerwise Relevance Propagation (LRP)

LRP finds the parts with high relevance in the input, by propagating the relevance from the output to the input. The algorithm is based on the conservation principle, which maintains the relevance in each layer. Let denote the relevance of a neuron in a layer and is associated with a neuron of the layer , this conservation takes the form:

(2)

[3] introduced two relevance propagation rules that satisfy Eq. 2. The first rule, called LRP-, is defined as

(3)

In this rule, a neuron in the layer receives the relevance, according to their contribution to the activation of the neurons in the layer . The constant prevents the numerical instability for the case in which the denominator becomes zero. The second rule LRP- enforces the conservation principle, while separating between the positive and negative activations in the relevance propagation process.

(4)

Recall that . To maintain the total relevance, the parameters are chosen such that . We refer to the part of the relevance that is multiplied by () and which is related to the positive (negative) activations as the positive (negative) relevance.

The Shortcoming of Current Relevance Propagation Methods

Figure 2: An illustration of the way positive and negative contributions are handled in LRP. See text for details.

To motivate our method, we employ a toy sample to understand how relevance is propagated in the current methods. While it is presented in the context of , similar pathologies can be found in other methods, such as , integrated gradients, and pattern attribution. Fig. 2 presents an example of forward pass and backward relevance propagation between the two intermediate layers. For illustration purposes, the forward process does not include a bias term nor batch normalization, and all neurons in layer are non-negative. For simplicity, the absolute values of all weights are identical. The darker color means the higher value of the neuron. The positive part of is propagated back to the neurons and at the ratio of {:}. Similarly, the negative relevance of receives in the propagation the lion’s share of the relevance of , resulting in very low relevance value for . When summing the high positive and the negative contributions of neuron in Eq. 4, these cancel out. However, in terms of the amount of the contribution, it is clear that the relevance of this neuron should be high, since it plays a major role in the activations of layer .

Another related pathology is illustrated in the dog sample in Fig. 2, which further illustrates this phenomenon. The relevance is propagated recursively using Eq. 4 from the output layer to the input, obtaining the positive propagation image (). When doing the same process, but only for the negative value (similarly to ), we obtain the negative propagation image. It seems that the same locations of the object receive both high positive relevance and high negative relevance. When these are combined, using Eq. 4, they tend to cancel each other. The combination of the positive and negative contributions is illustrated in the third image row, which demonstrated the results of the LRP- method (similar to previous work this notation refers to the LRP method with and ). Many of the positive relevance values are canceled out by equally large negative values, except for specific locations in which one contribution dominates. However, these locations can be either positive or negative and appear in close proximity to each other, as the zoomed-in subfigure demonstrates.

In RAP, we consider the absolute contribution of each neuron and propagating that relevance according to a novel method we introduce in the next section. The result of our method is shown on the bottom right of Fig 2. As shown in the result, the positive and negative relevances appear in different parts of the image, corresponding to regions of high and low importance.

Relative Attributing Propagation

Motivated by the above issues, our goal is to separate the relatively (un)important neurons according to their influence across the layers. The method has three main steps: (i) absolute influence normalization, (ii) propagation of the relevance, and (iii) uniform shifting. Fig. 3 illustrates the three steps.

Figure 3: Overall structure of RAP algorithm.

Absolute Influence Normalization

We first propagate the relevance value to neuron of the penultimate layer , according to its actual contribution value from the prediction node in the final layer . Because the bias is a single value, it is possible to consider the relevance of the bias to the previous layer , by increasing the contribution of each neuron.

(5)

After applying the following, relevance values in the penultimate layer are composed of both positives and negatives.

Next, we normalize the entire value by the ratio of the absolute positive and negative values .

(6)

is the new input relevance to the next propagation and all relevance values in layer are distributed by their relative importance to the output layer, from most influenced to rarely influenced. The greater the neuron’s influence on the contribution, the more positive relevance it is assigned. Eq. 6 is only applied in the first relevance propagation process.

Criterion of Relevant Neuron

After changing the relevance to the amount of absolute contribution each neuron has, i.e., when all relevance scores are positive, it is possible to propagate the relevance while maintaining a degree of relative influence. We then apply a uniform shifting to all activated neurons, which causes low influential neurons to have negative relevance.

Next, the relevance propagation redistributes the relevance to the next layer through the positive contributions and . It is possible to propagate the relevance of layer through the formal case , which makes each relevance to be redistributed according to the degree of the positive influence. For the negative case, we apply the same procedure, considering the influence among the negatively contributed neuron. Here, the propagated relevance originally means the ratio between the whole negative contribution in the forward pass. We utilize this amount of the relevance to uniformly shift the whole activated neurons, which makes the neurons to be converted as negative, in the order in which they are close to zero. For each relevance value in the layer , it is possible to compute the ratio of the positive and negative contribution with . However, when we compute the ratio with the original sign, the degeneracy problem has occurred because the absolute value of the numerator has a much larger value than the denominator . Therefore, the contribution ratio is computed after normalizing with the absolute value of each contribution.

(7)

After propagating the relevance to the negatively contributed neurons, all neurons in layer receive the relevance value, according to the inner ratio of each contribution. However, relevance is not conserved, and there is an over-allocation in layer in comparison to layer . This over-allocated relevance, which originally means the negative contribution, is utilized for separating the important (unimportant) neurons by uniformly shifting all activated neurons. Let be the number of the activated neurons in layer . We evenly subtract the mean over allocation over these neurons. This shifts some of the relevancy scores to the negative region. Specifically, the shift is given by

(8)

In this case, the relevance of the neurons in both groups and becomes

(9)
Figure 4: The visualization of the relevance map of the intermediate layers of the VGG-16 network.

From Eq. 8,9, it is easy to verify that the relevance value is preserved as in Eq. 2. We emphasize that the criterion for separating the important and unimportant neurons is the amount of each contribution, not the direction. Since the activated parts are different between the feature maps of intermediate layers, we evenly subtract the same value to all activated neurons for not losing the important parts of each feature map. The negative input relevance for the next propagation indicates a relatively lower priority for the prediction result. Therefore, Eq. 7 propagate the negative relevance to the connected neurons, which contributed to the relatively unimportant neurons in positive and negative directions.

For the final relevance propagation to the input image layer, we utilized the rule [3] which is commonly used for propagating to the input layer in methods derived from LRP.

(10)

where is the input image and denotes the minimum and maximum values of .

We investigate the variation of the relevance maps during the propagating process. Fig. 4 shows the relevance map of activated neurons in the intermediate layers of VGG-16 net, where each pixel represents the sum of the relevance scores along the channel-wise axis. Here, we note that the intensity of the represented becomes lighter, since the size of the feature map is increased when passing the max-pooling layers, resulting in scattered to more neurons. As expected, the positive/negative maps change gradually from the classification layer (left) toward input ones (right).

Figure 5: Comparison of the results of conventional methods and our RAP in VGG-16 network.

Experimental Evaluations

We extensively verified our method on large scale CNNs, including the popular VGG-16 and ResNet-50 architectures. We utilize the Large Scale Visual Recognition Challenge 2012 (ILSVRC 2012) dataset [23] and Pascal VOC 2012 dataset [8], which are widely employed and easily accessible. We also use the Imagenet segmentation mask provided by [11]. We implement RAP with both Pytorch and Keras and visualize the explanation as a heatmap. For the evaluation, we utilized the Keras version to fairly compare with other explaining methods. We utilize the implementation introduced in [1] for the other explaining methods. As is customary in the field (but maybe unintuitive), the visualized heatmap is represented by seismic colors, where red and blue colors denote positive and negative values, respectively.

The results of our method are compared with those of existing attribution methods, including integrated gradients, gradient, input* gradient, Guided BackProp, pattern attribution and LRP- with {, }. Since [2] proved that LRP- and DeepLift (Rescale version) are equivalent to the input* gradient when the model utilized the ReLU activation function, the result of input* gradient is the same as both methods. Since Pattern Attribution is developed for the VGG network, we do not report the result of it for experiments with Resnet or on the Pascal VOC dataset.

Qualitative Evaluation of Heatmap

For qualitatively evaluating the positive attributions generated by RAP, we compare the results by examining how the areas in which positive attributions converge are similar to those of the other methods. As the existing methods propagate the positive relevance well, we can utilize them to assess whether our method is consistent in attributing positive relevance. Fig. 5 presents the heatmaps generated from the various methods for the predicted images by the VGG-16 network. Fig. 6 illustrates the comparison between LRP- and RAP in ResNet-50. More qualitative comparisons are illustrated in the supplementary material.

To qualitatively evaluate the negative attributions, we regard the attributions allocated in the parts that are not related to the prediction as to the negative relevance. While our results clearly distinguish between the object and the irrelevant parts as positive and negative attributions, the attributions from other methods overlap each other and appear as purple, as shown in Fig. 5. The results shown are typical: we qualitatively assessed all images in the validation set of ILSVRC 2012 and Pascal VOC 2012 dataset, and most of them appear to show similarly satisfactory results.

Figure 6: Comparison of visualization results applied on ResNet-50.
Outside-Inside Ratio RAP Gradient Input* Integrated Pattern Guided
Gradient Gradients Attribution Backprop
VGG-16
ALL
POS
0.252
0.341
-
0.474
0.616
0.524
-
0.619
0.989
0.691
1.230
0.827
-
0.415
1.069
0.427
Res-50
ALL
POS
0.164
0.166
-
0.429
0.302
0.299
-
0.597
0.996
0.689
1.195
0.698
-
-
1.035
0.296
Segmentation Mask RAP Gradient Input* Integrated Pattern Guided
Gradient Gradients Attribution Backprop
Imagenet
PIX ACC
mIOU
79.23
62.23
75.40
55.78
72.95
50.86
70.01
49.30
66.38
44.01
66,52
45.90
76.84
58.05
71.98
49.87
Pascal VOC
PIX ACC
mIOU
73.91
55.60
70.86
49.82
69.43
46.85
68.14
46.07
50.01
31.69
52.38
34.39
-
-
66.92
43.63
Table 1: In Outside-inside relevance ratio result, the first (second) row is the ratio when considering all (only positive) relevance in Imagenet dataset. Segmentation mask result shows the pixel accuracy and mIOU of relevance heatmap in VGG-16 network.
Method mIOU
Guillaumin et al. [10] 57.30
DeepMask [21] 58.69
DeepSaliency [16] 62.12
Xiong et al. [29] 64.22
Ours 62.23
Table 2: Quantitative mIOU results on the ImageNet Segmentation task. Our method is highly comparable to state of the art, despite not using the additional supervision.

Quantitative Assessment of Attributions

It is not trivial to assess the quantitative performance of the methods designed for explaining DNN models, since each method poses a different assumption and is designed for slightly different objectives. In this work, we utilized three methods that are commonly used to evaluate objectness and relevance: (i) Outside-Inside ratio, (ii) Pixel accuracy and Intersection of Union, and (iii) Region perturbation.

[14] introduce a method for assessing how the attributions are focused on the object, by computing the relevance inside and outside of the bounding box. We extend this method to consider the effect of correctly and wrongly distributed negative relevance. However, since the bounding box is not perfect for the object corresponding to the prediction, we additionally utilize the segmentation masks and metrics to assess how the positive attributions are correctly distributed in the target object. [24] introduce the method for quantitatively assessing the explanations methods, which utilizes the region perturbation process that progressively distorts the pixels from the heatmap and formalized this method as Area over the perturbation curve (AOPC). To evaluate how the negative attributions are distributed to the irrelevant regions of the prediction, we perturbate least relevant first (LeRF) and investigate the degradation of the accuracy.

In our experiment, we extract 10,000 correctly classified images from the validation set of imagenet and employ the specified bounding boxes (for the pertubation test). For the objectness scores, we utilized the 4,276 images from the imagenet dataset with segmentation masks and 1,449 images from the Pascal VOC validation set.

Objectness of Positive Attributions

To verify how the attributions are distributed on the prediction object, we assess the outside-inside relevance ratio of attributions [14], utilizing the bounding box. We extend the original metric to evaluate the positive and negative relevance simultaneously.

(11)

Here, is the cardinality operator and denotes the set of pixels inside and outside of the bounding box, respectively. When the positive (negative) relevance is attributed out of the bounding box, the value of is increased (decreased). By contrast, if the positive (negative) relevance is distributed in the bounding box, the ratio is decreased (increased). For the conventional methods which only consider the positive relevance, this metric becomes identical to the original metric of [14].

We present the results with(out) considering the negative relevance. In Tab. 1, this is denoted as ALL when considering both and POS when discarding the negative relevance. For both cases in Tab. 1, RAP provides the best scores with indicating that the relevance attribution is better distributed inside/outside the bounding box than the existing methods.

Figure 7: This graph illustrates the results of the negative perturbation on VGG-16 and Resnet-50. For each step, 100 pixels corresponding to the LeRF is perturbated as zero. RAP shows the unique characteristics of the robustness to the perturbation.

Furthermore, we utilize the segmentation masks provided for the ImageNet and Pascal VOC 2012 validation sets to evaluate our explainability performance in terms of objectness. The evaluation is done by first producing an explainability map, followed by thresholding, considering two cases: (i) only positive relevance, in which case the mean value is taken as a threshold, and (ii) positive and negative relevance, where the threshold is set to zero. By thresholding, we produce a segmentation mask, considering values above the threshold as 1, and 0 otherwise. Tab. 1 shows that our method greatly outperforms previous work in both Pixel Accuracy and mean-IOU. These results are stable for other choices of the threshold as well, such as various percentiles and using the median. As can be seen, the output relevance is highly correlated with the input image objectness.

Interestingly, as shown in Tab. 2, our method is extremely competitive with respect to the objectness literature methods, which are trained with additional supervision, such as bounding boxes. This is demonstrated using the acceptable metric of mIOU on imagenet, which is commonly used for benchmarking the generic object segmentation [5].

Interfering Negative Attributions

When a DNN makes a correct prediction, removing the contribution of irrelevant pixels from the prediction should not change significantly the prediction accuracy and relevance values. A small decrement of accuracy is inevitable because the distortion of color and shape could affect the classification, resulting in unpredictable noise and attack. We carefully discuss these issues in more details in supplementary materials. Also, it is important to note that removing pixels corresponding to the negative attributions does not always bring an increment of the accuracy, because the negative relevance of incorrect prediction does not denote the positive relevance of true label.

Fig. 7 shows the results when applying the LeRF perturbation on the VGG-16 and Resnet-50. For each step, we perturbate 100 pixels corresponding to the negative relevance, a total of 4,000 pixels distortion. As shown in the results, while other methods that consider the negative relevance show the rapid decrements of the accuracy, RAP rarely affects the prediction result during the negative attributions removal process. Thus, RAP distinguishes between the relevant and unimportant parts of the input image without overlapping each other.

Conclusion

In this paper, we propose RAP, a new method for interpreting the neurons in terms of importance to the predictions of DNNs, by assigning the relevance score according to the influence of each neuron. By approaching the relevance in terms of influence among the neurons, it is possible to separate relevant and irrelevant regions to the prediction. We evaluate our methods in quantitative and qualitative ways to verify that the attributions correctly account for the meaning. For the quantitative evaluation, we utilize the metrics: Outside-Inside ratio, mIOU and region perturbation to confirm how the attributions focused on the (ir)relevant object according to their assigned relevance scores. Overall, the experiments show that the RAP method leads to the desired characteristics: (i) a clear distinction of positive (relevant) and negative (irrelevant) attributions and (ii) it is indicative of objectness and can separate the main image object from the other irrelevant regions.

Acknowledgments

This work was supported by Institute for Information & communications Technology Planning & Evaluation(IITP) grant funded by the Korea government(MSIT) (No.2017-0-01779, A machine learning and statistical inference framework for explainable artificial intelligence & No.2019-0-01371, Development of brain-inspired AI with human-like intelligence) and the European Research Council (ERC) under the European Unions Horizon 2020 research and innovation programme (grant ERC CoG 725974).

References

  • [1] M. Alber, S. Lapuschkin, P. Seegerer, M. Hägele, K. T. Schütt, G. Montavon, W. Samek, K. Müller, S. Dähne, and P. Kindermans (2019) INNvestigate neural networks!. Journal of Machine Learning Research 20 (93), pp. 1–8. Cited by: Experimental Evaluations.
  • [2] M. Ancona, E. Ceolini, C. Oztireli, and M. Gross (2018) Towards better understanding of gradient-based attribution methods for deep neural networks. In Proceedings of the International Conference on Learning Representations, Cited by: Related Work, Experimental Evaluations.
  • [3] S. Bach, A. Binder, G. Montavon, F. Klauschen, K. Müller, and W. Samek (2015) On pixel-wise explanations for non-linear classifier decisions by layer-wise relevance propagation. PloS one 10 (7), pp. e0130140. Cited by: Introduction, Introduction, Related Work, Layerwise Relevance Propagation (LRP), Criterion of Relevant Neuron.
  • [4] D. Baehrens, T. Schroeter, S. Harmeling, M. Kawanabe, K. Hansen, and K. Müller (2010) How to explain individual classification decisions. Journal of Machine Learning Research 11 (Jun), pp. 1803–1831. Cited by: Related Work.
  • [5] N. Cho, A. Yuille, and S. Lee (2017) A novel linelet-based representation for line segment detection. IEEE transactions on pattern analysis and machine intelligence 40 (5), pp. 1195–1208. Cited by: Objectness of Positive Attributions.
  • [6] P. Dabkowski and Y. Gal (2017) Real time image saliency for black box classifiers. In Advances in Neural Information Processing Systems, pp. 6970–6979. Cited by: Related Work.
  • [7] D. Erhan, Y. Bengio, A. Courville, and P. Vincent (2009) Visualizing higher-layer features of a deep network. University of Montreal 1341 (3), pp. 1. Cited by: Related Work.
  • [8] M. Everingham, S. M. A. Eslami, L. Van Gool, C. K. I. Williams, J. Winn, and A. Zisserman (2015-01) The pascal visual object classes challenge: a retrospective. International Journal of Computer Vision 111 (1), pp. 98–136. Cited by: Experimental Evaluations.
  • [9] J. N. Foerster, J. Gilmer, J. Sohl-Dickstein, J. Chorowski, and D. Sussillo (2017) Input switched affine networks: an rnn architecture designed for interpretability. In Proceedings of the International Conference on Machine Learning, pp. 1136–1145. Cited by: Related Work.
  • [10] M. Guillaumin, D. Küttel, and V. Ferrari (2014) Imagenet auto-annotation with segmentation propagation. International Journal of Computer Vision 110 (3), pp. 328–348. Cited by: Table 2.
  • [11] M. Guillaumin, D. Küttel, and V. Ferrari (2014) ImageNet auto-annotation with segmentation propagation. International Journal of Computer Vision 110, pp. 328–348. Cited by: Experimental Evaluations.
  • [12] P. Kindermans, K. T. Schütt, M. Alber, K. Müller, D. Erhan, B. Kim, and S. Dähne (2017) Learning how to explain neural networks: patternnet and patternattribution. arXiv preprint arXiv:1705.05598. Cited by: Introduction.
  • [13] P. W. Koh and P. Liang (2017) Understanding black-box predictions via influence functions. In Proceedings of the 34th International Conference on Machine Learning-Volume 70, pp. 1885–1894. Cited by: Related Work.
  • [14] S. Lapuschkin, A. Binder, G. Montavon, K. Muller, and W. Samek (2016) Analyzing classifiers: fisher vectors and deep neural networks. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 2912–2920. Cited by: 2nd item, Objectness of Positive Attributions, Quantitative Assessment of Attributions.
  • [15] S. Lapuschkin, S. Wäldchen, A. Binder, G. Montavon, W. Samek, and K. Müller (2019) Unmasking clever hans predictors and assessing what machines really learn. Nature Communications 10 (), pp. 1096. External Links: Document, Link Cited by: Introduction.
  • [16] X. Li, L. Zhao, L. Wei, M. Yang, F. Wu, Y. Zhuang, H. Ling, and J. Wang (2016) Deepsaliency: multi-task deep neural network model for salient object detection. IEEE Transactions on Image Processing 25 (8), pp. 3919–3930. Cited by: Table 2.
  • [17] S. M. Lundberg and S. Lee (2017) A unified approach to interpreting model predictions. In Advances in Neural Information Processing Systems, pp. 4765–4774. Cited by: Introduction, Related Work.
  • [18] A. Mahendran and A. Vedaldi (2016) Visualizing deep convolutional neural networks using natural pre-images. International Journal of Computer Vision 120 (3), pp. 233–255. Cited by: Related Work.
  • [19] G. Montavon, S. Lapuschkin, A. Binder, W. Samek, and K. Müller (2017) Explaining nonlinear classification decisions with deep taylor decomposition. Pattern Recognition 65, pp. 211–222. Cited by: Introduction, Related Work.
  • [20] G. Montavon, W. Samek, and K. Müller (2018) Methods for interpreting and understanding deep neural networks. Digital Signal Processing 73, pp. 1–15. Cited by: Introduction.
  • [21] P. O. Pinheiro, R. Collobert, and P. Dollár (2015) Learning to segment object candidates. In Advances in Neural Information Processing Systems, pp. 1990–1998. Cited by: Table 2.
  • [22] M. T. Ribeiro, S. Singh, and C. Guestrin (2016) Why should i trust you?: explaining the predictions of any classifier. In Proceedings of the 22nd ACM SIGKDD international conference on knowledge discovery and data mining, pp. 1135–1144. Cited by: Related Work.
  • [23] O. Russakovsky, J. Deng, H. Su, J. Krause, S. Satheesh, S. Ma, Z. Huang, A. Karpathy, A. Khosla, M. Bernstein, A. C. Berg, and L. Fei-Fei (2015) ImageNet Large Scale Visual Recognition Challenge. International Journal of Computer Vision 115 (3), pp. 211–252. External Links: Document Cited by: Experimental Evaluations.
  • [24] W. Samek, A. Binder, G. Montavon, S. Lapuschkin, and K. Müller (2017) Evaluating the visualization of what a deep neural network has learned. IEEE transactions on neural networks and learning systems 28 (11), pp. 2660–2673. Cited by: 2nd item, Related Work, Quantitative Assessment of Attributions.
  • [25] A. Shrikumar, P. Greenside, and A. Kundaje (2017) Learning important features through propagating activation differences. In Proceedings of the 34th International Conference on Machine Learning-Volume 70, pp. 3145–3153. Cited by: Introduction, Related Work.
  • [26] K. Simonyan, A. Vedaldi, and A. Zisserman (2013) Deep inside convolutional networks: visualising image classification models and saliency maps. arXiv preprint arXiv:1312.6034. Cited by: Related Work.
  • [27] J. T. Springenberg, A. Dosovitskiy, T. Brox, and M. Riedmiller (2014) Striving for simplicity: the all convolutional net. arXiv preprint arXiv:1412.6806. Cited by: Related Work.
  • [28] M. Sundararajan, A. Taly, and Q. Yan (2017) Axiomatic attribution for deep networks. In Proceedings of the 34th International Conference on Machine Learning-Volume 70, pp. 3319–3328. Cited by: Introduction, Related Work.
  • [29] B. Xiong, S. D. Jain, and K. Grauman (2018) Pixel objectness: learning to segment generic objects automatically in images and videos. arXiv preprint arXiv:1808.04702. Cited by: Table 2.
  • [30] M. D. Zeiler and R. Fergus (2014) Visualizing and understanding convolutional networks. In European conference on computer vision, pp. 818–833. Cited by: Related Work, Related Work.
  • [31] B. Zhou, D. Bau, A. Oliva, and A. Torralba (2018) Interpreting deep visual representations via network dissection. IEEE transactions on pattern analysis and machine intelligence. Cited by: Related Work.
  • [32] B. Zhou, A. Khosla, A. Lapedriza, A. Oliva, and A. Torralba (2016) Learning deep features for discriminative localization. In Proceedings of the IEEE conference on computer vision and pattern recognition, pp. 2921–2929. Cited by: Related Work.
Comments 0
Request Comment
You are adding the first comment!
How to quickly get a good reply:
  • Give credit where it’s due by listing out the positive aspects of a paper before getting into which changes should be made.
  • Be specific in your critique, and provide supporting evidence with appropriate references to substantiate general statements.
  • Your comment should inspire ideas to flow and help the author improves the paper.

The better we are at sharing our knowledge with each other, the faster we move forward.
""
The feedback must be of minimum 40 characters and the title a minimum of 5 characters
   
Add comment
Cancel
Loading ...
398185
This is a comment super asjknd jkasnjk adsnkj
Upvote
Downvote
""
The feedback must be of minumum 40 characters
The feedback must be of minumum 40 characters
Submit
Cancel

You are asking your first question!
How to quickly get a good answer:
  • Keep your question short and to the point
  • Check for grammar or spelling errors.
  • Phrase it like a question
Test
Test description