Detecting Invasive Ductal Carcinoma With Semi-Supervised Conditional GANs
Abstract
Invasive ductal carcinoma (IDC) comprises nearly 80% of all breast cancers. The detection of IDC is a necessary preprocessing step in determining the aggressiveness of the cancer, determining treatment protocols, and predicting patient outcomes, and is usually performed manually by an expert pathologist. Here, we describe a novel algorithm for automatically detecting IDC using semi–supervised conditional generative adversarial networks (cGANs).The framework is simple and effective at improving scores on a range of metrics over a baseline CNN.
Jeremiah W. Johnson††thanks: The author gratefully acknowledges NVIDIA Corp for GPU donation to support this research.
\addressApplied Engineering & Sciences
University of New Hampshire
Manchester, NH 03101
{keywords}
deep learning, histopathology, invasive ductal carcinoma, generative adversarial network, neural network
1 Introduction
Invasive ductal carcinoma (IDC) comprises nearly 80% of all breast cancers, making it the most common phenotypic subtype [1]. The aggressiveness of a sample is usually determined by performing a visual analysis of tissues slides from regions where the carcinoma has been detected. As such, the detection of invasive ductal carcinoma is a necessary preprocessing step for determining aggressiveness, treatment protocols, and predicting patient outcomes. Done manually, this is a time-consuming and challenging process, as it involves the pathologist scanning large regions of mostly healthy tissue to identify and delineate the relatively smaller regions of IDC. Because precise delineation of the IDC is a critical factor in the assessment of the aggressiveness of the malignancy, there is a significant need for highly accurate automatic methods for detecting IDCs.

There exist many algorithms that have been somewhat successful at automatic detection of IDCs [2, 3]. Over the past few years, methods from deep learning, especially convolutional neural networks (CNNs), have been at the forefront of investigations into automatic detection of IDC in histopathology images [2, 4]. A convolutional neural network, in general, consists of a sequence of linear and nonlinear transformations that transforms the input data into a set of features (a ‘learned representation’) suitable for the task at hand [5]. Convolutional neural networks were designed for classifying images, and the performance of CNNs on a range of challenging tasks in computer vision is state-of-the-art, often meeting or exceeding human performance [6, 7, 8, 9, 10]. Moreover, CNNs require little in the way of manual feature engineering, typically the most time–consuming and difficult aspect of machine learning: aside from minor preprocessing steps, the model learns the features necessary for the task at hand via the training process, which is typically a variant of stochastic gradient descent.
Generative Adversarial Networks (GANs) were introduced in 2014 [11]. A GAN consists of a pair of models, a generator and a discriminator, who compete in a minimax game: the generator attempts to generate synthetic data that is sufficiently similar to real data to fool the discriminator, and the discriminator tries to distinguish real data from synthetic data. By trading off the training process, the networks each improve until a Nash equilibrium is reached.
GANs are often thought of as generative models, but they can be used in other ways, including for classification tasks. For example, the discriminator in a GAN can be augmented with a second network head in order to predict not only whether input data is real or generated, but also to predict the class into which the input data falls. In this regime, the generator serves to augment the existing dataset by providing the discriminator with additional synthetic training data [12].
In this paper, we describe a novel algorithm for automatic detection of IDC in histopathology images. The proposed algorithm uses a GAN framework where the discriminator is trained to identify IDC in both real and synthetic generated data and to distinguish real from synthetic data. The generator in the GAN framework is conditioned by class. The framework is simple and effective at improving scores on a range of metrics over a baseline CNN. The outline of this paper is as follows: Section 2, provides technical background on the model, while in Section 3 the data, the methodology, and the experiments carried out are detailed; in Section 4 we present conclusions and paths for future work.
2 Background
![]() |
![]() |
![]() |
![]() |
![]() |
![]() |
2.1 Generative Adversarial Networks
A generative adversarial network, or GAN, consists of two neural network models, a generator and a discriminator , that compete in an adversarial game: the task of the generator is, given some random input , to produce an output such that the discriminator cannot distinguish from a sample taken from the source domain. As and are trained in turn, learns to model the true distribution of the source domain and learns to evaluate the divergence between and the generative distribution , resulting in a competition to reach a Nash equilibrium that can be expressed by the training procedure. The value function for this minimax game is given in Equation 1 below.
(1) |
or, equivalently,
(2) |
where is noise.
2.2 Conditional GANs
A conditional GAN, or cGAN, is a GAN designed to incorporate conditional information [15]. cGANs have been shown to be effective tasks such as clas–conditional image synthesis and image–to–image translation; in these cases, both the generator and the discriminator are provided the conditional information, usually via concatenation with the input data, though other methods have been proposed and shown to be more effective in specfic contexts [16, 15, 17, 18]. The value function for a cGAN is given below, where represents the conditioning data.
(3) |
or, equivalently,
(4) |
2.3 Semi–Supervised Training with GANs
GANs are most often used as generative models: after training, the discriminator is discarded, and the generator is used to generate synthetic samples that reflect the distribution of the source data; see Figure 2. However, it is possible to modify the discriminator in the GAN by augmenting it witha network head that predicts the classification of the data, as illustrated in Figure 1. After training, the generator is discarded, and the discriminator can be used to classify samples from the source data. It has been shown that the semi–supervised training regime can be particularly effective in situations where the amount of training data is small [12].
![]() |
![]() |
![]() |
![]() |
![]() |
![]() |
3 Methodology and Results
3.1 Data
The data used for the experiments described here are the publicly available111https://andrewjanowczyk.com/wp-static/IDC_regular_ps50_idx5.zip data first introduced in [2]. These data consists of digitized histopathology slides from 162 women diagnosed with IDC at the Hospital of the University of Pennsylvania and The Cancer Institute of New Jersey. The slides were digitized via a whole–slide scanner at 40x magnification (0.25m/pixel resolution), and each whole–slide image was downsampled by a factor of 16:1 to a resolution of 4m/pixel. The ground truth annotations were obtained manually by an expert pathologist. The data were publicly released not in their original format, but rather as RGB patches of pixels; see Figure 3. In total, the dataset contains 277,524 patches, of which 78,786, or 28% are IDC, while the remaining 198,738 patches, or 72% are healthy tissue. Note that the annotations were performed at 2x magnification or less, resulting in relatively coarse annotations, occasionally including some stromal or non–invasive tissue. 20% of the data were held out for testing, and the model was trained on the remaining 80% of the dataset.
3.2 Architecture and Training
The model developed for these experiments is a conditional GAN based on the DCGAN framework [19], where the generator is conditioned on the class of the input data, and the discriminator receives no conditioning, giving the modified value function
(5) |
or, equivalently,
(6) |
The generator uses a sequence of transposed convolutions to upsample the input latent vector, sometime referred to as a fully convolutional neural network [20]. The discriminator is a convolutional neural network with two network heads, one that predicts the presence of IDC, and the other that predicts whether the observed data is real or synthetic. Both the generator and the discriminator use five transposed convolutional or convolutional layers with kernels. The number of filters in each convolutional layer of the discriminator was , where is a width multiplier used to increase the capacity of the network; the number of filters in each transposed convolutional layer in the generator is calculated analogously, mutatis mutandis. The generator uses ReLU activations, the discriminator uses leaky ReLU activations with . No pooling layers where used in the discriminator; downsampling was accomplished by adjusting the stride of the convolutional layers as needed. The discriminator network heads consisted of a single fully connected layer. Spectral normalization was applied to all convolutional and transposed convolutional layers except the first and the last layers in the discriminator and gradient penalty was used to mitigate mode collapse [13, 14].
The network was trained for 200 epochs with minibatch size of 128 using the Adam optimizer () [21]. The learning rate was fixed at for the first 100 epochs, then reduced linearly to 0 for the remaining 100 epochs. The training data was augmented with vertically and horizontally flipped images. Traditional training loss curves tend to be uninformative when training GANs, so in addition to monitoring the generator and discriminator losses during training, samples from the generator outputs were periodically assessed qualitatively to insure that the generator was learning throughout the training process; samples generated by the generator are provided in Figure 2. The model was implemented using the open–source machine learning framework PyTorch [22]. The model was trained on a workstation running Ubuntu 18.04 using two Titan Xp GPUs. Results are presented in Table 1.
Metric | CNN ([2]) | cGAN_1 | cGAN_2 | cGAN_4 |
---|---|---|---|---|
Accuracy | NA | 86.68% | 87.45% | 88.33% |
BAC | 84.23% | 81.15% | 83.19% | 83.54% |
Precision | 65.40% | 81.94% | 80.85% | 84.39% |
Recall | 79.60% | 68.29% | 73.29% | 72.41% |
Specificity | NA | 94.00% | 93.09% | 94.66% |
F1 | 71.80% | 74.50% | 76.88% | 77.94% |
4 Conclusions and Future Work
In this paper we present the results of an investigation into the use of GANs and conditional GANs for automatic detection of IDC in breast histopathology images. The advantages of a GAN or cGAN framework is that the generator in the framework learns during the training process to generate data that follows the distribution of the training data, thus supplementing the training dataset with additional high–quality synthetic training data. These models achieve high accuracy, precision, specficity, and F1–scores, and competitive balanced accuracy scores, while being less sensitive than a conventional convolutional neural network model.
There are several avenues for future work in this vein. One of the advantages of semi–supervised GAN training is that in situations with limited data, it is often possible to achieve superior performance over other methods on similarly sized data. As noted in [18], most GAN discriminators are rather shallow in comparison to modern classifier architectures. Semi–supervised training with a fixed dataset may allow one to increase the capacity of the discriminator over a base classifier CNN and thereby improve performance beyond the results described here.
The ability to condition the generator of a conditional GAN on some supplementary data, such as the class of the data to be generated, is a noteworthy aspect of this model. Future investigations will explore other conditioning approaches; one possibility, for example, is to condition the generator based on both the class to be generated as well as the location of the patch in the whole slide image.
The algorithm described here has relatively high precision, but lower recall/sensitivity than other automatic detection methods based on CNNs. Increasing the recall of the algorithm while maintaining the precision, perhaps by weighting the loss function, is another potentially fruitful avenue for future work. Finally, in a purely theoretical direction, there is still much work to be done to understand the complex interplay between adverarial and classification loss in semi–supervised GAN training.
References
- [1] Carol DeSantis, Rebecca Siegel, Priti Bandi, and Ahmedin Jemal, “Breast cancer statistics, 2011,” CA: A Cancer Journal for Clinicians, vol. 61, no. 6, pp. 408–418, 2011.
- [2] Angel Cruz-Roa, Ajay Basavanhally, Fabio González, Hannah Gilmore, Michael Feldman, Shridar Ganesan, Natalie Shih, John Tomaszewski, and Anant Madabhushi, “Automatic detection of invasive ductal carcinoma in whole slide images with convolutional neural networks,” in Medical Imaging 2014: Digital Pathology, Metin N. Gurcan and Anant Madabhushi, Eds. International Society for Optics and Photonics, 2014, vol. 9041, pp. 1 – 15, SPIE.
- [3] Teresa Araújo, Guilherme Aresta, Eduardo Castro, José Rouco, Paulo Aguiar, Catarina Eloy, António Polónia, and Aurélio Campilho, “Classification of breast cancer histology images using convolutional neural networks,” PloS one, vol. 12, no. 6, pp. e0177544, 2017.
- [4] Andrew Janowczyk and Anant Madabhushi, “Deep learning for digital pathology image analysis: A comprehensive tutorial with selected use cases,” Journal of pathology informatics, vol. 7, 2016.
- [5] Ian Goodfellow, Yoshua Bengio, and Aaron Courville, Deep Learning, MIT Press, 2016, http://www.deeplearningbook.org.
- [6] Alex Krizhevsky, Ilya Sutskever, and Geoffrey E. Hinton, “Imagenet classification with deep convolutional neural networks,” in Proceedings of the 25th International Conference on Neural Information Processing Systems - Volume 1, USA, 2012, NIPS’12, pp. 1097–1105, Curran Associates Inc.
- [7] Karen Simonyan and Andrew Zisserman, “Very deep convolutional networks for large-scale image recognition,” CoRR, vol. abs/1409.1556, 2014.
- [8] Kaiming He, Xiangyu Zhang, Shaoqing Ren, and Jian Sun, “Deep residual learning for image recognition,” CoRR, vol. abs/1512.03385, 2015.
- [9] Kaiming He, Xiangyu Zhang, Shaoqing Ren, and Jian Sun, “Delving deep into rectifiers: Surpassing human-level performance on imagenet classification,” CoRR, vol. abs/1502.01852, 2015.
- [10] Kaiming He, Georgia Gkioxari, Piotr Dollár, and Ross B. Girshick, “Mask R-CNN,” CoRR, vol. abs/1703.06870, 2017.
- [11] Ian Goodfellow, Jean Pouget-Abadie, Mehdi Mirza, Bing Xu, David Warde-Farley, Sherjil Ozair, Aaron Courville, and Yoshua Bengio, “Generative adversarial nets,” Advances in Neural Information Processing Systems 27, pp. 2672–2680, 2014.
- [12] Tim Salimans, Ian Goodfellow, Wojciech Zaremba, Vicki Cheung, Alec Radford, and Xi Chen, “Improved techniques for training gans,” in Advances in neural information processing systems, 2016, pp. 2234–2242.
- [13] Takeru Miyato, Toshiki Kataoka, Masanori Koyama, and Yuichi Yoshida, “Spectral normalization for generative adversarial networks,” in International Conference on Learning Representations, 2018.
- [14] Ishaan Gulrajani, Faruk Ahmed, Martin Arjovsky, Vincent Dumoulin, and Aaron C Courville, “Improved training of wasserstein gans,” in Advances in neural information processing systems, 2017, pp. 5767–5777.
- [15] Mehdi Mirza and Simon Osindero, “Conditional generative adversarial nets,” arXiv preprint arXiv:1411.1784, 2014.
- [16] Phillip Isola, Jun-Yan Zhu, Tinghui Zhou, and Alexei A Efros, “Image-to-image translation with conditional adversarial networks,” arXiv preprint, 2017.
- [17] Takeru Miyato and Masanori Koyama, “cGANs with projection discriminator,” in International Conference on Learning Representations, 2018.
- [18] Faisal Mahmood, Xu Wenhao, Jeremiah W. Johnson, Nicholas J. Durr, and Alan Yuille, “Structured prediction using cGANs with fusion discriminator,” Proceedings of the Workshop on Deep Generative Models for Highly Structured Data at ICLR, 2019.
- [19] Alec Radford, Luke Metz, and Soumith Chintala, “Unsupervised representation learning with deep convolutional generative adversarial networks,” arXiv preprint arXiv:1511.06434, 2015.
- [20] Jonathan Long, Evan Shelhamer, and Trevor Darrell, “Fully convolutional networks for semantic segmentation,” in Proceedings of the IEEE conference on computer vision and pattern recognition, 2015, pp. 3431–3440.
- [21] Diederik P Kingma and Jimmy Ba, “Adam: A method for stochastic optimization,” arXiv preprint arXiv:1412.6980, 2014.
- [22] Adam Paszke, Sam Gross, Soumith Chintala, Gregory Chanan, Edward Yang, Zachary DeVito, Zeming Lin, Alban Desmaison, Luca Antiga, and Adam Lerer, “Automatic differentiation in pytorch,” 2017.