Detecting Invasive Ductal Carcinoma With Semi-Supervised Conditional GANs

Detecting Invasive Ductal Carcinoma With Semi-Supervised Conditional GANs


Invasive ductal carcinoma (IDC) comprises nearly 80% of all breast cancers. The detection of IDC is a necessary preprocessing step in determining the aggressiveness of the cancer, determining treatment protocols, and predicting patient outcomes, and is usually performed manually by an expert pathologist. Here, we describe a novel algorithm for automatically detecting IDC using semi–supervised conditional generative adversarial networks (cGANs).The framework is simple and effective at improving scores on a range of metrics over a baseline CNN.


Jeremiah W. Johnsonthanks: The author gratefully acknowledges NVIDIA Corp for GPU donation to support this research. \addressApplied Engineering & Sciences
University of New Hampshire
Manchester, NH 03101 {keywords} deep learning, histopathology, invasive ductal carcinoma, generative adversarial network, neural network

1 Introduction

Invasive ductal carcinoma (IDC) comprises nearly 80% of all breast cancers, making it the most common phenotypic subtype [1]. The aggressiveness of a sample is usually determined by performing a visual analysis of tissues slides from regions where the carcinoma has been detected. As such, the detection of invasive ductal carcinoma is a necessary preprocessing step for determining aggressiveness, treatment protocols, and predicting patient outcomes. Done manually, this is a time-consuming and challenging process, as it involves the pathologist scanning large regions of mostly healthy tissue to identify and delineate the relatively smaller regions of IDC. Because precise delineation of the IDC is a critical factor in the assessment of the aggressiveness of the malignancy, there is a significant need for highly accurate automatic methods for detecting IDCs.

Figure 1: cGAN architecture for semi–supervised training. In this case, the conditioning data is the class of the data to be generated by the generator; either IDC or healthy. Note that in contrast to the classical cGAN framework, the discriminator is not provided access to the conditioning data.

There exist many algorithms that have been somewhat successful at automatic detection of IDCs [2, 3]. Over the past few years, methods from deep learning, especially convolutional neural networks (CNNs), have been at the forefront of investigations into automatic detection of IDC in histopathology images [2, 4]. A convolutional neural network, in general, consists of a sequence of linear and nonlinear transformations that transforms the input data into a set of features (a ‘learned representation’) suitable for the task at hand [5]. Convolutional neural networks were designed for classifying images, and the performance of CNNs on a range of challenging tasks in computer vision is state-of-the-art, often meeting or exceeding human performance [6, 7, 8, 9, 10]. Moreover, CNNs require little in the way of manual feature engineering, typically the most time–consuming and difficult aspect of machine learning: aside from minor preprocessing steps, the model learns the features necessary for the task at hand via the training process, which is typically a variant of stochastic gradient descent.

Generative Adversarial Networks (GANs) were introduced in 2014 [11]. A GAN consists of a pair of models, a generator and a discriminator, who compete in a minimax game: the generator attempts to generate synthetic data that is sufficiently similar to real data to fool the discriminator, and the discriminator tries to distinguish real data from synthetic data. By trading off the training process, the networks each improve until a Nash equilibrium is reached.

GANs are often thought of as generative models, but they can be used in other ways, including for classification tasks. For example, the discriminator in a GAN can be augmented with a second network head in order to predict not only whether input data is real or generated, but also to predict the class into which the input data falls. In this regime, the generator serves to augment the existing dataset by providing the discriminator with additional synthetic training data [12].

In this paper, we describe a novel algorithm for automatic detection of IDC in histopathology images. The proposed algorithm uses a GAN framework where the discriminator is trained to identify IDC in both real and synthetic generated data and to distinguish real from synthetic data. The generator in the GAN framework is conditioned by class. The framework is simple and effective at improving scores on a range of metrics over a baseline CNN. The outline of this paper is as follows: Section 2, provides technical background on the model, while in Section 3 the data, the methodology, and the experiments carried out are detailed; in Section 4 we present conclusions and paths for future work.

2 Background

Figure 2: Examples of images generated by the conditional GAN during the training process. The images in the top row were generated by conditioning the generator on IDC, while those in the second row are images produced by the generator when conditioned on healthy tissue.

2.1 Generative Adversarial Networks

A generative adversarial network, or GAN, consists of two neural network models, a generator and a discriminator , that compete in an adversarial game: the task of the generator is, given some random input , to produce an output such that the discriminator cannot distinguish from a sample taken from the source domain. As and are trained in turn, learns to model the true distribution of the source domain and learns to evaluate the divergence between and the generative distribution , resulting in a competition to reach a Nash equilibrium that can be expressed by the training procedure. The value function for this minimax game is given in Equation 1 below.


or, equivalently,


where is noise.

GAN training is known to often be unstable and prone to issues such as mode collapse, but in recent years several notable developments including spectral normalization and gradient penalty have significantly improved the stability of GAN training [13, 14].

2.2 Conditional GANs

A conditional GAN, or cGAN, is a GAN designed to incorporate conditional information [15]. cGANs have been shown to be effective tasks such as clas–conditional image synthesis and image–to–image translation; in these cases, both the generator and the discriminator are provided the conditional information, usually via concatenation with the input data, though other methods have been proposed and shown to be more effective in specfic contexts [16, 15, 17, 18]. The value function for a cGAN is given below, where represents the conditioning data.


or, equivalently,


2.3 Semi–Supervised Training with GANs

GANs are most often used as generative models: after training, the discriminator is discarded, and the generator is used to generate synthetic samples that reflect the distribution of the source data; see Figure 2. However, it is possible to modify the discriminator in the GAN by augmenting it witha network head that predicts the classification of the data, as illustrated in Figure 1. After training, the generator is discarded, and the discriminator can be used to classify samples from the source data. It has been shown that the semi–supervised training regime can be particularly effective in situations where the amount of training data is small [12].

Figure 3: Examples of the crops used to train the model. The images in the top row are of IDC, while those in the second row are images of healthy tissue.

3 Methodology and Results

3.1 Data

The data used for the experiments described here are the publicly available111 data first introduced in [2]. These data consists of digitized histopathology slides from 162 women diagnosed with IDC at the Hospital of the University of Pennsylvania and The Cancer Institute of New Jersey. The slides were digitized via a whole–slide scanner at 40x magnification (0.25m/pixel resolution), and each whole–slide image was downsampled by a factor of 16:1 to a resolution of 4m/pixel. The ground truth annotations were obtained manually by an expert pathologist. The data were publicly released not in their original format, but rather as RGB patches of pixels; see Figure 3. In total, the dataset contains 277,524 patches, of which 78,786, or 28% are IDC, while the remaining 198,738 patches, or 72% are healthy tissue. Note that the annotations were performed at 2x magnification or less, resulting in relatively coarse annotations, occasionally including some stromal or non–invasive tissue. 20% of the data were held out for testing, and the model was trained on the remaining 80% of the dataset.

3.2 Architecture and Training

The model developed for these experiments is a conditional GAN based on the DCGAN framework [19], where the generator is conditioned on the class of the input data, and the discriminator receives no conditioning, giving the modified value function


or, equivalently,


The generator uses a sequence of transposed convolutions to upsample the input latent vector, sometime referred to as a fully convolutional neural network [20]. The discriminator is a convolutional neural network with two network heads, one that predicts the presence of IDC, and the other that predicts whether the observed data is real or synthetic. Both the generator and the discriminator use five transposed convolutional or convolutional layers with kernels. The number of filters in each convolutional layer of the discriminator was , where is a width multiplier used to increase the capacity of the network; the number of filters in each transposed convolutional layer in the generator is calculated analogously, mutatis mutandis. The generator uses ReLU activations, the discriminator uses leaky ReLU activations with . No pooling layers where used in the discriminator; downsampling was accomplished by adjusting the stride of the convolutional layers as needed. The discriminator network heads consisted of a single fully connected layer. Spectral normalization was applied to all convolutional and transposed convolutional layers except the first and the last layers in the discriminator and gradient penalty was used to mitigate mode collapse [13, 14].

The network was trained for 200 epochs with minibatch size of 128 using the Adam optimizer () [21]. The learning rate was fixed at for the first 100 epochs, then reduced linearly to 0 for the remaining 100 epochs. The training data was augmented with vertically and horizontally flipped images. Traditional training loss curves tend to be uninformative when training GANs, so in addition to monitoring the generator and discriminator losses during training, samples from the generator outputs were periodically assessed qualitatively to insure that the generator was learning throughout the training process; samples generated by the generator are provided in Figure 2. The model was implemented using the open–source machine learning framework PyTorch [22]. The model was trained on a workstation running Ubuntu 18.04 using two Titan Xp GPUs. Results are presented in Table 1.

Metric CNN ([2]) cGAN_1 cGAN_2 cGAN_4
Accuracy NA 86.68% 87.45% 88.33%
BAC 84.23% 81.15% 83.19% 83.54%
Precision 65.40% 81.94% 80.85% 84.39%
Recall 79.60% 68.29% 73.29% 72.41%
Specificity NA 94.00% 93.09% 94.66%
F1 71.80% 74.50% 76.88% 77.94%
Table 1: Results from semi–supervised experiments. In expressions of the form cGAN_, the value is the width multiplier described in Section 3.2.

4 Conclusions and Future Work

In this paper we present the results of an investigation into the use of GANs and conditional GANs for automatic detection of IDC in breast histopathology images. The advantages of a GAN or cGAN framework is that the generator in the framework learns during the training process to generate data that follows the distribution of the training data, thus supplementing the training dataset with additional high–quality synthetic training data. These models achieve high accuracy, precision, specficity, and F1–scores, and competitive balanced accuracy scores, while being less sensitive than a conventional convolutional neural network model.

There are several avenues for future work in this vein. One of the advantages of semi–supervised GAN training is that in situations with limited data, it is often possible to achieve superior performance over other methods on similarly sized data. As noted in [18], most GAN discriminators are rather shallow in comparison to modern classifier architectures. Semi–supervised training with a fixed dataset may allow one to increase the capacity of the discriminator over a base classifier CNN and thereby improve performance beyond the results described here.

The ability to condition the generator of a conditional GAN on some supplementary data, such as the class of the data to be generated, is a noteworthy aspect of this model. Future investigations will explore other conditioning approaches; one possibility, for example, is to condition the generator based on both the class to be generated as well as the location of the patch in the whole slide image.

The algorithm described here has relatively high precision, but lower recall/sensitivity than other automatic detection methods based on CNNs. Increasing the recall of the algorithm while maintaining the precision, perhaps by weighting the loss function, is another potentially fruitful avenue for future work. Finally, in a purely theoretical direction, there is still much work to be done to understand the complex interplay between adverarial and classification loss in semi–supervised GAN training.


Comments 0
Request Comment
You are adding the first comment!
How to quickly get a good reply:
  • Give credit where it’s due by listing out the positive aspects of a paper before getting into which changes should be made.
  • Be specific in your critique, and provide supporting evidence with appropriate references to substantiate general statements.
  • Your comment should inspire ideas to flow and help the author improves the paper.

The better we are at sharing our knowledge with each other, the faster we move forward.
The feedback must be of minimum 40 characters and the title a minimum of 5 characters
Add comment
Loading ...
This is a comment super asjknd jkasnjk adsnkj
The feedback must be of minumum 40 characters
The feedback must be of minumum 40 characters

You are asking your first question!
How to quickly get a good answer:
  • Keep your question short and to the point
  • Check for grammar or spelling errors.
  • Phrase it like a question
Test description