Deep Learning Assessment of Tumor Proliferation in Breast Cancer Histological Images
Current analysis of tumor proliferation, the most salient prognostic biomarker for invasive breast cancer, is limited to subjective mitosis counting by pathologists in localized regions of tissue images. This study presents the first data-driven integrative approach to characterize the severity of tumor growth and spread on a categorical and molecular level, utilizing multiple biologically salient deep learning classifiers to develop a comprehensive prognostic model. Our approach achieves pathologist-level performance on three-class categorical tumor severity prediction. It additionally pioneers prediction of molecular expression data from a tissue image, obtaining a Spearman’s rank correlation coefficient of 0.60 with ex vivo mean calculated RNA expression. Furthermore, our framework is applied to identify over two hundred unprecedented biomarkers critical to the accurate assessment of tumor proliferation, validating our proposed integrative pipeline as the first to holistically and objectively analyze histopathological images.
Breast cancer is the most common cancer in women worldwide, with over 1.2 million new cases diagnosed in 2012 . Cancer assessment is influenced by environmental and clinical factors, but it is universally accepted that tumor proliferation speed (tumor growth) is an important biomarker representative of progression rate and outcomes [2, 3]. Specifically, high proliferation speed is associated with worse outcomes . The assessment of this biomarker critically influences patient treatment plans, allowing for patients with more aggressive tumors to be treated with the corresponding therapy .
In a clinical setting, tumor proliferation is manually assessed by pathologists under a regime of counting mitotic figures in hematoxylin & eosin (H&E) stained histological slide preparations that are examined under a high powered microscope. Although ubiquitous, the process of counting mitoses has been reported to suffer from reproducibility problems that reflect the underlying subjectivity of the process . In addition, the simple methodology of pathologist mitosis counting and subsequent thresholding fails to account for pathological features including tumor extent, tissue density, and relative mitosis density. The mean expression of eleven prognostic RNA sequences, an objective measure of the proliferation score, requires extensive ex vivo molecular tests, relegating current practices to an inadequate and subjective pathologist diagnosis [3, 4].
This study introduces a comprehensive deep learning-based pipeline constructing and unifying models across several associated tasks of tumor localization, mitotic figure identification, and high-level feature extraction to classify categorical tumor grades (0–2) and predict RNA proliferation speed scores from histological whole slide images (WSIs). Furthermore, we aim to identify salient biomarkers related to tumor diagnosis to serve as the basis for future studies. The data-driven integrative approach presented here is generalizable and will be useful to analyze other cancerous tumors.
Ii-a Dataset Description
Three datasets were used in this study to train primary and auxiliary models . Our primary evaluation dataset consisted of 500 whole slide images with magnification levels from 10 to 40 that are annotated with a tumor score based on mitosis counting by pathologists and a molecular (RNA) proliferation score . We additionally use 73 2 magnified images annotated with mitotic figures and 148 WSIs partially annotated with tumor (high cellularity) regions for supplemental training.
Ii-B Data Mining, WSI Normalization, and Tissue Extraction
As our WSIs originate from three international pathology centers, each exhibited different staining methods. To ensure that such variations in the color and intensity of H&E staining would not hamper the effectiveness of subsequent quantitative image analysis, we employed the Bejnordi et al’s WSI Color Standardization (WSICS) procedure . Normalizing each WSI and images from auxiliary datasets ensured that our subsequent methods exhibited stain invariance between tissue preparation methods. Tissue regions were next extracted from each stain standardized input WSI using Otsu’s method on a HSV representation of the original RGB image [7, 8]. We subsequently removed small artifacts and expanded remaining regions via binary dilation to obtain a holistic tissue mask.
Ii-C Network Construction
Our pipeline (Fig. 1) next performed three tasks: metastatic tumor localization, mitotic figure identification, and WSI general feature extraction. We used a magnification level of 10 for the tumor localization process to identify high-level patterns, and we conducted mitosis identification and feature extraction on the 40 level for detailed analysis.
Metastatic Tumor Localization
Having extracted and normalized tissue regions from each input WSI, it is important to identify candidate regions for mitotic activity indicative of tumor proliferation. Such regions are biologically characterized as high-cellularity areas with proliferative activity often represented at the edges of the tissue abnormality.
Fig. 2 depicts the four employed network architectures, the first three of which are recognized as state-of-the-art convolutional neural networks (CNNs) for object recognition [9, 10, 11]. Each CNN generated output tumor probability heatmaps with a “sliding-window” approach , classifying overlapping tissue patches for tumor probability and assigning the resulting value to the center pixel of each patch. The fourth network, which we named LocNet, reduced the number of free parameters and operated on a fully convolutional paradigm . LocNet allowed for arbitrarily sized inputs (we use patches) and produced downsampled corresponding heatmaps for each patch as opposed to singular probability outputs. We resized and stitched these probability heatmaps over each WSI to rapidly generate a comprehensive heatmap.
Network training was framed as an active learning problem  as each annotated image contained a non-exhaustive list of tumor regions. We, therefore, separated the process into two components, with the first stage defining annotated patches as positive and identifying a random sample of remaining tissue patches as negative. Heatmaps were subsequently produced using each model and additional regions predicted as positive with over 95% confidence were appended to the initial positive training set. All models were retrained with the refined data; subsequent second stage results better eliminated regions misclassified in the first stage. The trained models were used both to identify tumor regions within which to perform mitotic figure identification and to provide informative features regarding tumor shape, density, area, extent, and location.
|Network||# Layers||# Params||RF||Input||Output||Propagation Time|
|Metastatic Tumor Localization (II-C1)|
|GoogLeNet ||27||5.97 M||49||1 2||562.62 ms|
|ResNet-34 ||34||13.9 M||49||1 2||204.55 ms|
|VGG-13 ||39||134.3 M||9||1 2||208.85 ms|
|LocNet||12||4.55 M||9||1k 1k||77.47 ms|
|Mitotic Figure Identification (II-C2)|
|DenseNet ||118||1.02 B||9||1 2||500.09 ms|
|GoogLeNet ||27||5.97 M||49||1 2||562.62 ms|
|GoogLeNet FCN||27||5.97 M||49||1k 1k||340.53 ms|
|LocNet||12||4.55 M||9||1k 1k||77.47 ms|
|MitosNet||6||21.1 K||16||1k 1k||44.21 ms|
|General Feature Extraction (II-C3)|
|Tumor + 3C/P||18||5.16 M||64||1k 1k||492.4 ms|
|Mitosis + 3C/P||12||5.93 M||64||1k 1k||143.1 ms|
Mitotic Figure Identification
We next constructed mitotic figure detectors to identify biologically salient features within tumor areas. Due to aberrant tumor chromosomal makeup, mitotic figure appearances may vary from typical examples of hyperchromatic objects with an absence of a clear nuclear membrane and hairy protrusions around edges.
Current state-of-the-art methods in the field of computational mitosis identification are trained and evaluated on high quality, standardized, and localized tumor regions with well-defined mitotic figures [16, 17, 18]. However, such methods fail to generalize to our WSI dataset as they often simply learn standardized color and texture filters from their homogeneous training dataset. Prior methods are additionally unable to rapidly generate mitosis identification results over an entire whole slide image. Here, we introduce and apply robust color, texture, and scale invariant mitosis detection networks that rapidly identify mitoses on individual patches and WSIs.
The five networks employed are depicted in the middle section of Fig. 2. DenseNet, requiring the most parameters, constructed repeated connections between network layers to develop a robust approach. Although the GoogLeNet architecture and its modified fully convolutional counterpart performed reasonably well, the additional complexity encoded within the network architecture excessively distilled the already small mitotic figures. To remedy this issue, we applied the LocNet model and developed a specialized architecture called MitosNet. LocNet and MitosNet exhibited fewer (yet more fine-tuned) layers, capturing the variance between mitoses without degrading effective inference.
To reduce the false positive rate for identified mitoses, we followed a two-stage training procedure similar to Section II-C1 using a dataset of pathologist annotated mitotic figures. We initially identified the locations and areas of all nuclei using morphological methods. We defined positive training patches as those nuclei annotated as mitotic, and we identified a random sample of other nuclei as negative. After the first stage of training and output heatmap generation, we subjected our mitosis identifications to further pathologist evaluation and subsequently retrained our models accounting for initially misidentified instances. The resulting robust trained models were used to characterize mitoses in terms of spatial distribution and shape-specific attributes; the process of extracting these features is detailed in Section II-D.
WSI General Feature Extraction
In addition to developing methods for the identification of anatomical structures in tumor severity analysis, we created end-to-end networks that predict the output categorical severity grade of the whole slide image from individual patches. These developed networks are defined in the last section of Fig. 2; each model utilizes outputs of tumor and mitosis networks to extract detailed computational features. Patches are extracted from original WSIs and input to the first network (with static weights) which computes coarse features corresponding to either mitosis identification or tumor localization. Convolutions from the first network’s feature volume are next mapped to the input of a second network (with dynamic weights). The second network performed categorical predictions and extracted WSI features. These 1,024 features, combined with 3 predicted class probabilities, were incorporated in the final predictive model.
Ii-D Tumor-Specific Feature Extraction
We applied our mitosis detection and tumor localization methods to identify biologically salient features in WSIs on both a patch-based and a whole slide level. Each approach allowed for extraction of features with varying granularities.
Specifically, we preferentially selected fifty patches from the fringes of localized tumor regions with the largest area. Each patch, a 1k1k tissue sample at magnification level , was input to our mitosis detectors which produced heatmaps of corresponding size identifying mitotic figure probability in the input image. We additionally represented each WSI with a comprehensive heatmap depicting mitotic figures in all tumor regions. Both individual patches and the WSI heatmap are used to compute biological and data-driven mitosis features.
Biological Features. From each selected patch, we extracted fifty morphometric and intensity based features to characterize biologically salient structural mitosis components. These features describe compositional and formational patterns that pathologists might observe. In addition, we characterized the distribution of mitoses throughout the entire WSI with sixty architectural features. Particularly, we analyzed the sparsity of mitosis distribution and second-order attributes including kurtosis, entropy, and skewness, providing a high-level interpretation of proliferative activity.
Data-Driven Features. Within each magnified patch, we additionally computed abstract deep learning-based features that represent attributes from learned filters. We segmented a tissue patch around each identified mitosis for input to our mitosis detection networks. Each mitotic figure was subsequently characterized by 4,096 attributes to describe mitosis-specific structural minutiae. As each patch consisted of mitotic figures and was thereby associated with distinct computational attributes, feature standardization was performed. We conducted post-processing -means clustering on all individual mitosis feature vectors (of length 4,096) from every WSI patch in a 200-dimensional vector space. Each vector was associated with a cluster label identifying its most similar sub-space. Finally, each WSI was distilled into a 200-bin histogram with frequencies corresponding to the cluster labels of each mitotic region within extracted patches, resulting in a fixed data-driven feature vector of length 200.
Iii Experiments and Results
Iii-a Performance Evaluation
Categorical Tumor Severity. A receiver operating characteristic (ROC) curve  detailing the ratio of true positives and false positives at varying thresholds is depicted in Fig. 3. Each class was predicted in a one vs all manner with mean values determined in five-fold cross-validation. The resulting micro-average AUROC of 0.78 validates our overall -measure of 0.62, establishing the model’s powerful discriminative potential among the three classes. Our method additionally achieved an accuracy of 0.72 (95% CI: 0.67, 0.76) when compared to pathologist severity gradation, indicating marginal deviation of our predictions from the inter-pathologist agreement of 0.79 (95% CI: 0.70, 0.85) .
Molecular RNA Expression. Our best-performing regression model achieved a mean squared error of 0.119. Fig. 4 depicts the correlation between our regression predictions and the calculated mean expression of eleven prognostic RNA strands. Our model, the first ever to predict gene expression data from histopathological image slides, achieves a Pearson’s correlation coefficient value and a Spearman’s rank coefficient ().
Along with the low MSE, these results indicate the ability of our pipeline not only to match categorical pathologist diagnosis but also to provide significantly more salient information regarding the biological underpinnings defining the severity of tumor tissue. The overall performance metrics indicate the prognostic potential of our model in effectively evaluating histopathological slides without a molecular examination.
Iii-B Specific Biomarker Analysis
Our study further elucidated important biomarkers (regression features with ) most important for proliferation evaluation. The number of mitoses, currently the sole feature used to manually diagnose tumor proliferation, was confirmed as relevant. Additionally, the standard deviation of the nucleus area of all identified mitoses was implicated. This attribute, known to characterize malignant neoplasms and dysplasia, was recently associated with breast cancer diagnosis .
In addition, the -values and predictive significance of several new biomarkers suggest that they are significantly related to the progression of breast cancer. The mean mitotic eccentricity over each WSI, a feature characteristic of the development of a cleavage furrow in mitotic figures , was found to be relevant, suggesting the differential importance of cytokinetic figures in diagnosis. The importance of compositional features of tumor tissue structures suggested prognostic information embedded in specific forms of tissue structure across the entire WSI. In addition to these interpretable biological features, low-level configurational and formational patterns identified by MitosNet were deemed relevant, denoting differential mitotic stages as prognostically significant.
This study presented the first completely data-driven approach to develop and integrate numerous biologically salient classifiers into a single invasive breast cancer prognostic model. This model was used to predict tumor growth on a categorical and molecular scale and to discover novel image-related biomarkers critical to disease diagnosis. With our prediction framework performing equivalent to pathologist grading and capturing the underlying complexity presented within tissue structure, early and less costly diagnoses for invasive breast cancer may allow for more effective and targeted treatments in clinical practice.
The authors would like to thank Babak Ehteshami Benjordi, Ben Glass, and Francisco Beca for their invaluable input.
- J. Ma and A. Jemal, “Breast cancer statistics,” in Breast Cancer Metastasis and Drug Resistance. Springer, 2013, pp. 1–18.
- M. Veta, P. J. van Diest, M. Jiwa, S. Al-Janabi, and J. P. Pluim, “Mitosis counting in breast cancer: Object-level interobserver agreement and comparison to an automatic method,” PloS one, vol. 11, no. 8, p. e0161286, 2016.
- M. C. Cheang, D. O. Treaba, C. H. Speers, I. A. Olivotto, C. D. Bajdik, S. K. Chia, L. C. Goldstein, K. A. Gelmon, D. Huntsman, C. B. Gilks et al., “Immunohistochemical detection using the new rabbit monoclonal antibody sp1 of estrogen receptor in breast cancer is superior to mouse monoclonal antibody 1d5 in predicting survival,” Journal of clinical oncology, vol. 24, no. 36, pp. 5637–5644, 2006.
- T. O. Nielsen, J. S. Parker, S. Leung, D. Voduc, M. Ebbert, T. Vickery, S. R. Davies, J. Snider, I. J. Stijleman, J. Reed et al., “A comparison of pam50 intrinsic subtyping with immunohistochemistry and clinical prognostic factors in tamoxifen-treated estrogen receptor–positive breast cancer,” Clinical Cancer Research, vol. 16, no. 21, pp. 5222–5232, 2010.
- M. I. A. G. E. (IMAG/e). (2016) Reading whole-slide images. [Online]. Available: http://tupac.tue-image.nl/node/95
- B. E. Bejnordi, G. Litjens, N. Timofeeva, I. Otte-Höller, A. Homeyer, N. Karssemeijer, and J. A. van der Laak, “Stain specific standardization of whole-slide histopathological images,” IEEE transactions on medical imaging, vol. 35, no. 2, pp. 404–415, 2016.
- S. Sural, G. Qian, and S. Pramanik, “Segmentation and histogram generation using the hsv color space for image retrieval,” in Image Processing. 2002. Proceedings. 2002 International Conference on, vol. 2. IEEE, 2002, pp. II–589.
- N. Otsu, “A threshold selection method from gray-level histograms,” Automatica, vol. 11, no. 285-296, pp. 23–27, 1975.
- C. Szegedy, W. Liu, Y. Jia, P. Sermanet, S. Reed, D. Anguelov, D. Erhan, V. Vanhoucke, and A. Rabinovich, “Going deeper with convolutions,” in Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, 2015, pp. 1–9.
- K. He, X. Zhang, S. Ren, and J. Sun, “Deep residual learning for image recognition,” arXiv preprint arXiv:1512.03385, 2015.
- K. Simonyan and A. Zisserman, “Very deep convolutional networks for large-scale image recognition,” arXiv preprint arXiv:1409.1556, 2014.
- G. Huang, Z. Liu, and K. Q. Weinberger, “Densely connected convolutional networks,” arXiv preprint arXiv:1608.06993, 2016.
- C. Szegedy, A. Toshev, and D. Erhan, “Deep neural networks for object detection,” in Advances in Neural Information Processing Systems, 2013, pp. 2553–2561.
- J. Long, E. Shelhamer, and T. Darrell, “Fully convolutional networks for semantic segmentation,” in Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, 2015, pp. 3431–3440.
- D. A. Cohn, Z. Ghahramani, and M. I. Jordan, “Active learning with statistical models,” Journal of artificial intelligence research, 1996.
- M. Veta, P. J. Van Diest, S. M. Willems, H. Wang, A. Madabhushi, A. Cruz-Roa, F. Gonzalez, A. B. Larsen, J. S. Vestergaard, A. B. Dahl et al., “Assessment of algorithms for mitosis detection in breast cancer histopathology images,” Medical image analysis, vol. 20, no. 1, pp. 237–248, 2015.
- F. B. Tek et al., “Mitosis detection using generic features and an ensemble of cascade adaboosts,” Journal of pathology informatics, vol. 4, no. 1, p. 12, 2013.
- D. C. Cireşan, A. Giusti, L. M. Gambardella, and J. Schmidhuber, “Mitosis detection in breast cancer histology images with deep neural networks,” in International Conference on Medical Image Computing and Computer-assisted Intervention. Springer, 2013, pp. 411–418.
- J. A. Hanley and B. J. McNeil, “The meaning and use of the area under a receiver operating characteristic (roc) curve.” Radiology, vol. 143, no. 1, pp. 29–36, 1982.
- S. Muhammadnejad, A. Muhammadnejad, M. Haddadi, M.-A. Oghabian, M.-A. Mohagheghi, F. Tirgari, F. Sadeghi-Fazel, and S. Amanpour, “Correlation of microvessel density with nuclear pleomorphism, mitotic count and vascular invasion in breast and prostate cancers at preclinical and clinical levels,” Asian Pacific Journal of Cancer Prevention, vol. 14, no. 1, pp. 63–68, 2013.
- R. Rappaport, “Cytokinesis: the effect of initial distance between mitotic apparatus and surface on the rate of subsequent cleavage furrow progress,” Journal of Experimental Zoology, vol. 221, no. 3, pp. 399–403, 1982.