Expectation-Maximization Regularized Deep Learning for Weakly Supervised Tumor Segmentation for Glioblastoma
We present an Expectation-Maximization (EM) Regularized Deep Learning (EMReDL) model for the weakly supervised tumor segmentation. The proposed framework was tailored to glioblastoma, a type of malignant tumor characterized by its diffuse infiltration into the surrounding brain tissue, which poses significant challenge to treatment target and tumor burden estimation based on conventional structural MRI. Although physiological MRI can provide more specific information regarding tumor infiltration, the relatively low resolution hinders a precise full annotation. This has motivated us to develop a weakly supervised deep learning solution that exploits the partial labelled tumor regions.
EMReDL contains two components: a physiological prior prediction model and EM-regularized segmentation model. The physiological prior prediction model exploits the physiological MRI by training a classifier to generate a physiological prior map. This map was passed to the segmentation model for regularization using the EM algorithm. We evaluated the model on a glioblastoma dataset with the available pre-operative multiparametric MRI and recurrence MRI. EMReDL was shown to effectively segment the infiltrated tumor from the partially labelled region of potential infiltration. The segmented core and infiltrated tumor showed high consistency with the tumor burden labelled by experts. The performance comparison showed that EMReDL achieved higher accuracy than published state-of-the-art models. On MR spectroscopy, the segmented region showed more aggressive features than other partial labelled region. The proposed model can be generalized to other segmentation tasks with partial labels, with the CNN architecture flexible in the framework.
Glioblastoma is the most common malignant primary brain tumor, characterized by poor outcomes (Wen et al., 2020). The first-line treatment includes maximal safe resection followed by chemoradiotherapy (Stupp et al., 2005), which requires an accurate tumor delineation to enhance the treatment efficacy and reduce the neurological deficits of patients (Mazzara et al., 2004; Stupp et al., 2005). As the manual delineation is often subjective and laborious, an automated tumor segmentation model is crucial in aiding clinical practice. Currently, Magnetic Resonance Imaging (MRI) is the mainstay for diagnosis, treatment planning, and disease monitoring of glioblastoma (Weller et al., 2014, 2017; Wen et al., 2020) . It however remains a challenge to accurately segment the glioblastoma based on MRI (Wadhwa et al., 2019), mainly due to several reasons. Firstly, glioblastoma is characterized by diffuse infiltration into the surrounding brain, leading to a poorly demarcated tumor margin. Secondly, glioblastoma is highly heterogeneous with regard to the tumor location, morphology and intensity values. Thirdly, glioblastoma may demonstrate similar appearance with neurodegenerative or white matter pathologies. All of the above may pose significant challenges to a robust segmentation model.
Incorporating multiple MRI modalities is considered beneficial for tumor segmentation (Ghaffari et al., 2020). Clinically, the most commonly used sequences include T1-weighted, T2-weighted, post-contrast T1-weighted (T1C), and fluid attenuation inversion recovery (FLAIR) sequences. A multimodal brain tumor image segmentation (BraTS) challenge represents the collective efforts to develop segmentation models using a large glioblastoma dataset with multiple MRI sequences available (Bakas et al., 2018). A wide spectrum of models has since been proposed with dramatic success in performance (Ghaffari et al., 2020). Among these models, deep learning shows unique advantages in using multiple MRI sequences for tumor segmentation, compared to the traditional methods of using hand-crafted features. However, the BraTS dataset only includes the most widely used structural sequences, which was shown to be prone to the low specificity in targeting actual tumor infiltration (Verburg et al., 2020). Particularly, for the non-enhancing lesion beyond the contrast-enhancing margin, it remains challenging to differentiate the infiltrated tumor from edema, even combining all the structural sequences (Verburg et al., 2020). An effective imaging model with higher specificity in segmenting the infiltrated tumor is of crucial value for clinical decision making.
An increasing amount of literature provides evidence that physiological MRI can facilitate the characterization of tumor infiltration (Li et al., 2019a; Yan et al., 2019). In particular, diffusion and perfusion MRI can identify the infiltrated tumor beyond the contrast enhancement by offering parametric measures describing tumor physiology, which may complement the non-specificity of the structural sequences. Specifically, The diffusion MRI is the only imaging method of describing brain microstructure by measuring water molecule mobility (Jellison et al., 2004), which can detect the subtle infiltration (Li et al., 2019b), characterize tumor invasiveness (Li et al., 2019c) and predict tumor progression (Yan et al., 2020). On the other hand, as a widely used perfusion technique, dynamic susceptibility contrast (DSC) imaging can derive the relative cerebral blood volume (rCBV), mean transit time (MTT) and relative cerebral blood flow (rCBF), reflecting the aberrant tumor vascularization (Lupo et al., 2005). Therefore, integrating physiological MRI into the tumor segmentation model shows potential to more accurately identify tumor infiltration.
Here we proposed a deep learning model to automatically segment the core and infiltrated tumor based on both structural and physiological multiparametric MRI. We hypothesized that the physiological MRI information of the core tumor could be used to guide the deep learning model to segment the infiltrated tumor beyond the core tumor. In the next section, we summarize the related work of tumor segmentation, including both supervised and weakly supervised models.
2 Related work
Tumor segmentation is an active research field with a growing number of models proposed. These models can be generally classified into generative or discriminative models (Ghaffari et al., 2020). Typically, generative models rely on the prior knowledge of the voxel distributions of the brain tissue, which is derived from the probabilistic atlas (Prastawa et al., 2004), whereas the discriminative models rely on the extracted image features that could be mapped to the classification labels. In general, discriminative models show superior performance than generative models. Most successful discriminative approaches in the BraTS challenge (Menze et al., 2015) are based on fully supervised convolutional neural networks (CNN).
In BraTS 2014, a CNN-based model was firstly introduced. The top-ranked algorithm employed a 3D CNN model trained on small image patches, which consisted of four convolutional layers with six filters in the last layer corresponding to six labels (Urban et al., 2014). In BraTs 2015, a 2D CNN model with a cascaded architecture was proposed. Two parallel CNNs were employed to extract local and global features which were then concatenated and fed into a fully connected layer for classification (Dutil et al., 2015). In BraTS 2016, DeepMedic, a 3D CNN model of eleven layers with residual connections was proposed. Two pathways were employed to process the inputs in parallel, to increase the receptive field of the classification layer (Kamnitsas et al., 2016). In BraTS 2017, the Ensembles of Multiple Models and Architectures (EMMA) separately trained several models (DeepMedics, 3D FCN, and 3D U-net) using different optimization approaches, while the output was defined as the average to reduce bias from individual models (Kamnitsas et al., 2017). The top-ranked model in BraTS 2018 proposed an asymmetric U-net architecture, where an additional variational auto-encoder branch was added to the shared encoder, providing additional regularization (Myronenko, 2018; Warrington et al., 2020). In BraTS 2019, the top-ranked model proposed a two-stage cascaded U-Net (Jiang et al., 2019). The first stage used a U-Net variant for preliminary prediction, whereas the second stage concatenated the preliminary prediction map with the original input images to refine the prediction.
In summary, the above top-ranked models from the BraTS depict the advantages of CNN-based segmentation model, which highlights the capacity of feature extraction of CNN. Further, to enhance the model performance or reduce the computational cost, various techniques were employed to improve the backbone CNN by a series of procedures, e.g., increasing network depth or width, optimizing the loss function, increasing receptive fields, or adopting an ensemble model. For more details of the BraTS models, please refer to (Bakas et al., 2018; Ghaffari et al., 2020). All these state-of-the-art models heavily rely on the full classification labels to train a model that could approximate the accuracy of experts. The infiltrative nature of glioblastoma, however, poses significant challenges to accurate delineation of the interface between tumor and healthy tissue. Although the binary contrast-enhancement provided a reference for “core tumor”, the surrounding non-enhancing region, regarded as the edema in BraTS labels, has established as diffusively infiltrated with tumor.
As outlined in the previous section, multiparametric MRI allows more accurate identification of the non-enhancing infiltrated tumor. Nevertheless, the low resolution of physiological MRI hinders the precise annotation based on these images. A full annotation based on physiological MRI therefore is prone to the subjective errors, even by experienced clinical experts. As a result, those models with high reliance on the full labels may not be suitable for segmented the infiltrated tumor.
Other studies investigated the feasibility of delineating tumor infiltration based on the weak labels of cancerous and healthy tissues. (Akbari et al., 2016) proposed a tumor infiltration inference model using the physiological and structural MRI (Akbari et al., 2016). Two types of weak labels were used, i.e., one scribble immediately adjacent to the enhancing tumor and another scribble near the distal margin of the edema. These two scribble regions, representing the tissue near and far from the core tumor respectively, were hypothesized to correspondingly have higher and lower tumor infiltration. The classifier was trained based on the weak labels using the support vector machine (SVM) which yielded a voxelwise infiltration probability. The model achieved excellent performance and was subsequently validated by another cohort and the tumor recurrence on the follow-up scans.
Although in relatively small sample size, this study underpinned the advantage of physiological MRI in identifying tumor infiltration and supported the feasibility of weakly supervised learning models to tackle the challenge of lacking precise full annotations. The proposed model, however, ignored the spatial continuity of tumor infiltration. The CNN model could empower the weakly supervised learning model (Chan et al., 2020) by effectively extracting multiparametric MRI features with spatial information.
Training a weakly supervised CNN model using a partial cross-entropy loss may lead to poor boundary localization of saliency maps (Zhang et al., 2020). To mitigate this limitation, additional regularization is often employed. For instance, (Tang et al., 2018)introduced a normalized cut loss as a regularizer with a partial cross-entropy loss. (Kervadec et al., 2019) introduced a regularization term constraining the size of the target region that was combined with a partial cross-entropy loss. (Roth et al., 2019) used the random walker algorithm to generate the pseudo full label from the partial labels and then constructed the regularized loss by enforcing the CNN outputs to match the pseudo labels. The results of above studies supported the usefulness of additional regularizers in the weakly supervised models. Due to the advantages of physiological MRI in detecting tumor infiltration, here we hypothesized that a regularizer from the physiological MRI could enhance the weakly supervised model for segmenting the infiltrated tumor by incorporating domain-specific information.
We sought to propose a CNN-based weakly supervised model, in which a regularization term was constructed by incorporating the prior information obtained from the physiological MRI by an prediction model through an expectation-maximization (EM) framework. We evaluated the model validation using tumor recurrence on follow-up scans and MR spectroscopy that non-invasively measures the metabolic alternation. The remainder of this paper is organized as follows: Section 3 will describe the overall study design, main components of the proposed framework and the performance evaluation of the model. Section 4 gives details of the dataset and the implementation of the experiments. Section 5 will provide the results and discussion followed by the conclusions in Section 6.
Consider the multiparametric MRI from (patients) training samples , including both structural sequences (T1-weighted, T2-weighted, T1C and FLAIR) and physiological sequences (diffusion and perfusion MRI), denoted as and , respectively. From a clinical perspective, three regions of interest (ROI) can be delineated (Figure 1):
ROI1: core tumor, which is the contrast-enhancing tumor region on T1C images and the surgery target for clinical practice;
ROI2: potential infiltrated region, which is the hyperintensities in FLAIR images outside of ROI1. We are specifically interested in this region as it represents the clinically extendable treatment target;
ROI3: normal-appearing region on both T1C and FLAIR sequences.
All MRI sequences have been co-registered. The voxel labels can be classified into observed labels and unobserved labels . A voxel label is a value either or , and indicates the labels of ROI1 and ROI3, where indicates a confirmed tumor voxel and represents a voxel from the normal-appearing brain region. The indicates label of ROI2. Given , we aimed to simultaneously segment the core tumor (ROI1) and the peritumoral infiltrated tumor in ROI2.
3.2 Overview of the proposed method
Our goal was to segment the core and infiltrated tumor using the model trained by the existing MRI data and its corresponding observed labels . For the standard supervised CNN models, full training labels are necessary to be used as the ‘ground-truth’ to train the weights of the CNN. In our proposed application, however, as it is not possible to obtain a full annotation for the unobserved labels , which renders a supervised CNN training inappropriate. In this paper, we cast the underlying problem into a weakly supervised learning problem by leveraging the EM algorithm, which can recursively estimate both the unknown parameters (M-step) and the unobserved labels (E-step) in the proposed segmentation problem. The problem can now be treated as a CNN model training task using partial labels.
As shown in Figure 1, the proposed method consists of two main components: physiological prior prediction model (left panel) and EM-regularized segmentation model (right panel). The left panel takes in physiological MRI information to train a classifier and generate voxelwise estimate of the unobserved labels in ROI2. The estimated label information is then passed into the right panel to improve the prediction performance of the segmentation model. Specifically, the label information is used to initialize ROI2 labels in the CNN model training in M-step, and is also integrated into E-step to recursively update the estimation of the unobserved label . The expected outcome of the right panel is a trained CNN segmentation model that can effectively distinguish the infiltrated tumor from the non-cancerous abnormalities, e.g., edema.
The pipeline introduced in Figure 1 can be further generalized to other similar segmentation problems with partially unobserved labels. Both the classifier in the left panel and the CNN segmentation model in the right panel are flexible to be replaced by other feed-forward deep learning models or CNN models with architectures other than the ones used in this paper. Given this, we will not explicitly describe detailed architecture of the CNN models used in the proposed method.
3.3 Physiological prior prediction
As discussed above, physiological MRI is more specific for tumor infiltration but in lower resolution than structural MRI. Treating physiological MRI and structural MRI equally may not able to effectively leverage the specific information from physiological MRI. Therefore, a physiological prior map which incorporates only the information of physiological MRI is generated to describe the extracted knowledge of ROI2. In particular, we constructed the underpin component to approximate the unobserved labels of ROI2, using a classifier trained by both the physiological MRI and the observed labels .
Since the labels in ROI1 and ROI3 only contain binary values 1 and 0, we used a binary classifier constructed by a fully connected neural network with two hidden layers. The number of hidden neurons is set equal to the number of input features from . The model produces probabilisitic prodicton for the distribuion of unoberserved labels in ROI2 with predicted value between . The predicted physiological prior map can then be formulated as , which was used in the EM-regularized weakly supervised learning segmentation component.
3.4 Segmentation with EM-regularized weakly supervised learning
In this component, a segmentation model constructed by a typical U-Net CNN architecture is trained for tumor segmentation. Different from the physiological prior prediction model, the segmentation model is trained using both physiological MRI and structural MRI . The EM algorithm is leveraged in this component to estimate the unobserved label and recursively optimize both the model accuracy and label accuracy of the partial labels potential infiltrated region. To perform this weakly supervised learning segmentation task, we firstly define the likelihood function as:
for which the maximum likelihood estimate with respect to the weights (of CNN) can be computed by integrating out the unknown term and maximizing the marginal distribution:
Nevertheless, the integral is often intractable and exact integration over all possible values is challenging.
EM algorithm solves the problem by iteratively estimating the unknown term in the expectation step (E-step) and in the maximization step (M-step). See (McLachlan and Krishnan, 2007) for details of the standard EM algorithm.
In this work, EM performs E-step by defining
where denotes the estimated CNN weights in iteration . computes the expectation of the log-likelihood of function with respect to the conditional distribution , which can be defined as:
The former term on the RHS is the physiological prior map generated by the binary classifier and the latter term is the predicted labels in the current th iteration of EM. denotes a voxelwise coefficient, which will be used to integrate the physiological prior map and the prediction of segmentation model.
M-step is to maximize the above quantity to derive new estimate :
The conditional distribution can be obtained by the designed CNN model, where its weight is given by .
From the perspective of loss function in CNN model training, Equation (6) can also be treated as the regularization terms to minimize the training loss of the segmentation model in M-step. In practice, the training loss is defined as:
which is a summation of both the supervised loss from the fixed observed labels and the regularised loss from pseudo labels calculated using the conditional distribution in Equation (5).
3.5 Model evaluation
We validated the proposed model using tumor burden, tumor recurrence and MRS. To examine the usefulness of the regularizer, we compared our model performance with the baseline model which employed the U-net with a partial cross-entropy loss without the additional regularizer from the physiological prior. We also compared our model with other methodsAkbari et al. (2016); Tang et al. (2018); Kervadec et al. (2019); Roth et al. (2019).
1) Tumor burden estimation
The finally segmented tumor volume was calculated as the core tumor burden (the delineated tumor in ROI1) and infiltrated tumor burden (the delineated tumor in ROI2). A linear regression was used to test the consistency of the segmented volumes from different models with the ground truth. Forthe core tumor (ROI1), the ground truth was used as the volume of the manual label. For the infiltrated tumor, the ground truth was used as the volumme of the recurrence within the potential infiltrated region (ROI2).
2) Tumor burden and recurrence prediction
The finally segmented tumor region was examined in the prediction of complete tumor burden and tumor recurrence region in the follow-up MRI of 68 patients who received the complete resection, which is defined clinically as a complete resection of contrast-enhancing tumor (ROI1). The potential infiltrated region (ROI2) on the pre-operative images was divided into recurrence region and non-recurrence region , according to the manual label, where represents the complementary operation.
For each patient, the pre-operative contrast-enhancing core tumor (ROI1) on T1C image was denoted as , therefore the total tumor burden was defined as = , whereas the normal-appearing area was defined as . The segmented tumor area and normal-appearing area can be derived automatically by thresholding the tumor infiltration probability that was finally produced by EMReDL. Finally, The sensitivity and specificity of predicting tumor burden were defined as:
After calculating the sensitive and specificity, the optimum threshold T for discriminating predicted infiltration mask was chosen by maximizing the Youden Index of the ROC curves.
3) Magnetic resonance spectroscopy validation
The metabolic signature was compared for the infiltrated region and non-infiltrated region segmented by our model in the potential infiltrated region (ROI2). The metabolic measures, including Choline, N-acetylaspartate (NAA) and Cho/NAA were calculated for the infiltrated region and non-infiltrated region, respectively. To account for the resolution difference between T2 and MRS space, all co-registered data were projected to MRS space, according to their coordinates using MATLAB. The proportion of T2-space tumor pixels occupying each MRS voxel was calculated. Paired t-test was used to compare the metabolic measures of the infiltration and non-infiltration regions.
4.1 Data description
This study was approved by the local institutional review board and informed consent was obtained from all patients. A total of 115 glioblastoma patients was prospectively recruited for maximal safe resection. Each patient underwent pre-operative multiparametric MRI, using a 3-Tesla MRI system (Magnetron Trio; Siemens Healthcare, Erlangen, Germany) with a standard 12-channel receive-head coil. The sequences included T1, T1C, T2, T2-FLAIR, diffusion imaging, DSC and multivoxel 2D 1H-MRS chemical shift imaging.
4.2 Image pre-processing
1) Multiparametric MRI processing
Diffusion MRI was processed using the diffusion toolbox (FDT) in FSL v5.0.8 (FMRIB Software Library, Centre for Functional MRI of the Brain, Oxford, UK). After normalization and eddy current correction, parametric maps of fractional anisotropy (FA), mean diffusivity (MD), p (isotropy) and q (anisotropy) were calculated as previously described (Li et al., 2019e, d). DSC was processed using the NordicICE (NordicNeuroLab, Bergen, Norway), with arterial input function automatically defined and leakage corrected. The parametric maps of rCBV, MTT and rCBF maps were calculated. The MRS data were processed using LCModel (Provencher, Oakville, Ontario) as previously described. All metabolites were calculated as a ratio to creatine (Cr).
2) Image co-registration
All pre-operative parametric maps were co-registered to the T2 space using FSL linear image registration tool (FLIRT) with an affine transformation. For the co-registration of the recurrence image to the pre-operative images, the recurrence T1C images were non-linearly co-registered to the pre-operative T2 images using the Advanced Normalization Tools (ANTs), with the pre-operative lesion masked out.
3) Image normalization
All MRI from different patients were normalized using the histogram matching method. Specifically, for each sequence, the image histograms for all patients were calculated, where the histogram closest to the averaged histogram was determined as the reference and normalized to [0, 1]. Finally, other image were matched to the reference histogram.
4.3 Labelling of pre-operative and recurrence tumor
Preoperative tumor and recurrence regions were manually delineated on the T1C and FLAIR images using the 3D slicer v4.6.2 (https://www.slicer.org/). The delineation was independently performed by a neurosurgeon (XX) and reviewed by a neuroradiologist (XX). Each rater used consistent criteria in each patient and was blinded to patient outcomes. The contrast-enhancing (CE) core tumor was defined as the regions within the contrast-enhancing margin on T1C images. The FLAIR ROI was defined as the hyperintensities on FLAIR images. Finally, the peritumoral ROIs were defined as the non-enhancing regions outside of contrast-enhancing regions, obtained by a Boolean subtraction of CE and FLAIR ROIs in MATLAB.
Patient was treated and followed up by the multidisciplinary team (MDT) according to the clinical guidelines. The extent of resection was assessed according to post-operative MRI within 72 hours. During the follow up of patients, clinical and radiological data were incorporated according to the Response Assessment in Neuro-oncology criteria.
4.5 Implementation details
We divided the complete dataset into two sets randomly: 50% as the training set (images of 57 patients) and 50% as the testing set (images of 58 patients). For the training set, 75% of the data was used for model training and the remaining 25% was used for model validation.
For the training of physiological prior prediction model, the multiparametric MRI feature vector for of the voxels in the ROI1 and ROI3 were used as the input of the empirical fully connected network. The model was trained to minimize the losss function. Adam optimizer was applied to train the model with initial learning rate set to , and the model was trained for 1000 epochs using mini-batches of size 5x. To tackle the class imbalance problem, equal numbers of majority- and minority-class samples were randomly selected for each mini-batch. Finally, the model with smallest validation error was adopted.
After the training of the physiological prior prediction model, a physiological prior map with the tumor infiltration probability was obtained. The EM-regularized weakly supervised segmentation model was trained for 200 epochs using Adam optimizer with initial learning rate of , and mini-batch size of 8. For the training of the first epoch, the prior infiltration probability was used as the probabilistic training labels in ROI2, the potential infiltration regions. Afterwards, the probabilistic training labels were updated for each epoch. The model with lowest validation error was finally chosen.
5 Results and Discussion
The experiment results showed that the proposed weakly supervised model achieved high accuracy in segmenting the core and infiltrated tumor area, which could be validated by the tumor burden estimation, tumor recurrence prediction and identification of invasive areas in MRS. The results are presented in below.
5.1 Tumor burden estimation
Tumor burden is crucial for patient risk stratification and treatment planning. We calculated the tumor burden estimated from the different models as the volume of the segmented regions (Table 1). For the core tumor, the results showed that all CNN models achieved comparable volumes with the grund truth, highlighting the capacity of CNN in core tumor segmentation. For the infiltrated tumor, our results showed EMReDL achieved most similar results with the recurrence volume.
Unit: ; Comparison model 1: SVM. Comparison model 2: Normalized cut loss. Comparison model 3: Size-constrained loss; Comparison model 4: Random walker regularized loss
We also performed the regression analysis between the tumor burden estimated from the models with the ground truth (Table 2). The results showed that for the core tumor, all tested models showed consistency in core tumor burden estimation. However, for the infiltrated tumor, EMReDL achived better consistency over other tested models.
Comparison model 1: SVM. Comparison model 2: Normalized cut loss. Comparison model 3: Size-constrained loss; Comparison model 4: Random walker regularized loss
5.2 Recurrence prediction
Firstly, we compared the performance of the baseline model and EMReDL. The ablation experiment showed that EMReDL achieved superior accuracy in predicting tumor recurrence compared to the baseline model which employed the U-net with a partial cross-entropy loss. The results suggest the usefulness of incorporating the additional regularizer constructed from the physiological MRI. Of note, the baseline model achieved higher higher sensitivity, but lower specificity than EMReDL, which is mainly due to the much smaller segmentation regions. The quantitative comparison results of the EMReDL and baseline model are in Table 3.
AUC: area under the curve. MCC: Matthews correlation coefficient
Figure 2 presents two examples of infiltration area predicted by the EMReDL and baseline model. The pre-operative structural MRIs, including FLAIR, T1C (Figure 2A,B), recurrence T1C (Figure 2C), and physiological MRI including DTI-q, DTI-p, FA, MD, MTT, rCBV and rCBF (Figure 2H-N), as well as the overlaid labels (red: contrast-enhancing core tumor, ROI1; blue: non-enhancing peritumoral region, ROI2). The prediction of two models is overlaid on pre-operative (Figure 2D: bassline, Figure 2E: EMReDL) and recurrence (Figure 2F: bassline, Figure 2G: EMReDL) T1C images. Note the recurrence area is well beyond the contrast-enhancing tumor core on the pre-operative MRI, which showed high correspondence with the infiltrated area identified by EMReDL. This improvement could possibly be explained by the tumor invasion area revealed by the physiological MRI shown underneath. Note the ground truth (the red region) of the complete tumor burden was taken as the combination of the core tumor and the recurrence tumor, with the assumption that the infiltrated tumor in the FLAIR is more responsible for the recurrence outside of the core tumor than other regions.
Next, we compared our results of the segmented infiltration area with other weakly-supervised models proposed in (Akbari et al., 2016; Kervadec et al., 2019; Roth et al., 2019; Tang et al., 2018). The results (Table 4) showed that all the models with additional loss achieved better accuracy than the SVM model, suggesting the usefulness of considering the spatial information through CNN in the prediction. Further, the EMReDL obtained higher accuracy than other weakly supervised models, which again supports the value of incorporating the physiological information through the separate physiological prior prediction model from the main segmentation model. As mentioned, physiological MRI has higher specificity in reflecting tumor biology but lower resolution than structural MRI. Benefiting from the separately designed model, the physiological information could be effectively employed and less affected by the structural MRI, which hence could improve the model performance. In comparison, the pseudo labels generated through the normalized cut loss in (Tang et al., 2018) and the random walker loss in (Roth et al., 2019) were obtained by treating the structural and physiological MRI equally, therefore may not effectively leverage the information from physiological MRI.
AUC: area under the curve. MCC: Matthews correlation coefficient. Comparison model 1: SVM. Comparison model 2: Normalized cut loss. Comparison model 3: Size-constrained loss; Comparison model 4: Random walker regularized loss
Figure 3 presents an example with the comparison of different models. Figure 3a-d show the structural images including T1C, FALIR, T1 and T2. Figure 3e and 3f show the FLAIR abnormality and contrast-enhancing tumor respectively, while Figure 3g indicates the recurrence regions on the follow up scans. The physiological MRI, including DTI-q, DTI-p, FA, MD, MTT, rCBV and rCBF, are shown in Figure 2H-N. Indeed, the EMReDL shows the highest performance, whereas the SVM model shows lower accuracy than all other models.
Lastly, we compared the performance of the different models in segmenting the infiltrated area in Table 5. As expected, all models obtained lower performance than segmenting the complete tumor burden including the core tumor, as we only take the recurrence region as the ground truth, while some non-recurrence area may also display invasive imaging features in the pretreatment MRI. For the model comparison, however, EMReDL achieved higher performance than other models, which may imply the value of the additionally constructed regularizer.
AUC: area under the curve. MCC: Matthews correlation coefficient. Comparison model 1: SVM. Comparison model 2: Normalized cut loss. Comparison model 3: Size-constrained loss; Comparison model 4: Random walker regularized loss
To summarize, the model comparisons may validate the performance of the proposed weakly supervised model. Also, our model showed comparable performance in both training and testing sets, which could suggest the robustness of the model.
5.3 MRS results
The MRS results showed that the predicted infiltrated region showed significantly more aggressive signature than the non-infiltrated region, which suggests the infiltration prediction could have significance regarding the tumor-induced metabolic change. Specifically, choline is a marker of cellular turnover and membrane integrity, which is correlated with tumor proliferation. NAA is a maker of neuron structure, which may be destructed by the tumor infiltration. In previous studies, the choline/NAA ratio was frequently used an imaging marker to indicate tumor invasiveness, which was shown to correlate with patient outcomes. The detailed comparison of MRS data from the predicted infiltrated ad non-inlfiltrated regions are detailed in Table 6.
IR: infiltration region; NAA: N-acetylaspartate
Our study has limitations. Firstly, our manual labels were delineated by human experts. Therefore, different from the synthetic images, any analysis performed on this dataset may be biased and subjective compared to the synthetic images. Secondly, the other weakly supervised models that we compared with our models are not developed based on MRI. Therefore the performance may be affected when applied to our images. Lastly, due to the nature of tumor infiltration and ethics issue, some infiltrated tumor may not be directed observed and measured, as some tumor regions are more sensitive to treatment, Therefore, incorporating longitudinal MRI into the model could yield a more accurate infiltrated tumor estimation, which we are improving in our current study.
In this paper, we presented an expectation-maximization regularized weakly supervised tumor segmentation model based on the deep convolutional neural networks. The proposed method was developed to segment both the core and peritumoral infiltrated tumor based on the multiparametric MRI. This weakly supervised model was developed to tackle the challenge of obtaining the full accurate labels for the infiltrated tumour. To effectively leverage the physiological MRI that has higher specificity but lower resolution than structural MRI, we constructed a physiological prior map generated from a fully connected neural network, for the iterative optimization of the CNN segmentation model. Using the tumor burden, tumor recurrence and MRS, the model evaluation confirms that our proposed model achieved higher accuracy than the published state-of-the-art weakly supervised methods, using the regularizer constructed from physiological MRI.
- Imaging surrogates of infiltration obtained via multiparametric imaging pattern analysis predict subsequent location of recurrence of glioblastoma. Neurosurgery 78 (4), pp. 572–580. External Links: Cited by: §2, §3.5, §5.2.
- Identifying the best machine learning algorithms for brain tumor segmentation, progression assessment, and overall survival prediction in the brats challenge. arXiv preprint arXiv:1811.02629. Cited by: §1, §2.
- A comprehensive analysis of weakly-supervised semantic segmentation in different image domains. International Journal of Computer Vision. External Links: Cited by: §2.
- A convolutional neural network approach to brain lesion segmentation. Ischemic Stroke Lesion Segment, pp. 51–6. Cited by: §2.
- Automated brain tumor segmentation using multimodal brain scans: a survey based on models submitted to the brats 2012-2018 challenges. IEEE Rev Biomed Eng 13, pp. 156–168. External Links: Cited by: §1, §2, §2.
- Diffusion tensor imaging of cerebral white matter: a pictorial review of physics, fiber tract anatomy, and tumor imaging patterns. American Journal of Neuroradiology 25 (3), pp. 356–369. Cited by: §1.
- Two-stage cascaded u-net: 1st place solution to brats challenge 2019 segmentation task. In International MICCAI Brainlesion Workshop, pp. 231–241. Cited by: §2.
- Ensembles of multiple models and architectures for robust brain tumour segmentation. In International MICCAI Brainlesion Workshop, pp. 450–462. Cited by: §2.
- DeepMedic for brain tumor segmentation. In International workshop on Brainlesion: Glioma, multiple sclerosis, stroke and traumatic brain injuries, pp. 138–149. Cited by: §2.
- Constrained-cnn losses for weakly supervised segmentation. Med Image Anal 54, pp. 88–99. External Links: Cited by: §2, §3.5, §5.2.
- Multi-parametric and multi-regional histogram analysis of mri: modality integration reveals imaging phenotypes of glioblastoma. Eur Radiol 29 (9), pp. 4718–4729. External Links: Cited by: §1.
- Intratumoral heterogeneity of glioblastoma infiltration revealed by joint histogram analysis of diffusion tensor imaging. Neurosurgery 85 (4), pp. 524–534. External Links: Cited by: §1.
- Characterizing tumor invasiveness of glioblastoma using multiparametric magnetic resonance imaging. J Neurosurg, pp. 1–8. External Links: Cited by: §1.
- Intratumoral heterogeneity of glioblastoma infiltration revealed by joint histogram analysis of diffusion tensor imaging. Neurosurgery 85 (4), pp. 524–534. Cited by: §4.2.
- Characterizing tumor invasiveness of glioblastoma using multiparametric magnetic resonance imaging. Journal of Neurosurgery 1 (aop), pp. 1–8. Cited by: §4.2.
- Dynamic susceptibility-weighted perfusion imaging of high-grade gliomas: characterization of spatial heterogeneity. American Journal of Neuroradiology 26 (6), pp. 1446–1454. Cited by: §1.
- Brain tumor target volume determination for radiation treatment planning through automated mri segmentation. Int J Radiat Oncol Biol Phys 59 (1), pp. 300–12. External Links: Cited by: §1.
- The em algorithm and extensions. Vol. 382, John Wiley & Sons. Cited by: §3.4.
- The multimodal brain tumor image segmentation benchmark (brats). IEEE Trans Med Imaging 34 (10), pp. 1993–2024. External Links: Cited by: §2.
- 3D mri brain tumor segmentation using autoencoder regularization. In International MICCAI Brainlesion Workshop, pp. 311–320. Cited by: §2.
- A brain tumor segmentation framework based on outlier detection. Med Image Anal 8 (3), pp. 275–83. External Links: Cited by: §2.
- Weakly supervised segmentation from extreme points. In Large-Scale Annotation of Biomedical Data and Expert Label Synthesis and Hardware Aware Learning for Medical Imaging and Computer Assisted Intervention, pp. 42–50. Cited by: §2, §3.5, §5.2.
- Radiotherapy plus concomitant and adjuvant temozolomide for glioblastoma. N Engl J Med 352 (10), pp. 987–96. External Links: Cited by: §1.
- Normalized cut loss for weakly-supervised cnn segmentation. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 1818–1827. Cited by: §2, §3.5, §5.2.
- Multi-modal brain tumor segmentation using deep convolutional neural networks. MICCAI BraTS (brain tumor segmentation) challenge. Proceedings, winning contribution, pp. 31–35. Cited by: §2.
- Improved detection of diffuse glioma infiltration with imaging combinations: a diagnostic accuracy study. Neuro Oncol 22 (3), pp. 412–422. External Links: Cited by: §1.
- A review on brain tumor segmentation of mri images. Magn Reson Imaging 61, pp. 247–259. External Links: Cited by: §1.
- XTRACT-standardised protocols for automated tractography in the human and macaque brain. NeuroImage, pp. 116923. Cited by: §2.
- EANO guideline for the diagnosis and treatment of anaplastic gliomas and glioblastoma. Lancet Oncol 15 (9), pp. e395–403. External Links: Cited by: §1.
- European association for neuro-oncology (eano) guideline on the diagnosis and treatment of adult astrocytic and oligodendroglial gliomas. Lancet Oncol 18 (6), pp. e315–e329. External Links: Cited by: §1.
- Glioblastoma in adults: a society for neuro-oncology (sno) and european society of neuro-oncology (eano) consensus review on current management and future directions. Neuro Oncol 22 (8), pp. 1073–1113. External Links: Cited by: §1.
- Multimodal mri characteristics of the glioblastoma infiltration beyond contrast enhancement. Ther Adv Neurol Disord 12, pp. 1756286419844664. External Links: Cited by: §1.
- A neural network approach to identify the peritumoral invasive areas in glioblastoma patients by using mr radiomics. Sci Rep 10 (1), pp. 9748. External Links: Cited by: §1.
- Weakly-supervised salient object detection via scribble annotations. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 12546–12555. Cited by: §2.