Improving 3D U-Net for Brain Tumor Segmentation by Utilizing Lesion Prior

Improving 3D U-Net for Brain Tumor Segmentation by Utilizing Lesion Prior

Abstract

We propose a novel, simple and effective method to integrate lesion prior and a 3D U-Net for improving brain tumor segmentation. First, we utilize the ground-truth brain tumor lesions from a group of patients to generate the heatmaps of different types of lesions. These heatmaps are used to create the volume-of-interest (VOI) map which contains prior information about brain tumor lesions. The VOI map is then integrated with the multimodal MR images and input to a 3D U-Net for segmentation. The proposed method is evaluated on a public benchmark dataset, and the experimental results show that the proposed feature fusion method achieves an improvement over the baseline methods. Besides, our proposed method also achieves competitive performance compared to state-of-the-art methods.

\name

Po-Yu Kao   Jeffereson W. Chen   B.S. Manjunath thanks: Acknowledgements: This work was partially supported by a National Institutes of Health (NIH) award # 5R01NS103774-03. \address Vision Research Lab, University of California, Santa Barbara, California, United States
UCI Health, University of California, Irvine, California, United States {keywords} Brain tumor segmentation, feature fusion, volume-of-interest, 3D U-Net, lesion prior

1 Introduction

Primary central nervous system (CNS) tumors refer to a heterogeneous group of tumors arising from cells within the CNS and can be benign or malignant. Malignant primary brain tumors remain among the most difficult cancers to treat, with a 5-year overall survival rate no greater than 35%. The most common malignant primary brain tumors in adults are gliomas. In a patient with a suspected brain tumor, magnetic resonance imaging (MRI) with gadolinium is the investigation tool of choice [lapointe2018primary]. Manual segmentation of brain tumors on MR images is a challenging and time-consuming task. Therefore, an automatic and accurate brain tumor segmentation tool benefits radiologists and physician on both diagnosis and treatment planning.

Convolutional neural networks have achieved state-of-the-art performance in the recent Multimodal Brain Tumor Image Segmentation Benchmarks (BraTS) [isensee2017brain, isensee2018no, kamnitsas2017ensembles, myronenko20183d]. These works focus on designing a new network architecture, loss function, data augmentation, and training and testing procedure in order to improve the performance of brain tumor segmentation. Another method proposed by Kao et al. [kao2018brain] utilizes an existing brain parcellation to bring location information of brain into patch-based neural networks that improves the brain tumor segmentation performance of networks. Inspired by their work, we directly integrate lesion prior with multimodal MR images and input the fused information to a 3D U-Net. The proposed lesion prior fusion method includes two steps: (i) we first create a volume-of-interest (VOI) map from the ground-truth brain tumor lesions, and (ii) this VOI map is then integrated with the multimodal MR images and input to a 3D U-Net for the brain tumor segmentation. The main contribution of this paper is the integration of lesion prior to a 3D U-Net architecture that improves the brain tumor segmentation performance of the 3D U-Net.

2 Materials and Methods

2.1 Dataset

Multimodal Brain Tumor Image Segmentation Benchmark (BraTS) 2017 [bakas2017advancing, bakas2017gbm, bakas2017lgg, menze2015multimodal] provides 285 subjects in the training set and 46 subjects in the validation set. Multimodal MR images are provided for each subject, but ground-truth lesion mask is only available for the training subject. These MR images include T1-weighted, contrast-enhanced T1-weighted, T2-weighted, and fluid-attenuated inversion recovery scans, and the ground-truth lesion mask comprises the enhancing tumor (ET), edema (ED), and necrotic & non-enhancing tumor (NCR/NET). The dimension of each image is in the and direction, and the voxel resolution is . The provided data are intra-subject registered, interpolated to the same resolution and skull-stripped.

2.2 Volume-of-interest Map

The volume-of-interest (VOI) map is built in the Montreal Neurological Institute (MNI) 1mm space [grabner2006symmetric], and each voxel of the VOI map has a label ranging from 0 to 9, which represents different probabilities of observing the brain tumor lesions. First, we build the heatmaps of different types of brain tumor lesions in the MNI space. We apply inter-subject registration which registers the ground-truth lesions of each BraTS 2017 training subject from the subject space to the MNI space using FLIRT [jenkinson2001global] from FSL [jenkinson2012fsl]. We then split the brain lesions of each subject into three binary masks, and each binary mask only contains information of one type of lesion. For each type of lesion, we apply element-wise summation to the binary masks of all 285 training subjects and create the heatmap of this type of lesion. Fig. 1 shows the heatmaps of different brain tumor lesions from BraTS 2017 training subjects in the MNI space.

{overpic}

[width=20mm, height=22.4mm]edema_heatmap.pngED{overpic}[width=20mm, height=22.4mm]necrosis_heatmap.pngNCR/NET{overpic}[width=20mm, height=22.4mm]tumor_heatmap.pngET

Figure 1: The heatmaps of different brain tumor lesions. The brighter voxels (yellow) represent higher intensity values. Best viewed in color.

The heatmaps of different brain lesions are then used to create the VOI map. The VOI map construction accounts for the fact that the whole tumor is a superset of ET, NCR/NET and ED, and the tumor core includes ET and NCR/NET. In addition, ETs are usually observed in patients with high-grade gliomas whose survival rate is considerably lower than patients with low-grade gliomas. Based on these observations, we create Algorithm 1 to generate the VOI map and prioritize the order of the VOI labels.

input : A heatmap of ED of size
A heatmap of NCR/NET of size
A heatmap of ET of size
output : The VOI map of size
percentile of non-zero voxels of ;
percentile of non-zero voxels of ;
percentile of non-zero voxels of ;
for  to  do
       for  to  do
             for  to  do
                   if  then
                        ;
                   else if  then
                        ;
                   else if  then
                        ;
                   else if  then
                        ;
                   else if  then
                        ;
                   else if  then
                        ;
                   else if  then
                        ;
                   else if  then
                        ;
                   else if  then
                        ;
                   else
                        ;
                   end if
                  
             end for
            
       end for
      
end for
Algorithm 1 Build the VOI map from the heatmaps of lesions.

Note that the VOI labels are based on the thresholds which are chosen from the percentiles of non-zero voxels of heatmaps. For each lesion type, we sort the frequency counts of the non-zero voxels, and the heatmaps are used to generate these frequency counts. The percentile thresholds () are selected from these sorted frequency counts. We then use these percentile thresholds to create the VOI label mapping. Any given voxel location in the VOI map has probabilities of being different types of lesion. We examined different thresholds, and percentiles yield the best overall segmentation performance. Fig. 2 shows the VOI map and the distribution of brain tumor lesions occurring in the different labels of VOI map. This distribution is computed by dividing the total voxel value of lesions in the heatmaps by the total volume of the corresponding VOI label. This distribution shows that (i) the prior probabilities of different lesions depend on their corresponding labels in the VOI label map, and (ii) lesions have higher probabilities to happen in the larger VOI labels.

{overpic}

[height=29mm]voi2d.pngVOI map{overpic}[height=29mm]VOI_histogram.png

Figure 2: The VOI map (background-0, red-1, green-2, blue-3, yellow-4, orange-5, pink-6, purple-7, grey-8, and brown-9) and the distribution of brain tumor lesions (green-ED, blue-NEC/NET, and red-ET) observed in the different labels of VOI map from BraTS 2017 training subjects. Best viewed in color.

2.3 3D U-Net

2.3.1 Data pre-processing

Intensity normalization is the procedure of mapping intensities of different MR images into a standard scale, and it is an essential step to avoid initial biases and improve the performance of the network. For each MR image, we first clip it at [0.2 percentile, 99.8 percentile] of non-zero voxels to remove the outliers and subsequently normalize every voxel within the brain with respect to their mean and standard deviation. That is, where is the index of voxel inside the brain, is the normalized voxel, is the corresponding raw voxel, and and are the mean and standard deviation of the raw voxels inside the brain, respectively.

2.3.2 Network architecture

The proposed network architecture shown in Fig. 3 is based on 3D U-Nets [cciccek20163d, jiang2019accurate]. Different colors of blocks represent different types of layers. The number of convolutional kernels is indicated within the white box. Group normalization [wu2018group] is used, and the number of groups is set to 4. Trilinear interpolation is used in the upsampling layer.

Figure 3: The proposed network architecture. conv(3): convolutional layer, GN: group normalization, D(0.3): dropout layer with 0.3 dropout rate, maxpool(2): max pooling layer, and conv(1): convolutional layer. Best viewed in color.

2.3.3 Training and testing procedure

The proposed network is trained with randomly cropped patches of size voxels and batch size 2. A larger input patch capture more contextual information of the brain. In every epoch, a cropped patch is randomly extracted from each subject. The network is trained for a total of 300 epochs. The weights of network are updated by Adam algorithm [kingma2014adam] with an initial learning rate following the schedule of , L2 penalty weight decay of , and AMSGrad [reddi2018convergence]. For the loss function, the standard multi-class cross-entropy loss with the hard negative mining is used to solve the class imbalance problem of the dataset. We only back-propagate the negative (background) voxels with the largest losses (hard negative) and the positive (lesions) voxels to the gradients. In our implementation, the number of selected negative voxels is at most three times more than the number of positive voxels. In addition, data augmentation is not used for both training and testing. At the testing time, we input the entire image of size voxels into the trained 3D U-Net for each patient to get the predicted lesion mask. Training takes approximate 12.5 hours, and testing takes approximate 1.5 seconds per subject on an Nvidia 1080 Ti GPU.

Figure 4: The pipeline of integrating the VOI map and a 3D U-Net.

2.4 Integrate the VOI Map and a 3D U-Net

Fig. 4 shows the pipeline of integrating the VOI map and a 3D U-Net for brain tumor segmentation. First, we register the VOI map from the MNI 1mm space to the subject space using FLIRT [jenkinson2001global] from FSL [jenkinson2012fsl], and this registered VOI map is then split into 9 binary masks. Each binary mask only contains information of one VOI label. Afterward, these binary masks are concatenated with the multimodal MR images. In the end, we input this 13-channel (4 image channels + 9 VOI channels) image to a 3D U-Net for both training and testing.

2.5 Evaluation Metrics

The employed evaluation metrics are the (i) Dice similarity coefficient (DSC) and the (ii) 95 percentile of the Hausdorff distance (H95). DSC is the quotient of similarity and ranges between 0 and 1 which is defined as

where and are the number of voxels in the ground-truth label and predict label, respectively. Hausdorff distance measures how far two subsets of a metric space are from each other which is defined as

where is the Euclidean distance, is the supremum, and is the infimum.

3 Experimental Results and Discussion

First, we examine if the proposed lesion prior fusion method improves the brain tumor segmentation performance of the proposed 3D U-Net. Therefore, we train two identical 3D U-Nets with and without additional VOI map using 285 subjects of BraTS 2017 training set. BraTS 2017 validation set is used to evaluate the performance of these networks. The quantitative results are shown in Table 1. From the first two rows of Table 1, our proposed lesion prior fusion method improves the performance of 3D U-Net, particularly for the DSC of ET (3.5%), and H95 of ET (2.56) and whole tumor (2.39).

DSC H95
Model Descriptions ET WT TC ET WT TC
Single 3D U-Net (baseline) 0.695 0.896 0.762 6.79 6.92 11.38
Single 3D U-Net + VOI (proposed) 0.730 0.899 0.764 4.23 4.53 10.93
Ensemble of five 3D U-Nets (baseline) 0.723 0.902 0.763 5.99 4.75 10.58
Ensemble of five 3D U-Nets + VOI (proposed) 0.744 0.903 0.780 5.01 3.86 9.71
Isensee et al. [isensee2017brain] 0.732 0.896 0.797 4.55 6.97 9.48
Kamnitsas et al. [kamnitsas2017ensembles] 0.738 0.901 0.797 4.50 4.23 6.56
Table 1: Quantitative results of the different models on BraTS 2017 validation set. Higher DSC and lower H95 indicate better segmentation performance. These results are given by the official online evaluation website. Results are reported as mean. Tumor core (TC) is the union of necrosis & non-enhancing tumor and enhancing tumor (ET). Whole tumor (WT) is the union of edema, necrosis & non-enhancing tumor and enhancing tumor. The underlined numbers highlight the improvement of VOI map, and the bold numbers highlight the best performance.

Second, we examine if the proposed lesion prior fusion method improves the performance of the ensemble of 3D U-Nets. Thus, we train two identical ensembles with and without additional VOI map using 285 subjects of BraTS 2017 training set. Each ensemble has five identical networks with different seed initializations, and the output of ensemble is averaged from five networks. BraTS 2017 validation set is used to evaluate the performance of ensembles, and the quantitative results are shown in Table 1. From the middle two rows of Table 1, our proposed lesion prior fusion method also improves the tumor segmentation performance of the ensemble of five 3D U-Nets, particularly for the DSC of ET (2.1%) and tumor core (1.7%). The reason why the VOI map has the greatest improvement on the ET is that the percentiles of ET heatmap have the highest priorities while we create the VOI map. In addition, the proposed VOI map, directly built from the heatmaps of brain lesions, has inhomogeneous labels within neighboring voxels that carry more precise information of brain tumor lesions to the 3D U-Net.

In the end, we compare the performance of our proposed method with the state-of-the-art methods [isensee2017brain, kamnitsas2017ensembles]. From Table 1, the baseline model has worse performance than the state-of-the-art methods but it achieves a competitive performance by integrating the proposed VOI map. It is noted that the ensemble of Kamnitsas et al. [kamnitsas2017ensembles] contains 7 different types of models but our proposed ensemble only consists of five 3D U-Net.

4 Conclusion

We have proposed a novel method to integrate prior information about the lesion probabilities into a 3D U-Net for improving brain tumor segmentation. Our experimental results demonstrate that the proposed lesion prior fusion approach improves the segmentation performance of the baseline model. Moreover, the proposed lesion prior fusion method can be easily integrated with other network architectures to further potentially enhance their segmentation performance.

References

Comments 0
Request Comment
You are adding the first comment!
How to quickly get a good reply:
  • Give credit where it’s due by listing out the positive aspects of a paper before getting into which changes should be made.
  • Be specific in your critique, and provide supporting evidence with appropriate references to substantiate general statements.
  • Your comment should inspire ideas to flow and help the author improves the paper.

The better we are at sharing our knowledge with each other, the faster we move forward.
""
The feedback must be of minimum 40 characters and the title a minimum of 5 characters
   
Add comment
Cancel
Loading ...
390131
This is a comment super asjknd jkasnjk adsnkj
Upvote
Downvote
""
The feedback must be of minumum 40 characters
The feedback must be of minumum 40 characters
Submit
Cancel

You are asking your first question!
How to quickly get a good answer:
  • Keep your question short and to the point
  • Check for grammar or spelling errors.
  • Phrase it like a question
Test
Test description