An Uncertainty-Driven GCN Refinement Strategy for Organ Segmentation

An Uncertainty-Driven GCN Refinement Strategy for Organ Segmentation

Roger D. Soberanis-Mukul(Γ) Computer Aided Medical Procedures, Technical University of Munich, Germany
   Shadi Albarqouni Computer Aided Medical Procedures, Technical University of Munich, Germany
   Nassir Navab Computer Aided Medical Procedures, Technical University of Munich, Germany

Organ segmentation is an important pre-processing step in many computer assisted intervention and computer assisted diagnosis methods. In recent years, CNNs have dominated the state of the art in this task. Organ segmentation scenarios present a challenging environment for these methods due to high variability in shape, similarity with background, etc. This leads to the generation of false negative and false positive regions in the output segmentation. In this context, the uncertainty analysis of the model can provide us with useful information about potentially misclassified elements. In this work we propose a method based on uncertainty analysis and graph convolutional networks as a post-processing step for segmentation. For this, we employ the uncertainty levels of the CNN to formulate a semi-supervised graph learning problem that is solved by training a GCN on the low uncertainty elements. Finally, we evaluate the full graph on the trained GCN to get the refined segmentation. We compare our framework with CRF on a graph-like data representation as refinement strategy.

Organ segmentation refinement Uncertainty GCN CRF Semi-supervised.

1 Introduction

Segmentation of anatomical structures is an important step in many computer aided procedures, like medical image navigation and detection algorithms. Many of this methods rely on manually segmented inputs performed by clinical experts. However, this is a time consuming task due to the large amount of information (generally volumes) that is generated. Organ segmentation in CT or MRI slices has been a topic of research for many years. Recently, with the growth of deep learning models, many architectures have been proposed for dealing with this problem. Some of the challenges for this models are related to the similarity between organs and background. This leads to misclassifications, mainly in boundary regions of the organs, that generates false positives (FP) and false negatives (FN) regions in the final results. Due to this, they might not be enough for clinical integration, where higher precision is required. One way to improve model performance is by introducing a post-processing refinement step in the pipeline. Even though dense graph representations of three dimensional data have been applied for refinement [1], the use of recent graph convolutional networks (GCN) with sparse graphs representations of 3-D data has not been fully investigated. In this paper, we propose a two-step approach for refinement of volumetric segmentation coming from a convolutional neural network (CNN). First, we perform a uncertainty analysis by applying Monte Carlo dropout (MCDO) [9] to the network to obtain the model’s uncertainty. This is used to divide the CNN output in high confidence background, high confidence foreground and low confidence points (FP and FN candidates). The uncertainty is also used to define a 3-D shape-adapted region of interest (ROI) around the organ. With this information, we define a semi-labeled graph inside the ROI that then is used to train a GCN in a semi-supervised way using the high confidence nodes. The refined segmentation is obtained by evaluating the full graph in the trained GCN. Additionally, we compare our framework with the refinement resulting from a fully connected conditional random field (CRF) inference [2]. Our main contributions can be summarized as follows: We present a methodology to define a semi-labeled graph representation for 3-D medical images; A refinement strategy that can be added to any CNN model (through MCDO). To our best knowledge, this is one of the first works employing a GCN-based refinement for CNN segmentation in medical data and the first work combining uncertainty-based misclassified point proposal with graph-like representations and GCN for the refinement of organ segmentation in volumetric data. We validate our method in the segmentation of pancreas. The results are compared with CRF.

Related Work

Recent segmentation models for medical structures are based on fully convolutional neural networks (FCN). These models can be composed of aggregations of multiple 2-D FCN [3, 4] or by 3-D FCN [5, 6]. Refinement strategies are typically added at the end of the process to improve the results. This can also be used as an intermediate processing step, where more complex strategies can use the refined results to improve the segmentation. For example, in [7], a set of scribbles is generated by defining a CRF problem that is solved with a Graph Cuts methods. This results, can be combined with user defined scribbles to perform an image specific fine-tune of a CNN segmentor. In other context, given the limited availability of labeled medical data, semi-supervised learning methods define strategies to include the (most commonly) available unlabeled medical data. Such strategies include the generation of pseudo-labels for unlabeled data. Here, refinement methods, like densely connected CRF [11] are included in the semi-supervised steps, to refine the pseudo-labels. Uncertainty has also proved to be useful as an attention mechanism in semi-supervised learning [12] and recent works in computer vision have started to explore the capabilities of uncertainty for finding potential misclassified regions for segmentation refinement purposes [10]. In the medical context, uncertainty has been employed as a measure of quality for the segmented output [13], and its ability to reflect incorrect predictions has been recently studied [14]. This motivates the development and research of new refinement strategies based on uncertainty-driven misclassification proposal.

2 Methods

In this section we describe the process employed to refine the segmentation. First, we perform an uncertainty analysis on the CNN. Then, we use this information to define a semi-supervised GCN learning problem for segmentation refinement.

2.1 Uncertainty Analysis.

In order to define FP/FN candidates, we estimate the uncertainty of the CNN using the MCDO strategy presented in [9]. For this, we use the dropout layers of the network in inference time, and perform stochastic passes on the network. Then, the model’s expectation is obtained using the following equation:


with the number of MCDO passes, a CNN model (slice-wise or volumetric), the input data, and the model parameters after applying dropout in the pass . The misclassified candidates are based on the entropy level of the CNN, computed as:


with the probability map of for class , and in our binary segmentation scenario. We use as an approximation of the probability for computing the entropy. Since MCDO can be computationally expensive due to the multiple evaluations, we reduce the volume to the smallest cube containing the the biggest 3-D connected component in the CNN prediction . Finding this connected component, by itself, can increases the overall dice score of . We perform all the uncertainty analysis considering this smaller area.

2.2 Graph Definition.

The graph is constructed considering the set of volumes (see Fig. 1). We will use the notation to define the value of a particular volume at voxel . We aim to obtain a refined segmentation using a graph based approach:


with a GCN, a semi-labeled graph, and a set of model parameters.

Figure 1: a) The GCN refinement strategy. We construct a semi-labeled graph representation based on the uncertainty analysis of the CNN. Then, a GCN is trained to refine the segmentation. b) Connectivity. The black square is connected to six perpendicular neighbors and with random voxels

Since most of the voxels in the volume are irrelevant for the refinement process, we restrict the refinement to a shape-adapted ROI surrounding the uncertainty region. Given that graphs are not restricted to rectangular structured representation of data, we can use shape-arbitrary ROIs adjusted to our working area. The ROI is defined as , with and the binarized expectation and binarized entropy respectively. Note that this last gives us the FN/FP voxels candidates. The expectation is thresholded by 0.5 and the entropy by a parameter . The voxels inside are used to define the nodes of and each node is represented by a feature vector containing intensity , expectation , and prediction . Edges are generated as follows: for a particular voxel (the black square in Fig 1b) we create a connection to its six perpendicular neighbors and also to 16 randomly selected voxels inside the ROI (the blue squares in Fig 1b). This allows the definition of a sparse graph representation, where efficient filtering operations are implemented as a product of sparse matrix [16]. To define the weights for the edges, we tested a function based on Gaussian kernels considering the intensity and the 3-D position associated with the node:


where is a balancing factor, div is given by the diversity between the nodes [17], defined as with , and for our binary case. We go for an additive weighting instead of a multiplicative. This because the GCN can take advantage of connections with both similar and dissimilar nodes in the learning process, and using a multiplicative weighting could cut dissimilar connections. Additive weighting will just assign a lower weight. Finally, we labeled each node in the graph according to its uncertainty level using the next rule:


In this way, we have defined a semi-supervised graph’s node classification problem that is solved with the methods presented in [16].

3 Experiments and Results

3.1 Implementation Details

We evaluate our framework on a 2-D U-Net [15] trained for pancreas segmentation with the dice-loss. In order to compute the uncertainty, we include a dropout layer after every convolutional layer and trained for 100 epochs with the Adam optimizer and a learning rate of . We also include batch normalization after every convolutional layer, to bring stability to the network. We used the publicly available NIH [18, 19, 20] pancreas dataset111 for training and testing. We use 53 volumes for training and 20 volumes for testing. Nine volumes were not included in the experiments, since they appear to come from a different distribution. The GCN is conformed by two layers: a hidden layer with 32 feature maps and a output layer with two logits. The CGN is trained with Adam for 200 epochs with a learning rate of , at evaluation time and independently for each volume. The parameters for the weighting were set to the variance of their respective arguments and we use . For the CRF refinement we use an implementation of [2], with as unary potential and the pair-wise potential defined in terms of position and intensity (similar to the smoothness and appearance kernels defined in [2]). The number of MCDO samples was set to . We tried with different uncertainty thresholds, ranging from to .

3.2 Results

Model Average DSC. Std. Dev. Max. DSC. Min. DSC
U-Net 76.00 % 6.35 % 85.56 % 61.91 %
U-Net connected comp. 77.04 % 7.86 % 88.21 % 59.15 %
GCN () 78.04 % 7.29 % 88.05 % 61.88 %
GCN () 77.92 % 7.43 % 88.10 % 61.36 %
GCN () 77.93 % 7.41 % 87.90 % 61.32 %
GCN () 77.96 % 7.34 % 87.90 % 61.21 %
GCN () 77.82 % 7.38 % 88.04 % 60.85 %
CRF 77.84 % 8.33 % 88.21 % 53.74 %
Table 1: CNN/GCN/CRF dice score (DSC) performance.
Figure 2: The yellow shaded region is the ground truth. A is a sample slice, B, C and D are the U-Net, CRF and GCN outputs (the bright regions) respectively. Rows 1 and 2 show cases where the refinement increases the accuracy by extending (1) or by adding a missing regions (2). Row 3 show a case where connectivity was lost in the center.

Table 1 shows the results for the U-Net segmentation before and after finding the largest connected component (U-Net connected comp.), together with the dice score for the GCN strategy using different uncertainty thresholds, and the CRF performance. Results shows better improvement in the dice score when using the GCN based refinement, specially when a threshold is used. These results were achieved in a sparse graph representation, showing that more efficient connectivities strategies can be applied, instead of fully connected representations, and that GCNs can make use of this sparse representations. Visual results are presented in Fig. 2. Rows 1 and 2 shows how graph based method can use connectivity relationships to recover missing regions. However in row 3, we can see a lost in connectivity. This happens because both models include the CNN expectation in their definition. This causes the disconnections when the expectation has strong differences with respect to the real segmentation. However, in this cases, the GCN shows more robustness to this problem and keeps part of the voxels, compared with the CRF method. In this context, different weighting methodologies can be investigated, in order to avoid disconnection.

4 Conclusion

In this work we have presented a method to construct a sparse semi-labeled graph representation of volumetric medical data, based on the output and uncertainty analysis of a CNN segmentation. We have also shown that GCN learning strategies can be used on this graph to obtain a refined segmentation. Future research can be directed in definitions of connectivity, weighting, and node representation.


  • [1] Kamnitsas, K., Ledig, C., et al.: Efficient Multi-Scale 3D CNN with fully connected CRF for Accurate Brain Lesion Segmentation. Medical Image Analysis. (2016)
  • [2] Krähenbühl, P., and Koltun,V.: Efficient inference in fully connected CRFs with Gaussian edge potentials. In: Proceedings of the 24th International Conference on Neural Information Processing Systems (NIPS). (2011) 109-117
  • [3] Zhou, Y., Xie, L., et al.: A Fixed-Point Model for Pancreas Segmentation in Abdominal CT Scans. In: Medical Image Computing and Computer Assisted Intervention (MICCAI). (2017) 193-701
  • [4] Roth, H., Lu, L., Lay, N., et al.: Spatial Aggregation of Holistically-Nested Convolutional Neural Networks for Automated Pancreas Localization and Segmentation. Medical Image Analysis. (2018) 94-107
  • [5] Zhu, Z., Xia, Y., et al.: A 3D Coarse-to-Fine Framework for Volumetric Medical Image Segmentation. In: 2018 International Conference on 3D Vision (3DV). (2018) 682-690
  • [6] Roth, H.,Oda, M., et al.: Towards dense volumetric pancreas segmentation in CT using 3D fully convolutional networks. Proc. SPIE 10574, Medical Imaging 2018: Image Processing. (2018)
  • [7] Wang, G., Li, W., Zuluaga, M. A., et al.: Interactive Medical Image Segmentation Using Deep Learning With Image-Specific Fine Tuning. In: IEEE Transactions on Medical Imaging. (2018)
  • [8] Yu, L., Cheng, J-Z., et al.: Automatic 3D Cardiovascular MR Segmentation with Densely-Connected Volumetric ConvNets. In: Medical Image Computing and Computer Assisted Intervention (MICCAI). (2017) 287-295
  • [9] Kendall, A., and Gal, Y.: What Uncertainties Do We Need in Bayesian Deep Learning for Computer Vision?. In: Proceedings of the 31th International Conference on Neural Information Processing Systems (NIPS). (2017)
  • [10] Dias, P. A., and Medeiros, H.: Semantic Segmentation Refinement by Monte Carlo Region Growing of High Confidence Detections. In: Asian Conference on Computer Vision (ACCV). (2019)
  • [11] Bai, W., Oktay, O., et al.: Semi-supervised Learning for Network-Based Cardiac MR Image Segmentation. In: Medical Image Computing and Computer Assisted Intervention (MICCAI). (2017) 253-260
  • [12] Xia, Y., Liu, F., Yang, D., et al.: 3D Semi-Supervised Learning with Uncertainty-Aware Multi-View Co-Training. (2018) arXiv:1811.12506
  • [13] Guha Roy, A., Conjeti, S., Navab, N., and Wachinger, C.: Inherent Brain Segmentation Quality Control from Fully ConvNet Monte Carlo Sampling. In: Medical Image Computing and Computer Assisted Intervention (MICCAI). (2018)
  • [14] Nair, T., Precup, D., Arnold, D. L., and Arbel, T.: Exploring Uncertainty Measures in Deep Networks for Multiple Sclerosis Lesion Detection and Segmentation. In: Medical Image Computing and Computer Assisted Intervention (MICCAI). (2018)
  • [15] Ronneberger, O., Fischer, P., and Brox, T.: U-Net: Convolutional Networks for Biomedical Image Segmentation. In: Medical Image Computing and Computer Assisted Intervention (MICCAI). (2015) 234-241
  • [16] Kipf, T. N. and Welling, M.: Semi-Supervised Classification with Graph Convolutional Networks . In: International Conference on Learning Representations (ICLR). (2017)
  • [17] Zhou, Z., Shin, J, et al.: Fine-tuning Convolutional Neural Networks for Biomedical Image Analysis: Actively and Incrementally. In: 2017 IEEE Conference on Computer Vision and Pattern Recognition (CVPR). (2017)
  • [18] Roth, H. R., Farag, A., et al.: Data From Pancreas-CT. The Cancer Imaging Archive. (2016)
  • [19] Roth, H. R., Lu, L., et al.: DeepOrgan: Multi-level Deep Convolutional Networks for Automated Pancreas Segmentation. In: Medical Image Computing and Computer Assisted Intervention (MICCAI). (2015)
  • [20] Clark, K., Vendt, B., et al.: The Cancer Imaging Archive (TCIA): Maintaining and Operating a Public Information Repository, Journal of Digital Imaging, Volume 26, Number 6, December, 2013, pp 1045-1057
Comments 0
Request Comment
You are adding the first comment!
How to quickly get a good reply:
  • Give credit where it’s due by listing out the positive aspects of a paper before getting into which changes should be made.
  • Be specific in your critique, and provide supporting evidence with appropriate references to substantiate general statements.
  • Your comment should inspire ideas to flow and help the author improves the paper.

The better we are at sharing our knowledge with each other, the faster we move forward.
The feedback must be of minimum 40 characters and the title a minimum of 5 characters
Add comment
Loading ...
This is a comment super asjknd jkasnjk adsnkj
The feedback must be of minumum 40 characters
The feedback must be of minumum 40 characters

You are asking your first question!
How to quickly get a good answer:
  • Keep your question short and to the point
  • Check for grammar or spelling errors.
  • Phrase it like a question
Test description