Attention Guided Network for Retinal Image Segmentation This work was done when S. Zhang is intern at CVTE Research. M. Tan (mingkuitan@scut.edu.cn) and Y. Xu (ywxu@ieee.org) are the corresponding authors.

Attention Guided Network for Retinal Image Segmentationthanks: This work was done when S. Zhang is intern at CVTE Research. M. Tan (mingkuitan@scut.edu.cn) and Y. Xu (ywxu@ieee.org) are the corresponding authors.

Shihao Zhang South China University of Technology, Guangzhou, China Inception Institute of Artificial Intelligence, Abu Dhabi, UAE Peng Cheng Laboratory, Shenzhen, China CVTE Research, Guangzhou, ChinaCixi Institute of Biomedical Engineering, Ningbo Institute of Materials Technology and Engineering, Chinese Academy of Sciences, Ningbo, China  
Project page: https://github.com/HzFu/AGNet
   Huazhu Fu South China University of Technology, Guangzhou, China Inception Institute of Artificial Intelligence, Abu Dhabi, UAE Peng Cheng Laboratory, Shenzhen, China CVTE Research, Guangzhou, ChinaCixi Institute of Biomedical Engineering, Ningbo Institute of Materials Technology and Engineering, Chinese Academy of Sciences, Ningbo, China  
Project page: https://github.com/HzFu/AGNet
   Yuguang Yan South China University of Technology, Guangzhou, China Inception Institute of Artificial Intelligence, Abu Dhabi, UAE Peng Cheng Laboratory, Shenzhen, China CVTE Research, Guangzhou, ChinaCixi Institute of Biomedical Engineering, Ningbo Institute of Materials Technology and Engineering, Chinese Academy of Sciences, Ningbo, China  
Project page: https://github.com/HzFu/AGNet
   Yubing Zhang South China University of Technology, Guangzhou, China Inception Institute of Artificial Intelligence, Abu Dhabi, UAE Peng Cheng Laboratory, Shenzhen, China CVTE Research, Guangzhou, ChinaCixi Institute of Biomedical Engineering, Ningbo Institute of Materials Technology and Engineering, Chinese Academy of Sciences, Ningbo, China  
Project page: https://github.com/HzFu/AGNet
   Qingyao Wu South China University of Technology, Guangzhou, China Inception Institute of Artificial Intelligence, Abu Dhabi, UAE Peng Cheng Laboratory, Shenzhen, China CVTE Research, Guangzhou, ChinaCixi Institute of Biomedical Engineering, Ningbo Institute of Materials Technology and Engineering, Chinese Academy of Sciences, Ningbo, China  
Project page: https://github.com/HzFu/AGNet
   Ming Yang South China University of Technology, Guangzhou, China Inception Institute of Artificial Intelligence, Abu Dhabi, UAE Peng Cheng Laboratory, Shenzhen, China CVTE Research, Guangzhou, ChinaCixi Institute of Biomedical Engineering, Ningbo Institute of Materials Technology and Engineering, Chinese Academy of Sciences, Ningbo, China  
Project page: https://github.com/HzFu/AGNet
   Mingkui Tan South China University of Technology, Guangzhou, China Inception Institute of Artificial Intelligence, Abu Dhabi, UAE Peng Cheng Laboratory, Shenzhen, China CVTE Research, Guangzhou, ChinaCixi Institute of Biomedical Engineering, Ningbo Institute of Materials Technology and Engineering, Chinese Academy of Sciences, Ningbo, China  
Project page: https://github.com/HzFu/AGNet
   Yanwu Xu South China University of Technology, Guangzhou, China Inception Institute of Artificial Intelligence, Abu Dhabi, UAE Peng Cheng Laboratory, Shenzhen, China CVTE Research, Guangzhou, ChinaCixi Institute of Biomedical Engineering, Ningbo Institute of Materials Technology and Engineering, Chinese Academy of Sciences, Ningbo, China  
Project page: https://github.com/HzFu/AGNet
Abstract

Learning structural information is critical for producing an ideal result in retinal image segmentation. Recently, convolutional neural networks have shown a powerful ability to extract effective representations. However, convolutional and pooling operations filter out some useful structural information. In this paper, we propose an Attention Guided Network (AG-Net) to preserve the structural information and guide the expanding operation. In our AG-Net, the guided filter is exploited as a structure sensitive expanding path to transfer structural information from previous feature maps, and an attention block is introduced to exclude the noise and reduce the negative influence of background further. The extensive experiments on two retinal image segmentation tasks (i.e., blood vessel segmentation, optic disc and cup segmentation) demonstrate the effectiveness of our proposed method.

1 Introduction

Retinal image segmentation plays an important role in automatic disease diagnosis. Compared to general natural images, retinal images contain more contextual structures, e.g., retinal vessel, optic disc and cup, which often provide important clinical information for diagnosis. As the main indicators for eye disease diagnosis, the segmentation accuracy of these information is important. Recently, convolutional neural networks (CNNs) have shown the strong ability in retinal image segmentation with remarkable performances [3, 5, 4, 14]. Existing CNN based models learn increasingly abstract representations by cascade convolutions and pooling operations. However, these operations may neglect some useful structural information such as edge structures, which are important for retinal image analysis. To address this issue, one possible solution is to add extra expanding paths to merge features skipped from the corresponding resolution levels. For example, FCN [9] sums up the upsampled feature maps and the feature maps skipped from the contractive path. And U-Net [10] concatenates them and add convolutions and non-linearities. However, these works can not effectively leverage these structural information, which may hamper the segmentation performance. Therefore, it is desirable to design a better expanding path to preserve structural information.

To address this, we introduce guided filter [6] as a special expanding path to transfer structural information extracted from low-level feature maps to high-level ones. Guided filter [6] is an edge-preserving image filter, and has been demonstrated to be effective for transferring structural information. Different from existing works which use the guided filter at the image level, we incorporate the guided filter into CNNs to learn better features for segmentation. We further design an attention mechanism in guided filter, called attention guided filter, to remove the noisy components, which are introduced from the complex background by original guided filter. Finally, we propose Attention Guided Network (AG-Net) to preserve the structural information and guide the expanding operation. The experiments on vessel segmentation and optic disc/cup segmentation demonstrate the effectiveness of our proposed method.

2 Methodology

For some reason, we temporarily withdraw this part. It will be available later.

3 Experiments

In this paper, we evaluate our method in two major tasks of vessel segmentation, and optic disc/cup segmentation from retina fundus images.

3.1 Vessel Segmentation on DRIVE Dataset

We conduct vessel segmentation experiments on DRIVE to evaluate performance of our proposed AG-Net. The DRIVE [11] (Digital Retinal Images for Vessel Extraction) dataset contains 40 colored fundus images, which are obtained from a diabetic retinopathy screening program in Netherlands. The 40 images are divided into 20 training images and 20 testing images. All the images are made by a 3CCD camera and each has size of . We apply gamma correction to improve the image quality, and resize the preprocessed images into as inputs. In the experiment, we train our AG-Net from scratch using Adam with the learning rate of 0.0015. The batch size is set to 2. The radius of windows and the regularization parameter in attention guided filter are set to and respectively. Following the previous work [16], we employ Specificity (Spe), Sensitivity (Sen), Accuracy (Acc), intersection-over-union(IOU) and Area Under ROC (AUC) as measurements.

We compare our AG-Net with several state-of-the-art methods, including Li [7], Liskowski [8], and Zhang [16]. Li [7] remolded the task of segmentation as a problem of cross-modality data transformation from retinal image to vessel map, and outputted the label map of all pixels instead of a single label of the center pixel. Liskowski [8] trained a deep neural network on sample of examples preprocessed with global contrast normalization, zero-phase whitening, and augmented using geometric transformations and gamma corrections. MS-NFN [12] generates multi-scale feature maps with an ‘up-pool’ submodel and a ‘pool-up’ submodel. To verify the efficacy of attention in guided filter and transfer structural information, we replaced the attention guided filter in AG-Net with the original guided filter, named GF-Net.

Method Acc AUC Sen Spe IOU
Li [7] 0.9527 0.9738 0.7569 0.9816
Liskowski [8] 0.9535 0.9790 0.7811 0.9807
MS-NFN [12]
U-Net [10]
M-Net [4]
GF-Net
AG-Net
Table 1: Quantitative comparison of segmentation results on DRIVE

Table 1 shows the performances of different methods on DRIVE. Form the results, we could have several interesting observations: Firstly, GF-Net performs better than original M-Net, which demonstrates the superiority of the guided filter compared to the skip connection for transferring structural information. Secondly, AG-Net outperforms GF-Net by 0.0010, 0.0019, 0.0205 and 0.0126 in terms of Acc, AUC, Sen and IOU respectively. This demonstrates the effectiveness of the attention strategy in attention guided filter. Lastly, unlike other deep learning methods which crop images into patches, our method achieves the best performance with the original preprocessed 20 images. We draw similar observations from the results on the CHASE_DB1 dataset, which are shown in Table 2.

Figure 1: (a) A test image from DRIVE dataset; (b) Ground truth segmentation; (c) Segmentation result by M-Net; (d) Segmentation result by GF-Net; (e) Segmentation result by AG-Net. From (c), M-Net neglect some edge structures which are very similar to choroidal vessels. On the contrary, by exploiting attention guided as a special expanding path, AG-Net gains better discrimination power and is able to distinguish objects from similar structures. Moreover, GF helps to obtain clearer boundaries.

Fig. 1 shows an example test, including the ground truth vessel and the segmentation results obtained by M-Net, M-Net+GF and the proposed AG-Net. M-Net+GF produces clearer boundaries than M-Net, which demonstrates the effectiveness of the guided filter to better leverage structure information. Compared with M-Net+GF, our proposed AG-Net produces more precise segmentation boundaries, which verifies that the attention mechanism is able to highlight the foreground and reduce the effect of background.

Method Acc AUC Sen Spe IOU
Li [7] 0.9581 0.9716 0.7507 0.9793
Liskowski [8] 0.9628 0.9823 0.7816 0.9836
MS-NFN [12]
U-Net [10]
M-Net [4]
GF-Net
AG-Net
Table 2: Quantitative comparison of segmentation results on CHASE_DB1

In terms of time consumption, we compare our AG-Net with M-Net which is the backbone of our method. In our experiment, both algorithms are implemented with Pytorch and tested on a single NVIDIA Titan X GPU (200 iterations on DRIVE dataset). The running time is shown in Table 3.

Method
M-Net
AG-Net
Table 3: Quantitative comparison of the time consumption

3.2 Optic Dice/Cup Segmentation on ORIGA Dataset

Optic Dice/Cup Segmentation is another important retinal segmentation task. In this experiment, we use ORIGA dataset, which contains 650 fundus images with 168 glaucomatous eyes and 482 normal eyes. The 650 images are divided into 325 training images (including 73 glaucoma cases) and 325 testing images (including 95 glaucoma cases). We crop the OD area and resize it into as the input. The training setting of our AG-Net is as same as in vessel segmentation task. We compare AG-MNet with several state-of-the-art methods in OD and/or OC segmentation, including ASM [15], Superpixel [2], LRR [13], U-Net [10], M-Net [4], and M-Net with polar transformation (M-Net + PT). ASM [15] employs the circular hough transform initializaiton to segmentation. Superpixel method in [2] utilizes superpixel classification to detect the OD and OC boundaries. The methods in LRR [13] obtain good results, but it only focus on OC segmentation.

Following the setting in [4], we firstly localize the disc center, and then crop pixels to obtain the input images. Inspired by M-Net+PT [4], we provide the results of AG-Net with polar transformation, called AG-MNet+PT. Besides, to reduce the impacts of changes in the size of OD, we construct a method AG-MNet+PT, which enlarges 50 pixels of bounding-boxes in up, down, right and left, where the bounding boxes are obtained from pretrained LinkNet[1]. We employ overlapping error (OE) as the evaluation metric, which is defined as , where and denote ground truth area and segmented mask, respectively. In particular, and are the overlapping error of OD and OE. is the average of and .

Method
ASM [15]
SP [2]
LRR [13]
U-Net [10]
M-Net [4]
M-Net+PT [4]
AG-Net (ours)
AG-Net+PT (ours)
AG-Net+PT (ours)
Table 4: Quantitative comparison of segmentation results on ORIGA

Table 4 shows the segmentation results, where the overlapping errors of other approaches come directly from the published results. Our method outperforms all the state-of-the-art OD and/or OC segmentation algorithms in terms of the aforementioned two evaluation criteria, which demonstrates the effectiveness of our model. Besides, Our AG-Mnet performs much better than original M-Net under the same situation, which further demonstrates our attention guided filter is beneficial for the segmentation performance. More visualization results could be found in Supplementary Material.

4 Conclusions

In this paper, we propose an attention guided filter as a structure sensitive expanding path. Specially, we employ M-Net as the main body and exploit our attention guided filter to replace the skip-connection and upsampling, which brings better information fusion. In addition, by introducing the attention mechanism into the guided filter, the attention guided filter can highlight the foreground and reduce the effect of background. Experiments on two tasks demonstrate the effectiveness of our method.

Acknowledments. This work was supported by National Natural Science Foundation of China (NSFC) 61602185 and 61876208, Guangdong Introducing Innovative and Enterpreneurial Teams 2017ZT07X183, and Guangdong Provincial Scientific and Technological Fund 2018B010107001, 2017B090901008 and 2018B010108002, and Pearl River S&T Nova Program of Guangzhou 201806010081, and CCF-Tencent Open Research Fund RAGR20190103.

References

  • [1] A. Chaurasia and E. Culurciello (2017) Linknet: exploiting encoder representations for efficient semantic segmentation. In VCIP, Cited by: §3.2.
  • [2] J. Cheng et al. (2013) Superpixel classification based optic disc and optic cup segmentation for glaucoma screening. TMI. Cited by: §3.2, Table 4.
  • [3] H. Fu et al. (2016) DeepVessel: Retinal Vessel Segmentation via Deep Learning and Conditional Random Field. In MICCAI, Cited by: §1.
  • [4] H. Fu et al. (2018) Joint optic disc and cup segmentation based on multi-label deep network and polar transformation. IEEE TMI. Cited by: §1, §3.2, §3.2, Table 1, Table 2, Table 4.
  • [5] Z. Gu et al. (2019) CE-Net: Context Encoder Network for 2D Medical Image Segmentation. IEEE TMI. Cited by: §1.
  • [6] K. He, J. Sun, and X. Tang (2013) Guided image filtering. IEEE TPAMI. Cited by: §1.
  • [7] Q. Li et al. (2016) A cross-modality learning approach for vessel segmentation in retinal images. IEEE TMI. Cited by: §3.1, Table 1, Table 2.
  • [8] P. Liskowski and K. Krawiec (2016) Segmenting retinal blood vessels with deep neural networks. TMI. Cited by: §3.1, Table 1, Table 2.
  • [9] J. Long, E. Shelhamer, and T. Darrell (2015) Fully convolutional networks for semantic segmentation. In CVPR, Cited by: §1.
  • [10] O. Ronneberger, P. Fischer, and T. Brox (2015) U-net: convolutional networks for biomedical image segmentation. In MICCAI, Cited by: §1, §3.2, Table 1, Table 2, Table 4.
  • [11] J. Staal et al. (2004) Ridge-based vessel segmentation in color images of the retina. IEEE TMI. Cited by: §3.1.
  • [12] Y. Wu et al. (2018) Multiscale network followed network model for retinal vessel segmentation. In MICCAI, Cited by: §3.1, Table 1, Table 2.
  • [13] Y. Xu et al. (2014) Optic cup segmentation for glaucoma detection using low-rank superpixel representation. In MICCAI, Cited by: §3.2, Table 4.
  • [14] Z. Yan, X. Yang, and K. Cheng (2017) A skeletal similarity metric for quality evaluation of retinal vessel segmentation. IEEE TMI. Cited by: §1.
  • [15] F. Yin et al. (2011) Model-based optic nerve head segmentation on retinal fundus images. In EMBC, Cited by: §3.2, Table 4.
  • [16] Y. Zhang and A. C. Chung (2018) Deep supervision with additional labels for retinal vessel segmentation task. In MICCAI, Cited by: §3.1, §3.1.
Comments 0
Request Comment
You are adding the first comment!
How to quickly get a good reply:
  • Give credit where it’s due by listing out the positive aspects of a paper before getting into which changes should be made.
  • Be specific in your critique, and provide supporting evidence with appropriate references to substantiate general statements.
  • Your comment should inspire ideas to flow and help the author improves the paper.

The better we are at sharing our knowledge with each other, the faster we move forward.
""
The feedback must be of minimum 40 characters and the title a minimum of 5 characters
   
Add comment
Cancel
Loading ...
392074
This is a comment super asjknd jkasnjk adsnkj
Upvote
Downvote
""
The feedback must be of minumum 40 characters
The feedback must be of minumum 40 characters
Submit
Cancel

You are asking your first question!
How to quickly get a good answer:
  • Keep your question short and to the point
  • Check for grammar or spelling errors.
  • Phrase it like a question
Test
Test description