Colorectal Polyp Detection in Real-world Scenario: Design and Experiment Study

Colorectal Polyp Detection in Real-world Scenario: Design and Experiment Study


Colorectal polyps are abnormal tissues growing on the intima of the colon or rectum with a high risk of developing into colorectal cancer, the third leading cause of cancer death worldwide. Early detection and removal of colon polyps via colonoscopy have proved to be an effective approach to prevent colorectal cancer. Recently, various CNN-based computer-aided systems have been developed to help physicians detect polyps. However, these systems do not perform well in real-world colonoscopy operations due to the significant difference between images in a real colonoscopy and those in the public datasets. Unlike the well-chosen clear images with obvious polyps in the public datasets, images from a colonoscopy are often blurry and contain various artifacts such as fluid, debris, bubbles, reflection, specularity, contrast, saturation, and medical instruments, with a wide variety of polyps of different sizes, shapes, and textures. All these factors pose a significant challenge to effective polyp detection in a colonoscopy. To this end, we collect a private dataset that contains 7,313 images from 224 complete colonoscopy procedures. This dataset represents realistic operation scenarios and thus can be used to better train the models and evaluate a system’s performance in practice. We propose an integrated system architecture to address the unique challenges for polyp detection. Extensive experiments results show that our system can effectively detect polyps in a colonoscopy with excellent performance in real time.

colonoscopy, colon polyp detection, convolutional neural network, colon polyp dataset

I Introduction

Colorectal cancer (CRC) is the fourth most commonly diagnosed cancer and the third leading cause of cancer death worldwide in 2018 [10]. In the United States, CRC is the third most common cancer, accounting for 9% of all cancer incidence [27]. Currently, colonoscopy is the primary and most effective method for screening for and preventing CRC. A colonoscopy is an outpatient procedure in which a tiny camera is navigated inside the large intestine (colon and rectum) to check for abnormalities and diseases. During the colonoscopy, abnormal growths, such as colorectal polyps which is an essential precursor of CRC, will be identified and removed. Therefore, polyp detection during colonoscopy plays a crucial role in CRC early detection and treatment.

Fig. 1: Sample results of our polyp detection system. Green rectangles represent the prediction of the system and red rectangles represent the ground truth. Best viewed in color.

A colon polyp is a clump of cells that grow on the lining of the colon. Polyps grow through rapidly dividing cells, similar to how cancer cells grow. This is why they can become cancerous, even though most polyps are benign. Some colon polyps can develop into CRC over time, which is often fatal when found in later stages. However, according to Leufkens et al.’s study [16], 22%-28% of polyps and 20%-24% of adenomas are missed in colonoscopy due to various human factors. Therefore, there is a critical need for an efficient and accurate computer-aided colorectal polyp detection system that can assist physicians with localizing the polyps during a colonoscopy.

Fig. 2: Difference between public datasets and real colonoscopies. (a) images from public datasets; (b) to (e) are images from real colonoscopies; (b) images with medical instruments; (c) blurry images caused by different factors; (d) bubbles; (e) specularity.

Previous polyp detection methods rely mainly on hand-crafted features such as color, texture or shape to extract specific patterns of polyps. However, due to the vast diversity in these features, traditional methods do not perform well for polyp detection. Recently, with the advances in deep learning, Convolutional Neural Network (CNN)-based approaches have been widely adopted for polyp detection and segmentation. For example, object detection methods such as Faster R-CNN [24] and YOLO [23] are used to find and indicate polyps with bounding boxes [20, 34]. Semantic image segmentation methods such as U-Net [25], Fully Convolutional Networks (FCN) [18]. and SegNet [5] are employed to localize polys on the pixel level [21, 17, 1, 9, 33].

While these CNN-based approaches have shown excellent performance in polyp detection and segmentation when trained and tested on publicly available datasets such as GIANA (Gastrointestinal Image ANAlysis) 2018 [8] and ETIS-Larib Polyp DB [28], there is a wide gap between images from real colonoscopy and those in public datasets. Specifically, public datasets contain carefully chosen clear images with reasonably sized polyps that stand out from the background, without various artifacts that are often present in real operations. On the other hand, a large portion of the images in colonoscopy operations has different degrees of blurring due to the movement of the camera and intima, camera out-of-focus, or water flushes during an operation. Also, the images often contain various artifacts such as fluid, debris, bubbles, reflection, specularity, contrast, saturation, and medical instruments [2, 3]. Moreover, a wider variety of polyps with different sizes, shapes or textures can appear in a colonoscopy than those in the public datasets. Some representative images in the public datasets and real colonoscopy are shown in Fig. 2 to illustrate the differences. All these factors make polyp detection much more challenging in a real colonoscopy than on public datasets. In fact, we find that models optimally trained on public datasets suffer a significant performance degradation in real colonoscopy videos (F1-score drops from 95.77% for the model trained on CVC-ClinicDB and 81.07% for the model trained on ETIS-Larib Polyp DB to 70.45% in real colonoscopy), frequently missing true polyps and falsely labeling artifacts (e.g., bubbles, specularity, instruments, undigested debris, color blobs in blurry images, etc) as polyps.

In this research, we collaborate with the Endoscopy Center of Xiangya Hospital of Central South University in China and collect a dataset that contains 7,313 images from 224 complete colonoscopy operations with pixel-level polyp and instrument annotations for each image. The large variation of polyps in terms of size, shape, and texture can substantially improve the sensitivity of the model trained on this dataset. In addition to the vast diversity of polyps, our dataset has a large fraction of images with various artifacts and blurry images caused by camera motion during the operations. Due to the presence of various artifacts in a colonoscopy, a single CNN-based polyp detector tends to generate a considerable number of false positives. While this problem may be alleviated by increasing the output threshold of the model, however, a threshold too high will inevitably lower the detection rate of true polyps. Meanwhile based on the finding that two of our recently developed CNN models have high consistency in localizing true polyps and low consistency on false detections, we propose an integrated polyp detection architecture that consists of a blurry detection module to filter out blurry images and an ensemble module to combine the results from an Anchor Free Polyp (AFP-Net) detector [32] and a U-Net with Dilation Convolution detector [30]. By removing blurry images from subsequent model inference, we can significantly reduce the processing time of the pipeline. By combining the results from two independent detectors, we can dramatically improve the specificity of our system without losing too much sensitivity. Fig. 1 shows sample results of our integrated polyp detection system in real colonoscopy procedures.

In summary, the key contributions of this paper include: (i) We create a private dataset that contains 7,313 images from 224 complete colonoscopy procedures. Images in this dataset are highly representative of realistic operation scenarios compared to those well-chosen ones in the public datasets, and thus can be used to better train the models and evaluate the system’s performance in practice. (ii) We propose an integrated system architecture that addresses the unique challenges for polyp detection in a colonoscopy. The system consists of a blurry detector that filters out blurry images to reduce processing time and two independent polyp detectors that are combined to enhance the accuracy. (iii) We train our models and test our proposed system on complete colonoscopy videos. Results show that our system can effectively detect polyps in colonoscopy operations with excellent performance in real-time at a rate of 23 frames per second.

Fig. 3: The architecture of our polyp-detection system. It consists of a blurry detection module, two polyp detection modules, and an ensemble module.

Ii Related Work

Early polyp detection methods usually utilize hand-crafted features and a simple classifier to distinguish polyps from normal tissues. Gross et al. [13], Bernal et al. [7, 6], and Ameling et al. [4] used color, shape, or texture features as key factors to identify the locations of polyps. Ganz et al. [12] and Mamonov et al. [19] utilized shape or contours information for polyp segmentation. However, since most polyps and normal tissues have similar edge and color features, traditional hand-crafted-feature based methods can only detect polyps with several typical patterns, resulting in a poor performance in real applications.

With the success of deep learning in computer vision, CNN-based polyp detectors have been investigated in the last few years. Mo et al. [20] applied a fine-tuned Faster-RCNN with VGG-16 as the backbone for polyp detection with a running time of 5 fps on a K40 GPU. In addition, Shin et. al [26] proposed a post learning scheme to enhance the Faster R-CNN detector. This post-learning scheme automatically collects hard negative samples and retrains the network with selected polyp-like false positives, which functions in a similar way to boosting. Zhang et al. proposed a two-step pipeline for polyp detection in colonoscopy videos. In the first step, they use a pretrained ResYOLO to detect suspicious polyps. The polyps are assumed to be stable between two consecutive frames. In the second step, a Discriminative Correlation Filter based tracking approach was proposed to leverage the temporal information.

Polyp segmentation is another popular approach to localize polyps in colonoscopy videos on the pixel level. Inspired by U-Net, Mohammed et al. [22] proposed Y-Net which fuses two fully convolutional encoders followed by a fully convolutional decoder. By using two encoders with and without pre-trained weights, the model addresses the performance loss due to domain-shift from the pre-trained network (natural images) to testing (polyp data) and limited training data, which are two common challenges when employing supervised learning methods in medical image analysis tasks. Wang et al. [33] built an automatic polyp-detection system with three threads for polyp segmentation, with each thread performing the image segmentation model based on SegNet [5], each thread needs less than 100 ms to process one frame, since the system can process up to 30 frames per second.

Iii System Design

The architecture of our computer-aided polyp detection system is illustrated in Fig. 3. It consists of a blurry detection module, two polyp detection modules, and an ensemble module. The system takes each single image frame in a colonoscopy as the input. The blurry detection module first filters out blurry frames caused by camera movement, out-of-focus or water flushes during a colonoscopy. The two polyp detection modules perform polyp detection simultaneously on the remaining frames and their results will be ensembled as the output shown on the colonoscopy image.

Iii-a Blurry Detection

Our Blurry detection module is a CNN-based binary classifier. To train and test the model, we sample frames from videos of colonoscopy procedures and manually selected 400 images (200 blurry and 200 clear images) for training and 225 (100 blurry and 125 clear images) for testing. We use the pre-trained SquezzeNet [15] as the backbone and fine-tune it on the train set. Our model achieves 95.0% of sensitivity and 96.0% of specificity on the test set. Moreover, by testing the model on videos, we observe that our model can filter out most of the blurry frames.

Iii-B AFP-Net

AFP-Net [32] is a fully convolutional network that classifies and localizes objects on each enhanced feature map. We use VGG16 [29] as the backbone and select feature maps from the backbone (conv4_3, conv5_3, conv6_2, conv7_2, conv8_2, and conv9_2) for detecting objects at different scales. To further increase the context information for small objects, we feed each feature map to a context module before forwarding them to the anchor-free detection heads. These detection heads are single-staged and have similar structures to the heads in SSD, where two parallel subnets are dedicated to classification and localization.

Iii-C U-Net with Dilated Convolution

Inspired by U-Net [25], we construct an end-to-end convolutional neural network that consists of a construction part (encoder) and an expansive part (decoder). The backbone we use in the encoder is Resnet-50 [14]. We remove the down-sampling operation before the last stage of Resnet-50 to retain feature resolution and introduce dilated convolution in the last stage to enlarge the receptive field.

Fig. 4: Three causes of blurry imgaes. (a) water flush; (b) out-of-focus; (c) movement of the camera and intima

For the decoder, we use interpolation instead of transposed convolution to upsample each feature map back into the original size of the input, and combine these four feature maps to one block using concatenation and finally transform it into a 1-dimensional segmentation map. The segmentation map will be transformed into bounding-boxes with several post-processing procedures. The details of our blurry detection and two polyp detection modules can be found in [32, 30].

Iii-D Model Ensemble

For every frame in a colonoscopy, the detection results from the two polyp detectors will be ensembled as follows. If the IoU (Intersection over Union) of the two resulting bounding-boxes is greater than 0.1, we keep one of the bounding-boxes while remove the other by applying non-maximum suppression. Otherwise, we ignore both of these results. By this “AND” logic, as will be shown in Section 4, we are able to significantly reduce the false positive rate without trading off much sensitivity.

Iv Dataset

Iv-a Blurry Detection Dataset

In order to train and test the blurry detector, we sample frames from videos of colonoscopy procedures and manually selected 400 images (200 blurry and 200 clear images) for training and 225 for testing (100 blurry and 125 clear images). The main causes of the blurry images in our dataset are the movement of the camera and intima, out-of-focus, and water flushes, which are shown in Fig 4. Additionally, the blurry detection dataset contains both colonoscopy images with and without polyps.

Iv-B Public Dataset

The public dataset consists of the datasets from GIANA (Gastrointestinal Image ANAlysis) 2018 [8] and ETIS-Larib Polyp DB [28]. We split them into train and test set as follows:

Train set: The train set consists of the following 3 datasets:

(i) 18 short videos from CVC-ClinicVideoDB (train set from GIANA Polyp detection Challenge) which contains 10,025 images of size 384 288.

(ii) CVC-ColonDB: 300 images of size 574 500 from the train set of GIANA Polyp segmentation Challenge.

(iii) 56 high definition images from GIANA Polyp segmentation Challenge with a resolution of 1,920 1080.

Test set: The test set consists of the following 2 datasets:

(i) CVC-ClinicDB: 612 images of size 384 288 from the test set of GIANA Polyp Segmentation Challenge.

(ii) ETIS-Larib Polyp DB: 196 images of size 1,225 966.

Iv-C Private Dataset

To strengthen the robustness of our model in a real-world clinical setting, we collected 224 videos of complete colonoscopy procedures at the Endoscopy Center of Xiangya Hospital of Central South University from July to November 2019. These videos are split into the train set and test set. We extract 96 and 239 short video clips that contain polyps from the training and testing videos respectively. These videos contain a large variety of polyps in terms of shape, size, color, texture as well as visual angles. To build the private dataset at the image frame level, we manually annotated 4,825 images with 4,535 polyps and 2,344 instruments from the training videos to form the train set, and 2,488 images with 2,688 polyps and 593 instruments from the testing videos to form the test set. In total, our private dataset contains 7,313 images, including 7,223 polyps and 2,937 instruments. These images and annotations were verified by experienced gastroenterologist from Xiangya Hospital of Central South University.

Compared to the public datasets, our private dataset is highly representative of real colonoscopy operations, containing more artifacts, and medical instruments. More importantly, some polyps are annotated in relatively blurry images caused by camera motion, out-of-focus, or water flushing in a colonoscopy. Additionally, all these 224 videos of complete colonoscopy were obtained from different types of colonoscopy equipment, such as Olympus CV-290 and Olympus CSV-290SL. Therefore, those images may vary in color cast, chromatic aberration, and resolution. All these factors contribute to a more challenging and realistic scenario for polyp detection. For this reason, we believe that models trained on our dataset will perform better in a real colonoscopy, and the performance evaluated on our private dataset will be a better measurement of the system’s effectiveness.

Backbone Sensitivity Specificity fps
Resnet-50 [14] 95.0% 98.4% 160
VGG16 [29] 90.0% 98.4% 490
SqueezeNet [15] 95.0% 96.0% 310
TABLE I: Comparison between blurry detectors with different backbone.

V Experiments and Results

Methods CVC-ClinicDB ETIS-Larib Polyp DB
Precision Recall F1-score Precision Recall F1-score
CVC-Clinic [6] 83.50 83.10 83.30 10.00 49.00 16.50
ASU [31] 97.20 85.20 90.80 - - -
OUS [8] 90.40 94.40 92.30 69.70 63.00 66.10
CUMED [11] 91.70 98.70 95.00 72.30 69.20 70.70
Faster R-CNN [20] 86.60 98.50 92.20 - - -
FCN [17] 89.99 77.32 83.01 - - -
FCN-8S [1] 91.80 97.10 94.38 - - -
FCN-VGG [9] - - - 73.61 86.31 79.46
Dilated U-Net[30] 96.71 95.51 96.11 80.48 81.25 80.86
AFP-Net [32] 99.36 96.44 97.88 88.89 80.77 84.63
Our system 98.85 92.88 95.77 91.02 73.08 81.07
TABLE II: Comparison between our system and previous methods on public datasets.
Methods Private Dataset
TP FP FN Precision Recall F1-score F2-score
Dilated U-Net (public) 1746 752 927 69.90 65.32 67.53 66.19
Dilated U-Net 1949 617 725 75.95 72.88 74.38 73.47
AFP-Net (public) 1718 418 956 80.43 64.25 71.43 66.94
AFP-Net 2117 319 557 86.90 79.17 82.86 80.60
Our system (public) 1572 217 1102 87.87 58.79 70.45 62.96
Our system 1885 132 789 93.46 70.49 80.37 74.14
TABLE III: Comparison of our models on the private dataset.

V-a Blurry Detection Backbone Comparison

We have experimented with building the blurry detection model with different backbones. Table I shows the results of different backbones for the sensitivity, specificity, and running time. From Table I, we can see that SqueezeNet [15] can achieve more than 95% on both sensitivity and specificity with a per-frame processing time of only 3ms, representing the best trade-off between accuracy and time consumption.

V-B Results on Public Datasets

Table II compares our results with current state-of-the-art methods on CVC-ClinicDB and ETIS-Larib Polyp DB. In this part, the AFP-Net, Dilated U-Net, and our integrated system are trained only on the public train set. Experiment results show that both polyp detectors and our integrated system significantly outperforms previous methods in precision, recall, and F1-score.

V-C Results on Private Dataset

Fig. 5: Comparison between models on videos of colonoscopy. (a) recall as a function of time for different models to find a polyp. (b) cumulative distribution function (CDF) of the number of false positives generated per minute in each video for different models. Best viewed in color.

In this part, we test our system at both image and video levels. All models are trained on the dataset that is a combination of the public and private train sets.

Results on images: Table III shows the results of our models on the private test set. Models labeled as “(public)” are trained only on the public datasets. From Table III, we can observe that when trained on the combination of public and private datasets, the models significantly outperform their counterparts trained only on the public datasets. This is mainly due to the limited variety of polyps existing in the public datasets. As a result, models trained on these public datasets are not capable of detecting polyps of vast different varieties that may appear in a colonoscopy. This result clearly demonstrates the need and importance of a dataset that is representative of real colonoscopy operations.

Results of recall on videos: In real clinical practice, it is desirable to not only detect all polyps that appear in an operation but also to find each polyp as quickly as possible. Therefore, we design an experiment to measure how quickly our system can detect a polyp after its first appearance. All the experiments are carried out on 239 short video clips with polyps from testing videos. The results of each polyp detector and the integrated system are shown in Fig. 5. We can observe that our system can detect more than 60% of the polyps within the first two seconds after a polyp appears and more than 80% in ten seconds.

Results of precision on videos: One of the most difficult challenges of applying a computer-aided system in real-world clinical practice is that models may generate a large number of false positives. In this part, we use video clips without polyps from testing videos of complete colonoscopy procedures to measure how many false positives our system generates. Since false positives within a short time span are usually highly correlated, for those false positives falling within 6 frames (100ms), we only count them as one incident. Fig. 5 shows the cumulative distributions function of the number of false positives generated per minute in each video for different models. From this figure, we can observe that we can eliminate most of the false positives from more than 40 per minute for the two individual detectors to less than 6 per minute for the integrated system. The number of false positives is smaller with blurry detection than without, as these false positives cause by artifacts on a blurry images (e.g. color blobs) are eliminated.

V-D Running Time

The inference time of our blurry detector, AFP-Net, and Dilated U-Net are around 3ms, 20ms, and 20ms, respectively. Therefore, the running time is around 3ms for a blurry frame and around 43ms (23fps) for a clear frame. All the tests are performed on a single NVIDIA GeForce GTX 2080 Ti GPU and Intel i7-9700K CPU with PyTorch 1.4.0.

V-E Discussion

From Table II and Table III, we can observe that AFP-Net and Dilated U-Net, while well trained on the public datasets, suffer a significant performance degradation on our private dataset. The F1-score of AFP-Net drops from 97.88% on CVC-ClinicDB and 84.63% on ETIS-Larib Polyp DB to 71.43% on our private dataset. Similarly, the F1-score of Dilated U-Net declines from 96.11% on CVC-ClinicDB and 80.86% on ETIS-Larib Polyp DB to 67.53% on the private dataset. These results demonstrate the impact of the wide gap between the colonoscopy images from real medical operations and those in the public datasets on the model performance. After we re-train these two polyp detectors on our private train set, the F1-score of AFP-Net and Dilated U-Net improve significantly by 6.85% and 11.43%, respectively. Meanwhile, the F1-score and F2-score of our integrated system improves by 9.92% and 11.18%. These results clearly show that images in our private dataset represent realistic operation scenarios compared to those in public datasets, and thus can help us better train the models and improve the system’s performance in practice.

Additionally, from Table III, we can see that, with a moderate decrease in recall, the precision of our system improves notably from 75.95% for Dilated U-Net and 86.90% for AFP-Net to 93.46%. This result shows that the ensemble of two models can effectively reduce the false positives in the detection. The reason to trade off recall for better precision is that we do not need to detect a polyp in every frame in real clinical practice. Instead, a polyp only needs to be detected in one of the frames to warn the physician during an operation. From Fig. 5, each of our polyp detectors finds more than 90% of polyps when running independently. However, based on the CDF plots shown in Fig. 5, they also produce a large number of false positives (approximately 40 to 60 per minute). After the ensemble module, the false positives rate drops significantly from 40 per minate to around 8 per minute. Also, with the blurry detection module, the number of false positives further declines to around 5 per minute. Therefore, the blurry detector and ensemble module can effectively reduce the false positive rate without losing much sensitivity for polyps detection in real colonoscopy procedures.

V-F Ensemble According to Polyp Size

In real applications, we observe that different models are sensitive to the size of polyps. From Fig. 6, we find that AFP-Net achieves a much better result than dilated U-Net on both precision and recall when polyp size is small. Therefore, in this section, we further introduce a more effective ensemble method with better performance and design a series of experiments to demonstrate its efficacy.

Fig. 6: Recall and precision as a function of the size of the predicted bounding-box for different models to find a polyp. All the experiments are processed on the private test dataset. Best viewed in color.

Based on the above observation, we redesign the ensemble module by introducing the length of the short edge of the prediction bounding-box as an extra parameter. When the short edge is less than a particular threshold, we only retain the outputs of AFP-Net, based on the fact that AFP-Net performs better than dilated U-Net when polyp size is small, while the performance gap narrows as the polyp size increases. To find the best threshold, we carry out a series of experiments to test the performance of the improved ensemble method on different thresholds. Here, we use the ratio of the length of the short edge of the prediction bounding-box and that of the input image as the parameter. The experiment results are shown in Fig. 7. By introducing this new parameter, physicians can adjust the sensitivity of the system to detect a polyp, especially for small polyps. For example, we can observe that by setting the threshold to 0.1, the recall of our polyp detection system can be dramatically improved from 80% to 94% with 13 false positives per minute. Additionally, the running time for processing a single frame is reduced to 23ms (time consumption the AFP-Net) when the size of the polyp is smaller than the configured threshold.

Vi Conclusion

In this research, we collect a private dataset that contains 7,313 images from 224 complete colonoscopy operations with pixel-level polyp and instrument annotations. Images in our dataset represent realistic operation scenarios and thus can be used to better train a model and evaluate its performance. Additionally, we propose an integrated system that consists of a blurry detector and two polyp detectors. The results of two polyp detectors are ensembled to enhance the polyp detection performance. Results show that our system achieves excellent performance and can effectively detect polyps in real colonoscopy operations with a high accuracy in a real-time fashion.

Fig. 7: Performance of our system with the new ensemble method on test videos. Blue bars represent the recall of our system with different thresholds of edge length. Orange bars represent the number of false positives generated by our system per minute with different thresholds of edge length. Best viewed in color.


  1. M. Akbari, M. Mohrekesh, E. Nasr-Esfahani, S. R. Soroushmehr, N. Karimi, S. Samavi and K. Najarian (2018) Polyp segmentation in colonoscopy images using fully convolutional network. In 2018 40th Annual International Conference of the IEEE Engineering in Medicine and Biology Society (EMBC), pp. 69–72. Cited by: §I, TABLE II.
  2. S. Ali, F. Zhou, A. Bailey, B. Braden, J. East, X. Lu and J. Rittscher (2019) A deep learning framework for quality assessment and restoration in video endoscopy. External Links: 1904.07073 Cited by: §I.
  3. S. Ali, F. Zhou, C. Daul, B. Braden, A. Bailey, S. Realdon, J. East, G. Wagnières, V. Loschenov, E. Grisan, W. Blondel and J. Rittscher (2019) Endoscopy artifact detection (ead 2019) challenge dataset. External Links: 1905.03209 Cited by: §I.
  4. S. Ameling, S. Wirth, D. Paulus, G. Lacey and F. Vilarino (2009) Texture-based polyp detection in colonoscopy. In Bildverarbeitung für die Medizin 2009, pp. 346–350. Cited by: §II.
  5. V. Badrinarayanan, A. Kendall and R. Cipolla (2017) Segnet: a deep convolutional encoder-decoder architecture for image segmentation. IEEE transactions on pattern analysis and machine intelligence 39 (12), pp. 2481–2495. Cited by: §I, §II.
  6. J. Bernal, F. Javier Śanchez, G. Ferńandez-Esparrach, D. Gil, C. Rodríguez de Miguel and F. Vilariño (2015-03) WM-dova maps for accurate polyp highlighting in colonoscopy: validation vs. saliency maps from physicians. Computerized Medical Imaging and Graphics 43, pp. . External Links: Document Cited by: §II, TABLE II.
  7. J. Bernal, J. Sánchez and F. Vilarino (2012) Towards automatic polyp detection with a polyp appearance model. Pattern Recognition 45 (9), pp. 3166–3182. Cited by: §II.
  8. J. Bernal, N. Tajkbaksh, F. J. Sánchez, B. J. Matuszewski, H. Chen, L. Yu, Q. Angermann, O. Romain, B. Rustad and I. Balasingham (2017) Comparative validation of polyp detection methods in video colonoscopy: results from the miccai 2015 endoscopic vision challenge. IEEE transactions on medical imaging 36 (6), pp. 1231–1249. Cited by: §I, §IV-B, TABLE II.
  9. P. Brandao, E. B. Mazomenos, G. Ciuti, R. Caliò, F. Bianchi, A. Menciassi, P. Dario, A. Koulaouzidis, A. Arezzo and D. Stoyanov (2017) Fully convolutional neural networks for polyp segmentation in colonoscopy. In Medical Imaging: Computer-Aided Diagnosis, Cited by: §I, TABLE II.
  10. F. Bray, J. Ferlay, I. Soerjomataram, R. L. Siegel, L. A. Torre and A. Jemal (2018) Global cancer statistics 2018: globocan estimates of incidence and mortality worldwide for 36 cancers in 185 countries. CA: a cancer journal for clinicians 68 (6), pp. 394–424. Cited by: §I.
  11. H. Chen, X. Qi, J. Cheng and P. Heng (2016) Deep contextual networks for neuronal structure segmentation. In Proceedings of the Thirtieth AAAI Conference on Artificial Intelligence, AAAI’16, pp. 1167–1173. Cited by: TABLE II.
  12. M. Ganz, X. Yang and G. Slabaugh (2012) Automatic segmentation of polyps in colonoscopic narrow-band imaging data. IEEE Transactions on Biomedical Engineering 59 (8), pp. 2144–2151. Cited by: §II.
  13. S. Gross, M. Kennel, T. Stehle, J. Wulff, J. Tischendorf, C. Trautwein and T. Aach () Polyp segmentation in nbi colonoscopy. Cited by: §II.
  14. K. He, X. Zhang, S. Ren and J. Sun (2016) Deep residual learning for image recognition. In Proceedings of the IEEE conference on computer vision and pattern recognition, pp. 770–778. Cited by: §III-C, TABLE I.
  15. F. N. Iandola, S. Han, M. W. Moskewicz, K. Ashraf, W. J. Dally and K. Keutzer (2016) SqueezeNet: alexnet-level accuracy with 50x fewer parameters and¡ 0.5 mb model size. arXiv preprint arXiv:1602.07360. Cited by: §III-A, TABLE I, §V-A.
  16. A. Leufkens, M. Van Oijen, F. Vleggaar and P. Siersema (2012) Factors influencing the miss rate of polyps in a back-to-back colonoscopy study. Endoscopy 44 (05), pp. 470–475. Cited by: §I.
  17. Q. Li, G. Yang, Z. Chen, B. Huang, L. Chen, D. Xu, X. Zhou, S. Zhong, H. Zhang and T. Wang (2017) Colorectal polyp segmentation using a fully convolutional neural network. 2017 10th International Congress on Image and Signal Processing, BioMedical Engineering and Informatics (CISP-BMEI), pp. 1–5. Cited by: §I, TABLE II.
  18. J. Long, E. Shelhamer and T. Darrell (2015) Fully convolutional networks for semantic segmentation. In Proceedings of the IEEE conference on computer vision and pattern recognition, pp. 3431–3440. Cited by: §I.
  19. A. V. Mamonov, I. N. Figueiredo, P. N. Figueiredo and Y. R. Tsai (2014) Automated polyp detection in colon capsule endoscopy. IEEE transactions on medical imaging 33 (7), pp. 1488–1502. Cited by: §II.
  20. X. Mo, K. Tao, Q. Wang and G. Wang (2018) An efficient approach for polyps detection in endoscopic videos based on faster r-cnn. 2018 24th International Conference on Pattern Recognition (ICPR), pp. 3929–3934. Cited by: §I, §II, TABLE II.
  21. A. K. Mohammed, S. Yildirim, I. Farup, M. Pedersen and Ø. Hovde (2018) Y-net: A deep convolutional neural network for polyp detection. CoRR abs/1806.01907. External Links: 1806.01907 Cited by: §I.
  22. A. Mohammed, S. Yildirim, I. Farup, M. Pedersen and Ø. Hovde (2018) Y-net: a deep convolutional neural network for polyp detection. arXiv preprint arXiv:1806.01907. Cited by: §II.
  23. J. Redmon, S. Divvala, R. Girshick and A. Farhadi (2016) You only look once: unified, real-time object detection. In Proceedings of the IEEE conference on computer vision and pattern recognition, pp. 779–788. Cited by: §I.
  24. S. Ren, K. He, R. Girshick and J. Sun (2015) Faster r-cnn: towards real-time object detection with region proposal networks. In Advances in neural information processing systems, pp. 91–99. Cited by: §I.
  25. O. Ronneberger, P. Fischer and T. Brox (2015) U-net: convolutional networks for biomedical image segmentation. In International Conference on Medical image computing and computer-assisted intervention, pp. 234–241. Cited by: §I, §III-C.
  26. Y. Shin, H. A. Qadir, L. Aabakken, J. Bergsland and I. Balasingham (2018) Automatic colon polyp detection using region based deep cnn and post learning approaches. IEEE Access 6, pp. 40950–40962. Cited by: §II.
  27. R. L. Siegel, K. D. Miller and A. Jemal Cancer statistics, 2019. CA: A Cancer Journal for Clinicians 69 (1), pp. 7–34. Cited by: §I.
  28. J. S. Silva, A. Histace, O. Romain, X. Dray and B. Granado (2014) Towards embedded detection of polyps in WCE images for early diagnosis of colorectal cancer. International Journal of Computer Assisted Radiology and Surgery 9 (2), pp. 283–293. Cited by: §I, §IV-B.
  29. K. Simonyan and A. Zisserman (2014) Very deep convolutional networks for large-scale image recognition. arXiv preprint arXiv:1409.1556. Cited by: §III-B, TABLE I.
  30. X. Sun, P. Zhang, D. Wang, Y. Cao and B. Liu (2019) Colorectal polyp segmentation by u-net with dilation convolution. arXiv preprint arXiv:1912.11947. Cited by: §I, §III-C, TABLE II.
  31. N. Tajbakhsh, S. R. Gurudu and J. Liang (2015) Automated polyp detection in colonoscopy videos using shape and context information. IEEE transactions on medical imaging 35 (2), pp. 630–644. Cited by: TABLE II.
  32. D. Wang, N. Zhang, X. Sun, P. Zhang, C. Zhang, Y. Cao and B. Liu (2019) AFP-net: realtime anchor-free polyp detection in colonoscopy. arXiv preprint arXiv:1909.02477. Cited by: §I, §III-B, §III-C, TABLE II.
  33. P. Wang, X. Xiao, J. R. G. Brown, T. M. Berzin, M. Tu, F. Xiong, X. Hu, P. Liu, Y. Song and D. Zhang (2018) Development and validation of a deep-learning algorithm for the detection of polyps during colonoscopy. Nature biomedical engineering 2 (10), pp. 741–748. Cited by: §I, §II.
  34. R. Zhang, Y. Zheng, C. C. Poon, D. Shen and J. Y. Lau (2018) Polyp detection during colonoscopy using a regression-based convolutional neural network with a tracker. Pattern recognition 83, pp. 209–219. Cited by: §I.
Comments 0
Request Comment
You are adding the first comment!
How to quickly get a good reply:
  • Give credit where it’s due by listing out the positive aspects of a paper before getting into which changes should be made.
  • Be specific in your critique, and provide supporting evidence with appropriate references to substantiate general statements.
  • Your comment should inspire ideas to flow and help the author improves the paper.

The better we are at sharing our knowledge with each other, the faster we move forward.
The feedback must be of minimum 40 characters and the title a minimum of 5 characters
Add comment
Loading ...
This is a comment super asjknd jkasnjk adsnkj
The feedback must be of minumum 40 characters
The feedback must be of minumum 40 characters

You are asking your first question!
How to quickly get a good answer:
  • Keep your question short and to the point
  • Check for grammar or spelling errors.
  • Phrase it like a question
Test description