Fast Bayesian Uncertainty Estimation of Batch Normalized Single Image SuperResolution Network
Abstract
In recent years, deep convolutional neural network (CNN) has achieved unprecedented success in image superresolution (SR) task. But the blackbox nature of the neural network and due to its lack of transparency, it is hard to trust the outcome. In this regards, we introduce a Bayesian approach for uncertainty estimation in superresolution network. We generate Monte Carlo (MC) samples from a posterior distribution by using batch mean and variance as a stochastic parameter in the batchnormalization layer during test time. Those MC samples not only reconstruct the image from its lowresolution counterpart but also provides a confidence map of reconstruction which will be very impactful for practical use. We also introduce a faster approach for estimating the uncertainty, and it can be useful for realtime applications. We validate our results using standard datasets for performance analysis and also for different domainspecific superresolution task. We also estimate uncertainty quality using standard statistical metrics and also provides a qualitative evaluation of uncertainty for SR applications.
1 Introduction
Single image superresolution (SISR) is an illposed low vision problem which generates high resolution (HR) image from a low resolution (LR) image. There is a possibility of making an infinite number of HR images which can be subsampled into the same LR image. Due to the advancement of deep learning techniques, the community has developed a stateoftheart SISR network using deep neural architectures [38]. SR has a wide range of applications from surveillance and security [44] to medical imaging [30] and many more. Other than improving the perceptual quality of images for human interpretation, SISR helps to boost the performance of different automated machine learning or computer vision task [29].
Uncertainty is a powerful tool for any prediction and reconstruction system. The confidence of the system’s output will help to improve the decisionmaking process. It is a useful tool for any computer vision problem. We use the concept of uncertainty for image superresolution. Deep learning based SISR techniques learns features from a dataset and features are dependent on the images of the dataset. But realworld pictures are completely different and contain more complex textures than the training set. Unseen texture during test time can create inappropriate reconstruction. Due to the blackbox nature of deep learning models, it is almost impossible to know the limitation of models or trustability of reconstructed SR image which is further processed to other computer vision task. But we have witnessed that some artifacts, blurriness or distortions in an image can significantly degrade the performance of deep learning based models [22]. Some deformation in reconstructing LR facial image may lead to wrong output in a recognition system or any deformed reconstruction in tumor image may lead to the incorrect estimation of tumor size. Uncertainty in deep learning (DL) models have a transparent and robust effect in DLbased computer vision task.
Bayesian approaches in superresolution not only provides the reconstructed HR image but also provides the posterior distribution of superresolution. Recent progress in Bayesian deep learning approaches uses Monte Carlo samples that come from a posterior distribution via dropout [32] or batchnormalization [13]. Dropout during test time or stochastic batch meanvariance during testing helps to generate MC samples [8, 33]. Monte Carlo methods for deep learning model uncertainty estimation is successfully applied to classification, segmentation [16, 27], camera relocalization [17] problems. Generally, the deep learning community does not use dropout in image reconstruction applications, but batch normalization is most common in SISR, denoising, etc. Therefore, we use batchnormalization uncertainty to analyze SISR uncertainty.
In this article, we propose a Bayesian approach to measure the quality of reconstructed HR images from downsampled LR images. For this purpose, we add widely used batchnormalization layer in the superresolution network, and it helps to generate Monte Carlo (MC) samples. These samples are different possible HR images from a single LR image. We use the mean of those HR images to get the reconstruction, and the variation between those HR images gives an idea about the uncertainty of reconstruction. We estimate the uncertainty of the reconstruction from those MC samples. We measure the quality of uncertainty using standard statistical metrics and observe the relation between reconstruction quality and uncertainty. We also propose a faster approach for generating MC samples and this faster approach can be extended to any other computer vision applications. In our method, we got MC samples in a single feedforward, and due to this, it is useful in realtime applications.
Our contributions in this paper are as follows:

We use the standard approach of uncertainty estimation of DL models using batchnormalization. Our work proposes a better and faster strategy for uncertainty estimation and overcame the hurdle of variable image size. Our procedure generates any number of MC samples in a single shot.

We have demonstrated a Bayesian uncertainty estimation approach for SISR, and as of our knowledge, we are the first one to estimate uncertainty in deep learning based image reconstruction.

We use Monte Carlo batchnormalization (MCBN) for uncertainty estimation in superresolution network.

We have discussed the advantages of uncertainty in SISR and its applications from medical image to satellite image superresolution. We also analyzed the uncertainty map and its significance.
2 Related Work
2.1 Single Image SuperResolution
SISR has extensive literature due to different studies in the last few decades. Recent advancement of deep learning (DL) methods has achieved significant improvement in that field [5, 18, 36] and different computer vision task [42, 41, 43, 15]. SRCNN [5] first explored the convolutional neural network for establishing a nonlinear mapping between interpolated LR images and its HR counterparts. It has achieved superior performance than other examplebased methods like nearest neighbor [6], sparse representation [39], neighborhood embedding [34], etc. VDSR [18] proposed a deeper architecture and showed performance improves with the increase of network depth and converge faster using residual learning. After that different DL based approaches [19, 23, 43, 24] have been proposed and achieved stateoftheart performance in the standard dataset. In our work, we used VDSR [18] architecture for uncertainty analysis as it is the first deep architecture for SISR.
2.2 Bayesian Uncertainty
Bayesian models are generally used to model uncertainty, and different approaches have been developed to adapt NNs to Bayesian reasoning like placing a prior distribution over parameters. Due to difficult in inferencing [7] of Bayesian neural networks (BNNs), some approaches [8, 33] have been taken to approximate BNNs. Bayesian deep learning approaches utilized MC samples generated via dropout [32] or batch normalization [13] to approximate posterior distribution. Dropout [32] can be treated as approximate Bayesian model as multiple predictions through trained model by sampling predictions using different dropout mask, and In case of batchnormalization [13], stochastic parameters batch mean and batch variance are used to generate multiple predictions. [33] shows batchnormalized neural network can be approximated to the Bayesian model. We use a batchnormalized neural network for SISR as it is widely used in image reconstruction applications.
3 Proposed Method
We propose a Bayesian approach on SISR that produces highresolution images along with a confidence map of reconstruction quality. In this regards, we discuss a short background of Bayesian inference in this section. After that, we define our network architecture and its modification for Bayesian approximation. We also present a faster and better approach to overcome the difficulties of estimating uncertainty in SISR applications. In the end, we discuss metrics to measure the quality of uncertainty.
3.1 Bayesian Inference
We estimate a probabilistic function from a training set where are LR image set and its corresponding HR image set . This function is approximated to generate most likely highresolution image from a lowresolution test image . So the probabilistic estimation of HR test image is described as
(1) 
where is weight parameters of a function . We use variational inference to approximate Bayesian modeling. Most common approach is to learn approximate distribution of weights by minimizing the KullbackâLeibler divergence . This yields approximate distribution
(2) 
In a batchnormalized neural network for Bayesian uncertainty estimation, model parameters are . Here is learnable model weight parameters and are stochastic parameters which are mean and variance of each layer. is a joint distribution of weights and stochastic parameters . is mean and variance of ’s sample and it need to be Independent and identically distributed random variables.
3.2 Network Architecture
In this paper, we use very deep superresolution (VDSR) network [18] as a base architecture for an experimental purpose to analyze uncertainty. Our method is a generalized approach, and it can be extended to any other superresolution network. We have used batchnormalization to measure uncertainty, but VDSR paper has not used batchnormalization (BN). So we introduce some changes in the main architecture. Our VDSR architecture has batchnormalization layer after each convolutional layer except the last layer, and no bias is used as batchnormalization normalizes the output. BN blocks are followed by convolution and ReLU nonlinearity.
3.3 Bayesian VDSR for Uncertainty Estimation
We use batchnormalization [13] to estimate the uncertainty of superresolution network. It is commonly used in deep networks to overcome the problem known as internal covariate shift. Random batch members are selected to estimate minibatch statistics for training. We use this stochasticity to approximate Bayesian inference. This allows us to make a meaningful estimate of uncertainty, and it is termed as Monte Carlo Batch Normalization (MCBN) [33]. Generally running batch mean and batch variance are estimated in each batch normalization layer during training, but we use batch mean and variance both during training and testing. We have learnable model parameters which are optimized using gradient backpropagation during training and stochastic parameters like batch mean and variances help to generate MC samples from the posterior distribution of the model. We feedforward a test image along with different training batch for multiple times, and due to stochasticity of batches, it creates various reconstructed HR images. We take mean of those MC samples to get estimated reconstruction and variance for getting uncertainty map.
3.4 Faster approach
The main drawback of Bayesian uncertainty estimation in the batchnormalized neural network is that we need to process test image with different random batches to generate MC samples and computation time increases exponentially with the increase of the number of samples in a single batch or spatial dimension of the batch.
Another challenge is that in the case of SISR, test image size varies from thousands to millions of pixel. We can not make higher spatial batch size during training as it takes longer computation time. We train our model using small batch size due to the computational constraint. So we have to break larger images during testing for batch processing, and it can create a patchy effect in images. Due to this, we propose a different approach to generate MC samples in a single batch. After training, we estimate stochastic parameters of each layer using different random training batches as shown in Algorithm 1. We use the same batch shape during training and stochastic parameter estimation. These parameters in each batchnormalization layer are estimated for a batch, and like this, we create several stochastic parameters set for various batches. These stochastic parameters will be used during testing for generating MC samples. One stochastic parameter set generates one MC sample. During testing, we concatenate the same test image based on the required number of MC samples and in batchnormalization layer we normalize each image separately using different stochastic parameters as shown in Algorithm 2. Due to this, it produces various HR image as an MC samples which come from a posterior distribution learned from the training dataset.
3.5 Uncertainty Quality Metrices
We evaluate quality of uncertainty using two standard statiscal metrices, Predictive Log Likelihood (PLL) and Continuous Ranked Probability Score (CRPS).
Predictive Log Likelihood (PLL): Predictive Log Likelihood is a widely accepted metric for measuring the quality of uncertainty [4, 11, 33, 8]. For a probabilistic model , PLL of an LR image and HR image is defined as:
(3) 
is a predictive probability distribution function of for an input . There is no bound of PLL and it is maximized for perfect predition of HR image without any variance. Main property of this metric is that it does not make any assumptions about distribution but it is criticized for have effect of outlier on score [28].
Continuous Ranked Probability Score (CRPS): Continuous Ranked Probability Score [10] is generally used to estimate respective accuracy of two probabilistic models. It generalizes mean absolute error for probabilistic estimation. CRPS is defined as
(4) 
where is predictive cumulative distribution function, and 1 is the Heaviside step function. The value of is if and otherwise. There is no upper bound of crps. Perfect prediction with no variance receives a CRPS of .
4 Experimental Results & Discussions
In this section, we discuss the datasets used in experimental purpose and training methodology. We also address the effect of the number of MC samples on performance and compare our faster MC sample generation approach with standard BN uncertainty estimation. In the end, we present our understanding of model uncertainty for SISR applications.
4.1 Datasets
We use DIV2K [2, 35], a highresolution, highquality dataset for training purpose. Total training images optimize trainable parameters of the SR network, and validation images are to select the best parameter settings for testing. We analyze the network performance using five standard benchmark testing dataset namely Set5 [3], Set14 [40], BSD100 [25], Urban100 [12], and Manga109 [26]. We have also experimented on satellite images downloaded from [1] and histopathological images from MoNuSeg challenge dataset [21].
4.2 Training Details
We randomly extract patches of size from each HR and bicubic interpolated LR image during training for a batch update. We augment the patches by horizontal flip, vertical flip, and degree rotation and randomly choose each augmentation with probability. We normalize each input into before feeding to the network. We train each model for iterations which is equivalent to million batch update. To ensure better convergence, we use trained model of scale factor as initialization for other scale factors. We use Xavier initialization [9] for initializing model of scaling factor . We train our model with PyTorch framework and update weights with Adam optimizer [20]. The learning rate is initialized to and reduced to half after every iterations. We use meansquared error to optimize model parameters. We extract patches from high variance regions of validation images to choose the best model for testing. During testing, we clip the output between and map into an 8bit unsigned integer format. For a fair comparison, we remove boundary pixels of each test image based on the scaling factor for image quality evaluation as described in vdsr paper [18].





5  14.28  1.0  
10  33.48  1.97  
15  52.67  2.96 
Scaling Factor  

MC Samples  5  15  25  5  15  25  5  15  25  
Set5  SSIM  0.957530  0.957559  0.957562  0.884546  0.884700  0.884685  0.732343  0.732507  0.732317 
PSNR  37.5404  37.5440  37.5472  31.4436  31.4456  31.4443  26.0340  26.0409  26.0391  
PLL  11.175  11.187  11.184  41.361  41.392  41.434  157.733  157.559  157.656  
CRPS  0.008191  0.008162  0.008157  0.015771  0.015711  0.015693  0.032864  0.032843  0.032823  
Set14  SSIM  0.912049  0.912211  0.912232  0.768863  0.768996  0.769027  0.615298  0.615382  0.615404 
PSNR  33.0858  33.1049  33.1096  28.0506  28.0637  28.0640  24.2733  24.2783  24.2816  
PLL  91.007  90.892  90.767  252.040  250.839  250.943  568.926  568.596  568.283  
CRPS  0.014960  0.014879  0.014857  0.026334  0.026215  0.026199  0.040444  0.040339  0.040343  
BSD100  SSIM  0.894965  0.895038  0.895048  0.725807  0.725903  0.725940  0.580028  0.580055  0.580051 
PSNR  31.9119  31.9164  31.9175  27.2973  27.2997  27.3012  24.4825  24.4842  24.4845  
PLL  74.233  74.226  74.217  195.734  195.736  195.730  346.074  345.979  345.981  
CRPS  0.016905  0.016859  0.016851  0.028345  0.028274  0.028260  0.039483  0.039417  0.039400  
Urban100  SSIM  0.916389  0.916472  0.916480  0.757138  0.757218  0.757180  0.576144  0.576194  0.576157 
PSNR  31.1339  31.1442  31.1457  25.3401  25.3423  25.3417  21.8184  21.8180  21.8176  
PLL  472.235  469.221  468.409  1553.83  1547.44  1545.33  3148.43  3140.58  3138.79  
CRPS  0.017461  0.017389  0.017373  0.033442  0.033353  0.033327  0.052008  0.051922  0.051906  
Manga109  SSIM  0.973192  0.973270  0.973291  0.887527  0.887976  0.888069  0.726837  0.727066  0.727109 
PSNR  37.5643  37.5966  37.6036  29.1394  29.1608  29.1634  23.3169  23.3217  23.3223  
PLL  127.31  127.00  126.71  796.85  791.92  790.55  2640.54  2634.91  2632.31  
CRPS  0.007538  0.007450  0.007427  0.017915  0.017773  0.017740  0.036155  0.036030  0.035989 
Dataset  Scale  Bicubic  A+  SRCNN  VDSR 





Set5  x2  33.65 / 0.930  36.54 / 0.954  36.65 / 0.954  37.53 / 0.958  37.49 / 0.957  37.54 / 0.957  37.55 / 0.958  
x4  28.42 / 0.810  30.30 / 0.859  30.49 / 0.862  31.35 / 0.882  31.32 / 0.882  31.45 / 0.884  31.45 / 0.885  
x8  24.39 / 0.657  25.52 / 0.692  25.33 / 0.689  25.72 / 0.711  26.00 / 0.729  26.07 / 0.732  26.04 / 0.733  
Set14  x2  30.34 / 0.870  32.40 / 0.906  32.29 / 0.903  32.97 / 0.913  33.03 / 0.912  33.08 / 0.912  33.11 / 0.912  
x4  26.10 / 0.704  27.43 / 0.752  27.61 / 0.754  28.03 / 0.770  28.02 / 0.767  28.07 / 0.768  28.06 / 0.769  
x8  23.19 / 0.568  23.98 / 0.597  23.85 / 0.593  24.21 / 0.609  24.26 / 0.613  24.32 / 0.615  24.28 / 0.615  
BSD100  x2  29.56 / 0.844  31.22 / 0.887  31.36 / 0.888  31.90 / 0.896  31.88 / 0.895  31.91 / 0.895  31.92 / 0.895  
x4  25.96 / 0.669  26.82 / 0.710  26.91 / 0.712  27.29 / 0.726  27.27 / 0.724  27.30 / 0.725  27.30 / 0.726  
x8  23.67 / 0.547  24.20 / 0.568  24.13 / 0.565  24.37 / 0.576  24.46 / 0.579  24.49 / 0.579  24.48 / 0.580  
Urban100  x2  26.88 / 0.841  29.23 / 0.894  29.52 / 0.895  30.77 / 0.914  31.05 / 0.916  31.15 / 0.917  31.15 / 0.916  
x4  23.15 / 0.659  24.34 / 0.720  24.53 / 0.724  25.18 / 0.753  25.27 / 0.754  25.35 / 0.756  25.34 / 0.757  
x8  20.74 / 0.515  21.37 / 0.545  21.29 / 0.543  21.54 / 0.560  21.77 / 0.574  21.83 / 0.576  21.82 / 0.576  
Manga109  x2  30.84 / 0.935  35.33 / 0.967  35.72 / 0.968  37.16 / 0.974  37.36 / 0.973  37.46 / 0.973  37.60 / 0.973  
x4  24.92 / 0.789  27.02 / 0.850  27.66 / 0.858  28.82 / 0.886  28.98 / 0.885  29.20 / 0.888  29.16 / 0.888  
x8  21.47 / 0.649  22.39 / 0.680  22.37 / 0.682  22.83 / 0.707  23.23 / 0.723  23.35 / 0.728  23.32 / 0.727 
4.3 Performance Analysis
4.3.1 Number of MC Samples
We get a better estimate of uncertainty and reconstruction with the increase of the number of MC samples, but it also increases the inference time. So proper choice of the number of MC samples required for the task from this tradeoff. The minimum number of MC samples should be capable enough to give a better reconstruction than batchnormalization without stochastic meanvariance and should also provide a stable uncertainty map. In the figure 2, we observe the changes of reconstruction and uncertainty quality against changes in the number of MC samples for ’Set5’ dataset. The plot shows that SSIM and PSNR index increases with the increase of MC samples and later it settles to some stable values. In the case of PLL and CRPS, it converges to some stable value after initial unstable conditions. In our experiments, We use , and MC samples for testing.
4.3.2 Fast MC Sample Generation
We benchmark our faster approach with standard batchnormalized (BN) uncertainty estimation. The time required for generating MC samples in standard BN uncertainty [33] mainly depends on the size of the image in the dataset and the number of MC samples. We overcome these two difficulties. Our approach is much faster than conventional as shown in Table 1. We consider MC sample generation for an image using our method as a baseline and other values in the table exhibits how many times more GPU time required for inference. Our approach takes times lesser execution time to generate samples for an image of size .
4.3.3 Image Quality Analysis
We used structural similarity (SSIM) [37] and peak signaltonoise ratio (PSNR) metrics to measure the reconstruction quality. Table 2 shows the performance of Bayesian VDSR for different Monte Carlo samples. We observe an improvement of image quality with the increase of MC samples and it saturates gradually. In Table 3, we also compared our bayesian VDSR and batchnormalized (BN) VDSR with standard deep learning based approaches like SRCNN [5], VDSR [18]. Our training dataset, training procedure and no bias approach are different from VDSR paper. So we put our VDSR implementation in the table for fair comparison. Bayesian VDSR gives a minor improvement over BNVDSR. But along with this, uncertainty map comes free, and it contributes a significant boost in deep learning based superresolution task.
4.4 Understanding Model Uncertainty
The Table 2 shows two standard uncertainty quality metrics PLL and CRPS for measuring uncertainty quality using different MC samples. We observe the value of PLL and CRPS are getting better (more stable) with the increase of MC samples due to the availability of larger samples it can estimate better mean and variance. Model uncertainty increases with the increase of scaling factor due to the higher shift of mean from actual value and higher variance in MC samples.
In Figure 3 shows different images from a standard testing dataset and its reconstruction from LR image with the different scaling factor. Along with this it also shows uncertainty in reconstruction. In the first image, if we look on the English letters, we can see higher uncertainty in border of each letter for scaling factor , and as scaling factor increases it becomes difficult to reconstruct the characters perfectly for a DL models and uncertainty increases as we can see for scaling factor , model shows uncertainty all over the characters. In the second image set of figure 3, the reconstructed image with scaling factor does not show any uncertainty. But we observe there is uncertainty in the boundary of Japanese character and it increases gradually with the increase of scaling factor. Uncertainty is maximum for scaling factor as we can also see visually that sharp boundary in that character has been deformed in the reconstructed image. But in case of dotted texture in that image shows higher uncertainty for scaling factor and there is no uncertainty for scaling factor and . It is due to perfect reconstruction for scaling factor but for scaling factor , these texture has completely been abolished in original low LR image due to high downsampling, and the dotted region becomes continuous, and model upsampled that continuous texture. The last image set of Figure 3 shows the uncertainty in the edges of windows, and it increases with the increase of downsampling factor. In our understanding, we observe that if some texture is present in LR image and it is not reconstructed properly in HR image, those regions show higher uncertainty in the reconstruction. Mainly ambiguous regions, object boundaries, sharp regions or any deformed reconstruction generally receive higher uncertainty. This is very much helpful to further process those images for other computer vision task. Features coming from those uncertain regions in other computer vision task can be assigned lower importance, and it may improve performance.
For qualitative evaluation, we compare the average uncertainty with the quality of reconstruction. In Figure 4, we use PSNR and perceptual loss [14] to measure the quality of reconstruction. We use features of , , and layer from popular VGG16 [31] model to calculate perceptual loss between HR image and reconstructed image. Perceptual loss increases with the increase of deformation in the reconstructed image and PSNR decreases with the increase of pixelwise loss. We observe a strong relationship between uncertainty and image quality. PSNR decreases and perceptual loss increases with the rise in uncertainty.
5 Conclusion
In this article, we introduced a Bayesian approach to estimate uncertainty in batch normalized superresolution network. Stochastic batchnormalization used during test time to generate multiple Monte Carlo samples and those samples are used to estimate uncertainty. We also propose a faster approach to produce Monte Carlo samples and measure uncertainty quality using standard statistical metrics. Our method is a generalized approach, and it can be applied to any image reconstruction techniques. We show Bayesian uncertainty provides the reliable measure of model uncertainty in SISR. We believe that uncertainty in image superresolution will improve the trustability of reconstructed output for deployment in the highrisk task.
References
 [1] Open Data Program. https://www.digitalglobe.com/ecosystem/opendata/. Accessed: 20190301.
 [2] E. Agustsson and R. Timofte. Ntire 2017 challenge on single image superresolution: Dataset and study. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition Workshops, pages 126–135, 2017.
 [3] M. Bevilacqua, A. Roumy, C. Guillemot, and M. L. AlberiMorel. Lowcomplexity singleimage superresolution based on nonnegative neighbor embedding. 2012.
 [4] T. Bui, D. HernándezLobato, J. HernandezLobato, Y. Li, and R. Turner. Deep gaussian processes for regression using approximate expectation propagation. In International Conference on Machine Learning, pages 1472–1481, 2016.
 [5] C. Dong, C. C. Loy, K. He, and X. Tang. Image superresolution using deep convolutional networks. IEEE transactions on pattern analysis and machine intelligence, 38(2):295–307, 2016.
 [6] W. T. Freeman, T. R. Jones, and E. C. Pasztor. Examplebased superresolution. IEEE Computer graphics and Applications, (2):56–65, 2002.
 [7] Y. Gal. Uncertainty in deep learning. PhD thesis, PhD thesis, University of Cambridge, 2016.
 [8] Y. Gal and Z. Ghahramani. Dropout as a bayesian approximation: Representing model uncertainty in deep learning. In International Conference on Machine Learning, pages 1050–1059, 2016.
 [9] X. Glorot and Y. Bengio. Understanding the difficulty of training deep feedforward neural networks. In Proceedings of the thirteenth international conference on artificial intelligence and statistics, pages 249–256, 2010.
 [10] T. Gneiting and A. E. Raftery. Strictly proper scoring rules, prediction, and estimation. Journal of the American Statistical Association, 102(477):359–378, 2007.
 [11] J. M. HernándezLobato and R. Adams. Probabilistic backpropagation for scalable learning of bayesian neural networks. In International Conference on Machine Learning, pages 1861–1869, 2015.
 [12] J.B. Huang, A. Singh, and N. Ahuja. Single image superresolution from transformed selfexemplars. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pages 5197–5206, 2015.
 [13] S. Ioffe and C. Szegedy. Batch normalization: Accelerating deep network training by reducing internal covariate shift. In International Conference on Machine Learning, pages 448–456, 2015.
 [14] J. Johnson, A. Alahi, and L. FeiFei. Perceptual losses for realtime style transfer and superresolution. In European conference on computer vision, pages 694–711. Springer, 2016.
 [15] A. Kar, S. Phani Krishna Karri, N. Ghosh, R. Sethuraman, and D. Sheet. Fully convolutional model for variable bit length and lossy high density compression of mammograms. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition Workshops, pages 2591–2594, 2018.
 [16] A. Kendall, V. Badrinarayanan, and R. Cipolla. Bayesian segnet: Model uncertainty in deep convolutional encoderdecoder architectures for scene understanding. Proceedings of the British Machine Vision Conference (BMVC), 2017.
 [17] A. Kendall and R. Cipolla. Modelling uncertainty in deep learning for camera relocalization. In 2016 IEEE international conference on Robotics and Automation (ICRA), pages 4762–4769. IEEE, 2016.
 [18] J. Kim, J. Kwon Lee, and K. Mu Lee. Accurate image superresolution using very deep convolutional networks. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pages 1646–1654, 2016.
 [19] J. Kim, J. Kwon Lee, and K. Mu Lee. Deeplyrecursive convolutional network for image superresolution. In Proceedings of the IEEE conference on computer vision and pattern recognition, pages 1637–1645, 2016.
 [20] D. P. Kingma and J. Ba. Adam: A method for stochastic optimization. ICLR, 2014.
 [21] N. Kumar, R. Verma, S. Sharma, S. Bhargava, A. Vahadane, and A. Sethi. A dataset and a technique for generalized nuclear segmentation for computational pathology. IEEE transactions on medical imaging, 36(7):1550–1560, 2017.
 [22] A. Kurakin, I. Goodfellow, and S. Bengio. Adversarial examples in the physical world. ICLR Workshop track, 2017.
 [23] W.S. Lai, J.B. Huang, N. Ahuja, and M.H. Yang. Fast and accurate image superresolution with deep laplacian pyramid networks. IEEE transactions on pattern analysis and machine intelligence, 2018.
 [24] C. Ledig, L. Theis, F. Huszár, J. Caballero, A. Cunningham, A. Acosta, A. Aitken, A. Tejani, J. Totz, Z. Wang, et al. Photorealistic single image superresolution using a generative adversarial network. In Proceedings of the IEEE conference on computer vision and pattern recognition, pages 4681–4690, 2017.
 [25] D. Martin, C. Fowlkes, D. Tal, and J. Malik. A database of human segmented natural images and its application to evaluating segmentation algorithms and measuring ecological statistics. In Proceedings Eighth IEEE International Conference on Computer Vision. ICCV 2001, volume 2, pages 416–423. IEEE, 2001.
 [26] Y. Matsui, K. Ito, Y. Aramaki, A. Fujimoto, T. Ogawa, T. Yamasaki, and K. Aizawa. Sketchbased manga retrieval using manga109 dataset. Multimedia Tools and Applications, 76(20):21811–21838, 2017.
 [27] A. G. Roy, S. Conjeti, N. Navab, and C. Wachinger. Inherent brain segmentation quality control from fully convnet monte carlo sampling. In International Conference on Medical Image Computing and ComputerAssisted Intervention, pages 664–672. Springer, 2018.
 [28] R. Selten. Axiomatic characterization of the quadratic scoring rule. Experimental Economics, 1(1):43–61, 1998.
 [29] A. Sharma, P. Kaur, A. Nigam, and A. Bhavsar. Learning to decode 7tlike mr image reconstruction from 3t mr images. In Deep Learning in Medical Image Analysis and Multimodal Learning for Clinical Decision Support, pages 245–253. Springer, 2018.
 [30] W. Shi, J. Caballero, C. Ledig, X. Zhuang, W. Bai, K. Bhatia, A. M. S. M. de Marvao, T. Dawes, D. OâRegan, and D. Rueckert. Cardiac image superresolution with global correspondence using multiatlas patchmatch. In International Conference on Medical Image Computing and ComputerAssisted Intervention, pages 9–16. Springer, 2013.
 [31] K. Simonyan and A. Zisserman. Very deep convolutional networks for largescale image recognition. arXiv preprint arXiv:1409.1556, 2014.
 [32] N. Srivastava, G. Hinton, A. Krizhevsky, I. Sutskever, and R. Salakhutdinov. Dropout: a simple way to prevent neural networks from overfitting. The Journal of Machine Learning Research, 15(1):1929–1958, 2014.
 [33] M. Teye, H. Azizpour, and K. Smith. Bayesian uncertainty estimation for batch normalized deep networks. In International Conference on Machine Learning, pages 4914–4923, 2018.
 [34] R. Timofte, V. De Smet, and L. Van Gool. A+: Adjusted anchored neighborhood regression for fast superresolution. In Asian conference on computer vision, pages 111–126. Springer, 2014.
 [35] R. Timofte, S. Gu, J. Wu, L. Van Gool, L. Zhang, M.H. Yang, M. Haris, et al. Ntire 2018 challenge on single image superresolution: Methods and results. In The IEEE Conference on Computer Vision and Pattern Recognition (CVPR) Workshops, June 2018.
 [36] T. Tong, G. Li, X. Liu, and Q. Gao. Image superresolution using dense skip connections. In The IEEE International Conference on Computer Vision (ICCV), Oct 2017.
 [37] Z. Wang, A. C. Bovik, H. R. Sheikh, E. P. Simoncelli, et al. Image quality assessment: from error visibility to structural similarity. IEEE transactions on image processing, 13(4):600–612, 2004.
 [38] Z. Wang, J. Chen, and S. C. Hoi. Deep learning for image superresolution: A survey. arXiv preprint arXiv:1902.06068, 2019.
 [39] J. Yang, J. Wright, T. S. Huang, and Y. Ma. Image superresolution via sparse representation. IEEE transactions on image processing, 19(11):2861–2873, 2010.
 [40] R. Zeyde, M. Elad, and M. Protter. On single image scaleup using sparserepresentations. In International conference on curves and surfaces, pages 711–730. Springer, 2010.
 [41] H. Zhang and V. M. Patel. Densely connected pyramid dehazing network. In The IEEE Conference on Computer Vision and Pattern Recognition (CVPR), June 2018.
 [42] K. Zhang, W. Zuo, Y. Chen, D. Meng, and L. Zhang. Beyond a gaussian denoiser: Residual learning of deep cnn for image denoising. IEEE Transactions on Image Processing, 26(7):3142–3155, 2017.
 [43] Y. Zhang, Y. Tian, Y. Kong, B. Zhong, and Y. Fu. Residual dense network for image superresolution. In The IEEE Conference on Computer Vision and Pattern Recognition (CVPR), June 2018.
 [44] W. W. Zou and P. C. Yuen. Very low resolution face recognition problem. IEEE Transactions on image processing, 21(1):327–340, 2012.