Super-Resolution based on Image-Adapted CNN Denoisers: Incorporating Generalization of Training Data and Internal Learning in Test Time

Super-Resolution based on Image-Adapted CNN Denoisers: Incorporating Generalization of Training Data and Internal Learning in Test Time

Tom Tirer
Tel Aviv University, Israel
tirer.tom@gmail.com
   Raja Giryes
Tel Aviv University, Israel
raja@tauex.tau.ac.il
Abstract

While deep neural networks exhibit state-of-the-art results in the task of image super-resolution (SR) with a fixed known acquisition process (e.g., a bicubic downscaling kernel), they experience a huge performance loss when the real observation model mismatches the one used in training. Recently, two different techniques suggested to mitigate this deficiency, i.e., enjoy the advantages of deep learning without being restricted by the prior training. The first one follows the plug-and-play (PP) approach that solves general inverse problems (e.g., SR) by plugging Gaussian denoisers into model-based optimization schemes. The second builds on internal recurrence of information inside a single image, and trains a super-resolver network at test time on examples synthesized from the low-resolution image. Our work incorporates the two strategies, enjoying the impressive generalization capabilities of deep learning, captured by the first, and further improving it through internal learning at test time. First, we apply a recent PP strategy to SR. Then, we show how it may become image-adaptive in test time. This technique outperforms the above two strategies on popular datasets and gives better results than other state-of-the-art methods on real images.

1 Introduction

The problem of image Super-Resolution (SR) has been the focus of many deep learning works in the recent years, and has experienced increasing improvement in performance along with the developments in deep learning [5, 3, 11, 13, 14, 22, 31, 30, 1, 32]. In fact, when the acquisition process of the low-resolution (LR) image is known and fixed (e.g. a bicubic downscaling kernel), Convolutional Neural Network (CNN) methods trained using the exact observation model clearly outperform other SR techniques, e.g. model-based optimization methods [34, 7, 8, 10, 21].

However, when there is a mismatch in the observation model between the training and test data the CNN methods exhibit significant performance loss [34, 23]. This behavior is certainly undesirable, because in real life the acquisition process is often inexact or unknown in advance. Therefore, several recent approaches have been proposed with the goal of enjoying the advantages of deep learning without being restricted by the assumptions made in training [34, 23, 35, 29, 27].

One line of works relies on the Plug-and-Play (P&P) approach that solves general inverse problems (e.g., SR) by plugging Gaussian denoisers into model-based optimization schemes [34, 29, 20, 26, 15]. In this approach, the observation model is handled by an optimization method and does not rely on the training phase. Another recent approach trains a neural network for the imaging task directly on the test image [23, 27]. Such methods build on internal recurrence of information inside a single image, and trains a super-resolver CNN at test time on examples synthesized from the low-resolution (LR) image using an input kernel [23] or the whole LR image directly [27].


(a) LR image
(b) EDSR+ [14]
(c) ZSSR [23]
(d) IDBP-CNN
(e) IDBP-CNN-IA

(f) LR image
(g) EDSR+ [14]
(h) ZSSR [23]
(i) IDBP-CNN
(j) IDBP-CNN-IA

(k) LR image
(l) EDSR+ [14]
(m) ZSSR [23]
(n) IDBP-CNN
(o) IDBP-CNN-IA
Figure 1: SR(x2) of old real images. We introduce a super-resolution version of the IDBP framework [26] that uses CNN denoisers and performs SR for any given down-sampling operator without retraining. We show that by making the CNN denoisers image-adaptive (IA), we get a more accurate image reconstruction with less artifacts. More examples are presented is Figures 6-9.

Contribution. In this paper we incorporate the two independent strategies mentioned above, enjoying the impressive generalization capabilities of deep learning, captured by the first, and further improving it by internal learning at test time. We start with the recently proposed IDBP framework [26], which has been applied so far only to inpainting and deblurring using a fixed CNN denoiser. Here we apply it to SR using a set of CNN denoisers (same as those used by IRCNN [34]) and obtain very good results. This IDBP-based SR method serves us as a strong starting point. We propose to further improve the performance by fine-tuning its CNN denoisers in test time using the LR input and synthetic additive Gaussian noise.

Our image-adaptive approach improves over the IDBP method which does not use any internal learning, as well as over a method that uses only internal learning [23], on widely-used datasets and experiments. On real images, that do not comply with a known model and may contain artifacts, it also gives better results than the state-of-the-art EDSR+ method [14] (see example results in Figure 1).

2 Related work

Many works have considered the problem of image super resolution. Some have relied on specific prior image models, such as sparsity [17, 33, 16, 7, 6, 8, 10]. Yet, recently, many works have employed neural networks for this task, showing a great advance in performance with respect to both the reconstruction error and the perceptual quality (see review of recent advancement in deep learning for SR in [32] and a comparison between methods that focus on perceptual quality and those that target reconstruction error in [1]). However, one main disadvantage of neural networks for the task of SR is their sensitivity to the LR image formation model. A network performance may degrade significantly if it has been trained for one acquisition model and then been tested on another [23] .

Our work follows the P&P approach, introduced in [29], which suggests leveraging excellent performance of denoising algorithms for solving other inverse imaging problems that can be formulated as a cost function, composed of fidelity and prior terms. The P&P approach uses iterative optimization schemes, where the fidelity term is handled by relatively simple optimization methods and the prior term is handled by activations of Gaussian denoisers. Several P&P techniques have been suggested, for example: Plug-and-Play Priors [29] uses variable splitting and ADMM [2], IRCNN [34] uses variable splitting and quadratic penalty method, RED [20] uses a modified prior term, and the recently proposed IDBP [26] modifies the fidelity term and uses alternating minimization. While the P&P approach is not directly connected to deep learning, IRCNN [34] presented impressive SR results using a set of CNN Gaussian denoisers, providing a way to enjoy the generalization capabilities of deep learning as a natural image prior without any restrictions on the observation model.

Our image-adaptive approach is influenced by the SR approach proposed in [23] that follows the idea of internal recurrence of information inside a single image within and across scales [9, 36, 10]. As was demonstrated in [9] (see Figure 5 in [9]), in some occasions there is no alternative to internal learning for predicting tiny patterns that recur in various scales throughout the image. In the spirit of this phenomenon, the SR method in [23], termed as ZSSR, completely avoids a prior training phase, and instead, trains a super-resolver CNN at test time on examples synthesized from the LR image using an input kernel. This strategy relates to another deep learning solution for inverse imaging problems that optimizes the weights of a deep neural network only in the test phase [27].

3 Problem formulation and IDBP-based SR

Many image acquisition models, including super-resolution, can be formulated by

(1)

where represents the unknown original image, represents the observations, is an degradation matrix and is a vector of i.i.d. Gaussian random variables . This model can be used for denoising task when is the identity matrix , inpainting task when is an sampling matrix (i.e. a selection of rows of ), and deblurring task when is a blurring operator. Specifically, here we are interested in image super-resolution, where is a composite operator of blurring (e.g. anti-aliasing filtering) and down-sampling (hence ).

Most of the model-based approaches for recovering , try to solve an optimization problem composed of fidelity and prior terms

(2)

where is the optimization variable, stands for the Euclidean norm, and is a prior image model. Recently, the work in [26] has suggested to solve a different optimization problem

(3)

where is a design parameter that should be set according to a certain condition that keeps (3) as an approximation of (2) (see Section III in [26] for more details). The major advantage of (3) over (2) is the possibility to solve it using a simple alternating minimization scheme that possesses the plug-and-play property: the prior term is handled solely by a Gaussian denoising operation with noise level . Iteratively, is obtained by

(4)

and is obtained by projecting onto

(5)

where is the pseudoinverse of (recall ). The two repeating operations lends the method its name: Iterative Denoising and Backward Projections (IDBP). After a stopping criterion is met, the last is taken as the estimate of the latent image .

The IDBP method can be applied to SR in an efficient manner: the composite operators and are easy to perform, and matrix inversion can be avoided using the conjugate gradient method. We note that until now its performance has been demonstrated only for inpainting and deblurring tasks.

(a)
(b)
Figure 2: Super-resolution results (PSNR averaged on Set5 vs. iteration number) for IDBP-CNN with and without our image-adapted CNN approach: (\subreffig:psnr_vs_iter_bicub) SR x2 with bicubic kernel; (\subreffig:psnr_vs_iter_gauss) SR x3 with Gaussian kernel.

A related SR method can be found in the IRCNN paper [34]. While IRCNN uses variable splitting and quadratic penalty method for different tasks such as deblurring and inpainting, for SR it uses a heuristic algorithm, inspired by [8], where the operation of IDBP is replaced by a bicubic upsampling (even if the acquisition kernel is not bicubic) multiplied by a manually tuned design parameter, and the resulted step is repeated five times before applying the denoising step. Despite the lack of theoretical reasoning, [34] obtained good SR results when plugging into the heuristic iterative scheme a set of CNN denoisers with noise level that decays exponentially from to , where denotes the desired SR scale factor.

Here, we adopt the strategy of changing CNN denoisers during the IDBP scheme, and denote the resulting method by IDBP-CNN. To be more precise, the parameter in (3) starts from in the first iteration and decays exponentially to in the last one. In most of the experiments in this paper it is assumed that . In this case, as discussed in [26] for noiseless inpainting, IDBP theory allows to decrease to any small positive value as the iterations increase. However, in experiments with we set a fixed lower bound on the value of (or equivalently ) to ensure good performance of IDBP. Our experiments show that IDBP-CNN achieves better SR results than IRCNN, presumably due to the theoretical reasoning behind IDBP, especially for kernels other than bicubic.

The IDBP-CNN algorithm serves us as a strong starting point. Following we discuss our method to improve its SR capabilities using image-adapted CNN denoisers.

4 Image-adapted CNN

We propose to incorporate the two independent strategies mentioned above: P&P approach and internal learning in test-time. The P&P approach allows to fully enjoy the impressive generalization capabilities of deep learning by training CNN Gaussian denoisers offline. The trained CNNs then handle only the prior term in the P&P scheme. Therefore, no assumptions on the observation model are done in the offline training phase. On the other hand, an internal learning step, where the CNN denoisers are fine-tuned in test-time using the LR input , leads to image-adapted CNN denoisers that can perform better on patterns that are specific to this image, and remove random artifacts.

Why and when the input LR can be used for internal learning? When the observed LR image does not exhibit any degradation (e.g. additive noise, JPEG compression, blur), the phenomenon of recurrence of patterns within and across scales [9] implies that information inside the LR can improve the prediction of the high resolution image compared to using only the prior knowledge obtained in training. However, when the quality of the LR image reduces, the achievable improvement is expected to decrease proportionally to the level of degradation.

For example, for blurriness type of degradation, exact patterns of the latent image may not be found in the LR image, and therefore prior training is necessary. As an evidence, ZSSR that completely avoids prior training has demonstrated significant performance loss for blur kernels wider than the bicubic kernel, even when it was given the exact ground truth blur kernels as inputs and there was no noise (see the 2dB performance drop for SR x2 on BSD100 dataset in Tables 2 and 1 in [23]). Similarly, the achievable improvement is also expected to decrease when the LR image contains random noise or artifacts. However, since random noise and artifacts do not recur in some fixed patterns, it can be conjectured that a wise learning method can still capture some useful information from the LR image with proportion to the degradation level. Indeed, in Section 5.2 we obtain only a small improvement for poor-quality LR images, while in Section 5.3 (whose results are shown in Figure 1) we obtain a clear improvement for old real images that suffer from a moderate degradation.

The discussion above emphasizes the importance of prior training when facing an ill-posed problem such as image super-resolution, which is the reason that we use internal learning as an additional ingredient of our P&P-based method.

We note that several recent works demonstrate performance improvement of denoisers if they are learned or fine-tuned in the training phase using a set of images from the same class as the desired image [25, 19]. In contrast, here we fine-tune CNN denoisers in test-time using a single LR observation.

4.1 Implementation

As mentioned in Section 3, we use for IDBP-CNN the same set of CNN denoisers that were proposed and trained in [34]. This set is composed of 25 CNNs, each of them is trained for a different noise level, and together they span the noise level range of . Each CNN denoiser has 7 convolution layers of filters and 64 channels (except for the last one that has a single channel for grayscale images111We apply our method on the luminance channel, and use simple bicubic upsampling to obtain the color channels. However, the method can be extended by using color denoisers.). The dilation factors of the convolutions from the first layer to the last layer are 1, 2, 3, 4, 3, 2 and 1.

The IDBP-CNN uses a fixed number of 30 iterations, in which it alternates between (3) and (3), where is initialized using bicubic upsampling. The value of in (3) is reduced exponentially from to (recall that denotes the desired SR scale factor). Let us denote this monotonically decreasing sequence by . In each iteration, a suitable CNN denoiser (i.e. associated with ) is used. After 30 iterations an estimator of the high-resolution image is obtained by the last , except in the case of , where we follow the noiseless inpainting experiments in [26], and use as the estimate, instead of .

In most of the experiments we have , so the noise level of the denoiser used in (3) is determined solely by . It is important to note that due to the exponential decay in only few early iterations use CNNs associated with high noise levels, while many ( 5-10) of the last iterations use CNNs associated with noise levels between and . Also, as will be explained in Section 5.2, when a lower bound on will be set (to get good performance of IDBP-CNN). Therefore, in this case many of the last iterations use the same CNN denoiser as well.

We now turn to discuss the implementation our image-adaptive CNN denoisers method. In order to examine the effect of this idea, we use the same IDBP-CNN algorithm with a single change: once the noise level in (3) becomes smaller than a predefined value of , then a fixed CNN denoiser will be used for the remaining iterations. This denoiser is obtained by fine-tuning the pre-trained denoiser associated with noise level . In the case of we use , and in the case of we set as the lower bound on minus 1. This approach allows us to fairly compare between the baseline IDBP-CNN and its image-adapted extension, which we denote by IDBP-CNN-IA.

Unless stated otherwise, the fine-tuning is done as follows. We extract patches of size uniformly chosen from from the LR image , which serve as the ground truth. Their noisy version are obtained by additive random Gaussian noise of level . To enrich this ”training set”, data augmentation is done by downscaling to 0.9 of its size with probability 0.5, using mirror reflections in the vertical and horizontal directions with uniform probability, and using 4 rotations , again, with uniform probability. The optimization process (which is done in test-time) is kept fast and simple. We use L2 loss222Note that we use residual learning as done in the training phase [34]., minibatch size of 32, and 320 iterations of ADAM optimizer [12] with its default parameters and learning rate of 3e-4. Note that the optimization time is independent of the image size and the desired SR scale-factor. In Section 5 we show that it only moderately increases the inference run-time compared to the baseline IDBP-CNN.

Lastly, we note that we fine-tune only a single final CNN denoiser and not every denoiser used in (3) for three reasons: a) in early iterations the denoisers have high noise levels and their goal is to improve only coarse details, b) we have not experienced a more significant performance improvement by fine-tuning every CNN denoiser, c) we aim to get only a moderate increase in inference run-time compared to the baseline method.

(a) Original image
(b) ZSSR [23]
(c) IDBP-CNN
(d) IDBP-CNN-IA
Figure 3: Fragments of SR x3 with bicubic kernel of ”ideal” LR monarch image from Set14.

5 Experiments

We implemented our method using MatConvNet package [28]. Our code will be made available upon acceptance.

5.1 Ideal observation model

In this section we assume that the model (1) holds precisely without any noise, i.e. . We examine three cases: bicubic anti-aliasing kernel with down-scaling factors of 2 and 3, and Gaussian kernel of size with standard deviation 1.6 with down-scaling factor of 3. We note that the latter scenario is used in many works [7, 20, 34].

We compare the IDBP-CNN with and without our image-adapted CNN approach to SRCNN [5], VDSR [11] and the recent state-of-the-art EDSR+ [14]. All these three methods require extensive offline training to handle any different model (1), and their benchmarked versions are available for the bicubic kernel cases. The goal of examining them for the Gaussian kernel is to show their huge performance loss whenever their training phase does not use the right observation model. We also compare our results to IRCNN [34] and ZSSR [23], which are flexible to changes in the observation model like our approach (i.e. these methods get the blur kernel and the desired scale factor as inputs, and can handle different observation models without extensive retraining). The results are given in Table 1. The PSNR is computed on Y channel, as done in all the previous benchmarks.

It can be seen that our image-adapted method outperforms all other model-flexible methods. In the bicubic kernel cases, it reduces the gap between IDBP-CNN and VDSR (which has been a state-of-the-art method before EDSR+), and sometime even performs slightly better than VDSR. Clearly, IDBP-CNN-IA also obtains the best results of all methods for the Gaussian kernel case. In Figure 2 we present the PSNR, averaged on Set5 dataset, as a function of the iteration number for IDBP-CNN with and without our image-adapted CNN approach. Two observation models are presented: bicubic kernel with scale factor 2, and Gaussian kernel with scale factor 3. In both scenarios, a boost in performance is observed once the IDBP scheme starts using the fine-tuned CNN denoiser. Visual example is presented in Figure 3. The IDBP-CNN based methods that also make use of prior learning capture more accurate patterns than ZSSR that uses only internal learning.

Regarding the inference run-time, our experiments were performed on Intel i7-7500U CPU and Nvidia Geforce GTX 950M GPU with 4GB dedicated memory. The IDBP-CNN required 21s per image in BSD100 dataset. Its image-adapted version required 105s, which is only a moderate increase and is significantly faster than ZSSR that required 146s in its fastest version. We note that our implemented IDBP-based methods (baseline and image-adapted) are not optimized for fast run-time and toggle between CPU and GPU operations.

Dataset Scale Kernel SRCNN [5] VDSR [11] EDSR+ [14] IRCNN [34] ZSSR [23] IDBP-CNN IDBP-CNN-IA
Set5
2
3
3
Bicubic
Bicubic
Gaussian
36.66
32.75
30.42
37.53
33.66
30.54
38.20
34.76
30.65
37.43
33.39
33.38
37.37
33.42
31.31
37.43
33.47
33.51
37.57
33.59
33.72
Set14
2
3
3
Bicubic
Bicubic
Gaussian
32.42
29.28
27.71
33.03
29.77
27.80
34.02
30.66
27.54
32.88
29.61
29.63
33.00
29.80
28.33
33.01
29.69
29.73
33.10
29.74
29.82
BSD100
2
3
3
Bicubic
Bicubic
Gaussian
31.36
28.41
27.32
31.90
28.82
27.43
32.37
29.32
27.46
31.68
28.62
28.64
31.65
28.67
27.76
31.73
28.64
28.68
31.80
28.68
28.74
Table 1: Super-resolution results (average PSNR in dB) for ideal (noiseless) observation model with bicubic and Gaussian blur kernels. Bold black indicates the leading method, and bold blue indicates the leading method that does not require extensive training for different observation models.
Dataset Degradation EDSR+ [14] ZSSR [23] IDBP-CNN IDBP-CNN-IA
Set5
AWGN ()
JPEG compression
27.90
32.07
32.02
33.09
32.70
33.24
32.74
33.36
Set14
AWGN ()
JPEG compression
26.40
29.17
28.44
29.48
29.94
30.28
29.99
30.35
Table 2: Super-resolution results (average PSNR in dB) for bicubic kernel and scale factor of 2, with AWGN and JPEG compression degradations.

5.2 Poor-quality LR images

In real life the acquisition process is often inexact and the observed LR image can be affected by different degradations. In this section we examine two types of high degradations used also in [23]: (i) AWGN with , and (ii) JPEG compression (made by Matlab).

We compare the results of the IDBP-based methods with ZSSR [23] and EDSR+ [14]. Similarly to previous section, EDSR+ is restricted by the assumptions made in its offline training phase. For ZSSR we follow the details in [23] and cancel the post-processing backprojections step. We also give it a standard deviation value for noise that it adds to the LR examples extracted from the test image. We set this value to 0.08 for the AWGN and 0.05 for the JPEG compression. These values are tuned for best performance. For the baseline IDBP-CNN we give the true for the case of AGWN, and for the JPEG compression. We also use a lower bound on the values of used in (3), i.e. we stop switching the CNN denoisers once . We set for the AWGN and for the JPEG compression. Again, our image-adapted approach uses the exact IDBP-CNN scheme except that we fine-tune the CNN denoiser associated with , and this is the denoiser that will be used once . Note that this exact strategy is discussed briefly in Section 4.1. We also decrease the learning rate in the fine-tuning to 0.5e-4. The results for the bicubic kernel and scale factor of 2 are given in Table 2, and visual examples are presented in Figures 5 and 5.

Clearly, EDSR+ (the state-of-the-art method) demonstrates poor robustness to degradations. The IDBP-CNN based methods have the best results (significantly better than ZSSR), presumably due to the good prior learning obtained by the offline training phase. Here the improvement obtained by the image-adaptive approach is smaller than in the ideal case (previous section). This observation relates to the discussion in Section 4, which expects strong degradations to reduce the amount of additional useful information that can be extracted from the poor-quality LR image.


(a) Original image

(b) EDSR+ [14]

(c) ZSSR [23]

(d) IDBP-CNN

(e) IDBP-CNN-IA
(a) Original image
(b) EDSR+ [14]
(c) ZSSR [23]
(d) IDBP-CNN
(e) IDBP-CNN-IA
Figure 4: SR x2 with bicubic kernel of LR head image from Set5 with AWGN of level (presented in Y channel).
Figure 5: Fragments of SR x2 with bicubic kernel of JPEG compressed LR zebra image from Set14.
Figure 4: SR x2 with bicubic kernel of LR head image from Set5 with AWGN of level (presented in Y channel).

5.3 Real LR images

In this section we examine the performance of the IDBP-based methods with ZSSR [23] and EDSR+ [14] on real images (i.e. we do not have ground truth for their high-resolution versions). Specifically, we consider old images, whose acquisition model is unknown. Again, EDSR+ cannot handle such images differently because it is restricted by the assumptions made in its training phase. For ZSSR we present the official results when available or run the official code with its predefined configuration for handling real images. For IDBP-CNN we use and a lower bound of on the values of . As before, our image-adapted approach uses the exact IDBP-CNN scheme except that we fine-tune the CNN denoiser associated with . More precisely, in this experiment the bound makes IDBP-CNN stop switching denoisers when , which is associated with the same pre-trained CNN denoiser being fine-tuned when IDBP-CNN-IA is applied. We also note that all the examined methods use the bicubic kernel (while the true kernel of each image is unknown).

Figure 1 show the reconstruction results of real old images. In all the examples our IDBP-CNN-IA technique clearly outperforms the other methods. In such images one may observe the great advantage that our proposed scheme has. On the one hand, it enjoys the prior knowledge learned on natural images by the used Gaussian denoiser. On the other hand, it is adaptive to the statistics of the provided image. Interestingly, similar artifacts appear in EDSR+ and IDBP-CNN that relies on a pre-trained denoiser, which are absent from our reconstruction result that is image adaptive. Compared to ZSSR our method has a visual advantage due to the fact that it has prior knowledge on natural images that ZSSR lacks. More examples are presented is Figures 6-9.

6 Conclusion

The task of image super-resolution has gained a lot from the developments in deep learning in the recent years. Yet, leading deep learning techniques are sensitive to the acquisition process assumptions used in the training phase. The state-of-the-art results achieved by these networks both quantitatively and qualitatively are degraded once there are inaccuracies in the image formation model assumption.

This work addressed this issue by combining two recent approaches, where the first solve general inverse problem using existing denoisers and the second relies on internal information in the given LR image. Our main contribution is using a fast internal learning step in test time to fine-tune the CNN denoisers for a method that we have adapted to the SR problem. Our image-adaptive strategy shows better results than the two mentioned independent strategies.

The advantage of our technique over the methods that only learn from other data is very clear in the inexact case (acquisition process not fully known, see Figure 1 for an example). The superiority over schemes that rely only on internal information, such as ZSSR, is very clear in the exact setting, where the downsampling kernel is known. Thus, we may conclude that our proposed approach provides a desirable hybrid solution that combines the two methodologies.

One of the limitations of our proposed method is that it relies on a denoiser trained on the LR image. Therefore, it is less effective when there are very strong artifacts on this image due to strong noise or blur. Though in this case, one may employ the deep image prior [27] that trains directly on corrupted images, it is possible that the best thing is just to rely on the denoising network trained on other clean images. Another possibility in this case, is fine-tuning the denoising network on other images that resemble the current processed image, which is likely to improve the performance as well [24, 25, 19].

Another interesting research direction of this work is training a larger number of denoisers based on the LR image. In this work, we train the final denoiser mainly for the sake of time efficiency and show that it suffices to improve the results and give better visual quality on real old images. Yet, one may consider training just some components of a network, e.g., the convolution weights or the batch normalization parameters as in [4], or a certain set the network parameters [18], and not all the weights as done in this work. This may require the use of another pre-trained denoising network, as in the current IRCNN network the batch-normalization components have been integrated into the convolution layers after the offline training phase.

Finally, it is important to mention that the proposed approach is generic and may be used with any plug-and-play strategy that relies on a Gaussian denoiser to solve general inverse problems. While we have adapted the IDBP framework [26] to SR (the original IDBP work did not consider this setup), one may apply the proposed strategy in this work to the schemes in [29, 34, 20] as well.

Acknowledgment

This work was supported by the European research council (ERC StG 757497 PI Giryes).

References

  • [1] Y. Blau, R. Mechrez, R. Timofte, T. Michaeli, and L. Zelnik-Manor. 2018 PIRM challenge on perceptual image super-resolution. In ECCV Workshops, 2018.
  • [2] S. Boyd, N. Parikh, E. Chu, B. Peleato, J. Eckstein, et al. Distributed optimization and statistical learning via the alternating direction method of multipliers. Foundations and Trends® in Machine learning, 3(1):1–122, 2011.
  • [3] J. Bruna, P. Sprechmann, and Y. LeCun. Super-resolution with deep convolutional sufficient statistics. In ICLR, 2016.
  • [4] F. M. Carlucci, L. Porzi, B. Caputo, E. Ricci, and S. Rota Bulò. Autodial: Automatic domain alignment layers. In International Conference on Computer Vision, 2017.
  • [5] C. Dong, C. C. Loy, K. He, and X. Tang. Learning a deep convolutional network for image super-resolution. In European conference on computer vision, pages 184–199. Springer, 2014.
  • [6] W. Dong, G. Shi, Y. Ma, and X. Li. Image restoration via simultaneous sparse coding: Where structured sparsity meets gaussian scale mixture. International Journal of Computer Vision (IJCV), 114(2):217–232, Sep. 2015.
  • [7] W. Dong, L. Zhang, G. Shi, and X. Li. Nonlocally centralized sparse representation for image restoration. IEEE Transactions on Image Processing, 22(4):1620–1630, 2013.
  • [8] K. Egiazarian and V. Katkovnik. Single image super-resolution via bm3d sparse coding. In Signal Processing Conference (EUSIPCO), 2015 23rd European, pages 2849–2853. IEEE, 2015.
  • [9] D. Glasner, S. Bagon, and M. Irani. Super-resolution from a single image. In Computer Vision, 2009 IEEE 12th International Conference on, pages 349–356. IEEE, 2009.
  • [10] J.-B. Huang, A. Singh, and N. Ahuja. Single image super-resolution from transformed self-exemplars. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pages 5197–5206, 2015.
  • [11] J. Kim, J. Kwon Lee, and K. Mu Lee. Accurate image super-resolution using very deep convolutional networks. In Proceedings of the IEEE conference on computer vision and pattern recognition, pages 1646–1654, 2016.
  • [12] D. P. Kingma and J. Ba. Adam: A method for stochastic optimization. arXiv preprint arXiv:1412.6980, 2014.
  • [13] C. Ledig, L. Theis, F. Huszár, J. Caballero, A. Cunningham, A. Acosta, A. P. Aitken, A. Tejani, J. Totz, Z. Wang, et al. Photo-realistic single image super-resolution using a generative adversarial network. In CVPR, volume 2, page 4, 2017.
  • [14] B. Lim, S. Son, H. Kim, S. Nah, and K. M. Lee. Enhanced deep residual networks for single image super-resolution. In The IEEE conference on computer vision and pattern recognition (CVPR) workshops, volume 1, page 4, 2017.
  • [15] T. Meinhardt, M. Moeller, C. Hazirbas, and D. Cremers. Learning proximal operators: Using denoising networks for regularizing inverse imaging problems. In ICCV, 2017.
  • [16] T. Peleg and M. Elad. A statistical prediction model based on sparse representations for single image super-resolution. IEEE Transactions on Image Processing, 23(6):2569–2582, June 2014.
  • [17] M. Protter, M. Elad, H. Takeda, and P. Milanfar. Generalizing the nonlocal-means to super-resolution reconstruction. IEEE Transactions on Image Processing, 18(1):36–51, Jan 2009.
  • [18] S.-A. Rebuffi, H. Bilen, and A. Vedaldi. Efficient parametrization of multi-domain deep neural networks. In CVPR, 2018.
  • [19] T. Remez, O. Litany, R. Giryes, and A. M. Bronstein. Class-aware fully convolutional gaussian and poisson denoising. IEEE Transactions on Image Processing, 27(11):5707–5722, 2018.
  • [20] Y. Romano, M. Elad, and P. Milanfar. The little engine that could: Regularization by denoising (red). SIAM Journal on Imaging Sciences, 10(4):1804–1844, 2017.
  • [21] Y. Romano, J. Isidoro, and P. Milanfar. RAISR: Rapid and accurate image super resolution. IEEE Trans. on Computational Imaging, 3(1):110–125, Mar. 2017.
  • [22] M. Sajjadi, B. Scholkopf, and M. Hirsch. Enhancenet: Single image super-resolution through automated texture synthesis. In ICCV, 2017.
  • [23] A. Shocher, N. Cohen, and M. Irani. ”zero-shot” super-resolution using deep internal learning. In The IEEE Conference on Computer Vision and Pattern Recognition (CVPR), June 2018.
  • [24] L. Sun and J. Hays. Super-resolution from internet-scale scene matching. In Proceedings of the IEEE Conf. on International Conference on Computational Photography (ICCP), 2012.
  • [25] A. M. Teodoro, J. M. Bioucas-Dias, and M. A. Figueiredo. Image restoration and reconstruction using variable splitting and class-adapted image priors. In Image Processing (ICIP), 2016 IEEE International Conference on, pages 3518–3522. IEEE, 2016.
  • [26] T. Tirer and R. Giryes. Image restoration by iterative denoising and backward projections. IEEE Transactions on Image Processing, 2018.
  • [27] D. Ulyanov, A. Vedaldi, and V. Lempitsky. Deep image prior. In CVPR, 2018.
  • [28] A. Vedaldi and K. Lenc. Matconvnet: Convolutional neural networks for matlab. In Proceedings of the 23rd ACM international conference on Multimedia, pages 689–692. ACM, 2015.
  • [29] S. V. Venkatakrishnan, C. A. Bouman, and B. Wohlberg. Plug-and-play priors for model based reconstruction. In Global Conference on Signal and Information Processing (GlobalSIP), 2013 IEEE, pages 945–948. IEEE, 2013.
  • [30] Y. Wang, F. Perazzi, B. McWilliams, A. Sorkine-Hornung, O. Sorkine-Hornung, and C. Schroers. A fully progressive approach to single-image super-resolution. In The IEEE Conference on Computer Vision and Pattern Recognition (CVPR) Workshops, June 2018.
  • [31] C. D. Xintao Wang, Ke Yu and C. C. Loy. Recovering realistic texture in image super-resolution by deep spatial feature transform. In CVPR, 2018.
  • [32] W. Yang, X. Zhang, Y. Tian, W. Wang, and J.-H. Xue. Deep learning for single image super-resolution: A brief review. arXiv preprint arXiv:1808.03344, 2018.
  • [33] R. Zeyde, M. Elad, and M. Protter. On single image scale-up using sparse-representations. In Proceedings of the 7th international conference on Curves and Surfaces, pages 711–730. Springer-Verlag, 2012.
  • [34] K. Zhang, W. Zuo, S. Gu, and L. Zhang. Learning deep cnn denoiser prior for image restoration. In IEEE Conference on Computer Vision and Pattern Recognition, pages 3929–3938, 2017.
  • [35] K. Zhang, W. Zuo, and L. Zhang. Learning a single convolutional super-resolution network for multiple degradations. In IEEE Conference on Computer Vision and Pattern Recognition, volume 6, 2018.
  • [36] M. Zontak and M. Irani. Internal statistics of a single natural image. In Computer Vision and Pattern Recognition (CVPR), 2011 IEEE Conference on, pages 977–984. IEEE, 2011.

(a) LR image

(b) EDSR+ [14]

(c) ZSSR [23]

(d) IDBP-CNN

(e) IDBP-CNN-IA
Figure 6: SR(x2) of a real image. It is recommended to zoom at the images.

(a) LR image

(b) EDSR+ [14]

(c) ZSSR [23]

(d) IDBP-CNN

(e) IDBP-CNN-IA
Figure 7: SR(x2) of a real image. It is recommended to zoom at the images.

(a) LR image

(b) EDSR+ [14]

(c) ZSSR [23]

(d) IDBP-CNN

(e) IDBP-CNN-IA
Figure 8: SR(x2) of a real image. It is recommended to zoom at the images.

(a) LR image

(b) EDSR+ [14]

(c) ZSSR [23]

(d) IDBP-CNN

(e) IDBP-CNN-IA
Figure 9: SR(x2) of a real image. It is recommended to zoom at the images.
Comments 0
Request Comment
You are adding the first comment!
How to quickly get a good reply:
  • Give credit where it’s due by listing out the positive aspects of a paper before getting into which changes should be made.
  • Be specific in your critique, and provide supporting evidence with appropriate references to substantiate general statements.
  • Your comment should inspire ideas to flow and help the author improves the paper.

The better we are at sharing our knowledge with each other, the faster we move forward.
""
The feedback must be of minimum 40 characters and the title a minimum of 5 characters
   
Add comment
Cancel
Loading ...
321959
This is a comment super asjknd jkasnjk adsnkj
Upvote
Downvote
""
The feedback must be of minumum 40 characters
The feedback must be of minumum 40 characters
Submit
Cancel

You are asking your first question!
How to quickly get a good answer:
  • Keep your question short and to the point
  • Check for grammar or spelling errors.
  • Phrase it like a question
Test
Test description