When AWGNbased Denoiser Meets Real Noises
Abstract
Discriminative learning based image denoisers have achieved promising performance on synthetic noises such as Additive White Gaussian Noise (AWGN). The synthetic noises adopted in most previous work are pixelindependent, but real noises are mostly spatially/channelcorrelated and spatially/channelvariant. This domain gap yields unsatisfied performance on images with real noises if the model is only trained with AWGN. In this paper, we propose a novel approach to boost the performance of a real image denoiser which is trained only with synthetic pixelindependent noise data dominated by AWGN. First, we train a deep model that consists of a noise estimator and a denoiser with mixed AWGN and Random Value Impulse Noise (RVIN). We then investigate Pixelshuffle Downsampling (PD) strategy to adapt the trained model to real noises. Extensive experiments demonstrate the effectiveness and generalization of the proposed approach. Notably, our method achieves stateoftheart performance on real sRGB images in the DND benchmark among models trained with synthetic noises. Codes are available at https://github.com/yzhouas/PDDenoisingpytorch.
Introduction
As a fundamental task in image processing and computer vision, image denoising has been extensively explored in the past several decades even for downstream applications [Zhou, Liu, and Huang2018, Wang et al.2019]. Traditional methods including the ones based on image filtering [Dabov et al.2008], low rank approximation [Gu et al.2014, Xu et al.2017, Yair and Michaeli2018], sparse coding [Elad and Aharon2006], and image prior [Ulyanov, Vedaldi, and Lempitsky2017] have achieved satisfactory results on synthetic noise such as Additive White Gaussian Noise (AWGN). Recently, deep CNN has been applied to this task, and discriminativelearningbased methods such as DnCNN [Zhang et al.2017a] outperform most traditional methods on AWGN denoising.
Unfortunately, while these learningbased methods work well on the same type of synthetic noise that they were trained on, their performance degrades rapidly on real images, showing poor generalization ability in real world applications. This indicates that these datadriven denoising models are highly domainspecific and nonflexible to transfer to other noise types beyond AWGN. To improve model flexibility, the recentlyproposed FFDNet [Zhang, Zuo, and Zhang2018] trains a conditional nonblind denoiser with a manually adjusted noiselevel map. By giving highvalued uniform maps to FFDNet, only oversmoothed results can be obtained in real image denoising. Therefore, blind denoising of real images is still very challenging due to the lack of accurate modeling of real noise distribution. These unknown realworld noises are much more complex than pixelindependent AWGN. They can be spatiallyvariant, spatiallycorrelated, signaldependent, and even devicedependent.
To better address the problem of real image denoising, current attempts can be roughly divided into the following categories: (1) realistic noise modeling [Shi Guo2018, Brooks et al.2019, Abdelhamed, Timofte, and Brown2019], (2) noise profiling such as multiscale [Lebrun, Colom, and Morel2015a, Yair and Michaeli2018], multichannel [Xu et al.2017] and regional based [Liu et al.2017] settings, and (3) data augmentation techniques such as the adversariallearningbased ones [Chen et al.2018]. Among them, CBDNet [Shi Guo2018] achieves good performance by modeling the realistic noise using the incamera pipeline model proposed in [Liu et al.2008]. It also trains an explicit noise estimator and sets a larger penalty for underestimated noise. The network is trained on both synthetic and real noises, but it still cannot fully characterize real noises. Brooks et al. [Brooks et al.2019] used prior statistics stored in the raw data of DND to augment the synthetic RGB data, but it does not prove the generalization of the model on other real noises.
In this work, from a novel viewpoint of real image blind denoising, we seek to adapt a learningbased denoiser trained on pixelindependent synthetic noises to unknown real noises. As shown in Figure 1, we assume that real noises differ from pixelindependent synthetic noises dominantly in spatial/channelvariance and correlation [Stanford2015]. This difference results from incamera pipeline like demosaicing [Zhou et al.2019]. Based on this assumption, we first propose to train a basis denoising network using mixed AWGN and RVIN. Our flexible basis net consists of an explicit noise estimator followed by a conditional denoiser. We demonstrate that this fullyconvolutional nets are actually efficient in coping with pixelindependent spatially/channelvariant noises. Second, we propose a simple yet effective adaptation strategy, Pixelshuffle Downsampling(PD), which employs the divideandconquer idea to handle real noises by breaking down the spatial correlation.
In summary, our main contributions include:

We propose a new flexible deep denoising model (trained with AWGN and RVIN) for both blind and nonblind image denoising. We also demonstrate that such fully convolutional models trained on spatiallyinvariant noises can handle spatiallyvariant noises.

We adapt the AWGNRVINtrained deep denoiser to real noises by applying a novel strategy called Pixelshuffle Downsampling (PD). Spatiallycorrelated noises are broken down to pixelwise independent noises. We examine and overcome the proposed domain gap to boost real denoising performance.

The proposed method achieves stateoftheart performance on DND benchmark and other real noisy RGB images among models trained only with synthetic noises. Note that our model does not use any images or prior metadata from real noise datasets. We also show that with the proposed PD strategy, the performance of some other existing denoising models can also be boosted.
Related Work
Discriminative Learning based Denoiser. Denoising methods based on CNNs have achieved impressive performance on removing synthetic Gaussian noise. Burger et al. [Burger, Schuler, and Harmeling2012] proposed to apply multilayer perceptron (MLP) to denoising task. In [Chen and Pock2017], Chen et al. proposed a trainable nonlinear reaction diffusion (TNRD) model for Gaussian noise removal at different level. DnCNN [Zhang et al.2017a] was the first to propose a blind Gaussian denoising network using deep CNNs. It demonstrated the effectiveness of residual learning and batch normalization. More network structures like dilated convolution [Zhang et al.2017b], autoencoder with skip connection [Mao, Shen, and Yang2016], ResNet [Ren, ElKhamy, and Lee2018], recursively branched deconvolutional network (RBDN) [Santhanam, Morariu, and Davis2017] were proposed to either enlarge the receptive field or balance the efficiency. Recently some interests are put into combining image denoising with highlevel vision tasks like classification and segmentation. Liu et al. [Liu et al.2017] applied segmentation to enhance the denoising performance on different regions. Similar classaware work were developed in [Niknejad, BioucasDias, and Figueiredo2017]. Due to domainspecific training and deficient realistic noise data, those deep models are not robust enough on realistic noises. In recently proposed FFDNet [Zhang, Zuo, and Zhang2018], the author proposed a nonblind denoising by concatenating the noise level as a map to the noisy image. By manually adjusting noise level to a higher value, FFDNet demonstrates a spatialinvariant denoising on realistic noises with oversmoothed details.
Blind Denoising on Real Noisy Images. Real noises of CCD cameras are complicated and are related to optical sensors and incamera process. Specifically, multiple noise sources like photon noise, readout noise etc. and processing including demosaicing, color and gamma transformation introduce the main characteristics of real noises: spatial/channel correlation, variance, and signaldependence. To approximate real noise, multiple types of synthetic noise are explored in previous work, including GaussianPoisson [Foi et al.2008, Liu, Tanaka, and Okutomi2014], Gaussian Mixture Model (GMM) [Zhu, Chen, and Heng2016], incamera process simulation [Liu et al.2008, Shi Guo2018] and GANgenerated noises [Chen et al.2018], to name a few. CBDNet [Shi Guo2018] first simulated real noise and trained a subnetwork for noise estimation, in which spatialvariance noise is represented as spatial maps. Besides, multichannel [Xu et al.2017, Shi Guo2018] and multiscale [Lebrun, Colom, and Morel2015a, Yu and Koltun2015] strategy were also investigated for adaptation. Different from all the aforementioned works which focus on directly synthesizing or simulating noises for training, in this work, we apply AWGNRVIN model and focus on pixelshuffle adaptation strategy to fill in the gap between pixelindependent synthetic and pixelcorrelated real noises.
Methodology
Basis Noise Model
The basis noise model is mixed AWGNRVIN. Noises in sRGB images are no longer approximated GaussianPoisson Noises as in the raw sensor data mainly due to gamma transform, demosaicing, and other interpolations etc.. In Figure 2, we follow [Liu et al.2008] pipeline to synthesize noisy images, and plot the Noise Level Functions (NLFs) (noise variance as a function of image intensity) before (first row) and after (second row) the Gamma Correction transform and demosaicing. From left to right, the Gamma factor increases. It shows that in RGB images, clipping effect and other nonlinear transforms will greatly influence the originally linear noise varianceintensity relationship in raw sensor data, even change the noise mean. Tough complicated, for a more general case than GaussianPoisson noises of modeling different nonlinear transforms, real noises in RGB can still be locally approximated as AWGN [Zhang, Zuo, and Zhang2018, Lee1980, Xu, Zhang, and Zhang2018]. In this paper, we thus assume the RGB noises to be approximated spatiallyvariant and spatiallycorrelated AWGN.
Adding RVIN for training aims at explicitly resolving the defective pixels caused by dead pixels of camera hardware or long exposure frequently appearing in most nightshot images. We generate AWGN, RVIN and mixed AWGNRVIN following PGB[Xu et al.2016].
Basis Model Structure
The architecture of the proposed basis model is illustrated in Figure 3.
The proposed blind denoising model consists of a noise estimator and a followup nonblind denoiser . Given a noisy observation , where is the noise synthetic process, and is the noisefree image, the model aims to jointly learn the residual , and it is trained on paired synthetic data . Specifically, the noise estimator outputs consisting of six pixelwise noiselevel maps that correspond to two noise types, i.e., AWGN and RVIN, across three channels (R, G, B). Then is concatenated with the estimated noise level maps and fed into the nonblind denoiser . The denoiser then outputs the noise residual . Three objectives are proposed to supervise the network training, including the noise estimation (), blind () and nonblind () image denoising objectives, defined as,
(1) 
(2) 
(3) 
where and are the trainable parameters of and . is the ground truth noise level maps for , consisting of and . For AWGN, is represented as the even maps filled with the same standard deviation values ranging from 0 to 75 across R,G,B channels. For RVIN, is represented as the maps valued with the corrupted pixels ratio with upperbound set to 0.3. We further normalize to range [0,1]. Then the full objective can be represented as a weighted sum of the above three losses,
(4) 
in which , and are hyperparameters to balance the losses, and we set them to be equal for simplicity.
The proposed model structure can perform both blind and nonblind denoising simultaneously, and the model is more flexible in interactive denoising and result adjustment. Explicit noise estimation also benefits noise modeling and disentanglement.
Pixelshuffle Downsampling (PD) Adaptation
Pixelshuffle Downsampling.
Pixelshuffle [Shi et al.2016] downsampling is defined to create the mosaic by sampling the images with stride . Compared to other downsampling methods like linear interpolation, bicubic interpolation, and pixel area relation, the pixelshuffle and nearestneighbour downsampling on noisy image would not influence the real noise distribution. Besides, pixelshuffle also benefits image recovery by preserving the original pixels from the images compared to others. These two advantages yield the two stages of PD strategy: adaptation and refinement.
Adaptation.
Learningbased denoiser trained on AWGN is not robust enough to real noises due to domain difference. To adapt the noise model to real noise, here we briefly analyze and justify our assumption on the difference between real noises and Gaussian noise: spatial/channel variance and correlation.
Suppose a noise estimator is robust, which means it can accurately estimate the exact noise level, for a single AWGNcorrupted image, pixelshuffle downsampling will neither influence the AWGN variance nor the estimation values, when the sample stride is small enough to preserve the textural structures. When extending it to real noise case, we have an interesting hypothesis: as we increase the sample stride of pixelshuffle, the estimation values of specific noise estimators will first fluctuate and then keep steady for a couple of stride increment. This assumption is feasible because pixelshuffle will break down the spatialcorrelated noise patterns to pixelindependent ones, which can be approximated as spatialvariant AWGN and adapted to those estimators.
We justify this hypothesis on both [Liu, Tanaka, and Okutomi2013] and our proposed pixelwise estimator. As shown in Figure 1, we randomly cropped a patch of size from a random noisy image in SIDD[Abdelhamed, Lin, and Brown2018]. We add AWGN with to its noisefree ground truth . After pixelshuffling both and AWGNcorrupted , starting from stride , the noise pattern of demonstrates expected pixel independence. Using [Liu, Tanaka, and Okutomi2013], the estimation result for is unchanged in Figure 4 (a) (Left), but the one for in Figure 4 (a) (Right) first increases and begins to keep steady after stride . It is consistent with the visual pattern and our hypothesis.
One assumption of [Liu, Tanaka, and Okutomi2013] is that the noise is additive and evenly distributed across the image. For spatialvariant signaldependent real noises, our pixelwise estimator has its superiority. To make statistics of spatialvariant noise estimation values, we extract the three AWGN channels of noise map , where and are width and height of the input image, and compute the normalized 10bin histograms across each channel when the stride is . We introduce the changing factor to monitor the noise map distribution changes as the stride increases,
(5) 
where is the channel index. We then investigate the difference of sequence between AWGN and realistic noises. Specifically, we randomly select 50 images from CBSD68 [Roth and Black2009] and add randomlevel AWGN to them. For comparison, we randomly pick up 50 image patches of from DND benchmark. In Figure 4 (b), sequence remains closed to zero for all AWGNcurrupted images (Left figure), while for real noises demonstrates an abrupt drop when . It indicates that the spatialcorrelation has been broken from .
The above analysis inspires the proposed adaptation strategy based on pixelshuffle. Intuitively, we aim at finding the smallest stride to make the downsampled spatialcorrelated noises match the pixelindependent AWGN. Thus we keep increasing the stride until drops under a threshold . We run the above experiments on CBSD68 for 100 iterations to select the proper generalized threshold . After averaging the maximum of each iteration, we empirically set .
PD Refinement.
Figure 5 shows the proposed Pixelshuffle Downsampling (PD) refinement strategy: (1) Compute the smallest stride , which is 2 in this example and more digital camera image cases, to match AWGN following the adaptation process, and pixelshuffle the image into mosaic ; (2) Denoise using ; (3) Refill each subimage with noisy blocks separately and pixelshuffle upsample them; (4) Denoise each refilled image again using and average them to obtain the ‘texture details’ ; (5) Combine the oversmoothed ‘flat regions’ to refine the final result.
As summarized in [Liu et al.2008], the goals of noise removal include preserving texture details and boundaries, smoothing flat regions, and avoiding generating artifacts. Therefore, in the above step(5), we propose to further refine the denoised image with the combination of ‘texture details’ and ‘flat regions’ . ‘Flat regions’ can be obtained from oversmoothed denoising results generated by lifting the noise estimation levels. In this work, given a noisy observation , the refined noise maps are defined as,
(6) 
Consequently, the ‘flat region’ is defined as , where PD and PU are pixelshuffle downsampling and upsampling. The final result is obtained by .
Experiments
Implementation Details
In this work, the structures of the subnetwork and follow DnCNN [Zhang et al.2017a] of 5 layers and 20 layers. For grayscale image experiments, we also follow DnCNN to crop patches from 400 images of size . For color image model, we crop patches with stride 10 from 432 color images in the Berkeley segmentation dataset (BSD) [Roth and Black2009]. The training data ratio of singletype noises (either AWGN or RVIN) and mixed noises (AWGN and RVIN) is 1:1. During training, Adam optimizer is utilized and the learning rate is set to , and batch size is 128. After 30 epochs, the learning rate drops to and the training stops at epoch 50.
To evaluate the algorithm on synthetic noise (AWGN, mixed AWGNRVIN and spatiallyvariant Gaussian), we utilize the benchmark data from BSD68, Set20 [Xu et al.2016] and CBSD68 [Roth and Black2009]. For realistic noise, we test it on RNI15 [Online2015a], DND benchmark [Plötz and Roth2017], and selfcaptured night photos. We evaluate the performance of the algorithm in terms of PSNR and SSIM. Qualitative performance for denoising is also presented, with comparison to other stateofthearts.
Evaluation with Synthetic Noise
Mixed AWGN and RVIN.
BM3D  WNNM  PGB  DnCNNB  OursNB  OursB  

(10, 0.15)  25.18  25.41  27.17  32.09  32.43  32.37 
(10, 0.30)  21.80  21.40  22.17  29.97  30.47  30.32 
(20, 0.15)  25.13  23.57  26.12  29.52  29.82  29.76 
(20, 0.30)  21.73  21.40  21.89  27.90  28.41  28.16 
Our model follows similar structure of DnCNN and FFDNet [Zhang, Zuo, and Zhang2018], so its performance on singletype AWGN removal is also similar to them. We thus evaluate our model on eliminating mixed AWGN and RVIN on Set20 as in [Xu et al.2016]. We also compare our method with other baselines, including BM3D [Dabov et al.2006] and WNNM [Gu et al.2014] which are nonblind Gaussian denoisers anchored with a specific noise level estimated by the approach provided in [Liu, Tanaka, and Okutomi2013]. Besides, we include the PGB [Xu et al.2016] denoiser that is designed for mixed AWGN and RVIN. The result of the blind version of DnCNNB, trained by the same strategy as our model, is also presented for reference. The comparison results are shown in Table 1, from which we can see the proposed method achieves the best performance. Compared to DnCNNB, for complicated mixed noises, our model explicitly disentangles different noises. It benefits the conditional denoiser to differentiate mixed noises from other types.
Signaldependent Spatiallyvariant Noise.
BM3D  FFDNet  DnCNNB  CBDNet  OursB  

(20, 10)  29.09  28.54  34.38  33.04  34.75 
(20, 20)  29.08  28.70  31.72  29.77  31.32 
(40, 10)  23.21  28.67  32.08  30.89  32.12 
(40, 20)  23.21  28.80  30.32  28.76  30.33 
We conduct experiments to examine the generalization ability of fully convolutional model on signaldependent noise model [Shi Guo2018, Foi et al.2008, Liu, Tanaka, and Okutomi2014]. Given a clean image , the noises in the noisy observation contain both signaldependent components with variance and independent components with variance . Table 2 shows that for nonblind model like BM3D and FFDNet, only scalar noise estimator [Liu, Tanaka, and Okutomi2013] is applied, thus they cannot well cope with the spatiallyvariant cases. In this experiment, DnCNNB is the original blind model trained on AWGN with ranged between 0 and 55. It shows that spatiallyvariant Gaussian noises can still be handled by fully convolutional model trained with spatiallyinvariant AWGN [Zhang, Zuo, and Zhang2018]. Compared to DnCNNB, the proposed network explicitly estimates the pixelwise map to make the model more flexible and possible for real noise adaptation.
Evaluation with Real RGB Noise
Qualitative Comparisons.
Some qualitative denoising results on DND are shown in Figure 6. The compared results of DND are all directly obtained online from the original submissions of the authors. The methods we include for the comparison cover blind real denoisers (CBDNet, NI [Online2015b] and NC [Lebrun, Colom, and Morel2015b]), blind Gaussian denoisers (CDnCNNB) and nonblind Gaussian denoisers (CBM3D, WNNM [Gu et al.2014], and FFDNet). From these example denoised results, we can observe that some of them are either noisy (as in DnCNN and WNNM), or spatiallyinvariantly oversmoothed (as in FFDNet). CBDNet performs better than others but it still suffers from blur edges and uncleaned background. Our proposed method (PD) achieves a better spatiallyvariant denoising performance by smoothing the background while preserving the textural details in a full blind setting.
Quantitative Results on DND Benchmark.
The images in the DND benchmark are captured by digital camera and demosaiced from raw sensor data, so we simply set the stride number . We follow the submission guideline of DND dataset to evaluate our algorithm. Recently, many learningbased methods like PathRestore [Yu et al.2019],RIDNet [Anwar and Barnes2019],WDnCNN [Zhao, Lam, and Lun2019] and CBDNet, achieved promising performance on DND, but they are all finetuned on real noisy images, or use prior knowledge in the metadata of DND [Brooks et al.2019]. For fair comparison, we select some representative conventional methods(MCWNNM, EPLL, TWSC, CBM3D), and learningbased methods trained only with synthetic noises. The results are shown in Table 3. Models trained on AWGN (DnCNN, TNRD, MLP) perform poorly on real RGB noises mainly due to the large gap between AWGN and real noise. CBDNet improves the results significantly by training the deep networks with artificial realistic noise model. Our AWGNRVINtrained model with PD refinement achieves much better results (+0.83dB) than CBDNet trained only with synthetic noises, and also boosts the performance of other AWGNbased methods (+PD). Compared to the base model, the proposed adaptation methods improve the performance on real noises by 5.8 dB. Note that our model is only trained on synthetic noises, and does not utilize any prior data of DND.
Method  PSNR  SSIM 

MCWNNM[Xu et al.2017]  37.38  0.929 
EPLL[Zoran and Weiss2011]  33.51  0.824 
TWSC[Xu, Zhang, and Zhang2018]  37.93  0.940 
MLP[Burger, Schuler, and Harmeling2012]  34.23  0.833 
TNRD[Chen and Pock2017]  33.65  0.830 
CBDNet(Syn)[Shi Guo2018]  37.57  0.936 
CBM3D[Dabov et al.2008]  34.51  0.850 
CBM3D(+PD)  35.02  0.873 
CDnCNNB[Zhang et al.2017a]  32.43  0.790 
CDnCNNB(+PD)  35.44  0.876 
FFDNet[Zhang, Zuo, and Zhang2018]  34.40  0.847 
FFDNet(+PD)  37.56  0.931 
Our Base Model(No PD)  32.60  0.788 
Ours(Full Pipeline)  38.40  0.945 
Ablation Study on Real RGB Noise
Adding RVIN.
Training models with mixed AWGN and RVIN noises will benefit the removal of dead or overexposure pixels in real images. For comparison, We train another model only with AWGN, and test it on real noisy night photos. An example utilizing the full pipeline is shown in Figure 7, in which it demonstrates the superiority of the existence of RVIN in the training data. Even though model trained with AWGN can also achieve promising denoising performance, it is not effective on dead pixels.
Stride Selection.
We apply different stride numbers while refining the denoised results, and compare the visual quality in Figure 8 (a)(b). For arbitrary given sRGB images, the stride number can be computed using our adaptation algorithm with the assistance of noise estimator. In our experiments, the selected stride is the smallest that . Small stride number will treat large noise patterns as textures to preserve, as shown in Figure 8 (b). While using large stride number tends to break the textural structures and details. Interestingly, as shown in Figure 8 (b), the texture of the fabric is invisible while applying .
Model  (s=1)  (s=3, Full)  (s=2,I)  (s=2,DI)  (s=2,Full) 

PSNR  32.60  37.90  37.00  37.20  38.40 
SSIM  0.7882  0.9349  0.9339  0.9361  0.9452 
Image Refinement Process.
The ablation on the refinement steps is shown in Figure 8 (c)(d) and Table 4, in which we compare the denoised results of I (i.e. directly pixelshuffling upsampling after step (2)), DI (i.e. denoising I using ), and Full (i.e. the current whole pipeline). It shows that both I and DI will form additional visible artifacts, while the whole pipeline smooths out those artifacts and has the best visual quality.
Blending Factor .
Due to the ambiguity nature of fine texture and midfrequent noises, human perception intervene on the denoising level is inevitable. is this parameter introduced as a ’linear’ adjustment of denoising level for a more flexible and interactive user operation. Using blending factor is more stable and safe to preserve the spatiallyvariant details than directly adjusting the estimated noise level like CBDNet. In Figure 9, as increases, the denoised results tend to be oversmoothed. This is suitable for images with more background patterns. However, smaller will preserve more fine details which are applicable for images with more foreground objects. In most cases, users can simply set to 0 to obtain the most detailed textures recovery and visually plausible results.
Conclusions
In this paper, we revisit the real image blind denoising from a new viewpoint. We assumed the realistic noises are spatially/channel variant and correlated, and addressed adaptation from AWGNRVIN noises to real noises. Specifically, we proposed an image blind and nonblind denoising network trained on AWGNRVIN noise model. The network consists of an explicit multitype multichannel noise estimator and an adaptive conditional denoiser. To generalize the network to real noises, we investigated Pixelshuffle Downsampling (PD) refinement strategy. We showed qualitatively that PD behaves better in both spatiallyvariant denoising and details preservation. Results on DND benchmark and other realistic noisy images demonstrated the newly proposed model with the strategy are efficient in processing spatial/channel variance and correlation of real noises without explicit modeling.
References
 [Abdelhamed, Lin, and Brown2018] Abdelhamed, A.; Lin, S.; and Brown, M. S. 2018. A highquality denoising dataset for smartphone cameras. In CVPR.
 [Abdelhamed, Timofte, and Brown2019] Abdelhamed, A.; Timofte, R.; and Brown, M. S. 2019. Ntire 2019 challenge on real image denoising: Methods and results. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition Workshops, 0–0.
 [Anwar and Barnes2019] Anwar, S., and Barnes, N. 2019. Real image denoising with feature attention. arXiv preprint arXiv:1904.07396.
 [Brooks et al.2019] Brooks, T.; Mildenhall, B.; Xue, T.; Chen, J.; Sharlet, D.; and Barron, J. T. 2019. Unprocessing images for learned raw denoising. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, 11036–11045.
 [Burger, Schuler, and Harmeling2012] Burger, H. C.; Schuler, C. J.; and Harmeling, S. 2012. Image denoising: Can plain neural networks compete with bm3d? In CVPR.
 [Chen and Pock2017] Chen, Y., and Pock, T. 2017. Trainable nonlinear reaction diffusion: A flexible framework for fast and effective image restoration. IEEE transactions on pattern analysis and machine intelligence 39(6):1256–1272.
 [Chen et al.2018] Chen, J.; Chen, J.; Chao, H.; and Yang, M. 2018. Image blind denoising with generative adversarial network based noise modeling. In CVPR.
 [Dabov et al.2006] Dabov, K.; Foi, A.; Katkovnik, V.; and Egiazarian, K. 2006. Image denoising with blockmatching and 3d filtering. In Image Processing: Algorithms and Systems, Neural Networks, and Machine Learning, volume 6064, 606414. International Society for Optics and Photonics.
 [Dabov et al.2008] Dabov, K.; Foi, A.; Katkovnik, V.; and Egiazarian, K. 2008. Image restoration by sparse 3d transformdomain collaborative filtering. In Image Processing: Algorithms and Systems VI, volume 6812, 681207. International Society for Optics and Photonics.
 [Elad and Aharon2006] Elad, M., and Aharon, M. 2006. Image denoising via sparse and redundant representations over learned dictionaries. IEEE Transactions on Image processing 15(12):3736–3745.
 [Foi et al.2008] Foi, A.; Trimeche, M.; Katkovnik, V.; and Egiazarian, K. 2008. Practical poissoniangaussian noise modeling and fitting for singleimage rawdata. IEEE Transactions on Image Processing 17(10):1737–1754.
 [Gu et al.2014] Gu, S.; Zhang, L.; Zuo, W.; and Feng, X. 2014. Weighted nuclear norm minimization with application to image denoising. In CVPR.
 [Lebrun, Colom, and Morel2015a] Lebrun, M.; Colom, M.; and Morel, J.M. 2015a. Multiscale image blind denoising. IEEE Transactions on Image Processing 24(10):3149–3161.
 [Lebrun, Colom, and Morel2015b] Lebrun, M.; Colom, M.; and Morel, J.M. 2015b. The noise clinic: a blind image denoising algorithm. Image Processing On Line 5:1–54.
 [Lee1980] Lee, J.S. 1980. Refined filtering of image noise using local statistics. Technical report, NAVAL RESEARCH LAB WASHINGTON DC.
 [Liu et al.2008] Liu, C.; Szeliski, R.; Kang, S. B.; Zitnick, C. L.; and Freeman, W. T. 2008. Automatic estimation and removal of noise from a single image. IEEE Trans. Pattern Anal. Mach. Intell. 30(2):299–314.
 [Liu et al.2017] Liu, D.; Wen, B.; Liu, X.; Wang, Z.; and Huang, T. S. 2017. When image denoising meets highlevel vision tasks: A deep learning approach. arXiv preprint arXiv:1706.04284.
 [Liu, Tanaka, and Okutomi2013] Liu, X.; Tanaka, M.; and Okutomi, M. 2013. Singleimage noise level estimation for blind denoising. IEEE transactions on image processing 22(12):5226–5237.
 [Liu, Tanaka, and Okutomi2014] Liu, X.; Tanaka, M.; and Okutomi, M. 2014. Practical signaldependent noise parameter estimation from a single noisy image. IEEE Transactions on Image Processing 23(10):4361–4371.
 [Mao, Shen, and Yang2016] Mao, X.; Shen, C.; and Yang, Y.B. 2016. Image restoration using very deep convolutional encoderdecoder networks with symmetric skip connections. In NeurIPS.
 [Niknejad, BioucasDias, and Figueiredo2017] Niknejad, M.; BioucasDias, J. M.; and Figueiredo, M. A. 2017. Classspecific poisson denoising by patchbased importance sampling. arXiv preprint arXiv:1706.02867.
 [Online2015a] Online. 2015a. [online] available:. https://ni.neatvideo.com/home.
 [Online2015b] Online. 2015b. [online] available:. https://ni.neatvideo.com/.
 [Plötz and Roth2017] Plötz, T., and Roth, S. 2017. Benchmarking denoising algorithms with real photographs. In CVPR.
 [Ren, ElKhamy, and Lee2018] Ren, H.; ElKhamy, M.; and Lee, J. 2018. Dnresnet: Efficient deep residual network for image denoising. arXiv preprint arXiv:1810.06766.
 [Roth and Black2009] Roth, S., and Black, M. J. 2009. Fields of experts. International Journal of Computer Vision 82(2):205.
 [Santhanam, Morariu, and Davis2017] Santhanam, V.; Morariu, V. I.; and Davis, L. S. 2017. Generalized deep image to image regression. In CVPR.
 [Shi et al.2016] Shi, W.; Caballero, J.; Huszár, F.; Totz, J.; Aitken, A. P.; Bishop, R.; Rueckert, D.; and Wang, Z. 2016. Realtime single image and video superresolution using an efficient subpixel convolutional neural network. In CVPR.
 [Shi Guo2018] Shi Guo, Zifei Yan, K. Z. W. Z. L. Z. 2018. Toward convolutional blind denoising of real photographs. In arXiv preprint arXiv:1807.04686.
 [Stanford2015] Stanford. 2015. Demosaicking and denoising. https://web.stanford.edu/group/vista/cgibin/wiki/index.php/Demosaicking˙and˙Denoising.
 [Ulyanov, Vedaldi, and Lempitsky2017] Ulyanov, D.; Vedaldi, A.; and Lempitsky, V. 2017. Deep image prior. arXiv preprint arXiv:1711.10925.
 [Wang et al.2019] Wang, C.; Huang, H.; Han, X.; and Wang, J. 2019. Video inpainting by jointly learning temporal structure and spatial details. In Proceedings of the AAAI Conference on Artificial Intelligence, volume 33, 5232–5239.
 [Xu et al.2016] Xu, J.; Ren, D.; Zhang, L.; and Zhang, D. 2016. Patch group based bayesian learning for blind image denoising. In ACCV.
 [Xu et al.2017] Xu, J.; Zhang, L.; Zhang, D.; and Feng, X. 2017. Multichannel weighted nuclear norm minimization for real color image denoising. In ICCV.
 [Xu, Zhang, and Zhang2018] Xu, J.; Zhang, L.; and Zhang, D. 2018. A trilateral weighted sparse coding scheme for realworld image denoising. arXiv preprint arXiv:1807.04364.
 [Yair and Michaeli2018] Yair, N., and Michaeli, T. 2018. Multiscale weighted nuclear norm image restoration. In CVPR.
 [Yu and Koltun2015] Yu, F., and Koltun, V. 2015. Multiscale context aggregation by dilated convolutions. arXiv preprint arXiv:1511.07122.
 [Yu et al.2019] Yu, K.; Wang, X.; Dong, C.; Tang, X.; and Loy, C. C. 2019. Pathrestore: Learning network path selection for image restoration. arXiv preprint arXiv:1904.10343.
 [Zhang et al.2017a] Zhang, K.; Zuo, W.; Chen, Y.; Meng, D.; and Zhang, L. 2017a. Beyond a gaussian denoiser: Residual learning of deep cnn for image denoising. IEEE Transactions on Image Processing 26(7):3142–3155.
 [Zhang et al.2017b] Zhang, K.; Zuo, W.; Gu, S.; and Zhang, L. 2017b. Learning deep cnn denoiser prior for image restoration. In CVPR.
 [Zhang, Zuo, and Zhang2018] Zhang, K.; Zuo, W.; and Zhang, L. 2018. Ffdnet: Toward a fast and flexible solution for cnn based image denoising. IEEE Transactions on Image Processing.
 [Zhao, Lam, and Lun2019] Zhao, R.; Lam, K.M.; and Lun, D. P. 2019. Enhancement of a cnnbased denoiser based on spatial and spectral analysis. In 2019 IEEE International Conference on Image Processing (ICIP), 1124–1128. IEEE.
 [Zhou et al.2019] Zhou, Y.; Jiao, J.; Huang, H.; Wang, J.; and Huang, T. 2019. Adaptation strategies for applying awgnbased denoiser to realistic noise. In Proceedings of the AAAI Conference on Artificial Intelligence, volume 33, 10085–10086.
 [Zhou, Liu, and Huang2018] Zhou, Y.; Liu, D.; and Huang, T. 2018. Survey of face detection on lowquality images. In 2018 13th IEEE International Conference on Automatic Face & Gesture Recognition (FG 2018), 769–773. IEEE.
 [Zhu, Chen, and Heng2016] Zhu, F.; Chen, G.; and Heng, P.A. 2016. From noise modeling to blind image denoising. In CVPR.
 [Zoran and Weiss2011] Zoran, D., and Weiss, Y. 2011. From learning models of natural image patches to whole image restoration. In ICCV.