Conventional optimization based methods have utilized forward models with image priors to solve inverse problems in image processing. Recently, deep neural networks (DNN) have been investigated to significantly improve the image quality of the solution for inverse problems. Most DNN based inverse problems have focused on using data-driven image priors with massive amount of data. However, these methods often do not inherit nice properties of conventional approaches using theoretically well-grounded optimization algorithms such as monotone, global convergence. Here we investigate another possibility of using DNN for inverse problems in image processing. We propose methods to use DNNs to seamlessly speed up convergence rates of conventional optimization based methods. Our DNN-incorporated scaled gradient projection methods, without breaking theoretical properties, significantly improved convergence speed over state-of-the-art conventional optimization methods such as ISTA or FISTA in practice for inverse problems such as image inpainting, compressive image recovery with partial Fourier samples, image deblurring, and medical image reconstruction with sparse-view projections.
oddsidemargin has been altered.
marginparsep has been altered.
topmargin has been altered.
marginparwidth has been altered.
marginparpush has been altered.
paperheight has been altered.
The page layout violates the ICML style. Please do not change the page layout, or include packages like geometry, savetrees, or fullpage, which change it for you. We’re not able to reliably undo arbitrary changes to the style. Please remove the offending package(s), or layout-changing commands and try again.
Speeding up scaled gradient projection methods using deep neural networks
for inverse problems in image processing
Byung Hyun Lee 1 Se Young Chun 1
Preprint. Work in progress.\@xsect
Inverse problems have been widely investigated for image processing applications such as image denoising (Roth & Black, 2005; Mairal et al., 2009; Zoran & Weiss, 2011), image inpainting (Roth & Black, 2005), image deblurring (Zoran & Weiss, 2011), image recovery from incomplete Fourier samples (Patel et al., 2012), and image reconstruction from noisy Radon transformed measurements (Zhong et al., 2013). Before the advent of deep learning based approaches, most state-of-the-art methods for inverse problems in image processing were using (iterative) convergent optimization algorithms with accurate forward modeling of image degradation processes and reasonable and efficient image priors such as minimum total variation (TV) to yield the final estimated images. Iterative algorithms have achieved state-of-the-art performances for inverse problems in image processing, but they are often time-consuming. There have been numerous attempts to improve the speed of convergence rate for optimization algorithms such as iterative shrinkage-thresholding algorithm (ISTA) (Figueiredo & Nowak, 2003), fast iterative shrinkage-thresholding algorithm (FISTA) (Beck & Teboulle, 2009), approximate message passing (AMP) (Donoho et al., 2009), alternating directional method of multipliers (ADMM) (Boyd et al., 2011), and optimized first-order methods (OGM) (Kim & Fessler, 2016), to name a few.
Deep neural network (DNN) based approaches have been investigated widely investigated for various computer vision problems to yield unprecedented state-of-the-art results (LeCun et al., 2015). Recently, DNN have also been investigated to significantly improve the image quality of the solution for inverse problems in image processing. Most DNN based inverse problems have focused on using data-driven image priors with massive amount of data. There are largely three different approaches in DNN based methods for inverse problems in image processing. Firstly, direct mapping approach is to train DNNs to map from given low quality image to high quality image and yields state-of-the-art performance in image inpainting (Xie et al., 2012), image denoising (Zhang et al., 2017), single image super resolution (Lim et al., 2017) and medical image reconstruction (Jin et al., 2017). Secondly, learning based iterative algorithms is to construct DNNs that are similar to unfolded algorithms with finite number of iterations and to train those networks to map from given low quality data to high quality data. Learned ISTA (Gregor & LeCun, 2010), ADMM-net (Sun et al., 2016) and Learned Denoising-AMP (Metzler et al., 2017) are in this category. Lastly, hybrid approach is to use DNN as a proximal operator in conventional optimization based methods for inverse problems in image processing (Chang et al., 2017).
Most DNN based inverse problems have focused on using data-driven image priors with massive amount of data and have been successful to achieve state-of-the-art image quality. However, we still have limited understanding about the solutions for inverse problems when using DNN approaches compared to conventional methods. DNN based methods often do not inherit nice properties of conventional approaches using theoretically well-grounded optimization algorithms such as monotone, global convergence. Direct mapping approach and learning based iterative algorithms optimize DNNs for training datasets, rather than optimize the solutions for given input information as conventional methods do. There are sometimes the cases where conventional methods achieve better image quality than DNN based methods, yet, such as medical image reconstruction with toy examples (Jin et al., 2017) and image denoising with real data (Plötz & Roth, 2017).
Here we investigate another possibility of using DNN for inverse problems in image processing. We propose methods to use DNNs to seamlessly speed up convergence rates of conventional optimization based methods. Most conventional optimization theories specify step size ranges for iterative algorithms to converge, but choosing a number (or a diagonal matrix) at each iteration has no optimal procedure and is heuristic. Often, a very small, safe number (less than Lipschitz constant) is selected to guarantee convergence. We propose to replace this procedure with DNN so that large step sizes may be chosen, especially for initial iterations, without compromising convergence property. Our DNN-incorporated scaled gradient projection (SGP) methods, without breaking theoretical properties, significantly improved convergence speed over state-of-the-art conventional optimization methods such as ISTA (Figueiredo & Nowak, 2003) or FISTA (Beck & Teboulle, 2009) in practice for inverse problems in image processing such as image inpainting, compressive image recovery with partial Fourier samples, image deblurring, and medical image reconstruction with sparse-view projections.
Consider an optimization problem of the form
where and are both convex and is subdifferentiable whereas is differentiable for . Then, the following equation is called the proximal gradient method (PGM):
where represents the -norm and
PGM guarantees the convergence of to , the cost at the solution , with the rate of convergence when is properly chosen.
One way to determine is based on the majorization-minimization technique for the cost function and its quadratic approximation for each iteration. If the value of the cost function is smaller than or equal to that of its quadratic approximaiton at a point generated by the proximal gradient method with and , it is true that the cost function decreases. To promise this condition, is usually assumed to be Lipschitz continuous on the given domain and the reciprocal of the Lipschitz constant of is used for . Another popular method is the backtracking method in which the stepsize is initialized to a positive number and increases its value until the inequality between the cost function and its quadratic approximation at the next point is established. Note that both methods do not seek the largest possible stepsize since it is often more efficient to calculate the next iteration with sub-optimal stepsize than to perform time-consuming stepsize optimization.
Recently, scaled gradient projection (SGP) methods has been proposed to improve convergence speed (Bonettini et al., 2008; Bonettini & Prato, 2015). The problem (1) can also be seen as a constrained optimization problem
where , which is a convex set because of the convexity of , and is determined by . Then, the iterative algorithm can be formulated as the projected gradient method given by
where the proximal operator is
Note that whenever , is a descent direction at for the problem, which means that its inner product with is negative, and implies that is a stationary point. Since is a descent direction for the problem, the Armijo line search can be applied to generate a convergent sequence of , which satisfies the Armijo condition
where and for all .
In (3), SGP method additionally multiplies a symmetric positive definite matrix in front of . The symmetry and positive definiteness are the necessary conditions for a Hessian matrix of in Newton’s method and they are also important conditions in Quasi-Newton methods. The Newton-type methods, or second order optimization methods, usually converge with fewer iterations than the first order optimization methods, but they are computationally demanding especially for large size of input data. SGP method only exploits the symmetry and positive definiteness with the aim of less increase of computational cost while it can still refine the direction vector to accelerate the convergence rate. SGP method is based on the Armijo line search algorithm because remains as a descent direction of the problem with the conditions of at . It is also applicable for proximal operator.
For the convergence of SGP method, it requires additional condition for . Define for as the set of all symmetric positive definite matrices whose eigenvalues are in the interval . Then, for such that , the condition , should be satisfied. Note that from the condition. It implies that becomes an identity matrix and the iteration becomes similar to PGM (A similar mechanism is also used in Levenberg-Marquardt method). According to (Bonettini & Prato, 2015), its convergence rate is of , but it could outperform fast proximal gradient methods with appropriate choice of . However, how to find the that can accelerate and guarantee convergence still remains as an open problem. For the given proximal operator
the SGP method is summarized as follows:
is called the data-fidelity term where represents the measurement matrix, is the signal to estimate, and is the measurements from the true signal by the measruement matrix. If , solving becomes ill-posed since there are infinitely many solutions for . To regularize ill-posed inverse problems, a regularization term is added so that one can solve the optimization problem for instead of alone.
In -regularized least squares minimization problems, where is the regularization parameter and represents -norm. The use of -norm is based on the concept of compressed sensing, in which the main idea is that the true signal would have a sparse representation in a certain transform domain such as wavelet domain.
Applying the PGM to solve the problem, the exact proximal operator can be expressed element-wise as follows:
for each element in the vectors. The operator is called soft threshold and this optimization is called iterative shrinkage-thresholding algorithm (ISTA) (Figueiredo & Nowak, 2003; Beck & Teboulle, 2009).
We conjecture that DNN can be trained to generate an almost optimal stepsize if a current estimate and the gradient of a cost function at that estimate are given, especially for early iterations. However, since there is no ground truth for the stepsize selection, we propose to train stepsize DNN to minimize the distance between the estimated image at the next iteration and the ground truth image (the converged image). Our proposed methods still enjoy all nice theoretical properties such as monotone convergence (more iteration always improves the output) and interpretation of the optimal solution to the given input data (e.g., the solution to balance in between minimizing the distance from the given input and encouraging sparsity in wavelet domain).
To learn the stepsize by DNNs, a set of solution images of an optimization problem was generated where is the number of images using the existing iterative algorithms (FISTA was chosen in our case). These solutions are used as the ground truth. Suppose that the input training images are given where the training set is generated after iterations. Let us denote the output of the DNN as a set of positive real numbers . Then, we can define a set of signals as the estimated images at the next iteration such that
Thus, DNNs are trained to yield that minimize the following loss function
Therefore, the desired stepsize for training image at iteration can be obtained as
which shows that the desired is dependent not only on the estimated image , but also on its gradient . Thus, it is reasonable to constitute the input training set as two-channel inputs where the first channel is and the second channel is .
After is evaluated by the learned stepsizes , we generate another set of images for the next iteration in a traditional way by using shrinkage operator with the traditional stepsize using Lipschitz constant on the set of as follows:
where . This additional step was necessary since often were not improved over when the DNN training was not done yet.
Therefore, one iteration of our method consists of two steps of soft thresholding operations where the first operation moves the current images closest to their solutions in the corresponding directions and the second operation is applied to mitigate the values of the cost function that could have possibly increased by the first operation. We will show that this method works well to reduce the loss value quickly.
Furthermore, The same method with a diagonal matrix can be applied by replacing the stepsize in (6) with the diagonal matrix . The same training procedure can be used to train DNNs to learn diagonal matrices instead of learned stepsizes by slightly modifying the output of the networks and its backpropagation form from the networks that learned stepsizes.
Instead of training DNNs for one iteration, we further train DNNs to generate stepsizes for multiple number of itereations. Inspired by the training strategy in (Gupta et al., 2018), we define the following cumulative loss function:
where is defined in (7). We also define new input dataset for training as follows:
and the ground truth label for training as follows:
Suppose that the DNN is to learn stepsizes of the first iterations. Initially, DNN is trained with the input data set and the ground truth label at the th iteration using the procedure in Section id1. Then, in the next iteration, DNN is re-trained with the input data set and the ground truth label at the first iteration. This training process is repeated times so that the DNN can be trained cumulatively to yield good stepsizes for the first iterations. The training algorithm is summarized as follows:
We predict that our trained DNN should yield almost optimal stepsize values for the first iterations, but will not be able to yield good stepsizes in later iterations that are larger than . We found that this prediction is consistent with our simulation results. In practice, many inverse problem applications in image processing have the trade-off between better image quality with long iterations and fast computation time with good enough image quality for corresponding application. Thus, should be selected based on the consideration on this trade-off in given applications.
Let us define the stepsize DNN as the function with the two-channel input vectors where is the set of weight parameters of the DNN and is the loss function that is also defined in (7). Then, the derivative of with respect to is
where is the derivative of the function with respect to with the input image . The implementation of for a DNN can be done using any deep learning development package such as Tensorflow, PyTorch, or MathConvNet. For the inverse problem using regularization in (5), the derivative of for is
where and is the th element of .
The SGP method is described in Algorithm 1 of Section id1. If is an identity matrix and is equal to 1 for all , the SGP method is equivalent to the PGM. In other words, the SGP method is a generalized version of the PGM by additionally multiplying a symmetric positive definite matrix with the gradient of the loss function that guarantees is the descent direction and by enforcing the Armijo condition for convergence. However, there is no method to determine the that can accelerate and guarantee convergence. We propose to determine using the DNN that is trained using the learning procedure in Section id1. However, as discussed in of Section id1, our DNN is trained for the first iterations, so there is no guarantee that it will work for later iterations. We proposed to relax the SGP method to use DNN based stepsize (or diagonal matrix) estimation for early iterations and to use the PGM with conservative Lipschitz constant based stepsize for later iterations.
First of all, we propose the direction relaxation scheme (DRS) as described in Algorithm 3. Note that when is -norm and is a diagonal matrix where is the diagonal of . represents the search direction generated by the trained DNN and is the search direction from the PGM with conventional Lipschitz constant based stepsize. Then, depending on the relationship between and , the final search direction will be determined to be either a linear combination of both of them or . If the DNN is trained to generate a number (stepsize), then will be the diagonal matrix with all diagonal elements of that stepsize.
Secondly, we propose to incorporate the DNN-incorporated DRS method into the SGP algorithm as described in Algorithm 4. is the search direction to yield the estimate image for the next iteration. As in Algorithm 3, is either the weighted average of and with the weights using or itself. The value of is initially and it remains same or decreases by a factor over iterations depending on the Armijo condition at each iteration . The ratios of the weight for to the weight for in are evaluated every iterations. Initially, using the trained DNN is dominant in , but for later iterations and the DNN will not be used then. Thus, our proposed algorithm is initially the SGP with DNN-determined search directions and becomes the PGM for later iterations.
Note that the proposed DRS method with relaxation only determines the search direction for the next estimate in a descent direction for inverse problems. is the final stepsize parameter, starting from and decreases its value by a factor until it satisfies the Armijo condition. Therefore, our proposed method in Algorithms 3 and 4 using the trained DNN is converging theoretically.
We modified FBPConvNet (Jin et al., 2017) that is based on U-Net architecture (Ronneberger et al., 2015) to implement our DNN to yield a stepsize or a diagonal matrix. convolution filters are used for all convolutional layers. Batch normalization and rectified linear unit (ReLU) are used after each convolution layer. max pooling was also applied in the first half of the DNN. Deconvolution layers and skip connections are used for the second half of the DNN. We reduced the number of layers in the FBPConvNet to reduce computation time. For stepsize learning, one fully connected layer was added at the end of the DNN to generate a single number.
An optimization problem for inverse problems in image processing has the following form:
where is a matrix in that describes a image degradation forwarding process, is the measurement vector and is a regularization parameter to balance between data fidelity and image prior.
The goal of these simulations is to demonstrate that our proposed methods can yield the same converged image estimate with the fastest convergence rate for various inverse problems in image processing such as image inpainting, compressive image recovery with partial Fourier samples, image deblurring and medical image reconstruction with sparse-view projections. The image is in wavelet domain (three level symlet-4 wavelets) for image inpainting and compressive image recovery with partial Fourier samples and is in augmented spatial average-difference domain for image deblurring and sparse-view image reconstruction as introduced in (Kamilov, 2017). Thus, the linear operator is a usual measurement matrix with an inverse sparsifying transform. Note that for normalized measurement matrices, their Lipschitz constants that are less than . We found that normalized gradients help to produce better training results.
BSD500 dataset with images was used for all of our inverse problems except medical image reconstruction. CT images were used for sparse-view image reconstruction. Our DNN implementation was done based on the FBPConvNet using MatConvNet in MATLAB (Jin et al., 2017). Note that the input and output images for the DNN are in a sparsifying transform domain, which have improved the overall performance of inverse problems.
Our proposed methods were applied to image inpainting problem with 50% missing points. was used where is the symlet-4 wavelet transform and is a sampling matrix. The image with 50% missing is in Figure 1 (a). The regularization parameter was 0.1. For the stepsize / diagonal matrix learning, were used, respectively. At , 10 / 40 epochs were run for training and then, for the rest of , 20 epochs were used for both cases. For 50 test images that were not used for training, FISTA with backtracking did not yield converged image at the 100th iteration as in Figure 1 (b). However, our proposed methods yielded almost converged images with excellent convergence rates as illustrated in Figure 1 (c), (d) and (e).
Similar simulations were performed for image recovery with partial Fourier samples (50% sampling). Note that the input image of the DNN has four channels such that the first two channels are the real and imaginary of the estimated image. The regularization parameter was 0.1. The initial image in Figure 2 (a) was obtained using inverse FFT with zero padding. For 50 test images, FISTA yielded blurred image at the 20th iteration as in Figure 2 (b). However, our proposed methods yielded sharp images with fast convergence rates as illustrated in Figure 2 (c), (d) and (e).
Proposed methods were applied to image deblurring problems. Images were blurred using Gaussian kernel with . Then, image deblurring was performed with the regularization parameter 0.00001. Note that the initial data fidelity term for deblurring problem is usually much larger than other inverse problems such as inpainting problems. Unlike other inverse problems in image processing, learned diagonal matrix based relaxed SGP yielded the best image quality among all compared methods as shown in Figure 3 qualitatively and quantitatively. It seems that large discrepancy in the data fidelity term was quickly compensated when using the learned diagonal matrix in relaxed SGP.
Lastly, our proposed method was investigated for sparse-view CT image reconstruction. The initial image in Figure 4 (a) was obtained by filtered back-projection (FBP) from 144 views of projections and had streaking artifacts. With the regularization parameter 0.0005, we performed FISTA and our proposed diag-learned relaxed SGP. At the 10th iteration, our proposed method yielded visually better image than FISTA as illustrated in Figure 4 (b) and (c). Figure 4 (d) also shows that our proposed method achieved faster initial convergence rate than state-of-the-art FISTA.
We investigated the robustness of our trained DNN for determining stepsize or diagonal matrix when different forward models are used. Figure 5 (a) shows convergence graphs when image inpainting with 30% sampling is performed using the trained DNN for image inpainting with 50% sampling. It shows that the trained DNN still achieved much faster convergence than FISTA. Similar tendency was observed for image inpainting with 70% sampling. Figure 5 (b) presents convergence graphs for image recovery from partial Fourier samples (30% sampling) when the DNN was trained for the same inverse problem with 50% sampling. The trained DNN also yielded robust performance for lower or higher (70%) sampling in partial Fourier image recovery. However, for image deblurring problems with different blur levels, the trained DNN yielded sub-optimal performance compared to FISTA. However, note that step-learned SGP yielded relatively robust performance to diag-learned SGP and it yielded better performance than FISTA for very early iterations. For sparse-view image reconstruction, the trained DNN with 144 views did not yield good performance for the test with 45 views. Thus, the robustness of the trained DNN to other forward models is application-dependent. However, it has been well-investigated that many DNN based algorithms for inverse problems are not robust to other measurement models (Jin et al., 2017).
We investigated the possibility of using the DNN for seamlessly speeding up convergence rates of optimization algorithms for inverse problems in image processing. Our proposed methods utilized the DNN to determine free parameters in optimization algorithms without breaking nice theoretical properties of algorithms such as monotone convergence. While FISTA is theoretically faster than SGP methods, our trained DNN with relaxation scheme enabled SGP methods to outperform FISTA in practice.
Acknowledgements This work was supported by Basic Science Research Program through the National Research Foundation of Korea (NRF) funded by the Ministry of Education (NRF- 2017R1D1A1B05035810).
- Beck & Teboulle (2009) Beck, A. and Teboulle, M. A fast iterative shrinkage-thresholding algorithm for linear inverse problems. SIAM journal on imaging sciences, 2(1):183–202, 2009.
- Bonettini & Prato (2015) Bonettini, S. and Prato, M. New convergence results for the scaled gradient projection method. Inverse Problems, 31(9):095008, September 2015.
- Bonettini et al. (2008) Bonettini, S., Zanella, R., and Zanni, L. A scaled gradient projection method for constrained image deblurring. Inverse Problems, 25(1):015002, 2008.
- Boyd et al. (2011) Boyd, S., Parikh, N., Chu, E., Peleato, B., Eckstein, J., et al. Distributed optimization and statistical learning via the alternating direction method of multipliers. Foundations and Trends® in Machine learning, 3(1):1–122, 2011.
- Chang et al. (2017) Chang, J. H. R., Li, C.-L., Poczos, B., and Kumar, B. V. K. V. One network to solve them all - solving linear inverse problems using deep projection models. In IEEE International Conference on Computer Vision (ICCV), pp. 5889–5898, 2017.
- Donoho et al. (2009) Donoho, D. L., Maleki, A., and Montanari, A. Message-passing algorithms for compressed sensing. Proceedings of the National Academy of Sciences, 106(45):18914–18919, 2009.
- Figueiredo & Nowak (2003) Figueiredo, M. A. and Nowak, R. D. An EM algorithm for wavelet-based image restoration. IEEE Transactions on Image Processing, 12(8):906–916, 2003.
- Gregor & LeCun (2010) Gregor, K. and LeCun, Y. Learning fast approximations of sparse coding. In International Conference on International Conference on Machine Learning, pp. 399–406. Omnipress, 2010.
- Gupta et al. (2018) Gupta, H., Jin, K. H., Nguyen, H. Q., McCann, M. T., and Unser, M. CNN-based projected gradient descent for consistent CT image reconstruction. IEEE Transactions on Medical Imaging, 37(6):1440–1453, 2018.
- Jin et al. (2017) Jin, K. H., McCann, M. T., Froustey, E., and Unser, M. Deep Convolutional Neural Network for Inverse Problems in Imaging. IEEE Transactions on Image Processing, 26(9):4509–4522, September 2017.
- Kamilov (2017) Kamilov, U. S. A parallel proximal algorithm for anisotropic total variation minimization. IEEE Transactions on Image Processing, 26(2):539–548, 2017.
- Kim & Fessler (2016) Kim, D. and Fessler, J. A. Optimized first-order methods for smooth convex minimization. Mathematical Programming, 159(1):81–107, Sep 2016.
- LeCun et al. (2015) LeCun, Y., Bengio, Y., and Hinton, G. Deep learning. nature, 521(7553):436, 2015.
- Lim et al. (2017) Lim, B., Son, S., Kim, H., Nah, S., and Lee, K. M. Enhanced Deep Residual Networks for Single Image Super-Resolution. In IEEE Conference on Computer Vision and Pattern Recognition Workshops (CVPRW), pp. 1132–1140, 2017.
- Mairal et al. (2009) Mairal, J., Bach, F. R., Ponce, J., Sapiro, G., and Zisserman, A. Non-local sparse models for image restoration. In IEEE International Conference on Computer Vision (ICCV), pp. 2272–2279, 2009.
- Metzler et al. (2017) Metzler, C., Mousavi, A., and Baraniuk, R. Learned D-AMP: Principled neural network based compressive image recovery. In Advances in Neural Information Processing Systems, pp. 1772–1783, 2017.
- Patel et al. (2012) Patel, V. M., Maleh, R., Gilbert, A. C., and Chellappa, R. Gradient-based image recovery methods from incomplete Fourier measurements. IEEE Transactions on Image Processing, 21(1):94–105, January 2012.
- Plötz & Roth (2017) Plötz, T. and Roth, S. Benchmarking denoising algorithms with real photographs. In IEEE Conference on Computer Vision and Pattern Recognition (CVPR), pp. 2750–2759, 2017.
- Ronneberger et al. (2015) Ronneberger, O., Fischer, P., and Brox, T. U-Net: Convolutional Networks for Biomedical Image Segmentation. In Medical Image Computing and Computer-Assisted Intervention (MICCAI), pp. 234–241, 2015.
- Roth & Black (2005) Roth, S. and Black, M. J. Fields of experts: A framework for learning image priors. In IEEE Conference on Computer Vision and Pattern Recognition (CVPR), pp. 860–867, 2005.
- Sun et al. (2016) Sun, J., Li, H., Xu, Z., et al. Deep ADMM-Net for compressive sensing MRI. In Advances in Neural Information Processing Systems, pp. 10–18, 2016.
- Xie et al. (2012) Xie, J., Xu, L., and Chen, E. Image denoising and inpainting with deep neural networks. In Advances in neural information processing systems, pp. 341–349, 2012.
- Zhang et al. (2017) Zhang, K., Zuo, W., Chen, Y., Meng, D., and Zhang, L. Beyond a Gaussian denoiser: Residual learning of deep CNN for image denoising. IEEE Transactions on Image Processing, 26(7):3142–3155, 2017.
- Zhong et al. (2013) Zhong, L., Cho, S., Metaxas, D., Paris, S., and Wang, J. Handling noise in single image deblurring using directional filters. In IEEE Conference on Computer Vision and Pattern Recognition (CVPR), pp. 612–619, 2013.
- Zoran & Weiss (2011) Zoran, D. and Weiss, Y. From learning models of natural image patches to whole image restoration. In IEEE International Conference on Computer Vision (ICCV), pp. 479–486, 2011.