Nonlocal TV-Gaussian (NLTG) prior for Bayesian inverse problems with applications to Limited CT Reconstruction
Bayesian inference methods have been widely applied in inverse problems, largely due to their ability to characterize the uncertainty associated with the estimation results. In the Bayesian framework the prior distribution of the unknown plays an essential role in the Bayesian inference, and a good prior distribution can significantly improve the inference results. In this paper, we extend the total variation-Gaussian (TG) prior in , and propose a hybrid prior distribution which combines the nonlocal total variation regularization and the Gaussian (NLTG) distribution. The advantage of the new prior is two-fold. The proposed prior models both texture and geometric structures present in images through the NLTV. The Gaussian reference measure also provides a flexibility of incorporating structure information from a reference image. Some theoretical properties are established for the NLTG prior. The proposed prior is applied to limited-angle tomography reconstruction problem with difficulties of severe data missing. We compute both MAP and CM estimates through two efficient methods and the numerical experiments validate the advantages and feasibility of the proposed NLTG prior.
Bayesian inference methods [12, 17] have become a popular tool to solve inverse problems. Such popularity is largely due to its ability to quantify the solution uncertainties. A typical Bayesian treatment consists of assigning a prior distribution to the unknown parameters and then update the distribution based on the observed data, yielding the posterior distribution. Recently considerable attentions have been paid to the studies of infinite dimensional Bayesian inverse problems, where the unknowns are functions of space or time, for example, images. In particular, a rigorous function space Bayesian inference framework for inverse problems was developed in . It should be clear that, as most practical inverse problems are highly ill-posed, the performance of the Bayesian inference depends critically on the choice of the prior distribution, and prior modeling plays an essential role in the Bayesian inference method. For infinite dimensional problems, the Gaussian measures are arguably the most popular choice of prior distributions, as it has many theoretical and computational advantages in the infinite dimensional setting .
However, in many practical problems, such as medical image reconstruction, the functions or images that one wants to recover are often subject to sharp jumps or discontinuities. The Gaussian prior distributions are typically not suitable for modeling such functions . To this end several non-Gaussian priors have been proposed to model such images, e.g., . Since these prior distributions differ significantly from Gaussian, many sampling schemes based on the Gaussian prior can not be used directly. To address the issue, a hybrid prior was proposed in . The hybrid prior is motivated by the total variation (TV) regularization  in the deterministic setting; however, it has been proven in  that the TV based prior does not converge to a well-defined infinite-dimensional measure as the discretization dimension increases. The hybrid prior is a combination of the TV term and the Gaussian distribution: it uses a TV term to capture the sharp jumps in the functions and a Gaussian distribution as a reference measure to make sure that the resulting prior does converge to a well-defined probability measure in the function space in the infinite dimensional limit.
Nonlocal methods are another types of popular regularization methods for imaging inverse problems. They are originally proposed for natural image processing to restore repetitive patterns and textures, for example a heuristic copy-paste technique was firstly proposed for texture synthesis in , a more systematic nonlocal means filter was proposed in  and a nonlocal variational framework was established in . The main idea of the nonlocal methods is to utilize the similarities present in an image as a weight for restoring, smoothing or regularization. As an extension of TV, nonlocal TV (NLTV) regularization method is among those popular variational regularization tools due to its flexibility of recovering both texture and geometry patterns for diverse imaging inverse problems, see [43, 44, 28, 22] for the applications. It has been demonstrated that in many practical problems, the NLTV method has better performance than the standard TV, especially for recovering textures and structures of images. More definitions and details are present in Section 2.2.
Inspired by the success of nonlocal regularization, we propose to improve the hybrid TV-Gaussian (TG) prior proposed in  by in-cooperating the nonlocal methods. The idea is rather straightforward: we shall replace the TV term in the hybrid prior with a NLTV term, and theoretically we are able to prove that the resulting new hybrid prior can also lead to well defined posterior distribution in the infinite dimensional setting. Moreover, the new hybrid prior has the following advantages: first the NLTV term can better recover textures and structures of images, especially for highly ill-posed or severely data-missing inverse problems; secondly, we have extended the function space to a larger function space compared to considered in TG prior, for dealing with larger images class; finally the Gaussian measure provides some freedom to incorporate other prior information through the covariance matrix, such as structures from a reference image. To demonstrate the effectiveness of the hybrid NLTV-Gaussian (NLTG) prior, we apply it to the limited tomography problems, where only limited projection data are available. In particular, we consider the two common types of point estimation in the Bayesian framework: maximum a posterior (MAP) and conditional mean (CM) with the NLTG prior. The MAP estimate consists of solving an optimization problem, while the computing of CM involves evaluating a high-dimensional integration problem [17, 24].
The remainder of this paper is structured as follows: Section 2 describes the proposed NLTG priors construction on the separable Hilbert space and presents the related theoretical properties. In section 3, we give a simple introduction on the limited tomography and solve the reconstruction problem by applying the proposed NLTG prior. The numerical results via both MAP and CM estimates are shown in section 4. Finally, we draw our conclusions in the last section.
2 The NLTG priors
2.1 The Bayesian framework and the TG prior
We first give a brief introduction to the basic setup of the Bayesian inference methods for inverse problems. We consider the forward model of the following form:
where is an unknown function (in this work we shall restrict ourselves to the situation where is a real-valued function defined in , i.e., an image), is the measured data and is a -dimensional zero mean Gaussian noise with covariance matrix . Our goal here is to estimate the unknown function from the measured data .
First we assume that the unknown lives in a Hilbert space of functions, say . We then choose a probabilistic measure defined on , denoted by , to be the prior measure of . The posterior measure of , denoted as , is represented as the Radon-Nikodym (R-N) derivative with respect to :
where is the data fidelity term in deterministic inverse problem. In the Bayesian framework, the posterior distribution depends on the information from data and the prior knowledge represented by the prior distribution, and so the choice of the prior distribution plays an essential role in the Bayesian method.
As is discussed in Section 1, probably the most popular prior in the infinite dimensional setting is the Gaussian measure. That is, we choose a Gaussian measure defined on with zero mean and covariance operator . Note that is symmetric positive operator of trace class . To better model functions with sharp jumps, Ref  proposes the hybrid TG prior in the form of,
where represents additional prior information (or regularization) on . In this case, one can writhe the R-N derivative of with respect to :
which in turn returns to the conventional formulation with Gaussian priors. Specifically we shall choose the state space to be Sobolev space and introduced in [41, Section 2.3].
2.2 Nonlocal Total Variation
Here we provide the formulation of the NLTG prior and we start with a brief introduction to the nonlocal regularization. For the details, one may consult  for the variational framework based nonlocal operators, [13, 23, 43, 44] for a short survey on the theory and application of NLTV, and [2, 5, 10, 18, 27, 45] for more relative surveys. The nonlocal methods can be described as follows. Let be a bounded set, and . Given a reference image , we define a nonnegative symmetric weight function as follows:
where is a Gaussian kernel with the standard deviation , is a filtering parameter and is the standard inner product in . Note that in general, corresponds to the noise level; conventionally we set it to be the standard deviation of the noise .
Let . Using the weight function in (5), we define the nonlocal (NL) gradient as
For a given , its NL divergence is defined by the standard adjoint relation with the NL gradient operator as follows:
which leads to the following explicit formula:
Now, we design the following functional based on the nonlocal operators:
Then we can see that the functional in (8) is analogous to the total variation (TV) seminorm, and the NLTG prior can be constructed similarly. To do this, we first need to specify the state space , which should be desirably a separable Hilbert space. Recall that is a bounded set in . As convention, we choose throughout this paper. (Note, however, that our analysis is valid for any bounded domain with boundary). For a given reference image and a weight function defined as (5), we introduce
Let be a given reference image, and let be a weight function defined as (5). For , we have
As a consequence, is a continuous linear operator.
Proof. First we note that, from the definition (5) of , we have
Hence, using Hölder’s inequality (e.g. ), we have
where we used the fact that . This concludes (10).
For (11), we first note that for each , the mapping
is in , by (10), where we specified the Lebesgue measure for clarity. Then applying the Hölder’s inequality again, we have
and this concludes Lemma 1.
2.3 Theoretical properties of the NLTG prior
In this section, we show that the NLTG prior leads to a well-behaved posterior distribution in , where the proofs follow the similar line as .
Following , we assume that the forward operator satisfies the following conditions:
For every , there exists such that
For every , there exists such that for all , with
We have the following proposition on in (3) .
Let be defined as (13). Then we have the followings:
For all , we have .
For every , there exists such that for all with , we have .
For every , there exists such that for all , with
For a given , we choose . Then whenever , we have
which concludes (ii). For (iii), first note that for , , we have
Theorem 2 states that in (4) is a well-defined probability measure on and it is Lipschitz continuous in the data . Since the theorem is a direct consequence of the fact that satisfies [36, Assumptions 2.6.], we omit the proof.
is a well-defined measure on .
is Lipschitz continuous in the data with respect to the Hellinger distance. More precisely, if and are two measures corresponding to data and respectively, then for every , there exists such that for all , with
As a result, the expectation of any polynomially bounded function is continuous in .
For practical concerns, it is important to consider the finite dimensional approximation of . In particular we consider the following finite dimensional approximation:
where is the dimensional approximation of with being the dimensional approximation of and is the dimensional approximation of . Theorem 3 provides the convergence property of .
Assume that and satisfies A1 with constants independent of and satisfies Proposition 1 (i)-(ii) with constants independent of . Assume further that for all , there exist two sequences and , both of which converge to , such that for all where
then we have
as , .
In particular, noting that is a separable Hilbert space, we can consider the finite dimensional approximation of , as presented in the following corollary.
Let be a complete orthonormal basis of . For , we define
If satisfies A1-A2, then we have
3 Application to Limited Tomography Reconstruction
In this section, we illustrate the application of the Bayesian inference method and the NLTG prior to limited tomography problem.
X-ray computed tomography (CT) plays an important role in diseases diagnosis of human body. Let denote the image to be reconstructed. Throughout this paper, we assume is supported in a domain , and we only consider the two dimensional parallel beam CT for simplicity. Then the sinograme (or the projection data) is obtained by the following Radon transform :
From (21), we can easily see that the reconstruction of requires the so-called complete knowledge of on [19, 29]. However, the problem (20) becomes ill-posed whenever the limited data is available in the subset of due to the reduced size of detector [38, 39] and/or the reduced number of projections [26, 34, 4]. In particular, if the projection data is available on :
then there exists a nontrivial function called the ambiguity of , in . As this ambiguity is nonconstant in the region of interest (ROI) [38, 40], the reconstructed image via (21) using available only on will be deteriorated by .
In the literature, numerous studies have been proposed to remove the ambiguity due to the restriction of on . These studies can be classified into the following two categories: the known subregion based approaches related to the restoration of signal from the truncated Hilbert transform, and the sparsity model based approaches using the sparse approximation of tomographic images in the ROI. See [32, 8, 16] for the detailed surveys. In this paper, we will use an approach based on a reference image and NLTG regularization.
3.2 MAP and CM estimators
In this section, we discuss how to compute the two popular point estimator in the Bayesian setting. The first often used point estimator in the Bayesian framework is the MAP estimator, and following the same steps as in , we can show that the MAP estimator with the NLTG prior is the minimizer of
over , where in this problem with being the linear operator in the limited tomography problem.
To minimize (23), we adopt the widely used split Bregman method  which is equivalent to the alternating direction method of multipliers (ADMM). For the sake of clarity, we present the split Bregman method for (23) in Algorithm 1.
To solve (24), we use the conjugate gradient method to solve the following linear system:
The subproblem (25) has the following closed form solution:
where is the shrinkage formula defined as
with the convention that .
Another often used point estimator in the Bayesian setting is the conditional mean (CM), or the posterior mean, which is usually evaluated using the samples drawn from the posterior distribution, often with the Markov Chain Monte Carlo (MCMC) methods. In this work we use the preconditioned Crank-Nicolson (pCN) algorithm developed in  for its property of being independent of discritization dimensionality. Simply speaking, the pCN algorithm proposes according to,
where is the present position, is the proposed position, and is the parameter controlling the stepsize, and . The associated acceptance probability is
We describe the complete pCN algorithm in Algorithm 2.
4 Numerical Results
4.1 Experimental Setup
In this section, we present some experimental results to demonstrate the performance of the proposed NLTG prior. In particular we compare the results (both MAP and CM estimates) of the proposed method with those of the Filtered back projection (FBP) method, the NLTV regularization model and the TG method. We use the XCAT image  taking integer values in as the original image . Then the ground truth image is generated by adding one round shaped object which stands for the tumor in lung and further adding the sinusoidal wave as an inhomogeneous background. Finally, we generate the reference image , which can be considered as the previous CT image of the same patient taken by the same CT modality, by using projections of added by the Gaussian noise of different levels. In all experiments the reference images used are the reconstruction from the projections data with low noise level and high level , which will be denoted as and respectively. Please refer to Figure 1 for these images.
Given a reference image , the covariance matrix for the Gaussian measure term is computed as
following the idea of radial basis function kernel in machine learning  and the similarity weight in nonlocal means filter . We note that this choice of covariance matrix is different from usual Gaussian measure, as the correlation between the pixels value of is used instead of spatial distance. Such choice aims to bring structures and edge information of the reference image to the to-be-reconstructed image, which is especially important for reconstruction from highly missing data. In the comparison to TG method, we adopt this covariance matrix as well for a fair comparison. As for the comparison to the NLTV method, in order to save computation and storage memory, we only use the 10 largest weights and the closest neighbors for each pixel, as adopted in . This will be further illustrated in the numerical results.
To synthesize the limited projection data , we generate the forward operator as the discrete Radon transform followed by the restriction onto the discretization of . Note that the size of discrete Radon transform depends on both the size of CT image and the number of projections. In this experiments, we use equally spaced projections. Then the projection data is generated by
with the Gaussian noise of different noise levels. Throughout this experiments, we choose the noise level to be for the low level and for the high level respectively. Finally, the reconstructed images are further improved by imposing the constraint of intensity using
where denotes the output of the four methods in comparison.
4.2 MAP results
In solving (23), the filtering parameter in both (5) and (29) is chosen as in . In addition, as it is not hard to see that is positive definite, which means that the model (23) is convex, we set the maximum outer iteration number with 80 ensuring that the algorithm converges to the global minimizer. Finally, the regularization parameter is manually chosen so that we can obtain the optimal restoration results.
Tables 1 and 2 summarizes the PSNR and SSIM values of each case. As we can see from the tables, our NLTG prior consistently outperforms the other reconstruction methods, namely the FBP, TV, TG and the NLTV priors. We present the reconstruction results with different methods in Figures 2 and 3. As can be expected, both TV and FBP can not successfully reconstruct reasonable results due to limited projections. The TG prior based MAP estimate shows better performance since our proposed Gaussian covariance matrix (29) provides some structure information from the reference image as prior. We can also see that, compared to the TG prior, the NLTG prior shows the advantage of NLTV by using the similarity in the image. In addition, compared to the NLTV prior, the NLTG prior can obtain better recovery result, thanks to the presence of the Gaussian term which extracts more structure information by the covariance matrix computed from the reference image. As we can see from the figures, the visual improvements are consistent with the improvements in the indices.
|Sinogram Noise Level||Reference Image||FBP||TV||TG||NLTV||NLTG|
|Sinogram Noise Level||Reference Image||FBP||TV||TG||NLTV||NLTG|
4.3 CM Results
The hyperparameters in the CM model directly come from the MAP model. For both TG prior and NLTG prior, we perform the pCN approach with samples and another samples as the pre-run. The stepwise has been chosen to make the acceptance probability is around . We show the CM results in Figure 4, and then compute the PSNR and SSIM as in Table 3.
One can see from the figures and Table 3 that, the CM model with NLTG prior consistently outperformed the CM model with TG prior for different scenarios. We can also see that the PSNR of CM model is less than that of MAP model under low sinogram noise level while the PSNR of CM model is higher than that of MAP under high sinogram noise level. Nonetheless, the results of MAP model consistently outperforms CM model in the SSIM. The reason of such behavior is that MAP estimator preserves better structures and edges while CM provides smoother images that suppress the noise. We can also see that the PSNR of CM with different reference images under fixed sinogram noise level are almost identical, while the SSIM values are in general dependent on the reference image noise level. This suggests that, in some sense, SSIM index is more sensitive to the structures that can be brought from reference images.
The main advantage of Bayesian techniques is that it can measure the uncertainty in the estimates. Figure 5 summarizes the confidence interval (CI) gaps for each setting. Once again, the CM model with TG prior performed worse than the CM model with NLTG prior, as one would expect since the similarity of structures or distribution of an image extracted by NLTG prior can include more information than detection of local sharp edges from TG prior.
For the sinogram noise level of 5 , we can see that the CI gap of samples with reference noise level of 1 shows higher values, especially near the edge of the image, than that of samples with reference noise level of 5. It is worth noting that edges have high uncertainty than the smooth regions. This may be due to the fact that the NLTG prior always try to frequently tune the optimal values on the edge pixel which consequently result in high variance and large confidence interval. Finally, we can derive a similar conclusion in the case of high sinogram noise level.
|Sinogram Noise Level||Reference Image||NLTG||TG||NLTG||TG|
In this paper, we consider the Bayesian inference methods for infinite dimensional inverse problems, and in particular we propose a prior distribution that combines the nonlocal method to extract information from a reference image, with a standard Gaussian distribution. We show that the proposed prior distribution leads to a well-behaved posterior measure in the infinite dimensional setting. We then apply the proposed method to a limited tomography problem. The numerical experiments demonstrate the performance of the proposed NLTG prior is competitive against existing and adapted methods, and we also provide a comparison of the MAP and the CM estimates. We believe that the proposed NLTG prior distribution can be useful in a large class of image reconstruction problems where reference images are available and we plan to investigate these applications in the future.
-  RN. Bracewell and AC. Riddle. Inversion of fan-beam scans in radio astronomy. The Astrophysical Journal, 150:427, 1967.
-  A. Buades, B. Coll, and J. M. Morel. A Review of Image Denoising Algorithms, with a New One. Multiscale Model. Simul., 4(2):490–530, 2005.
-  C. Chang and C. Lin. Libsvm: a library for support vector machines. ACM transactions on intelligent systems and technology (TIST), 2(3):27, 2011.
-  K. Choi, J. Wang, L. Zhu, T. S. Suh, S. Boyd, and L. Xing. Compressed Sensing Based Cone-Beam Computed Tomography Reconstruction with a First-Order Method. Med. Phys., 37(9):5113–5125, 2010.
-  F. R. K. Chung. Spectral Graph Theory, volume 92 of CBMS Regional Conference Series in Mathematics. Published for the Conference Board of the Mathematical Sciences, Washington, DC; by the American Mathematical Society, Providence, RI, 1997.
-  SL. Cotter, GO. Roberts, AM. Stuart, and D. White. MCMC methods for functions: modifying old algorithms to make them faster. Statistical Science, 28(3):424–446, 2013.
-  M. Dashti, K. J. H. Law, A. M. Stuart, and J. Voss. MAP Estimators and Their Consistency in Bayesian Nonparametric Inverse Problems. Inverse Problems, 29(9):095017, 27, 2013.
-  M. Dashti, KJH. Law, AM. Stuart, and J. Voss. MAP estimators and their consistency in bayesian nonparametric inverse problems. Inverse Problems, 29(9):095017, 2013.
-  Efros, A. Alexei, and T.K. Leung. Texture synthesis by non-parametric sampling. In IEEE International Conference on Computer Vision, pages 1033–1038, Corfu, Greece, September 1999.
-  A. Elmoataz, O. Lezoray, and S. Bougleux. Nonlocal Discrete Regularization on Weighted Graphs: a Framework for Image and Manifold Processing. IEEE Trans. Image Process., 17(7):1047–1060, 2008.
-  G. B. Folland. Real Analysis: Modern Techniques and Their Applications. Pure and Appl. Math. John Wiley & Sons Inc., New York, 2nd edition, 1999.
-  A. Gelman, JB. Carlin, HS. Stern, DB. Dunson, A. Vehtari, and DB. Rubin. Bayesian data analysis. CRC press Boca Raton, FL, 2014.
-  G. Gilboa and S. Osher. Nonlocal Linear Image Regularization and Supervised Segmentation. Multiscale Model. Simul., 6(2):595–630, 2007.
-  G. Gilboa and S. Osher. Nonlocal Operators with Applications to Image Processing. Multiscale Model. Simul., 7(3):1005–1028, 2008.
-  T. Goldstein and S. Osher. The split bregman method for -regularized problems. SIAM journal on imaging sciences, 2(2):323–343, 2009.
-  T. Heußer, M. Brehm, S. Marcus, S. Sawall, and M. Kachelrieß. CT data completion based on prior scans. In Nuclear Science Symposium and Medical Imaging Conference (NSS/MIC), 2012 IEEE, pages 2969–2976. IEEE, 2012.
-  J. Kaipio J and E. Somersalo. Statistical and computational inverse problems, volume 160. Springer Science & Business Media, 2006.
-  S. Kindermann, S. Osher, and P. W. Jones. Deblurring and Denoising of Images by Nonlocal Functionals. Multiscale Model. Simul., 4(4):1091–1115, 2005.
-  E Klann. A mumford–shah-like method for limited data tomography with an application to electron tomography. SIAM Journal on Imaging Sciences, 4(4):1029–1048, 2011.
-  M. Lassas and S. Siltanen. Can one use total variation prior for edge-preserving bayesian inversion? Inverse Problems, 20(5):1537, 2004.
-  Jinglai Li. A note on the karhunen–loève expansions for infinite-dimensional bayesian inverse problems. Statistics & Probability Letters, 106:1–4, 2015.
-  J. Liu, H. Ding, S. Molloi, X. Zhang, and H. Gao. Nonlocal total variation based spectral CT image reconstruction. Medical physics, 42(6):3570–3570, 2015.
-  Y. Lou, X. Zhang, S. Osher, and A. Bertozzi. Image Recovery via Nonlocal Operators. J. Sci. Comput., 42(2):185–197, 2010.
-  F. Lucka, S. Pursiainen, M. Burger, and C.H. Wolters. Hierarchical bayesian inference for the eeg inverse problem using realistic fe head models: depth localization and source separation for focal primary currents. Neuroimage, 61(4):1364–1382, 2012.
-  F. Natterer. The Mathematics of Computerized Tomography. Springer, 1986.
-  X. Pan, EY. Sidky, and M. Vannier. Why do commercial ct scanners still employ traditional, filtered back-projection for image reconstruction? Inverse problems, 25(12):123009, 2009.
-  G. Peyré. Image Processing with Nonlocal Spectral Bases. Multiscale Model. Simul., 7(2):703–730, 2008.
-  Gabriel Peyré, Sébastien Bougleux, and Laurent Cohen. Non-local regularization of inverse problems. In David Forsyth, Philip Torr, and Andrew Zisserman, editors, Computer Vision – ECCV 2008. Springer Berlin Heidelberg, 2008.
-  ET. Quinto. Singularities of the x-ray transform and limited data tomography in R and R. SIAM Journal on Mathematical Analysis, 24(5):1215–1225, 1993.
-  J. Radon. Uber die bestimmug von funktionen durch ihre integralwerte laengs geweisser mannigfaltigkeiten. Berichte Saechsishe Acad. Wissenschaft. Math. Phys., Klass, 69:262, 1917.
-  L. Rudin, S. Osher, and E. Fatemi. Nonlinear total variation based noise removal algorithms. Physica D: nonlinear phenomena, 60(1-4):259–268, 1992.
-  T. Schuster, B. Kaltenbacher, B. Hofmann, and K. S. Kazimierski. Regularization Methods in Banach Spaces, volume 10 of Radon Series on Computational and Applied Mathematics. Walter de Gruyter GmbH & Co. KG, Berlin, 2012.
-  W.P. Segars, G. Sturgeon, S. Mendonca, Jason Grimes, and Benjamin MW Tsui. 4d xcat phantom for multimodality imaging research. Medical physics, 37(9):4902–4915, 2010.
-  E. Y. Sidky and X. Pan. Image Reconstruction in Circular Cone-Beam Computed Tomography by Constrained, Total-Variation Minimization. Phys. Med. Biol., 53(17):4777, 2008.
-  AM. Stuart. Inverse problems: a Bayesian perspective. Acta Numer, 19:451–559, 2010.
-  S. J. Vollmer. Dimension-Independent MCMC Sampling for Inverse Problems with Non-Gaussian Priors. SIAM/ASA J. Uncertain. Quantif., 3(1):535–561, 2015.
-  Sebastian J Vollmer. Dimension-independent mcmc sampling for inverse problems with non-gaussian priors. SIAM/ASA Journal on Uncertainty Quantification, 3(1):535–561, 2015.
-  G. Wang and H. Yu. The meaning of interior tomography. Physics in medicine and biology, 58(16):R161, 2013.
-  JP. Ward, M. Lee, JC. Ye, and M. Unser. Interior tomography using 1d generalized total variation. part i: Mathematical foundation. SIAM Journal on Imaging Sciences, 8(1):226–247, 2015.
-  J. Yang, H. Yu, M. Jiang, and G. Wang. High-order total variation minimization for interior tomography. Inverse problems, 26(3):035013, 2010.
-  Z. Yao, Z. Hu, and J. Li. A TV-Gaussian Prior for Infinite-Dimensional Bayesian Inverse Problems and Its Numerical Implementations. Inverse Problems, 32(7):075006, 19, 2016.
-  Zhewei Yao, Zixi Hu, and Jinglai Li. A tv-gaussian prior for infinite-dimensional bayesian inverse problems and its numerical implementations. Inverse Problems, 32(7):075006, 2016.
-  X. Zhang, M. Burger, X. Bresson, and S. Osher. Bregmanized Nonlocal Regularization for Deconvolution and Sparse Reconstruction. SIAM journal on imaging sciences, 3(3):253–276, 2010.
-  X. Zhang and T. F. Chan. Wavelet Inpainting by Nonlocal Toral Variation. Inverse problems and Imaging, 4(1):191–210, 2010.
-  D. Zhou and B. Schölkopf. Regularization on Discrete Spaces. Springer Berlin Heidelberg, Berlin, Heidelberg, 2005.