Abstract
We consider the problem of image denoising in the presence of noise whose statistical properties are a combination of two different distributions. We focus on noise distributions that are frequently considered in applications, in particular mixtures of salt & pepper and Gaussian noise, and Gaussian and Poisson noise. We derive a variational image denoising model that features a total variation regularisation term and a data discrepancy that features the mixed noise as an infimal convolution of discrepancy terms of the singlenoise distributions. We give a statistical derivation of this model by joint Maximum APosteriori (MAP) estimation, and discuss in particular its interpretation as the MAP of a socalled infinity convolution of two noise distributions. Moreover, classical singlenoise models are recovered asymptotically as the weighting parameters go to infinity. The numerical solution of the model is computed using second order Newtontype methods. Numerical results show the decomposition of the noise into its constituting components. The paper is furnished with several numerical experiments and comparisons with other existing methods dealing with the mixed noise case are shown.
Infimal convolution of data discrepancies for mixed noise removal
Luca Calatroni
Centre de Mathématiques Appliquées (CMAP)
École Polytechnique CNRS,
Route de Saclay, 91128 Palaiseau Cedex, France
(luca.calatroni@polytechnique.edu)
[0.4cm] Juan Carlos De Los Reyes
Research Center on Mathematical Modelling (MODEMAT), Escuela Politécnica Nacional
Ladron De Guevara E11253, 2507144, Quito, Ecuador
(juan.delosreyes@epn.edu.ec)
[0.4cm] CarolaBibiane Schönlieb
Department of Applied Mathematics and Theoretical Physics (DAMTP)
University of Cambridge, Wilberforce Road, CB3 0WA, Cambridge, UK
(cbs31@cam.ac.uk)
1 Introduction
Let open and bounded with Lipschitz boundary and be a given noisy image. The image denoising problem can be formulated as the task of retrieving a denoised image from . In its general expression, assuming no blurring effect on the observed image, the denoising inverse problem assumes the following form:
(1.1) 
where stands for the degradation process generating random noise that follows some statistical distribution. The problem (1.1) is often regularised and reformulated as the following minimisation problem
(1.2) 
where and are elements in suitable function spaces, and the desired reconstructed image is computed as a minimiser of the energy functional in the Banach space . The functional is the sum of two different terms: the regularising energy which drives the reconstruction process encoding apriori information about the desired image and the data fidelity function modelling the relation between the data and the reconstructed image . The positive weighting parameter balances the action of the regularisation against the trust in the data. In this paper we consider
(1.3) 
the Total Variation (TV) as image regulariser, which is a popular choice since the seminal works of Rudin, Osher and Fatemi [50], Chambolle and Lions [17] and Vese [58] due to its edgepreserving and smoothing properties. Under this choice, the minimisation problem (1.2) is formulated in a subspace of , the space of functions of bounded variation, [1].
Having fixed the regularisation term, depending on the imaging task and the specific application considered, several choices for the fidelity term and the balancing parameter can be made. The correct mathematical modelling of the data fidelity term in (1.2) is crucial for the design of an image reconstruction model fitting appropriately the given data. Its choice is typically driven by physical and statistical considerations on the noise corrupting the data in (1.1), cf. [33, 56, 28]. In this spirit, the general model (1.2) is derived as the MAP estimate of the likelihood distribution. In the simplest scenario, the noise is assumed to be an additive random component , Gaussiandistributed with zero mean and variance determining the noise intensity. In this case (1.1) can be simply written as
Gaussian noise is often used to model the noise statistics in many applications such as in medical imaging, as a simple approximation of more complicated noise models. Other additive noise distributions, such as the more heavytailed Laplace distribution can alternatively be considered. Another possibility – which is appropriate for modelling transmission errors affecting only a percentage of the pixels in the image – is to consider a type of noise where the intensity value of only a fraction of pixels in the image is switched to either the maximum/minimum value of its dynamic range or to a random value within it with positive probability. This type of noise is called “salt & pepper” noise or “impulse” noise, respectively. In some other cases, a different, signaldependent property is assumed to conform to the actual physical application considered. For instance, a Poisson distribution of the noise is used for astronomical and microscopy imaging applications.
When Gaussian noise is assumed, an type data fidelity
(1.4) 
can be derived for as the MAP estimate of the Gaussian likelihood function [28, 50, 17]. Similarly, in the case of additive Laplace noise, the statistically consistent data fidelity term for reads
(1.5) 
see, e.g. [42, 3]. Furthermore, the same data fidelity is considered in [44, 25] to model the sparse structure of the salt & pepper and impulse noise distributions. Variational models where a Poisson noise distribution is assumed are often approximated by weightedGaussian distributions through variancestabilising techniques [55, 11]. In [53, 52] a statisticallyconsistent analytical modelling has been derived: for the resulting data fidelity term is a KullbackLeiblertype functional of the form
(1.6) 
As a result of different image acquisition and transmission factors, very often in applications the observed image is corrupted by a mixture of noise statistics. For instance, mixed noise distributions can be observed when faults in the acquisition of the image are combined with transmission errors to the receiving sensors. In this case, the modelling of (1.2) shall encode a combination of salt & pepper and Gaussian noise in the choice of . In other applications, specific tools (such as fluorescence and/or highenergy beams) are used before the signal is actually acquired. This process is typical, for instance, in microscopy and astronomy and may result in a combination of a Poisson noise with an additive Gaussian noise component [51, 54].
From a modelling point of view, the presence of multiple noise distributions has been translated in the relevant literature in the combination of the data fidelity terms (1.4), (1.5) and (1.6) in different ways. In [30], for instance, a combined data fidelity with TV regularisation model is considered for joint impulsive and Gaussian noise removal. There, the image is spatially decomposed in two disjoint domains in which the image is separetely corrupted by pure salt & pepper and Gaussian noise, respectively. A twophase approach is considered in [13] where two sequential steps with and data fidelity are performed to remove the impulsive and the Gaussian component of the mixed noise, respectively. Mixtures of Gaussian and Poisson noise have also been considered. In [36, 35], for instance, the exact loglikelihood estimator of the mixed noise model is derived and its numerical solution is computed via a primaldual splitting. A similar model has been considered in [4] where a scaled gradient semiconvergent algorithm is used to solve the combined model. In [27] the discretecontinuous nature of the model (due to the different support of the PoissonGaussian distribution on the set of natural and real numbers, respectively) is approximated by an additive model, using homomorphic variancestabilising transformations and weighted approximations. In [41] a nonBayesian framework to differentiate Poisson intensities from the Gaussian ones in the Haar wavelet domain is considered. In [39] a GaussianPoisson model similar to the one we consider in this paper is derived in a discrete setting. A more recent approach featuring a linear combination of different data fidelities of the type (1.4), (1.5) and (1.6) has been considered in [38] for a combination of Gaussian and impulse noise and in [22, 15, 14] in the context of bilevel learning approaches for the design of optimal denoising models.
In this work, we present an alternative variational model for denoising of images corrupted by mixed noise distributions that is based on an infimal convolution of the data fidelity terms (1.4), (1.5) and (1.6). For the case of GaussianPoisson noise a similar model is discussed in [39] in a finitedimensional setting. Our model is derived from statistical assumptions on the data and can be studied rigorously in function spaces using standard tools in calculus of variations and functional analysis. The simple infimal convolution nature of the model derived makes its numerical solution amenable for standard first or second order optimisation methods. In this paper we use a semismooth Newton (SSN) method for the numerical realisation of our model. By using the classical operation of infimal convolution, our variational model combines different data fidelities , each associated to the corresponding noise component in the data, allowing for splitting the noise into its constituting elements. In what follows, we fix the regularisation term in (1.2) to be the TV energy. This is of course just a toy example and extensions to higherorder regularisations such as TVTV [46] or TGV [9] can be considered as well. We refer to our model as TVIC to highlight the infimal convolution combination of data fidelities. This should not be confused with the ICTV model [17, 32] where the same operation is used to combine TV regularisation with secondorder regularisation terms.
1.1 The reference models
In the following, we consider two exemplar problems which extend (1.1) to the case of multiple noise distributions. We consider the following problem encoding noise mixtures
(1.7) 
where models a general noising process possibly depending on in a nonlinear way, while is an additive, Gaussiandistributed, noise component independent of . We will focus in the following on two particular cases of (1.7).
Salt & pepper and Gaussian noise
For this case, we suppose that the observed noisy image is corrupted by a mixture of salt & pepper and Gaussian noise, i.e. we consider the following instance of the noising model (1.7),
(1.8) 
where, following [19, Section 1.2.2], the salt & pepper component is modelled by considering two independent random fields and defined, for every pixel defined as
Following [49, Section 2.2] and [42], we will approximate the nonlinear model (1.8) by the following additive model:
(1.9) 
where is now the realisation of a Laplace distributed random variable, independent of . This approximation will be made more clear in Section 3.
Poisson and Gaussian noise
In this case, we consider the problem of a mixture of noise distributions where a signaldependent component is combined with an additive Gaussian noise component. In particular, we will focus on a Poisson distributed component having as Poisson parameter. In other words, we will consider the noising model (1.7) with
(1.10) 
Organisation of the paper
In Section 2 we present the infimalconvolution data fidelity terms considered for the reference models above and show that they are welldefined. Motivations for the TVIC denoising model are given in Section 3 where some considerations based on the use of Bayes’ formula are given to support our modelling. In particular, we make precise how the mixed data fidelity terms can be derived in two different ways, using a joint MAP estimation strategy as in [39], and by deriving a joint posterior distribution using and variant of the classical convolution operation of two probability distributions. In Section 4 wellposedness results of the TVIC model by means of classical tools of calculus of variations and functional analysis are given. In Section 5 we show how classical single noise models can be recovered asymptotically from our model by letting the fidelity parameters go to infinity. In Section 6 we confirm the validity of our model with several different numerical experiments using a semismooth Newton method for its efficient numerical solution. Finally, in Section 7.1 we report some preliminary results in the spirit of recent advances in noise model learning via bilevel optimisation á la [14].
2 The variational model
We consider a open and bounded image domain with Lipschitz boundary and denote by the given, noisy image. Further assumptions on the function spaces where lies will be specified in the following.
Let the total variation seminorm defined in (1.3) and let denote the space of functions of bounded variation, [1] . For positive weights and admissible sets of functions and , the general proposed TVIC denoising model for mixed noise distributions reads:
(TVICa)  
where the data fidelity has the following infimal convolutiontype structure:  
(TVICb) 
for two data fidelity functions and defined over . The size of the weighting parameters and in (TVICb) balances the trust in the data against the smoothing effect of the regularisation as well as the fitting with respect to the intensity of each single noise distribution in .
We now consider the two particular noise combinations introduced in Section 1.1 and demonstrate how these fit into the general model above.
2.1 Salt & pepperGaussian fidelity
For the case of mixed salt & pepper and Gaussian noise (1.9), we specify the assumptions for the model in (TVICa)(TVICb) as follows. We consider the noisy image , for the impulsive noise component and for the Gaussian noise component. The admissible sets are simply . In this case, the infimal convolution data fidelity in (TVICb) reads
(2.2) 
Since the set is bounded, and both terms in (2.2) are welldefined.
Remark 2.1.
Note that (2.2) can be rewritten as the Moreau envelope or as a weighted proximal map of the norm in terms of the ratio between the parameters and (see [2, Section 12.4]). Namely, we have:
where denotes the proximal map of a generic function while indicates the Moreau envelope of with parameter . Therefore, inherits in this case several good properties of proximal maps such the firmly nonexpansiveness ([2, Proposition 12.27]) which will be useful in the following analysis.
The following proposition asserts that the minimisation problem (2.2) is wellposed.
Proposition 2.2.
Let , and . Then the minimum in the minimisation problem (2.2) is uniquely attained.
Proof.
The proof of this proposition is based on the use of standard tools of calculus of variations. We report it in Appendix A. ∎
2.2 GaussianPoisson fidelity
For the case of mixed Gaussian and Poisson noise (1.10) some additional assumptions need to be specified. The given image is assumed to be bounded, that is and the data fidelities and , encoding the Gaussian and the Poisson component of the noise, respectively, are then chosen as and as the KullbackLeibler (KL) divergence between and the “residual” , that is
(2.3) 
We refer the reader to Appendix B where more properties of the KL functional are given. The variational fidelity in (TVICb) then reads
(2.4) 
with the following choice of the admissible sets:
(2.5) 
We note that for we have almost everywhere. Together with the definition of , this ensures that the functional is well defined (using the standard convention ).
Remark 2.3.
Our choice for the Poisson data fidelity (2.3) differs from previous works on Poisson noise modelling such as [52, 22] where a reduced KLtype data fidelity
is preferred. This choice is motivated through Bayesian derivation and MAP estimation (compare [52, 11, 3]), where the terms that do not depend on the denoised image are neglected since they are not part of the optimisation argument. In our combined modelling, however, those quantities have to be taken into account to incorporate the additional variable encoding the Gaussian noise component.
Similarly as in Proposition 2.2, we guarantee that the minimisation problem (2.4) is well posed in the following proposition.
Proposition 2.4.
3 Statistical derivation and interpretation
In this section we want to give a statistical motivation for the choice of the IC data fidelity term introduced in the previous section. We do this by switching from the continuous setting with in an infinite dimensional function space, to the discrete setting, where is a vector in and the noise distribution of each is independent of the others. Correspondingly, we consider an appropriate discretisation of the continuous TVIC variational model(TVICa)(TVICb). In this setting, we will show, how the IC data fidelity can be derived in two ways: in terms of a joint MAP estimate for and in Section 3.1, and by considering a modified likelihood for the mixed noise distribution – which consists of an infinity convolution of the two noise distributions – and computing its MAP estimate with respect to in Section 3.2.
3.1 Joint maximumaposterior estimation
A standard approach used to derive variational regularisation methods for inverse problems which reflect the statistical assumptions on the data is the Bayesian approach [19, 56]. In this framework, the problem is formulated in terms of the maximisation of the posterior probability density , i.e. the probability of observing the desired denoised image given the noisy image . This approach is commonly known as Maximum A Posteriori (MAP) estimation and relies on the simple application of the Bayes’ rule
(3.1) 
by which maximising for translates in maximising the ratio on the right hand side of (3.1). Classically, the probability density is called prior probability density since it encodes statistical a priori information on . Frequently, a Gibbs model of the form
(3.2) 
where is a convex regularising energy functional is assumed. In what follows, we take to be a discretisation of the total variation (1.3), that is denoting by the pixelcoordinates and reordering within a onedimensional vector with index , thus considering , we have
where denotes the forwarddifference approximation of the twodimensional gradient. The conditional probability density , also called likelihood function, relates to the statistical assumptions on the noise corrupting the data . Its expression depends on the probability distribution of the noise assumed on the data. The denominator of (3.1) plays the role of a normalisation constant for to be a probability density.
Clearly, maximising (3.1) in our setting of a Gibbs prior is equivalent to minimising the negative logarithm of . Thus, the MAP estimation problem equivalently corresponds to the problem of finding
where the term has been neglected since it does not affect the minimisation over .
We now want to use MAP estimation to derive a discretisation of the TVIC variational model (TVICa)(TVICb) for the mixed noise scenario. From a Bayesian point of view, the general model (1.7) can be written in terms of random variables and (for which the noisy image and the noisefree image are realisations, respectively) as follows
(3.3) 
Here, is the sum of the two random variables and . The random variable is distributed according to a probability distribution which may in general depend on (for instance, could be one of its parameters), whereas is fixed to be a Gaussian random variable independent of and , which combined with through an additive modelling. Here, we denote by the Gibbs model (3.2) on , i.e. the prior distribution on , the quantity of interest, so that .
For our derivation we consider a joint MAP estimation for a pair of realisations of the corresponding random variables and . Using Bayes’ rule and mutual independence between and , we have:
(3.4) 
which holds for general noise distributions . However, in the particular case where with being a random variable distributed as and independent of (in particular, when is not a parameter of the probability distribution of and is combined additively with another random variable ), (3.3) becomes
(3.5) 
In this case , and (3.4) reduces to:
(3.6) 
After these general considerations, let us now carry on with the detailed MAP derivation, first for the additive mixture of Laplace noise noistribution (as an approximation for salt & pepper distribution) and Gaussian noise distribution (1.9), and then for the mixed GaussianPoisson case (1.10).
The additive case
Let us focus first on the additive model (3.5) in the case of a mixture of Laplace and Gaussiandistributed noise. Following [49], we will use the Laplace distribution as an approximation for the one describing salt & pepper noise. More precisely, such distribution is an example of heavytailed distribution whose behaviour can be well approximated by the heavytailed Laplace noise distribution [42, 59]. We start from formulating the problem (3.5) in a finitedimensional statistical setting. For each image pixel , our problem is
(3.7) 
where and are realisations of two mutually independent random variables distributed according to Laplace and Gaussian distribution, respectively. More precisely, the equation above describes the structure of the realisations of the components of the random vector in terms of the ones of the random vectors and . The quantities and are, for every , independent realisations of identically distributed Laplace and Gaussian random variables and , respectively, i.e. and .
We recall that the Laplace probability density with parameter is defined as
(3.8) 
and the Gaussian probability density with zeromean and variance is defined as
(3.9) 
Let us now take the negative logarithm in the joint MAP estimate (3.6) for all the realisations. By independence, we have that:
(3.10)  
where the constant terms which do not affect the minimisation over and have been neglected.
To pass from a discrete to a continuous expression of the model we follow [52]. We interprete the elements in the space as samples of functions defined on the whole image domain . For convenience, we use in the following the same notation to indicate the corresponding, continuous quantities. Introducing the indicator function
where is the region in the image occupied by the th detector, we have that any discrete data can be interpreted as the mean value of a function over the region as follows
In this way, we can then express (3.10) as the following continuous model
where with being the usual Lebesgue measure in . We observe that at this level, the function spaces (i.e. the regularity) where the minimisation problem is posed still need to be specified. Defining and and replacing by , we derive from (3.21) the following variational model for noise removal of Laplace and Gaussian noise mixture
where the function spaces for and have been chosen so that all the terms are well defined. By a simple change of variables, we can equivalently write the model above as
(3.11) 
where the minimisation is taken over the denoised image and the salt & pepper noise component .
The signaldependent case
Let us now consider (3.3) in the case when the nonGaussian noise component in the data follows a Poisson probability distribution with parameter . In this case, the model (3.3) can be formulated in a finitedimensional setting as
(3.12) 
at every pixel . Similarly as before, the equation above describes the structure of the realisations of the random vector in terms of the sum of two mutually independent random vectors and , having as components independent and identically distributed Poisson and Gaussian random variables and , respectively, for every . Again, the values are random realisations of the random variables which are distributed according to the Gibbs model (3.2) introduced before.
The Poisson noise density with parameter is defined as
(3.13) 
where the factorial function is classically extended to the whole real line by using the Gamma function. We note that (3.13) can be rewritten equivalently as:
Similarly as above, let us now take the negative logarithm in the general joint MAP estimate (3.6), for all the realisations. By independence, we have:
where the constant terms which do not affect the minimisation over and have been neglected. Passing from a discrete to a continuous modelling similarly as described in the discussion for the additive case, we obtain the following continuous model
(3.14) 
where we have set and and we have still to specify the function spaces where the minimisation takes place. By a closer inspection on the third term in the model above, we have:
where we have used the standard Stirling approximation of the logarithm of the factorial function. The functional we end up with is the wellknown KullbackLeibler (KL) functional which has been used for the design of imaging models used for Poisson noise removal in previous works, such as [53, 52, 40]. We refer the reader also to Appendix B where some properties of are recalled. The variational PoissonGaussian data fidelity (3.14) can then be written in a more compact form as
(3.15) 
where the admissible sets and are chosen as specified in (2.5) to guarantee that all the terms in (3.15) are welldefined. We point out here again that in [39] a similar joint MAP estimation for a mixed Gaussian and Poisson modelling was considered.
In both the general model (3.3) and in its simplified additive version (3.5) we have derived through joint MAP estimation the two variational models (3.11) and (3.15) which, consistently with our statistical assumptions, model the single noise components by norms classically used for the corresponding single noise models and combined together for the mixed noise case through a joint minimisation procedure. The fidelity terms are weighted against each other by the parameters and , whose size depends on the intensity of each noise component, and weight the fidelity against the TV regularisation.
In the next section, we discuss a different interpretation of the TVIC model, as the MAP estimate for a posterior distribution in which the two noise distributions are combined by a socalled infinity convolution, instead of the classical convolution of probability distributions.
3.2 Infinity convolution
From a probabilistic point of view, convolutiontype data fidelities (TVICb) can be alternatively derived via a modification of the expression of the probability density of a sum of random variables. Given two independent realvalued random variables and with associated probability densities and , it is wellknown that the random variable has probability density given by the convolution of and , that is
(3.16) 
Following [31, Remark 2.3.2], since and are nonnegative, we can define for the convolution of order p as follows:
(3.17) 
which clearly corresponds to the classical convolution (3.16) if . By letting it is easy to show that (3.17) converges to the infinity convolution of and
(3.18) 
We note that in (3.16) and have the same domain. Also, note that in order for to be a probability density the following normalisation
is needed. Then, is a probability density, having the same domain of , but potentially featuring heavier tails.
In the special case when the probability densities and have an exponential form of the type and , with and being positive constants and continuous and convex positive functions, (3.18) can be equivalently rewritten as:
(3.19) 
Hence, the infinity convolution of and corresponds to the infimal convolution of and through a negative exponentiation and up to multiplication by positive constants.
Let us now recall the additive model in the finite dimensional setting (3.7). At every pixel, the random variable is the sum of the two independent random variables and having Laplace (3.8) and Gaussian (3.9) probability densities, respectively. In contrast to the previous Section 3.1 where we have computed a joint MAP estimate for and , here we want to compute the MAP estimate for the modified, infinity convolution likelihood (3.19), optimising over only. We will see that doing so we derive the same infimalconvolution models (3.11) and (3.15) as before. By independence of the realisations, we have that the probability density of the random vector in correspondence with the realisation reads
(3.20) 
Each probability density is formally given by the convolution between the Laplace and Gaussian probability distribution. However, if we replace this convolution with the infinity convolution defined in (3.18) and we normalise appropriately in order to get a probability distribution, we get that the probability density of a single evaluated in correspondence of one realisation can be expressed as
Plugging this expression in (3.20) and computing the negative loglikelihood of , we get:
Thus, we have that the loglikelihood we intend to minimise is
where the constant terms which do not affect the minimisation over have been neglected.
Similar to before, passing from a discrete to a continuous representation we get
(3.21) 
where the infimum and the integral operators commute by assuming that the function space for is as stated in [48, Theorem 3A] (see also [29] for further details). This also ensures that the integrand terms are both well defined. Similarly as above, we now define and and derive from (3.21) the same following data fidelity term as in (3.11).
3.3 Connection with existing approaches
Several variational models for image denoising in the presence of combined noise distributions have been considered in the literature.
In the case of a mixture of impulsive and Gaussian noise these models can be roughly divided in two categories. The first category considers additive, + data fidelities. In [30], for instance, the image domain is decomposed in two parts, with impulsive noise in one and Gaussian noise in the other, modelled by the sum of an and data fidelity supported on the respective domain with Gaussian or impulse noise. For the numerical solution an efficient domain decomposition approach is used. In [22, 15, 14] semismooth Newton’s methods are employed to solve a denoising model where and data fidelities are combined in an additive fashion. A second category of methods renders the removal of impulsive and Gaussian noise in a twostep procedure (see, e.g. [13]). In the first phase the pixels damaged by the impulsive noise component are identified via an outlierremoval procedure and removed making use of a variational regularisation model with data fitting. Then a variational denoising model with an data fidelity is employed for removing the Gaussian noise in all other pixels. These methods are often presented as image inpainting strategies where the pixels corrupted by impulsive noise are first identified and then filled in using the information nearby with a variational regularisation approach.
When a mixture of Gaussian and Poisson noise is assumed, an exact loglikelihood model has been considered in [35, 36]. In the same discrete setting described above, still denoting by the total number of pixels, the expression of the negative loglikelihood reads
(3.22) 
In comparison with our derivation presented in Section 3.2, here the loglikelihood is derived via MAP estimation for the logarithm of the convolution (3.16) of the Gaussian probability density (3.9) with the Poisson probability density (3.13). Instead, the model we propose is derived equivalently either by replacing the convolution with the infinity convolution or by a joint MAP estimation. The resulting variational model (3.15) is much simpler since, in particular, it makes easier dealing with the infinite sum appearing in (3.22) which is related to the discrete Poisson density function. In order to design an efficient optimisation strategy, the authors first split the expression above into the sum of two different terms, the former being a convex Lipschitzdifferentiable function, the latter being a proper, convex and lower semicontinuous function. In particular, the authors are able to design a competitive firstorder optimisation method based on the use of primaldual splitting algorithms exploiting the closedform of the proximal operators associated to the convex Lipschitzdifferentiable function. The advantage of such algorithms is that it is only based on the iterative application of the proximal mapping operations, without requiring any matrix inversion. Furthermore, the approximation error accumulating throughout the iterations of the algorithm are shown to be absolutely summable sequences, a property which is essential in this framework due to the presence of infinite sums.
Another approach for mixed Gaussian and Poisson noise is considered in [39], where the authors design a data fidelity term similar to (3.15) combining it with TV regularisation for image denoising. Despite the analogies between the joint MAP estimation of their and our model, in their work no wellposedness results in fuction species nor properties of noise decompositions are discussed. Nonetheless, in [39] the good performance of the combined model is observed in terms of improvements of the Peak Signal to Noise Ratio (PSNR) for several synthetic and microscopy images.
Differently from most aforementioned works, the simple structure of our models (3.11) and (3.15) allows for the design of efficient first and secondorder numerical schemes. In this paper we focus on the latter. Our approach is further able to ‘decompose’ the noise into its different statistical components, each corresponding to one particular noise distribution in the data. The authors are not aware of any existing method dealing with such a feature.
4 Wellposedness of the TVIC model
Thanks to Proposition 2.2 and Proposition 2.4, we can conclude that in both cases (2.2) and (2.4), the function is welldefined and that for every there exists a unique element minimising the functional . Problem (TVICa) can then be rewritten as
(4.1) 
where is the unique solution of (TVICb) in one of the two cases (2.2) and (2.4). In particular, for every , there is a positive finite constant such that
(4.2) 
Note that the constant in (4.2) may depend on and hence does not necessarily bound uniformly. For the following existence proof, therefore, we restrict the admissible set of solutions for by intersecting it with the closed ball in with fixed radius , centred in . That is, in (TVICb) we consider a new admissible set
(4.3) 
where for some and is defined as before in each case. Since is compact and convex, the wellposedness properties studied in Section 2 still hold true. In addition, one can now easily compute the constant of (4.2) by Young’s inequality. Namely, for every one has now
(4.4) 
with . This additional assumption is reasonable for our applications since we want the noise component to preserve some similarities with the given image in terms of its noise features.
After this modification, we can now state and prove the main wellposedness result or the salt & pepperGaussian noise combination described in Section 2.1 and the GaussianPoisson model in Section 2.2 by means of standard tools of calculus of variations. We refer the reader also to [58] and [10] where similar results are proved for and KullbackLeiblertype data fidelities, respectively.
Theorem 4.1.
Proof.
Let be a minimising sequence for . Such sequence exists since the functional is nonnegative. Neglecting the positive contribution given by , we have:
(4.5) 
for some finite constant . To show the uniform boundedness of the sequence in we first observe that using the positivity of , (4.5) we have
(4.6) 
Next, in order to get appropriate bounds for in we need to differentiate the two cases considered.

Gaussiansalt & pepper case: For , we get from (4.5) that by Young’s inequality: