Regularization matrices determined
by matrix nearness problems
This paper is concerned with the solution of large-scale linear discrete ill-posed problems with error-contaminated data. Tikhonov regularization is a popular approach to determine meaningful approximate solutions of such problems. The choice of regularization matrix in Tikhonov regularization may significantly affect the quality of the computed approximate solution. This matrix should be chosen to promote the recovery of known important features of the desired solution, such as smoothness and monotonicity. We describe a novel approach to determine regularization matrices with desired properties by solving a matrix nearness problem. The constructed regularization matrix is the closest matrix in the Frobenius norm with a prescribed null space to a given matrix. Numerical examples illustrate the performance of the regularization matrices so obtained.
ikhonov regularization, regularization matrix, matrix nearness problem.
65R30, 65F22, 65F10.
We are concerned with the computation of an approximate solution of linear least-squares problems of the form
with a large matrix with many singular values of different orders of magnitude close to the origin. In particular, is severely ill-conditioned and may be singular. Linear least-squares problems with a matrix of this kind often are referred to as linear discrete ill-posed problems. They arise, for instance, from the discretization of linear ill-posed problems, such as Fredholm integral equations of the first kind with a smooth kernel. The vector of linear discrete ill-posed problems that arise in applications typically represents measured data that is contaminated by an unknown error .
Let denote the unknown error-free vector associated with , i.e.,
and let be the solution of the unavailable linear system of equations
which we assume to be consistent. If is singular, then denotes the solution of minimal Euclidean norm.
Let denote the Moore–Penrose pseudoinverse of . The solution of minimal Euclidean norm of (1), given by
typically is not a useful approximation of due to severe propagation of the error . This depends on the large norm of . Therefore, one generally replaces the least-squares problem (1) by a nearby problem, whose solution is less sensitive to the error . This replacement is known as regularization. One of the most popular regularization methods is due to Tikhonov. This method replaces (1) by a penalized least-squares problem of the form
where is referred to as a regularization matrix and the scalar as a regularization parameter; see, e.g., [1, 9, 11]. Throughout this paper denotes the Euclidean vector norm or the spectral matrix norm. We assume that the matrices and satisfy
where denotes the null space of the matrix . Then the minimization problem (4) has the unique solution
for any . The superscript denotes transposition. When is the identity matrix, the Tikhonov minimization problem (4) is said to be in standard form, otherwise it is in general form. We are interested in minimization problems (4) in general form.
The value of in (4) determines how sensitive is to the error , how close is to the desired solution , and how small the residual error is. A suitable value of generally is not explicitly known and has to be determined during the solution process.
Minimization problems (4) in general form with matrices and of small to moderate size can be conveniently solved with the aid of the Generalized Singular Value Decomposition (GSVD) of the matrix pair ; see, e.g., [7, 11]. We are interested in developing solution methods for large-scale minimization problems (4). These problems have to be solved by an iterative method. However, the regularization matrices derived also may be useful for problems of small size.
Common choices of regularization matrices in (4) when the least-squares problem (1) is obtained by discretizing a Fredholm integral equation of the first kind in one space-dimension are the identity matrix , and scaled finite difference approximations of the first derivative operator,
as well as of the second derivative operator,
The null spaces of these matrices are
The regularization matrices and damp fast oscillatory components of the solution of (4) more than slowly oscillatory components. This can be seen by comparing Fourier coefficients of the vectors , , and ; see, e.g., . These matrices therefore are referred to as smoothing regularization matrices. Here we think of the vector as a discretization of a continuous real-valued function. The use of a smoothing regularization matrix can be beneficial when the desired solution is a discretization of a smooth function.
The regularization matrix in (4) should be chosen so that known important features of the desired solution of (3) can be represented by vectors in , because these vectors are not damped by . For instance, if the solution is known to be the discretization at equidistant points of a smooth monotonically increasing function, then it may be appropriate to use the regularization matrix (7), because its null space contains the discretization of linear functions. Several approaches to construct regularization matrices with desirable properties are described in the literature; see, e.g., [4, 5, 6, 13, 16, 18, 21]. Many of these approaches are designed to yield square modifications of the matrices (6) and (7) that can be applied in conjunction with iterative solution methods based on the Arnoldi process. We will discuss the Arnoldi process more below.
The present paper describes a new approach to the construction of square regularization matrices. It is based on determining the closest matrix with a prescribed null space to a given square nonsingular matrix. For instance, the given matrix may be defined by appending a suitable row to the finite difference matrix (6) to make the matrix nonsingular, and then prescribing a null space, say, (8) or (9). The distance between matrices is measured with the Frobenius norm,
where the inner product between matrices is defined by
Our reason for using the Frobenius norm is that the solution of the matrix nearness problem considered in this paper can be determined with fairly little computations in this setting.
We remark that commonly used regularization matrices in the literature, such as (6) and (7), are rectangular. Our interest in square regularization matrices stems from the fact that they allow the solution of (4) by iterative methods that are based on the Arnoldi process. Application of the Arnoldi process to the solution of Tikhonov minimization problems (4) was first described in ; a recent survey is provided by Gazzola et al. . We are interested in being able to use iterative solution methods that are based on the Arnoldi process because they only require the computation of matrix-vector products with the matrix and, therefore, typically require fewer matrix-vector product evaluations than methods that demand the computation of matrix-vector products with both and , such as methods based on Golub–Kahan bidiagonalization; see, e.g.,  for an example.
This paper is organized as follows. Section 2 discusses matrix nearness problems of interest in the construction of the regularization matrices. The application of regularization matrices obtained by solving these nearness problems is discussed in Section 3. Krylov subspace methods for the computation of an approximate solution of (4), and therefore of (1), are reviewed in Section 4, and a few computed examples are presented in Section 5. Concluding remarks can be found in Section 6.
2 Matrix nearness problems
This section investigates the distance of a matrix to the closest matrix with a prescribed null space. For instance, we are interested in the distance of the invertible square bidiagonal matrix
Square regularization matrices have the advantage over rectangular ones that they can be used together with iterative methods based on the Arnoldi process for Tikhonov regularization [2, 10] as well as in GMRES-type methods . These applications have spurred the development of a variety of square regularization matrices. For instance, it has been proposed in  that a zero row be appended to the matrix (6) to obtain the square regularization matrix
with the same null space. Among the questions that we are interested in is whether there is a square regularization matrix that is closer to the matrix (10) than and has the same null space as the latter matrix. Throughout this paper denotes the range of the matrix .
Let the matrix have orthonormal columns and define the subspace . Let denote the subspace of matrices whose null space contains . Then and the matrix
satisfies the following properties:
if , then ;
if , then .
We have , which shows the first property. The second property implies that , from which it follows that
Finally, for any , we have
where the last equality follows from the cyclic property of the trace.
The following result is a consequence of Proposition 2.
The matrix closest to a given matrix with a prescribed null space also can be characterized in a different manner that does not require an orthonormal basis of the null space. It is sometimes convenient to use this characterization.
Let be the subspace of matrices whose null space contains , where is a rank- matrix. Then the closest matrix to in in the Frobenius norm is , where
Since the columns of are linearly independent, the matrix is positive definite and, hence, invertible. It follows that is an orthogonal projector with null space . The desired result now follows from Proposition 2.
Thus, is smaller than .
We turn to square tridiagonal regularization matrices. The matrix
with the null space (9). We also may apply Proposition 2 below, which is analogous to Proposition 2 in that that no orthonormal basis for the null space is required. The result can be shown by direct computations.
Given , the closest matrix to in the Frobenius norm with a null space containing the linearly independent vectors is given by , where
It follows easily from Proposition 2 that the closest matrix to with null space is , where is an orthogonal projector defined by
The regularization matrices constructed above are generally nonsymmetric. We are also interested in determining the distance between a given nonsingular symmetric matrix, such as , and the closest symmetric matrix with a prescribed null space, such as (9). The following results shed light on this.
Let the matrix be symmetric, let have orthonormal columns and define the subspace . Let denote the subspace of symmetric matrices whose null space contains . Then and the matrix
satisfies the following properties:
if , then ;
if , then .
We have and , which shows the first property. The second property implies that , from which it follows that
Finally, for any , it follows from the cyclic property of the trace that
Proposition 2 characterizes the closest matrix in to a given symmetric matrix . The following proposition provides another characterization that does not explicitly use an orthonormal basis for the prescribed null space. The result follows from Proposition 2 and Corollary 2 in a straightforward manner.
Let be the subspace of symmetric matrices whose null space contains , where is a rank- matrix. Then the closest matrix to the symmetric matrix in in the Frobenius norm is , where is defined by (12).
3 Application of the regularization matrices
In this section we discuss the use of regularization matrices of the form and in the Tikhonov minimization problem (4), where is an orthogonal projector and is nonsingular. We solve the problem (4) by transforming it to standard form in two steps. First, we let and then set . Following Eldén  or Morigi et al. , we express the Tikhonov minimization problem
in the form
Let the columns of form an orthonormal basis for the desired null space of . Then . Determine the QR factorization
where has orthonormal columns and is upper triangular. It follows from (5) that is nonsingular, and we obtain
We turn to the solution of (17). This minimization problem can be expressed in standard form
where . Let solve (20). Then the solution of (17) is given by . In actual computations, we evaluate by solving a linear system of equations with . We can similarly solve the problem (4) with by transforming it to standard form in three steps, where the first two steps are the same as above and the last step is similar to the first step of the case with .
It is desirable that the matrix not be very ill-conditioned to avoid severe error propagation when solving linear systems of equations with this matrix. For instance, the condition number of the regularization matrix , defined by (10), depends on the parameter . Clearly, the condition number of , defined as the ratio of the largest and smallest singular value of the matrix, is large for “tiny” and of moderate size for . In the computations reported in Section 5, we use the latter value.
4 Krylov subspace methods and the discrepancy principle
A variety of Krylov subspace iterative methods are available for the solution of the Tikhonov minimization problem (20); see, e.g., [2, 3, 10, 17] for discussions and references. The discrepancy principle is a popular approach to determining the regularization parameter when a bound for the norm of the error in is known, i.e., . It can be shown that the error in satisfies the same bound. The discrepancy principle prescribes that be chosen so that the solution of (20) satisfies
where is a constant independent of . This is a nonlinear equation of .
We can determine an approximation of by applying an iterative method to the linear system of equations
and terminating the iterations sufficiently early. This is simpler than solving (20), because it circumvents the need to solve the nonlinear equation (21) for . We therefore use this approach in the computed examples of Section 5. Specifically, we apply the Range Restricted GMRES (RRGMRES) iterative method described in . At the th step, this method computes an approximate solution of (22) as the solution of the minimization problem
where is a Krylov subspace. The discrepancy principle prescribes that the iterations with RRGMRES be terminated as soon as an iterate that satisfies
has been computed. The number of iterations required to satisfy this stopping criterion generally increases as is decreased. Using the transformation from to described in Section 3, we transform to an approximate solution of (1). Further details can be found in . Here we only note that can be computed without explicitly evaluating the matrix-vector product .
5 Numerical examples
We illustrate the performance of regularization matrices of the form and . The error vector has in all examples normally distributed pseudorandom entries with mean zero, and is normalized to correspond to a chosen noise level
where denotes the error-free right-hand side vector in (3). We let in (23) in all examples. Throughout this section and denote orthogonal projectors with null spaces (8) and (9), respectively. All computations are carried out on a computer with an Intel Core i5-3230M @ 2.60GHz processor and 8GB ram using MATLAB R2012a. The computations are done with about significant decimal digits.
|reg. mat.||# iterations||# mat.-vec. prod.|
Example 5.1. Consider the Fredholm integral equation of the first kind,
with kernel and solution given by
This equation is discussed by Phillips . We use the MATLAB code phillips from  to discretize (24) by a Galerkin method with orthonormal box functions as test and trial functions. The code produces the matrix and a scaled discrete approximation of . Adding to the latter yields the vector with which we compute the error free right-hand side . This provides an example of a problem for which it is undesirable to damp the -component in the computed approximation of .
Table 1 displays results obtained with RRGMRES for several regularization matrices and different noise levels, and Figure 2 shows three computed approximate solutions obtained for the noise level . The iterations are terminated by the discrepancy principle (23). From Table 1 and Figure 2, we can see that the regularization matrix yields the best approximation of . The worst approximation is obtained when no regularization matrix is used with RRGMRES. This situation is denoted by . Table 1 shows both the number of iterations and the number of matrix-vector product evaluations with the matrix . The fact that the latter number is larger depends on the matrix-vector product evaluations with required to evaluate the left-hand side of (18) and the matrix-vector product with needed for evaluating the product of with a vector; cf. (19).
|reg. mat.||# iterations||# mat.-vec. prod.|
Example 5.2. Regard the Fredholm integral equation of the first kind,
We discretize (25) by a Galerkin method with orthonormal box functions as test and trial functions using the MATLAB program deriv2 from . This program yields a symmetric indefinite matrix and a scaled discrete approximation of the solution of (25). Adding yields the vector with which we compute the error-free right-hand side . Error vectors are constructed similarly as in Example 5.1, and the data vector in (1) is obtained from (2).
Table 2 shows results obtained with RRGMRES for different regularization matrices. The performance for three noise levels is displayed. The iterations are terminated with the aid of the discrepancy principle (23). When , or , and the noise level is , as well as when , and the noise level is or , the initial residual satisfies the discrepancy principle and no iterations are carried out. Figure 3 shows computed approximate solutions obtained for the noise level with the regularization matrix and without regularization matrix. Table 2 and Figure 3 show the regularization matrix to give the most accurate approximations of the desired solution . We remark that addition of the vector to to the solution vector determined by the program deriv2 enhances the benefit of using a regularization matrix different from the identity. The benefit would be even larger, if a larger multiple of the vector were added to the solution.
The above examples illustrate the performance of regularization matrices suggested by the theory developed in Section 2. Other combinations of nonsingular regularization matrices and orthogonal projectors also can be applied. For instance, the regularization matrix performs as well as when applied to the solution of the problem of Example 5.1.
This paper presents a novel method to determine regularization matrices via the solution of a matrix nearness problem. Numerical examples illustrate the effectiveness of the regularization matrices so obtained. While all examples used the discrepancy principle to determine a suitable regularized approximate solution of (1), other parameter choice rules also can be applied; see, e.g., [14, 20] for discussions and references.
SN is grateful to Paolo Buttà for valuable discussions and comments on part of the present work. The authors would like to thank a referee for comments.
-  C. Brezinski, M. Redivo–Zaglia, G. Rodriguez, and S. Seatzu, Extrapolation techniques for ill-conditioned linear systems, Numer. Math., 81 (1998), pp. 1–29.
-  D. Calvetti, S. Morigi, L. Reichel, and F. Sgallari, Tikhonov regularization and the L-curve for large discrete ill-posed problems, J. Comput. Appl. Math., 123 (2000), pp. 423–446.
-  D. Calvetti and L. Reichel, Tikhonov regularization of large linear problems, BIT, 43 (2003), pp. 263–283.
-  D. Calvetti, L. Reichel, and A. Shuibi, Invertible smoothing preconditioners for linear discrete ill-posed problems, Appl. Numer. Math., 54 (2005), pp. 135–149.
-  M. Donatelli, A. Neuman, and L. Reichel, Square regularization matrices for large linear discrete ill-posed problems, Numer. Linear Algebra Appl., 19 (2012), pp. 896–913.
-  M. Donatelli and L. Reichel, Square smoothing regularization matrices with accurate boundary conditions, J. Comput. Appl. Math., 272 (2014), pp. 334–349.
-  L. Dykes, S. Noschese, L. Reichel, Rescaling the GSVD with application to ill-posed problems, Numer. Algorithms, 68 (2015), pp. 531–545.
-  L. Eldén, A weighted pseudoinverse, generalized singular values, and constrained least squares problems, BIT, 22 (1982), pp. 487–501.
-  H. W. Engl, M. Hanke, and A. Neubauer, Regularization of Inverse Problems, Kluwer, Dordrecht, 1996.
-  S. Gazzola, P. Novati, and M. R. Russo, On Krylov projection methods and Tikhonov regularization, Electron. Trans. Numer. Anal., 44 (2015), pp. 83–123.
-  P. C. Hansen, Rank-Deficient and Discrete Ill-Posed Problems, SIAM, Philadelphia, 1998.
-  P. C. Hansen, Regularization tools version 4.0 for MATLAB 7.3, Numer. Algorithms, 46 (2007), pp. 189–194.
-  P. C. Hansen and T. K. Jensen, Smoothing norm preconditioning for regularizing minimum residual methods, SIAM J. Matrix Anal. Appl., 29 (2006), pp. 1–14.
-  S. Kindermann, Discretization independent convergence rates for noise level-free parameter choice rules for the regularization of ill-conditioned problems, Electron. Trans. Numer. Anal., 40 (2013), pp. 58–81.
-  B. Lewis and L. Reichel, Arnoldi–Tikhonov regularization methods, J. Comput. Appl. Math., 226 (2009), pp. 92–102.
-  S. Morigi, L. Reichel, and F. Sgallari, Orthogonal projection regularization operators, Numer. Algorithms, 44 (2007), pp. 99–114.
-  A. Neuman, L. Reichel, and H. Sadok, Implementations of range restricted iterative methods for linear discrete ill-posed problems, Linear Algebra Appl., 436 (2012), pp. 3974–3990.
-  S. Noschese and L. Reichel, Inverse problems for regularization matrices, Numer. Algorithms, 60 (2012), pp. 531–544.
-  D. L. Phillips, A technique for the numerical solution of certain integral equations of the first kind, J. ACM, 9 (1962), pp. 84–97.
-  L. Reichel and G. Rodriguez, Old and new parameter choice rules for discrete ill-posed problems, Numer. Algorithms, 63 (2013), pp. 65–87.
-  L. Reichel and Q. Ye, Simple square smoothing regularization operators, Electron. Trans. Numer. Anal., 33 (2009), pp. 63–83.