Geometric scaling: a simple preconditioner for certain
linear systems with discontinuous coefficients
Abstract
Linear systems with large differences between coefficients (“discontinuous coefficients”) arise in many cases in which partial differential equations (PDEs) model physical phenomena involving heterogeneous media. The standard approach to solving such problems is to use domain decomposition (DD) techniques, with domain boundaries conforming to the boundaries between the different media. This approach can be difficult to implement when the geometry of the domain boundaries is complicated or the grid is unstructured. This work examines the simple preconditioning technique of scaling the equations by dividing each equation by the norm of its coefficients. This preconditioning is called geometric scaling (GS). It has long been known that diagonal scaling can be useful in improving convergence, but there is no study on the general usefulness of this approach for discontinuous coefficients. GS was tested on several nonsymmetric linear systems with discontinuous coefficients derived from convectiondiffusion elliptic PDEs with small to moderate convection terms. It is shown that GS improved the convergence properties of restarted GMRES and BiCGSTAB, with and without the ILUT preconditioner. GS was also shown to improve the distribution of the eigenvalues by reducing their concentration around the origin very significantly.
Keywords. BiCGSTAB; diagonal scaling; discontinuous coefficients; domain decomposition; geometric scaling; GMRES; GS; Lpnorm; linear equations; nonsymmetric equations; parallel processing; partial differential equations.
1 Introduction
Many physical phenomena are modeled by partial differential equations (PDEs) which describe the relations between one or more scalar or vector fields and the surrounding media. When boundary conditions are prescribed, a common approach to achieving a numerical solution is to impose a grid and discretize the equations, thus getting a system of linear equations. In some cases, this approach yields a system of equations with very large differences between coefficients of the equations. Examples of such systems arise in modeling flow through heterogeneous media with widelyvarying permeability, oil reservoir simulation, electromagnetics and semiconductor modeling. Such systems are commonly referred to as systems with “discontinuous coefficients”.
One of the most common methods for tackling such problems is the domaindecomposition (DD) approach, in which the domain is partitioned into subdomains, with the subdomain boundaries conforming to the boundaries between the different media. DD techniques typically operate as follows: some boundary conditions are assumed to exist on the interfaces between subdomains, and a solution of the equations in each subdomain is obtained, often with only low accuracy. The boundary conditions at the interfaces are then updated according to these solutions, and the process is repeated until convergence is achieved. There exists a vast literature on this subject, and a review of the area is beyond the scope of this work; see, for example, [Glowinski88, Smith96, Quarteroni99, Rice00]. DD techniques may be difficult to implement on unstructured grids or when the boundaries between domains have a complicated geometry.
We consider nonsymmetric linear systems with discontinuous coefficients derived from convectiondiffusion elliptic PDEs with small to moderate convection terms. It is shown that a simple preconditioning technique, which we call geometric scaling (GS), can be applied to the system of equations in order to improve the convergence properties of algorithms applied to the system. GS(), with an integer parameter , consists of dividing each equation by the norm of its vector of coefficients. GS, which is a particular form of diagonal scaling on the left, is geometric in the following sense: after applying GS() to a linear system, all the rows of the system matrix have a unity norm.
In order to examine the general usefulness of GS in conjunction with various solution methods, we tested it with the two leading Krylov subspace solvers for nonsymmetric systems: GMRES [Saad86] and BiCGSTAB [Vorst92]. Both algorithms were tested with and without the ILUT preconditioner [Saad94] on several nonsymmetric problems taken from (or based on) Saad [Saad03], van der Vorst [Vorst92], and Graham and Hagger [Graham99]. Four basic problems were considered, as well as several variations of the problems, such as modifications of the differences between the coefficients and various grid sizes. Results are provided for three different levels of accuracy.
Our experiments used the norm, but the norm yielded similar results. Our results can be characterized as follows:

In most cases, when the tested method (algorithm/preconditioner combination) converges to the specified accuracy criterion, GS speeds up the convergence time.

In many cases, when the tested method stagnates or diverges on the original system, it converges on the scaled system.

GS was not particularly useful on problems without discontinuous coefficients.

GS was not helpful on problems derived from PDEs with large convection terms. Such PDEs produce linear systems with very large offdiagonal elements, and these require other solution methods.
We also provide data on the effect of GS on the distribution of the eigenvalues. It is generally considered to be detrimental to the convergence of Krylov subspace methods to have a large accumulation of eigenvalues near the origin. In the cases examined here, there is a large accumulation of eigenvalues near the origin, and GS appears to “push” many eigenvalues away from the origin. Also, the eigenvalues of the scaled system appear to be distributed similarly to those obtained without scaling but with equalsized coefficients.
Note that the above choice of algorithm/preconditioner combinations does not imply that these methods are optimal for the above problems. Finding an optimal method is problemdependent and a topic for further research.
Diagonal scaling is not new, and its usefulness for discontinuous coefficients has also been noted. Van der Sluis [Sluis69] deals with the effect of scaling on the condition number of matrices. Widlund [Widlund71, p. 3435] writes that wellscaled ADI methods give good rates of convergence when the coefficients of elliptic problems vary very much in magnitude (ADI stands for the alternating direction implicit method [Peaceman55]). Graham and Hagger [Graham99, p. 20422043] write that diagonal scaling has been observed in practical computations to be very effective as a preconditioner for problems with discontinuous coefficients. Duff and van der Vorst [Duff98] write that on vector machines, diagonal scaling is often competitive with other approaches. Regarding diagonal scaling of symmetric matrices, see Meurant [Meurant99, Th. 8.1, 8.2], who also has some reservations about the usefulness of diagonal scaling for parallel processing [Meurant99, p. 401]. Gambolati et al. [Gambolati03] use the least square logarithm (LSL) scaling on the rows and the columns of the system matrix for a certain problem in geomechanics with discontinuous coefficients.
However, most previous works are concerned with symmetric systems and diagonal scaling refers to multiplying the system matrix on the left and on the right by the same diagonal matrix, in order to preserve symmetry. Also, there is no study on the general usefulness of geometric scaling for nonsymmetric problems with discontinuous coefficients. The idea of geometric scaling was found to be useful for certain problems in image reconstruction from projections; see [Gordon07].
DD methods are often mentioned in connection with parallel processing. The equations of different domains can, in principle, be solved in parallel by different processors. However, the different domains that arise in many practical situations do not necessarily lead to an optimal loadbalancing assignment of equations to processors. With the GS approach, inherently parallel algorithms such as BiCGSTAB and GMRES can be implemented efficiently in parallel without regard to subdomain boundaries. Needless to say, GS is inherently parallel.
2 General background
Throughout the rest of the paper, we assume that all vectors are column vectors, and we use the following notation: denotes the dot product of two vectors and , which is also . Given a vector , we denote its norm by . For , we will omit the index and just write . If is an matrix, we denote by the th rowvector of ; i.e., .
Consider a system of linear equations in variables:
(2.1) 
We shall assume throughout that (2.1) is consistent and that does not contain a row of zeros. For , we define a diagonal matrix . The geometricallyscaled system is defined as
(2.2) 
In some algorithms, GS() is an inherent step in the following sense: either the scaling is executed at the beginning as an intrinsic part of the algorithm, or, executing the algorithm produces identical results to those obtained when GS() is done at the beginning. As an example, it is easy to see that GS(2) is inherent in the Kaczmarz algorithm (KACZ) [Kaczmarz37]. KACZ can be described geometrically as follows: starting from an arbitrary point as the initial iterate, KACZ projects the current iterate orthogonally towards a hyperplane defined by one of the equations. The hyperplanes are chosen in cyclic order.
The tests were run using the AZTEC software library [Tuminaro99], which includes a wide range of algorithms and preconditioners, suitable for sequential and parallel implementations. Geometric scaling with the norm is a builtin option in AZTEC (where it is called “rowsum scaling”). As mentioned we used BiCGSTAB and GMRES, with and without ILUT. Restarted GMRES was used with Krylov subspace size of 10. In AZTEC, GMRES is implemented with a double classical GramSchmidt orthogonalization step. ILUT was implemented with AZTEC’s default parameters of drop tolerance = 0 and fillin = 1. These parameters are not necessarily optimal for the tested examples, but since our purpose was to demonstrate the general usefulness of GS, we stuck with a commonlyused restart value for GMRES and AZTEC’s default values for ILUT. No doubt, better convergence properties and runtimes would be obtained if these parameters were finetuned to each problem. The eigenvalue computations were done with the LAPACK linear algebra package [lapack].
We also experimented with the option of multiplying the system matrix by on the left and on the right, resulting in the system , with . In one problem, the results were somewhat poorer than those obtained with GS, but otherwise, the results were similar. Note that this option is computationally more expensive. Gambolati et al. [Gambolati03] remark that when using ILU(0), the iteration matrix of BiCGSTAB on a diagonally scaled system (on the left and/or the right) is theoretically the same as the one with ILU(0) on the original matrix. In our experiments, the effect of ILUT was identical to that of ILU(0), so in theory, BiCGSTAB with ILUT on the original and the scaled systems should have behaved similarly. However, our experiments indicated that using GS produces much better numerical results, and this also holds for GMRES with ILUT. This is probably due to the fact that the scaled system does not have very large differences between coefficients, so it is much less prone to roundoff errors.
2.1 Setup of the numerical experiments
In two dimensions, the general form of the secondorder differential equations in this study was
where and are given functions of and , “” stands for lowerorder derivatives, and is a prescribed RHS. In three dimensions, there are three given functions of , and a second order partial derivative w.r.t. was also included. Boundary conditions were either Dirichlet or mixed Dirichlet and Neumann. The regions were taken as the unit square or the unit cube. The discretization of the secondorder derivatives at a given grid point was done using central differences; e.g., was approximated as
All problems were discretized with equallyspaced grids, and the initial value was taken as . The tests were run on a Pentium IV 2.8GHz processor with 3GB memory, running Linux.
2.2 Stopping tests
There are several stopping criteria which one may apply to iterative systems. Our stopping criterion was to use the relative residual: , where was taken as , and . In some of the cases, this was not attainable. Since this stopping criterion depends on the scaling of the equations, we always made this test on the geometricallyscaled system using the norm. In order to limit the time taken by the methods implemented in AZTEC, the maximum number of iterations was set to 10,000. The AZTEC library has several other builtin stopping criteria: numerical breakdown, numerical loss of precision and numerical illconditioning. In the results, we denote the relative residual by relres, and nonconvergence by “no conv.”
One should note that the test for numerical breakdown in AZTEC uses the machine precision DBL_EPSILON, and this may result in a premature notice of numerical breakdown in some cases. To get around this problem, we multiplied the variable brkdown_tol in the BiCGSTAB algorithm by some small factor, such as . (brkdown_tol is normally set equal to DBL_EPSILON, which is approximately on our machine).
3 Problem 1
Problem 1 is taken from Saad [Saad03, §3.7, problem F2DB]. It is a twodimensional PDE
where
and and . The Dirichlet boundary condition is used on the boundary, and the RHS is immaterial since the vector of the linear system is chosen as , where is the system matrix and .
Table 3.1 shows the number of iterations and runtimes of the various algorithm/preconditioner methods, with and without GS, on a grid of . In cases of stagnation, the table shows the relative residual that was achieved before stagnation. Note that GS enables the convergence of BiCGSTAB without ILUT; this is useful in a parallel setting, because BiCGSTAB is inherently parallel but ILUT is not an ideal parallel preconditioner. We can also see that GS is slightly helpful to BiCGSTAB with ILUT, and, for lowaccuracy, also to GMRES with and without ILUT. Another observation is that when GMRES (with and without ILUT) stagnates before reaching the prescribed convergence goal, GS postpones the stage at which stagnation sets in, and enables convergence to a level that is acceptable for most practical applications.
No. of iterations and time (in sec.)  
Method  relres  relres  relres  
BiCGSTAB  no conv.  no conv.  no conv.  
with GS  91  (0.30)  299  (0.99)  361  (1.19) 
BiCGSTAB+ILUT  31  (0.23)  107  (0.67)  142  (0.88) 
with GS  30  (0.23)  90  (0.59)  130  (0.81) 
GMRES  converged to  
with GS  265  (0.85)  converged to  
GMRES+ILUT  converged to  
with GS  39  (0.23)  converged to 
Three additional experiments, based on Problem 1, were also done:

The values of and were increased to in the inner square. The results were very similar to those shown in Table 3.1, but with slightly increased runtimes in the higheraccuracy cases.

A “continuous” case: the values of and were taken as 1 throughout the unit square. Here, all the methods converged, and GS made very little difference.

A second continuous case, with . Table 3.2 shows the results of this experiment. As can be seen, GS made very little difference.
No. of iterations and time (in sec.)  
Method  relres  relres  relres  
BiCGSTAB  121  (0.41)  224  (0.76)  286  (0.96) 
with GS  123  (0.42)  217  (0.74)  261  (0.88) 
BiCGSTAB+ILUT  39  (0.28)  61  (0.41)  85  (0.56) 
with GS  40  (0.29)  60  (0.40)  85  (0.56) 
GMRES  1365  (4.31)  3704  (11.69)  6045  (19.19) 
with GS  1356  (4.28)  3582  (11.45)  5722  (18.16) 
GMRES+ILUT  140  (0.69)  359  (1.69)  579  (2.67) 
with GS  140  (0.69)  359  (1.69)  579  (2.67) 
We turn now to studying the effect of GS on the distribution of the eigenvalues. For Problem 1, this was done with a grid size of for the purpose of a clear presentation. Table 3.3 shows the values of the minimum and maximum eigenvalues and the condition number, for the original, the scaled and the continuous () cases of Problem 1. Also shown for each case is the number of eigenvalues in the first interval (out of 100) in a histogram of the eigenvalue distribution. We can see that while the condition number is not changed much by the scaling, the number of eigenvalues in the first interval is reduced very significantly.
Matrix 
No. of eigenvalues in first interval 


Original  1.87E2  7.96E+3  4.25E+5  1126 
With GS  7.07E6  1.78E+0  2.52E+5  4 
Cont. coef.  1.17E+1  8.00E+3  6.81E+2  8 
Figure 3.1 shows the distribution of the eigenvalues of Problem 1 at the full scale (left) and with a zoom (right). We can see that there is a congestion of eigenvalues close to zero. Figure 3.2 shows the distribution of the eigenvalues of Problem 1 after geometric scaling with the norm (left) and the (right). Both scalings produce fairly similar distributions; however we will present results throughout the paper for the scaling. As to the continuous version of the problem, its eigenvalues were all real and distributed evenly in the range .
Figure 3.3 shows a histogram of the eigenvalue distributions for the original Problem 1, for the geometrically scaled problem, and for the continuous case with .
4 Problem 2
Problem 2 is based on Example 2 from van der Vorst [Vorst92], to which we added convection terms. This example demonstrates that GS also works with mixed Dirichlet and Neumann boundary conditions. The governing PDE is
with and boundary conditions as shown in Figure 4.4. The convection terms were . In the original example, the internal value of is , but we also tested internal values of and . Also, the internal square in our case is somewhat smaller. The unit square was discretized with a grid size of . Together with the boundary equations, the system consisted of 22,952 equations. The resulting system is indefinite, with eigenvalues in the four quadrants of the imaginary plane.
Tables 4.4, 4.5 and 4.6 show the number of iterations and runtimes of the various algorithm and preconditioner combinations, with and without GS, for , , and , respectively. When there was no convergence to the prescribed goal, the table shows the relative residual that was achieved.
No. of iterations and time (in sec.)  
Method  relres  relres  relres  
BiCGSTAB  no conv.  no conv.  no conv.  
with GS  591  (2.13)  1025  (3.69)  conv. to  
BiCGSTAB+ILUT  92  (0.75)  139  (1.09)  215  (1.68) 
with GS  88  (0.72)  125  (1.00)  324  (2.45) 
GMRES  converged to 6.15  
with GS  converged to  
GMRES+ILUT  converged to 0.927  
with GS  3361  (19.86)  converged to 
No. of iterations and time (in sec.)  
Method  relres  relres  relres  
BiCGSTAB  no conv.  no conv.  no conv.  
with GS  589  (2.01)  707  (2.41)  1493  (5.37) 
BiCGSTAB+ILUT  96  (0.76)  192  (1.44)  255  (1.86) 
with GS  79  (0.64)  121  (0.92)  163  (1.19) 
GMRES  converged to 1.84  
with GS  converged to  
GMRES+ILUT  converged to 0.924  
with GS  converged to 
No. of iterations and time (in sec.)  
Method  relres  relres  relres  
BiCGSTAB  no conv.  no conv.  no conv.  
with GS  224  (0.82)  730  (2.65)  conv. to  
BiCGSTAB+ILUT  96  (1.15)  192  (1.56)  conv. to  
with GS  21  (0.23)  125  (0.98)  217  (1.68) 
GMRES  converged to 1.37  
with GS  357  (1.38)  converged to  
GMRES+ILUT  converged to 0.90  
with GS  40  (0.32)  converged to 
The tables show that GS was helpful in all cases, with the exception of BiCGSTAB with ILUT in Table 4.4, where it increased the number of iterations required in the highest accuracy case. A close examination of this case revealed that starting from iteration 220, BiCGSTAB with ILUT (after GS) was extremely oscillatory, with fluctuations of up to three orders of magnitude in the relative residual; the required convergence goal of was almost attained at 225 iterations.
The three tables show that the behavior of GS is not necessarily “smooth”: consider the results of BiCGSTAB with ILUT for relres = : we can see that the goal was not reached for and , but it was reached for . The explanation for this is the large number of iterations required by BiCGSTAB in this case, as seen in Table 4.5 and the oscillatory nature of BiCGSTAB. The accumulated roundoff error is sometimes too large to reach the convergence goal, and while the updated error measured by the algorithm showed convergence, the true relative residual did not always decrease sufficiently.
Another interesting point is the behavior of GS with GMRES and ILUT in the lowaccuracy case: for , there was convergence after many iterations, for there was no convergence, and then for there was a very fast convergence. In order to examine this behavior, we also tested this particular case with , and we noticed that starting , GS improved the convergence of GMRES by one order of magnitude for every increase in by one order of magnitude. Not doubt, this phenomenon should be studied further. In any case, for the larger values of , GS enabled the convergence of GMRES, with and without ILUT, to practical levels of accuracy.
For the eigenvalue data, we discretized the domain by a grid of . Table 4.7 provides the basic eigenvalue information for Problem 2, for the original () and the scaled matrices. Also shown are the eigenvalues for a variation of Problem 2 with “continuous” coefficients, obtained by taking throughout the unit square. The last column of the table shows the number of eigenvalues whose real part lies in the same interval as the origin, when the range of (the real parts) of the eigenvalues is divided into 100 intervals. We can see that GS reduces the condition number by three orders of magnitude, and the reduction is even greater w.r.t. the continuous case. Furthermore, the number of eigenvalues around the origin is reduced very significantly, even below that of the continuous case.
Matrix 
Eigenvalues around 


Original  3.33E1  1.34E+7  4.02E+7  1058 
With GS  9.15E5  1.78E0  1.95E+4  8 
Continuous coef.  1.48E2  1.34E+7  9.09E+8  130 
Figure 4.5 shows the distribution of the eigenvalues for Problem 2 with , and for the geometrically scaled problem. It can be seen that the eigenvalues of the original matrix are very concentrated around the origin. In the scaled matrix, there are very few eigenvalues around the origin, but many eigenvalues have their real part around 0.6.
Figure 4.6 provides another view of the eigenvalue distribution with a histogram of the real part of the eigenvalues for the original and the scaled matrices.
5 Problem 3
Problem 3 is also taken from van der Vorst [Vorst92, Example 4]. It describes a certain groundwater flow problem which leads to a nonsymmetric system, with a complex geometry and several jumps in the discontinuities of the equations. This problem is wellknown for its difficulty. The basic equation is the following:
where and are taken as shown in Figure 5.7, and . The Dirichlet boundary conditions are taken as shown in Figure 5.7. The unit square was discretized with a grid size of , resulting in 16,129 equations.
Nothing converged without ILUT, so we present in Table 5.8 only the results with ILUT, with and without GS. On this difficult problem, GS reduced the runtime of BiCGSTAB with ILUT by about 25%. GS also enabled the convergence of GMRES with ILUT, though the achieved runtimes were an order of magnitude larger than those of the scaled BiCGSTAB with ILUT in the higheraccuracy cases.
No. of iterations and time (in sec.)  
Method  relres  relres  relres  
BiCGSTAB+ILUT  93  (0.60)  124  (0.79)  152  (0.95) 
with GS  67  (0.44)  90  (0.59)  112  (0.72) 
GMRES+ILUT  no conv.  no conv.  no conv.  
with GS  338  (1.58)  1008  (4.61)  1683  (7.71) 
For the eigenvalue data, we discretized Problem 3 with a grid of . The data for the original and the scaled matrices is summarized in Table 5.9, which also shows the number of eigenvalues in the first interval (out of 100) of the eigenvalue histogram. We can see that the condition number is decreased very significantly, and so is the number of eigenvalues in the first interval.
Matrix 
No. of eigenvalues in first interval 


Original  6.97E5  7.93E+4  1.14E+9  1284 
With GS  1.21E4  1.78E+0  1.47E+4  167 
Figure 5.8 shows the distribution of the eigenvalues for the original (with a zoom) and the scaled matrices of Problem 3.
6 Problem 4
This problem is based on a threedimensional problem from Graham and Hagger [Graham99], to which we added convection terms. The differential equation is the following:
where the domain is the unit cube, and is defined as
The Dirichlet boundary conditions are prescribed with on the plane and on the other boundaries. The convection terms were taken as , and two values of were tested: (as in the original problem) and . Two grids were tested: and . The resulting linear systems are indefinite, with eigenvalues in the four quadrants of the imaginary plane. This problem will also be used to test the limit of usefulness of GS as the convection increases.
Due to the many different cases, the data for this problem is presented differently. Table 6.10 shows the number of iterations required for convergence, for a grid of , with and . We can see that GS enabled the convergence of BiCGSTAB without ILUT, and it speeded up BiCGSTAB with ILUT quite significantly. GS also enabled the convergence of GMRES, with and withou ILUT, for relres = , and also for relres = with .
relres  relres  relres  
Method  
BiCGSTAB  —  —  —  —  —  — 
with GS  112  111  262  160  406  374 
BiCGSTAB+ILUT  77  52  112  139  123  169 
with GS  13  13  52  19  84  86 
GMRES  —  —  —  —  —  — 
with GS  208  207  —  286  —  — 
GMRES+ILUT  —  —  —  —  —  — 
with GS  28  28  —  46  —  — 
Note: “—” denotes no convergence to the prescribed accuracy. 
Figures 6.10 and 6.11 show barplots of the runtimes for the two grid sizes, for and , for the three levels of accuracy. In cases of stagnation, the figures show (in parentheses) the relative residual achieved before stagnation.
Results for  Results for 

Results for  Results for 

The results can be summarized as follows.

BiCGSTAB (without ILUT) and GMRES (with and without ILUT) did not converge in any of the cases.

GS enabled the convergence of BiCGSTAB in all cases, and the convergence of GMRES (with and without ILUT) in the lowaccuracy cases and in one midaccuracy case.

GS significantly improved the accuracy of GMRES (with and without ILUT) by postponing the stage at which stagnation set in, enabling convergence to much higher levels of accuracy.

For the grid, GS speeded up the convergence of BiCGSTAB with ILUT quite significantly. This was also true for the coarser grid, but only in the lowaccuracy case.

In the above cases, GS (by itself) was generally a better preconditioner than ILUT for BiCGSTAB.

In all cases with GS, the results for were either similar to or better than those for .

In summary, GS was beneficial in all cases to all the algorithm/ILUT combinations, and in many cases, GS provided a very significant advantage.
Table 6.11 provides the basic eigenvalue information for Problem 4 with , for the original and the scaled matrices, and also for a continuous version with everywhere. The grid size for this data was . The the last column shows the number of eigenvalues whose real part lies in the same interval as the origin (out of 100 intervals). We can see that although GS doubled the condition number, it reduced the number of eigenvalues around the origin very significantly.
Matrix 
No. of eigenvalues around 


Original  1.45E2  2.93E+3  2.02E+5  1131 
With GS  2.41E6  1.00E+0  4.16E+5  25 
Cont. coef. ()  3.61E+0  6.18E+4  1.71E+4  42 
Figure 6.12 shows the distribution of the eigenvalues for the original Problem 4, and a zoom to the region . We can see that the eigenvalues are very concentrated around the origin. Figure 6.13 shows the distribution of the eigenvalues for the scaled and the continuous () cases. We can see that these distributions look quite similar, with eigenvalues concentrated around the perimeter.
Figure 6.14 provides a histogram of the real part of the eigenvalues of Problem 4, for the original, the scaled, and the continuous () cases. The histogram provides an additional view of how the eigenvalues are distributed in the three cases.
GS is proposed as being useful for discontinuous coefficients when the convection terms are small to moderate. In order to demonstrate this, we considered Problem 4 with , a grid size of , and varying convection terms. The previous results were based on convection terms of 100, so we also tested this case with convection terms of 200, 500, and 1000. The results, which are summarized in Table 6.12 below, show how the usefulness of GS degrades as the convection terms are increased.
Method / Convection:  100  200  500  1000 

BiCGSTAB  —  —  —  — 
with GS  —  
BiCGSTAB+ILUT  —  —  
with GS  —  —  
GMRES  —  —  —  — 
with GS  —  —  
GMRES+ILUT  —  —  —  — 
with GS  —  —  —  
Note: ‘—’ means no convergence. The numbers  
indicate which relative error goal was obtained. 
Note in particular the difference between BiCGSTAB without ILUT and BiCGSTAB with ILUT: it turns out that without ILUT, GS enables convergence to even with convection = 500, but with ILUT, degradation already appears with convection = 200; this degradation is independent of the use of GS.
7 Conclusions and further research
In this paper we have reported on a very simple technique for improving the convergence properties of algorithms on certain linear systems with discontinuous coefficients. The technique, called geometric scaling (GS), consists of simply dividing each equation by the norm of its vector of coefficients. GS, with , was tested on four nonsymmetric problems derived from convectiondiffusion elliptic PDEs, with small to moderate convection terms. BiCGSTAB and restarted GMRES, with and without ILUT, were tested on the four problems with and without GS. Normally, such problems are solved by domain decomposition (DD) techniques, but these can be difficult to implement if the boundaries between the subdomains have a complicated geometry or the grid is unstructured. The problems were taken from (or based on) wellknown examples from the literature, and included two and threedimensional cases, with Dirichlet and mixed Dirichlet/Neumann boundary conditions. One problem had a complicated geometry. Three different convergence goals were prescribed: relative residual .
Table 7.13 summarizes the convergence behavior of the four different algorithm/preconditioner combinations, with and without GS, on the four different problems, for the three convergence goals. The results indicate that GS was very useful in improving the convergence status of the different solution methods on the tested problems. With one exception, when the solution method converged on a problem with discontinuous coefficients, GS speeded up the convergence. Eigenvalue distribution maps show that when there is a strong concentration of values around the origin, GS reduces this concentration very significantly and “pushes” the values away from the origin.
Method  Problem 1  Problem 2  Problem 3  Problem 4 

BiCGSTAB  
with GS  
BiCGSTAB+ILUT  
with GS  
GMRES  1.37  0.27  
with GS  
GMRES+ILUT  0.90  0.29  
with GS  
Note: ‘’ means no convergence, ‘’ means convergence, ‘’ means  
better convergence. The numbers indicate the best relative error obtained. 
This study is only a first report on this subject, and much work remains to be done. On the theoretical side, we have seen that GS improves the eigenvalue distribution, but further analysis is still needed. On the practical side, GS needs to be tested in conjunction with other algorithms and preconditioners, on various other difficult problems.
A natural question that arises is how does GS compare with DD techniques? Over the years, these methods have achieved a high level of efficiency, and a headon runtime comparison is very much problemdependent and a topic for further research. However, an even more interesting topic is the combination of the two methods: apply GS to the equations and then apply some standard DD method; hopefully, on some problems, GS could also improve the convergence properties of some DD approaches.
GS can also help the parallelization of solution methods in several ways. Firstly, GS itself is a parallel computational step. Secondly, it enables the partitioning of a domain into subdomains along boundaries that do not necessarily follow the physical boundaries of the problem; this way, better load balancing can be achieved. Thirdly, as seen in some of the cases, GS enabled the convergence of algorithms that are inherently parallel without the need for a preconditioner (such as ILUT) that may not be ideal for parallelism.
Another topic for further research is the application of the (sequential) CGMN algorithm [Bjorck79, Gordon08] and its blockparallel version CARPCG [Gordon08a] on problems that have discontinuous coefficients and are also strongly convectiondominated. These algorithms have already been shown to be very effective on convectiondominated problems and initial experiments with discontinuous coefficients have yielded good results. The reason for this may be that these two methods are essentially conjugategradient acceleration of the Kaczmarz algorithm, and as such, GS (with the norm) is already inherent in them.