A composite spectral scheme for variable coefficient Helmholtz problems
P.G. Martinsson, Department of Applied Mathematics, University of Colorado at Boulder
May 31, 2012
Abstract: A discretization scheme for variable coefficient Helmholtz problems on twodimensional domains is presented. The scheme is based on highorder spectral approximations and is designed for problems with smooth solutions. The resulting system of linear equations is solved using a direct solver with complexity for the precomputation and complexity for the solve. The fact that the solver is direct is a principal feature of the scheme, since iterative methods tend to struggle with the Helmholtz equation. Numerical examples demonstrate that the scheme is fast and highly accurate. For instance, using a discretization with 12 points per wavelength, a Helmholtz problem on a domain of size wavelengths was solved to ten correct digits. The computation was executed on an office desktop; it involved 1.6M degrees of freedom and required 100 seconds for the precomputation, and 0.3 seconds for the actual solve.
1. Introduction
The paper describes a technique for constructing an approximate solution to the variable coefficient Helmholtz equation
(1.1) 
where is a rectangular domain with boundary , where the Helmholtz parameter is real, and where is a given smooth scattering potential. The scheme can straightforwardly be adapted to handle other variable coefficient elliptic problems, as well as free space scattering problems in . The primary limitation of the method is that it requires the solution to be smooth in .
The equation (1.1) is discretized via a composite spectral scheme. The domain is split into small square (or rectangular) patches. On each patch, the solution is represented via tabulation on a tensor product grid of Chebyshev points, see Figure 1. The Laplace operator is approximated via a spectral differentiation matrix acting on each local grid, and then equation (1.1) is enforced strongly at all tabulation nodes in the interior of each patch. To glue patches together, continuity of both the potential and its normal derivative are enforced at the spectral interpolation nodes on the boundaries between patches.
The discretization scheme is combined with a direct solver for the resulting linear system. The fact that the solver is direct rather than iterative is a principal feature of the scheme, since iterative solvers tend to struggle for Helmholtz problems of the kind considered here [2]. The direct solver organizes the patches in the discretization into a binary tree of successively larger patches. The solver then involves two stages, one that involves an upwards pass, and one that involves a downwards pass:

A precomputation stage where an approximation to the solution operator for (1.1) is computed. This is done via a single sweep of the hierarchical tree, going from smaller patches to larger. For each leaf in the tree, a local solution operator, and an approximation to the DirichlettoNeumann (DtN) map for the patch are constructed. For a parent node in the tree, a local solution operator and a local DtN operator are computed from an equilibrium equation formed using the DtN operators of the children of the patch. The precomputation stage has asymptotic complexity .

A solve stage that takes as input a vector of Dirichlet data tabulated on , and constructs tabulated values of at all internal grid points. The solve stage involves a single downwards sweep through the hierarchical tree of patches, going from larger patches to smaller. The solve stage has asymptotic complexity .
Numerical experiments indicate that the spectral convergence of the method makes it both highly accurate and computationally efficient. For instance, the equation (1.1) was solved for a box whose size exceeded wavelengths in less than 2 minutes on a standard office laptop. A th order spectral scheme with 12 points per wavelength was used in the local approximation on the patches. The resulting solution was accurate to between 7 and 10 digits, depending on the nature of the scattering potential in (1.1). The discretization used a total of degrees of freedom. The computational time was dominated by the precomputation stage; the actual solve stage took only seconds. This makes the scheme particularly powerful in situations where an equation such as (1.1) needs to be solved for a sequence of different boundary functions .
The scheme proposed is conceptually related to a direct solver for the LippmanSchwinger equation proposed in 2002 by Yu Chen [1]. The schemes are different in that the method proposed here is not based on a LippmanSchwinger formulation, and uses spectral approximations on the smallest patches in the hierarchical tree. Comparing the efficiencies of the two schemes is difficult since the paper [1] does not report numerical results, and we have been unable to find reports of implementations of the scheme. The scheme proposed here is also conceptually related to the classical nested dissection algorithm for finite element and finite difference matrices [3], and to recently proposed direct solvers for BIEs on surfaces in 3D [4].
For clarity, the current paper focusses on the simple boundary value problem (1.1) involving the Helmholtz elliptic operator and Dirichlet boundary data. The scheme can with trivial modifications be applied to more general elliptic operators
coupled with Dirichlet, Neumann, or mixed boundary data. It has for instance been successfully tested on convectiondiffusion problems that are strongly dominated (by a factor of ) by the convection term, see Section 6.4. Moreover, the scheme can with minor modifications be applied to a free space scattering problem such as
(1.2) 
coupled with appropriate radiation conditions at infinity. A standard assumption is that is supported outside of some (bounded) square region while the smooth function is supported inside . The scheme described in this note computes the DtN operator for (1.2) on . The DtN operator for the exterior domain can be computed via Boundary Integral Equation techniques; and by combining the two, one can solve the free space scattering problem, see Section 7.3.
The method proposed has a vulnerability in that it crucially relies on the existence of DtN operators for all patches in the hierarchical tree. This can be problematic due to resonances: For certain wavenumbers , there exist nontrivial solutions that have zero Dirichlet boundary data. We have found that in practice, this problem almost never arises when processing domains that are a couple of hundred wavelengths or less in size. Moreover, if a resonant patch should be encountered, this will be detected and countermeasures can be taken, see Sections 5.3 and 7.5.
The asymptotic complexity of the proposed method is . For the case where the wavenumber is increased as grows to keep a constant number of discretization points per wavelength (i.e. ), we do not know how to improve the complexity. However, for the case where the wavenumber is kept constant as increases, complexity can very likely be attained by exploiting internal structure in the DtN operators. The resulting scheme would be a spectral version of recently published accelerated nested dissection schemes such as [8, 10, 12].
An early version of the work reported was published on arXiv as [9].
The paper is organized as follows: Section 2 introduces notation and lists some classical material on spectral interpolation and differentiation. Section 3 describes how to compute the solution operator and the DtN operator for a leaf in tree (which is discretized via a single tensorproduct grid of Chebyshev nodes). Section 4 describes how the DtN operator for a larger patch consisting of two small patches can be computed if the DtN operators for the smaller patches are given. Section 5 describes the full hierarchical scheme. Section 6 reports the results of some numerical experiments. Section 7 describes how the scheme can be extended to more general situations.
2. Preliminaries — spectral differentiation
This section introduces notation for spectral differentiation on tensor product grids of Chebyshev nodes on the square domain . This material is classical, see, e.g., Trefethen [11]. (While we restrict attention to square boxes here, all techniques generalize trivially to rectangular boxes of moderate aspect ratios.)
Let denote a positive integer. The Chebyshev nodes on are the points
Let denote the set of points of the form for . Let denote the linear space of sums of tensor products of polynomials of degree or less. has dimension . Given a vector , there is a unique function such that for . (A reason Chebyshev nodes are of interest is that for any fixed , the map is stable.) Now define , , and as the unique matrices such that
(2.1)  
(2.2)  
(2.3) 
3. Leaf computation
This section describes the construction of a discrete approximation to the DirichlettoNeumann operator associated with the boundary value problem (1.1) for a square patch . We discretize (1.1) via a spectral method on a tensor product grid of Chebyshev nodes on . In addition to the DtN operator, we also construct a solution operator to (1.1) that maps the Dirichlet data on the nodes on the boundary of to the value of at all internal interpolation nodes.
3.1. Notation
Let denote a square patch. Let denote the nodes in a tensor product grid of Chebyshev nodes. Partition the index set
in such a way that contains all nodes on the boundary of , and denotes the set of interior nodes, see Figure 2(a). Let be a function that satisfies (1.1) on and let
denote the vectors of samples of and its partial derivatives. We define the shorthands
Let , , and denote spectral differentiation matrices corresponding to the operators , , and , respectively (see Section 2). We use the shorthand
to denote the part of the differentiation matrix that maps exterior nodes to interior nodes, etc.
(a)  (b) 
3.2. Equilibrium condition
3.3. Constructing the DtN operator
Let and denote the matrices that map boundary values of the potential to boundary values of and . These are constructed as follows: Given the potential on the boundary, we reconstruct the potential in the interior via (3.2). Then, since the potential is known on all Chebyshev nodes in , we can determine the gradient on the boundary via spectral differentiation on the entire domain. To formalize, we find
where
(3.4) 
An analogous computation for yields
(3.5) 
4. Merge operation
Let denote a rectangular domain consisting of the union of the two smaller rectangular domains,
as shown in Figure 3. Moreover, suppose that approximations to the DtN operators for and are available. (Represented as matrices that map boundary values of to boundary values of and .) This section describes how to compute a solution operator that maps the value of a function that is tabulated on the boundary of to the values of on interpolation nodes on the internal boundary, as well as operators and that map boundary values of on the boundary of to values of the and tabulated on the boundary.
4.1. Notation
Let denote a box with children and . For concreteness, let us assume that and share a vertical edge. We partition the points on and into four sets: Boundary nodes of that are not boundary nodes of . Boundary nodes of that are not boundary nodes of . The two nodes that are boundary nodes of , of , and also of the union box . Boundary nodes of both and that are not boundary nodes of the union box . Figure 3 illustrates the definitions of the ’s. Let denote a function such that
and let , , denote the values of , , and , restricted to the nodes in the set “”. Moreover, set
(4.1) 
Finally, let , , , denote the operators that map potential values on the boundary to values of and on the boundary for the boxes and . We partition these matrices according to the numbering of nodes in Figure 3,
(4.2) 
and
(4.3) 
4.2. Equilibrium condition
Suppose that we are given a tabulation of boundary values of a function that satisfies (1.1) on . In other words, we are given the vectors , , and . We can then reconstruct the values of the potential on the interior boundary (tabulated in the vector ) by using information in (4.2) and (4.3). Simply observe that there are two equations specifying the normal derivative across the internal boundary (tabulated in ), and combine these equations:
Solving for we get
(4.4) 
Now set
(4.5) 
to find that (4.4) is (in view of (4.1)) precisely the desired formula
(4.6) 
The net effect of the merge operation is to eliminate the interior tabulation nodes in and so that only boundary nodes in the union box are kept, as illustrated in Figure 4.
(a)  (b) 
4.3. Constructing the DtN operators for the union box
We will next build a matrix that constructs the derivative on given values of on . In other words
To this end, observe from (4.2) and (4.3) that
(4.7)  
(4.8) 
Equations (4.2) and (4.3) provide two different formulas for , either of which could be used. For numerical stability, we use the average of the two:
(4.9)  
(4.10) 
Combining (4.7) – (4.10) we obtain
In other words,
(4.11) 
An analogous computation for yields
(4.12) 
5. The full hierarchical scheme
5.1. The algorithm
Now that we know how to construct the DtN operator for a leaf (Section 3), and how to merge the DtN operators of two neighboring patches to form the DtN operator of their union (Section 4), we are ready to describe the full hierarchical scheme for solving (1.1).
First we partition the domain into a collection of square (or possibly rectangular) boxes, called leaf boxes. These should be small enough that a small spectral mesh with nodes (for, say, ) accurately interpolates both any potential solution of (1.1) and its partial derivatives , , and . Let denote the points in this mesh. (Observe that nodes on internal boundaries are shared between two or four local meshes.) Next construct a binary tree on the collection of boxes by hierarchically merging them, making sure that all boxes on the same level are roughly of the same size, cf. Figure 5. The boxes should be ordered so that if is a parent of a box , then . We also assume that the root of the tree (i.e. the full box ) has index .
With each box , we define two index vectors and as follows:
A list of all indices of the nodes on the boundary of .  
For a leaf , is a list of all interior nodes of .  
For a parent , is a list of all its interior nodes that are not interior nodes of its children. 
Next we execute a precomputation in which for every box , we construct the following matrices:
The matrix that maps the values of on the boundary of a box to the values of  
on the interior nodes of the box. In other words, .  
The matrix that maps the values of on the boundary of a box to the values of  
(tabulating ) on the boundary of a box. In other words, .  
The matrix that maps the values of on the boundary of a box to the values of  
(tabulating ) on the boundary of a box. In other words, . 
To this end, we scan all boxes in the tree, going from smaller to larger. For any leaf box , a dense matrix of size that locally approximates the differential operator in (1.1) is formed. Then the matrices , , and are constructed via formulas (3.3), (3.4), and (3.5). For a parent box with children and , the matrices , , and are formed from the DtN operators encoded in the matrices , , , using the formulas (4.5), (4.11), and (4.12). The full algorithm is summarized in Figure 6. An illustrated cartoon of the merge process is provided in Appendix A.
Once all the matrices have been formed, it is a simple matter to construct a vector holding approximations to the solution of (1.1). The nodes are scanned starting with the root, and then proceeding down in the tree towards smaller boxes. When a box is processed, the value of is known for all nodes on its boundary (i.e. those listed in ). The matrix directly maps these values to the values of on the nodes in the interior of (i.e. those listed in ). When all nodes have been processed, all entries of have been computed. Figure 7 summarizes the solve stage.
Remark 5.1.
Every interior meshpoint belongs to the index vector for precisely one node . In other words forms a disjoint union of the interior mesh points.
Remark 5.2.
The way the algorithms are described, we compute for each node matrices and that allow the computation of both the normal and the tangential derivative at any boundary node, given the Dirichlet data on the boundary. This is done for notational convenience only. In practice, any rows of and that correspond to evaluation of tangential derivatives need never be evaluated since tangential derivatives do not enter into consideration at all.
Remark 5.3.
The merge stage is exact when performed in exact arithmetic. The only approximation involved is the approximation of the solution on a leaf by its interpolating polynomial.
5.2. Complexity analysis
The analysis of the asymptotic cost of the algorithm in Section 5.1 closely mimics the analysis of the classical nested dissection algorithm [7, 3]. For simplicity, we analyze the simplest situation in which a square domain is divided into leaf boxes, each holding a spectral cartesian mesh with points. The total number of unknowns in the system is then roughly (to be precise, ).
Cost of leaf computation: Evaluating the formulas (3.3), (3.4), and (3.5) requires dense matrix algebra on matrices of size roughly . Since there are about leaves, the total cost is
Cost of the merge operations: For an integer , we refer to “level ” as the collection of boxes whose side length is times the side of the full box (so that corresponds to the root and corresponds to the set of leaf boxes). To form the matrices , , and for a box on level , we need to evaluate each of the formulas (4.5), (4.11), and (4.12) three times, with each computation involving matrices of size roughly . Since there are boxes on level , we find that the cost of processing level is
Adding the costs at all levels, we get
Cost of solve stage: The cost of processing a nonleaf node on level is simply the cost of a matrixvector multiply involving the dense matrix of size . For a leaf, is of size roughly . Therefore
5.3. Problem of resonances
The scheme presented in Section 5.1 will fail if one of the patches in the hierarchical partitioning is resonant in the sense that there exist nontrivial solutions to the Helmholtz equation that have zero Dirichlet data at the boundary of the patch. In this case, the Neumann data for the patch is not uniquely determined by the Dirichlet data, and the DtN operator cannot exist. In practice, this problem will of course arise even if we are merely close to a resonance, and will be detected by the discovery that the inverse matrix in the formulas (3.3) and (4.5) for the solution operator is illconditioned.
It is our experience from working with domains of size one hundred wavelengths or less that resonances are very rare; one almost never encounters the problem. Nevertheless, it is important to monitor the conditioning of the formulas (3.3) and (4.5) to ensure the accuracy of the final answer. Should a problem be detected, the easiest solution would be to simply start the computation over with a different tessellation of the domain . Very likely, this will resolve the problem.
Current efforts to formulate variations of the scheme that are inherently not vulnerable to resonances are described in Section 7.5.
6. Numerical experiments
This section reports the results of some numerical experiments with the method described in Section 5.1. The method was implemented in Matlab and the experiments executed on a Lenovo W510 laptop with a quad core Intel i7 Q720 processor with 1.6GHz clockspeed, and 16GB of RAM.
The speed and memory requirements of the algorithm were investigated by solving the special case where in (1.1), and the Dirichlet data is simply set to equal a known analytic solution. The results are reported in Section 6.1. We also report the errors incurred in the special case, but it should be noted that these represent only a best case estimate of the errors since the equation solved is particularly benign. To get a more realistic estimate of the errors in the method, we also applied it to three situation in which exact solutions are not known: A problem with variable coefficients in Section 6.2, a problem on an Lshaped domain in Section 6.3, and a convectiondiffusion problem in Section 6.4.
6.1. Constant coefficient Helmholtz
We solved the basic Helmholtz equation
(6.1)  
(6.2) 
where and . The boundary data were in this first set of experiments chosen as the restriction to of an exact solution
(6.3) 
where , and where is the ’th Bessel function of the second kind. This experiment serves two purposes. The first is to systematically measure the speed and memory requirements of the method described in Section 5.1. The second is to get a sense of what errors can be expected in a “best case” scenario with a very smooth solution. Observe however that the situation is by no means artificial since the smoothness of this case is exactly what one encounters when the solver is applied to a free space scattering problem as described in Section 7.3.
The domain was discretized into patches, and on each patch a Cartesian mesh of Chebyshev nodes was placed. The total number of degrees of freedom is then
We tested the method for . For each fixed , the method was executed for several different mesh sizes . The wavenumber was chosen to keep a constant of 12 points per wavelength, or .
Since the exact solution is known in this case, we computed the direct error measure
where is the set of all mesh points. We also computed the maximum error in the gradient of on the boundary as computed via the and operators on the root box,
Table 1 reports the following variables:
The number of wavelengths along one side of .
The time in seconds required to execute the precomputation in Figure 6.
The time in seconds required to execute the solve in Figure 7.
The amount of RAM used in the precomputation in MB.
The table also reports the memory requirements in terms of number of the number of double precision reals
that need to be stored per degree of freedom in the discretization.
The highorder version of the method () was also capable of performing a high accuracy solve with only six points per wavelength. The results are reported in Table 2.
We see that increasing the spectral order is very beneficial for improving accuracy. However, the speed deteriorates and the memory requirements increase as grows. Choosing appears to be a good compromise.
Remark 6.1.
In the course of executing the numerical examples, the instability problem described in Section 5.3 was detected precisely once (for and 12 points per wave length).
(sec)  (sec)  (MB)  (reals/DOF)  
6  6561  6.7  0.28  0.0047  8.02105e03  3.06565e01  2.8  56.5 
6  25921  13.3  0.96  0.0184  1.67443e02  1.33562e+00  12.7  64.2 
6  103041  26.7  4.42  0.0677  3.60825e02  5.46387e+00  56.2  71.5 
6  410881  53.3  20.23  0.2397  3.39011e02  1.05000e+01  246.9  78.8 
6  1640961  106.7  88.73  0.9267  7.48385e01  4.92943e+02  1075.0  85.9 
11  6561  6.7  0.16  0.0019  2.67089e05  1.08301e03  2.9  58.0 
11  25921  13.3  0.68  0.0073  5.30924e05  4.34070e03  13.0  65.7 
11  103041  26.7  3.07  0.0293  1.01934e04  1.60067e02  57.4  73.0 
11  410881  53.3  14.68  0.1107  1.07747e04  3.49637e02  251.6  80.2 
11  1640961  106.7  68.02  0.3714  2.17614e04  1.37638e01  1093.7  87.4 
21  6561  6.7  0.23  0.0011  2.56528e10  1.01490e08  4.4  87.1 
21  25921  13.3  0.92  0.0044  5.24706e10  4.44184e08  18.8  95.2 
21  103041  26.7  4.68  0.0173  9.49460e10  1.56699e07  80.8  102.7 
21  410881  53.3  22.29  0.0727  1.21769e09  3.99051e07  344.9  110.0 
21  1640961  106.7  99.20  0.2965  1.90502e09  1.24859e06  1467.2  117.2 
21  6558721  213.3  551.32  20.9551  2.84554e09  3.74616e06  6218.7  124.3 
41  6561  6.7  1.50  0.0025  9.88931e14  3.46762e12  7.9  157.5 
41  25921  13.3  4.81  0.0041  1.58873e13  1.12883e11  32.9  166.4 
41  103041  26.7  18.34  0.0162  3.95531e13  5.51141e11  137.1  174.4 
41  410881  53.3  75.78  0.0672  3.89079e13  1.03546e10  570.2  181.9 
41  1640961  106.7  332.12  0.2796  1.27317e12  7.08201e10  2368.3  189.2 
(sec)  (sec)  (MB)  (reals/DOF)  

41  6561  13.3  1.30  0.0027  1.54407e09  1.78814e07  7.9  157.5 
41  25921  26.7  4.40  0.0043  1.42312e08  2.35695e06  32.9  166.4 
41  103041  53.3  17.54  0.0199  1.73682e08  5.84193e06  137.1  174.4 
41  410881  106.7  72.90  0.0717  2.28475e08  1.51575e05  570.2  181.9 
41  1640961  213.3  307.37  0.3033  4.12809e08  5.51276e05  2368.3  189.2 
6.2. Variable coefficient Helmholtz
We solved the equation
(6.4)  
(6.5) 
where , where , and where
The Helmholtz parameter was kept fixed at , corresponding to a domain size of wave lengths. The boundary data was given by
The equation (6.4) was discretized and solved as described in Section 6.1. The speed and memory requirements for this computation are exactly the same as for the example in Section 6.1 (they do not depend on what equation is being solved), so we now focus on the accuracy of the method. We do not know of an exact solution, and therefore report pointwise convergence. Letting denote the value of computed using degrees of freedom, we used