1 Introduction

A composite spectral scheme for variable coefficient Helmholtz problems


P.G. Martinsson, Department of Applied Mathematics, University of Colorado at Boulder

May 31, 2012

Abstract: A discretization scheme for variable coefficient Helmholtz problems on two-dimensional domains is presented. The scheme is based on high-order spectral approximations and is designed for problems with smooth solutions. The resulting system of linear equations is solved using a direct solver with complexity for the pre-computation and complexity for the solve. The fact that the solver is direct is a principal feature of the scheme, since iterative methods tend to struggle with the Helmholtz equation. Numerical examples demonstrate that the scheme is fast and highly accurate. For instance, using a discretization with 12 points per wave-length, a Helmholtz problem on a domain of size wavelengths was solved to ten correct digits. The computation was executed on an office desktop; it involved 1.6M degrees of freedom and required 100 seconds for the pre-computation, and 0.3 seconds for the actual solve.

1. Introduction

The paper describes a technique for constructing an approximate solution to the variable coefficient Helmholtz equation

(1.1)

where is a rectangular domain with boundary , where the Helmholtz parameter is real, and where is a given smooth scattering potential. The scheme can straight-forwardly be adapted to handle other variable coefficient elliptic problems, as well as free space scattering problems in . The primary limitation of the method is that it requires the solution to be smooth in .

The equation (1.1) is discretized via a composite spectral scheme. The domain is split into small square (or rectangular) patches. On each patch, the solution is represented via tabulation on a tensor product grid of Chebyshev points, see Figure 1. The Laplace operator is approximated via a spectral differentiation matrix acting on each local grid, and then equation (1.1) is enforced strongly at all tabulation nodes in the interior of each patch. To glue patches together, continuity of both the potential and its normal derivative are enforced at the spectral interpolation nodes on the boundaries between patches.

The discretization scheme is combined with a direct solver for the resulting linear system. The fact that the solver is direct rather than iterative is a principal feature of the scheme, since iterative solvers tend to struggle for Helmholtz problems of the kind considered here [2]. The direct solver organizes the patches in the discretization into a binary tree of successively larger patches. The solver then involves two stages, one that involves an upwards pass, and one that involves a downwards pass:

  1. A pre-computation stage where an approximation to the solution operator for (1.1) is computed. This is done via a single sweep of the hierarchical tree, going from smaller patches to larger. For each leaf in the tree, a local solution operator, and an approximation to the Dirichlet-to-Neumann (DtN) map for the patch are constructed. For a parent node in the tree, a local solution operator and a local DtN operator are computed from an equilibrium equation formed using the DtN operators of the children of the patch. The pre-computation stage has asymptotic complexity .

  2. A solve stage that takes as input a vector of Dirichlet data tabulated on , and constructs tabulated values of at all internal grid points. The solve stage involves a single downwards sweep through the hierarchical tree of patches, going from larger patches to smaller. The solve stage has asymptotic complexity .

Numerical experiments indicate that the spectral convergence of the method makes it both highly accurate and computationally efficient. For instance, the equation (1.1) was solved for a box whose size exceeded wave-lengths in less than 2 minutes on a standard office laptop. A -th order spectral scheme with 12 points per wave-length was used in the local approximation on the patches. The resulting solution was accurate to between 7 and 10 digits, depending on the nature of the scattering potential in (1.1). The discretization used a total of degrees of freedom. The computational time was dominated by the pre-computation stage; the actual solve stage took only seconds. This makes the scheme particularly powerful in situations where an equation such as (1.1) needs to be solved for a sequence of different boundary functions .

The scheme proposed is conceptually related to a direct solver for the Lippman-Schwinger equation proposed in 2002 by Yu Chen [1]. The schemes are different in that the method proposed here is not based on a Lippman-Schwinger formulation, and uses spectral approximations on the smallest patches in the hierarchical tree. Comparing the efficiencies of the two schemes is difficult since the paper [1] does not report numerical results, and we have been unable to find reports of implementations of the scheme. The scheme proposed here is also conceptually related to the classical nested dissection algorithm for finite element and finite difference matrices [3], and to recently proposed direct solvers for BIEs on surfaces in 3D [4].

For clarity, the current paper focusses on the simple boundary value problem (1.1) involving the Helmholtz elliptic operator and Dirichlet boundary data. The scheme can with trivial modifications be applied to more general elliptic operators

coupled with Dirichlet, Neumann, or mixed boundary data. It has for instance been successfully tested on convection-diffusion problems that are strongly dominated (by a factor of ) by the convection term, see Section 6.4. Moreover, the scheme can with minor modifications be applied to a free space scattering problem such as

(1.2)

coupled with appropriate radiation conditions at infinity. A standard assumption is that is supported outside of some (bounded) square region while the smooth function is supported inside . The scheme described in this note computes the DtN operator for (1.2) on . The DtN operator for the exterior domain can be computed via Boundary Integral Equation techniques; and by combining the two, one can solve the free space scattering problem, see Section 7.3.

The method proposed has a vulnerability in that it crucially relies on the existence of DtN operators for all patches in the hierarchical tree. This can be problematic due to resonances: For certain wave-numbers , there exist non-trivial solutions that have zero Dirichlet boundary data. We have found that in practice, this problem almost never arises when processing domains that are a couple of hundred wave-lengths or less in size. Moreover, if a resonant patch should be encountered, this will be detected and counter-measures can be taken, see Sections 5.3 and 7.5.

The asymptotic complexity of the proposed method is . For the case where the wave-number is increased as grows to keep a constant number of discretization points per wave-length (i.e. ), we do not know how to improve the complexity. However, for the case where the wave-number is kept constant as increases, complexity can very likely be attained by exploiting internal structure in the DtN operators. The resulting scheme would be a spectral version of recently published accelerated nested dissection schemes such as [8, 10, 12].

An early version of the work reported was published on arXiv as [9].

The paper is organized as follows: Section 2 introduces notation and lists some classical material on spectral interpolation and differentiation. Section 3 describes how to compute the solution operator and the DtN operator for a leaf in tree (which is discretized via a single tensor-product grid of Chebyshev nodes). Section 4 describes how the DtN operator for a larger patch consisting of two small patches can be computed if the DtN operators for the smaller patches are given. Section 5 describes the full hierarchical scheme. Section 6 reports the results of some numerical experiments. Section 7 describes how the scheme can be extended to more general situations.

Figure 1. The box is split into leaf boxes, and a Cartesian grid of Chebyshev nodes is placed on each leaf box. The figure shows local grids of size for clarity; in actual computations, local grids of size were typically used.

2. Preliminaries — spectral differentiation

This section introduces notation for spectral differentiation on tensor product grids of Chebyshev nodes on the square domain . This material is classical, see, e.g., Trefethen [11]. (While we restrict attention to square boxes here, all techniques generalize trivially to rectangular boxes of moderate aspect ratios.)

Let denote a positive integer. The Chebyshev nodes on are the points

Let denote the set of points of the form for . Let denote the linear space of sums of tensor products of polynomials of degree or less. has dimension . Given a vector , there is a unique function such that for . (A reason Chebyshev nodes are of interest is that for any fixed , the map is stable.) Now define , , and as the unique matrices such that

(2.1)
(2.2)
(2.3)

3. Leaf computation

This section describes the construction of a discrete approximation to the Dirichlet-to-Neumann operator associated with the boundary value problem (1.1) for a square patch . We discretize (1.1) via a spectral method on a tensor product grid of Chebyshev nodes on . In addition to the DtN operator, we also construct a solution operator to (1.1) that maps the Dirichlet data on the nodes on the boundary of to the value of at all internal interpolation nodes.

3.1. Notation

Let denote a square patch. Let denote the nodes in a tensor product grid of Chebyshev nodes. Partition the index set

in such a way that contains all nodes on the boundary of , and denotes the set of interior nodes, see Figure 2(a). Let be a function that satisfies (1.1) on and let

denote the vectors of samples of and its partial derivatives. We define the short-hands

Let , , and denote spectral differentiation matrices corresponding to the operators , , and , respectively (see Section 2). We use the short-hand

to denote the part of the differentiation matrix that maps exterior nodes to interior nodes, etc.

                      
(a) (b)
Figure 2. Notation for the leaf computation in Section 3. (a) A leaf before elimination of interior (white) nodes. (b) A leaf after elimination of interior nodes.

3.2. Equilibrium condition

The operator (1.1) is approximated via the matrix

where denotes the vector of pointwise values of ,

The equation we enforce on is that the vector should evaluate to zero at all internal nodes,

(3.1)

where

Solving (3.1) for , we obtain

(3.2)

where

(3.3)

3.3. Constructing the DtN operator

Let and denote the matrices that map boundary values of the potential to boundary values of and . These are constructed as follows: Given the potential on the boundary, we reconstruct the potential in the interior via (3.2). Then, since the potential is known on all Chebyshev nodes in , we can determine the gradient on the boundary via spectral differentiation on the entire domain. To formalize, we find

where

(3.4)

An analogous computation for yields

(3.5)

4. Merge operation

Let denote a rectangular domain consisting of the union of the two smaller rectangular domains,

as shown in Figure 3. Moreover, suppose that approximations to the DtN operators for and are available. (Represented as matrices that map boundary values of to boundary values of and .) This section describes how to compute a solution operator that maps the value of a function that is tabulated on the boundary of to the values of on interpolation nodes on the internal boundary, as well as operators and that map boundary values of on the boundary of to values of the and tabulated on the boundary.

Figure 3. Notation for the merge operation described in Section 4. The rectangular domain is formed by two squares and . The sets , , and form the exterior nodes (black), while consists of the interior nodes (white).

4.1. Notation

Let denote a box with children and . For concreteness, let us assume that and share a vertical edge. We partition the points on and into four sets: Boundary nodes of that are not boundary nodes of . Boundary nodes of that are not boundary nodes of . The two nodes that are boundary nodes of , of , and also of the union box . Boundary nodes of both and that are not boundary nodes of the union box . Figure 3 illustrates the definitions of the ’s. Let denote a function such that

and let , , denote the values of , , and , restricted to the nodes in the set “”. Moreover, set

(4.1)

Finally, let , , , denote the operators that map potential values on the boundary to values of and on the boundary for the boxes and . We partition these matrices according to the numbering of nodes in Figure 3,

(4.2)

and

(4.3)

4.2. Equilibrium condition

Suppose that we are given a tabulation of boundary values of a function that satisfies (1.1) on . In other words, we are given the vectors , , and . We can then reconstruct the values of the potential on the interior boundary (tabulated in the vector ) by using information in (4.2) and (4.3). Simply observe that there are two equations specifying the normal derivative across the internal boundary (tabulated in ), and combine these equations:

Solving for we get

(4.4)

Now set

(4.5)

to find that (4.4) is (in view of (4.1)) precisely the desired formula

(4.6)

The net effect of the merge operation is to eliminate the interior tabulation nodes in and so that only boundary nodes in the union box are kept, as illustrated in Figure 4.

     
(a) (b)
Figure 4. Merge operation for two small boxes to form a new large box. (a) Before elimination of interior (white) nodes. (b) After elimination of interior nodes.

4.3. Constructing the DtN operators for the union box

We will next build a matrix that constructs the derivative on given values of on . In other words

To this end, observe from (4.2) and (4.3) that

(4.7)
(4.8)

Equations (4.2) and (4.3) provide two different formulas for , either of which could be used. For numerical stability, we use the average of the two:

(4.9)
(4.10)

Combining (4.7) – (4.10) we obtain

In other words,

(4.11)

An analogous computation for yields

(4.12)

5. The full hierarchical scheme

5.1. The algorithm

Now that we know how to construct the DtN operator for a leaf (Section 3), and how to merge the DtN operators of two neighboring patches to form the DtN operator of their union (Section 4), we are ready to describe the full hierarchical scheme for solving (1.1).

First we partition the domain into a collection of square (or possibly rectangular) boxes, called leaf boxes. These should be small enough that a small spectral mesh with nodes (for, say, ) accurately interpolates both any potential solution of (1.1) and its partial derivatives , , and . Let denote the points in this mesh. (Observe that nodes on internal boundaries are shared between two or four local meshes.) Next construct a binary tree on the collection of boxes by hierarchically merging them, making sure that all boxes on the same level are roughly of the same size, cf. Figure 5. The boxes should be ordered so that if is a parent of a box , then . We also assume that the root of the tree (i.e. the full box ) has index .

Figure 5. The square domain is split into leaf boxes. These are then gathered into a binary tree of successively larger boxes as described in Section 5.1. One possible enumeration of the boxes in the tree is shown, but note that the only restriction is that if box is the parent of box , then .

With each box , we define two index vectors and as follows:

A list of all indices of the nodes on the boundary of .
For a leaf , is a list of all interior nodes of .
For a parent , is a list of all its interior nodes that are not interior nodes of its children.

Next we execute a pre-computation in which for every box , we construct the following matrices:

The matrix that maps the values of on the boundary of a box to the values of
on the interior nodes of the box. In other words, .
The matrix that maps the values of on the boundary of a box to the values of
(tabulating ) on the boundary of a box. In other words, .
The matrix that maps the values of on the boundary of a box to the values of
(tabulating ) on the boundary of a box. In other words, .

To this end, we scan all boxes in the tree, going from smaller to larger. For any leaf box , a dense matrix of size that locally approximates the differential operator in (1.1) is formed. Then the matrices , , and are constructed via formulas (3.3), (3.4), and (3.5). For a parent box with children and , the matrices , , and are formed from the DtN operators encoded in the matrices , , , using the formulas (4.5), (4.11), and (4.12). The full algorithm is summarized in Figure 6. An illustrated cartoon of the merge process is provided in Appendix A.

Once all the matrices have been formed, it is a simple matter to construct a vector holding approximations to the solution of (1.1). The nodes are scanned starting with the root, and then proceeding down in the tree towards smaller boxes. When a box is processed, the value of is known for all nodes on its boundary (i.e. those listed in ). The matrix directly maps these values to the values of on the nodes in the interior of (i.e. those listed in ). When all nodes have been processed, all entries of have been computed. Figure 7 summarizes the solve stage.

Remark 5.1.

Every interior meshpoint belongs to the index vector for precisely one node . In other words forms a disjoint union of the interior mesh points.

Remark 5.2.

The way the algorithms are described, we compute for each node matrices and that allow the computation of both the normal and the tangential derivative at any boundary node, given the Dirichlet data on the boundary. This is done for notational convenience only. In practice, any rows of and that correspond to evaluation of tangential derivatives need never be evaluated since tangential derivatives do not enter into consideration at all.

Pre-computation This program constructs the global Dirichlet-to-Neumann operator for (1.1). It also constructs all matrices required for constructing at all interior nodes. It is assumed that if node is a parent of node , then .   for if ( is a leaf) else Let and be the children of . Partition and into vectors , , , and as shown in Figure 3. if ( and are side-by-side) else end if . . Delete , , , . end if end for

Figure 6. Pre-computation

Solver This program constructs an approximation to the solution of (1.1). It assumes that all matrices have already been constructed in a pre-computation. It is assumed that if node is a parent of node , then .   for all . for . end for

Figure 7. Solve stage.
Remark 5.3.

The merge stage is exact when performed in exact arithmetic. The only approximation involved is the approximation of the solution on a leaf by its interpolating polynomial.

5.2. Complexity analysis

The analysis of the asymptotic cost of the algorithm in Section 5.1 closely mimics the analysis of the classical nested dissection algorithm [7, 3]. For simplicity, we analyze the simplest situation in which a square domain is divided into leaf boxes, each holding a spectral cartesian mesh with points. The total number of unknowns in the system is then roughly (to be precise, ).

Cost of leaf computation: Evaluating the formulas (3.3), (3.4), and (3.5) requires dense matrix algebra on matrices of size roughly . Since there are about leaves, the total cost is

Cost of the merge operations: For an integer , we refer to “level ” as the collection of boxes whose side length is times the side of the full box (so that corresponds to the root and corresponds to the set of leaf boxes). To form the matrices , , and for a box on level , we need to evaluate each of the formulas (4.5), (4.11), and (4.12) three times, with each computation involving matrices of size roughly . Since there are boxes on level , we find that the cost of processing level is

Adding the costs at all levels, we get

Cost of solve stage: The cost of processing a non-leaf node on level is simply the cost of a matrix-vector multiply involving the dense matrix of size . For a leaf, is of size roughly . Therefore

5.3. Problem of resonances

The scheme presented in Section 5.1 will fail if one of the patches in the hierarchical partitioning is resonant in the sense that there exist non-trivial solutions to the Helmholtz equation that have zero Dirichlet data at the boundary of the patch. In this case, the Neumann data for the patch is not uniquely determined by the Dirichlet data, and the DtN operator cannot exist. In practice, this problem will of course arise even if we are merely close to a resonance, and will be detected by the discovery that the inverse matrix in the formulas (3.3) and (4.5) for the solution operator is ill-conditioned.

It is our experience from working with domains of size one hundred wave-lengths or less that resonances are very rare; one almost never encounters the problem. Nevertheless, it is important to monitor the conditioning of the formulas (3.3) and (4.5) to ensure the accuracy of the final answer. Should a problem be detected, the easiest solution would be to simply start the computation over with a different tessellation of the domain . Very likely, this will resolve the problem.

Current efforts to formulate variations of the scheme that are inherently not vulnerable to resonances are described in Section 7.5.

6. Numerical experiments

This section reports the results of some numerical experiments with the method described in Section 5.1. The method was implemented in Matlab and the experiments executed on a Lenovo W510 laptop with a quad core Intel i7 Q720 processor with 1.6GHz clockspeed, and 16GB of RAM.

The speed and memory requirements of the algorithm were investigated by solving the special case where in (1.1), and the Dirichlet data is simply set to equal a known analytic solution. The results are reported in Section 6.1. We also report the errors incurred in the special case, but it should be noted that these represent only a best case estimate of the errors since the equation solved is particularly benign. To get a more realistic estimate of the errors in the method, we also applied it to three situation in which exact solutions are not known: A problem with variable coefficients in Section 6.2, a problem on an L-shaped domain in Section 6.3, and a convection-diffusion problem in Section 6.4.

6.1. Constant coefficient Helmholtz

We solved the basic Helmholtz equation

(6.1)
(6.2)

where and . The boundary data were in this first set of experiments chosen as the restriction to of an exact solution

(6.3)

where , and where is the ’th Bessel function of the second kind. This experiment serves two purposes. The first is to systematically measure the speed and memory requirements of the method described in Section 5.1. The second is to get a sense of what errors can be expected in a “best case” scenario with a very smooth solution. Observe however that the situation is by no means artificial since the smoothness of this case is exactly what one encounters when the solver is applied to a free space scattering problem as described in Section 7.3.

The domain was discretized into patches, and on each patch a Cartesian mesh of Chebyshev nodes was placed. The total number of degrees of freedom is then

We tested the method for . For each fixed , the method was executed for several different mesh sizes . The wave-number was chosen to keep a constant of 12 points per wavelength, or .

Since the exact solution is known in this case, we computed the direct error measure

where is the set of all mesh points. We also computed the maximum error in the gradient of on the boundary as computed via the and operators on the root box,

Table 1 reports the following variables:
The number of wave-lengths along one side of . The time in seconds required to execute the pre-computation in Figure 6. The time in seconds required to execute the solve in Figure 7. The amount of RAM used in the pre-computation in MB.
The table also reports the memory requirements in terms of number of the number of double precision reals that need to be stored per degree of freedom in the discretization.

The high-order version of the method () was also capable of performing a high accuracy solve with only six points per wave-length. The results are reported in Table 2.

We see that increasing the spectral order is very beneficial for improving accuracy. However, the speed deteriorates and the memory requirements increase as grows. Choosing appears to be a good compromise.

Remark 6.1.

In the course of executing the numerical examples, the instability problem described in Section 5.3 was detected precisely once (for and 12 points per wave length).

(sec) (sec) (MB) (reals/DOF)
6 6561 6.7 0.28 0.0047 8.02105e-03 3.06565e-01 2.8 56.5
6 25921 13.3 0.96 0.0184 1.67443e-02 1.33562e+00 12.7 64.2
6 103041 26.7 4.42 0.0677 3.60825e-02 5.46387e+00 56.2 71.5
6 410881 53.3 20.23 0.2397 3.39011e-02 1.05000e+01 246.9 78.8
6 1640961 106.7 88.73 0.9267 7.48385e-01 4.92943e+02 1075.0 85.9
11 6561 6.7 0.16 0.0019 2.67089e-05 1.08301e-03 2.9 58.0
11 25921 13.3 0.68 0.0073 5.30924e-05 4.34070e-03 13.0 65.7
11 103041 26.7 3.07 0.0293 1.01934e-04 1.60067e-02 57.4 73.0
11 410881 53.3 14.68 0.1107 1.07747e-04 3.49637e-02 251.6 80.2
11 1640961 106.7 68.02 0.3714 2.17614e-04 1.37638e-01 1093.7 87.4
21 6561 6.7 0.23 0.0011 2.56528e-10 1.01490e-08 4.4 87.1
21 25921 13.3 0.92 0.0044 5.24706e-10 4.44184e-08 18.8 95.2
21 103041 26.7 4.68 0.0173 9.49460e-10 1.56699e-07 80.8 102.7
21 410881 53.3 22.29 0.0727 1.21769e-09 3.99051e-07 344.9 110.0
21 1640961 106.7 99.20 0.2965 1.90502e-09 1.24859e-06 1467.2 117.2
21 6558721 213.3 551.32 20.9551 2.84554e-09 3.74616e-06 6218.7 124.3
41 6561 6.7 1.50 0.0025 9.88931e-14 3.46762e-12 7.9 157.5
41 25921 13.3 4.81 0.0041 1.58873e-13 1.12883e-11 32.9 166.4
41 103041 26.7 18.34 0.0162 3.95531e-13 5.51141e-11 137.1 174.4
41 410881 53.3 75.78 0.0672 3.89079e-13 1.03546e-10 570.2 181.9
41 1640961 106.7 332.12 0.2796 1.27317e-12 7.08201e-10 2368.3 189.2
Table 1. Results from an experiment with a constant coefficient Helmholtz problem on a square. The boundary data were picked so that the analytic solution was known; as a consequence, the solution is smooth, and can be smoothly extended across the boundary. The wave-number was chosen to keep a constant of 12 discretization points per wave-length.
(sec) (sec) (MB) (reals/DOF)
41 6561 13.3 1.30 0.0027 1.54407e-09 1.78814e-07 7.9 157.5
41 25921 26.7 4.40 0.0043 1.42312e-08 2.35695e-06 32.9 166.4
41 103041 53.3 17.54 0.0199 1.73682e-08 5.84193e-06 137.1 174.4
41 410881 106.7 72.90 0.0717 2.28475e-08 1.51575e-05 570.2 181.9
41 1640961 213.3 307.37 0.3033 4.12809e-08 5.51276e-05 2368.3 189.2
Table 2. This table illustrates the same situation as Table 1, but now is increased twice as fast (so that we keep only 6 points per wave-length).

6.2. Variable coefficient Helmholtz

We solved the equation

(6.4)
(6.5)

where , where , and where

The Helmholtz parameter was kept fixed at , corresponding to a domain size of wave lengths. The boundary data was given by

The equation (6.4) was discretized and solved as described in Section 6.1. The speed and memory requirements for this computation are exactly the same as for the example in Section 6.1 (they do not depend on what equation is being solved), so we now focus on the accuracy of the method. We do not know of an exact solution, and therefore report pointwise convergence. Letting denote the value of computed using degrees of freedom, we used