Fast discrete convolution in \mathbb{R}^2 using Sparse Bessel Decomposition

Fast discrete convolution in using Sparse Bessel Decomposition

Introduction

The Boundary Element Method requires the resolution of fully populated linear systems . As the number of unknowns gets large, the storage of the matrix and the computational cost for solving the system through direct methods (e.g. LU factorization) become prohibitive. Instead, iterative methods can be used, which require very fast evaluations of matrix vector products. In the context of boundary integral formulations, this takes the form of discrete convolutions:

Here, is the Green’s kernel of the partial differential equation under consideration, is a set of points in , with diameter , is the Euclidian norm and is a vector (typically the values of a function at the points ). For example, the resolution of the Laplace equation with Dirichlet boundary conditions leads to with (the kernel of the single layer potential).

In principle, the effective computation of the using requires operations. However, several more efficient algorithms have emerged to compute an approximation of with only quasilinear complexity in . Among those are the celebrated Fast Multipole Method (see for example [13] and references therein), the Hierarchical Matrix [6], and more recently, the Sparse Cardinal Sine Decomposition (SCSD) [2].

One of the key ingredients in all those methods consists in writing the following local variable separation:

which needs to be valid for and arbitrarily distant from each other, and up to a controlled accuracy. This eventually results in compressed matrix representations and accelerated matrix-vector product. Notice that, to be fully effective, the former separation is usually made locally with the help of a geometrical splitting of the cloud of points using a hierarchical octree.

Here we present an alternative compression and acceleration technique, which we call the Sparse Bessel Decomposition (SBD). This is an extension of the SCSD adapted to 2-dimensional problems. The SBD and SCSD methods achieve performances comparable to the aforementioned algorithms, they are flexible with respect to the kernel , and do not rely on the construction of octrees, which makes them easier to implement. In addition, they express in an elegant way the intuition according to which a discrete convolution is nothing but a product of Fourier spectra.

The method heavily relies on the Non Uniform Fast Fourier Transform (NUFFT) of type III (see the seminal paper [12] and also [14] and references therein for numerical aspects and open source codes). The NUFFT is a fast algorithm, which we denote by for later use, that returns, for arbitrary sets of points and points in and a complex vector , the vector defined by:

This algorithm generalizes the classical Fast Fourier Transform (FFT) [10], to nonequispaced data, preserving the quasi-linear complexity in . The SBD method first produces a discrete and sparse approximation of the spectrum of ,

This approximation is replaced in to yield

where denotes the elementwise product between vectors. The decomposition is obtained “off-line” and depends on the value of , but is independent of the choice of the vector and can thus be used for any evaluation of the matrix vector product . The approximation reduces the complexity from to , stemming from the NUFFT complexity.

The NUFFT has already been used in the literature for the fast evaluation of quantities of the form . In particular, our algorithm shares many ideas with the approach in [18]. The method presented therein also relies on an approximation of the form . However, we choose the set of frequencies in a different way, leading to a sparser representation of (see ).

The kernel is usually singular near the origin, in which case can only be accurate for above some threshold . Part of the SBD algorithm is thus dedicated to computing a local correction using a sparse matrix (which we call the close correction matrix, denoted by in the sequel) to account for the closer interactions. This threshold must be chosen so as to balance the time spent for computing the far and those close contributions. As a matter of fact, we shall prove the following:

We prove using several steps. We first introduce the Fourier-Bessel series (). In , we present the Sparse Bessel Decomposition, and analyze its numerical stability. In section , we apply the SBD method to the Laplace kernel, and give estimates for the number of terms to reach a fixed accuracy. We also explain how to adapt this method for other kernels. In , we show how to convert a SBD decomposition into an approximation of the form through what we call “circular quadratures” and provide error estimates. In , we summarize the complexities of each step and prove . We conclude with some numerical examples.

Before this, let us start by a brief overview of our algorithm.

2Summary of the algorithm

The SBD algorithm can be summarized as follows:

Off-line part
  • Inputs:

    A radially symmetric kernel , a set of points in of diameter , a value for the parameter , a tolerance .

  • Sparse spectrum sampling:

    Compute a set of complex weights and frequencies so that is valid for up to tolerance .

  • Correction Matrix:

    Determine the set of all pairs such that (fixed-radius neighbor search). Assemble the close correction sparse matrix:

    Notice that the second term is a non-uniform discrete Fourier transform. Indeed, if we introduce given by , and , the non-zero entries of are given by

  • Outputs

    : The set of weights , the frequencies and the sparse matrix .

On-line part
  • Input:

    All outputs of the off-line part, and a complex vector of size .

  • Far approximation:

    Compute, for all ,

    For this, follow three steps

    • Space Fourier:

      Compute

    • Fourier multiply:

      Perform elementwise multiplication by :

    • Fourier Space:

      Compute

  • Close correction:

    Compute the sparse matrix product:

  • Output:

    The vector , with, for any

The sparse spectrum sampling step:

The main novelty in our algorithm is the method for producing an approximation of the form . We proceed in two steps, uncoupling the radial and circular coordinates.

Sparse Dessel Decomposition: In a first part, we compute a Bessel series approximating on , which we call a Sparse Bessel Decomposition. It takes the form

where is the Bessel function of first kind and order zero and is the sequence of its roots (see for more details). In order to compute the coefficients , we first choose a starting value for and compute the weights that minimize the least square error

This amounts to solving an explicit linear system. We keep increasing until the residual error goes below the required tolerance. To choose the stopping criterion, we suggest monitoring the error near where it is usually the highest. The successive choices of are made using a dichotomy search.

Circular quadrature: In a second step, we use approximations of the form:

which are discrete versions of the identity:

We sum them to eventually obtain the formula .

3Series of Bessel functions and error estimates

In this section, we give a short introduction to Fourier-Bessel series. A possible reference on this topic is [24], chapter XVIII. The main result needed for our purpose is , an equivalent statement of which can be found in to Theorem in [22] chapter 8, section 20.

In , we quickly recall some classical facts on the Laplacian eigenfunctions with Dirichlet boundary conditions on the unit ball and formulate a conjecture () for the normalization constant supported by strong numerical evidence (). It is worth noting that the use Laplace eigenfunctions as the decomposition basis and the results we obtain hereafter do not rely on the space dimension. For example, in the radial eigenvalues of the Laplacian are proportional to

Therefore, our approach generalizes [2] to any dimension.

3.1Radial eigenvalues of the Laplace operator with Dirichlet conditions

In the following, denotes the unit ball in , and its boundary. We say that a function is radial if there exists such that for any ,

In this case, we use the notation where . We note the closed subspace of that are radial functions and . Similarly,

which is a Hilbert space for the norm

Finally, we note the closure of in , with the norm

We now briefly recall some facts on Bessel functions. All the results that we use on this topic can be found in the comprehensive book [1], most of which can be consulted handily on the digital library [16].

is a solution of Bessel’s differential equation

The roots of , behave, for large , as

More precisely, , and

For any , we introduce:

where the normalization constant is chosen such that , that is

For any , satisfies:

The following result is well-known.

One can check, using asymptotic expansions of Bessel functions, that

We will actually need a more precise knowledge on the constant . We formulate the next conjecture, which seems hard to establish and for which we didn’t find any element of proof in the literature. Numerical evidence exposed in strongly suggests it is true.

Figure 1: Illustration of  with numerical values of the first 1000 terms of the sequence {v_p = \sqrt{2\pi p}C_p - 1} (blue circles) and {w_p = 1 - \sqrt{2\pi(p-1/4)}C_p} (red crosses) in log-log scale
Figure 1: Illustration of with numerical values of the first terms of the sequence (blue circles) and (red crosses) in log-log scale

3.2Truncature error for Fourier-Bessel series of smooth functions

We now introduce the Fourier-Bessel series and prove a bound for the norm of the remainder. In , we have shown that any function can be expanded through its so-called Fourier-Bessel series as

The generalized Fourier coefficients are obtained by the orthonormal projection:

Most references on this topic focus on proving pointwise convergence of the series even for not very regular functions (e.g. piecewise continuous, square integrable, etc.) [21]. In such cases, the Fourier-Bessel series may exhibit a Gibb’s phenomenon [26]. On the contrary here, we need to establish that the Fourier-Bessel series of very smooth functions converges exponentially fast. To this aim, we first introduce the following terminology:

If satisfies the multi-Dirichlet condition of order , then by integration by parts:

since vanishes on . Assume the result is true for some and let satisfy the multi-Dirichlet condition of order . Then, using the fact that is an eigenvector of associated to the eigenvalue we obtain:

The result follows from integration by parts where we successively use that and vanish on .

Notice that this is similar to the fact that the Fourier coefficients of smooth functions decay fast.

We apply the result of the previous proposition and remark that since is an eigenfunction of the Laplace operator with unit norm in , . To conclude, recall for large .

Parseval’s identity implies

According to the previous results, we find that:

The announced result follows from for .

3.3Other boundary conditions

When we replace the Dirichlet boundary condition by the following Robin boundary conditions

for some constant , the same analysis can be conducted, leading to Dini series (also covered in [24]). This time, we construct a Hilbert basis of with respect to the bilinear form

The following result holds.

It can be checked that the truncature error estimates in extend to this case, for functions satisfying multi-Robin conditions of order , that is for all , satisfies .

4Sparse Bessel Decomposition

4.1Definition of the SBD

Consider the kernel involved in . We can assume up to rescaling that the diameter of is bounded by , and therefore, we need to approximate only on the unit ball . If we wish to approximate in series of Bessel functions, two kinds of complications are encountered:

  • is usually singular near the origin, therefore not in (even for ).

  • The multi-Dirichlet conditions may not be fulfilled up to a sufficient order.

The point (ii) is crucial in order to apply the error estimates of the previous section. The first two kernels that we will study (Laplace and Helmholtz) satisfy the favorable property:

for some , which will be helpful to ensure (ii) at any order. For more general kernel, we propose in a simple trick to enforce multi-Dirichlet conditions up to a given order.

As for point (i), we will use the following method: for the approximation

we know by that the minimal error on is reached for . However, if the closest interaction in are computed explicitly, it can be sufficient to approximate in a domain of the form for some . For this reason, we propose to define the coefficients as the minimizers of the quadratic form

where is the annulus . In the sequel, those coefficients will be called the SBD coefficients of of order . Obviously, for any radial function defined on that coincides with on , one has

In particular, when is smooth up to the origin, this gives an error estimate via . If we choose a sufficiently high value for , we ensure that smooth enough extensions exist, and ensure fast decay of the coefficients.

The next result shows that the norm on controls the norm, thus ruling out any risk of Gibb’s phenomenon.

It is sufficient to show the inequality for smooth , the general result following by density. Let , we have, since :

4.2Numerical computation of the SBD

For a given kernel , the SBD coefficients are obtained numerically by solving the following linear system:

Where is the Bessel function of first kind and order (in fact, ). We solve this system for increasing values of until a required tolerance is reached. It turns out that the matrix whose entries are given by

is explicit: for , the non-diagonal entries of are

where

while the diagonal entries are

where

Those formulas are obtained using Green’s formulas together with the fact that are eigenfunctions of the Laplace operator. They are valid for any value of (not just the roots of ).

4.3Conditioning of the linear system

The conditioning of seems to depend almost exclusively on the parameter . We were only able to derive an accurate estimate of the conditioning of when is small enough. For large , we will show some numerical evidence for a conjectured bound on the conditioning of .

Conditioning of for small

This estimate is only useful when , that is where is the first positive root of . One has

In particular, for , The matrix is well conditioned, the ratio of its largest to its smallest eigenvalues being of the order . A plot of is provided below, , and some numerical approximations of the minimal eigenvalue of are shown in function of for several values of .

Figure 2: Graph of F and numerical values of the minimal eigenvalue of A, \lambda_{\min}(A) in function of \gamma, in the case b=1, for P=50 (blue circles) and P=500 (red crosses)
Figure 2: Graph of and numerical values of the minimal eigenvalue of , in function of , in the case , for (blue circles) and (red crosses)

Let and let its coordinates on this basis. Then

showing that the eigenvalues of are bounded by . Thus is a positive symmetric matrix and the smallest eigenvalue of is bounded by

which yields:

We now use , which implies

combined with , to get

For the first term, we write

while for the second, we use the classical inequality to deduce

We use again to obtain:

which obviously implies the claimed result.

Conditioning of for large

The behavior of is more difficult to study for large . Nevertheless, we observed the following exponential decay: for any , and , the minimal eigenvalue of is bounded below by

Lower bounds from  and , and estimated minimal eigenvalue of A in function of \gamma for P=50 (blue circles), P=150 (red crosses), P=500 (yellow squares) and P=1500 (purple diamonds). For P=50, the computed minimal eigenvalue is negative due to numerical errors from \gamma \approx 7.5 so corresponding data cannot be displayed. The horizontal dashed line shows machine precision
Lower bounds from and , and estimated minimal eigenvalue of in function of for (blue circles), (red crosses), (yellow squares) and (purple diamonds). For , the computed minimal eigenvalue is negative due to numerical errors from so corresponding data cannot be displayed. The horizontal dashed line shows machine precision

5Application to Laplace and Helmholtz kernels

5.1Laplace kernel

When solving PDE’s involving the Laplace operator (for example in heat conduction or electrostatic problems), one is led to with the Laplace kernel (we have dropped the constant for simplicity). Here we show that its SBD converges exponentially fast:

We show this by exhibiting an extension for which we are able to estimate the remainder of the Fourier-Bessel series. For any , let us define extensions of as

where the coefficients are chosen so that has continuous derivatives up to the order :

Notice that the term ensures the boundedness of near the origin. Also observe that for all :

since in a vicinity of . We now go into some rather tedious computations to provide a crude bound for in terms of the coefficients .

For , we have

This result is obtained by expanding the sum in the definition of and using the fact that . Hence, using triangular inequality

For , we apply the following (crude) inequality:

to obtain:

Since , the last sum is bounded by , while

follows from Stirling formula.

We are now able to prove the following, which implies .

Let , by Leibniz formula,

This leads to

where we have used the identity

Observe that

and thus,

Combining (Equation 14) with estimation ( ?), we find that there exists a constant such that, for

Therefore, integrating on , we get

and since

for , the same bound applies to . We now plug this estimate into the inequality of corollary ?, to get

The previous inequality holds true for any integer such that and any . Without loss of generality, one can assume that . In this case, let , and . Using the fact that is bounded on , we get

Figure 3
Figure 3

5.2Helmholtz kernel

Let the classical Bessel function of second kind and of order . For any , the Helmholtz kernel, , where , is the fundamental Green’s kernel associated to the harmonic wave operator , that satisfies a Sommerfeld radiation condition at infinity, (see for example [25]). This kernel arises in various physical problems, such as sound waves scattering. To approximate as a sum of dilated functions, it is sufficient to produce a SBD decomposition of . We can obtain good approximations of in series of dilated functions in the following way:

  • When is a root of

    : In this case the multi-Dirichlet condition is satisfied at any order. Indeed, for any ,

    We thus produce a SBD decomposition of on an interval . Just like for the Laplace kernel, it was observed that the approximation error converges exponentially fast to zero, as soon as is greater than .

  • When is close to, or greater than the first root of

    : we find a Sparse Bessel Decomposition for on , where is the first root of larger than . This provides a decomposition for valid on .

  • When is much smaller than the first root of

    : the previous idea might lead to unnecessary efforts. Indeed, to ensure that is small enough, one would have to choose a very small value of leading to a very long Bessel series. Instead, one can use the Bessel-Fourier series associated to the Robin condition (see ):

    noticing that in this region.

Figure 4: SBD of Y_0(kr) with k a root of Y_0 approximately equal to 7.086, with P=10 terms and a=0.05
Figure 4: SBD of with a root of approximately equal to , with terms and

5.3General kernel : enforcing the multi-Dirichlet condition

For general kernels , the multi-Dirichlet conditions may not be fulfilled, even after rescaling. When applying the SBD method without any changes, this leads to the situation observed in the top panel of . In this figure, the SBD is applied to the kernel (note that ), and we plot the error in function of for several values of and . The error curve stagnates near . In the bottom panel, we apply the SBD to where the last term enforces the Dirichlet condition, and the error decay is improved.

G_1(r) = \log(r) + \sin(r)
G_2(r) = \log(r) + \sin(r) - \frac{1}{4}\Delta G_1(1)r^2

More generally, for any radial , we can apply the SBD to a modified function where is chosen to enforce the multi-Dirichlet condition. Since we wish to obtain for a decomposition in sum of Bessel functions, we propose to choose in the form

for some that are not roots of . It is not the aim of this paragraph to describe a systematic way of choosing , as this work is still in progress. However, when the first few iterates of the Laplace operator on are known, we suggest the following choice: let the square root of the average ratio between too successive iterates, choose as the root of that is closest from . Then, assign , successively to the closest roots of . The coefficients are finally found by inverting a small linear system

where is the vector given by

and with

In , we show the efficiency of this method by applying the SBD with terms to some highly oscillating function (), for , and computing the maximal error of the decomposition in function of . The frequencies are the roots of that are closest to .

Figure 5: Maximal error betwwen the kernel G_1(r) =  \log(x) + \sin(250x) and its 100-terms SBD in function of \gamma using the method described in this paragraph for several values of n.
Figure 5: Maximal error betwwen the kernel and its -terms SBD in function of using the method described in this paragraph for several values of .

6Circular quadrature

In this section, we study the approximation of the form

for some integer and some quadrature points .

6.1Theoretical bound

In order to prove this proposition, we first prove a result on Fourier series

Since is , it is equal to its Fourier Series, which converges normally:

Using this expression, we obtain