A unified integral equation scheme for doubly-periodic Laplace and Stokes boundary value problems in two dimensions

# A unified integral equation scheme for doubly-periodic Laplace and Stokes boundary value problems in two dimensions

Alex Barnett Department of Mathematics, Dartmouth College, and Center for Computational Biology, Flatiron Institute    Gary Marple Department of Mathematics, University of Michigan    Shravan Veerapaneni22footnotemark: 2    Lin Zhao INTECH, Princeton, NJ
July 4, 2019
###### Abstract

We present a spectrally-accurate scheme to turn a boundary integral formulation for an elliptic PDE on a single unit cell geometry into one for the fully periodic problem. Applications include computing the effective permeability of composite media (homogenization), and microfluidic chip design. Our basic idea is to exploit a small least squares solve to apply periodicity without ever handling periodic Green’s functions. We exhibit fast solvers for the two-dimensional (2D) doubly-periodic Neumann Laplace problem (flow around insulators), and Stokes non-slip fluid flow problem, that for inclusions with smooth boundaries achieve 12-digit accuracy, and can handle thousands of inclusions per unit cell. We split the infinite sum over the lattice of images into a directly-summed “near” part plus a small number of auxiliary sources which represent the (smooth) remaining “far” contribution. Applying physical boundary conditions on the unit cell walls gives an expanded linear system, which, after a rank-1 or rank-3 correction and a Schur complement, leaves a well-conditioned square system which can be solved iteratively using fast multipole acceleration plus a low-rank term. We are rather explicit about the consistency and nullspaces of both the continuous and discretized problems. The scheme is simple (no lattice sums, Ewald methods, nor particle meshes are required), allows adaptivity, and is essentially dimension- and PDE-independent, so would generalize without fuss to 3D and to other non-oscillatory elliptic problems such as elastostatics. We incorporate recently developed spectral quadratures that accurately handle close-to-touching geometries. We include many numerical examples, and provide a software implementation.

## 1 Introduction

Periodic boundary value problems (BVPs) arise frequently in engineering and the sciences, either when modeling the behavior of solid or fluid media with true periodic geometry, or when applying artificial periodic boundary conditions to simulate a representative domain of a random medium or particulate flow (often called a super-cell or representative volume element simulation). The macroscopic response of a given microscopic periodic composite medium can often be summarized by an effective material property (e.g. a conductivity or permeability tensor), a fact placed on a rigorous footing by the field of homogenization (for a review see [14]). However, with the exception of layered media that vary in only one dimension, finding this tensor requires the numerical solution of “cell problems” [60, 64], namely BVPs in which the solution is periodic up to some additive constant expressing the macroscopic driving. Application areas span all of the major elliptic PDEs, including the Laplace equation (thermal/electrical conductivity, electrostatics and magnetostatics of composites [35, 38, 27, 12]); the Stokes equations (porous flow in periodic solids [18, 49, 73, 26], sedimentation [1], mobility [67], transport by cilia carpets [16], vesicle dynamics in microfluidic flows [57]); elastostatics (microstructured periodic or random composites [29, 36, 59, 62]); and the Helmholtz and Maxwell equations (phononic and photonic crystals, bandgap materials [41, 63]). In this work we focus on the first two (non-oscillatory) PDEs above, noting that the methods that we present also apply with minor changes to the oscillatory Helmholtz and Maxwell cases, at least up to moderate frequencies [5, 13, 55].

The accurate solution of periodic BVPs has remained a challenging problem at the forefront of analytical and computational progress for well over a century [9]. Modern simulations may demand large numbers of objects per unit cell, with arbitrary geometries, that at high volume fractions may approach arbitrarily close to each other [70]. In such regimes asymptotic methods based upon expansion in the inclusion size do not apply [18, 26]. Polydisperse suspensions or nonsmooth geometries may require spatially adaptive discretizations. In microfluidics, high aspect ratio and/or skew unit cells are needed, for instance in the optimal design of particle sorters. Furthermore, to obtain accurate average material properties for random media or suspensions via the Monte Carlo method, thousands of simulation runs may be needed [38]. A variety of numerical methods are used (see [60, Sec. 2.8]), including particular solutions (starting in 1892 with Rayleigh’s method for cylinders and spheres [68], and, more recently, in Stokes flow [69]), eigenfunction expansions [73], lattice Boltzmann and finite differencing [47], and finite element methods [29, 59]. However, for general geometries, it becomes very hard to achieve high-order accuracy with any of the above apart from finite elements, and the cost of meshing renders the latter unattractive when moving geometries are involved. Integral equation methods [48, 65, 44, 54] are natural, since the PDE has piecewise-constant coefficients, and are very popular [49, 36, 27, 26, 62, 12, 1, 67]. By using potential theory to represent the solution field in terms of the convolution of a free-space Green’s function with an unknown “density” function living only on material boundaries, one vastly reduces the number of discretized unknowns. The linear system resulting by applying boundary conditions takes the form

 Aτ=f , \hb@xt@.01(1.1)

where the matrix is the discretization of an integral operator. The latter is often of Fredholm second-kind, hence remains well-conditioned, independent of , so that iterative methods converge rapidly. Further advantages include their high-order or spectral accuracy (e.g. via Nyström quadratures), the ease of adaptivity on the boundary, and the existence of fast algorithms to apply with optimal cost, such as the fast multipole method (FMM) [28].

The traditional integral equation approach to periodic problems replaces the free-space Green’s function by a periodic Green’s function, so that the density is solved on only the geometry lying in a single unit cell [36, 27, 26, 62, 12]. There is an extensive literature on the evaluation of such periodic Green’s functions (e.g. in the Stokes case see [66, 72]); yet any such pointwise evaluation for each source-target pair leads to complexity, which is unacceptable for large problems. There are two popular approaches to addressing this in a way compatible with fast algorithms:

• Lattice sums. Noticing that the difference between the periodic and free-space Green’s function is a smooth PDE solution in the unit cell, one expands this in a particular solution basis (a cylindrical or spherical expansion); the resulting coefficients, which need to be computed only once for a given unit cell, are called lattice sums [7, 35, 27, 26]. They originate in the work of Rayleigh [68] and in the study of ionic crystals (both reviewed in [9, Ch. 2–3]). The FMM may then be periodized by combining the top-level multipole expansion coefficients with the lattice sums to give a correction to the local expansion coefficients (in 2D this is a discrete convolution) which may be applied fast [28, Sec. 4.1] [62]. This method has been used for the doubly-periodic Laplace BVP by Greengard–Moura [27], and Stokes BVP by Greengard–Kropinski [26].

• Particle-mesh Ewald (PME) methods. These methods exploit Ewald’s realization [20] that, although both the spatial and the spectral (Fourier) sums for the periodic Green’s function in general converge slowly, there is an analytic splitting into spatial and spectral parts such that both terms converge superalgebraically. Numerically, the spatial part is now local, while the spectral part may be applied by “smearing” onto a uniform grid, using a pair of FFTs, and (as in the non-uniform FFT [19]) correcting for the smearing. The result is a fast algorithm; for a review see [15]. Recently this has been improved by Lindbo–Tornberg to achieve overall spectral accuracy in the Laplace [52] and Stokes [50] settings, with applications to 3D fluid suspensions [1].

The scheme that we present is based purely on free-space Green’s functions, and has distinct advantages over the above two periodization approaches. 1) Only physically-meaningful boundary conditions and compatibility conditions are used. Aside from conceptual and algebraic simplification, this also removes the need for (often mysterious and subtle [9]) choices of the values of various conditionally or even non-convergent lattice sums based on physical arguments [18, 28, 36, 27] (e.g., six such choices are needed in [26]). When it comes to skewed unit cells combined with applied pressure driving, such hand-picked choices in lattice sum methods become far from obvious. 2) Free-space FMM codes may be inserted without modification, and fast direct solvers (based on hierarchical compression of matrix inverses) [31, 23] require only slight modification [22, 57]. 3) When multi-scale spatial features are present in the problem—commonly observed in practical applications such as polydisperse suspensions and complex microfludic chip geometries—PME codes become inefficient because of their need for uniform grids. In contrast, our algorithm retains the fast nature of the adaptive FMM in such cases. 4) In contrast to lattice sums and Ewald methods, which intrinsically rely on radially-symmetric expansions and kernels, our scheme can handle reasonably high aspect ratio unit cells with little penalty. 5) Since our Stokes formulation does not rely on complex variables (e.g. Sherman–Lauricella [26]), the whole scheme generalizes to 3D with no conceptual changes [56]. A disadvantage of our scheme is that the prefactor may be slightly worse than Ewald methods, due to the direct summation of neighboring images.

The basic idea is simple, and applies to integral representations for a variety of elliptic PDEs. Let be a (generally trapezoidal) unit cell “box”, and let be the material boundary which is to be periodized ( may even intersect the walls of ). Applying periodic boundary conditions is equivalent to summing the free-space representation on over the infinite lattice, as in Fig. LABEL:f:geom(a). Our scheme sums this potential representation over only the nearest neighbor images of (see Fig. LABEL:f:geom(b)), but adds a small auxiliary basis of smooth PDE solutions in , with coefficient vector , that efficiently represents the (distant) contribution of the rest of the infinite image lattice. For this auxiliary basis we use point sources (Fig. LABEL:f:geom(c)); around are needed, a small number which, crucially, is independent of and the boundary complexity. We apply the usual (homogeneous) boundary condition on , which forms the first block row of a linear system. We impose the desired physical periodicity as auxiliary conditions on the Cauchy data differences between wall pairs and (see Fig. LABEL:f:geom(a)), giving a second block row. The result is a block “extended linear system” (ELS),

 [ABCQ][τξ]=[0g] . \hb@xt@.01(1.2)

Here accounts for the applied macroscopic thermal or pressure gradient in the form of prescribed jumps across one unit cell. Fig. LABEL:f:geom(b)–(c) sketches the interactions described by the four operators , , , and . This linear system is generally rectangular and highly ill-conditioned, but when solved in a backward-stable least-squares sense can result in accuracies close to machine precision.

Three main routes to a solution of (LABEL:ELS) are clear. (a) The simplest is to use a dense direct least-squares solve (e.g., via QR), but the cost becomes impractical for large problems with . Instead, to create fast algorithms one exploits the fact that the numbers of auxiliary rows and columns in (LABEL:ELS) are both small, as follows. (b) One may attempt to eliminate to get the Schur complement square linear system involving , the periodized version of , of the form

 A\tiny perτ:=(A−BQ+C)τ=−BQ+g , \hb@xt@.01(1.3)

where is a pseudoinverse of (see Section LABEL:s:schur). As we will see, (LABEL:Aper) can be well-conditioned when (LABEL:A) is, and can be solved iteratively, by using the FMM to apply while applying the second term in its low-rank factored form . (c) One may instead eliminate by forming, then applying, a compressed representation of via a fast direct solver; this has proven useful for the case of fixed periodic geometry with multiple boundary data vectors [22, 57]. Both methods (b) and (c) can achieve complexity.

This paper explores route (b). A key contribution is overcoming the stumbling block that, for many standard PDEs and integral representations, as written in (LABEL:Aper) does not in fact exist. Intuitively, this is due to divergence in the sum of the Green’s function over the lattice. For instance, the sum of clearly diverges, and thus the single-layer representation for Laplace’s equation, which we use in Section LABEL:s:lap, cannot be periodized unless the integral of density (total charge) vanishes. This manifests itself in the 2nd block row of (LABEL:ELS): the range of contains vectors that are not in the range of , thus numerically blows up. After studying in Sections LABEL:s:lapempty and LABEL:s:stoempty the consistency conditions for the linear system involving (the “empty unit cell” problem), we propose rank-1 (for Neumann Laplace) and rank-3 (for no-slip Stokes) corrections to the ELS that allow the Schur complement to be taken, and, moreover that remove the nullspace associated with the physical BVP. We justify these schemes rigorously, and test them numerically.

Although the idea of imposing periodicity via extra linear conditions has aided certain electromagnetics solvers for some decades [32, 75], we believe that this idea was first combined with integral equations by the first author and Greengard in [5], where controlled a Helmholtz local expansion. Since then it has become popular for a variety of PDE [5, 22, 30, 13, 12, 57, 55]. From [5] we also inherit the split into near and far images, and cancellations in that allow a rapidly convergent scheme even when intersects unit cell walls. The use of point sources as a particular solution basis that is efficient for smooth solutions is known as the “method of fundamental solutions” (MFS) [8, 21], “method of auxiliary sources” [45], “charge simulation method” [43], or, in fast solvers, “equivalent source” [11] or “proxy” [58] representations. This is also used in the recent 3D periodization scheme of Gumerov–Duraiswami [30]. Finally, the low-rank perturbations which enlarge the range of are inspired by the low-rank perturbation methods for singular square systems of Sifuentes et al. [71].

Here is a guide to the rest of this paper. In Section LABEL:s:lap we present the periodic Neumann Laplace BVP, possibly the simplest application of the scheme. This first requires understanding and numerically solving the “empty unit cell” subproblem, which we do in Section LABEL:s:lapempty. The integral formulation, discretization, and direct numerical solution of the ELS (LABEL:ELS) follows in Section LABEL:s:lapbie. A general scheme to stably take the Schur complement is given in Section LABEL:s:schur, along with a scheme to remove the physical nullspace specific to our representation. The latter is tested numerically. Section LABEL:s:multi shows how the FMM and close-evaluation quadratures are incorporated, and applies them to large problems with thousands of inclusions. Section LABEL:s:kappa defines the effective conductivity tensor and shows how to evaluate it efficiently using a pair of BVP solutions. In Section LABEL:s:sto we move to periodic Dirichlet (no-slip) Stokes flow, and show how its periodizing scheme closely parallels the Laplace version. In fact, the only differences are new consistency conditions for the empty subproblem (Section LABEL:s:stoempty), the use of a combined-field formulation (Section LABEL:s:stobie), and the need for a rank-three perturbation for a stable Schur complement (Section LABEL:s:stosch). Experiments on the drag of a square array of discs, and on large-scale flow problems, are performed in Section LABEL:s:stonum. We discuss generalizations and conclude in Section LABEL:s:conc.

Our aim is to illustrate, with two example BVPs, a unified route to a well-conditioned periodization compatible with fast algorithms. We take care to state the consistency conditions and nullspaces of the full and empty problems, this being the main area of problem-specific variation. We believe that the approach adapts simply to other boundary conditions and PDEs, once the corresponding consistency conditions and nullspaces are laid out. Although general aspect ratios and skew unit cells are amongst our motivations, for clarity we stick to the unit square; the generalization is straightforward.

###### Remark 1.1 (Software)

We maintain documented MATLAB codes implementing the basic methods in this paper at http://github.com/ahbarnett/BIE2D

## 2 The Neumann Laplace case

We now present the heat/electrical conduction problem in the exterior of a periodic lattice of insulating inclusions (corresponding to in [27]). For simplicity we first assume a single inclusion per unit cell. Let and be vectors defining the lattice in ; we work with the unit square so that and . Let represent the infinite lattice of inclusions. The scalar , representing electric potential or temperature, solves the BVP

 Δu = 0 in R2∖¯¯¯¯¯¯¯ΩΛ \hb@xt@.01(2.1) un = 0 on ∂ΩΛ \hb@xt@.01(2.2) u(x+e1)−u(x) = p1for all x∈R2∖¯¯¯¯¯¯¯ΩΛ \hb@xt@.01(2.3) u(x+e2)−u(x) = p2for all x∈R2∖¯¯¯¯¯¯¯ΩΛ , \hb@xt@.01(2.4)

ie, is harmonic, has zero flux on inclusion boundaries, and is periodic up to a given pair of constants which encode the external (macroscopic) driving. We use the abbreviation , where is the unit normal on the relevant evaluation curve.

###### Proposition 2.1

For each the solution to (LABEL:lap)–(LABEL:p2) is unique up to an additive constant.

Proof. As usual, one considers the homogeneous BVP arising when is the difference of two solutions. Let be any unit cell (tiling domain) containing , then using Green’s first identity,

 0=∫\@fontswitchB∖¯¯¯ΩuΔu=−∫\@fontswitchB∖¯¯¯Ω|∇u|2+∫∂\@fontswitchBuun−∫∂Ωuun .

The first boundary term cancels by periodicity, the second by (LABEL:bc), hence .

To solve the periodic BVP we first re-express it as a BVP on a single unit cell with coupled boundary values on the four walls comprising its boundary ; see Fig. LABEL:f:geom(a). For simplicity we assume for now that the square can be chosen such that does not intersect any of the walls; this restriction will later be lifted (see Remark LABEL:r:intersect). We use the notation to mean the restriction of to the wall , and for its normal derivative using the normal on (note that this points to the right, as shown in the figure.) Consider the following reformulated BVP:

 Δu = 0 in \@fontswitchB∖¯¯¯¯Ω \hb@xt@.01(2.5) un = 0 on ∂Ω \hb@xt@.01(2.6) uR−uL = p1 \hb@xt@.01(2.7) unR−unL = 0 \hb@xt@.01(2.8) uU−uD = p2 \hb@xt@.01(2.9) unU−unD = 0 . \hb@xt@.01(2.10)

Clearly any solution to (LABEL:lap)–(LABEL:p2) also satisfies (LABEL:lap2)–(LABEL:j2). Because of the unique continuation of Cauchy data as a solution to the 2nd-order PDE, the converse holds, thus the two BVPs are equivalent. We define the discrepancy [5] of a solution as the stack of the four functions on the left-hand side of (LABEL:u1)–(LABEL:j2).

### 2.1 The empty unit cell discrepancy BVP and its numerical solution

We first analyze, then solve numerically, an important subproblem which we call the “empty unit cell BVP.” We seek a harmonic function matching a given discrepancy (i.e., a stack of four functions defined on the walls , , , , respectively). That is,

 Δv = 0 in \@fontswitchB \hb@xt@.01(2.11) vR−vL = g1 \hb@xt@.01(2.12) vnR−vnL = g2 \hb@xt@.01(2.13) vU−vD = g3 \hb@xt@.01(2.14) vnU−vnD = g4 . \hb@xt@.01(2.15)

We now give the consistency condition and nullspace for this BVP, which shows that it behaves essentially like a square linear system with nullity 1.

###### Proposition 2.2

A solution to (LABEL:lapv)–(LABEL:g4) exists if and only if , and is then unique up to a constant.

Proof. The zero-flux condition holds for harmonic functions. Writing as the union of the four walls, with their normal senses, gives the sum of the integrals of (LABEL:g2) and (LABEL:g4). Uniqueness follows from the method of proof of Prop. LABEL:p:lapnull.

We now describe a numerical solution method for this BVP that is accurate for a certain class of data , namely those for which the solution may be continued as a regular solution to Laplace’s equation into a large neighborhood of , and hence each of is analytic. It will turn out that this class is sufficient for our periodic scheme (essentially because will only have to represent distant image contributions, as shown in Fig. LABEL:f:geom(c)). Let be centered at the origin. Recalling the fundamental solution to Laplace’s equation,

 G(x,y)=12πlog1r ,r:=∥x−y∥ , \hb@xt@.01(2.16)

we approximate the solution in by a linear combination of such solutions with source points lying uniformly on a circle of radius , where is the maximum radius of the unit cell. That is, for ,

 v(x)≈M∑j=1ξjϕj(x) ,ϕj(x):=G(x,yj) ,yj:=(Rpcos2πj/M,Rpsin2πj/M) , \hb@xt@.01(2.17)

with unknown coefficient vector . As discussed in the introduction, this is known as the MFS. Each basis function is a particular solution to (LABEL:lapv) in , hence only the boundary conditions need enforcing (as in Rayleigh’s original method [68]).

This MFS representation is complete for harmonic functions. More precisely, for in a suitable class, it is capable of exponential accuracy uniformly in . In particular, since is contained within the ball , we may apply known convergence results to get the following.

###### Theorem 2.3

Let extend as a regular harmonic function throughout the closed ball of radius . Let the fixed proxy radius satisfy . For each let be a proxy basis set as in (LABEL:mfs). Then there is a sequence of coefficient vectors , one vector for each , and a constant dependent only on , such that

 ∥∥v−M∑j=1ξ(M)jϕ(M)j∥∥L∞(\@fontswitchB)≤C(ρr\@fontswitchB)−M/2 ,M=1,2,… .

In addition, the vectors may be chosen so that the sequence , , is bounded.

This exponential convergence—with rate controlled by the distance to the nearest singularity in —was derived by Katsurada [42, Rmk. 2.2] in the context of collocation (point matching) for a Dirichlet BVP on the boundary of the disc of radius , which is enough to guarantee the existence of such a coefficient vector sequence. The boundedness of the coefficient norms has been known in the context of Helmholtz scattering in the Russian literature for some time (see [46, Sec. 3.4] and references within, and [17, Thm. 2.4]). We do not know of a reference stating this for the Laplace case, but note that the proof is identical to that in [4, Thm. 6], by using equation (13) from that paper. The restriction is crucial for high accuracy, since when , although convergence occurs, blows up, causing accuracy loss due to catastrophic cancellation [4, Thm. 7].

###### Remark 2.4

Intuitively, the restriction arises because the proxy basis may be viewed as a periodic trapezoid rule quadrature approximation to a single-layer potential on the circle , a representation which is incomplete when : it cannot represent the constant function [74, Rmk. 1].

To enforce boundary conditions, let , , , be two sets of collocation points, on the left and bottom wall respectively; we use Gauss–Legendre nodes. Enforcing (LABEL:g1) between collocation points on the left and right walls then gives

 M∑j=1[ϕj(xiL+e1)−ϕj(xiL)]ξj=g1(xiL) , for all i=1,…,m . \hb@xt@.01(2.18)

Continuing in this way, the full set of discrepancy conditions (LABEL:g1)–(LABEL:g4) gives the linear system

 Qξ=g , \hb@xt@.01(2.19)

where stacks the four vectors of discrepancies sampled at the collocation points, while the matrix consists of four block rows, each of size . By reading from (LABEL:Q1), one sees that has elements . Analogously, , , and finally .

We solve the (generally rectangular) system (LABEL:Qsys) in the least squares sense, which corresponds to minimizing a weighted -norm of the discrepancy error (we find in practice that there is no advantage to incorporating the square-root of the weights associated with the collocation nodes). As is well known in the MFS community [42, 21], becomes exponentially ill-conditioned as grows; intuitively this follows from the exponential decay of the singular values of the single-layer operator from the proxy radius to the circle radius containing . Thus, a least squares solver is needed which can handle rank-deficiency stably, i.e. return a small-norm solution when one exists. In this case the ill-conditioning causes no loss of accuracy, and, even though there is instability in the vector , for the evaluation of errors close to machine precision () are achieved [4].

###### Remark 2.5

For this and subsequent direct solves we use MATLAB’s linsolve (with the option RECT=true to prevent the less accurate LU from being used in the square case), which uses column-pivoted QR to find the so-called “basic” solution [61, Sec. 5.7] having at most nonzero entries, where is the numerical rank; this is close to having minimum norm [25, Sec. 5.5].

We illustrate this with a simple numerical test, in which is the discrepancy of a known harmonic function of typical magnitude and with sufficiently distant singularity. Once is filled and (LABEL:Qsys) solved, the numerical solution is evaluated via (LABEL:mfs), and the maximum error at 100 random target points in is taken (after removing an overall constant error, expected from Prop. LABEL:p:lapemptynull). Fig. LABEL:f:Qconv(a) shows exponential convergence in this error vs the number of wall nodes . Fig. LABEL:f:Qconv(b) shows exponential convergence with respect to , the number of proxy points, with a rate slightly exceeding the predicted rate. It is clear that whenever and the norm remains , and around 15 digits of accuracy result.

The decaying lowest few singular values of are also shown in panel (b): in particular there is one singular value decaying faster than all others. It is easy to verify that this corresponds to the null-space of the BVP (Prop. LABEL:p:lapemptynull); indeed we have tested that its right singular vector is approximately constant, and generates via (LABEL:mfs) the constant function in to within relative error. Likewise, the consistency condition in Prop. LABEL:p:lapemptynull manifests itself in : let be the vector that applies the discretization of this consistency condition to a vector , namely

 \hb@xt@.01(2.20)

where semicolons indicate vertical stacking, are the vectors of weights corresponding to the collocation nodes on , , and is the zero vector. Then we expect that

 w\tiny TQ≈0\tiny TM , \hb@xt@.01(2.21)

and indeed observe numerically that once . In summary, although the matrix is generally rectangular and ill-conditioned, it also inherits both aspects of the unit nullity of the empty BVP that it discretizes.

###### Remark 2.6

A different scheme is possible in which would be square, and (modulo a nullity of one, as above) well-conditioned, based on a “tic-tac-toe” set of layer potentials (see [5, Sec. 4.2] in the Helmholtz case). However, we recommend the above proxy point version, since i) the matrix is so small that handling its ill-conditioning is very cheap, ii) the tic-tac-toe scheme demands close-evaluation quadratures for its layer potentials, and iii) the tic-tac-toe scheme is more complicated.

### 2.2 Extended linear system for the conduction problem

We now treat the above empty BVP solution scheme as a component in a scheme for the periodic BVP (LABEL:lap2)–(LABEL:j2). Simply put, we take standard potential theory for the Laplace equation [44, Ch. 6], and augment this by enforcing periodic boundary conditions. Given a density function on the inclusion boundary , recall that the single-layer potential, evaluated at a target point , is defined by

 v=(\@fontswitchS∂Ωτ)(x):=∫∂ΩG(x,y)τ(y)dsy=12π∫∂Ωlog1∥x−y∥τ(y)dsy ,x∈R2 . \hb@xt@.01(2.22)

Using to indicate the outward normal at , this potential obeys the jump relation [44, Thm. 6.18]

 \hb@xt@.01(2.23)

where denotes the usual transposed double-layer operator from a general source curve to a target curve , defined by

 (D\tiny TΓ′,Γτ)(x)=∫Γ∂G(x,y)∂nxτ(y)dsy=12π∫Γ−(x−y)⋅nx∥x−y∥2τ(y)dsy ,x∈Γ′ . \hb@xt@.01(2.24)

The integral implied by the self-interaction operator is to be interpreted in the principal value sense.

Our representation for the solution sums the single-layer potential over the copies closest to the origin, and adds an auxiliary basis as in (LABEL:mfs),

 u=\@fontswitchS\tiny near∂Ωτ+M∑j=1ξjϕj ,where (\@fontswitchS\tiny near∂Ωτ)(x):=∑m,n∈{−1,0,1}∫∂ΩG(x,y+me1+ne2)τ(y)dsy . \hb@xt@.01(2.25)

Our unknowns are the density and auxiliary vector . Substituting (LABEL:urep) into the Neumann boundary condition (LABEL:bc2), and using the exterior jump relation on the central copy () only, gives our first block row,

 (−\small12+D\tiny near,T∂Ω,∂Ω)τ+M∑j=1ξjϕj|∂Ω=0 , \hb@xt@.01(2.26)

where, as before, the “near” superscript denotes summation over source images as in (LABEL:urep).

The second block row arises as follows. Consider the substitution of (LABEL:urep) into the first discrepancy equation (LABEL:u1): there are nine source copies, each of which interacts with the and walls, giving 18 terms. However, the effect of the rightmost six sources on is cancelled by the effect of the leftmost six sources on , leaving only six terms, as in [5, Fig. 4(a)–(b)]. All of these surviving terms involve distant interactions (the distances exceed one period if is contained in ). Similar cancellations occur in the remaining three equations (LABEL:j1)–(LABEL:j2). The resulting four subblocks are

 ∑n∈{−1,0,1}(SR,∂Ω−e1+ne2−SL,∂Ω+e1+ne2)τ+M∑j=1(ϕj|R−ϕj|L)ξj = p1 \hb@xt@.01(2.27) ∑n∈{−1,0,1}(D\tiny TR,∂Ω−e1+ne2−D\tiny TL,∂Ω+e1+ne2)τ+M∑j=1(∂ϕj∂n∣∣R−∂ϕj∂n∣∣L)ξj = 0 \hb@xt@.01(2.28) ∑m∈{−1,0,1}(SU,∂Ω+me1−e2−SD,∂Ω+me1+e2)τ+M∑j=1(ϕj|U−ϕj|D)ξj = p2 \hb@xt@.01(2.29) ∑m∈{−1,0,1}(D\tiny TU,∂Ω+me1−e2−D\tiny TD,∂Ω+me1+e2)τ+M∑j=1(∂ϕj∂n∣∣U−∂ϕj∂n∣∣D)ξj = 0 . \hb@xt@.01(2.30)
###### Remark 2.7 (Wall intersection)

The cancellation of all near interactions in (LABEL:row2a)–(LABEL:row2d) is due to the neighbor summation in the representation (LABEL:urep). Furthermore this cancellation allows an accurate solution even when intersects , without the need for specialized quadratures, as long as remains far from the boundary of the enclosing unit cell block. Informally, the unit cell walls are “invisible” to the inclusions. With an elongated for which the last condition does not hold, with a little bookkeeping one could split into pieces that, after lattice translations, lie far from the boundary of the enclosing unit cell block; we leave this last case for future work.

(LABEL:bie)–(LABEL:row2d) form a set of coupled integral-algebraic equations, where the only discrete aspect is that of the proxy points.111It would also be possible to write a purely continuous version by replacing the proxy circle by a single-layer potential; however, sometimes an intrinsically discrete basis is useful [5, 13]. It is natural to ask how the unit nullity of the BVP manifests itself in the solution space for the pair . Are there pairs with no effect on , enlarging the nullspace? It turns out that the answer is no, and that the one-dimensional nullspace (constant functions) is spanned purely by , as a little potential theory now shows.

###### Lemma 2.8

In the solution to (LABEL:bie)–(LABEL:row2d), is unique.

Proof. Let be the difference between any two solutions to (LABEL:bie)–(LABEL:row2d). Let be the representation (LABEL:urep) using this , both in but also inside . Then by construction is a solution to the homogeneous BVP (LABEL:lap2)–(LABEL:j2), i.e. with , thus by Prop. LABEL:p:lapnull, is constant in . Thus by the continuity of the single-layer potential [44, Theorem 6.18], the interior limit of on is constant. However, is harmonic in , so is constant in . By (LABEL:JR), , but we have just shown that both and vanish, so .

#### 2.2.1 Discretization of the ELS

We discretize (LABEL:bie), the first row of the system, using a set of quadrature nodes on and weights such that

 ∫∂Ωf(y)dsy≈N∑i=1wif(xi)

holds to high accuracy for smooth functions . In practice, when is parametrized by a -periodic function , , then using the periodic trapezoid rule in gives and . Since has a smooth kernel, we apply Nyström discretization [44, Sec. 12.2] to the integral equation (LABEL:bie) using these nodes, to get our first block row

 Aτ+Bξ=0 , \hb@xt@.01(2.31)

where is a vector of density values at the nodes, and has entries

 Aij=−\small12δij+∑m,n∈{−1,0,1}∂G(xi,xj+me1+ne2)∂nxiwj , \hb@xt@.01(2.32)

where is the Kronecker delta. Here, for entries the standard diagonal limit of the kernel is needed, where is the signed curvature at . The matrix has entries .

For the second block row (LABEL:row2a)–(LABEL:row2d) we use the above quadrature for the source locations of the operators, and enforce the four equations on the wall collocation nodes , , , , , to get

 Cτ+Qξ=g , \hb@xt@.01(2.33)

with the macroscopic driving encoded by the right-hand side vector

 g=[p11m;0m;p21m;0m]∈R4m , \hb@xt@.01(2.34)

where is the vector of ones. The matrix is precisely as in (LABEL:Qsys). Rather than list formulae for all four blocks in , we have , with the others filled analogously. Stacking the two block rows (LABEL:row1) and (LABEL:row2) gives the -by- extended linear system (LABEL:ELS), which we emphasize is just a standard discretization of the BVP conditions (LABEL:bc2)–(LABEL:j2).

For small problems ( less than a few thousand), the ELS is most simply solved by standard dense direct methods; for larger problems a Schur complement must be taken in order to solve iteratively, as presented in Section LABEL:s:schur. First we perform a numerical test of the direct method.

### 2.3 Numerical tests using direct solution of the ELS

Example 1. We define a smooth “worm” shaped inclusion which crosses the unit cell walls and by . The solution of the periodic Neumann Laplace BVP with external driving is shown in Fig. LABEL:f:ELS(a). Here is evaluated via (LABEL:urep) both inside (where it is accurate), and, to show the nature of the representation, out to the proxy circle (where it is certainly inaccurate). Recall that the proxy sources must represent the layer potentials due to the infinite lattice of copies which excludes the central block. Note that two tips of copies in this set slightly penetrate the proxy circle, violating the condition in Thm. LABEL:t:Qconv that the function the proxy sources represent be analytic in the closed ball of radius . We find in practice that such slight geometric violations do not lead to problematic growth in , but that larger violations can induce a large which limits achievable accuracy. Panel (c) shows the convergence of errors in (at the pair of points shown) to their numerically-converged value, fixing converged values for and . Convergence to around 13 digits (for , ) is apparent, and the solution time is 0.03 seconds.

As an independent verification of the method, we construct a known solution to a slightly generalized version of (LABEL:lap2)–(LABEL:j2), where (LABEL:bc2) is replaced by inhomogeneous data and a general discrepancy is allowed. We choose the known solution

 u\tiny ex(x)=∑m,n∈{−2,−1,…,2}n0⋅∇G(x,y0+me1+ne2) ,

where the central dipole has direction , and location chosen inside and far from its boundary (so that the induced data is smooth). The grid size of (some of which is shown by symbols in Fig. LABEL:f:ELS(a)) is chosen so that is sufficiently smooth (which requires at least ), and so that the periodizing part is nontrivial. From the right-hand side functions and are then evaluated at nodes, the ELS solved directly, the numerical solution (LABEL:urep) evaluated, and the difference at two points compared to its known exact value. The resulting -convergence to 13 digits is shown in Fig. LABEL:f:ELS(c). The convergence rate is slower than before, due to the unavoidable closeness of the dipole source to .

Finally, the grey lines in panel (e) show that, as with , there is one singular value of which is much smaller than the others, reflecting the unit nullity of the underlying BVP (Prop. LABEL:p:lapnull). Also apparent is the fact that, despite the exponential ill-conditioning, the solution norm remains bounded once -convergence has occurred.

### 2.4 Schur complement system and its iterative solution

When is large, solving the ill-conditioned rectangular ELS (LABEL:ELS) is impractical. We would like to use a Schur complement in the style of (LABEL:Aper) to create an equivalent system, which we do in Sec. LABEL:s:equiv. Furthermore, in order to use Krylov subspace iterative methods with known convergence rates, we would like to remove the nullspace to make this well-conditioned, which we do in Sec. LABEL:s:wellcond. We will do both these tasks via low-rank perturbation of certain blocks of (LABEL:Aper) before applying the Schur complement.

In what follows, is the pseudoinverse [25, Sec. 5.5] of , i.e. the linear map that recovers a small-norm solution (if one exists) to via . The obstacle to using (LABEL:Aper) as written is that inherits a consistency condition from the empty BVP so that has one smooth vector in its left null-space (; see (LABEL:wQ)). However the range of does not respect this condition, thus has a huge 2-norm (which we find numerically is at least ). We first need to show that the range of a rank-deficient matrix may be enlarged by a rank- perturbation, a rectangular version of results about singular square matrices in [71].

###### Lemma 2.9

Let have a -dimensional nullspace. Let have full-rank projection onto , i.e. if has columns forming a basis for then is invertible. Let be arbitrary. Then , i.e. the range now includes that of .

Proof. We need to check that has a solution for all given pairs , . Recalling that is invertible, by substitution one may check that is an explicit such solution.

###### Remark 2.10

With additional conditions , and that has full-rank projection onto a part of (i.e. if has columns forming a basis for a -dimensional subspace of , then is invertible), we get incidentally that has trivial nullspace (generalizing [71, Sec. 3]). The proof is as follows. Let solve the homogeneous equation . Then , but is invertible, so . Thus the homogeneous equation becomes , which means there is an such that . Thus , but is invertible, so that , so .

We also need the fact that a block-column operation allows to be perturbed as above while changing the ELS solution space in a known way. The proof is simple to check.

###### Proposition 2.11

Let , , and be matrices, and let be a matrix with as many rows as has columns and as many columns as has rows. Then the pair solves the block system

 [AB+APCQ+CP][~τξ]=[0g] \hb@xt@.01(2.35)

if and only if the pair , with , solves the original ELS (LABEL:ELS).

#### 2.4.1 An equivalent square system preserving the nullspace

Armed with the above, a method to “fold” the Neumann Laplace ELS into an equivalent square system is as follows:

1. Set the proxy coefficient vector , and the discrete density vector . Create low-rank perturbed matrix blocks and .

2. Solve for the vector in the Schur complement linear system

 (A−~B~Q+C)~τ=−~B~Q+g . \hb@xt@.01(2.36)

More precisely, since (due to the numerical ill-conditioning in ) multiplying by would lose accuracy due to rounding error, instead solve the small systems for , and for , heeding Remark LABEL:r:linsolve. From them, build and , which are respectively the system matrix and right-hand side for (LABEL:Gschur). This large square system may then be solved iteratively (see section LABEL:s:multi).

3. Recover the proxy coefficients via .

4. Recover the density via .

Note that in Step 1 the prefactor leads to correct quadrature scaling, so that has similar 2-norm to the other matrices (also recommended in [71]).

###### Theorem 2.12

Let encode the driving as in (LABEL:gg). Then for all sufficiently large , , and , any pair produced by the above procedure performed in exact arithmetic solves the original ELS (LABEL:ELS).

Proof. By rotational invariance, a constant single-layer density on a circle generates a constant potential inside, and inserting into (LABEL:mfs) gives the periodic trapezoid quadrature approximation to such a potential, thus generates discrepancy near zero (in fact exponentially small in ). Thus for all sufficiently large , is not orthogonal to . Applying Lemma LABEL:l:onesmatrix with gives that the range of includes the discrepancy vector produced by the constant density . Thus, to show that the range of includes all smooth vectors, i.e. that it does not have the consistency condition (LABEL:wQ), one needs to check that which is done in Lemma LABEL:l:onesdens below. Thus the systems and are consistent for any and , so that the Schur complement is well defined. Finally, Prop. LABEL:p:gary, using the rank-1 matrix , insures that step 4 recovers a solution to (LABEL:ELS).

For the missing technical lemma, we first need a form of Gauss’ law, stating that for any density on a curve the single-layer potential generates flux equal to the total charge, i.e.

 ∫∂\@fontswitchKvn=∫∂Ωτds \hb@xt@.01(2.37)

where is the boundary of some open domain containing . This may be proved by combining the jump relation (from (LABEL:JR)) with the fact that, since is a Laplace solution in , taken over the boundaries of and of .

###### Lemma 2.13

Let the matrix be defined as in Section LABEL:s:lapbie, and be as in (LABEL:w). Then, for all and sufficiently large, .

Proof. Let be the discrepancy of the potential for density . As used in the proof of Prop. LABEL:p:lapemptynull, the flux (left hand side of (LABEL:Gauss)) out of the unit cell equals , which by (LABEL:Gauss) (and noting that only one of the nine source terms in lies within ) equals the perimeter . What we seek is the discretization of this statement about the PDE. If is the discrepancy of sampled at the wall nodes, then its quadrature approximation is , and so (as discussed above (LABEL:w)) the quadrature approximation to is . Thus, as and tend to infinity, the latter converges to .

Theorem LABEL:t:gary justifies rigorously one procedure to create a square system equivalent to the ELS (LABEL:ELS). However, by the equivalence in Prop. LABEL:p:gary, the system matrix at the heart of the procedure is singular, as it inherits the unit nullity of the ELS, which itself derives from the unit nullity of the Laplace Neumann BVP. Since the convergence of iterative solvers for singular matrices is a subtle matter [10], this motivates the following improved variant which removes the nullspace.

#### 2.4.2 A well-conditioned square system

The following simpler variant creates a non-singular square system from the ELS; its proof is more subtle. It is what we recommend for the iterative solution of the periodic Neumann Laplace problem, and test numerically:

1. Set the constant proxy coefficient vector , the constant discrete density vector , and .

2. Solve for the vector in the Schur complement linear system

 ~A\tiny perτ:=(A−B~Q+C)τ=−B~Q+g , \hb@xt@.01(2.38)

where, as before, one solves the small systems and , to build the large system matrix and right-hand side for (LABEL:pschur), which may then be solved iteratively (see section LABEL:s:multi).

3. Recover the proxy coefficients via