A Lowrank Schwarz Method for Radiative Transport Equation with Heterogeneous Scattering Coefficient^{†}^{†}thanks: The work of JL is supported in part by the National Science Foundation via grant DMS1454939. The work of KC, QL, and SW is supported in part by the National Science Foundation via grant 1740707. The work of SW is further supported in part by National Science Foundation grants 1447449, 1628384, and 1634597; Subcontract 8F30039 from Argonne National Laboratory; and Award N660011824020 from the DARPA Lagrange Program. The work of KC and QL is further supported in part by Wisconsin Data Science Initiative and National Science Foundation via grant DMS1750488, and DMS1107291: RNMS KINet.
Abstract
Random sampling has been used to find lowrank structure and to build fast direct solvers for multiscale partial differential equations of various types. In this work, we design an accelerated Schwarz method for radiative transfer equations that makes use of approximate local solution maps constructed offline via a random sampling strategy. Numerical examples demonstrate the accuracy, robustness, and efficiency of the proposed approach.
andom sampling, Schwarz method, heterogeneous media, radiative transfer equation
65N
1 Introduction
The radiative transfer equation (RTE) is a standard model that describes propagation of light through such turbid media as biological tissues or planetary atmospheres. The equation is used in situations in which energy is transported by light, as in the study of the greenhouse effect [3], optical tomography [21], and the radiation field for atmosphereocean system [27]. Light is injected from a source, and RTE models the absorption and scattering of the photons in the ambient material.
The model equation for the steady state is
(1) 
where describes the light intensity at location oriented in velocity direction . The lefthand side describes free propagation of the photons along direction with velocity , while the righthand side characterizes interaction between photons and media (via absorption and scattering). The media information is encoded in , which is strictly positive for all . The operator , typically an integral operator, characterizes how photons are scattered and change directions. We denote the physical domain by . Since photons always move with the same speed, the velocity term is determined purely by the direction, so that , the unit sphere in dimensions.
In the large spaceregime, with scaling (where is a small parameter discussed below), the equation (1) becomes
(2) 
where is the rescaled media function, with capturing the smallest scale of the variation in the media. This function is rough when . In the equation (2), is the Knudsen number that represents the ratio of the mean free path to the typical domain length.
With appropriate boundary conditions, wellposedness of the equation is straightforward, and is independent of the scales (that is, the smallness of or ). In this paper, we tackle the numerical challenge of designing a numerical solver that is efficient and accurate across all regimes, including regimes in which the parameters and are small.
1.1 Asymptotic preserving
Small parameters in PDEs can make computations challenging. In the case described above, we have , so a classical numerical solver can be expected to attain good accuracy only when the mesh size in the discretization satisfies
A grid in dimensions with this discretization parameter will have at grid points, so computation will be infeasible when and are small.
A natural question is whether it is possible to design a numerical method for which the computational cost of obtaining a stable, accurate solution is independent of the parameters and , and whether the numerical solution can capture the right asymptotic limit of the solution as and approach zero. If a numerical solver for a multiscale problem has its discretization independent of the smallest scale in the equation, but still preserves the asymptotic limits, then the solver is called asymptoticpreserving (AP). This term was coined in [19] for a class of kinetic equations, although some algorithms for simpler settings had been designed previously [22]. Extensive progress has been made during the past decade, with AP solvers being designed for the BGK equation, the Boltzmann equation, the VlasovPoissonBoltzmann (VPB) equation, and many others [10, 9, 18]. See also the reviews [20].
A standard approach for designing AP solvers is based on analysis of the asymptotic limits. In some cases, asymptotic limits for the equations can be derived: the Euler limit for the Boltzmann, the coupled diffusionPoisson system for the VPB system. One strategy for AP is then to work with two sets of solvers, one for the original equation and one for the asymptotic limit, the latter being encoded in the former via a weight that can be tuned. In the limit as , this weight is adjusted so that the limiting equation dominates, driving the numerical solution to that of the asymptotic limiting system.
The analysisbased approach is straightforward and mathematically sound, and has made some previously impossible computations feasible. It depends, however, on analytical understanding of the asymptotic limit, which is not always straightforward. We are led to ask whether it is possible to design an AP solver that does not require detailed knowledge of the asymptotic limit. This paper addresses this question in the specific case of RTE. This equation is complicated in that different patterns of convergence of the pair to lead to different limiting systems, not all of which are well understood. Can we design AP numerical solvers in the absence of this analytical understanding? We outline an answer to this question in the next section.
1.2 Random Sampling and PDE Compression
We design AP solvers without analytical knowledge by using compression techniques. Even when asymptotic limits of equations with small parameters are difficult to derive analytically, we can sometimes show the existence of such limits. In the discrete setting, a basic discretization scheme with grid points will suffice to attain accuracy level . When a limiting equation exists, this accuracy level may be attainable with as few as grid points. Since the limiting equation is asymptotically close to the original equation, these degrees of freedom are asymptotically sufficient to represent the original PDE solution that naively would require grid points to compute. This observation implies that the dimensional solution space is compressible, and can be well approximated by a dimensional space when and are small.
Knowing that the space is “compressible”, can one find the compressed space quickly? We answer this question affirmatively, in the case of the RTE, by making use of random sampling. Random sampling is not a new strategy. It has been used in data science to sample sparse vectors (as in compressed sensing [6]) and lowrank matrices [16], with the goal of reconstructing these objects from a relatively small number of samples. Generally, the number of samples is tied more closely to the intrinsic dimension of the object (for example, the number of nonzeros in a sparse vector) than to the dimension of the ambient space, which is typically much larger.
Application of random sampling techniques to PDEs has been limited previously to the discrete algebraic system obtain by discretizing the PDE. A direct link to the original PDEs needs to be explored further. Some important questions have not been fully resolved, for example, whether the PDE solution space or the solution operator is “compressible”. The two views correspond to regarding a matrix as defining a column space or as a linear operator, respectively. Other questions involve the sense in which these objects are “compressible”, and whether the spectral norm used for matrices is the appropriate norm in the case of PDEs.
Previously, several algorithms that make use of random sampling ideas have been proposed, mostly in the context of elliptic PDEs [28, 17, 8, 25, 26, 5, 11]. A more systematic investigation of PDE compression appears recently in our previous work [7], where compressed PDE solution spaces are related to low rank structure of the matrix formed by the Green’s functions. Such concepts in multiscale PDE computation as asymptoticpreserving (see above) and numerical homogenization are unified under this framework.
1.3 Contribution
This paper follows the line of research started in [7]. For RTE (2), we know only that the equation has asymptotic limits with small parameters, but the actual forms of the limiting equations are unknown. We aim to design an accurate numerical scheme whose runtime is independent of the smallness of the coefficients in the equation.
We apply the Schwarz iteration under the domain decomposition framework. The domain is divided into overlapping subdomains (patches). The PDEs in these patches can be solved in parallel. Solution of the PDE on each patch with partial boundary conditions yields an output in the form of boundary conditions that are passed to neighboring patches. The PDE on each patch is solved again with the modified boundary conditions supplied by its neighbors, the whole process repeating until the solutions are consistent in the overlapping regions. The boundarytoboundary map, in which the inputs are the partial boundary conditions on the patch PDEs and the output are the missing boundary conditions obtained by solving the PDEs, is a compressible map. We will develop an algorithm based on random sampling that computes an adequate approximation to this map quickly. The overall scheme is a composition of an offline component, in which lowrank approximations to the boundarytoboundary maps are obtained using random sampling; and an online step, in which Schwarz iteration, accelerated by the lowrank boundarytoboundary map, is executed until a solution consistent across the whole domain is found.
Our work contrasts with the approach in [7], where the local solution space is compressed in an offline step. In the online step, a solution for particular boundary conditions or source term is found as a linear combination of basis vectors for the compressed space, with the coefficients chosen to match the given conditions. The problem of finding these coefficients is typically overdetermined, the number of coefficients being fewer than the constraints arising from the boundary conditions or source term. Some accuracy is sacrificed, and the error is difficult to quantify. The current work compresses the boundarytoboundary map in the offline stage, rather than the local solution space, and uses the compressed map to update local boundary conditions in the online stage, until a preset error tolerance is achieved.
The remainder of the paper is organized as follows. We introduce the concept of “lowrankness” in the context of the RTE in Section 2. In Section 3, we review the Schwarz iteration under the domain decomposition framework, and present the new lowrank Schwarz iteration method based on random sampling. Numerical experience is described in Section 4.
2 Lowrankness of RTE in Various Regimes
As discussed above, current AP schemes rely heavily on good understanding of the analytical form of the asymptotic limits, although in some situations, this limiting form is hard to specify, even when we know that it exists. The radiative transfer equation with small Knudsen number and small media oscillation period is a good example of the latter phenomenon. As and converge to in different ways, the limiting equations are different, and only some of the limiting forms can be made explicit. We show two different homogenization effects in the following two subsections, and unify them using the concept of the lowrankness in Section 2.3.
Consider the RTE in infinite domain (2), which we restate here:
(3) 
where and . We define the scattering operator to have the following form:
where is the normalized measure on the velocity domain. The scattering coefficient encodes the media information, with denoting the smallest spatial scale. The operator has a nontrivial null space which consists of functions that are constant in the velocity domain. We use this fact later to formally derive the diffusion limit of RTE.
We now consider different limits for different regimes of the parameters .
2.1 Diffusion Regime
In the diffusion regime, we have while is fixed at a positive value. From (3), we see that in the leading order, meaning that belongs to the null space of , and loses its velocity dependence. By matching orders in the classical asymptotic expansion
we obtain
By substituting the equation into the equation, one obtains a diffusion equation, as follows. {theorem}[[4]] In the zero limit of , the solution to (3) converges to the solution to the diffusion equation:
(4) 
in the sense that
Remark \thetheorem
Note that we did not account for boundary conditions in deriving the limiting equation. In physical space, the derivation is valid when the boundary conditions are periodic. Otherwise, one has to be careful with the boundary influences and curvature effects. The diffusion limit still holds outside the boundary layers, but the convergence deteriorates when curvature corrections need to be taken into account. These results can be found in [15, 24] for the case when domain is convex.
2.2 Homogenization Regime
When is fixed at a positive value while , homogenization limits are achieved; see [12]. We assume a twoscale media, having dependence on a fast variable and a slow variable :
where is assumed to be periodic (with respect to the fast variable) for each . Accordingly, we write the solution as
In this notation, the operator is replaced by from chain rule. By substituting into the equation, we have
By substituting the asymptotic expansion
into the equation above, and matching terms, we obtain
(5a)  
(5b) 
By applying the Fourier transform for the first equation (5a) with respect to the periodic variable , we obtain
We note that for almost all fixed , the multiplier is nonvanishing for all , because otherwise there exists some such that for a positive measure set of , which is impossible. Therefore, by dividing the multiplier, we have for any that
That is, all Fourier modes are vanishing except for the one with . This implies that is independent of the periodic variable , so we redefine the notation to omit this dependence:
For the next order equation (5b), when we take the integral over , the RHS vanishes due to periodicity, and we obtain the homogenized equation
The derivation of homogenization limit presented above is validated in the following theorem. See [12, Theorem 3.1] for a rigorous proof for the timedependent case. {theorem} Let be a bounded family of such that
then the solution to the RTE (3) (with fixed) converges in weak to , the solution to the following homogenized RTE:
(6) 
Because of the oscillations of scale in the media , the solution is rough. However, such oscillations are homogenized in the limit, and the solution becomes close to the solution to (6), which has no oscillation.
In general, the limiting regime can be taken through different routes. One may fix and send to zero to reach the diffusion limit (4) and then send , shown as solid arrow in Figure 1, Alternatively, one could fix and send to zero to reach the homogenization limit (6), then send . This path is shown as the dashed arrow in Figure 1. Additionally, one could send both and simultaneously to zero at different rates, shown as dotted arrows in Figure 1. All these routes, though considering the same regime , do not necessarily end up at the same limit. In fact, Goudon and Mellet [13, 14] showed that by following the route , RTE (2) ends up as an effective drift diffusion equation, while Abdallah, Puel and Vogelius [1] followed the route to obtain an effective diffusion equation.
2.3 Low Rank of the PDE Solution Map
An AP scheme has been proposed in [23] to deal with the regime , while numerical schemes for other regimes remain open. In practice, given a particular pair that is close to zero, it would not be clear where they are situated in Figure 1. It is therefore impossible to determine which limiting equation is the most appropriate one to use as an approximation to the solution of (2). The analysisbased approach of designing AP schemes is therefore not feasible. We seek to develop instead a universal numerical approach that is valid in different limiting regimes.
We start by considering the diagram in Figure 2.
Assume we are given an equation where denotes the smallest parameters in the equation operator , together with a boundary operator such that for some given boundary data . The solution can be represented as a convolution of with all Green’s functions , so it lies in the space spanned by . To find an accurate numerical approximation to this solution, the operator is translated to , a matrix whose number of columns is . Thus, the numerical solution is a vector of length .
On the other hand, assume the equation is “homogenizable” and there exists an asymptotic limit, an operator so that the solution to equation with boundary condition is asymptotically close to . The computation of is expected to be significantly cheaper, since the limiting equation no longer has small parameters and is expected to be smooth. Thus, the numerical solution requires merely degrees of freedom to represent accurately.
Since is close to for regular , the numerical solution spaces captured by the range of matrices and are expected to be almost the same. On the other hand, although contains many more degrees of freedom (columns) than , the former is “compressible” and the latter is a good lowrank approximation to it. This argument can be made rigorous with the definition of “numerical rank of an operator,” a concept that is equivalent to the “Kolmogorov width”. More details can be found in [7].
2.4 Random Sampling for LowRank Structure
Knowing that an operator is approximately of low rank does not mean that it is easy to find a lowrank approximation quickly. In the linear algebra setting, finding the lowrank structure is equivalent to finding the singular vectors of a matrix that correspond to the largest singular values. For an matrix, the singular value decomposition costs operations, making this approach prohibitive for large . When the approximate rank is known to be , a randomized SVD solver whose cost depends on is available. This approach is based on the following result. {theorem}[Corollary 10.9 of [16]] Suppose that the matrix has singular values ordered as follows: . Assume that the target rank and oversampling parameter are positive integers such that . Then with probability at least , we have
where is a matrix with orthonormal columns whose column space matches that of , where is a random matrix of dimension with entries drew from independent identical distributed (i.i.d.) normal distribution.
This theorem suggests that if an operator has approximate low rank, random sampling can find its range accurately, with overwhelming probability. A simplified estimate shows that for oversampling rate to be as small as , with at least confidence, the error can be controlled by . Algorithm 1, proposed in [16], finds the rank approximation to , which we denote by .
There are two crucial features of the algorithm. First, the amount of computation depends crucially on the rank , and is generally much less expensive than a full SVD. Second, it can be implemented without explicit knowledge of the matrix . Rather, we need only to be able to compute the products of with the random matrix . These properties make the algorithm well suited for use in the numerical homogenization of PDEs.
Translation of the randomization idea to the PDE setting is not straightforward. First, a PDE solution map is a continuous operator, not a matrix. Discretization of the space and the choice of norm is not always obvious. Redesigning the scheme to deal with an operator may also be difficult. While the adjoint of a matrix is easy to define, the adjoint of an operator may not be so easy. Second, a numerical homogenization scheme needs to find a solution quickly for an arbitrary boundary term or source, and in pursuit of that goal, the PDE solution map may not be the right operator to “compress”. In fact, in our approach, to be discussed in Section 3, we compress a different operator: the boundarytoboundary map, which is needed in the Schwarz iteration scheme.
For the present, we assume the adjoint operator of is known (denoting it by ), and translate Algorithm 1 to the operator setting, equipping it to find the corresponding Kolmogorov width operator, . (Note that for any , we have .)
3 Lowrank Schwarz Domain Decomposition Method
We consider the boundary value problem for RTE (3) in the following form:
(7a)  
(7b) 
where the partial boundary is defined by
(8) 
Here, is the outer normal vector at location , and is expected to be negative for all incoming velocities. Similarly, the outflow coordinates are collected in the complementary partial boundary . The problem (7) is well posed, as we show in the Appendix.
We start in Section 3.1 by introducing the domain decomposition and the classical Schwarz method in Algorithm 3. Section 3.2 identifies the operator that needs to be compressed for efficient implementation of this method, while Section 3.3 derives the adjoint operator. These elements together make it possible to design the lowrank Schwarz method, Algorithm 5, presented in Section 3.4.
3.1 Schwarz Domain Decomposition Method
For computing (7), one first considers an overlapping domain decomposition of the physical space ,
where forms an open cover of . We decompose the domain accordingly as:
(9) 
We denote by the outflow and inflow boundaries for . The domains are overlapping, and we define
to be the interior of that does not overlap with its neighbors. Since the inflow boundary of neighboring domain is partially inside , we further define
(10) 
so that the outflow boundary of is the union of the domains above, that is,
One can see that the restriction on of the local solution in the domain , would partially provide the inflow boundary condition for its neighboring domain . This fact will be used later to update local solutions in each iteration of Schwarz method. An illustration is shown in Figure 3.
The Schwarz method is an iterative algorithm that updates solutions confined to different subdomains by exchanging information between iterations. In this setting, we update the values on using the newly computed solutions in each subdomain, repeating the process until the solution converges. To be more specific, we denote by the restriction of the solution at the th step of the Schwarz process on patch , confined to the partial boundary , that is,
The th iteration of the Schwarz method can be expressed as a mapping from the to , obtained by exchanging the boundary conditions between adjacent patches and solving the RTE. We denote this mapping by , as follows:
We terminate at an iteration for which the difference between and falls below a given tolerance.
The evaluation of map amounts to evaluation and assembly of , and is defined by the following procedure, applied to all subdomains , .

Define the following solution map
(11) by solving RTE on domain :
(12) Obtain the solution in each subdomain with boundary conditions . Here is a functional space where the trace of over the boundary is welldefined (details appear in the appendix).

Confine the solution on the boundaries:
(13) where and are the boundary values transmitted to the next iteration of the Schwarz procedure.
We denote by the boundarytoboundary map that maps the boundary condition on to the boundary values on the adjacent subdomains ():
(14) 
This boundarytoboundary map is a welldefined operator, as we show in Theorem 34 in the appendix. It can be regarded as a composition of and a trace operator, as follows:
(15) 
Note that provides the boundary condition for the adjacent subdomains in the next Schwarz iteration, seen as in equation (13). The full map is obtained by collecting the action of for all subdomains .
As initial conditions for the Schwarz process, we set
(16) 
except at the physical boundary, where we impose given boundary conditions:
(17) 
When convergence to a given tolerance is achieved, the latest solutions may not perfectly match at the overlapping areas. To assemble the global solution, we define a suitable set of partitionofunity functions for the subdomains , whose properties are as follows:
We construct the global solution by setting
(18) 
The method is summarized in Algorithm 3.
The Schwarz approach has several advantages. First, it is easy to implement in parallel, since the main computations (12) and (13) can be solved simultaneously for the subdomains . In fact, one could even use different solvers in different subdomains, when appropriate (for example, when there is prior information about inhomogeneity of the medium). Second, computing solutions in each subdomain is significantly cheaper than for the full domain. It saves storage cost and computation time, especially when stoge and computation scale superlinearly with the size of the domain.
The disadvantage of the Schwarz approach is that it requires multiple iterations for convergence. Since needs to be reevaluated at each iteration for each subdomain , and it calls for the computation of , finding the local solutions with the given boundary condition quickly is the key to the success of the entire algorithm. In the following sections we identify the operator that can be efficiently compressed, aiming at improving the efficiency of evaluating , or .
3.2 Identifying the Operator to be Compressed
As discussed in Section 2.3, one should be able to reveal and exploit the lowrankness in homogenizable equations. In our setting, the local equation (12) has a homogenization limit, so we expect the map from boundary conditions to local solutions to be of low rank. This does not seem to be the case, however. In Figure 4 we plot all normalized singular values of the discrete representation of on the domain with discretization parameters and . The decay of the singular values is not as strong as we expected. These values saturate at a value of about , meaning that even when we capture the top modes, we can still expect errors of around . As stated in Remark 2.1, the homogenization limit concerns mainly the behavior of the solution in the interior, while the behavior of the solution in the boundary layer is usually still far away from “equilibrium”. As shown in Figure 4, a solution with an inhomogeneous boundary condition can have strong boundary layer effect.
These boundary layer effects are included in , an operator that maps the boundary condition to the solution in the entire region (including the boundary layer), destroying the desired lowrank structure. However, the operator looks only at the solution confined in a small interior set . As shown in Figure 4, the singular values of decays significantly faster than those for .
Therefore, is a better object for compression, and we will seek a fast solver to approximate for any input , by exploiting the Randomized SVD algorithm, Algorithm 2. This method requires us to apply the operator to random inputs, which amounts to finding the local solution in with randomly constructed boundary conditions, and then confining it to . Following (15), the evaluation is defined as
Finding , the adjoint of , however, is much more complicated. We summarize our findings in the following theorem.
The adjoint operator is defined by
(19)  
where , supported on , satisfies:
(20) 
in which is the solution to:
(21) 
is the adjoint for in the sense that:
(22) 
where and are weighted inner products on and , respectively, defined by
The proof is postponed to the appendix. We remark here that the computation involved in finding the adjoint operator is complicated. It requires the computation of two adjoint RTEs over and respectively, that are coupled in a nontrivial fashion through the boundary conditions, as seen in (20) and (21). We further note that the measure is not the standard Lebesgue measure, but rather weighted by .
3.3 Design of the Adjoint Map for
Although the operator is of approximate lowrank, its adjoint , which is needed to compute the lowrank approximation, is complicated. The operator , on the other hand, is not compressible, but its adjoint is relatively easy to find. In this section, we show that we can approximate by an approximately lowrank operator based on whose adjoint is easy to find.
Since has slow singular decay mainly because it contains too much information from the boundary layer, we consider a restriction of this operator from to , which we call . The restriction to eliminates most of the effects of the boundary layer. This operator is defined as follows.
(23) 
where and . The advantages of using this operator are threefold:

can be defined easily in terms of . Nothing is lost by comparison with (15); we have
(24) and once again serves as the new boundary condition , as in equation (13). Note that the trace in (24) is well defined, as maps boundary conditions to , so the image of its restriction to has a trace on the boundary of .

Because effects from boundary layers are excluded in , it can be expected to have approximate low rank. Figure 5 shows that the decay rate of (upon discretization) is almost the same as for .

Its adjoint is easy to compute, as we show in Theorem 3.3.
The adjoint of is defined as follows:
(25)  
where solves the adjoint RTE over , which is
(26) 
and the source is the trivial extension of over , that is,
We need to show that
for all and . Denoting by the solution to (12) with boundary condition , then the definition of implies that the left hand side of the expression above is
Denoting by the solution to (26) with source term , we have
yielding the required result.
3.4 Lowrank Schwarz Iteration Method
We can use to implement the RSVD method to find the lowrank approximation to the operator . Given target rank , and denoting the reduced operator by , we look for functions and and nonnegative scalars such that:
(27) 
where and are obtained in Algorithm 4.