[

[

[

FGA for Elastic Wave Equation ] Frozen Gaussian Approximation for 3-D Elastic Wave Equation and Seismic Tomography J. C. Hateley, L. Chai, P. Tong, X. Yang] J. C. Hateley, L. Chai, P. Tong, X. Yang
Department of Mathematics, University of California, Santa Barbara, CA 93106, USA
School of Mathematics, Sun Yat-sen University, Guangzhou, 510275, China
School of Physical and Mathematical Sciences & Asian School of the Environment,
Nanyang Technological University, 637371, Singapore

[


\@close@row



\@next\@currbox\@freelist
SUMMARY
The purpose of this work is to generalize the frozen Gaussian approximation (FGA) theory to solve the 3-D elastic wave equation and use it as the forward modeling tool for seismic tomography with high-frequency data. FGA has been previously developed and verified as an efficient solver for high-frequency acoustic wave propagation (P-wave). The main contribution of this paper consists of three aspects: 1. We derive the FGA formulation for the 3-D elastic wave equation. Rather than standard ray-based methods (e.g. geometric optics and Gaussian beam method), the derivation requires to do asymptotic expansion in the week sense (integral form) so that one is able to perform integration by parts. Compared to the FGA theory for acoustic wave equation, the calculations in the derivation are much more technically involved due to the existence of both P- and S-waves, and the coupling of the polarized directions for SH- and SV-waves. In particular, we obtain the diabatic coupling terms for SH- and SV-waves, with the form closely connecting to the concept of Berry phase that is intensively studied in quantum mechanics and topology (Chern number). The accuracy and parallelizability of the FGA algorithm is illustrated by comparing to the spectral element method for 3-D elastic wave equation in homogeneous media; 2. We derive the interface conditions of FGA for 3-D elastic wave equation based on an Eulerian formulation and the Snell’s law. We verify these conditions by simulating high-frequency elastic wave propagation in a 1-D layered Earth model. In this example, we also show that it is natural to apply the FGA algorithm to geometries with non-Cartesian coordinates; 3. We apply the developed FGA algorithm for 3-D seismic wave-equation-based traveltime tomography and full waveform inversion, respectively.
\@currbox

Θ\@addtodblcol \@next\@currbox\@freelist Key words: Frozen Gaussian approximation; elastic wave equation; seismic tomography; Wasserstein distance; high-frequency wavefield; weak asymptotic analysis.

\@currbox

Θ\@addtodblcol

\@xsect

Images computed by seismic tomography can provide crucial information for the subsurface structures of Earth at different scales, and the understanding of tectonics, volcanism, and geodynamics \citep[e.g.][]Aki1976,Romanowicz1991,Rawlinson2010,Zhao2012a. Wave-equation-based seismic tomography solves the nonlinear optimization problem iteratively for velocity models by computing seismograms and sensitivity kernels in 3-D complex models \citepTromp2005,Liu2008,Liu2012,Tong2014a. This leads to successful applications including imaging the velocity models of the southern California crust \citepTape2009,Tape2010, the European upper mantle \citepZhu2012, the North Atlantic region \citepRickers2013, and the Japan islands \citepSimute2016. The performance of seismic tomography is restricted by how accurate one can solve the 3-D wave equation for synthetic seismograms and sensitivity kernels. The dominant frequency of a low-frequency earthquake is around 1 Hz, while the one of a typical earthquake is around 5 Hz \citep[e.g.][]nakamichi2003source, which leads to demanding and even unaffordable computational cost. Real applications usually chose significantly low-frequency data in order to be computationally feasible \citep[e.g.][]Tape2009,Zhu2012,Simute2016, since simulating high-frequency seismic waves requires much more powerful computational resources on both memories and CPU time than the low-frequency waves. This imposes a demand on improving the efficiency and accuracy of numerical methods for computing high-frequency waves in order to make use of real seismic data around their dominant frequencies.

In previous works \citepchai2017frozen,chai2018tomo, the authors have developed and verified frozen Gaussian approximation (FGA) as an efficient solver for computing high-frequency acoustic wave propagation (P-wave), wave-equation-based traveltime tomography and full waveform inversion (FWI). The core idea of FGA is to approximate seismic wavefields by fixed-width Gaussian wave packets, whose dynamics follow ray paths with the prefactor amplitude equation derived delicately from an asymptotic expansion on phase plane. With a multicore processors computer station, the property that the FGA algorithm is embarrassingly parallel makes possible the application of FGA to compute 3-D high frequency sensitivity kernels, and further used for 3-D traveltime tomography and FWI. Compared to other ray-based methods including WKBJ \citep[e.g.][]Chapman:76,ChDr:82, WKM \citep[e.g.][]NiDiHe:00,HeNi:05, generalized ray theory \citep[e.g.][]He:68,ViHe:88, seismic traveltime tomography \citep[e.g.][]Aki1977, tong2017time, Kirchhoff migration [[, e.g.]]Gr:86,KeBe:88 and Gaussian beam migration \citep[e.g.][]Hi:90,Hi:01,SEG-2003-11141117,Gr:05,GrBe:09,PoSePoVe:10, FGA does not need to solve ray paths by shooting to reach the receivers, and provides accurate solutions at the presence of caustics and multipathing, with no requirement on tuning beam width parameters to achieve a good resolution \citep[]CePoPs:82,Hi:90,FoTa:09,QiYi:10,LuYa:11.

In this paper, we first generalize the FGA theory to solve the 3-D elastic wave propagation. Different from the WKBJ theory or Gaussian beam method which can be derived by direct asymptotic expansion, the derivation of FGA formulation requires to do the asymptotic expansion in an integral form (weak sense) so that one is able to perform integration by parts to eliminate the extra constraints yielded by direct asymptotic expansion. Compared to the previous works on FGA [Lu & Yang(2012a), Chai et al.(2017)Chai, Tong, & Yang, Chai et al.(2018)Chai, Tong, & Yang], the calculations for the derivation are much more technically involved due to the existence of both P- and S-waves. Particularly, the diabatic coupling of the polarized directions for SH- and SV-waves leads to a term closely connected to the concept of Berry phase which is intensively studied in quantum mechanics and topology (Chern number) [[, e.g.]]Be:84,Si:83. We prove the accuracy and parallelizability of the FGA algorithm by comparing to the spectral element method \citep[e.g.][]Komatitsch1999,Komatitsch2005,Tromp2008 for the 3-D elastic wave equation in homogeneous media, where one has the analytical solution as a benchmark. We derive the interface conditions of FGA for the 3-D elastic wave equation based on an Eulerian formulation and the Snell’s law. These interface conditions are verified by simulating high-frequency elastic wave propagation in a 1-D layered Earth model. We also show how natural to apply the FGA algorithm to geometries with non-Cartesian coordinates in the simulation of this model.

The second part of this paper contributes to the study of seismic tomography with high-frequency data. Wave-equation-based traveltime tomography and FWI are able to generate higher-resolution images of the Earth’s interior than the conventional ray-based tomography method \citep[][]Virieux2009. Especially, some recent novel studies on FWI using optimal transport distance (e.g. Wasserstein metric) are able to capture traveltime differences between seismic signals, and thus overcomes the cycling effect in standard FWI method \citep[e.g.][]EnFr:14,MeBrMeOuVi:16,YaEnSuHa:18,ChChWuYa:arXiv. However, seismic tomography using high-frequency data increases computational cost drastically, which restricts the application of these wave-equation-based inversion methods, and only low-frequency data are modeled and inverted in some real applications \citep[e.g.][]Tape2007, Zhu2012, Simute2016. To address the computation challenge, we use the developed FGA algorithm to compute the 3-D high-frequency sensitivity kernels for wave-equation-based traveltime tomography and FWI, respectively. We apply FGA to both wave-equation-based traveltime tomography and FWI on synthetic crosswell seismic data with dominant frequencies as high as those of real crosswell data, and use a hierarchical approach which first uses traveltime tomography to create a macro-scale model and then adopts FWI to generate a high-resolution micro-scale model.

\@xsect

We derive the frozen Gaussian approximation (FGA) formulation for the following inhomogeneous, isotropic elastic wave equation:

(\theequation)

where is the wavefield, is the density, and are the Lamé parameters, and is the body force. We shall assume in this section as a sake of simplicity for presenting the FGA formula. Remark that eq. ([) can be rewritten as,

(\theequation)

with the P-wave, and S-wave speeds defined as,

(\theequation)

We consider the elastic wave eq. ([), with the following initial conditions

(\theequation)

Note that all the bolded variables and parameters will be referred to vectors in without any further clarification.

\@xsect

The FGA approximates the wavefield in eq. ([) by a summation of dynamic frozen Gaussian wave packets,

(\theequation)

with the weight functions

(\theequation)
(\theequation)

In eq. ([), is the imaginary unit, and we use unbolded subscripts/superscripts “” and “” to indicate P- and S-waves respectively, and superscripts to indicate quantities depending on wave number . The quantities that have both and as subscripts/superscripts mean that they can be referred to both P- and S-waves. Here refers to the initial sets of Gaussian center and propagation vector for P- and S-waves respectively, and indicates the two-way wave propagation directions correspondingly. are unit vectors indicating the polarized directions of P- and S-waves, i.e. if , then ; and if , then . In ([), are the initial directions of P- and S-waves, i.e. , and the “” on the right-hand-side of the equation indicate that the correspond to . Remark that can be understood as complex localized wave-packet centered at with propagation vector , and is the projection of the initial wavefield onto each wave-packet computed by eq. ([) with serving as the dummy variable in the integration. Associated with each frozen Gaussian wave-packet, the time-dependent quantities are the position center , momentum center , amplitude and unit direction vectors . Note that all the S-waves discussed in the formulation will include both SH- and SV-waves.

\@xsect

The derivation of the FGA formulation involves with the asymptotic expansion in the integral form (weak sense), with proper integration by parts performed to convert powers of distance to the Gaussian center to orders of the wavelength . It is quite lengthy and technically involved, and thus we leave it to Appendix [ for the readers who are interested in the mathematical details, and only present the FGA formulation as below.

For the “+” wave propagation, i.e., , the Gaussian center and propagation vector follow the ray dynamics

(\theequation)

with initial conditions

(\theequation)

For the “-” wave propagation, i.e., , the Gaussian center and propagation vector follow the ray dynamics

(\theequation)

with initial conditions

(\theequation)

Remark that, the equations for “-” wave propagation ([) have opposite signs of the right-hand side to the equations for “+” wave propagation ([), and other than that, they are the same. Actually they can be both viewed as the Hamiltonian system with as the Hamiltonian function for “+” and “-” wave propagation, respectively.

The prefactor amplitudes satisfy the following equations, where S-waves have been decomposed into SH- and SV-waves,

(\theequation)
(\theequation)
(\theequation)

with the initial conditions , and and are the two unit directions perpendicular to , referring to the polarized directions of SV- and SH-waves, respectively. Here “” corresponds to the two-way wave propagation directions, and we have used the short-hand notations,

(\theequation)

Note that, the prefactor equation ([) is consistent with the one for acoustic wave equation with \citepchai2017frozen , and the last terms on the right-hand-side of ([)-([) indicate the diabatic coupling of the polarized directions for SH- and SV-waves, which are closely connected to the concept of Berry phase intensively studied in quantum mechanics and topology (Chern number) [[, e.g.]]Be:84,Si:83.

Algorithm. A flowchart is shown in Fig. 2 to describe the FGA algorithm, with the technique details discussed in the figure caption.

Figure 2: This cartoon figure illustrates the main three steps of the FGA algorithm: Step 1, Initial decomposition, i.e., to choose the proper sets by the fast FBI transform algorithm \citep[Page 2]YaLuFo:13 to decompose the initial wavefields into Gaussian wave packets, and calculate the corresponding weight function defined in eq. ([); Step 2, Time propagation, i.e., to solve by some numerical solver eqs. ([), ([), ([), ([) and ([) for the dynamics of Gassian center, propagation vector and prefactor amplitude. Transmission and reflection conditions are needed at the presence of interfaces as given in Section [; Step 3, Wave reconstruction, i.e., to compute eq. ([) for the elastic wavefield at time . The fast Gaussian summation algorithm proposed in \citet[Section 3.3]chai2018tomo is highly recommended if a large number of reconstruction grid points are needed.
\@xsect

We check the accuracy and parallelizability of the FGA method by simulating the elastic wave propagation at a homogeneous media with absorbing boundary conditions, for which one can solve the analytical solution as a benchmark. Specifically, we consider

(\theequation)

with as the source location, as the source time function at , and , and as constants. Eq. ([) has the following analytical solution

(\theequation)

where we have used the Einstein’s index summation convention, , and is the Kronecker delta function. Here we use for a simple formulation of the analytical solution, which is actually in standard notations.

In the numerical tests, we simulate the elastic wave propagation in a domain of the size km km km, and take the source time function as

with as the frequency, s and s. We choose P- and S-wave speeds as km/s, km/s, density , final propagation time s and the source location km. We use a fourth order Runge-Kutta (RK4) with fixed time step as the numerical solver for eqs. ([), ([), ([), ([) and ([). To compare the performance of the FGA method to the spectral element software package SPECFEM3Dhttps://geodynamics.org/cig/software/specfem3d/, simluations are done on a cluster of each node equipped with an Intel(R) Xeon(R) E5-2670 (2.60GHz) and a total of 64GB RAM. The accuracy and parallelizability of FGA are tested for Hz, and the comparison of computational time to SPECFEM3D is done for a range of from Hz to Hz.

The dynamics of the elastic wave propagation is shown by the FGA method in Fig. 4, which is consistent with the analytical solution ([) as an expanding ball. The component-wise wavefields, including both P- and S-wave components, are shown in Fig. 6 for s. With the comparable accuracy of FGA to SPECFEM3D as illustrated in Fig. 8 for s and the source frequency Hz. The relative error for Fig. 8(d), FGA versus the analytical solution is computed to be 3.84%; while for Fig. 8(e), the relative error is computed to be 3.57%. With a comparable accuracy, the FGA shows a much faster computational speed and better parallelizablity than SPECFEM3D for high-frequency ( Hz) elastic wave propagation, with details described in Figs. 10 and 12. Particularly, one can see in Fig. 10 that the computational time of SPECFEM3D has nearly cubic growth in the frequency of elastic waves, while FGA has about -times growth in the frequency. Fig. 12 shows that the speed-up ratio for FGA is almost 2 when one doubles the number of processors, which indicates that the FGA algorithm is embarrassingly parallel. Remark that, in the simulation for the source frequency Hz where FGA and SPECFEM3D produce comparable accuracy, we use Gaussians for computing P-wave of both “” propagation directions, and Gaussians for computing S-wave of both “” propagation directions, which needs to roughly compute a total number of millions of variables in the simulation. Note that, the equations for both P- and S-waves have the same number of variables, and the prefactor comes by counting the number of variables needed in eqs. ([) and ([) where and are 3-D real vectors and the prefactor amplitude is a complex number; while in SPECFEM3D, we use elements in each direction with nodes in each element. One needs to roughly to compute a total number of millions of variables in the simulation, where the prefactor is due to is a 3-D real vector in eq. ([). In addition, the stability conditions on time step for solving the ODE systems ([) and ([) by RK4 is better than solving the elastic wave equation ([) by SPECFEM, since the CFL condition is restricted by small wavelength in solving eq. ([).

(b) Initial wavefield
(d) s
(f) s
Figure 4: Modulus of the elastic wavefield computed by the FGA method in homogeneous media, with the analytical solution given by eq. ([). The subfigures from left to right show the slices of at km for s. The frequency of the source time function is Hz.
(b)
(d) : P-wave component
(f) : S-wave component
(h)
(j) : P-wave component
(l) : S-wave component
(n)
(p) : P-wave component
(r) : S-wave component
Figure 6: Modulus of the componentwise elastic wavefield computed by the FGA method in homogeneous media, with the analytical solution given by eq. ([). The subfigures show the components of for s and the source frequency Hz. The second and third columns show the computed P- and S-wavefields by the FGA method, respectively.
(b) FGA
(d) SEM
(f) Solution
(h) FGA vs Solution
(j) SEM vs Solution
Figure 8: Comparison of accuracy for FGA and SPECFEM3D to the analytical solution ([) at s with the source frequency Hz. The top subfigures are the modulus of wavefields given by FGA, SPECFEM and analytical solution. The bottom subfigures are the modulus differences between the FGA and analytical solutions (bottom left) and between the SPECFEM and analytical solutions (bottom right). This shows that FGA and SPECFEM3 produce comparable accuracy in this case.
Figure 10: Dependence of one-step computational time on frequency for both FGA and SPECFEM3D in homogeneous media. The horizontal axis is the frequency (Hz) and the vertical axis is the one-step computational time (s) of the solvers. The triangle line stands for the FGA simulations and the square line stands for the SPECFEM3D simulations. Due to the limitation of memory, SPEFEM3D can not run for Hz, and the dashed square line is obtained by extrapolation.
Figure 12: Dependence of time on the number of processors for the source frequency Hz in homogeneous media. The ideal speed-up ratio is , while one can see that the speed-up ratio for FGA is approximately , which is slightly smaller than indicating an almost perfectly parallel efficiency. On the other hand, as a comparison, SPECFEM3D is used with 128 elements in each spatial direction to achieve a comparable accuracy to FGA. The speed-up ratio for the SPECFEM3D solver is around , which is smaller that those of FGA. This is because SPECFEM3D solves eq. ([) on a parallel computer with processors by partitioning the whole domain into slabs with each processor solving the equation in each slab. Therefore, for each time step, each processor needs to communicate with its neighbors to get necessary boundary information, which decreases the speed-up ratio.
\@xsect

In this section, we derive the interface conditions of FGA for 3-D elastic wave propagation in heterogeneous media with strong discontinuities, e.g., the Moho surface and Core-mantle boundary. We verify these interface conditions by simulating a 1-D layered Earth model, which actually requires to revise the FGA solution in spherical coordinates (non-Cartesian coordinates). This can be easily done by an observation that the only step of the FGA algorithm depending on the -coordinate is the reconstruction part given by eq. ([) and all the other steps including the initial wavefield decomposition by eqs. ([) and ([) and propagation of ray equations eqs. ([), ([), ([), ([) and ([) are -coordinate-free. Therefore, to generalize the FGA method in spherical coordinates, one simply needs to change in eq. ([) to with as the distance to earth center, and as the longitude and latitude degrees.

\@xsect

For a sake of clarity and simplicity, we only give the interface conditions for a flat interface located at , and for a reflecting geometry of general shape, one needs to apply the formulation in the local tangent-normal coordinates by treating the tangential direction as the local flat horizontal interface. The wave speeds of the two layers are assumed to be,

(\theequation)

In Fig. 14, we only consider an incident Gaussian wave packet for P-wave hitting the interface at , and then reflected and transmitted as Gaussian wave packets for P- and SV-waves, respectively. The other cases including an incident Gaussian wave packet for SV- and SH-waves can be handled similarly, although there is no interaction between P- and S-waves for the case of SH-wave.

Associated with each Gaussian wave packet, one needs to provide the reflection and transmission conditions for , , and , which follow the Snell’s Law and the Zoeppritz equations \citepYi:01. However, for FGA, what is different from standard ray theory on the flat interface is that, one also needs to derive the interface conditions for which will change after the Gaussian wave packet hits the interface and affect the dynamics of given by eqs. ([)-([). Eq. ([) implies that to derive the interface conditions of will be equivalent to derive the interface conditions for and , which requires to use the conservation of level set functions designed in the Eulerian frozen Gaussian approximation formula \citepLuYa:MMS,WeYa:12. The mathematical details of the derivation are lengthy and technical, and thus we leave them to Appendix [ for the interested readers, and only present the results here:

(\theequation)

where corresponds to the center and propagation vector of incident, reflected and transmitted Gaussian wave packet for either P- or S-waves, respectively, and . Here is a row vector, and are two matrices, , and

Layer 2 ()Layer 1 ()
Figure 14: Cartoon illustration of an incident Gaussian wave packet for P-wave hitting the interface at , and then reflected and transmitted as Gaussian wave packets for P- and SV-waves. Here the stands for the Gaussian wave packet for the incident, reflected and transmitted P- and SV-waves, respectively. We denote to be the incident, reflection and transmission angles of P-waves, and to be the reflection and transmission angles of SV-waves, respectively.

Let us explain the formulation of reflection and transmission interface conditions with more detials for the incident P-wave as illustrated in Fig. 14. In this case, if one denotes to be the center, propagation vector and prefactor amplitude of the corresponding incident, reflected and transmitted Gaussian wave packets for P- and S-waves respectively, and , then for and ,

(\theequation)

where , and with and .

Moreover, if one denotes to be the P-wave incident, reflection and transmission angles, and to be the SV-wave reflection and transmission angles, respectively, then the Zoeppritz equations \citepYi:01 read as

(\theequation)

with the matrix M as

(\theequation)

where are the densities for the layers 1 and 2, respectively.

More explicitly in eq. (\theequation) corresponding to the case in Fig. 14, , , , , and .

\@xsect

We verify the interface conditions (\theequation) and (\theequation) by simulating a waveguide example in a 1-D layered Earth model, with the layered P-wave velocity given in Fig. 16(a) following the data in the IASP91 model \citepKe:91,DMC:10. We are particularly interested in choosing the -km discontinuity for the numerical proof of the conditions (\theequation) and (\theequation), which presents to a increase on P-wave velocity and calibrates the mantle transition zone. We consider a radially symmetric surface source as shown in Fig. 16(b), so that the elastic wave equation ([) has a solution of P-waves in the form of where is the radially symmetric solution to the scalar wave equation, i.e.,

(\theequation)

We choose the initial condition for the elastic wave equation ([) as

where , km, and

with km and km. The dominant frequency is around Hz. We solve eq. (\theequation) using 1-D finite difference method with fine enough grid points as the reference solution, to which we compare the full elastic wave solution of ([) computed by the FGA algorithm. Fig. 18 shows the seismic signals of P-waves received at stations of depth km, km and km, respectively, where one can see a good agreement of FGA simulation with the reference solution for the P-waves.

Figure 16: (a): The layered P-wave velocity follows the data in the IASP91 model \citepKe:91,DMC:10. (b): Cartoon plot of the -km discontinuity, and distributions of source and stations.
(b) P-wave at depth km
(d) P-wave at depth km
(f) P-wave at depth km
Figure 18: The seismic signals of P-waves received at stations of depth km, km and km, respectively, simulated by the finite difference (FD) and FGA methods, with the source locating at the depth of km. FGA shows a good agreement with the reference signals computed by the FD method for the P-waves.
\@xsect

In this section, we apply the developed FGA algorithm for 3-D wave-equation-based traveltime tomography \citep[e.g.][]Tromp2005,Liu2012 and FWI \citep[e.g.][]Pratt1999,Virieux2009, respectively. In particular, we consider the following crosswell seismic tomography, where the three-layered velocity model is set up as follows, with a low-velocity region located at the second layer and homogeneity in the -direction,

(\theequation)

where the layered velocities are m/s, m/s, m/s, and the interfaces locate at m, m, m. The center of the low-velocity region is set at m, m. We choose and to indicate the largest magnitude of the low-velocity perturbation from the background velocity and the area of low velocity region; see Fig. 20 for an illustration of the P-wave velocity. In Fig. 20, we also show the positions of 16 seismic resources (as stars) with an equal spacing of m in one well and 32 seismic receivers (as dots) with an equal spacing of m in the other well.

(b) Three-layered crosswell model
(d) Relative velocity
Figure 20: (a): The three-layered crosswell P-wave velocity model with homogeneity in horizontal -direction. The background velocities from top to bottom are m/s, m/s, m/s, and the interfaces locate at m, m, m. The low-velocity region has a Gaussian shape centered at m, m, with standard deviation equal to m. stars indicate the locations of seismic sources, and dots indicate the locations of seismic receivers. (b): we use the relative velocity to indicate the largest magnitude of the low-velocity perturbation from the background velocity and the area of low velocity region.

We start with the following initial velocity model

(\theequation)

then use FGA to compute the forward and adjoint wavefields and construct 3-D kernels of different phases by the methods of wave-equation-based traveltime tomography \citepTromp2005,Liu2012 and FWI \citepPratt1999,Virieux2009. Since FWI requires a more sophisticated initial velocity model for the convergence than wave-equation-based traveltime tomography, we use a hierarchical strategy as in our previous work \citepchai2018tomo, which first uses wave-equation-based traveltime tomography to create a macro-scale model and then adopts FWI to generate a high-resolution micro-scale model. For the signals received at the top station (Fig. 22), we show the corresponding kernels of different phases in Fig. 24. The inversion results are shown in Fig. 26. The damping and smoothing parameters are chosen empirically by forcing the variation in each iteration less than 3%. Note that artifacts are visible in the final images, which are unavoidable and mainly caused by the uneven data coverages.

Figure 22: The signals of direct P, pP, S2P, pS, direct S, and S2pS received at the top station with the source as the top ninth source in Fig. 20. Here means the second interface locates at m, e.g. S2P means S-wave gets reflected at the interface at m and arrives at the station as P-wave.
Figure 24: Seismic tomography kernels computed by FGA for the initial velocity model (\theequation), with the thick dashed lines as the actual ray paths of direct P, pP, S2P and pS, direct S, and S2pS signals in Fig. 22, respectively. (a): Kernel computed from the direct P signal – 3D slices view; (b): Kernel computed from the direct P signal; (c): Kernel computed from the pP signal; (d): Kernel computed from the S2P and pS signals; (e): Kernel computed from the direct S and S2pS signals.
Figure 26: For and in eq. (\theequation), the subfigures (a) and (b) are the first two iterations using travel-time tomography, and the subfigure (c) is the third iteration using FWI which removes the artifacts in travel-time tomography and is close to the true velocity profile given in Fig. 20(b).
\@xsect

This is the last part of our theoretical study on the application of the frozen Gaussian approximation (FGA) method to seismic tomography using high-frequency seismic data. Together with \citetchai2017frozen,chai2018tomo, we systematically derive the FGA formulation for acoustic and elastic wave equations, derive the transmission and reflection conditions of FGA for sharp interfaces (e.g., Moho surface and Core-mantle boundary), reformulate the equations of the forward and adjoint wavefields for a convenient application of the FGA algorithm, propose a fast Gaussian summation algorithm for the reconstruction of FGA, and conduct 3-D high-frequency seismic inversion by the methods of wave-equation-based traveltime tomography and FWI. As a proof of methodology, we compare FGA to SPECFEM3D for acoustic and elastic wave propagation in homogeneous media where an analytical benchmark solution is available. We also test the performance of FGA in a 1-D layered Earth model for checking the accuracy of the interface conditions, and apply FGA to the cross-well example for the 3-D seismic tomography using high-frequency data. We remark that, in the original mathematical work on FGA for strictly linear hyperbolic systems \citepLuYa:CPAM, the FGA formulation is rather complicated, and done for a purpose of rigorously proving its accuracy in the high-frequency regime. In this paper, we weakly expand the solution ansatz of FGA with a consideration of decomposition into P- and S-waves, and reach simple prefactor equations ([), ([) and ([) after lengthy and tricky simplifications. This also generalizes the results in \citetLuYa:CPAM to some extent in the sense that the elastic wave equation is actually not strictly hyperbolic, which has an eigenfunction space of multiplicity-2 (corresponding to S-waves). By these efforts, we hope to bridge a gap between the semiclassical approximation theory in mathematics and the application of 3-D seismic tomography in geophysics. We also remark that, it will be interesting to study the performance of the FGA method in the optimal transport theory-based seismic tomography, with the normalization strategies proposed in, e.g., \citetEnFr:14,MeBrMeOuVi:16,qiu2017full,YaEnSuHa:18,ChChWuYa:arXiv. Our future plans include: 1. to apply 3-D seismic tomography with the FGA method into real data around their dominant frequencies; 2. to use the fast computation performance of FGA to train neural networks for seismic inversion; 3. to develop the FGA algorithm for the optimal transport theory-based seismic tomography using high-frequency data.

\@xsect

We are grateful to the support from the Center for Scientific Computing from the CNSI, MRL: an NSF MRSEC (DMR-1121053) and NSF CNS-0960316. JH, LC and XY were partially supported by the NSF grants DMS-1818592, DMS-1418936 and DMS-1107291. JH was also partially supported by the graduate summer fellowship of Department of Mathematics at University of California, Santa Barbara. PT was partially supported by the MOE AcRF Tier-1 grant (M4011899.110). XY also thanks Professor Qinya Liu for useful discussions on the WKM and GRT methods. We also thank the anonymous referee for bringing us the interesting questions on the optimal transport theory-based seismic tomography.

\@newctr

equation[section]

\@xsect

This appendix is devoted to the mathematical details of deriving the FGA formulation. It is lengthy and technically involved, making use of a few mathematical concepts including weak asymptotic expansion, Schwartz class functions, canonical transform and symplecticity. We shall do our best to make the details as straightforward as possible so that they are accessible for interested readers with different background.

For a sake of clarity and simplicity, we only consider the case of and being constants in eq. ([) with only “” wave propagation direction, and the derivation of FGA for general and with two-way wave propagation directions will be essentially the same but just with more calculations. We plug eq. ([) into eq. ([), combine the phase functions, and define the total phase as

(A.1)

We call two functions and are weakly equal to each other, denoted by , if

(A.2)

Note that both and can be scalar-, vector-, matrix- or tensor-valued functions.

A map: is called canonical transformation if the associated Jacobian matrix

is symplectic, i.e., for any ,

(A.3)

where is a identity matrix. It is easy to verify that the Hamiltonian flow given by either eq. ([) or ([) is a canonical transform by observing that

with the Hamiltonian function , thus satisfies eq. (A.3). This property guarantees that the matrix defined in eq. ([) is always invertible for all by the following argument, with the subscripts omitted for convenience,

which implies

where is the conjugate transpose of , and in the last equality, we have used eq. (A.3). Therefore, the used in eqs. ([), ([) and ([) is a well behaved quantity.

With the theoretical preparation above, we are ready to derive the formulation of FGA. We only consider the case in eq. ([), and refer to \citet[Section 3.1]chai2018tomo for the treatment of a general source time function. We start with the following form of Gaussian wave packet as the solution ansatz,

(A.4)

with given by eq. (A.1). Remark that, for the readers who are familiar with Gaussian beam method \citep[e.g.][]Hi:90,Hi:01,SEG-2003-11141117,Gr:05,GrBe:09,PoSePoVe:10, it is easy to see that, a direct plugin of eq. (A.4) into eq. ([) will yield more equations than the number of variables due to the missing of a time-dependent Hessian function in the total phase . Therefore, one can only derive the FGA formulation using the weak asymptotic expansion with a sense of equality defined in eq. (A.2), where the following integration by parts lemma will help to eliminate the extra constraints by converting the powers of to the powers of .

Lemma of integration by parts: For any vector and matrix in Schwartz class, i.e., with decay properties at infinity so that the integrals in eq. (A.2) are well defined, one has the following integration by parts formula in the componentwise form, with ,

(A.5)

where the Einstein’s index summation convention is used, the matrix is given in eq. ([), and we refer to \citetLuYa:11,LuYa:CPAM for the detailed proofs.

Note that eq. ([) with can be rewritten as the following “curl” form,

(A.6)

Notice that eq. (A.6) is linear, and thus one can derive the prefactor equations for P- and S-waves individually by assuming or in eq. (A.4), respectively. Without loss of generality, we first consider the prefactor equation for the P-wave, with the governing equations for and given by eq. ([). For an ease of notations, we shall omit the subscript in calculating the prefactor equation for the P-wave.

Plugging eq. (A.4) into eq. (A.6) and expanding the asymptotics in the weak sense of (A.2) yield

(A.7)

The spatial and temporal derivatives of are given by

(A.8)

Notice that the terms containing will be of by the lemma of integration by parts, and for P-waves, and . Plugging the derivatives of in eq. (A.8) into eq. (A.7) produces, after neglecting the and lower order terms,

(A.9)

where means the tensor product, e.g., .

Expanding around and truncating at order third order,

(A.10)

and noticing that is constant, one has

(A.11)

Taking the second derivative for and substituting eq. (A.11) bring

(A.12)

Plugging eqs, ([) and (A.12) into eq. (A.9), and dividing by yield, in componentwise form,

(A.13)

where and are given as follows,