FGA for Elastic Wave Equation ]
Frozen Gaussian Approximation for 3-D Elastic Wave Equation and Seismic Tomography
J. C. Hateley, L. Chai, P. Tong, X. Yang]
J. C. Hateley, L. Chai, P. Tong, X. Yang
Department of Mathematics, University of California, Santa Barbara, CA 93106, USA
School of Mathematics, Sun Yat-sen University, Guangzhou, 510275, China
School of Physical and Mathematical Sciences & Asian School of the Environment,
Nanyang Technological University, 637371, Singapore
Images computed by seismic tomography can provide crucial information for the subsurface structures of Earth at different scales, and the understanding of tectonics, volcanism, and geodynamics \citep[e.g.]Aki1976,Romanowicz1991,Rawlinson2010,Zhao2012a. Wave-equation-based seismic tomography solves the nonlinear optimization problem iteratively for velocity models by computing seismograms and sensitivity kernels in 3-D complex models \citepTromp2005,Liu2008,Liu2012,Tong2014a. This leads to successful applications including imaging the velocity models of the southern California crust \citepTape2009,Tape2010, the European upper mantle \citepZhu2012, the North Atlantic region \citepRickers2013, and the Japan islands \citepSimute2016. The performance of seismic tomography is restricted by how accurate one can solve the 3-D wave equation for synthetic seismograms and sensitivity kernels. The dominant frequency of a low-frequency earthquake is around 1 Hz, while the one of a typical earthquake is around 5 Hz \citep[e.g.]nakamichi2003source, which leads to demanding and even unaffordable computational cost. Real applications usually chose significantly low-frequency data in order to be computationally feasible \citep[e.g.]Tape2009,Zhu2012,Simute2016, since simulating high-frequency seismic waves requires much more powerful computational resources on both memories and CPU time than the low-frequency waves. This imposes a demand on improving the efficiency and accuracy of numerical methods for computing high-frequency waves in order to make use of real seismic data around their dominant frequencies.
In previous works \citepchai2017frozen,chai2018tomo, the authors have developed and verified frozen Gaussian approximation (FGA) as an efficient solver for computing high-frequency acoustic wave propagation (P-wave), wave-equation-based traveltime tomography and full waveform inversion (FWI). The core idea of FGA is to approximate seismic wavefields by fixed-width Gaussian wave packets, whose dynamics follow ray paths with the prefactor amplitude equation derived delicately from an asymptotic expansion on phase plane. With a multicore processors computer station, the property that the FGA algorithm is embarrassingly parallel makes possible the application of FGA to compute 3-D high frequency sensitivity kernels, and further used for 3-D traveltime tomography and FWI. Compared to other ray-based methods including WKBJ \citep[e.g.]Chapman:76,ChDr:82, WKM \citep[e.g.]NiDiHe:00,HeNi:05, generalized ray theory \citep[e.g.]He:68,ViHe:88, seismic traveltime tomography \citep[e.g.]Aki1977, tong2017time, Kirchhoff migration [[, e.g.]]Gr:86,KeBe:88 and Gaussian beam migration \citep[e.g.]Hi:90,Hi:01,SEG-2003-11141117,Gr:05,GrBe:09,PoSePoVe:10, FGA does not need to solve ray paths by shooting to reach the receivers, and provides accurate solutions at the presence of caustics and multipathing, with no requirement on tuning beam width parameters to achieve a good resolution \citepCePoPs:82,Hi:90,FoTa:09,QiYi:10,LuYa:11.
In this paper, we first generalize the FGA theory to solve the 3-D elastic wave propagation. Different from the WKBJ theory or Gaussian beam method which can be derived by direct asymptotic expansion, the derivation of FGA formulation requires to do the asymptotic expansion in an integral form (weak sense) so that one is able to perform integration by parts to eliminate the extra constraints yielded by direct asymptotic expansion. Compared to the previous works on FGA [Lu & Yang(2012a), Chai et al.(2017)Chai, Tong, & Yang, Chai et al.(2018)Chai, Tong, & Yang], the calculations for the derivation are much more technically involved due to the existence of both P- and S-waves. Particularly, the diabatic coupling of the polarized directions for SH- and SV-waves leads to a term closely connected to the concept of Berry phase which is intensively studied in quantum mechanics and topology (Chern number) [[, e.g.]]Be:84,Si:83. We prove the accuracy and parallelizability of the FGA algorithm by comparing to the spectral element method \citep[e.g.]Komatitsch1999,Komatitsch2005,Tromp2008 for the 3-D elastic wave equation in homogeneous media, where one has the analytical solution as a benchmark. We derive the interface conditions of FGA for the 3-D elastic wave equation based on an Eulerian formulation and the Snell’s law. These interface conditions are verified by simulating high-frequency elastic wave propagation in a 1-D layered Earth model. We also show how natural to apply the FGA algorithm to geometries with non-Cartesian coordinates in the simulation of this model.
The second part of this paper contributes to the study of seismic tomography with high-frequency data. Wave-equation-based traveltime tomography and FWI are able to generate higher-resolution images of the Earth’s interior than the conventional ray-based tomography method \citepVirieux2009. Especially, some recent novel studies on FWI using optimal transport distance (e.g. Wasserstein metric) are able to capture traveltime differences between seismic signals, and thus overcomes the cycling effect in standard FWI method \citep[e.g.]EnFr:14,MeBrMeOuVi:16,YaEnSuHa:18,ChChWuYa:arXiv. However, seismic tomography using high-frequency data increases computational cost drastically, which restricts the application of these wave-equation-based inversion methods, and only low-frequency data are modeled and inverted in some real applications \citep[e.g.]Tape2007, Zhu2012, Simute2016. To address the computation challenge, we use the developed FGA algorithm to compute the 3-D high-frequency sensitivity kernels for wave-equation-based traveltime tomography and FWI, respectively. We apply FGA to both wave-equation-based traveltime tomography and FWI on synthetic crosswell seismic data with dominant frequencies as high as those of real crosswell data, and use a hierarchical approach which first uses traveltime tomography to create a macro-scale model and then adopts FWI to generate a high-resolution micro-scale model.
We derive the frozen Gaussian approximation (FGA) formulation for the following inhomogeneous, isotropic elastic wave equation:
where is the wavefield, is the density, and are the Lamé parameters, and is the body force. We shall assume in this section as a sake of simplicity for presenting the FGA formula. Remark that eq. ([) can be rewritten as,
with the P-wave, and S-wave speeds defined as,
We consider the elastic wave eq. ([), with the following initial conditions
Note that all the bolded variables and parameters will be referred to vectors in without any further clarification.
The FGA approximates the wavefield in eq. ([) by a summation of dynamic frozen Gaussian wave packets,
with the weight functions
In eq. ([), is the imaginary unit, and we use unbolded subscripts/superscripts “” and “” to indicate P- and S-waves respectively, and superscripts to indicate quantities depending on wave number . The quantities that have both and as subscripts/superscripts mean that they can be referred to both P- and S-waves. Here refers to the initial sets of Gaussian center and propagation vector for P- and S-waves respectively, and indicates the two-way wave propagation directions correspondingly. are unit vectors indicating the polarized directions of P- and S-waves, i.e. if , then ; and if , then . In ([), are the initial directions of P- and S-waves, i.e. , and the “” on the right-hand-side of the equation indicate that the correspond to . Remark that can be understood as complex localized wave-packet centered at with propagation vector , and is the projection of the initial wavefield onto each wave-packet computed by eq. ([) with serving as the dummy variable in the integration. Associated with each frozen Gaussian wave-packet, the time-dependent quantities are the position center , momentum center , amplitude and unit direction vectors . Note that all the S-waves discussed in the formulation will include both SH- and SV-waves.
The derivation of the FGA formulation involves with the asymptotic expansion in the integral form (weak sense), with proper integration by parts performed to convert powers of distance to the Gaussian center to orders of the wavelength . It is quite lengthy and technically involved, and thus we leave it to Appendix [ for the readers who are interested in the mathematical details, and only present the FGA formulation as below.
For the “+” wave propagation, i.e., , the Gaussian center and propagation vector follow the ray dynamics
with initial conditions
For the “-” wave propagation, i.e., , the Gaussian center and propagation vector follow the ray dynamics
with initial conditions
Remark that, the equations for “-” wave propagation ([) have opposite signs of the right-hand side to the equations for “+” wave propagation ([), and other than that, they are the same. Actually they can be both viewed as the Hamiltonian system with as the Hamiltonian function for “+” and “-” wave propagation, respectively.
The prefactor amplitudes satisfy the following equations, where S-waves have been decomposed into SH- and SV-waves,
with the initial conditions , and and are the two unit directions perpendicular to , referring to the polarized directions of SV- and SH-waves, respectively. Here “” corresponds to the two-way wave propagation directions, and we have used the short-hand notations,
Note that, the prefactor equation ([) is consistent with the one for acoustic wave equation with \citepchai2017frozen , and the last terms on the right-hand-side of ([)-([) indicate the diabatic coupling of the polarized directions for SH- and SV-waves, which are closely connected to the concept of Berry phase intensively studied in quantum mechanics and topology (Chern number) [[, e.g.]]Be:84,Si:83.
Algorithm. A flowchart is shown in Fig. 2 to describe the FGA algorithm, with the technique details discussed in the figure caption.
We check the accuracy and parallelizability of the FGA method by simulating the elastic wave propagation at a homogeneous media with absorbing boundary conditions, for which one can solve the analytical solution as a benchmark. Specifically, we consider
with as the source location, as the source time function at , and , and as constants. Eq. ([) has the following analytical solution
where we have used the Einstein’s index summation convention, , and is the Kronecker delta function. Here we use for a simple formulation of the analytical solution, which is actually in standard notations.
In the numerical tests, we simulate the elastic wave propagation in a domain of the size km km km, and take the source time function as
with as the frequency, s and s. We choose P- and S-wave speeds as km/s, km/s, density , final propagation time s and the source location km. We use a fourth order Runge-Kutta (RK4) with fixed time step as the numerical solver for eqs. ([), ([), ([), ([) and ([). To compare the performance of the FGA method to the spectral element software package SPECFEM3Dhttps://geodynamics.org/cig/software/specfem3d/, simluations are done on a cluster of each node equipped with an Intel(R) Xeon(R) E5-2670 (2.60GHz) and a total of 64GB RAM. The accuracy and parallelizability of FGA are tested for Hz, and the comparison of computational time to SPECFEM3D is done for a range of from Hz to Hz.
The dynamics of the elastic wave propagation is shown by the FGA method in Fig. 4, which is consistent with the analytical solution ([) as an expanding ball. The component-wise wavefields, including both P- and S-wave components, are shown in Fig. 6 for s. With the comparable accuracy of FGA to SPECFEM3D as illustrated in Fig. 8 for s and the source frequency Hz. The relative error for Fig. 8(d), FGA versus the analytical solution is computed to be 3.84%; while for Fig. 8(e), the relative error is computed to be 3.57%. With a comparable accuracy, the FGA shows a much faster computational speed and better parallelizablity than SPECFEM3D for high-frequency ( Hz) elastic wave propagation, with details described in Figs. 10 and 12. Particularly, one can see in Fig. 10 that the computational time of SPECFEM3D has nearly cubic growth in the frequency of elastic waves, while FGA has about -times growth in the frequency. Fig. 12 shows that the speed-up ratio for FGA is almost 2 when one doubles the number of processors, which indicates that the FGA algorithm is embarrassingly parallel. Remark that, in the simulation for the source frequency Hz where FGA and SPECFEM3D produce comparable accuracy, we use Gaussians for computing P-wave of both “” propagation directions, and Gaussians for computing S-wave of both “” propagation directions, which needs to roughly compute a total number of millions of variables in the simulation. Note that, the equations for both P- and S-waves have the same number of variables, and the prefactor comes by counting the number of variables needed in eqs. ([) and ([) where and are 3-D real vectors and the prefactor amplitude is a complex number; while in SPECFEM3D, we use elements in each direction with nodes in each element. One needs to roughly to compute a total number of millions of variables in the simulation, where the prefactor is due to is a 3-D real vector in eq. ([). In addition, the stability conditions on time step for solving the ODE systems ([) and ([) by RK4 is better than solving the elastic wave equation ([) by SPECFEM, since the CFL condition is restricted by small wavelength in solving eq. ([).
In this section, we derive the interface conditions of FGA for 3-D elastic wave propagation in heterogeneous media with strong discontinuities, e.g., the Moho surface and Core-mantle boundary. We verify these interface conditions by simulating a 1-D layered Earth model, which actually requires to revise the FGA solution in spherical coordinates (non-Cartesian coordinates). This can be easily done by an observation that the only step of the FGA algorithm depending on the -coordinate is the reconstruction part given by eq. ([) and all the other steps including the initial wavefield decomposition by eqs. ([) and ([) and propagation of ray equations eqs. ([), ([), ([), ([) and ([) are -coordinate-free. Therefore, to generalize the FGA method in spherical coordinates, one simply needs to change in eq. ([) to with as the distance to earth center, and as the longitude and latitude degrees.
For a sake of clarity and simplicity, we only give the interface conditions for a flat interface located at , and for a reflecting geometry of general shape, one needs to apply the formulation in the local tangent-normal coordinates by treating the tangential direction as the local flat horizontal interface. The wave speeds of the two layers are assumed to be,
In Fig. 14, we only consider an incident Gaussian wave packet for P-wave hitting the interface at , and then reflected and transmitted as Gaussian wave packets for P- and SV-waves, respectively. The other cases including an incident Gaussian wave packet for SV- and SH-waves can be handled similarly, although there is no interaction between P- and S-waves for the case of SH-wave.
Associated with each Gaussian wave packet, one needs to provide the reflection and transmission conditions for , , and , which follow the Snell’s Law and the Zoeppritz equations \citepYi:01. However, for FGA, what is different from standard ray theory on the flat interface is that, one also needs to derive the interface conditions for which will change after the Gaussian wave packet hits the interface and affect the dynamics of given by eqs. ([)-([). Eq. ([) implies that to derive the interface conditions of will be equivalent to derive the interface conditions for and , which requires to use the conservation of level set functions designed in the Eulerian frozen Gaussian approximation formula \citepLuYa:MMS,WeYa:12. The mathematical details of the derivation are lengthy and technical, and thus we leave them to Appendix [ for the interested readers, and only present the results here:
where corresponds to the center and propagation vector of incident, reflected and transmitted Gaussian wave packet for either P- or S-waves, respectively, and . Here is a row vector, and are two matrices, , and
Let us explain the formulation of reflection and transmission interface conditions with more detials for the incident P-wave as illustrated in Fig. 14. In this case, if one denotes to be the center, propagation vector and prefactor amplitude of the corresponding incident, reflected and transmitted Gaussian wave packets for P- and S-waves respectively, and , then for and ,
where , and with and .
Moreover, if one denotes to be the P-wave incident, reflection and transmission angles, and to be the SV-wave reflection and transmission angles, respectively, then the Zoeppritz equations \citepYi:01 read as
with the matrix M as
where are the densities for the layers 1 and 2, respectively.
We verify the interface conditions (\theequation) and (\theequation) by simulating a waveguide example in a 1-D layered Earth model, with the layered P-wave velocity given in Fig. 16(a) following the data in the IASP91 model \citepKe:91,DMC:10. We are particularly interested in choosing the -km discontinuity for the numerical proof of the conditions (\theequation) and (\theequation), which presents to a increase on P-wave velocity and calibrates the mantle transition zone. We consider a radially symmetric surface source as shown in Fig. 16(b), so that the elastic wave equation ([) has a solution of P-waves in the form of where is the radially symmetric solution to the scalar wave equation, i.e.,
We choose the initial condition for the elastic wave equation ([) as
where , km, and
with km and km. The dominant frequency is around Hz. We solve eq. (\theequation) using 1-D finite difference method with fine enough grid points as the reference solution, to which we compare the full elastic wave solution of ([) computed by the FGA algorithm. Fig. 18 shows the seismic signals of P-waves received at stations of depth km, km and km, respectively, where one can see a good agreement of FGA simulation with the reference solution for the P-waves.
In this section, we apply the developed FGA algorithm for 3-D wave-equation-based traveltime tomography \citep[e.g.]Tromp2005,Liu2012 and FWI \citep[e.g.]Pratt1999,Virieux2009, respectively. In particular, we consider the following crosswell seismic tomography, where the three-layered velocity model is set up as follows, with a low-velocity region located at the second layer and homogeneity in the -direction,
where the layered velocities are m/s, m/s, m/s, and the interfaces locate at m, m, m. The center of the low-velocity region is set at m, m. We choose and to indicate the largest magnitude of the low-velocity perturbation from the background velocity and the area of low velocity region; see Fig. 20 for an illustration of the P-wave velocity. In Fig. 20, we also show the positions of 16 seismic resources (as stars) with an equal spacing of m in one well and 32 seismic receivers (as dots) with an equal spacing of m in the other well.
We start with the following initial velocity model
then use FGA to compute the forward and adjoint wavefields and construct 3-D kernels of different phases by the methods of wave-equation-based traveltime tomography \citepTromp2005,Liu2012 and FWI \citepPratt1999,Virieux2009. Since FWI requires a more sophisticated initial velocity model for the convergence than wave-equation-based traveltime tomography, we use a hierarchical strategy as in our previous work \citepchai2018tomo, which first uses wave-equation-based traveltime tomography to create a macro-scale model and then adopts FWI to generate a high-resolution micro-scale model. For the signals received at the top station (Fig. 22), we show the corresponding kernels of different phases in Fig. 24. The inversion results are shown in Fig. 26. The damping and smoothing parameters are chosen empirically by forcing the variation in each iteration less than 3%. Note that artifacts are visible in the final images, which are unavoidable and mainly caused by the uneven data coverages.
This is the last part of our theoretical study on the application of the frozen Gaussian approximation (FGA) method to seismic tomography using high-frequency seismic data. Together with \citetchai2017frozen,chai2018tomo, we systematically derive the FGA formulation for acoustic and elastic wave equations, derive the transmission and reflection conditions of FGA for sharp interfaces (e.g., Moho surface and Core-mantle boundary), reformulate the equations of the forward and adjoint wavefields for a convenient application of the FGA algorithm, propose a fast Gaussian summation algorithm for the reconstruction of FGA, and conduct 3-D high-frequency seismic inversion by the methods of wave-equation-based traveltime tomography and FWI. As a proof of methodology, we compare FGA to SPECFEM3D for acoustic and elastic wave propagation in homogeneous media where an analytical benchmark solution is available. We also test the performance of FGA in a 1-D layered Earth model for checking the accuracy of the interface conditions, and apply FGA to the cross-well example for the 3-D seismic tomography using high-frequency data. We remark that, in the original mathematical work on FGA for strictly linear hyperbolic systems \citepLuYa:CPAM, the FGA formulation is rather complicated, and done for a purpose of rigorously proving its accuracy in the high-frequency regime. In this paper, we weakly expand the solution ansatz of FGA with a consideration of decomposition into P- and S-waves, and reach simple prefactor equations ([), ([) and ([) after lengthy and tricky simplifications. This also generalizes the results in \citetLuYa:CPAM to some extent in the sense that the elastic wave equation is actually not strictly hyperbolic, which has an eigenfunction space of multiplicity-2 (corresponding to S-waves). By these efforts, we hope to bridge a gap between the semiclassical approximation theory in mathematics and the application of 3-D seismic tomography in geophysics. We also remark that, it will be interesting to study the performance of the FGA method in the optimal transport theory-based seismic tomography, with the normalization strategies proposed in, e.g., \citetEnFr:14,MeBrMeOuVi:16,qiu2017full,YaEnSuHa:18,ChChWuYa:arXiv. Our future plans include: 1. to apply 3-D seismic tomography with the FGA method into real data around their dominant frequencies; 2. to use the fast computation performance of FGA to train neural networks for seismic inversion; 3. to develop the FGA algorithm for the optimal transport theory-based seismic tomography using high-frequency data.
We are grateful to the support from the Center for Scientific Computing from the CNSI, MRL: an NSF MRSEC (DMR-1121053) and NSF CNS-0960316. JH, LC and XY were partially supported by the NSF grants DMS-1818592, DMS-1418936 and DMS-1107291. JH was also partially supported by the graduate summer fellowship of Department of Mathematics at University of California, Santa Barbara. PT was partially supported by the MOE AcRF Tier-1 grant (M4011899.110). XY also thanks Professor Qinya Liu for useful discussions on the WKM and GRT methods. We also thank the anonymous referee for bringing us the interesting questions on the optimal transport theory-based seismic tomography.
This appendix is devoted to the mathematical details of deriving the FGA formulation. It is lengthy and technically involved, making use of a few mathematical concepts including weak asymptotic expansion, Schwartz class functions, canonical transform and symplecticity. We shall do our best to make the details as straightforward as possible so that they are accessible for interested readers with different background.
For a sake of clarity and simplicity, we only consider the case of and being constants in eq. ([) with only “” wave propagation direction, and the derivation of FGA for general and with two-way wave propagation directions will be essentially the same but just with more calculations. We plug eq. ([) into eq. ([), combine the phase functions, and define the total phase as
We call two functions and are weakly equal to each other, denoted by , if
Note that both and can be scalar-, vector-, matrix- or tensor-valued functions.
A map: is called canonical transformation if the associated Jacobian matrix
is symplectic, i.e., for any ,
where is a identity matrix. It is easy to verify that the Hamiltonian flow given by either eq. ([) or ([) is a canonical transform by observing that
with the Hamiltonian function , thus satisfies eq. (A.3). This property guarantees that the matrix defined in eq. ([) is always invertible for all by the following argument, with the subscripts omitted for convenience,
With the theoretical preparation above, we are ready to derive the formulation of FGA. We only consider the case in eq. ([), and refer to \citet[Section 3.1]chai2018tomo for the treatment of a general source time function. We start with the following form of Gaussian wave packet as the solution ansatz,
with given by eq. (A.1). Remark that, for the readers who are familiar with Gaussian beam method \citep[e.g.]Hi:90,Hi:01,SEG-2003-11141117,Gr:05,GrBe:09,PoSePoVe:10, it is easy to see that, a direct plugin of eq. (A.4) into eq. ([) will yield more equations than the number of variables due to the missing of a time-dependent Hessian function in the total phase . Therefore, one can only derive the FGA formulation using the weak asymptotic expansion with a sense of equality defined in eq. (A.2), where the following integration by parts lemma will help to eliminate the extra constraints by converting the powers of to the powers of .
Lemma of integration by parts: For any vector and matrix in Schwartz class, i.e., with decay properties at infinity so that the integrals in eq. (A.2) are well defined, one has the following integration by parts formula in the componentwise form, with ,
where the Einstein’s index summation convention is used, the matrix is given in eq. ([), and we refer to \citetLuYa:11,LuYa:CPAM for the detailed proofs.
Note that eq. ([) with can be rewritten as the following “curl” form,
Notice that eq. (A.6) is linear, and thus one can derive the prefactor equations for P- and S-waves individually by assuming or in eq. (A.4), respectively. Without loss of generality, we first consider the prefactor equation for the P-wave, with the governing equations for and given by eq. ([). For an ease of notations, we shall omit the subscript in calculating the prefactor equation for the P-wave.
The spatial and temporal derivatives of are given by
Notice that the terms containing will be of by the lemma of integration by parts, and for P-waves, and . Plugging the derivatives of in eq. (A.8) into eq. (A.7) produces, after neglecting the and lower order terms,
where means the tensor product, e.g., .
Expanding around and truncating at order third order,
and noticing that is constant, one has
Taking the second derivative for and substituting eq. (A.11) bring
where and are given as follows,