Data assimilation for massive autonomous systems based on second-order adjoint method

# Data assimilation for massive autonomous systems based on second-order adjoint method

Shin-ichi Ito Earthquake Research Institute, The University of Tokyo, Japan    Hiromichi Nagao Earthquake Research Institute, The University of Tokyo, Japan Graduate School of Information Science and Technology, The University of Tokyo, Japan    Akinori Yamanaka Division of Advanced Mechanical Systems Engineering, Institute of Engineering, Tokyo University of Agriculture and Technology, Japan    Yuhki Tsukada Graduate School of Engineering, Nagoya University, Japan    Toshiyuki Koyama Graduate School of Engineering, Nagoya University, Japan    Masayuki Kano Earthquake Research Institute,The University of Tokyo, Japan    Junya Inoue School of Engineering, The University of Tokyo, Japan
June 30, 2019
###### Abstract

Data assimilation (DA) is a fundamental computational technique that integrates numerical simulation models and observation data on the basis of Bayesian statistics. Originally developed for meteorology, especially weather forecasting, DA is now an accepted technique in various scientific fields. One key issue that remains controversial is the implementation of DA in massive simulation models under limited computation time and resources. In this paper, we propose an adjoint-based DA method for massive autonomous models that produces optimum estimates and their uncertainties within practical computation time and resource constraints. The uncertainties are given as several diagonal components of an inverse Hessian matrix, which is the covariance matrix of a normal distribution that approximates the target posterior probability density function in the neighborhood of the optimum. Conventional algorithms for deriving the inverse Hessian matrix require computations and memory, where is the number of degrees of freedom of a given autonomous system and is the number of computations needed to simulate time series of suitable length. The proposed method using a second-order adjoint method allows us to directly evaluate the diagonal components of the inverse Hessian matrix without computing all of its components. This drastically reduces the number of computations to and the amount of memory to for each diagonal component. The proposed method is validated through numerical tests using a massive two-dimensional Kobayashi’s phase-field model. We confirm that the proposed method correctly reproduces the parameter and initial state assumed in advance, and successfully evaluates the uncertainty of the parameter. Such information regarding uncertainty is valuable, as it can be used to optimize the design of experiments.

aaaaa

## I Introduction

Determining the model parameters and initial states of simulation models is an important task in various scientific fields, as it enables the temporal evolution of the target system to be observed. However, in many practical cases, this procedure is somewhat complex, because it is often impossible to observe the parameters and initial states experimentally. In materials engineering, for example, phase-field (PF) models are often used to simulate the evolution of microstructures during the processes of solidification and phase transformation Kobayashi (1993); Boettinger et al. (2002); Chen (2002); Tsukada et al. (2008); Shimokawabe et al. (2011); Ohno et al. (2013); Takaki (2014). PF models phenomenologically describe the dynamics of phases using field variables that evolve in time depending on the gradient of the total free energy. Since a PF model usually requires a huge number of grid points to discretize the field variables, the computational cost tends to be prohibitive. Nonetheless, PF models are accepted beyond the field of materials engineering, such as in hydrodynamics Jacqmin (1999); De Menech (2006); Lee et al. (2011), as they can be employed to model phases and their dynamics using mathematical expressions that are easy to manipulate. In PF models, the parameters and initial states perfectly determine the dynamical properties of the phases. However, various limitations in practical experiments sometimes prevent such parameters and initial states from being estimated.

Data assimilation (DA) is a computational technique that integrates numerical simulation models and observational data on the basis of Bayesian statistics. Thus, DA enables the parameters and initial states of PF models to be estimated by systematically extracting as much information as possible from the given observational/experimental data. The process of DA evaluates a probability density function (PDF) (or the “posterior PDF,” to be precise) of the unknown parameters and unobservable states that is conditional on the given observation data Reich and Cotter (2015). DA was originally developed in the fields of meteorology and oceanography Kalnay (2003); Tuyuki and Miyoshi (2007); Ghil and Malanotte-Rizzoli (1991), but is now applied in areas such as seismology, marketing science, and industrial science Kano et al. (2015); Maeda et al. (2015); Motohashi et al. (2012); Sasaki2016prep (). Several sequential Bayesian filters and other non-sequential estimation methods have been used in DA. Common sequential Bayesian filters such as the ensemble Kalman filter Evensen (2003); Houtekamer and Mitchell (1998); Ueno et al. (2007) and particle filter Kitagawa (2010); Doucet et al. (2000); Nagao et al. (2013) estimate the target posterior PDF using Bayes’ theorem. This approximation is formed using an ensemble of realizations, meaning that the computational cost is proportional to the number of realizations. The implementation of one sequential Bayesian filter on a given simulation model is not especially complex, and a sufficiently accurate estimate of the posterior PDF can be achieved when the number of degrees of freedom of the simulation model is sufficiently small. However, Bayesian filters become inefficient when applied to massive simulation models, as the number of realizations required to obtain a converged posterior PDF is proportional to . Unlike sequential Bayesian filters, adjoint methods Lewis and Derber (1985); Le Dimet and Talagrand (1986); Iri and Kubota (1991) directly determine the optimum solution using a gradient method to maximize the target posterior PDF. Although this achieves a drastic reduction in the computational cost, the ordinary adjoint method cannot evaluate the uncertainty in its estimations, something which sequential Bayesian filters obtain in a straightforward manner. Such uncertainties provide valuable information related to both the estimations and the optimum solution. For example, the uncertainties provide feedback for the experimental design that helps to identify the parameters of interest with the required accuracy. The quantification of uncertainty is currently a very important issue in the application of DA to massive simulation models.

This paper describes an adjoint-based DA methodology that simultaneously estimates the optimum solution and its uncertainty, and is capable of being applied to simulation models with a huge number of degrees of freedom. We first construct a method to estimate the parameters and initial states involved in an autonomous system, and then validate our approach using a PF model as a testbed. Section II introduces the formulation of an adjoint-based DA method to simultaneously obtain an estimation and its uncertainty using second-order information of the posterior PDF. Section III describes the formulation of an estimation test using synthetic data generated from the time series given by Kobayashi’s PF model Kobayashi (1993). Section IV presents and discusses the results of estimation tests, and Section V concludes this paper by summarizing the results of this study.

## Ii Method

### ii.1 State-space model and cost function

DA based on Bayesian statistics always starts by defining a state-space model, which consists of a system model and an observation model. The system model describes how a state vector evolves over time in accordance with a given simulation model. The state vector contains all time-dependent variables used in the simulation model and sometimes the model parameters.

Suppose an autonomous simulation model is given by , where denotes a time-dependent variable and is a function of and a time-invariant parameter vector . The system model describes the time evolution of a state vector consisting of and . Let be the state vector, where denotes the transpose of and . Since is time-invariant, i.e., , the system model can be represented by

 ∂\boldmathX∂t=\boldmathf(% \boldmathX), (1)

where is defined as for and for .

The observation model describes how relates to a time series of observation data , where denotes the dimension of the observations. Considering that the data include noise, the observation model can be described as

 (2)

where is an observation operator that outputs quantities from comparable with the data and denotes observation noise. In this paper, we assume that and are nonlinear functions, and is white noise that follows a normal distribution with a diagonal covariance matrix. Our purpose is to obtain the optimum initial state together with the uncertainties of the variables of interest.

In consideration of PF models, we also assume that is constrained by

 XLoweri

where and denote the lower and upper bounds of , respectively.

To simplify our formulation, we normalize as

 θi(t)=Xi(t)−XLoweriXUpperi−XLoweri(i=1,⋯,N). (4)

This leads to the following -dependent forms of Eqs. (1)-(3):

 ∂\boldmathθ∂t=\boldmathF(% \boldmathθ), (5)
 \boldmathD=\boldmathh(\boldmathθ)+\boldmathΩ, (6)
 0<Θi<1(i=1,⋯,N), (7)

where , is an observation operator after transforming to , and .

Bayes’ theorem states that a conditional PDF , which is called the posterior PDF, can be described as

 p(Θ|\boldmathD)=p(Θ)p(\boldmathD|Θ)p(\boldmathD), (8)

where and are called the prior PDF and likelihood, respectively. Note that is constant, since is a definite vector. Thus, Eq. (8) implies that is proportional to a product of the prior PDF and the likelihood.

The prior PDF contains prior information provided by experience and intuition. If we suppose that this prior information is the constraint condition given by Eq. (7) and that is independent for each , is given by a product of prior PDFs of ,

 p(Θ)=N∏i=1p(Θi), (9)

where

 p(Θi)={1for0<Θi<10otherwise. (10)

When observation data are obtained at , can be written as

 p(\boldmathD|Θ)=n∏s=1K∏k=1p(Ωk(ts)), (11)

where

 p(Ωk(ts))=1√2πσ2kexp⎡⎣−{Dk(ts)−hk(\boldmathθ(ts))}22σ2k⎤⎦, (12)

and is the standard deviation of (). We consider to be a hyper-parameter.

This paper defines the optimum solution to be the that maximizes the posterior PDF . For the convenience of numerical computation, we aim to minimize a cost function

 J=n∑s=1K∑k=1⎡⎣log(2πσ2k)2+{Dk(ts)−hk(\boldmathθ(ts))}22σ2k⎤⎦ subject to 0<Θi<1, (13)

which comes from a negative logarithmic posterior PDF, i.e., , to find , rather than maximizing . The constraint in Eq. (13) arises from the term , which appears when calculating .

An optimum solution for can be determined as follows. By letting , we obtain

 σk= ⎷1nn∑s=1[Dk(ts)−hk(% \boldmathθ(ts))]2. (14)

Then, is obtained by substituting , which is calculated by Eq. (5) using , into Eq. (14).

### ii.2 Optimization via an adjoint method

Typically, is optimized using a gradient method such as steepest gradient descent, the nonlinear conjugate gradient method, or the Limited-memory Broyden-Fletcher-Goldfarb-Shanno (LBFGS) method Nocedal (1980). Gradient methods require to update , but it is difficult to calculate this quantity because does not explicitly include , as seen in Eq. (13). Generally, is fully determined by setting through Eq. (5), so that must be written as a function of , i.e., . According to Eq. (13), is also a function of that satisfies Eq. (5), not including . Summarizing these two expressions for :

 J(Θ)=∫tf0dtJ(\boldmathθ)+∫tf0dt\boldmathλ⊤(\boldmathF−∂\boldmathθ∂t), (15)

where is an arbitrary time later than , and denotes a vector of the Lagrange multipliers that impose Eq.(5) as the constraint condition of . is the time-dependent function

 J(\boldmathθ)=n∑s=1δ(t−ts)K∑k=1⎡⎣log(2πσ2k)2+{Dk(t)−hk(\boldmathθ(t))}22σ2k⎤⎦, (16)

that satisfies , where denotes the Dirac delta function. Taking a variation of Eq. (15), we have a time evolution equation for :

 ∂\boldmathλ∂t+(∂%\boldmath$F$∂\boldmathθ)⊤\boldmath% λ+∂J∂\boldmathθ=% \boldmath0, (17)

where

 \boldmathλ(0)=∂J∂Θ, (18)
 \boldmathλ(tf)=\boldmath0. (19)

Details of the derivation can be found in Le Dimet and Talagrand (1986); Wang et al. (1992). Solving Eq. (17) backwardly in time with the condition in Eq. (19), we obtain the objective as . Such a procedure to obtain the gradient of the cost function using the adjoint equation (Eq. (17)) is called the adjoint method.

When we apply the adjoint method to our problem, a variable transformation is needed in the process of updating based on a gradient method, since has the constraint shown in Eq. (7). The variable transformation

 Ψi=logΘi−log(1−Θi)(i=1,⋯,N) (20)

converts the constrained optimization problem of into an unconstrained one with respect to . The update procedure is as follows. After obtaining by the adjoint method based on Eqs. (17)-(19), we convert to using Eq. (20) and to as

 ∂J∂Ψi=Θi(1−Θi)∂J∂Θi(i=1,⋯,N). (21)

Using this formulation to update , we can obtain an updated from the inverse transformation of Eq. (20):

 Θi=11+exp(−Ψi)(i=1,⋯,N). (22)

Although this update procedure does not allow to be exactly or , owing to the definition of Eq. (22), can be sufficiently close to or to pose no problem in practical cases.

The adjoint method calculates for a fixed , so that an optimization of is to be done at the same time as by substituting Eq. (14) into Eq. (16) every time is updated.

The advantages of the adjoint method over sequential Bayesian filters are that only computations and memory are required to find , where is the number of computations needed to run the given simulation model from to .

### ii.3 Evaluation of uncertainty via a second-order adjoint method

The adjoint method described in Section II.2 gives the optimum solution that maximizes . However, it does not provide information about the behavior of in the neighborhood of , which reflects the uncertainty in the estimation of . To extract such information, another procedure must be implemented on the adjoint method. Considering that , the Taylor expansion of with respect to is

 J(Θ)∼J(^Θ)+12(Θ−^Θ)⊤H(Θ−^Θ), (23)

where terms of order higher than three have been neglected, and is a Hessian matrix given by

 (24)

We normalize into which Eq. (23) is substituted as

 p(Θ|\boldmathD)∼exp[−12(Θ−^Θ)⊤H(Θ−^Θ)](2π)N/2∣∣H−1∣∣1/2, (25)

where is the inverse of and denotes the determinant of . Equation (25) indicates that, in the neighborhood of , can be approximated by a multivariate normal distribution with mean vector and covariance matrix . Let () be a component of interest in . Integrating Eq. (25) over all variables except for , the marginal distribution with respect to is the normal distribution with mean and variance , which is the -th diagonal element of . This means that the uncertainty of is given by . When , it is unrealistic to obtain directly by numerically differentiating , which is generally dense, as this would require computations and memory. In practical cases, it is not necessary to evaluate all elements of , since the number of elements of interest is usually much smaller than . Therefore, we propose to use a second-order adjoint method Wang et al. (1998); Le Dimet et al. (2002) to efficiently obtain such uncertainties in massive autonomous systems. The following procedure to obtain the uncertainties requires computations and memory for each uncertainty. When evaluating the uncertainty of , we consider a linear equation of :

 H\boldmathr=\boldmathq, (26)

where is a vector with elements and . The solution obviously includes as . Note that is a constant matrix that requires complex computations because of its large dimension. We must obtain from an initial guess via an iterative technique such as the conjugate gradient method or conjugate residual method. The iterative method needs, in the way of the iteration, to compute each of the Hessian-vector products for a vector . The second-order adjoint method enables us to compute such Hessian-vector products.

Let and be perturbations of and , which respectively correspond to and when . Their time evolutions are given by

 ∂\boldmathξ∂t=∂% \boldmathF∂\boldmathθ∣∣∣\boldmathθ=^\boldmathθ\boldmathξ, (27)
 ∂\boldmathζ∂t+(∂\boldmathF∂\boldmathθ)⊤∣∣ ∣∣\boldmathθ=^\boldmathθ\boldmathζ% +(∂2\boldmathF∂\boldmathθ2\boldmathξ)⊤∣∣ ∣∣\boldmathθ=^\boldmathθ^\boldmathλ+∂2J∂\boldmathθ2∣∣ ∣∣\boldmathθ=^\boldmathθ\boldmathξ=\boldmath0. (28)

The combination of Eqs. (27) and (28), which are called the tangent linear model and second-order adjoint model, respectively, gives the Hessian-vector product for an arbitrary vector. Solving Eq. (27) forwardly in time for a given vector , we obtain the time series of . Then, solving Eq. (28) backwardly with the given and the time series , we obtain the objective Hessian-vector product . The detailed derivation is given in Wang et al. (1998).

## Iii Twin experiment

### iii.1 Kobayashi’s phase-field model

The accuracy of the proposed method is verified through numerical simulations termed “twin experiments,” details of which are given in Section III.2. We choose a two-dimensional Kobayashi’s PF model as the testbed in the twin experiments. Kobayashi’s PF model describes the fundamental growth dynamics of two phases, such as in solidification or a phase transformation. The time evolution of one of the phases is described by

 τ∂ϕ∂t=ϵ2△ϕ+ϕ(1−ϕ)(ϕ+m−12),−12

where the PF variable denotes the existence probability of the relevant phase, e.g., solid or liquid. The parameters and non-dimensionalize time and space, respectively, and characterizes the velocity of the interface between the two phases. We assume that these parameters are time-invariant constants. We know that should be constrained in , as it describes a probability. This condition is automatically satisfied by setting the initial phase to , since and are the fixed points of Eq. (29).

Kobayashi’s PF model underlies various PF models that describe physical phenomena such as dendrite growth Kobayashi (1993); Shimokawabe et al. (2011), crack propagation Aranson et al. (2000); Spatschek et al. (2006), and interface-driven pattern formation Komura and Yamazaki (2007). Therefore, Kobayashi’s PF model is a good choice for verifying whether the proposed DA method works well, and is a first step towards future applications in more complex PF models.

### iii.2 Synthetic data

Twin experiments are often conducted in the field of DA to verify a newly developed method on the basis of synthetic data. The synthetic data are usually generated using the given simulation model, in which the true parameters and initial state are pre-determined. Verification then proceeds by checking whether the DA method applied to the synthetic data reproduces the true parameters and initial state. In our case, the synthetic dataset is a time series of that is numerically calculated by Kobayashi’s PF model with a true initial state and parameter . The synthetic data are then contaminated by observation noise that follows a normal distribution with mean zero and variance . The twin experiments are intended to confirm that the proposed method estimates the initial state and parameter with the associated uncertainty.

Let and be the numbers of grid points in the - and -directions, respectively, be the total number of grid points, i.e., , and be the grid spacing. A periodic boundary condition is imposed on the boundary of the computational domain. Letting be the phase at the -th grid point, Eq. (29) can be rewritten as

 τ∂ϕi∂t=ϵ2△iϕi+ϕi(1−ϕi)(ϕi+m−12), (30)

where denotes a second-order difference operator acting on the four nearest neighbors of the -th grid point , i.e., .

Figure 1 shows the time evolution of in two-dimensional space, where , , , and the time increment in the Euler method is . The assumed initial state is shown in Fig. 1(a), and the true value for the parameter is assumed to be . Figures 1(b)-(f) show snapshots indicating that the interface between the phases and migrates, expanding the area of . Motivated by the fact that such snapshots are sometimes obtained as observation data in practical experiments, we use snapshots such as in Figs. 1(b)-(f) with added observation noise as the synthetic data for the twin experiments. The synthetic data are given by

 ϕobsi(t)=ϕi(t)+ωi(t)(i=1,..,M), (31)

where is normally-distributed observation noise with mean zero and variance .

When DA is applied to Kobayashi’s PF model, the time evolution equation with respect to is needed to construct a system model within the state-space model (Section II.1). The time evolution equation can be written as

 τ∂b∂t=0, (32)

where , which denotes the normalization of , i.e., .

### iii.3 Cost function

We consider the synthetic observation data to be the snapshots of obtained from to with time interval . Let be the set of observation times and be the number of observations. Combining Eqs. (13) and (31), can be rewritten as

 (33)

The values of and that minimize Eq. (33) also minimize

 J′=∫T0dtJ′, (34)

where

 J′=12∑ts∈Tδ(t−ts)M∑i=1(ϕobsi(t)−ϕi(t))2, (35)

since is independent of and . When the optimum for and for are obtained by minimizing , the optimum can be obtained as

 ^σ= ⎷1nM∑ts∈TM∑i=1(ϕobsi(ts)−^ϕi(ts))2, (36)

where denotes simulated using and .

### iii.4 Procedures

Prior to applying the proposed method to the PF model, the constraint for the initial state is to be changed to to satisfy the domain of the variable transformation Eq. (20). The state variables and can be defined as

 \boldmathθ =(ϕ1,⋯,ϕM,b)⊤ (37) Θ =(ϕ1(0),⋯,ϕM(0),b(0))⊤.

The constraint for becomes (). The system models of Eqs. (30) and (32) are rewritten in terms of as

 τ∂θi∂t={ϵ2△iθi+θi(1−θi)(θi+θM+1−1)fori=1,⋯,M,0otherwise. (38)

Substituting the right-hand side of this equation for in Eq. (17), replacing (Eq. (17)) with (Eq. (35)), and replacing (Eq. (18)) with (Eq. (34)), the adjoint method described by Eqs. (17)-(19) is rewritten as

 −τ∂λi∂t=⎧⎪ ⎪ ⎪ ⎪ ⎪⎨⎪ ⎪ ⎪ ⎪ ⎪⎩ϵ2△iλi+{−3θ2i+(4−2θM+1)θi+θM+1−1}λi+∂J′∂θifori=1,⋯,M,M∑j=1θj(1−θj)λjotherwise, (39)
 \boldmathλ(0)=∂J′∂Θ, (40)
 \boldmathλ(tf)=\boldmath0. (41)

We adopt the LBFGS technique Nocedal (1980) as the gradient method for optimizing . Starting from an initial guess, the LBFGS method updates by satisfying for all owing to the variable transformation mentioned in Section II.2. To tune the LBFGS method, we set the tolerance to and determine the step length by Armijo’s rule Armijo (1966). Once the optimum has been obtained, the optimum standard deviation can be estimated by Eq. (36) and the optimum for is given by or .

One of the most remarkable features of the proposed method is its evaluation of the uncertainties. These uncertainties can provide important information that is beneficial to updating the experimental design. In accordance with the procedure mentioned in Section II.3, we consider a linear equation , where = is a Hessian matrix, is a vector to be determined, and is a vector containing the elements and . The uncertainty can be computed from the solution as

 δ^m=^σ√^rM+1. (42)

The conjugate residual method, in which the tolerance is set to , is adopted to solve the linear equation. The second-order adjoint method computes each of the Hessian-vector products that appears in the optimization process of the conjugate residual method. Substituting the right-hand side of Eq. (38) for in Eq. (27), the tangent linear model can be rewritten as

 τ∂ξi∂t=⎧⎪ ⎪⎨⎪ ⎪⎩ϵ2△iξi+^θi(1−^θi)ξM+1+{−3^θ2i+(4−2^θM+1)^θi+^θM+1−1}ξifori=1,⋯,M,0otherwise, (43)

with the initial condition , where denotes the state vector corresponding to . Substituting the right-hand side of Eq. (38) for in Eq. (28) and replacing (Eq. (28)) with (Eq. (35)), the second-order adjoint model in Eq. (28) becomes

 (44)

where is the perturbation of . Solving Eq. (44) with the condition , we obtain the objective Hessian-vector product as .

## Iv Results and discussion

The proposed method is verified through three twin experiments: (I) estimation of the parameter conditional on the true initial state (Section IV.1), (II) simultaneous estimation of the parameter and the initial state (Section IV.2), and (III) estimation of the initial state conditional on the true parameter (Section IV.3). Twin experiment I investigates how the estimation depends on observation data, twin experiment II verifies whether the proposed method outputs correct estimations, even for massive simulation models, and twin experiment III validates the unknown phenomena that appear in the results of experiment II.

### iv.1 Twin experiment I: Parameter estimation

Twin experiment I investigates the influences of three parameters related to the observation data: (i) the length of the observation time , (ii) the time interval of the observations , and (iii) the standard deviation of the observation noise . Table 1 summarizes the parameter values used in the experiments. The first observation is assumed to occur at , and the true value of is assumed to be . The initial guess used in the LBFGS method is . In this experiment, the true phase field at , i.e., (see Fig. 1(a)) is given as the initial state. The results reported here are the average values of and over twenty trials with different random seeds.

Figure 2 shows the results of Test I-(i). Figure 2 (a) indicates that the parameter estimation is successful, because the true parameter is included in the range . The estimation of the uncertainty fails when is at its minimum, i.e., , because insufficient data cause to become negative. Figure 2(b) indicates that is proportional to in this range. The reason that the decrease is more rapid than the law of large numbers would suggest is related to the nonlinearity of Kobayashi’s PF model. A theoretical evaluation actually indicates that is proportional to when , and will converge with a constant value as increases. This is because no additional information is included in the observation data after becomes almost uniform across the entire computational domain.

Figure 3 shows the results of Test I-(ii). Figure 3(a) indicates that the proposed method successfully reproduces the true parameter, and Fig. 3(b) shows that is proportional to when , which seems to follow the law of large numbers.

Figure 4 shows the results of Test I-(iii). Figure 4(a) indicates that the parameter estimation is again successful, and Fig. 4(b) shows that is proportional to . In summary, the results of Test I demonstrate that the proposed method is capable of estimating the true parameter and the associated uncertainty.

### iv.2 Twin experiment II: Simultaneous estimation

Twin experiment II investigates the influence of the observation noise in two cases: (i) when the noise has a small standard deviation () and (ii) when the noise has a large standard deviation (). The true parameter is assumed to be , and the true initial state is assumed to be the phase field shown in Fig. 1(a). The other observational conditions are , , , and the initial guesses are and .

Figure 5 shows the results of Test II. Figure 5(a) indicates how each iteration of the LBFGS method updates the estimation of . It is clear that each estimation converges with . Figures 5(b) and (c) indicate the estimated initial states in Test II-(i) and (ii), respectively. These results appear to be almost consistent with the true initial states, although “spot-like” pattern appears in Fig. 5(c). This spot-like pattern would be conspicuous if the observation noise was large or if the time of the first observation was far from . Additionally, the spot-like pattern does not disappear under lower tolerance levels.

### iv.3 Twin experiment III: Estimation of initial state

Twin experiment III confirms whether the estimation of affects the generation of the spot-like pattern found in twin experiment II-(ii). Therefore, twin experiment III is set up to estimate only the initial state with a fixed parameter . The true initial state is assumed to be the phase field shown in Fig. 1(a). The other observational conditions are , , and , and the initial guess is .

Figures 6(a) and (b) show the estimated initial states after the st and after the final iterations, respectively, and Fig. 6(c) shows how the cost function varies with the iteration. A spot-like pattern again appears in the estimated initial state (Fig. 6(b)) as the number of iterations increases. Note that the cost function is almost the same after the st step and after the final step, although Figs. 6(a) and (b) are much different.

This is caused by a feature inherent in the two-dimensional Kobayashi’s PF model. When a spot of radius evolves with time based on the PF model, whether it grows or decays depends on the relation between and .

Figure 7 shows the phase diagram obtained by the two-dimensional Kobayashi’s PF model under the assumption of the axial symmetry. The destiny of a given spot depends on whether the radius is above or below the critical line, which means the critical radius is approximately inversely proportional to  Castro (2003). The radius of each spot in Fig. 6(b) is actually smaller than the critical radius, which is approximately in the case of .

Time evolutions starting from three different initial states shown in Fig. 6. These results indicate that the phase fields at the time of the first observation, i.e., , are completely coincident (Fig. 8).

## V Conclusions

This paper has described an adjoint-based DA method for massive autonomous models that not only determines the optimum estimates but also gives their uncertainties within a practical computation time and reasonable resource requirements. The uncertainties can be obtained as several diagonal components of the inverse Hessian matrix, which is the covariance matrix of the normal distribution that approximates the posterior PDF in the neighborhood of the optimum estimates. This proposed approach provides a new methodology for evaluating uncertainties using a second-order adjoint method that obtains Hessian-vector products in the process of the computation. Twin experiments using a two-dimensional Kobayashi PF model demonstrated the validity of the proposed method.

The uncertainties associated with physical quantities of interest depend on the quality and amount of data. Thus, conducting twin experiments prior to practical experiments allows us to determine how many observations are required to obtain the physical quantities of interest to the desired accuracy. Such feedback to practical experiments is already possible in systems with only a few degrees of freedom, but the proposed method makes this possible for massive simulation models.

The proposed method is not only applicable to PF models, but to various models described by autonomous systems, e.g., shallow water equations, Navier equations for elastic materials, and Boltzmann equations. The proposed method is of great utility for evaluating the uncertainties of model parameters through DA, which is important in various fields of science, even when using massive models.

## Acknowledgments

This study is supported by the Cross-ministerial Strategic Innovation Promotion Program (SIP). The authors are grateful to Prof. Munekazu Ohno, Prof. Peter XK Song and Dr. Jonggyu Baek for useful discussions.

## References

You are adding the first comment!
How to quickly get a good reply:
• Give credit where it’s due by listing out the positive aspects of a paper before getting into which changes should be made.
• Be specific in your critique, and provide supporting evidence with appropriate references to substantiate general statements.
• Your comment should inspire ideas to flow and help the author improves the paper.

The better we are at sharing our knowledge with each other, the faster we move forward.
The feedback must be of minimum 40 characters and the title a minimum of 5 characters