Model error moment estimation via data assimilation
Using a dynamical model to make predictions about a system has many sources of error. These can include errors in how the model was initialised but also errors in the dynamics of the model itself. For many applications in data assimilation, probabilistic forecasting, or model improvement, these model errors need to be known over the timestep of the model, not over a time-averaged period. Using a forecast from a state that combines observational information as well as prior information we can gain an approximation to the statistics of the model errors on the timescale of the model that is required. Here we give bounds on the errors in the estimation of the mean and covariance of the errors in the model equations in terms of the errors made in the state estimation. This is the first time that such a result has been derived. The result shows to what extent the state estimation must constrain the analysis in order to obtain a specified error on the mean or covariance of the model errors. This is particularly useful for experimental design as it indicates the necessary information content required in observations of the dynamical system.
Keywords: model error estimation, data assimilation, analysis forecast, model error covariance estimation
Suppose we have a numerical model, , for a dynamical process:
where is the state at time indexed by . Then it is almost certain that such a numerical model will contain errors. A large amount of uncertainty quantification has revolved around providing bounds of the accuracy of the numerical scheme used to approximate the underlying mathematical or statistical problem . This ignores one large issue: the underlying mathematical problem does not represent all of the features of the real world system in which one is interested. For example, when modelling wind around a wind farm, drag induced by individual trees/plants will likely not be included. Instead, some approximate homogenised quantity will be used. As another example, when modelling carbon fibre composites, specific manufacturing defects will not be included in the model until measurements of the materials response under various loading conditions are incorporated.
The ubiquitous quote on this issue is from : “All models are wrong but some are useful”. The problem we address is how to estimate the distribution of the errors that are made by the mathematical model in simulating a physical system.
To get a measure of how far a mathematical model is deviating from the real world, it is necessary to use observations from the system as an independent source of information. Typically we shall never be able to fully observe a system (in time and/or space) and so one has to be careful how to compare a small set of observations with the larger model of the system. The theoretical framework to do this rigorously is well established: using data assimilation to numerically implement Bayes’ theorem .
A physical system that occurs in nature has only one realisation. We refer to that realisation as the truth and denote it . If we use such a truth with the numerical model of the system we have the following evolution equation,
Here is the model error term which is a realisation of a stochastic process which occurs at time . We wish to estimate, through the use of data assimilation, the properties of the distribution that follows. The first two moments of this unknown distribution, its mean and covariance, are denoted and respectively. Knowledge of can be fed back into the dynamical model as a way to systematically improve the model. For example, directly subtracting from the model would be an example of a bias correction method , whereas the sources of these errors can be attributed to components of the model equations  and thus can be corrected to improve the mathematical model.
We use any or all prior information we have, along with observations of the system to obtain a “best estimate” of the truth. This is the data assimilation stage and results in what is referred to as the “analysis”, . There are many different techniques for data assimilation, each which make various assumptions or simplifications to Bayes’ theorem. See  or  for an overview of the field.
Once we have we define our approximation to the model error at timestep , , with the equation
This makes no assumptions on how is calculated. However, we can immediately see that one traditional method of data assimilation is inappropriate to estimate the term.
In SC-4DVar the analysis at time , is given as
Here, is a background estimate for the state at time , assumed to be Gaussian with background error covariance matrix . Observations of the system are taken at separate times . The observation is related to the system through the observation equation
where the observation errors are assumed to be Gaussian with an observation error covariance matrix . Then at subsequent times in the assimilation window,
and thus .
1.1Usage of the model error covariance matrix
Having an estimate for is important as it plays a vital role in many modern data assimilation methods. In the variational method of data assimilation known as weak constraint 4DVar (WC-4DVar), in addition to the background and observation terms of SC-4DVar, a third term is introduced to penalise the addition of model error , which is implicitly assumed to follow a Gaussian distribution.
In this formulation, is the initial state which the method seeks to find, along with model error terms . The first term penalises the difference in the initial state and an initial guess known as the background term , weighted by the inverse of the background error covariance matrix . The second term penalises discrepancies between observations of the system and the model equivalent of the observations at the corresponding time, , all weighted by the appropriate observation error covariance matrix . This notation hides the fact that etc. The final term penalises the introduction of model error at each model timestep , weighted by the model error covariance matrix .
In particle filters, such as the equivalent weights particle filter , is used as part of the proposal density to bring the ensemble closer to the observations. It also appears in the computation of weights associated with each particle.  showed that the lack of information regarding is the limiting factor to applying particle filters in large-scale geophysical systems.
1.2Previous work on model error estimation
Much of the work to estimate has been undertaken at the European Centre for Medium-range Weather Forecasting (ECMWF) in the context of implementing a form of WC-4DVar for operational numerical weather forecasting. The lack of knowledge of is one of the main reasons that such weak-constraint methods are not operational.  notes “Weak-constraint 4D-Var has never been implemented fully with a realistic forecast model because of the computational cost and because of the lack of information to define the model-error covariance matrix required to solve the problem”.
It has been shown  that for the atmospheric case, the background error covariance matrix () is not successful as an approximation to the model-error covariance matrix (). As an ad hoc approximation to ,  suggests the following. At time , get an ensemble of analysis states . As each ensemble member should be indistinguishable from the truth, each forecast could be considered a “possible evolution of the atmosphere from the true state”. From this, Trémolet concluded that the differences in these forecast increments could be interpreted as “an ensemble of possible realisations of model error”. The real world, however, has only one realisation. Hence one needs to compare these forecasts not with themselves, but with the observations instead. In this way, one discovers not just where the model diverges based on its initial conditions, but where it has failed to capture the realisation of the underlying stochastic dynamics.
A smoother is a sequential method of data assimilation that gives an estimate of for all , not simply at timesteps where observations are present. Recently, the concept of using a forecast from a lag-1 smoother estimate to approximate the model error covariances has been investigated . This work only considers estimating the the problem in observation space, and do not look at the full information available to them in state space. This is physically motivated as in the case where only the observed variables of the dynamical system are well constrained, only the projection of into the observation space will be accurate.
Such derivations from a smoother trajectory have been investigated by  in the context of parameter and parameterization estimation. When the model error can be attributed to incorrect coefficients in the formulation of the model equations then it has been shown that the first moment of the derived model error can be used to determine the necessary corrections to the parameters used in the numerical model. This can be extended to estimation of parameterizations when the first moment of the model error is regressed onto different functions of the model state vector. However higher order moments have not been investigated.
2Error bounds on the estimation of the first two moments of model error
We now set out on the main path of this paper: to find a bound on the error in the approximation of and in terms of the accuracy of in approximating . The main assumption we shall make is that the dynamical model is (Lipschitz) continuous. If the model were not continuous at then regardless of how close we approximate this argument, the resulting model trajectory could be wildly different. This would lead to a poor approximation of the model error.
Lipschitz s.t. .
The sample mean of the model error, , is calculated simply as
The approximated sample mean of the model error, , in similarly calculated as
Using these definitions and Theorem ?, we can show the following bound on the estimation of the mean of the model error distribution.
We now consider the second moment of the model error distribution. The sample covariance of model error is defined and approximated as, respectively,
By the triangle inequality
Introducing a subscript notation where refers to the component of a vector and refers to the component of a matrix, we show the following preliminary result.
By Lemma ?, if
Thus we can combine these so that
It is now possible to obtain a bound on the error in the covariance of the model error distribution in terms of the analysis error. For clarity, the notation also refers to the component of a vector that already has a subscript.
By Theorem ?, if
By Corollary ? if
Hence we can use Lemma ?. Thus .
Corollary ? shows that if one desires to know the mean of the model error to an accuracy of , then the analysis of that variable must be within of the truth at every timestep. Contrast this with the result from Theorem ?: to estimate the variance of a variable to within , the analysis of that variable must be within
of the truth at each timestep. With the factor of in the denominator, this can be seen as approximately an order of magnitude more accuracy needed in the analysis to achieve the same accuracy in the approximation. Moreover, the accuracy in the variance depends on the mean value of that variable. The larger the mean value, the more accuracy required in the analysis to give the same error in the variance. This may be more easily understood conversely: the larger the mean value in a variable, the larger the error in the estimation of its variance.
In this section we apply the model error moment estimation method as detailed in the previous sections to the Lorenz 96 system . This is a very simple example to show numerically that the errors in the approximations of and are dependent on the quality of the analysis . The Lorenz 96 system is given by the following ODE:
where with cyclic boundary conditions. We choose . We prescribe the truth to evolve as with where
and have been chosen to be interesting and in particular, for implementation purposes, has a rather simple symmetric square root.
We take observations of the true state at every timestep. The observations are related to the state via the equation
Here we consider and will show two different experiments with observations of varying accuracy. The first experiment will have , and represents very informative observations. The second experiment will have , and represents observations containing less information.
To get an analysis , we perform a simple 3DVar at each timestep. That is,
We use as a particularly uninformative prior. The model is evolved for 3000 timesteps, giving us 2999 samples of . Figure ? shows Hovmöller diagrams of the analysis error as a result of the data assimilation process. The higher precision observations lead to significantly lower estimation errors.
Using these analyses, estimates for and are calculated. Results for are shown in Figures ? to ? and for are shown in Figures ? to ?. These results show that the second order moment of model error, , is much more sensitive to the quality of the analysis than the first order moment, . This is consistent with the theory set out in the previous section.
This paper has considered using the problem of estimating the moments of model error of a dynamic model. The model error is approximated by the difference of the analysis and the model forecast of the analysis at the previous timestep. Bounds for the errors in both the first and second moment of the approximated model error compared to the sample model error are derived, in terms of the error in the analysis from the truth. It is shown that to achieve the same error estimation in the second moment compared with the first, the analysis must be an order of magnitude closer to the truth.
Numerical experiments were conducted to elucidate how the quality of the analysis affects the estimation of both the mean and the covariance of the model error. It is shown numerically with the Lorenz 96 system that the estimation of the covariance of the model error is much more sensitive to the quality of the data assimilation than the estimation of the mean of the model error.
- Empirical model-building and response surfaces.
Box, G. E. and Draper, N. R. (1987). John Wiley & Sons.
- Twin experiments with the equivalent weights particle filter and HadCM3.
Browne, P. and van Leeuwen, P. (2015). Quarterly Journal of the Royal Meteorological Society, 141(693 October 2015 Part B):3399–3414.
- Variational bias correction of satellite radiance data in the ERA-Interim reanalysis.
Dee, D. P. and Uppala, S. (2009). Quarterly Journal of the Royal Meteorological Society, 135(October):1830–1841.
- Data assimilation.
Evensen, G. (2007). Springer.
- Stochastic Processes and Filtering Theory.
Jazwinski, A. H. (1970). Academic Press.
- A systematic method of parameterisation estimation using data assimilation.
Lang, M., van Leeuwen, P. J., and Browne, P. (2016). Tellus A, 68:1–10.
- Data Assimilation: A Mathematical Introduction.
Law, K., Stuart, A., and Zygalakis, K. (2015). Texts in Applied Mathematics. Springer International Publishing.
- Predictability: A problem partly solved.
Lorenz, E. (1996). Proc. Seminar on predictability, 1(1):40–58.
- Optimal Uncertainty Quantification.
Owhadi, H., Scovel, C., Sullivan, T. J., McKerns, M., and Ortiz, M. (2013). SIAM Review, 55(2):271–345.
- Further analysis of multilevel Monte Carlo methods for elliptic PDEs with random coefficients.
Teckentrup, A. L., Scheichl, R., Giles, M. B., and Ullmann, E. (2013). Numerische Mathematik, pages 24–29.
- A lag-1 smoother approach to system-error estimation: sequential method.
Todling, R. (2015). Quarterly Journal of the Royal Meteorological Society, 141(690):1502–1513.
- Accounting for an imperfect model in 4D-Var.
Trémolet, Y. (2006). Quarterly Journal of the Royal Meteorological Society, 132(621):2483–2504.
- Model-error estimation in 4D-Var.
Trémolet, Y. (2007). Quarterly Journal of the Royal Meteorological Society, 1280(July):1267–1280.
- Nonlinear data assimilation in geosciences: an extremely efficient particle filter.
van Leeuwen, P. (2010). Quarterly Journal of the Royal Meteorological Society, 136(653):1991–1999.