Deterministic treatment of model error in geophysical data assimilation
This chapter describes a novel approach for the treatment of model error in geophysical data assimilation. In this method, model error is treated as a deterministic process fully correlated in time. This allows for the derivation of the evolution equations for the relevant moments of the model error statistics required in data assimilation procedures, along with an approximation suitable for application to large numerical models typical of environmental science. In this contribution we first derive the equations for the model error dynamics in the general case, and then for the particular situation of parametric error. We show how this deterministic description of the model error can be incorporated in sequential and variational data assimilation procedures. A numerical comparison with standard methods is given using low-order dynamical systems, prototypes of atmospheric circulation, and a realistic soil model. The deterministic approach proves to be very competitive with only minor additional computational cost. Most importantly, it offers a new way to address the problem of accounting for model error in data assimilation that can easily be implemented in systems of increasing complexity and in the context of modern ensemble-based procedures.
The prediction problem in geophysical fluid dynamics typically relies on two complementary elements: the model and the data. The mathematical model, and its discretized version, embodies our knowledge about the laws governing the system evolution, while the data are samples of the system’s state. They give complementary information about the same object. The sequence of operations that merges model and data to obtain a possibly improved estimate of the flowâs state is usually known, in environmental science, as data assimilation Daley1991 (); Kalnay2002 (). The physical and dynamical complexity of geophysical systems makes the data assimilation problem particularly involved.
The different information entering the data assimilation procedure, usually the model, the data and a background field representing the state estimate prior to the assimilation of new observations, are weighted according to their respective accuracy. Data assimilation in geophysics, particularly in numerical weather prediction (NWP) has experienced a long and fruitful stream of research in recent decades which has led to a number of advanced methods able to take full advantage of the increasing amount of available observations and to efficiently track and reduce the dynamical instabilities Evensen (). As a result the overall accuracy of the Earthâs system estimate and prediction, particularly the atmosphere, has improved dramatically.
Despite this trend of improvement, the treatment of model error in data assimilation procedures is still, in most instances, done following simple assumptions such as the absence of time correlation Jazwinski (). The lack of attention on model error is in part justified by the fact that on the time scale of NWP, where most of the geophysical data assimilation advancements have been originally concentrated, its influence is reasonably considered small as compared to the initial condition error that grows in view of the chaotic nature of the dynamics. Nevertheless, the improvement in data assimilation techniques and observational networks on the one hand, and the recent growth of interest in seasonal-to-decadal prediction on the other DobRey-et-al-13 (); Weber_et_al_2015 (), has placed model error, and its treatment in data assimilation, as a main concern and a key priority. A number of studies reflecting this concern have appeared, in the context of sequential and variational schemes DeeDaSilva1998 (); Tremolet2006 (); Tremolet2007 (); Kondrashov2008 ().
Two main obstacles toward the development of techniques taking into account model error sources are the huge size of the geophysical models and the wide range of possible model error sources. The former problem implies the need to estimate large error covariance matrices on the basis of the limited number of available observations. The second important issue is related to the multiple sources of modeling error, such as incorrect parametrisation, numerical discretization, and the lack of description of some relevant scale of motion. This latter problem has until recently limited the development of a general formulation for the model error dynamics. Model error is commonly as an additive, stationary, zero-centered, Gaussian white noise process. This choice could be legitimate by the multitude of unknown error sources and the central limit theorem. However, despite this simplification, the size of geoscientific models still makes detailed estimation of the stochastic model error covariance impractical.
In the present contribution we describe an alternative approach in which the evolution of the model error is described based on a deterministic short-time approximation. The approximation is suitable for realistic applications and is used to estimate the model error contribution in the state estimate. The method is based on the theory of deterministic dynamics of the model error that was introduced recently by Nicolis2003 (); Nicolis2004 (); Nicolis2009 (). Using this approach it is possible to derive evolution equations for the moments of the model error statistics required in data assimilation procedures, and has been applied in the context of both sequential and variational data assimilation schemes, and for errors originated from uncertain parameters and from unresolved scales.
We give here a review of the recent developments of the deterministic treatment of model error in data assimilation. To this end, we start by first formalizing the deterministic model error dynamics in Sect. 2. We show how general equations for the mean and covariance error can be obtained and discuss the parametric error as a special case. In Sections 3 and 4 the incorporation of the short-time model error evolution laws is described in the context of the Extended Kalman filter and variational scheme respectively. These two types of assimilation procedures are significantly different and are summarized in the respective Sections along with the discussion on the consequences of the implementation of the model error treatment. We provide some numerical illustrations of the proposed approach together with comparisons with other methods, for two prototypical low order chaotic systems widely used in theoretical studies in geosciences Lorenz1963 (); Lorenz1996 () and a quasi-operational soil model Mahfouf2007 ().
New potential applications of the use of the deterministic model error treatment are currently under way and are summarized, along with a synopsis of the method, in the final discussion Section 5. These include soil data assimilation with the use of new observations and ensemble based procedures Evensen ().
Let the model at our disposal be represented as:
where is typically a nonlinear function, defined in and is a -dimensional vector of parameters.
Model (1) is used to describe the evolution of a (unknown) true dynamics, i.e. nature, whose evolution is assumed to be given by the following coupled equations:
where is a vector in , and is defined in and may represent scales that are present in the real world, but are neglected in model (1); the unknown parameters have dimension . The true state is thus a vector of dimension . The model state vector and the variable of the true dynamics span the same phase space although, given the difference in the functions and , they do not have the same attractor in general. The function can have an explicit dependence on time but it is dropped here to simplify the notation.
When using model (1) to describe the evolution of , estimation error can arise from the uncertainty in the initial conditions at the resolved scale () and from the approximate description of the nature afforded by (1) which is referred as model error. A number of different sources of model errors are present in environmental modeling. Typical examples are those arising from the inadequate description of some physical processes, numerical discretization and/or the presence of scales in the actual dynamics that are unresolved by the model. The latter are typically parametrised in terms of the resolved variables (for instance the Reynolds stress of the turbulent transport).
2.1 General description of model error dynamics
Following the approach outlined in Nicolis2004 (), we derive the evolution equations of the dominant moments, mean and covariance, of the estimation error in the resolved scale (i.e. in ). The formal solutions of (1) and (2) read respectively:
with . Two types of averaging could be performed, one over a set of initial conditions sampled on the attractor of the system, and/or a set of perturbations around one specific initial state selected on the system’s attractor. In data assimilation, the second is more relevant since one is interested in the local evaluation of the uncertainty. However, in many situations the first one is used to get statistical information on covariances quantities, as will be illustrated in this Chapter. For clarity, we will refer to as the local averaging, and to for an averaging over a set of initial conditions sampled over the attractor of the system. In this section, we will only use for clarity, but it also extends to the other averaging. We will use the other notation when necessary.
In the hypothesis that the initial condition is unbiased, , Eq. (5) gives the evolution equation of the bias due to the model error, usually refers to as drift in climate prediction context. The important factor driving the drift is the difference between the true and modeled tendency fields, . Expanding (5) in Taylor series around up to the first non-trivial order, and using unbiased initial conditions, it reads:
Equation (6) gives the evolution of the bias, , the drift, in the short-time approximation and the subscript stands for model error-related bias. It is important to remark that in the case of stochastic model error treatment, and in the hypothesis of unbiased initial condition error, .
Similarly, by taking the expectation of the external product of the error anomalies by themselves, we have:
Equation (7) describes the time evolution of the estimation error covariance in the resolved scale. The first term, that does not depend on time, represents the covariance of the initial error. The two following terms account for the correlation between the error in the initial condition and the model error, while the last term combines the effect of both errors on the evolution of the estimation error covariance.
Let us focus on the last term of Eq. (7) denoted as,
The amplitude and structure of this covariance depends on the dynamical properties of the difference of the nature and model tendency fields. Assuming that these differences are correlated in time, we can expand (8) in a time series up to the first nontrivial order around the arbitrary initial time , and gets:
where is the model error covariance matrix at initial time. Note again that, if the terms are represented as white-noise process, the short-time evolution of is bound to be linear instead of quadratic. This distinctive feature is relevant in data assimilation applications where model error is often assumed to be uncorrelated in time, a choice allowing for a reduction of the computational cost associated with certain types of algorithms Tremolet2006 (); CV10 ().
2.2 Model error due to parameter uncertainties
We assume for simplicity that the model resolves all scales present in the reference system. Under the aforementioned hypothesis that the model and the true trajectories span the same phase space, nature dynamics, (2), can be rewritten as:
The function , which has the same order of magnitude of and is scaled by the dimensionless parameter , accounts for all other extra terms not included in the model and depends on the resolved variable and on a set of additional parameters . In a more formal description, this would correspond to a function relating the variables and under an adiabatic elimination Nicolis2004 (). We are interested here in a situation in which the main component of the nature dynamics is well captured by the model so that , and the extra terms described by are neglected. We concentrate in a situation in which model error is due only to uncertainties in the specification of the parameters appearing in the evolution law . This formulation accounts, for instance, for errors in the description of some physical processes (dissipation, external forcing, etc.) represented by the parameters.
An equation for the evolution of the state estimation error can be obtained by taking the difference between the first rhs term in (10) and (1). The evolution of depends on the error estimate at the initial time (initial condition error ) and on the model error. If is ”small”, the linearized dynamics provides a reliable approximation of the actual error evolution. The linearization is made along a model trajectory, solution of (1), by expanding, to first order in and , the difference between Eqs. (10) and (1):
The first partial derivative on the rhs of (11) is the Jacobian of the model dynamics evaluated along its trajectory. The second term, which corresponds to the model error, will be denoted hereafter to simplify the notation;
The solution of (11), with initial condition at , reads:
with being the fundamental matrix (the propagator) relative to the linearized dynamics along the trajectory between and . We point out that and in (12) depend on (the integration variable) through the state variable . Equation (12) states that, in the linear approximation, the error in the state estimate is given by the sum of two terms, the evolution of initial condition error, , and the model error, . The presence of the fundamental matrix in the expression for suggests that the instabilities of the flow plays a role in the dynamics of model error.
Let us now apply the expectation operator to (12) defined locally around the reference trajectory, by sampling over an ensemble of initial conditions and model errors, and the equation for the mean estimation error along a reference trajectory reads:
In a perfect model scenario an unbiased state estimate at time () will evolve, under the linearized dynamics, into an unbiased estimate at time . In the presence of model error and, depending on its properties, an initially unbiased estimate can evolve into a biased one with being the key factor.
The dynamics of the state estimation error covariance matrix can be obtained by taking the expectation of the outer product of with itself. Assuming that the estimation error bias is known and removed from the background error, we get:
The four terms of the r.h.s. of (14) give the contribution to the estimation error covariance at time due to the initial condition, model error and their cross correlation, respectively. These integral equations are of little practical use for any realistic nonlinear systems, let alone the big models used in environmental prediction. A suitable expression can be obtained by considering their short-time approximations through a Taylor expansion around . We proceed by expanding (12) in Taylor series, up to the first non trivial order, only for the model error term while keeping the initial condition term, , unchanged. In this case, the model error evolves linearly with time according to:
By adding the initial condition error term, , we get a short time approximation of (12):
For the mean error we get:
Therefore, as long as is different from zero, the bias due to parametric error evolves linearly for short-time, otherwise the evolution is conditioned by higher orders of the Taylor expansion. Note that the two terms in the short time error evolution (19) and (20), are not on equal footing since, in contrast to the model error term, which has been expanded up to the first nontrivial order in time, the initial condition error evolution contains all the orders of times . The point is that, as explained below, we intend to use these equations to model the error evolution in conjunction with the technique of data assimilation for which the full matrix , or an amended ensemble based approximation, is already available.
Taking the expectation value of the external product of (19) by itself and averaging, we get:
Equation (21) is the short time evolution equation, in this linearized setting, for the error covariance matrix in the presence of both initial condition and parametric model errors.
3 Deterministic model error treatment in the extended Kalman filter
We describe here two formulations of the extended Kalman filter (EKF) incorporating a model error treatment. The Short-Time-Extended-Kalman-Filter, ST-EKF CVN08 () accounts for model error through an estimate of its contribution to the assumed forecast error statistics. In the second formulation, the Short-Time-Augmented-Extended-Kalman-Filter, ST-AEKF CV11a (), the state estimation in the EKF is accompanied with the estimation of the uncertain parameters. This is done in the context of a general framework known as state augmentation Jazwinski (). In both cases model error is treated as a deterministic process implying that the dynamical laws described in the previous section are incorporated, at different stages, in the filter formulations.
The EKF extends, to nonlinear dynamics, the classical Kalman filter (KF) for linear dynamics Kalman1960 (). The algorithm is sequential in the sense that a prediction of the system’s state is updated at discrete times, when observations are present. The state update, the analysis, is then taken as the initial condition for the subsequent prediction up to the next observation time. The EKF, as well as the standard KF for linear dynamics, is derived in the hypothesis of Gaussian errors whose distributions can thus be fully described using only the first two moments, the mean and the covariance. Although this can represent a very crude approximation, especially for nonlinear systems, it allows for a dramatic reduction of the cost and difficulties involved in the time propagation of the full error distribution.
The model equations can conveniently be written in terms of a discrete mapping from time to :
where and are the forecast and analysis states respectively and is the nonlinear model forward operator (the resolvent of (1)).
Let us assume that a set of noisy observations of the true system (2), stored as the components of an -dimensional observation vector , is available at the regularly spaced discrete times , , with being the assimilation interval, so that:
where is the observation error, assumed here to be Gaussian with known covariance matrix and uncorrelated in time. is the (possibly nonlinear) observation operator which maps from model to observation space (i.e. from model to observed variables) and may involve spatial interpolations as well as transformations based on physical laws for indirect measurements JanicCohn06 ().
For the EKF, as well as for most least-square based assimilation schemes, the analysis state update equation at an arbitrary analysis time , reads Jazwinski ():
where the time indexes are dropped to simplify the notation. The analysis error covariance, , is updated through:
The gain matrix is given by:
where is the forecast error covariance matrix and the linearized observation operator (a real matrix). The analysis update is thus based on two complementary sources of information, the observations, , and the forecast . The errors associated to each of them are assumed to be uncorrelated and fully described by the covariance matrices and respectively.
In the EKF, the forecast error covariance matrix, , is obtained by linearizing the model around its trajectory between two successive analysis times and . In the standard formulation of the EKF model error is assumed to be a random uncorrelated noise whose effect is modeled by adding a model error covariance matrix, , at the forecast step so that Jazwinski ():
In practice the matrix should be considered as a measure of the variability of the noise sequence. This approach has been particularly attractive in the past in view of its simplicity and because of the lack of more refined model for the model error dynamics. Note that while is propagated in time and is therefore flow dependent, is defined once for all and it is then kept constant.
3.1 Short Time Extended Kalman Filter - ST-EKF
We study here the possibility of estimating the model error covariance, , on a deterministic basis CVN08 (). The approach uses the formalism on model error dynamics outlined in Sect. 2.
Model error is regarded as a time-correlated process and the short-time evolution laws (6) and (9) are used to estimate the bias, , and the model error covariance matrix, , respectively. The adoption of the short-time approximation is also legitimated by the sequential nature of the EKF, and an important practical concern is the ratio between the duration of the short-time regime and the length of the assimilation interval over which the approximation is used Nicolis2004 ().
A key issue is the estimation of the two first statistical moments of the tendency mismatch, , required in (6) and in (9) respectively. The problem is addressed assuming that a reanalysis dataset of relevant geophysical fields is available and is used as a proxy of the nature evolution. Reanalysis programs constitute the best-possible estimate of the Earth system over an extended period of time, using an homogeneous model and data assimilation procedure, and are of paramount importance in climate diagnosis (see e.g. Dee2011 ()).
Let us suppose to have access to such a reanalysis which includes the analysis, , and the forecast field, , so that , and is the assimilation interval of the data assimilation scheme used to produce the reanalysis; the suffix stands for reanalysis. Under this assumption the following approximation is made:
The difference between the analysis and the forecast, , is usually referred, in data assimilation literature, to as the analysis increment. From (28) we see that the vector of analysis increments can be used to estimate the difference between the model and the true tendencies. A similar approach was originally introduced by Leith (1978) Leith1978 (), and it has been used recently to account for model error in data assimilation Li2009 ().
Note that the estimate (28) neglects the analysis error, so that its accuracy is connected to that of the data assimilation algorithm used to produce the reanalysis, which is in turn related to the characteristics of the observational network such as number, distribution and frequency of the observations. However this error is present and acts as an initial condition error, a contribution which is already accounted for in the EKF update by the forecast error covariance, . As a consequence when (9) is used to estimate only the model error component, an overestimation is expected that can be overcome by an optimal tuning of the amplitude of and .
The bias is then removed from the forecast field before the latter enters the EKF analysis update, (24). The scalar term is a tunable coefficient aimed at optimizing the bias size to account for the expected overestimation connected with the use of (28). In a similar way the model error contribution to the forecast error covariance can be estimated taking the external product of (28) after removing the mean and reads:
We consider now the particular case of parametric error. The forecast error covariance , is estimated using the short-time evolution (21) where the correlation terms are neglected and the model error covariance, is evolved quadratically in the intervals between observations. An additional advantage is that can be straightforwardly adapted to different assimilation intervals and for the assimilation of asynchronous observations. At analysis times the forecast error bias due to the model error, , can be estimated on the basis of the short-time approximation (20):
By neglecting the correlation terms and dropping the time dependence for convenience, Eq. (21) can be rewritten as:
where is the analysis error covariance matrix, as estimated at the last analysis time, and
An essential ingredient of the ST-EKF in the case of parametric error is the matrix : it embeds the information on the model error through the unknown parametric error and the parametric functional dependence of the dynamics. In CVN08 () it was supposed that some a-priori information on the model error was at disposal and could be used to prescribe and then used to compute and required by the ST-EKF. The availability of information on the model error, which may come in practice from the experience of modelers, is simulated by estimating and averaging over a large sample of states on the system’s attractor as,
The same assumption is adopted here in the numerical applications with the ST-EKF described in Sect. 3.2.1.
In summary, in the ST-EKF, either in general or in the parameteric error case, once and are estimated (with (29)-(30) or (31)-(33) respectively) they are then kept constant along the entire assimilation cycle. Model error is thus repeatedly corrected in the subspace spanned by the range of where it is supposed to be confined. This choice reflects the assumption that the impact of model uncertainty on the forecast error does not fluctuate too much along the analysis cycle. Finally, in the ST-EKF, the forecast field and error covariance are transformed according to:
Numerical Results with ST-EKF. Error due to unresolved scale
We show here numerical results of the ST-EKF for the case of model error arising from the lack of description of a scale of motion. The case of parametric error is numerically tested in Sect. 3.2.1. A standard approach, known in geosciences as observation system simulation experiments (OSSE), is adopted here Bengt81 (). This experimental setup is based on a twin model configuration in which a trajectory, solution of the system taken to represent the actual dynamics, is sampled to produce synthetic observations. A second model provides the trajectory that assimilates the observations.
As a prototype of two-scales chaotic dynamics we consider the model introduced by Lorenz1996 (), whose equations read:
The model possesses two distinct scales of motion evolving according to (38) and (39), respectively. The large/slow scale variable, , represents a generic meteorological variable over a circle at fixed latitude. In both set of equations, the quadratic term simulates the advection, the second rhs term the internal dissipation, while the constant term in (38) plays the role of the external forcing. The two scales are coupled in such a way that the small/fast scale variables inhibit the larger ones, while the opposite occurs for the effect of the variables on . According to Lorenz1996 () the variables can be taken to represent some convective-scale quantity, while the variables favor this convective activity. The model parameters are set as in Lorenz1996 (): , which makes the variables to vary ten times slower than , with amplitudes ten times larger, while and . With this choice, the dynamics is chaotic. The numerical integration have been performed using a fourth-order Runge-Kutta scheme with a time step of 0.0083 units, corresponding to 1 hour of simulated time.
In the experiments the full equations (38) - (39), are taken to represent the truth, while the model sees only the slow scale and its equations are given by (38) without the last term. A network of regularly spaced noisy observations of is simulated by sampling the reference true trajectory and adding a Gaussian random observation error. We first generate a long record of analysis for the state vector, , which constitutes the reanalysis dataset. The EKF algorithm is run for years with assimilation interval hours, and observation variance set to of the system’s climate variance. From this long integration we extract the record of analysis increments required in (29) and (30).
An illustration of the impact of the proposed treatment of the model error is given in Fig. 1, which shows a days long assimilation cycle. The upper panel displays the true large scale variable (blue line), the corresponding estimates obtained with the ST-EKF and the EKF without the model error treatment (red and yellow lines respectively) and the observations (green marks). The error variance of the EKF estimates are shown in the bottom panel. From the top panel we see the improvement in the tracking of the true trajectory obtained by implementing the proposed model error treatment; this is particularly evident in the proximity of the maxima and minima of the true signal. The benefit is further evident by looking at the estimated error variance which undergoes a rapid convergence to values close or below the observation error.
A common practical procedure used to account for model error in KF-like and ensemble-based schemes, is the multiplicative covariance inflation Anderson1999 (). The forecast error covariance matrix is multiplied by a scalar factor and thus inflated while keeping its spatial structure unchanged, so that before its use in the analysis update, (24). We have followed the same procedure here and have optimized the EKF by tuning the inflation factor ; the results are reported in Fig. 2(a) which shows the normalized estimation error variance as a function of . The experiments last for days, and the results are averaged in time, after an initial transient of days, and over a sample of random initial conditions. The best performance is obtained by inflating by of its original amplitude and the estimation error variance is about of the system’s climate variance, slightly above the observation error variance. Note that when filter divergence occurs in some of the experiments.
We now test the sensitivity of the ST-EKF to the multiplicative coefficient in (29) and (30). The results are reported in Fig. 2(b), which shows the estimation error variance as a function of . As above the averages are taken over days and over the same ensemble of random initial conditions. The important feature is the existence of a range of values of , for which the estimation error is below the observation error level. Note that for , the estimation error is about of the climate’s variance, below the observational accuracy. This result highlights the accuracy of the estimate of despite the simplifying assumptions such as the one associated with the correlation between model error and initial condition error and the use of the reanalysis field as a proxy of the actual true trajectory. Interestingly, the best performance is obtained with , in agreement with the expected overestimation connected with the use of (28).
In Fig. 2(c) we explicitly compare the EKF with the optimal inflation for , (, ), with the EKF implementing the model error treatment through the matrix estimated according to (30) and tuned with the optimal values of the scalar coefficient . The figure displays the estimation error variance as a function of time. Note in particular the ability of the filter, using , to keep estimation error small even in correspondence with the two large deviations experienced by the EKF employing the forecast error covariance inflation.
3.2 Short Time Augmented Extended Kalman Filter - ST-AEKF
Let us now turn to the state-augmentation approach. In this case we will assume that model errors arise from mis-specifications of some parameters, so that the theory depicted in Section 2.2 can be used. This view restricts us to parametric errors, but it also reflects our limited knowledge of the sub-grid scale processes that are only represented through parametrisation schemes for which only a set of parameters is accessible.
A straightforward theory exists for the estimation of the uncertain parameters along with the system’s state. The approach, commonly known as state-augmentation Jazwinski (), consists in defining an augmented dynamical system which allocates, along with the system’s state, the model parameters to be estimated. The analysis update is then applied to this new augmented dynamical system. Our aim here is to use the state-augmentation approach in conjunction with the deterministic formulation of the model error dynamics.
The dynamical system (22), the forecast model, is augmented with the model parameters, as follows:
where is the augmented state vector. The augmented dynamical system includes the dynamical model for the system’s state, , and a dynamical model for the parameters . In the absence of additional information, a persistence model for is usually assumed so that and . Recently, a temporally smoothed version of the persistence model has been used in the framework of a square root filter YangDelsole2009 (). The state-augmented formulation is also successfully applied in the context of the general class of ensemble-based data assimilation procedures Ruiz2013 ().
By proceeding formally as for Eq. (12) we can write the linearized error evolution for the augmented system, in an arbitrary assimilation interval, , with initial condition given by the augmented state analysis error, :
with . The parametric error is constant over the assimilation interval in virtue of the assumption . Equation (41) describes, in the linear approximation, the error evolution in the augmented dynamical system (40). The short-time approximation of the error dynamics (41) in the interval reads:
where the matrix is the error covariance of the state estimate , is the parametric error covariance and the error correlation matrix between the state vector, , and the vector of parameters . These correlations are essential for the estimation of the parameters. In general one does not have access to a direct measurement of the parameters, and information are only obtained through observations of the system’s state. As a consequence, at the analysis step, the estimate of the parameters will be updated only if they correlate with the system’s state, that is . The gain of information coming from the observations is thus spread out to the full augmented system phase space.
Let us define, in analogy with (43), the analysis error covariance matrix for the augmented system:
Note that (45) is equivalent to (32), except that now the correlations between the initial condition and the model error are maintained (last two terms on the r.h.s. of (45)), and and replace and . Nevertheless, in contrast to the ST-EKF where is estimated statistically and then kept fixed, in the ST-AEKF is estimated online using the observations.
The information on the uncertainty in the model parameters is embedded in the error covariance , a by-product of the assimilation. Using the definition of and (46), the matrix can be rewritten as:
Similarly, the correlation terms in (45) can be written according to:
The three terms in (50) represent the contribution to the forecast state error covariance coming from the analysis error covariance in the system’s state, in the parameters and in their correlation respectively.
By making use of the definition of the model error vector in (47), the forecast error correlation matrix becomes:
with being the ST-AEKF forward operator defined as:
where is the identity matrix.
The short-time bias equation (20) is used to estimate the bias in the state forecast, , due to parametric error, in analogy with the ST-EKF. This estimate is made online using the last innovation of the parameter vector. Assuming furthermore that the forecast of the parameter is unbiased, the bias in the state augmented forecast at time reads:
The bias is then removed from the forecast field before the latter is used in the analysis update, that is , where is the unbiased state augmented forecast.
As for the standard EKF, we need the observation operator linking the model to the observed variables. An augmented observation operator is introduced, with as in (23), and its linearization, is now a matrix in which the last columns contain zeros; the rank deficiency in reflects the lack of direct observations of the model parameters.
The augmented state and covariance update complete the algorithm:
where the vector of observations is the same as in the standard EKF while is now the identity matrix. The augmented gain matrix is defined accordingly:
but it is now a matrix.
Equations (40) - (52) for the forecast step, and (55) - (57) for the analysis update define the ST-AEKF. The algorithm is closed and self consistent meaning that, once it has been initialized, it does not need any external information (such as statistically estimated error covariances) and the state, the parameters and the associated error covariances are all estimated online using the observations.
The ST-AEKF is a short-time approximation of the classical augmented EKF, the AEKF Jazwinski (). In essence, the approximation consists of the use of an analytic expression for the evolution of the model error component of the forecast error covariance. This evolution law, quadratic for short-time, reflects a generic and intrinsic feature of the model error dynamics, connected to the model sensitivity, to perturbed parameters and to the degree of dynamical instability. It does not depend on the specific numerical integration scheme adopted for the evolution of the model state. The state error covariance, , in the ST-AEKF, is evolved as in the standard AEKF: the propagator is the product of the individual relative to each time-step within the assimilation interval. The difference between the two algorithms is in the time propagation of the forecast error covariance associated with the misspecification of the parameters, . In the ST-AEKF this is reduced to the evaluation of the off diagonal term in the operator . This term replaces the full linearization of the model equations with respect to the estimated parameters, required by the AEKF. In this latter case the model equations are linearized with respect to the augmented state, , giving rise to an augmented tangent linear model, . This linearization can be particularly involved Kondrashov2007 (), especially in the case of implicit or semi-implicit integration schemes such as those often used in NWP applications Kalnay2002 (). The propagator relative to the entire assimilation interval is then given by the product of the individual augmented tangent linear propagator over the single time-steps. As a consequence the cost of evolving the model error covariance in the AEKF grows with the assimilation interval. In the ST-AEKF, the use of the short-time approximation within the assimilation interval makes straightforward the implementation of the parameter estimation in the context of a pre-existing EKF, without the need to use an augmented tangent linear model during the data assimilation interval. It reduces the computational cost with respect to the AEKF, because the propagation of the model error component does not depend on the length of the assimilation interval. Nevertheless the simplifications in the setup and the reduction in the computational cost are obtained at the price of a decrease in the accuracy with respect to the AEKF. The degree of dynamical instabilities along with the length of the assimilation interval, are the key factors affecting the accuracy of the ST-AEKF.
Numerical Results with ST-EKF and ST-AEKF
Numerical experiments are carried out with two different models. OSSEs are performed first using the Lorenz ’96 Lorenz1996 () model used in Sect. 3.1.1, but in its one-scale version given by:
where the parameter associated with the advection, , linear dissipation, and the forcing , are written explicitly. As for the experiments described in Sect. 3.1.1, the numerical integration are performed using a fourth-order Runge-Kutta scheme with a time step of 0.0083 units, corresponding to 1 hour of simulated time.
The reference trajectory, representing the true evolution we intend to estimate, is given by a solution of (58) with parameters ; with this choice the model behaves chaotically. A network of regularly spaced noisy observations is simulated by sampling the reference true trajectory and adding a Gaussian random observation error whose variance is set to of the system’s climate variance. Model error is simulated by perturbing , and with respect to their reference true values. Gaussian samples of states and model parameters are used to initialize assimilation cycles lasting for year. In all the experiments the initial condition error variance is set to of the system’s climate variance. The model parameters are sampled from a Gaussian distribution with mean equal to and standard deviation of .
We compare four configurations of the EKF: (1) standard EKF without model error treatment, (2) standard EKF using a perfect model, (3) ST-EKF (Sect. 3.1), and (4) ST-AEKF (Sect. 3.2). Recall that in the ST-EKF, model error bias and covariance are estimated according to (31) and (33) with and evaluated on a statistical basis before the assimilation experiments. The expectation of is estimated through:
and is then used in (31) and (33). In (59) the averages are taken over the same Gaussian sample of initial conditions and parameters used to initialize the data assimilation experiments, using the actual value of the parameter, , as the reference. This idealized procedure has been chosen to give the ST-EKF the best-possible statistical estimate of the model error in view of its comparison with the more sophisticated ST-AEKF.
Figure 3 shows the analysis error variance as a function of time for the four experiments of one year long; the assimilation interval is hours. The errors are spatio-temporal average over the ensemble of experiments and over the model domain, and normalized with the system’s climate variance.
The figure clearly shows the advantage of incorporating a model error treatment: the average error of the ST-EKF is almost half of the corresponding to the standard EKF without model error treatment. However using the ST-AEKF the error is further reduced and attains a level very close to the perfect model case.
The benefit of incorporating a parameter estimation procedure in the ST-AEKF is displayed in Fig. 4 that shows the time mean analysis error variance for the EKF with perfect model and the ST-AEKF (top panel), along with the relative parametric errors as a function of time (bottom panel). The time series of the ST-AEKF analysis error variance is also superimposed to the time-averages in the top panel. Figure 4 reveals that the ST-AEKF is successful in recovering the true parameters. This reconstruction is very effective for the forcing, , and the advection, , and at a lesser extent for the dissipation, .
The ability of the ST-AEKF to efficiently exploit the observations of the system’s state to estimate an uncertain parameter, either multiplicative or additive, is evident. Given that the innovation in the parameter, obtained via Eq. (55), is proportional to the cross-covariance forecast error, , the accuracy of the parameter estimation revealed by Fig. 4 turns out to be an indication of the quality of the short-time approximation, (51), on which the estimate of is based.
Figure 5 focuses on the comparison between the ST-AEKF and the standard AEKF. The experiments are carried out for , and hours and with . As above, the results are averaged over an ensemble of experiments, and the observation error variance is of the system’s climate variance. The left panels display the quadratic estimation error, while the parametric error is given in the panels in the right column; note that the logarithm scale is used in the y-axis. The estimation error relative to the EKF with a perfect model is also displayed for reference.
We see that as expected the AEKF has a superior skill than the ST-AEKF but for or hours their performances are very similar. The AEKF shows a marked rapidity to reach convergence but the asymptotic error level attained by the two filters are practically indistinguishable. On the other hand for hours the ST-AEKF diverges whereas the AEKF is able to control error growth and maintain the estimation error to a low level. We first observe that in all but one cases the parametric error in the experiments with the AEKF is lower than for the ST-AEKF, in agreement with the observed lower state estimation error. Anyhow when or hours, the asymptotic parametric errors of the two filters are very similar, a remarkable result considering the approximate evolution law used in the ST-AEKF. An important difference is the extreme variability observed in the parametric error with the ST-AEKF as compared to the smoothness of the corresponding solutions with the AEKF. Note also that when hours the ST-AEKF reduces the error in the forcing more than the AEKF but the error curves are subject to very large fluctuations. The dissipation, , appears as the most difficult parameter to be estimated in agreement with what observed in Fig. 4. In summary, Fig. 5 suggests that the ST-AEKF may represent a suitable and efficient alternative to the full AEKF when the assimilation interval does not exceed the time range of validity of the approximation on which the ST-AEKF is based. The results indicate that this limit is between and hours given that the ST-AEKF diverges when hours. According to the theory outlined in Nicolis2003 (), the short-time regime is related to the inverse of twice the largest (in absolute value) Lyapunov exponent of the system. In the Lorenz system (58) the largest Lyapunov exponent turns out to be the most negative one, equal to day, so that the duration of the short-time regime is estimated to be about hours, in qualitative agreement with the performance of the ST-AEKF. Finally note that the slight deterioration in the filter accuracy is compensated by a reduction in both the computational and implementation costs with respect to the AEKF.
The second model under consideration is an offline version of the operational soil model, Interactions between Surface, Biosphere,and Atmosphere (ISBA) NoilhanMahfouf1996 (). In the experiments that follow, ST-AEKF has been implemented in the presence of parametric errors in the Leaf Area Index (LAI) and in the Albedo; more details, along with the case of other land surface parameters, can be found in C_ISBA12 (). OSSEs are performed using the two-layers version of ISBA which describes the evolution of soil temperature and moisture contents; the model is available within a surface externalized platform (SLDAS; Mahfouf2007 ()). The state vector, , contains the surface and deep soil temperatures and and the corresponding water contents and . The vector is taken to represent the set of model parameters. A detailed description of ISBA can be found in NoilhanMahfouf1996 ().
The forcing data are the same for the truth and the assimilation solutions. They consist of 1-hourly air temperature, specific humidity, atmospheric pressure, incoming global radiation, incoming long-wave radiation, precipitation rate and wind speed relative to the ten summers in the decade 1990-1999 extract from ECMWF Re-analysis ERA40 and then dynamically down-scaled to 10 km horizontal resolution over Belgium Hamdi2012 (). The fields are then temporally interpolated to get data consistent with the time resolution of the integration scheme of ISBA (300 s). In this study ISBA is run in one offline single column mode for a 90 day period, and the forcing parameters are those relative to the grid point closest to Brussels. An one-point soil model has been also used by Orescanin2009 (), for parameter estimation using an ensemble based assimilation algorithm.
The simulated observations are and , interpolated between the forcing level (20 m) and the surface with the Geleyn’s interpolation scheme (Geleyn1988 ()), at , , and UTC. The assimilation interval is hours, while the observational noise is drawn from a Gaussian, , with zero-mean and covariance given by the diagonal matrix with elements: . As explained in Mahfouf2009 (), the observation operator, relating the state vector to the observation includes the model integration. The initial and required by the EKF are set as diagonal with elements 111The values of and are expressed as soil wetness index where is the volumetric field capacity and is the wilting point. and Mahfouf2007 ().
Parametric errors is introduced by perturbing simultaneously the LAI and the albedo. These parameters strongly influence the surface energy balance budget and partitioning, which in turn regulate the circulation patterns and modify the hydrological processes. For each summer in the period 1990-1999, a reference trajectory is generated by integrating the model with LAI = 1 and albedo = 0.2. Around each of these trajectories, Gaussian samples of initial conditions and uncertain parameters are used to initialize the assimilation cycles. The initial conditions are sampled from a distribution with standard deviation , whereas LAI and the albedo are sampled with standard deviations, and respectively (Ghilain2011 ()). The initial in the ST-AEKF read