A Construction of the quasilinearized penalty matrix in a general system of nonlinear ODEs

# Estimation and approximation in nonlinear dynamic systems using quasilinearization

## Abstract

Nonlinear (systems of) ordinary differential equations (ODEs) are common tools in the analysis of complex one-dimensional dynamic systems. In this paper we propose a smoothing approach regularized by a quasilinearized ODE-based penalty in order to approximate the state functions and estimate the parameters defining nonlinear differential systems from noisy data. Within the quasilinearized spline based framework, the estimation process reduces to a conditionally linear problem for the optimization of the spline coefficients. Furthermore, standard ODE compliance parameter(s) selection criteria are easily applicable and conditions on the state function(s) can be eventually imposed using soft or hard constraints. The approach is illustrated on real and simulated data.

Keywords: Differential conditions; Nonlinear differential equations; Penalized splines; Quasilinearization

## 1 Introduction

The analysis of one dimensional dynamic systems represents a crucial task in many scientific fields. This kind of problem is usually tackled invoking differential calculus tools by defining (a system of) ordinary differential equation(s) which (analytic or numerical) solution(s) appropriately describes the observed dynamics.
In many circumstances, the parameters involved in the differential model synthesizing the observed phenomenon are not known. In these cases statistical procedures are applied to estimate them from noisy data. In general it is assumed that changes in the states of a dynamic system are governed by a set of differential equations:

 dxdt(t)=f(t,x,θ),t∈[0;T], (1)

where is a known nonlinear function in the states and an unknown vector of parameters. It is assumed that only a subset of the state functions are observed at time point , , with additive measurement error . We denote by the corresponding measurement. For simplicity, we assume that the error terms are distributed according to a standardized Gaussian distribution. The goal is to estimate the ODE parameters , the precision of measurements and to approximate the solution of the ODE model given in Eq. (1) from the (noisy) observations.
Many approaches have been proposed in the literature to estimate these sets of unknowns. The first reference in this field is probably Hotelling (1927) that tackled this task from a regression point of view. Following this idea, Biegler et al. (1986) proposed a nonlinear least squares approach for the estimation of the ODE parameters using suitable numerical schemes to approximate the state function(s). This approach suffers of some drawbacks. It can be computationally demanding to deal with complex differential systems and the quality of the final estimates can be severely influenced by possible error propagation effects (related to the adopted numerical scheme). Finally these techniques become unpractical for inferential purposes in complex dynamics.
For these reasons the penalized smoothing collocation approach of Ramsay et al. (2007) represents an attractive alternative. This estimation procedure consists in a compromise between a flexible (regression based) description of the recorded measurements and the (numerical) solution of the hypothesized differential system obtained through a collocation scheme. In particular, this framework exploits a B-spline approximation of the state function. The flexibility induced by the B-spline approximation is then counterbalanced by a penalty related to the differential model. The ODE-penalty term is defined as the integral of the ODE operator evaluated at the B-spline approximation of the state function. It measures the fidelity of the B-spline approximation to the ODE model. An ODE-compliance parameter balances between the fidelity to the data and the compliance of the B-spline approximation to the ODE model. This approach can be adopted both in case of linear and nonlinear differential equations.
In contrast to the linear cases, estimation in nonlinear systems using the approach by Ramsay et al. (2007) requires a nonlinear least squares step and the application of the implicit function theorem both for point and interval estimates of the ODE parameters. Furthermore, even if the compliance parameter still tunes the balance between data fitting and ODE model fidelity, the application of standard selection procedure can be unpractical to deal with nonlinear ODE systems.
In this paper, we overcome these challenges by introducing a quasilinearization scheme (Bellman and Kalaba, 1965) in the estimation procedure. Our quasilinearized ODE-P-spline (QL-ODE-P-spline) approach can be viewed as a convenient generalization of the P-spline collocation procedure of Ramsay et al. (2007) as it permits an explicit link between the spline coefficients and the unknown differential parameters. Furthermore, standard approaches for selecting/optimizing the ODE-compliance parameters become directly applicable.
In this paper we consider two differential problems as working examples to illustrate our proposal and evaluate its performances and robustness through simulations: 1) a simple first order ODE with initial value condition having a closed form solution; 2) the well known Van der Pol system of ODEs. The first order initial value problem

 ⎧⎨⎩dxdt(t)=(θ1−2θ2t)x2(t)x(0)=x0, (2)

with explicit solution

 x(t)=1θ2t2−θ1t+x−10, (3)

appears particularly attractive as it has a closed form solution, enabling an objective comparison of the merits of our approach with the nonlinear least squares one. The second working example is represented by the Van der Pol system:

 Missing or unrecognized delimiter for \right (4)

In this system, the state function describes a non-conservative oscillator with nonlinear damping. In particular, it has been proposed to describe the dynamics of current charge in a nonlinear electronic oscillator circuit (Van der Pol, 1926). No exact solution can be derived for non-zero parameter. This model also found applications in other fields such as biology and seismology.
This paper is organized as follows. In Section 2 the key features of the generalized profiling estimation procedure proposed by Ramsay et al. (2007) are briefly recalled. In Section 3, the introduction of a quasilinearization step in the profiling framework is illustrated to highlight the strengths of our QL-ODE-P-spline method. Its properties are studied using simulations in Section 4. We conclude the paper with the analysis of two challenging datasets in Section 5 followed by a discussion in Section 6.

## 2 Generalized profiling estimation in a nutshell

In this section, the key steps of the profiling procedure proposed by Ramsay et al. (2007) are briefly reminded. First of all, each state function involved in the system of (nonlinear) differential equations is approximated using a B-spline basis function expansion:

 ˜xj(t) = Kj∑k=1bjk(t)αjk, = [bj(t)]⊤αj

where is a -vector of spline coefficients and is the -vector of B-spline basis functions evaluated at time . The variation of the B-spline approximations are shrunk by introducing an ODE-based penalty. More precisely, for each differential equation of the system, a penalty term is introduced:

 PENj(α|θ)=∫(d˜xjdt(t)−fj(t,˜x,θ))2dt, (5)

where . The full fidelity measure to the ODE-model is then defined as:

 PEN(α|θ,γ)=d∑j=1γjPENj(α). (6)

The compromise between data fitting and model fidelity is tuned by the ODE-compliance parameters . For values of close to zero, the final estimates tend to overfit the data by satisfying nearly exclusively the goodness of fit criterion. For , the final approximation tends to mimic the solution of the ODE model defining the penalty.
The ODE parameters and the spline coefficients are estimated by profiling the likelihood. This approach requires to optimize the choice of the ODE parameters by considering the spline coefficients as nuisance parameters. For given values of the ODE parameters , of the precisions of measurement and of the ODE-compliance parameters , the optimal spline coefficients are found as the maximizer of:

 J(α|θ,γ,τ,y)=∑j∈J{nj2log(τj)−τj2∥∥yj−˜xj(t)∥∥2}−12PEN(α|θ,γ). (7)

In a second step, given the current value of the ODE compliance parameters, the vector is estimated together with the precisions of measurement optimizing the following data fitting criterion:

 H(θ,τ|y,ˆα)=∑j∈J{nj2log(τj)−τj2∥∥yj−Bjˆαj(θ,τ,γ,y)∥∥2}, (8)

with the B-spline matrix defined for the th state function.
As the definition of the penalty term is based on (a system of) nonlinear differential equation(s), the minimizer of cannot be found analytically. This makes the dependence of on and partially implicit. Therefore, the function can only be optimized combining a non linear least squares solver with the implicit function theorem. Similar arguments are involved in the approximation of the variance-covariance matrix of (see Ramsay et al. (2007) for details). The selection of the optimal ODE-compliance parameters is a non trivial task as common smoothing parameter selection criteria cannot be applied.

## 3 Quasilinearized ODE-P-spline approach

When nonlinearity becomes an issue for the estimation process, it is convenient to approximate nonlinear functions with linear surrogates (obtained for example through first order Taylor expansion). Quasilinearization (Bellman and Kalaba, 1965) generalizes this idea to approximate the solution of differential equations. In the generalized profiling estimation procedure, provided that differential equations are linear, the estimation of and can be made by iterating simple least squares procedures with ODE compliance parameters selected using standard procedures. From the discussion at the end of Section 2 it appears that dealing with nonlinear ODEs is more challenging. Our main suggestion is to amend the estimation procedure using quasilinearization.

### 3.1 Quasilinearization of the penalty

We propose to linearize the nonlinear part of the ODE model using an iterative approximation procedure. More precisely, given the set of spline coefficients at iteration and the induced estimates of the -th state function , we propose to approximate the -th component of the ODE penalty term at iteration by:

 PENj(α(i+1)|α(i),θ)≈∫⎛⎜⎝d˜x(i+1)jdt(t)−fj(t,˜x(i),θ)−d∑k=1(˜x(i+1)k(t)−˜x(i)k(t))∂fj∂xk(t,˜x(i),θ)⎞⎟⎠2dt. (9)

Then, the overall fidelity measure to the ODE in Eq. (6) can be approximated by:

 PEN(α(i+1)|α(i),θ,γ) = d∑j=1γjPENj(α(i+1)|α(i),θ) (10) = (α(i+1))⊤R(θ,γ,α(i))α(i+1)+(α(i+1))⊤r(θ,γ,α(i))+l(θ,γ,α(i)),

where plays the role of a penalty matrix, is the penalty vector and is a constant not depending on the current spline coefficients but only on the previous ones (see Appendix A for the explicit definition of and ). In contrast to the inner optimization proposed in Ramsay et al. (2007), the substitution of (10) in (7) makes the optimization of the criterion with respect to the spline coefficients a standard least squares problem that does not require any sophisticated mathematical analysis tool. Indeed, Eq. (7) becomes a quadratic form in which optimization does not require the implicit function theorem anymore.
Similarly to Ramsay et al. (2007) and Frasso et al. (2013), the estimation process for the ODE parameters and the spline coefficients implies the iteration of two profiling steps. First, given the last available estimates and for given vectors of ODE parameters , precisions of measurements and ODE-compliance parameters , the spline coefficients are updated by maximizing the approximated criterion:

 α(i+1) = argmaxαJ(α(i+1)|θ(i),τ(i),γ(i),α(i),y) (11) = (G(i)+R(θ(i),γ(i),α(i)))−1(g(i)−r(θ(i),γ(i),α(i))),

where is a block diagonal matrix with if the -th state function is observed and the null -matrix otherwise and with if the -th state function is observed or null (-vector) otherwise. Here, is the -vector of all observations for the -the state function and is a B-spline matrix of dimension . The values of and are updated by optimizing given . In the rest of our discussion the iteration indexes for the , and vectors will be omitted.
Thanks to the quasilinearization, the selection of the compliance parameters could be performed using standard methods such as AIC, BIC or (generalized) cross-validation. Instead, we advocate to use the EM-type algorithm proposed by Schall (1991) which avoids the computation of a selection criterion over a grid of possible values of each by estimating them iteratively as the ratio of the estimated error variances and the current variance of the penalty. The standard errors of the ODE parameters can be approximated from the Hessian matrix based on criterion at convergence. Alternatively, if a quantification including all the sources of uncertainty is desired, we advise to consider the estimation problem from a Bayesian perspective.
The following algorithm summarizes the proposed estimation strategy: {algorithm} \KwDataSet of (noisy) measurements driven by a dynamic system. \KwResultEstimation of optimal , , and via QL-ODE-P-splines. Initialization: Smooth the data using standard P-splines to estimate the initial spline coefficients . Set the initial , and parameters   \WhileConvergence is not achieved (a) Given the current and linearize the ODE based penalties (see Web Appendix C)

(b) Compute as Eq. (11) using the linearized penalties and compute

(c) Update (the effective dimension is computed as and following Hastie and Tibshirani (1990))

(d) Compute and ODE compliance parameters

(e) Given the current values of the other unknowns, update the ODE parameters by minimizing Eq. (8)

(f) Compute the convergence criterion in Eq. (12). Quasilinearized ODE-P-spline estimation algorithm. The following convergence criterion was used throughout:

 [max(∣∣∣α(i+1)−α(i)α(i)∣∣∣),max(∣∣∣γ(i+1)−γ(i)γ(i)∣∣∣),max(∣∣∣θ(i+1)−θ(i)θ(i)∣∣∣),max(∣∣∣τ(i+1)−τ(i)τ(i)∣∣∣)]<10−4. (12)

The initial values for the ODE parameters and the precision of measurements can be either chosen by the analyst on the basis of prior knowledge or estimated from the raw measurements. In the latter case different alternatives are possible. One possibility is to invert the ODE system using the spline coefficients estimated by P-splines. Alternatively, the estimates of a NLS analysis (based on the numerical solution of the ODE) can be used. The ODE compliance parameters can be set at large values (as in Ramsay et al. (2007)) if the interest is exclusively focused on ODE parameter estimation. Then, step (d) in Algorithm 3.1 can be skipped.
The proposed procedure requires an initial vector of spline coefficients: we recommend to use a standard P-spline smoother (Eilers and Marx, 1996) to start with reasonable initial coefficients . The ODE model combined with the initial ODE parameters () and the initial spline coefficients associated to the observed state functions can be used to deduce initial spline coefficients for the unobserved state functions. On the other hand, according to our experience, the quality of the final estimates is not dramatically influenced by the choice of .
This issue is illustrated in the analysis of the results (Table 1) of a simulation study. One hundred data sets have been generated by adding a zero mean Gaussian noise to the solutions of the ODE problems presented in Section 1. These data have been analyzed using QL-ODE-P-splines with initial ODE parameters set either to their true values , to or to . The initial spline coefficients have been obtained: by smoothing of the raw measurements with standard P-splines (Strategy 1), or by assuming constant state functions (Strategy 2). The smoothers have been built on cubic B-splines defined over 20 equidistant internal knots to approximate the state function in the first order ODE model (Example 1) and using fourth order spline functions computed on 150 equidistant internal knots for the simulation of the Van der Pol system (Example 2). The respective sample sizes are and .
Figure 1 shows possible estimation steps for Example 1 starting with a constant state function and wrong initial ODE parameters. These plots clearly show the impact of the bad initial guesses (upper left panel). At the second estimation step, the ODE-compliance parameter becomes small giving more weight to the goodness-of-fit term in Eq. (7). Then, the (fitted) red curve tends to overfit the data with estimated ODE parameters different from the true ones (). As the number of iterations increases, the ODE compliance parameter also increases, yielding more weight to the ODE-based penalty in Eq. (7) and ensuring estimated state functions and ODE parameters closer and closer to their true values. Figures 2 shows similar results for Example 2 (based on the Van der Pol system). It is remarkable that, even with initial spline coefficients chosen without taking into account the ODE model and/or the observations, we properly approximate both the observed () and unobserved () state functions. It suggests that results provided by Algorithm 3.1 are robust to the choice of initial conditions.
Table 1 shows the (average) root mean squared error w.r.t. the data () and the (average) RMSE w.r.t. the state function (): the influence of the initial guesses on the goodness of the final fit appears negligible. Analogously, the compliance of the approximated state functions to the simulated ones does not seem affected by the specification of the initial settings. On average, the choice of the initial values for the spline coefficients seems to have a limited impact on the number of iterations required to achieve converge ().

### 3.2 Including state conditions in the estimation process

A (system of) differential equation(s) can have a unique solution if condition(s) are imposed on the value(s) of the state function(s) and/or of one of its derivatives at a given time point(s). Initial and boundary value conditions are usually distinguished. In what follows, for the sake of brevity, we generically refer to them as “state conditions”.
Often state conditions arise as characteristic of the observed dynamics. Such information can easily be included in the estimation process with a linear differential model (Frasso et al., 2013). The same is true after quasilinearization of a nonlinear system. This is an extra advantage over exiting approaches in the literature (Ramsay et al., 2007; Cao and Zhao, 2008; Poyton et al., 2006).
Consider the general differential problem given by:

 ⎧⎪⎨⎪⎩dxdt(t)=f(t,x,θ),x(t0)=st0 % for t0∈T0,

where denotes the subset of observed times where is forced to be equal to . Using the B-spline approximation, these conditions become:

 Sα(i+1)=st0, (13)

where is a matrix where each of the rows contains the B-splines in the basis function evaluated at a given . These state conditions can be included in the estimation process using two strategies.
The first one consists in treating them as an extra least squares penalty. Then, the optimal spline coefficient can be obtained by maximizing the following modified fitting criterion:

 Missing or unrecognized delimiter for \right (14)

where is set at an arbitrary large value. Note that with such an extra penalty, the state condition is imposed only in a least square sense. With the fitting criterion in Eq. (14), the optimal spline coefficients at iteration become:

 α(i+1) = argmaxαJ(α(i+1)|θ,τ,γ,α(i)y) = (G+R(θ,γ,α(i))+κS⊤S)−1(g−r(θ,γ,α(i))+κS⊤st0).

If is equal to zero, then one gets back the estimation procedure discussed before, while for tending to infinity, the state functions are forced to meet the imposed conditions. This parameter can be fixed to a large value (, say) if the user is confident in the validity of the conditions or optimized using standard criteria (such AIC, BIC, cross-validation).
The second approach consists in forcing the smooth estimates of the state function to check the prescribed conditions by the introduction of Lagrange multipliers. Starting from Eq. (7) the Lagrange function for the constrained optimization problem is:

 Missing or unrecognized delimiter for \right

where is the vector of Lagrange multipliers. Following Currie (2013), the maximization of can be obtained by solving:

 (G+R(θ,γ,α(i))S⊤S0)(α(i+1)ω)=(g−r(θ,γ,α(i))st0).

Whatever the selected approach to force the state conditions, the estimates of the ODE parameters are updated using Eq. (8).

## 4 Practical implementation and simulation

In this section the results of simulation studies based on the two examples presented in Section 1 are reported to illustrate the properties of the proposed approach. It is first shown how these ODEs can be linearized to define the steps in the QL-ODE-P-spline estimation algorithm.
The ODE given in Eq. (2) can be linearized at iteration as follows:

 d˜x(i+1)dt(t)=(θ1−2θ2t)[˜x(i+1)(t)]2≈2(θ1−2θ2t)˜x(i)(t)˜x(i+1)(t)−(θ1−2θ2t)[˜x(i)(t)]2.

In the Van der Pol system, only the first differential equation has a nonlinear term. It can be linearized using:

 d˜x(i+1)1dt(t)=θ(˜x(i+1)1(t)−13[˜x(i+1)1(t)]3−˜x(i+1)2(t))≈θ(1−[˜x(i)1(t)]2)˜x(i+1)1(t)−θ˜x(i+1)2(t)+2θ3[˜x(i)1(t)]3.

These approximations are then used to build the penalty terms in Eqs. (5) and (6). The spline coefficients and the other parameters are jointly updated with the ODE-compliance parameters (using the EM-type approach by Schall (1991)). The QL-ODE-P-spline estimation steps are summarized in Algorithm 3.1.

### 4.1 Evaluation of estimation accuracy

An intensive simulation study is presented here to study the properties of the proposed methods. We have generated a large number of datasets using the ODE models in Eqs. (2) and (4) by perturbing the analytic or numerical solutions of the differential models by a zero mean Gaussian noise.
We aim to assess the quality of the estimates provided by the QL-ODE-P-spline approach for known or unknown initial values for the state functions. These constraints were imposed during QL-ODE-P-spline estimation using Lagrange multipliers (see Section 3.2).

#### First order ODE simulation results

In this section, the results of simulations based on Eq. (2) are discussed. The (analytic) state function has been computed with and . Gaussian noise with mean zero and variance has been added to values of the state function evaluated at time points uniformly sampled in . The data generation process has been repeated 500 times under four sample sizes .
Table 2 summarizes the estimation performances when is unknown and provides the bias (in percentage), the averaged standard errors, the relative mean squared error and standard deviation of and . In all the simulation settings we used cubic B-splines defined on 20 equidistant internal knots. The quality of the QL-ODE-P-spline estimates are compared with those obtained by estimating from Eq. (3) using nonlinear least squares.
The table clearly shows that the NLS and the quasilinearized ODE-P-spline approaches provide good estimates of the unknown ODE parameters. The biases for all the parameter estimates are small and similar for the two procedures whatever the sample size. Similar results are obtained with the two methods for the estimation of the standard deviations and the root mean squared errors of the parameter estimators.

From Table 3 one can evaluate the impact of the initial value conditions on the estimation performances. These constraints reduce the estimation biases of the two procedures whatever the sample size. Also in this case the standard deviations and the root mean squared errors of the parameter estimators appear really similar for the two alternative procedures.

#### Van der Pol system simulation results

A simulation study was also performed with the Van der Pol system in Eq. (4). Five hundreds datasets of size have been obtained adding a Gaussian noise component with null mean and variance to uniformly sampled values of the numerical solution to Eq. (4). The solution has been approximated using a Runge-Kutta scheme for values of in and considering initial value conditions for . Table 4 compares constrained and unconstrained QL-ODE-P-spline methods in terms of estimation performances. These results have been obtained using fourth order B-splines defined on 150 equidistant internal knots.

The biases and the root mean squared errors of the estimates of the ODE parameters and of the state function at are fairly small and tend to decrease with sample size. The estimation performance indicators for the constrained procedure are shown in the lower part of the table. It appears that the introduction of the initial conditions ensures, on average, more precise and robust estimates.

## 5 Two real(istic) examples

In this section the QL-ODE-P-spline procedure is illustrated by analyzing two real(isitc) extra examples. As first problem we consider a simulated experiment inspired by the heartbeat phenomenological model proposed by dos Santos et al. (2004). In the second example, (a subset of) the well known Canadian lynx-hare abundance data (Elton and Nicholson, 1942; Holling, 1959) are analyzed by estimating the parameter and the solution of a Lotka-Volterra system of equations via QL-ODE-P-splines.

### 5.1 A realistic model for the analysis of heartbeat signals

The normal cardiac rhythm depends on the aggregate of cells in the right atrium defining the sino-atrial (SA) node which constitutes the normal pacemaker. The SA node generates electrical impulses that spread to the ventricules through the atrial musculature and conducting tissues, the AV node. The idea of treating the heart system using differential models dates back to the work by Van der Pol and Van der Mark (1928). In their pioneering paper the authors proposed to model the SA/AV system by a coupled set of electronic systems exhibiting relaxation oscillations. The Van der Pol and Van der Mark (1928) model represents a convenient description of the hearth dynamics due to its parametric simplicity and ability to cover complex periodicity. Successively, other systems have been proposed generalizing this original framework in order to better synthesize the biological complexity of the phenomenon under consideration (see Formaggia et al. (2010) for more details).
Among all the alternatives proposed in the specialized literature, we adopt the following system of coupled oscillators (dos Santos et al., 2004):

 ⎧⎪ ⎪ ⎪ ⎪ ⎪ ⎪ ⎪ ⎪ ⎪ ⎪ ⎪ ⎪ ⎪⎨⎪ ⎪ ⎪ ⎪ ⎪ ⎪ ⎪ ⎪ ⎪ ⎪ ⎪ ⎪ ⎪⎩dx1dt(t)=x2(t)dx2dt(t)=κ(x1(t)−w1)(x1(t)−w2)x2(t)−b1x1(t)+c1(x3(t)−x1(t))dx3dt(t)=x4(t)dx4dt(t)=κ(x3(t)−w1)(x3(t)−w2)x4(t)−b2x3(t)+c2(x1(t)−x3(t))

where and describe the electrical activity in the SA (sino-atrial) and AV (atrio-ventricular) nodes respectively. Parameters and determine the normal mode frequencies and characterize the non-linearity of the system. A central role is played by parameters determining the kind of diffusivity described by the coupled oscillators. For the SA node undergoes periodic regular oscillations, while for non-null the system describes a bidirectional asymmetric coupling.
The right panels in Figure 3 depict the simulated measurements (dots) and the numerical solution of the system (dashed lines) together with the state function approximated via the QL-ODE-P-spline method (black full lines). The data have been obtained by perturbing uniformly sampled values of the first derivatives of the numerical solutions of the system with a zero mean Gaussian noise. The system has been solved numerically for taking . Table 5 shows the estimated parameters and their standard errors. These results have been obtained using fourth order B-splines built on 250 equidistant internal knots. From Figure 3 and Table 5 one can appreciate the quality of the performances achieved by the QL-ODE-P-spline procedure.

### 5.2 Canadian lynx and snowshoe hare life cycles

In this section, we analyze the well known Canadian lynx and snowshoe hare data (Elton and Nicholson, 1942; Holling, 1959). The lynx and hare abundances have been collected from historical pelt-trading records of the Hudson Bay Company in a period between 1862 and 1930. Here, we consider a subset of the original sample analyzing the abundances observed between 1900 and 1920 (see figure 4). For these data the standard Lotka-Volterra system of ODEs can be estimated to obtain a meaningful description the life cycles of the two populations:

 ⎧⎪ ⎪ ⎪⎨⎪ ⎪ ⎪⎩dx1dt(t)=x1(t)[β−ζx2(t)]dx2dt(t)=−x2(t)[δ−ηx1(t)]

where indicate the prey (hare) and the predator (lynx) abundances respectively, and are the intrinsic growth rates of the prey and predator populations and are the predator-pray meeting and the predator death rates respectively.
The parameter estimates obtained with the quasilinearized ODE-P-splines are given in Table 6. These results were computed using cubic B-splines defined on a set of 200 internal knots. The initial values for the ODE and precision parameters, and , were obtained using nonlinear least squares with and approximated using standard P-splines (Eilers and Marx, 1996). Figure 4 shows the approximated state functions comparing them with the raw measurements and the numerical solutions of the ODE system (black dashed lines). The numerical solutions have been computed plugging the ODE parameters and initial values estimated via QL-ODE-P-spline in the system and solving the ODEs through a Runge-Kutta scheme.
The unknown parameters and state functions have been estimated taking two initial values for the ODE-compliance parameter: one indicating (relatively) low initial compliance to the ODE model () and one indicating (relatively) high initial compliance to the ODE system (). In the first case the final smoother overfits the data by giving a low weight to the differential model (first column of Figure 4). Then, the (numerical) solutions obtained by plugging the estimated in the system of ODEs are not in agreement with the observed data. On the contrary, starting with a large compliance parameter, a smoother consistent with the ODE model is estimated (second column of Figure 4): the approximated state functions (solid gray lines) and the numerical solutions (black dashed lines) overlap. In our opinion, these estimates guarantee a good description of the data even if the approximated state functions do not catch the trend observed in the years between 1905 and 1910. This miss-fitting effect could be a consequence of the poor flexibility of the Lotka-Volterra model. On the other hand we consider it a small price to be payed to preserve model parsimony and interpretability.
This fitting issue could also be related to the data collection process. Indeed, data before 1903 represent fur trading records while after 1903 they have been derived from questionnaires (Zhang et al., 2007). This might have influenced the observations in the years between the two peaks. Finally, according to our experience, we advise to start the estimation process by taking a reasonably large parameter if the research interest is focused on the estimation of the ODE parameters and the approximation of the associated state functions.

## 6 Discussion

In this paper we have introduced the quasilinearized (QL-) ODE-P-spline approach as a tool for the estimation of the parameters defining (systems of) nonlinear ODEs and for the approximation of their state functions starting from a set of (noisy) measurements. This methodology is inspired by the generalized profiling framework of Ramsay et al. (2007). The estimation procedure exploits a penalized likelihood formulation to build a compromise between a B-spline based description of the observations and the B-spline collocation solution of the ODE. The estimation process alternates two steps: 1) estimation of the optimal spline coefficients given the other unknowns; 2) estimation of the ODE parameters treating the spline coefficients as nuisance parameters. The compromise between data smoothing and compliance to the differential model is tuned by an ODE-compliance parameter to be selected in an upper optimization level. Dealing with nonlinear ODEs, the application of the original proposal of Ramsay et al. (2007) is involved requiring a nonlinear least squares step and the use of the implicit function theorem both for point and interval estimates. This is a consequence of the implicit relationship between the spline coefficients and the other unknowns.
The introduction of a quasilinearization step overcomes these hitches leading to a valuable simplification of the estimation procedure. The quasilinearization of the ODE-based penalty term (Bellman and Kalaba, 1965) permits to analytically link the spline coefficients to the differential parameters and makes possible the selection of the ODE-compliance parameters using standard approaches. Indeed, within our quasilinearized spline based framework, the estimation process reduces to a conditionally linear problem for the optimization of the spline coefficients. We suggest to view the ODE-compliance parameter as the ratio of the noise and of the penalty variances and to estimate it using the EM-type approach proposed by Schall (1991). The quasilinearized framework also facilitates the introduction of state (initial or boundary value) conditions. As discussed in Section 3.2, these constraints can be imposed using either extra penalties or Lagrange multipliers.
According to Algorithm 3.1, the quasilinearized estimation process requires initial values for the spline coefficients. Good initial spline coefficients can be obtained smoothing the raw data with standard P-splines (Eilers and Marx, 1996) even if, according to our experience, the choice of hardly influences the quality of the final estimates (see Section 3).
The performances of the proposed method have been evaluated through intensive simulations (Section 4). Two simulation studies have been conducted by generating data from a simple first order ODE in the first case and by considering a relaxation oscillator governed by the Van der Pol equation in the second case. The first ODE model, having an explicit solution, made possible a precise comparison between the estimates obtained via QL-ODE-P-splines and by a nonlinear least squares inversion of the analytic solution (Hotelling, 1927; Biegler et al., 1986). The simulation results confirm that our approach ensures performances comparable with those obtained via NLS for known or unknown state conditions. The high quality of the QL-ODE-P-spline estimates can also be appreciated by looking at the simulation results obtained in the framework of the more complex Van der Pol differential problem.
In Section 5, the quasilinearized spline based methodology has been applied in two real(istic) data analyzes. In the first example we have analyzed a set of 150 synthetic SA/AV voltage measurements generated perturbing with a Gaussian random noise the first derivatives of the (numerical) solutions of an unforced system of coupled Van der Pol equations (dos Santos et al., 2004). In the second example we have analyzed the (yearly) lynx-hare abundances (MacLulich, 1937; Elton and Nicholson, 1942) recorded in Canada between 1900 and 1920. In the latter case, we used a QL-ODE-P-spline approach with a penalty based on a Lotka-Volterra system. The estimation has been conducted by taking either a small or a large initial value for the ODE-compliance parameter. Starting with a large initial compliance parameter () ensured final estimates for the state functions and ODE parameters consistent with the hypothesized model, while a low initial compliance parameter () led to overfitting state functions corresponding to inappropriate estimates for the ODE parameters. This is probably due to the inability of the ODE model to describe the measurement trend in the period 1905-1910. A small value for the ODE compliance parameter provides a better fit but makes the interpretation of the final ODE parameter estimates awkward.
Our future research will focus on possible extensions of the QL-ODE-P-spline framework. First of all, we think that the linearized ODE penalty and the profiling estimation settings can be adapted to hierarchical settings with random ODE parameters. Second, the presented approach can be also adapted to analyze (noisy) realizations of dynamic systems evolving in space and time and described by nonlinear partial differential equations (PDEs). Finally, the introduction of a quasilinearization step should facilitate the generalization of the approach proposed by Jaeger and Lambert (2013) to analyze nonlinear differential systems in a Bayesian framework.

## Acknowledgements

The authors acknowledge financial support from IAP research network P7/06 of the Belgian Government (Belgian Science Policy), and from the contract ‘Projet d’Actions de Recherche Concertées’ (ARC) 11/16-039 of the ‘Communauté française de Belgique’, granted by the ‘Académie universitaire Louvain’.

## Appendix A Construction of the quasilinearized penalty matrix in a general system of nonlinear ODEs

In this appendix we define the matrix , the vector and the constant involved in the linearized ODE-penalty. Consider a general system of ODEs:

 ⎧⎪ ⎪ ⎪ ⎪ ⎪ ⎪ ⎪⎨⎪ ⎪ ⎪ ⎪ ⎪ ⎪ ⎪⎩dx1dt(t)−f1(t,x,θ)=0 \ ⋮dxddt(t)−fd(t,x,θ)=0 (A.1)

where is a known function of the state variables and of the unknown ODE parameters that we consider fixed over time. With quasilinearization, given the estimate at step , Eq. (A.1) suggests to require that checks:

 ⎧⎪ ⎪ ⎪ ⎪ ⎪ ⎪ ⎪ ⎪ ⎪ ⎪ ⎪⎨⎪ ⎪ ⎪ ⎪ ⎪ ⎪ ⎪ ⎪ ⎪ ⎪ ⎪⎩d˜x(i+1)1dt(t)−d∑k=1˜x(i+1)k∂f1∂xk(t,˜x(i),θ)−f1(t,˜x(i),θ)+d∑k=1˜x(i)k∂f1∂xk(t,˜x(i),θ)=0   \ ⋮d˜x(i+1)ddt(t)−d∑k=1˜x(i+1)k∂fd∂xk(t,˜x(i),θ)−fd(t,˜x(i),θ)+d∑k=1˜x(i)k∂fd∂xk(t,˜x(i),θ)=0

If denotes the vector of spline coefficients to be estimated in the th state function, the updated th penalty term at iteration is given by:

 PENj(α(i+1)j|α(i)j) ≈ ∫⎛⎜⎝d˜x(i+1)jdt(t)−fj(t,˜x(i),θ) − d∑k=1(˜x(i+1)k−˜x(i)k)∂fj∂xk(t,˜x(i),θ))2dt, ≈ ∫([α(i+1)j]⊤b(1)j(t) − d∑k=1[α(i+1)k]⊤b(0)j(t)∂fi∂xk(t,˜x(i),θ)−vj(t))2dt

where the integration is performed over , and , are, respectively, -dimensional vectors of B-spline functions and their derivative such that and . The th element of the penalty can be computed as:

 Unknown environment '% (A.2)

with

 cjkl=∫∂fj∂xk(t,˜x(i),θ)∂fj∂xl(t,˜x(i),θ)b(0)k(t)[b(0)l(t)]⊤dt.

The overall penalty term , at step is therefore equal to:

 PEN(i+1)=[α(i+1)]⊤R(θ,γ,α(i))α(i+1)+2[α(i+1)]⊤r(θ,γ,α(i))+l(θ,γ,α(i)).

The matrix has dimension , is symmetric and block-diagonal. The block has dimension and is computed as:

 Missing or unrecognized delimiter for \left −γl∫∂fj∂xl(t,˜x(i),θ)b(0)l(t)[b(1)k(t)]⊤dt+d∑i=1γicikl,

where and if and zero otherwise. The vector has length with th component given by . Finally the constant in the penalty is equal to .

### References

1. Bellman, R. and Kalaba, R. (1965). Quasilinearization and nonlinear boundary-value problems. Modern analytic and computational methods in science and mathematics. American Elsevier Pub. Co.
2. Biegler, L., Damiano, J., and Blau, G. (1986). Nonlinear parameter estimation: A case study comparison. AIChE Journal, 32(1):29–45.
3. Cao, J. and Zhao, H. (2008). Estimating dynamic models for gene regulation networks. Bioinformatics, 24:1619–1624.
4. Currie, I. D. (2013). Smoothing constrained generalized linear models with an application to the Lee-Carter model. Statistical Modelling, 13(1):69–93.
5. dos Santos, A. M., Lopes, S. R., and Viana, R. L. (2004). Rhythm synchronization and chaotic modulation of coupled van der pol oscillators in a model for the heartbeat. Physica A: Statistical Mechanics and its Applications, 338(3â4):335 – 355.
6. Eilers, P. and Marx, B. (1996). Flexible smoothing with B-splines and penalties. Statistical Science, 11:89–121.
7. Elton, C. and Nicholson, M. (1942). The ten-year cycle in numbers of the lynx in canada. Journal of Animal Ecology, 11(2):pp. 215–244.
8. Formaggia, L., Quarteroni, A., and Veneziani, A. (2010). Cardiovascular Mathematics: Modeling and simulation of the circulatory system. MS&A. Springer.
9. Frasso, G., Jaeger, J., and Lambert, P. (2013). Estimation and approximation in multidimensional dynamics. Technical report, Université de Liège.
10. Hastie, T. J. and Tibshirani, R. J. (1990). Generalized additive models. London: Chapman & Hall.
11. Holling, C. S. (1959). The components of predation as revealed by a study of small mammal predation predation of the european pine sawfly. Canadian Entomologist, 91:293–320.
12. Hotelling, H. (1927). Differential equations subject to error, and population estimates. Journal of the American Statistical Association, 22(159):283–314.
13. Jaeger, J. and Lambert, P. (2013). Bayesian P-spline estimation in hierarchical models specified by systems of affine differential equations. Statistical Modelling, 13(1):3–40.
14. MacLulich, D. A. (1937). Fluctuations in the numbers of varying hare (Lepus americanus ). University of Toronto studies: Biological series. The University of Toronto press.
15. Poyton, A., Varziri, M., McAuley, K., McLellan, P., and Ramsay, J. (2006). Parameter estimation in continuous-time dynamic models using principal differential analysis. Computers & Chemical Engineering, 30(4):698–708.
16. Ramsay, J. O., Hooker, G., Campbell, D., and Cao, J. (2007). Parameter estimation for differential equations: a generalized smoothing approach. Journal of the Royal Statistical Society, Series B, 69:741–796.
17. Schall, R. (1991). Estimation in generalized linear models with random effects. Biometrika, 78(4):719–727.
18. Van der Pol, B. (1926). On relaxation-oscillations. The London, Edinburgh, and Dublin Philosophical Magazine and Journal of Science, 7(2):978–992.
19. Van der Pol, B. and Van der Mark, J. (1928). The heartbeat considered as a relaxation oscillation, and an electrical model of the heart. Philosophical Magazine Series 7, 6(38):763–775.
20. Zhang, Z., Tao, Y., and Li, A. Z. (2007). Factors affecting hare-lynx dynamics in the classic time series of the Hudson Bay Company, Canada. Climate Research, 34:83–89.
You are adding the first comment!
How to quickly get a good reply:
• Give credit where it’s due by listing out the positive aspects of a paper before getting into which changes should be made.
• Be specific in your critique, and provide supporting evidence with appropriate references to substantiate general statements.
• Your comment should inspire ideas to flow and help the author improves the paper.

The better we are at sharing our knowledge with each other, the faster we move forward.
The feedback must be of minumum 40 characters