An asymptotically optimal indirect approach to continuoustime system identification
Abstract
The indirect approach to continuoustime system identification consists in estimating continuoustime models by first determining an appropriate discretetime model. For a zeroorder hold sampling mechanism, this approach usually leads to a transfer function estimate with relative degree 1, independent of the relative degree of the strictly proper real system. In this paper, a refinement of these methods is developed. Inspired by indirect PEM, we propose a method that enforces a fixed relative degree in the continuoustime transfer function estimate, and show that the resulting estimator is consistent and asymptotically efficient. Extensive numerical simulations are put forward to show the performance of this estimator when contrasted with other indirect and direct methods for continuoustime system identification.
Index terms—ystem identification; Continuoustime systems; Parameter estimation; Sampled data.
I Introduction
System identification deals with the problem of estimating adequate models of dynamical systems from inputoutput data. The methods developed over the years in this field have seen applications in many areas of science and engineering, and comprehensive literature has been written on the subject [1, 2, 3].
When postulating a mathematical model for describing a dynamical system based on sampled data, one must decide between obtaining a discretetime (DT), or a continuoustime (CT) model. In a predominantly digital era, DT system identification has been studied thoroughly (see, e.g., [1, 2]). Nevertheless, interest in CT models still persists due to its advantages over discretetime. For example, greybox modelling [4], which is commonly based on physical principles and conservation laws, is naturally suited for continuous time, as the parameters can usually be better interpreted in this domain. Also, CT models are known to have more intuitive dynamics, and they do not depend on a sampling period.
In CT system identification there are two main approaches, namely the direct and indirect approaches. For direct CT system identification, a CT model is obtained directly from the sampled data. The main difficulty present in the direct methods is the handling of derivatives, as they are not immediately available from discrete data points without amplifying noise [5]. To effectively deal with this issue, many well known methods have been proposed [6], with success in real applications [7]. On the other hand, indirect methods for CT modelling first determine a suitable DT model via DT system identification methods like the Prediction Error Methods (PEM) or Maximum Likelihood (ML) [2], and then transform this model into a CT equivalent model. Evidence has been shown regarding the advantages of direct over indirect CT model identification [8], although with a precise initialisation of PEM, the approaches seem comparable for certain sampling periods [9].
Even though the indirect approach seems easy to implement, as there is much theory and literature concerning DT system identification, there are reasons why this approach is not always recommended. First, it may suffer from numerical inaccuracies at fast sampling, and requires a precise initialisation. In addition, it is not possible to select the desired numerator order of the CT model, as the estimated DT model will generally lead to a CT model with relative degree 1 in the case of sampling by a zeroorder hold mechanism. Hence, an unnecessarily complex model structure is indirectly being estimated, which leads to a loss in accuracy according to the parsimony principle [1].
In this paper, we introduce a method that optimally imposes a desired relative degree in the indirect approach to continuoustime system identification. Based on Indirect PEM [10], we prove that the proposed estimator is a consistent and asymptotically efficient estimator of the system’s true parameter vector. Extensive numerical simulations show that the new method imposes the correct relative degree, while improving the statistical properties of the transfer function estimate, and achieves a performance that compares favourably against both standard direct and indirect approaches.
The remainder of this paper is organised as follows. In Section II the problem is formulated. Section III provides an introduction to the indirect approach for CT system identification. In Section IV we derive an estimator that optimally enforces the desired relative degree for the indirect approach, and determine its properties. Section V illustrates the method with extensive numerical examples. Finally, conclusions are drawn in Section VI.
Ii Problem formulation
Consider a linear timeinvariant, causal, stable, single input single output, CT system
(1) 
where is the Heavyside operator, i.e., , and is the relative degree of the system. In this paper, we denote as the true CT system parameter vector.
Suppose that the inputoutput signals are sampled with period and the resulting output is contaminated by an additive zeromean white noise sequence of variance . That is,
(2) 
The goal of CT system identification is to obtain a CT transfer function estimate for , given discrete inputoutput data measurements and knowledge about the physical characteristics of the system, or the intersample behaviour. In this paper, we assume that the input is a piecewise constant signal between samples (i.e., zeroorder hold behaviour [11]).
For obtaining a model of , a simple way to proceed is to identify the zeroorder hold equivalent model given the input and output data measurements by using standard PEM in the DT domain, and then return to the CT domain via zeroorder hold equivalences. Although this procedure has good statistical properties, it does not impose relative degree constraints in the CT domain. Our goal is to optimally impose this constraint, which should lead to a statistically improved estimate of .
Iii The indirect approach to continuoustime system identification
One approach to identifying a CT system is to first estimate the DT model given the input and output data samples, and then translate this model into continuous time. This is called the indirect approach, since it relies on DT system identification theory instead of obtaining immediately a CT model using CT system identification methods.
Much literature has been written regarding the first step of the indirect approach [1, 2]. The theoretically optimal solution is to apply the maximum likelihood method. This method is known to give consistent and asymptotically efficient estimates under very general conditions. Under the assumption that the additive white noise is Gaussian, the ML method is equivalent to PEM, which is one of the most celebrated parametric methods, and is available in the MATLAB System Identification Toolbox [12].
If a CT model for (1) is required, we should propose a DT model structure of the form
(3) 
If we define and denote as the model’s output, then the ML estimate is
The next step is to transform this DT transfer function into an adequate CT model. This can be done in several ways. For example, the wellknown Tustin transformation can be applied on the DT transfer function estimate by letting
as reported in [13]. If it is assumed that the CT input signal is piecewise constant, the most natural mapping (used in e.g. [14, 9]), is the zeroorder hold sampling equivalence
(4) 
where and denote the Z and Laplace transforms, respectively. This mapping is known to be a troublesome part of the indirect approach, as it can be ill conditioned and its uniqueness depends on a correct choice of the sampling period.
In both of these mappings, the resulting CT model can have numerator parameters that exceed the desired numerator orders. This is generally the case when the relative degree of is greater than 1. This problem contributes to poor accuracy and high standard deviations at high frequencies. One very simple way of treating this issue is setting to zero the numerator coefficients which should be zero, but this is not the best way of taking care of the information [9].
Iv Optimal enforcement of relative degree
In this section we develop an indirectapproach estimator for the CT parameter vector that renders a CT transfer function estimate of a desired relative degree . For this matter, we first focus on the PEM estimator of the zeroorder hold equivalent model of .
For simplicity, we assume that the correct model order has been found. A model with structure (3) is obtained by PEM, and the covariance matrix of is also estimated. We know that the CT zeroorder hold equivalent of this estimated model is in general given by
Define , and denote by the true DT parameter vector. The parameters in are related to by the zeroorder hold equivalence equations that can be derived by using (4) and comparing coefficients. This relation is a nonlinear mapping , which is differentiable almost everywhere. Hence, the following asymptotic relationship is valid for the covariance matrices of and :
(5) 
where is the Jacobian matrix of evaluated at the naive estimation of , that is, throwing away the high order coefficients of the numerator of which should be zero^{1}^{1}1Note that the standard PEM estimate could have also been used for this matter, and the asymptotic relation still holds..
Now, we propose to find an appropriate projection of into a proper subspace of the parameter space that yields the desired relative degree. This subspace is simply the one formed by all vectors with first elements set to zero. Hence, we decide to study the following problem:
(6)  
(7) 
where is the identity matrix of dimension , is the null matrix of appropriate dimensions, and . The optimisation problem in this context can be interpreted as an application of the Indirect PEM [10].
By Lagrange multiplier theory, the optimization problem in (6) is equivalent to calculating, for a suitable ,
Partitioning appropriately (dropping the subindex for simplicity), and differentiating with respect to we obtain
(8) 
Imposing (7) we obtain . If we denote by the Cholesky factorization matrix of [15] (i.e., a lower triangular matrix with positive diagonal entries such that ) we can write (8) as
(9) 
That is, the estimator (9) can be seen as an best approximation to the PEM CT estimate that imposes the desired relative degree.
Iva Properties
We briefly present the most important properties of estimator (9) in the following theorems.
Theorem IV.1
Consider the system described by (1) and (2), where is a Gaussian white noise sequence. Assume that the sampling frequency is larger than twice the largest imaginary part of the domain poles and there is no delay in the real system^{2}^{2}2These conditions can be relaxed, as long as the sampling frequency is such that the transformation is well defined.. Then, the estimator (9) is a consistent and asymptotically efficient estimator of the real vector parameter , provided the DT model set (with the chosen relative degree) contains the real system.
Under the Gaussian noise assumption, the DT PEM estimate can be interpreted as the ML estimate. Under the proposed sampling frequency, the transformation is unique for sufficiently large [16]. Hence, by the invariance principle of ML estimators [17], the CT equivalent of the system’s parameters is also an ML estimate.
To prove the theorem, we only require that the assumptions in [10] are satisfied in this scenario, and then directly apply the results obtained in the cited contribution. First, note that the model structure given by this procedure contains the models with the desired relative degree (and the contention is proper if ), with a linear mapping between parameter vectors given by the matrix
Furthermore, provided that the DT model set contains the real system, both structures give parameter identifiability. Also note that , obtained via (5), is a consistent estimate of the covariance matrix of . Hence, the results in [10] follow. Namely, the normalized estimation errors^{3}^{3}3For simplicity, we assume that the vector has the appropriate dimension, where zeros have been considered in the first terms if necessary. and are asymptotically normally distributed with zero means and their asymptotic covariance matrices satisfy the relation
(10) 
Moreover, following the steps in [10, Section 3], the improved PEM estimate is consistent and has the same asymptotic distribution as , thereby proving its asymptotic efficiency.
Remark IV.1
Note that by (9) and (10), the asymptotic covariances can be shown to satisfy the following properties:
Both of these claims follow by applying properties 10.5 and 10.6 from [18, Chapter 10] to this context. These properties imply that for sufficiently large , the proposed estimator can only decrease the covariance of the estimated parameters compared to standard PEM. The asymptotic covariances satisfy a Pythagorean relation, as the PEM estimate is projected orthogonally on to the proper subspace where lies.
Next, we establish that imposing a larger relative degree improves the accuracy of the estimates, provided that the highest relative degree model structure contains the real system.
Theorem IV.2
Given a plant of order and relative degree , consider CT candidate models of relative degree and and their improved PEM parameter vector estimates and respectively. If , then their asymptotic covariance matrices satisfy .
The proof follows by applying [19, Theorem 2]. Details are omitted due to length restrictions.
V Monte Carlo simulation studies
We will now study the performance of the proposed estimator under a series of experiments.
The system considered is the RaoGarnier system [8], which is a linear fourthorder nonminimum phase system with complex poles that has been tested in many publications (see e.g. [21, 22, 7]) of CT system identification:
This system is interesting since it has two damped oscillatory modes at 2 and 20 rad/sec with damping of and respectively, and has a nonminimum phase zero at . It is known that this is a particularly difficult system to estimate by PEM/ML, since these methods may converge to a local minimum if they are not well initialised [21].
Three methods have been compared: PEM, PEM with relative degree enforcement (labeled PEMrd (9)), and the simplified refined IV method for continuoustime systems (SRIVC), which is one of the most successful direct methods available, and has been suggested for general use in one of the most recent surveys on CT system identification [7]. Each method has been tested in different Monte Carlo simulations, and they have been evaluated according to the average normalized square error of the system estimate
the average normalized square error of the parameter vectors
and the fit measure
where is the noisefree output sequence (the simulated data without the additive measurement noise), is the simulated output sequence of the th estimated model, and is the average value of .
We have run PEM using the standard MATLAB System Identification Toolbox [12] with the oe command, and assumed that the correct order of the system is known. The search algorithm has been initialised with the estimate given by the Null Space Fitting method^{4}^{4}4PEM initialised with the estimate from SRIVC (as in [9]) has also been tested with similar results. [23]. We based PEMrd on the PEM estimate previously obtained. The required Jacobian matrix has been numerically calculated via finite differences, and the correct relative degree has been imposed for this estimator. We have used the command d2c of MATLAB in both cases. SRIVC has been implemented with the CONTSID toolbox for MATLAB [6, Chapter 9] with default initialisation, and has been set to estimate the model
Va Effect of the number of data points and sampling rate
In this study, we have designed the input as a pseudorandom binary sequence (PRBS) of amplitude switches between and . For the first input sequence, the number of stages of the shift register is and the data length of the shortest interval is . Hence, a sequence of data points has been obtained. The noise is a zeromean white Gaussian noise signal, where the variance has been set such that the signaltonoise ratio (SNR in dB) between the noiseless output sequence and the noise equals dB.
Three Monte Carlo studies have been performed with simulations of different noise realisations each one, for . The results are shown in Table I.
Method  Fit  

0.01  PEM  98.9742  
PEMrd  99.1219  
SRIVC  99.1217  
0.05  PEM  98.9645  
PEMrd  99.1141  
SRIVC  99.1146  
0.1  PEM  98.9436  
PEMrd  99.0884  
SRIVC  99.0870 
In order to analyse the performance of each estimator under less data, we have set the number of stages to and the data length of the shortest interval to , resulting in a input of data points. With the same SNR as the test above, the results for Monte Carlo simulations for each sampling period can be found in Table II.
Method  Fit  

0.01  PEM  97.7791  
PEMrd  98.0705  
SRIVC  98.0719  
0.05  PEM  97.7892  
PEMrd  98.1172  
SRIVC  98.1167  
0.1  PEM  97.7416  
PEMrd  98.0339  
SRIVC  97.9754 
Both Tables I and II show that the refined PEM estimator statistically improves the estimates given by PEM, and is a very competitive method against SRIVC, even for high frequency sampling. Note that under less data points, PEMrd still outperforms PEM for every sampling period, which indicates that the asymptotic properties studied in Section IV can be observed in practical finite data cases as well.
Remark V.1
In Tables I and II we have discarded cases where PEM has delivered estimates with one pole in the negative real axis. Fortunately this scenario is very uncommon (9 cases seen in 3000 simulations). A similar phenomenon was observed in Table II for in SRIVC, where 2 estimates gave negative fit values. These simulations were not considered either.
VB Multisine input
A different input signal has also been tested. We have taken 2000 data measurements of a multisine input given by the sum of sine waves of angular frequencies . The standard deviation of the additive noise has been set to 0.1, and the median of Monte Carlo simulations of the normalized model error, normalized parameter error, and fit have been obtained. The results are shown in Table III.
Method  Fit  

0.01  PEM  99.4319  
PEMrd  99.8173  
SRIVC  80.8446  
0.02  PEM  99.2684  
PEMrd  99.7325  
SRIVC  94.3144 
For this input and sampling method, SRIVC performs poorly, while PEM and PEMrd normally reach the global optimum. Even though the fit is very near the optimal, PEMrd is consistently better than standard PEM at all metrics.
VC Mean and covariance of the estimated parameters
As established in Theorem IV.1, the improved PEM estimate should reduce (at least asymptotically) the covariance of the parameter vector. To test this, we have obtained the mean and standard deviation of each parameter given by the Monte Carlo study under the setup of Section VA with and . The results are shown in Table IV. It can be observed that the mean values are similar in all methods (except standard PEM, which does not estimate the correct model structure), and PEMrd provides the lowest standard deviation for every parameter.
Method 










PEM 










PEMrd 










SRIVC 









VD Direct comparison with standard PEM
In this subsection we analyse the improvements of the novel method over the standard PEM estimates.
To show the impact of selecting and enforcing the correct relative degree, we have focused on the Bode Diagrams of the resulting models for PEM and PEMrd. In Figure 1 we have plotted the frequency response estimates of 100 Monte Carlo simulations under the setup in Section VA, for , .
It is clear by Fig. 1 that the improvement over PEM is mainly at high frequencies. This is intuitive, since the relative degree determines the asymptotic slope of the Bode diagram of magnitude. The proposed method enforces the true asymptotic slope, leading to an important gain in accuracy in both magnitude and phase.
Also, direct comparison plots between PEM and PEMrd are shown in Figure 2. These plots compare the normalized model and parameter error, and the fit of PEM and PEMrd for each Monte Carlo experiment (500 in total). PEMrd outperforms standard PEM in most Monte Carlo simulations, specially in the fit comparison, where 496 out of 500 experiments have lead to an increase in fit under PEM refinement.
VE Random systems
To obtain the average performance of each estimator, we have tested them on a data set created with 500 random systems of order 3 and relative degree 2, by using the rss command in MATLAB. The slowest pole has been set to have real part less than 0.1. The input was a unit variance Gaussian white noise, and the additive white noise was also Gaussian, of standard deviation equal to of the maximum value of the noiseless output. The sampling period has been chosen as 10 times faster than the fastest pole or zero of the real system.
We have computed the median of the metrics used above, as in Section VB, for the 500 random systems. We also have counted failures of the estimators, which are the cases when the estimates produced a negative fit, when the algorithm crashed, or when it was not possible to correctly initialise the PEM estimate having tried reducing by a factor of 2 the sampling rate for initialisation estimation.
The results can be seen in Table V. Although all estimators report failures, the PEMrd again shows promising results in all metrics. As pointed out in [9] and seen in these tests, initialisation aspects are in fact a major issue concerning the reliability of these algorithms.
Method  Fit  N Failures  

PEM  98.871  12  
PEMrd  98.997  16  
SRIVC  98.9723  30 
Vi Conclusions
We have proposed a refinement to the standard PEM estimator for indirect continuoustime system identification that achieves an asymptotically optimal desired relative degree enforcement. An explicit expression for this estimator has been found, and its statistical properties have been analysed. Extensive simulations using both standard benchmarks and random systems have been put forward with promising results. These have shown that the refinement to standard PEM leads to an important improvement in all the statistical metrics studied, and its performance is comparable, if not better, to SRIVC for all sampling periods in this study provided PEM is initialised correctly.
References
 [1] T. Söderström, P. Stoica, System Identification, PrenticeHall, 1988.
 [2] L. Ljung, System Identification: Theory for the User, 2nd edition, PrenticeHall, 1999.
 [3] R. Pintelon, J. Schoukens, System Identification: A Frequency Domain Approach, John Wiley & Sons, 2012.
 [4] T. P. Bohlin, Practical Greybox Process Identification: Theory and Applications, Springer, 2006.
 [5] G. Rao, H. Unbehauen, Identification of continuoustime systems, IEE ProceedingsControl theory and applications 153 (2) (2006) 185–220.
 [6] H. Garnier, L. Wang, Identification of Continuoustime Models from Sampled Data, Springer, 2008.
 [7] H. Garnier, Direct continuoustime approaches to system identification. Overview and benefits for practical applications, European Journal of control 24 (2015) 50–62.
 [8] G. Rao, H. Garnier, Numerical illustrations of the relevance of direct continuoustime model identification, IFAC Proceedings Volumes 35 (1) (2002) 133–138.
 [9] L. Ljung, Experiments with identification of continuous time models, IFAC Proceedings Volumes 42 (10) (2009) 1175–1180.
 [10] T. Söderström, P. Stoica, B. Friedlander, An Indirect Prediction Error Method for System Identification, Automatica 27 (1) (1991) 183–188.
 [11] K. J. Åström, B. Wittenmark, Computer Controlled Systems: Theory and Design, PrenticeHall, 1984.
 [12] L. Ljung, System Identification Toolbox 7: Getting Started Guide.
 [13] H. Unbehauen, G. Rao, Continuoustime Approaches to System Identification â A Survey, Automatica 26 (1) (1990) 23–35.
 [14] N. K. Sinha, Identification of continuoustime systems from samples of inputoutput data: An introduction, Sadhana 25 (2) (2000) 75–83.
 [15] R. A. Horn, C. R. Johnson, Matrix Analysis, 2nd Edition, Cambridge University Press, 2012.
 [16] I. Kollar, G. Franklin, R. Pintelon, On the Equivalence of zdomain and sdomain Models in System Identification, in: Instrumentation and Measurement Technology Conference, IMTC96, Vol. 1, IEEE, 1996, pp. 14–19.
 [17] P. W. Zehna, et al., Invariance of maximum likelihood estimators, Annals of Mathematical Statistics 37 (3) (1966) 744.
 [18] C. Gourieroux, A. Monfort, Statistics and econometric models, Vol. 1, Cambridge University Press, 1995.
 [19] R. A. González, P. E. Valenzuela, C. R. Rojas, R. A. Rojas, Optimal enforcement of causality in nonparametric transfer function estimation, IEEE Control Systems Letters 1 (2) (2017) 268–273.
 [20] P. Young, Recursive estimation, forecasting and adaptive control, Control and Dynamic systems 30 (Part 3) (1989) 119–165.
 [21] L. Ljung, Initialisation aspects for subspace and OutputError identification methods, in: European Control Conference (ECC), 2003, 2003, pp. 773–778.
 [22] J. S. Welsh, J. C. Agüero, M. Alamir, Continuoustime System Identification using Indirect Inference, IFAC Proceedings Volumes 42 (10) (2009) 1169–1174.
 [23] M. Galrinho, C. R. Rojas, H. Hjalmarsson, The weighted nullspace fitting method, arXiv preprint arXiv:1708.03946.