An asymptotically optimal indirect approach to continuous-time system identification

An asymptotically optimal indirect approach to continuous-time system identification

Rodrigo A. González, Cristian R. Rojas and James S. Welsh This work was supported by the Swedish Research Council under contract number 2016-06079 (NewLEADS).R.A. González and C.R. Rojas are with the Department of Automatic Control and ACCESS Linnaeus Centre, KTH Royal Institute of Technology, 10044 Stockholm, Sweden (e-mails: grodrigo@kth.se, crro@kth.se).James S. Welsh is with the School of Electrical Engineering and Computer Science, University of Newcastle, Australia (e-mail: james.welsh@newcastle.edu.au).
Abstract

The indirect approach to continuous-time system identification consists in estimating continuous-time models by first determining an appropriate discrete-time model. For a zero-order hold sampling mechanism, this approach usually leads to a transfer function estimate with relative degree 1, independent of the relative degree of the strictly proper real system. In this paper, a refinement of these methods is developed. Inspired by indirect PEM, we propose a method that enforces a fixed relative degree in the continuous-time transfer function estimate, and show that the resulting estimator is consistent and asymptotically efficient. Extensive numerical simulations are put forward to show the performance of this estimator when contrasted with other indirect and direct methods for continuous-time system identification.

Index terms—ystem identification; Continuous-time systems; Parameter estimation; Sampled data.

I Introduction

System identification deals with the problem of estimating adequate models of dynamical systems from input-output data. The methods developed over the years in this field have seen applications in many areas of science and engineering, and comprehensive literature has been written on the subject [1, 2, 3].

When postulating a mathematical model for describing a dynamical system based on sampled data, one must decide between obtaining a discrete-time (DT), or a continuous-time (CT) model. In a predominantly digital era, DT system identification has been studied thoroughly (see, e.g., [1, 2]). Nevertheless, interest in CT models still persists due to its advantages over discrete-time. For example, grey-box modelling [4], which is commonly based on physical principles and conservation laws, is naturally suited for continuous time, as the parameters can usually be better interpreted in this domain. Also, CT models are known to have more intuitive dynamics, and they do not depend on a sampling period.

In CT system identification there are two main approaches, namely the direct and indirect approaches. For direct CT system identification, a CT model is obtained directly from the sampled data. The main difficulty present in the direct methods is the handling of derivatives, as they are not immediately available from discrete data points without amplifying noise [5]. To effectively deal with this issue, many well known methods have been proposed [6], with success in real applications [7]. On the other hand, indirect methods for CT modelling first determine a suitable DT model via DT system identification methods like the Prediction Error Methods (PEM) or Maximum Likelihood (ML) [2], and then transform this model into a CT equivalent model. Evidence has been shown regarding the advantages of direct over indirect CT model identification [8], although with a precise initialisation of PEM, the approaches seem comparable for certain sampling periods [9].

Even though the indirect approach seems easy to implement, as there is much theory and literature concerning DT system identification, there are reasons why this approach is not always recommended. First, it may suffer from numerical inaccuracies at fast sampling, and requires a precise initialisation. In addition, it is not possible to select the desired numerator order of the CT model, as the estimated DT model will generally lead to a CT model with relative degree 1 in the case of sampling by a zero-order hold mechanism. Hence, an unnecessarily complex model structure is indirectly being estimated, which leads to a loss in accuracy according to the parsimony principle [1].

In this paper, we introduce a method that optimally imposes a desired relative degree in the indirect approach to continuous-time system identification. Based on Indirect PEM [10], we prove that the proposed estimator is a consistent and asymptotically efficient estimator of the system’s true parameter vector. Extensive numerical simulations show that the new method imposes the correct relative degree, while improving the statistical properties of the transfer function estimate, and achieves a performance that compares favourably against both standard direct and indirect approaches.

The remainder of this paper is organised as follows. In Section II the problem is formulated. Section III provides an introduction to the indirect approach for CT system identification. In Section IV we derive an estimator that optimally enforces the desired relative degree for the indirect approach, and determine its properties. Section V illustrates the method with extensive numerical examples. Finally, conclusions are drawn in Section VI.

Ii Problem formulation

Consider a linear time-invariant, causal, stable, single input single output, CT system

(1)

where is the Heavyside operator, i.e., , and is the relative degree of the system. In this paper, we denote as the true CT system parameter vector.

Suppose that the input-output signals are sampled with period and the resulting output is contaminated by an additive zero-mean white noise sequence of variance . That is,

(2)

The goal of CT system identification is to obtain a CT transfer function estimate for , given discrete input-output data measurements and knowledge about the physical characteristics of the system, or the intersample behaviour. In this paper, we assume that the input is a piecewise constant signal between samples (i.e., zero-order hold behaviour [11]).

For obtaining a model of , a simple way to proceed is to identify the zero-order hold equivalent model given the input and output data measurements by using standard PEM in the DT domain, and then return to the CT domain via zero-order hold equivalences. Although this procedure has good statistical properties, it does not impose relative degree constraints in the CT domain. Our goal is to optimally impose this constraint, which should lead to a statistically improved estimate of .

Iii The indirect approach to continuous-time system identification

One approach to identifying a CT system is to first estimate the DT model given the input and output data samples, and then translate this model into continuous time. This is called the indirect approach, since it relies on DT system identification theory instead of obtaining immediately a CT model using CT system identification methods.

Much literature has been written regarding the first step of the indirect approach [1, 2]. The theoretically optimal solution is to apply the maximum likelihood method. This method is known to give consistent and asymptotically efficient estimates under very general conditions. Under the assumption that the additive white noise is Gaussian, the ML method is equivalent to PEM, which is one of the most celebrated parametric methods, and is available in the MATLAB System Identification Toolbox [12].

If a CT model for (1) is required, we should propose a DT model structure of the form

(3)

If we define and denote as the model’s output, then the ML estimate is

The next step is to transform this DT transfer function into an adequate CT model. This can be done in several ways. For example, the well-known Tustin transformation can be applied on the DT transfer function estimate by letting

as reported in [13]. If it is assumed that the CT input signal is piecewise constant, the most natural mapping (used in e.g. [14, 9]), is the zero-order hold sampling equivalence

(4)

where and denote the Z and Laplace transforms, respectively. This mapping is known to be a troublesome part of the indirect approach, as it can be ill conditioned and its uniqueness depends on a correct choice of the sampling period.

In both of these mappings, the resulting CT model can have numerator parameters that exceed the desired numerator orders. This is generally the case when the relative degree of is greater than 1. This problem contributes to poor accuracy and high standard deviations at high frequencies. One very simple way of treating this issue is setting to zero the numerator coefficients which should be zero, but this is not the best way of taking care of the information [9].

Iv Optimal enforcement of relative degree

In this section we develop an indirect-approach estimator for the CT parameter vector that renders a CT transfer function estimate of a desired relative degree . For this matter, we first focus on the PEM estimator of the zero-order hold equivalent model of .

For simplicity, we assume that the correct model order has been found. A model with structure (3) is obtained by PEM, and the covariance matrix of is also estimated. We know that the CT zero-order hold equivalent of this estimated model is in general given by

Define , and denote by the true DT parameter vector. The parameters in are related to by the zero-order hold equivalence equations that can be derived by using (4) and comparing coefficients. This relation is a nonlinear mapping , which is differentiable almost everywhere. Hence, the following asymptotic relationship is valid for the covariance matrices of and :

(5)

where is the Jacobian matrix of evaluated at the naive estimation of , that is, throwing away the high order coefficients of the numerator of which should be zero111Note that the standard PEM estimate could have also been used for this matter, and the asymptotic relation still holds..

Now, we propose to find an appropriate projection of into a proper subspace of the parameter space that yields the desired relative degree. This subspace is simply the one formed by all vectors with first elements set to zero. Hence, we decide to study the following problem:

(6)
(7)

where is the identity matrix of dimension , is the null matrix of appropriate dimensions, and . The optimisation problem in this context can be interpreted as an application of the Indirect PEM [10].

By Lagrange multiplier theory, the optimization problem in (6) is equivalent to calculating, for a suitable ,

Partitioning appropriately (dropping the subindex for simplicity), and differentiating with respect to we obtain

(8)

Imposing (7) we obtain . If we denote by the Cholesky factorization matrix of [15] (i.e., a lower triangular matrix with positive diagonal entries such that ) we can write (8) as

(9)

That is, the estimator (9) can be seen as an best approximation to the PEM CT estimate that imposes the desired relative degree.

Iv-a Properties

We briefly present the most important properties of estimator (9) in the following theorems.

Theorem IV.1

Consider the system described by (1) and (2), where is a Gaussian white noise sequence. Assume that the sampling frequency is larger than twice the largest imaginary part of the -domain poles and there is no delay in the real system222These conditions can be relaxed, as long as the sampling frequency is such that the transformation is well defined.. Then, the estimator (9) is a consistent and asymptotically efficient estimator of the real vector parameter , provided the DT model set (with the chosen relative degree) contains the real system.

{proof}

Under the Gaussian noise assumption, the DT PEM estimate can be interpreted as the ML estimate. Under the proposed sampling frequency, the transformation is unique for sufficiently large [16]. Hence, by the invariance principle of ML estimators [17], the CT equivalent of the system’s parameters is also an ML estimate.

To prove the theorem, we only require that the assumptions in [10] are satisfied in this scenario, and then directly apply the results obtained in the cited contribution. First, note that the model structure given by this procedure contains the models with the desired relative degree (and the contention is proper if ), with a linear mapping between parameter vectors given by the matrix

Furthermore, provided that the DT model set contains the real system, both structures give parameter identifiability. Also note that , obtained via (5), is a consistent estimate of the covariance matrix of . Hence, the results in [10] follow. Namely, the normalized estimation errors333For simplicity, we assume that the vector has the appropriate dimension, where zeros have been considered in the first terms if necessary. and are asymptotically normally distributed with zero means and their asymptotic covariance matrices satisfy the relation

(10)

Moreover, following the steps in [10, Section 3], the improved PEM estimate is consistent and has the same asymptotic distribution as , thereby proving its asymptotic efficiency.

Remark IV.1

Note that by (9) and (10), the asymptotic covariances can be shown to satisfy the following properties:

Both of these claims follow by applying properties 10.5 and 10.6 from [18, Chapter 10] to this context. These properties imply that for sufficiently large , the proposed estimator can only decrease the covariance of the estimated parameters compared to standard PEM. The asymptotic covariances satisfy a Pythagorean relation, as the PEM estimate is projected orthogonally on to the proper subspace where lies.

Next, we establish that imposing a larger relative degree improves the accuracy of the estimates, provided that the highest relative degree model structure contains the real system.

Theorem IV.2

Given a plant of order and relative degree , consider CT candidate models of relative degree and and their improved PEM parameter vector estimates and respectively. If , then their asymptotic covariance matrices satisfy .

{proof}

The proof follows by applying [19, Theorem 2]. Details are omitted due to length restrictions.

Remark IV.2

The relative degree of the CT system is not always known. In some practical applications, physical knowledge about the system can give intuition. In addition, statistical measures can be used such as the coefficient of determination [7], or the Young Information Criterion [20].

V Monte Carlo simulation studies

We will now study the performance of the proposed estimator under a series of experiments.

The system considered is the Rao-Garnier system [8], which is a linear fourth-order non-minimum phase system with complex poles that has been tested in many publications (see e.g. [21, 22, 7]) of CT system identification:

This system is interesting since it has two damped oscillatory modes at 2 and 20 rad/sec with damping of and respectively, and has a non-minimum phase zero at . It is known that this is a particularly difficult system to estimate by PEM/ML, since these methods may converge to a local minimum if they are not well initialised [21].

Three methods have been compared: PEM, PEM with relative degree enforcement (labeled PEMrd (9)), and the simplified refined IV method for continuous-time systems (SRIVC), which is one of the most successful direct methods available, and has been suggested for general use in one of the most recent surveys on CT system identification [7]. Each method has been tested in different Monte Carlo simulations, and they have been evaluated according to the average normalized square error of the system estimate

the average normalized square error of the parameter vectors

and the fit measure

where is the noise-free output sequence (the simulated data without the additive measurement noise), is the simulated output sequence of the -th estimated model, and is the average value of .

We have run PEM using the standard MATLAB System Identification Toolbox [12] with the oe command, and assumed that the correct order of the system is known. The search algorithm has been initialised with the estimate given by the Null Space Fitting method444PEM initialised with the estimate from SRIVC (as in [9]) has also been tested with similar results. [23]. We based PEMrd on the PEM estimate previously obtained. The required Jacobian matrix has been numerically calculated via finite differences, and the correct relative degree has been imposed for this estimator. We have used the command d2c of MATLAB in both cases. SRIVC has been implemented with the CONTSID toolbox for MATLAB [6, Chapter 9] with default initialisation, and has been set to estimate the model

V-a Effect of the number of data points and sampling rate

In this study, we have designed the input as a pseudorandom binary sequence (PRBS) of amplitude switches between and . For the first input sequence, the number of stages of the shift register is and the data length of the shortest interval is . Hence, a sequence of data points has been obtained. The noise is a zero-mean white Gaussian noise signal, where the variance has been set such that the signal-to-noise ratio (SNR in dB) between the noiseless output sequence and the noise equals dB.

Three Monte Carlo studies have been performed with simulations of different noise realisations each one, for . The results are shown in Table I.

Method Fit
0.01 PEM 98.9742
PEMrd 99.1219
SRIVC 99.1217
0.05 PEM 98.9645
PEMrd 99.1141
SRIVC 99.1146
0.1 PEM 98.9436
PEMrd 99.0884
SRIVC 99.0870
TABLE I: Monte Carlo simulation results for PRBS input of total length .

In order to analyse the performance of each estimator under less data, we have set the number of stages to and the data length of the shortest interval to , resulting in a input of data points. With the same SNR as the test above, the results for Monte Carlo simulations for each sampling period can be found in Table II.

Method Fit
0.01 PEM 97.7791
PEMrd 98.0705
SRIVC 98.0719
0.05 PEM 97.7892
PEMrd 98.1172
SRIVC 98.1167
0.1 PEM 97.7416
PEMrd 98.0339
SRIVC 97.9754
TABLE II: Monte Carlo simulation results for PRBS input of total length .

Both Tables I and II show that the refined PEM estimator statistically improves the estimates given by PEM, and is a very competitive method against SRIVC, even for high frequency sampling. Note that under less data points, PEMrd still outperforms PEM for every sampling period, which indicates that the asymptotic properties studied in Section IV can be observed in practical finite data cases as well.

Remark V.1

In Tables I and II we have discarded cases where PEM has delivered estimates with one pole in the negative real axis. Fortunately this scenario is very uncommon (9 cases seen in 3000 simulations). A similar phenomenon was observed in Table II for in SRIVC, where 2 estimates gave negative fit values. These simulations were not considered either.

V-B Multisine input

A different input signal has also been tested. We have taken 2000 data measurements of a multisine input given by the sum of sine waves of angular frequencies . The standard deviation of the additive noise has been set to 0.1, and the median of Monte Carlo simulations of the normalized model error, normalized parameter error, and fit have been obtained. The results are shown in Table III.

Method Fit
0.01 PEM 99.4319
PEMrd 99.8173
SRIVC 80.8446
0.02 PEM 99.2684
PEMrd 99.7325
SRIVC 94.3144
TABLE III: Monte Carlo simulation results for multisine wave input of total length .

For this input and sampling method, SRIVC performs poorly, while PEM and PEMrd normally reach the global optimum. Even though the fit is very near the optimal, PEMrd is consistently better than standard PEM at all metrics.

V-C Mean and covariance of the estimated parameters

As established in Theorem IV.1, the improved PEM estimate should reduce (at least asymptotically) the covariance of the parameter vector. To test this, we have obtained the mean and standard deviation of each parameter given by the Monte Carlo study under the setup of Section V-A with and . The results are shown in Table IV. It can be observed that the mean values are similar in all methods (except standard PEM, which does not estimate the correct model structure), and PEMrd provides the lowest standard deviation for every parameter.

Method
Parameter
True value
0
0
-6400
1600
5
408
416
1600
PEM
Mean
Std. Dev
-0.044
0.963
0.301
11.414
-6403.01
147.13
1601.33
47.85
5.006
0.399
408.16
7.98
416.25
9.27
1600.49
33.29
PEMrd
Mean
Std. Dev.
0
0
0
0
-6397.25
122.39
1599.79
42.39
4.994
0.315
407.88
7.11
415.84
8.33
1599.27
28.59
SRIVC
Mean
Std. Dev.
0
0
0
0
-6399.19
132.05
1600.49
44.21
5.014
0.338
407.99
7.75
416.61
8.98
1599.71
31.1
TABLE IV: Estimated parameter value means and standard deviations for each method, considering .

V-D Direct comparison with standard PEM

In this subsection we analyse the improvements of the novel method over the standard PEM estimates.

To show the impact of selecting and enforcing the correct relative degree, we have focused on the Bode Diagrams of the resulting models for PEM and PEMrd. In Figure 1 we have plotted the frequency response estimates of 100 Monte Carlo simulations under the setup in Section V-A, for , .

It is clear by Fig. 1 that the improvement over PEM is mainly at high frequencies. This is intuitive, since the relative degree determines the asymptotic slope of the Bode diagram of magnitude. The proposed method enforces the true asymptotic slope, leading to an important gain in accuracy in both magnitude and phase.

Fig. 1: Bode Diagram of 100 estimates by PEM (green), 100 estimates by PEMrd (blue), and the real system (red).

Also, direct comparison plots between PEM and PEMrd are shown in Figure 2. These plots compare the normalized model and parameter error, and the fit of PEM and PEMrd for each Monte Carlo experiment (500 in total). PEMrd outperforms standard PEM in most Monte Carlo simulations, specially in the fit comparison, where 496 out of 500 experiments have lead to an increase in fit under PEM refinement.

Fig. 2: Direct comparison plots between PEM and PEMrd for 500 Monte Carlo simulations. The green dots correspond to Monte Carlo simulations where PEMrd outperforms PEM. Red dots represent the opposite, and the dashed blue line is the separatrix.

V-E Random systems

To obtain the average performance of each estimator, we have tested them on a data set created with 500 random systems of order 3 and relative degree 2, by using the rss command in MATLAB. The slowest pole has been set to have real part less than -0.1. The input was a unit variance Gaussian white noise, and the additive white noise was also Gaussian, of standard deviation equal to of the maximum value of the noiseless output. The sampling period has been chosen as 10 times faster than the fastest pole or zero of the real system.

We have computed the median of the metrics used above, as in Section V-B, for the 500 random systems. We also have counted failures of the estimators, which are the cases when the estimates produced a negative fit, when the algorithm crashed, or when it was not possible to correctly initialise the PEM estimate having tried reducing by a factor of 2 the sampling rate for initialisation estimation.

The results can be seen in Table V. Although all estimators report failures, the PEMrd again shows promising results in all metrics. As pointed out in [9] and seen in these tests, initialisation aspects are in fact a major issue concerning the reliability of these algorithms.

Method Fit N Failures
PEM 98.871 12
PEMrd 98.997 16
SRIVC 98.9723 30
TABLE V: Monte Carlo simulation results for random systems.

Vi Conclusions

We have proposed a refinement to the standard PEM estimator for indirect continuous-time system identification that achieves an asymptotically optimal desired relative degree enforcement. An explicit expression for this estimator has been found, and its statistical properties have been analysed. Extensive simulations using both standard benchmarks and random systems have been put forward with promising results. These have shown that the refinement to standard PEM leads to an important improvement in all the statistical metrics studied, and its performance is comparable, if not better, to SRIVC for all sampling periods in this study provided PEM is initialised correctly.

References

  • [1] T. Söderström, P. Stoica, System Identification, Prentice-Hall, 1988.
  • [2] L. Ljung, System Identification: Theory for the User, 2nd edition, Prentice-Hall, 1999.
  • [3] R. Pintelon, J. Schoukens, System Identification: A Frequency Domain Approach, John Wiley & Sons, 2012.
  • [4] T. P. Bohlin, Practical Grey-box Process Identification: Theory and Applications, Springer, 2006.
  • [5] G. Rao, H. Unbehauen, Identification of continuous-time systems, IEE Proceedings-Control theory and applications 153 (2) (2006) 185–220.
  • [6] H. Garnier, L. Wang, Identification of Continuous-time Models from Sampled Data, Springer, 2008.
  • [7] H. Garnier, Direct continuous-time approaches to system identification. Overview and benefits for practical applications, European Journal of control 24 (2015) 50–62.
  • [8] G. Rao, H. Garnier, Numerical illustrations of the relevance of direct continuous-time model identification, IFAC Proceedings Volumes 35 (1) (2002) 133–138.
  • [9] L. Ljung, Experiments with identification of continuous time models, IFAC Proceedings Volumes 42 (10) (2009) 1175–1180.
  • [10] T. Söderström, P. Stoica, B. Friedlander, An Indirect Prediction Error Method for System Identification, Automatica 27 (1) (1991) 183–188.
  • [11] K. J. Åström, B. Wittenmark, Computer Controlled Systems: Theory and Design, Prentice-Hall, 1984.
  • [12] L. Ljung, System Identification Toolbox 7: Getting Started Guide.
  • [13] H. Unbehauen, G. Rao, Continuous-time Approaches to System Identification —- A Survey, Automatica 26 (1) (1990) 23–35.
  • [14] N. K. Sinha, Identification of continuous-time systems from samples of input-output data: An introduction, Sadhana 25 (2) (2000) 75–83.
  • [15] R. A. Horn, C. R. Johnson, Matrix Analysis, 2nd Edition, Cambridge University Press, 2012.
  • [16] I. Kollar, G. Franklin, R. Pintelon, On the Equivalence of z-domain and s-domain Models in System Identification, in: Instrumentation and Measurement Technology Conference, IMTC-96, Vol. 1, IEEE, 1996, pp. 14–19.
  • [17] P. W. Zehna, et al., Invariance of maximum likelihood estimators, Annals of Mathematical Statistics 37 (3) (1966) 744.
  • [18] C. Gourieroux, A. Monfort, Statistics and econometric models, Vol. 1, Cambridge University Press, 1995.
  • [19] R. A. González, P. E. Valenzuela, C. R. Rojas, R. A. Rojas, Optimal enforcement of causality in non-parametric transfer function estimation, IEEE Control Systems Letters 1 (2) (2017) 268–273.
  • [20] P. Young, Recursive estimation, forecasting and adaptive control, Control and Dynamic systems 30 (Part 3) (1989) 119–165.
  • [21] L. Ljung, Initialisation aspects for subspace and Output-Error identification methods, in: European Control Conference (ECC), 2003, 2003, pp. 773–778.
  • [22] J. S. Welsh, J. C. Agüero, M. Alamir, Continuous-time System Identification using Indirect Inference, IFAC Proceedings Volumes 42 (10) (2009) 1169–1174.
  • [23] M. Galrinho, C. R. Rojas, H. Hjalmarsson, The weighted null-space fitting method, arXiv preprint arXiv:1708.03946.
Comments 0
Request Comment
You are adding the first comment!
How to quickly get a good reply:
  • Give credit where it’s due by listing out the positive aspects of a paper before getting into which changes should be made.
  • Be specific in your critique, and provide supporting evidence with appropriate references to substantiate general statements.
  • Your comment should inspire ideas to flow and help the author improves the paper.

The better we are at sharing our knowledge with each other, the faster we move forward.
""
The feedback must be of minimum 40 characters and the title a minimum of 5 characters
   
Add comment
Cancel
Loading ...
171924
This is a comment super asjknd jkasnjk adsnkj
Upvote
Downvote
""
The feedback must be of minumum 40 characters
The feedback must be of minumum 40 characters
Submit
Cancel

You are asking your first question!
How to quickly get a good answer:
  • Keep your question short and to the point
  • Check for grammar or spelling errors.
  • Phrase it like a question
Test
Test description