SVD-based Kalman Filter Derivative Computation

# SVD-based Kalman Filter Derivative Computation

J.V. Tsyganova and M.V. Kulikova Manuscript received ??; revised ??. The second author thanks the support of Portuguese National Fund (Fundação para a Ciência e a Tecnologia) within the scope of project UID/Multi/04621/2013.The first author is with Ulyanovsk State University, Str. L. Tolstoy 42, 432017 Ulyanovsk, Russian Federation. The second author is with CEMAT (Center for Computational and Stochastic Mathematics), Instituto Superior Técnico, Universidade de Lisboa, Av. Rovisco Pais 1, 1049-001 LISBOA, Portugal; Emails: TsyganovaJV@gmail.com; maria.kulikova@ist.utl.pt
###### Abstract

Recursive adaptive filtering methods are often used for solving the problem of simultaneous state and parameters estimation arising in many areas of research. The gradient-based schemes for adaptive Kalman filtering (KF) require the corresponding filter sensitivity computations. The standard approach is based on the direct differentiation of the KF equations. The shortcoming of this strategy is a numerical instability of the conventional KF (and its derivatives) with respect to roundoff errors. For decades, special attention has been paid in the KF community for designing efficient filter implementations that improve robustness of the estimator against roundoff. The most popular and beneficial techniques are found in the class of square-root (SR) or UD factorization-based methods. They imply the Cholesky decomposition of the corresponding error covariance matrix. Another important matrix factorization method is the singular value decomposition (SVD) and, hence, further encouraging KF algorithms might be found under this approach. Meanwhile, the filter sensitivity computation heavily relies on the use of matrix differential calculus. Previous works on the robust KF derivative computation have produced the SR- and UD-based methodologies. Alternatively, in this paper we design the SVD-based approach. The solution is expressed in terms of the SVD-based KF covariance quantities and their derivatives (with respect to unknown system parameters). The results of numerical experiments illustrate that although the newly-developed SVD-based method is algebraically equivalent to the conventional approach and the previously derived SR- and UD-based strategies, it outperforms the mentioned techniques for estimation accuracy in ill-conditioned situations.

{keywords}

Kalman filter, filter sensitivity equations, SVD factorization, array algorithms.

## I Introduction

The problem of filter sensitivities evaluation plays a key role in many areas of research; for instance, in state estimation and parameter identification realm [1, 2], in the field of optimal input design [3, 4], in information theory for computing the Fisher information matrix [5, 6, 7] etc. In this paper we explore linear discrete-time stochastic systems where the associated Kalman filter (KF) is used for estimating the unknown dynamic states. Therefore, the standard approach for computing the filter sensitivities (with respect to unknown system parameters) is a direct differentiation of the KF equations. This conventional methodology is comprehensively studied in [3, 8, 9]. The shortcoming of this strategy is a numerical instability of the conventional KF (and its derivatives) with respect to roundoff errors discussed in [10, 11]. Due to this fact, special attention has been paid in the KF community for designing robust KF implementation methods. The most popular techniques belong to the class of square-root (SR) or UD factorization-based methods; see [12, 13, 14, 15] and many others. These algorithms imply the Cholesky decomposition and its modification for the corresponding covariance matrix factorization [13, 16, 17]. We may note that the Cholesky decomposition exists and is unique when the symmetric matrix to be decomposed is positive definite [18]. If it is a positive semi-definite, then the Cholesky decomposition still exists, however, it is not unique [19]. Further encouraging KF implementation methods might be found with the use of singular value decomposition (SVD). Some evidences of better estimation quality obtained under the SVD-based approach exist in the field of nonlinear filtering; for instance, see discussion in [20, 21, 22] and others. For linear filtering problem examined in this paper, the first SVD-based KF was, to the best of our knowledge, designed in [23]. Our recent analysis exposes that the mentioned SVD-based filter can be further improved for enhancing its numerical robustness. This result is comprehensively studied in [24], where some new stable SVD-based KF implementations are designed.

Despite the existence of inherently more stable SR-, UD- and SVD-based KF variants, the problem of robust filter derivative computation is seldom addressed in practice because of its complicated matter. The solution to the mentioned problem heavily relies on the use of matrix differential calculus. The first SR-based information-type algorithm for the KF derivative computations belongs to Bierman et al. and was appeared in 1990; see [25]. Alternatively, the SR-based covariance-type method was proposed in [26] as well as the UD-based scheme designed in [27]. Later on, a general “differentiated” SR-based methodology was designed for both orthogonal and J-orthogonal transformations involved in the filtering equations (and their derivatives) in [28, 29, 30]. Alternatively, in this technical note we develop the SVD-based approach for the KF derivative computation. We show that the new technique is algebraically equivalent to the conventional “differentiated” KF, but it improves the robustness against roundoff errors as well as the existing “differentiated” SR- and UD-based methodologies. However, motivated by the results obtained in nonlinear filtering realm, we expect that the newly-designed SVD-based method outperforms the previously derived algorithms while solving the parameters estimation problem, especially when the error covariance is ill-conditioned.

## Ii Filter sensitivity equations: conventional approach

Consider the state-space equations

 xk =F(θ)xk−1+B(θ)uk−1+G(θ)wk−1,k≥1, (1) zk =H(θ)xk+vk,vk∼N(0,R(θ)),wk∼N(0,Ω(θ)) (2)

where , , and are, respectively, the vectors of available measurements, the known deterministic control input, the unknown dynamic state and the unknown system parameters that need to be estimated from the available experimental data, . The process and the measurement noises are independent Gaussian zero-mean white-noise processes that are also independent from the initial state . The covariances are assumed to be , and .

Equations (1), (2) represent a set of the state-space models (SSMs). Each of them corresponds to a particular system parameter value. This means that for any fixed value of , say , the system matrices are known, i.e. there is no uncertainty in model (1), (2). For simplicity, throughout the paper we write etc. instead of etc. when evaluating at the fixed point . The associated KF yields the linear minimum least-square estimate of the unknown dynamic state that can be recursively computed via the equations [16, Theorem 9.2.1]:

 ek =zk−H^xk|k−1,^x1|0=¯x0,k≥1, (3) Kp,k =FPk|k−1HTR−1e,k,Re,k=R+HPk|k−1HT, (4) ^xk+1|k =F^xk|k−1+Buk+Kp,kek (5)

where are innovations of the discrete-time KF. The important property of the KF for Gaussian SSMs is . The is the one-step ahead predicted error covariance matrix computed as follows:

 Pk+1|k=FPk|k−1FT+GΩGT−Kp,kRe,kKTp,k,P1|0=Π0. (6)

The conventional approach for deriving the related sensitivity model is based on differentiation of the corresponding filtering equations. Let , be matrices, which entries are differentiable functions of the parameter vector . The matrix implies the partial derivative of the with respect to the -th component of , . The matrix is the differential form of first-order derivatives of . Taking into account the matrix product rule of differentiation [31, p. 955]: , and the fact , we derive for any square and invertible matrix (it is also known as the Jacobi’s formula); see also [8, p. 546]. Using these differentiation rules, the necessary differentials of (3)-(6) can be written as follows [8, 9]:

 dek =−[(dH)^xk|k−1+H(d^xk|k−1)], (7) d^xk+1|k =(dF)^xk|k−1+F(d^xk|k−1)+(dB)uk +(dKp,k)ek+Kp,k(dek), (8) dKp,k =(dF)Pk|k−1HTR−1e,k+F(dPk|k−1)HTR−1e,k +FPk|k−1(dHT)R−1e,k −FPk|k−1HTR−1e,k(dRe,k)R−1e,k, (9) dRe,k =dR+(dH)Pk|k−1HT+H(dPk|k−1)HT +HPk|k−1(dHT), (10) dPk+1|k =(dF)Pk|k−1FT+F(dPk|k−1)FT +FPk|k−1(dFT)+(dG)ΩGT+G(dΩ)GT +GΩ(dGT)−(dKp,k)Re,kKTp,k −Kp,k(dRe,k)KTp,k−Kp,kRe,k(dKTp,k). (11)

In deriving the equations above we take into account that and , because the observations and the control input do not depend on the parameters (i.e. their realizations are independent of variations in ) and therefore have a differential equal to zero.

We may also note that except for the scalar factor , is a special case of , so that to obtain partial-derivative forms from differential forms, we only have to everywhere replace operator with for  [8, p. 546]. Hence, from (7) – (11) we obtain a set of vector equations, known as the filter sensitivity equations, for computing , , and a set of matrix equations, known as the Riccati-type sensitivity equations, for computing , . This approach for the KF sensitivity model derivation is called the “differentiated KF”. Its main drawback is a numerical instability of the conventional KF (3) – (6) and inherently its derivative (7) – (11) with respect to roundoff errors.

The goal of this paper is to design a robust methodology for updating the “differentiated” KF equations above in terms of SVD factors (and their derivatives) of the error covariance matrices instead of using the full matrices (and their derivatives).

## Iii SVD factorization-based Kalman filtering

To the best of our knowledge, the first SVD-based KF was by Wang et al. and appeared in 1992; see Eqs (17), (22), (23) in [23, pp. 1225-1226]. Our recent research shows that although that implementation is inherently more stable than the KF (3) – (6), it is still sensitive to roundoff and poorly treats ill-conditioned problems. The cited analysis exposes that the SVD-based filter can be further improved for enhancing its numerical robustness. This result is comprehensively studied in [24], where new stable SVD-based KF implementations are designed. The readers are referred to the cited paper for the detailed derivations, numerical stability discussion and proofs. Here, we briefly outline the principle steps for construction of the most advanced SVD-based KF variant. Next, we extend it to a stable filter sensitivities computation, which is the main purpose of this study.

Consider the SVD factorization [32, Theorem 2.8.1]: suppose , . There exist positive numbers and unitary matrices and such that

 A=WΣV∗,Σ=[S000]∈Cm×n,S=diag{σ1,…,σr}

where is the conjugate transpose of .

The diagonal entries of are known as the singular values of . The non-zero () are the square roots of the non-zero eigenvalues of both and .

If is a square matrix such that , then the can be diagonalized using a basis of eigenvectors according to the spectral theorem, i.e. it can be factorized as follows: where is a unitary matrix and is a diagonal matrix, respectively. If is also positive semi-definite, then the spectral decomposition above, , is also a SVD factorization, i.e. the diagonal matrix contains the singular values of . For the SSMs examined in this paper, the initial error covariance is a symmetric positive semi-definite matrix and, hence, the spectral decomposition implies where and are the orthogonal and diagonal matrices, respectively. It is also a SVD factorization, i.e. the factor contains the singular values of .

Now, we are ready to present the SVD-based KF implementation developed recently in [24]. Instead of conventional recursion (3)-(6) for , we update only their SVD factors, , at each iteration step of the filter as shown below.

Initial Step (). Apply the SVD factorization for the initial error covariance matrix and, additionally, for the process and measurement noise covariances: and , respectively. Set the initial values as follows: , and .

Measurement Update (). Build the pre-arrays from the filter quantities that are currently available and, then, apply the SVD factorizations in order to obtain the corresponding SVD factors of the updated filter quantities as follows:

 ⎡⎢⎣D1/2RQTRD1/2Pk|k−1QTPk|k−1HT⎤⎥⎦Pre−array=W(1)MU[D1/2Re,k0]QTRe,kPost−arraySVDfactors, (12) ¯Kk=(QPk|k−1DPk|k−1QTPk|k−1)HTQRe,k, (13) ⎡⎢⎣D1/2Pk|k−1QTPk|k−1(I−KkH)TD1/2RQTRKTk⎤⎥⎦Pre−array=W(2)MU[D1/2Pk|k0]QTPk|kPost−arraySVDfactors (14)

where we denote . The matrices , and , are the orthogonal matrices of the corresponding SVD factorizations in (12), (14). Next, and are diagonal matrices with square roots of the singular values of and , respectively.

It can be easily seen that the required SVD factors of the innovation covariance , i.e. , and a posteriori error covariance matrix , i.e. , are directly read-off from the post-array factors in (12) and (14), respectively. Finally, find a posteriori estimate through equations

 ^xk|k=^xk|k−1+¯KkD−1Re,k¯ek,¯ek=QTRe,k(zk−H^xk|k−1). (15)

Time Update (). Build the pre-array and apply the SVD factorization to obtain a priori error covariance SVD factors as follows:

 ⎡⎢⎣D1/2Pk|kQTPk|kFTD1/2ΩQTΩGT⎤⎥⎦Pre−array=WTU[D1/2Pk+1|k0]QTPk+1|kPost−arraySVDfactors (16)

and find a priori estimate as follows:

 ^xk+1|k =F^xk|k+Buk. (17)

The SVD-based KF implementation above is formulated in two-stage form. Meanwhile, following [15], the conventional KF (3) –(6) is expressed in the so-called “condensed” form. Nevertheless, these KF variants are algebraically equivalent. It is easy to prove if we take into account the SVD factorization and the properties of orthogonal matrices. Indeed, for each pre-array to be decomposed we have . Next, by comparing both sides of the obtained matrix equations, we come to the corresponding SVD-based KF formulas. The detailed derivation can be found in [24].

## Iv Filter sensitivity equations: SVD-based approach

To begin constructing the “differentiated” SVD-based method for computing the filter sensitivities, we pay attention to the underlying SVD-based filter and remark that it is formulated in the so-called array form. This makes the modern KF algorithms better suited to parallel implementation and to very large scale integration (VLSI) implementation as mentioned in [15]. Each iteration of the SVD-based filter examined has the following pattern: given a pre-array , compute the post-array SVD factors , and by means of the SVD factorization

 A (18)

where the matrix is of full column rank, i.e. ; the , are orthogonal matrices and is a diagonal matrix with singular values of the pre-array .

The goal of our study is to develop the method that naturally extends formula (18) on the post-array SVD factors’ derivative computation. More precisely, the computational procedure is expected to utilize the pre-array and its derivative for reproducing the SVD post-arrays together with their derivatives . To achieve our goal, we prove the result presented below. We also bear in mind that the SVD post-array factor is of no interest in the presented SVD-based KF for performing the next step of the filter recursion and, hence, the quantity is not required to be computed.

###### Lemma 1

Consider the SVD factorization in (18). Let entries of the pre-array be known differentiable functions of a scalar parameter . We assume that , , for all . Given the derivative of the pre-array, , the following formulas calculate the corresponding derivatives of the post-arrays:

 Σ′θ =[S′θ0], S′θ =diag[WTA′θV]s×s, (19) V′θ =V[¯LT2−¯L2] (20)

where denotes the main block of the matrix product , and is a strictly lower triangular matrix, which entries are computed as follows:

 (¯l2)ij =¯ujiσj+¯lijσiσ2i−σ2j, i =2,…,s,j=1,…,i−1. (21)

In equation above, the quantities and denote the entries of matrices and , respectively. The , are strictly lower and upper triangular parts of the matrix product , respectively.

{proof}

By differentiating (18) with respect to , we obtain

 A′θ=W′θΣVT+WΣ′θVT+WΣ(VT)′θ. (22)

Having applied a right-multiplier and a left-multiplier to equation (22), we have

 WTA′θV=[WTW′θ]Σ+Σ′θ+Σ[(VT)′θV]. (23)

In deriving the equation above we take into account the properties of any orthogonal matrix , i.e. .

It is also easy to show that for any orthogonal matrix the product is a skew symmetric matrix. Indeed, by differentiating both sides of with respect to , we get , or in the equivalent form . The latter implies that the matrix is skew symmetric.

For the sake of simplicity we introduce the following notations: and . As discussed above, the matrices and are skew symmetric, because and are orthogonal matrices. Hence, we have . Taking into account this fact, we obtain the following partitioning of the matrix form of equation (23):

From the equation above, we derive the formula for the main block of the matrix product

 [WTA′θV]s×s=[Υ]s×sS+S′θ−SΛ. (24)

Hence, the diagonal matrix obeys the equation

 S′θ=[WTA′θV]s×s−[Υ]s×sS+SΛ. (25)

Now, let us discuss formula (25) in details. Recall the matrices and are skew symmetric matrices and, hence, their diagonal entries are equal to zero. The multiplication of any skew symmetric matrix by a diagonal matrix does not change the matrix structure, i.e. the diagonal entries of the matrix products and are equal to zero as well. Meanwhile, the matrix is a full matrix and contains a diagonal part. Hence, from equation (25) we conclude that diagonal matrix is, in fact, a diagonal part of the main block of the matrix product . This completes the proof of formulas in equation (19).

Finally, we need to compute where . Since is an orthogonal matrix, we obtain . Next, any skew symmetric matrix can be presented as a difference of a strictly lower triangular matrix and its transpose. Hence, the skew symmetric matrices and can be represented as follows:

 [Υ]s×s =¯LT1−¯L1 Λ =¯LT2−¯L2 (26)

where and are strictly lower triangular matrices.

Next, we split the matrix product into strictly lower triangular, diagonal and strictly upper triangular parts, i.e. . It was proved above that . Taking into account this fact, the substitution of both formulas in (26) into (25) yields

 DS′θ=¯L+D+¯U[WTA′θV]s×s−[¯LT1−¯L1][Υ]s×sS+S[¯LT2−¯L2]Λ. (27)

Hence, we obtain

 ¯L+¯U=[¯LT1−¯L1]S−S[¯LT2−¯L2]. (28)

In (28), the , , are strictly lower triangular matrices, the is a strictly upper triangular matrix and is a diagonal. Hence, equation (28) implies

 {¯U=¯LT1S−S¯LT2,¯L=−¯L1S+S¯L2.

It can be solved with respect to entries of as follows:

 (¯l2)ij=¯ujisj+¯lijsis2i−s2j,i=2,…,s,j=1,…,i−1.

The formula above is exactly equation (21). Having computed the entries we can form the matrix in (26) and, then, compute the derivative . This completes the proof of (20) and Lemma 1.

###### Remark 1

The assumption of singular values of being distinct for all values of parameter is necessary for avoiding the division by zero in formula (21). In future, if possible, we will intend for relaxing this restriction, which reduces the practical applicability of the proposed method.

For readers’ convenience, Algorithm 1 provides a pseudocode for the computational scheme derived in Lemma 1.

The theoretical result presented in Lemma 1 can be further applied to the SVD factorization-based KF discussed in Section III. The obtained computational scheme is summarized in Algorithm 2 and shown in the form of a pseudocode. The new “differentiated” SVD-based KF extends the underlying SVD-based filter on the derivative computation (with respect to unknown system parameters) for updating the corresponding filter sensitivities equations. The method can be used for replacing the conventional “differentiated KF” approach discussed in Section II by inherently more stable approach, which is preferable for practical implementation. Finally, we would like to remark that any “differentiated” filtering technique consists of two parts: i) the underlying KF variant, and ii) its “differentiated” extension used for the filter sensitivities computation.

At the same manner, one can naturally augment any existing SVD-based KF variant (see, for instance, the algorithms in [23, 24]) or potentially new SVD-based KF implementation on the corresponding filter sensitivities computation.

Finally, taking into account the properties of orthogonal matrices, it is not difficult to show that the negative log likelihood function (LF) given as [33]:

 L(θ,ZN1)=c0+12N∑k=1{ln(detRe,k)+eTkR−1e,kek}

can be rewritten in terms of the SVD filter variables , and appeared in equations (12) – (17) as follows:

 L(θ,ZN1)=c0+12N∑k=1{ln(detDRe,k)+¯eTkD−1Re,k¯ek} (29)

where is -step measurement history and is a constant value where .

Taking into account that the matrix is diagonal and using the Jacobi’s formula, , from (29) we obtain the expression for the log LF gradient evaluation in terms of the SVD filter variables and their derivatives computed in the newly-developed Algorithm 2 (for each ):

 ∂iL(θ,ZN1) =12N∑k=1{tr[(∂iDRe,k)D−1Re,k]+2(∂i¯eTk)D−1Re,k¯ek −¯eTkD−1Re,k(∂iDRe,k)D−1Re,k¯ek}. (30)

## V Numerical experiments

By using simple test problem, we would like to demonstrate thoroughly each step of the method summarized in Algorithm 1.

###### Example 1

Given pre-array and its derivative

 A(θ)=⎡⎢ ⎢ ⎢ ⎢ ⎢ ⎢ ⎢⎣−2θsin(θ)2θθ2sin2(θ)1/3θ3θ2θ2−1cos2(θ)θ3+θ2⎤⎥ ⎥ ⎥ ⎥ ⎥ ⎥ ⎥⎦,

compute the corresponding SVD post-arrays , and their derivative , at the point .

Table 1 illustrates each step of the computational scheme in Algorithm 1. To assess the accuracy of computations, we compute . This quantity should be small. Indeed, taking into account the properties of diagonal and orthogonal matrices, from (18) we have . Hence, the derivatives of both sides of the last formula should coincide as well. In our numerical experiment we obtain . This justifies the correctness of computations via Algorithm 1 and confirms the theoretical derivations in Lemma 1.

Next, we wish to demonstrate how the novel method for the filter sensitivities evaluation (Algorithm 2) works in practice. For that, we consider the parameter estimation problem where the gradient-based optimization method is applied for finding the optimal value of unknown system parameters. We test the conventional “differentiated” KF (Eqs (3) – (11) in Section II) and the previously derived SR- and UD-based “differentiated” KF variants from [26] and [27], respectively, against the new “differentiated” SVD-based KF (Algorithm 2). As discussed in Section IV, all “differentiated” methods consist of two parts and, hence, they compute the Log LF and its gradient simultaneously. These values are utilized by a gradient-based optimization method for maximizing the log LF with respect to system parameters. Our library of codes is implemented in MATLAB where we use the built-in optimization method fminunc.

###### Example 2

Consider a linearized version of the in-track motion dynamic when a satellite travels in a circular orbit [34]:

 xk =⎡⎢ ⎢ ⎢⎣110.50.5011100100000.606⎤⎥ ⎥ ⎥⎦xk−1+wk−1, Ω =⎡⎢ ⎢ ⎢⎣000000000000000q1⎤⎥ ⎥ ⎥⎦ zk =[11111111+δ]xk+vk, R =[θ2δ200θ2δ2]

where , and is the unknown system parameter that needs to be estimated. In contrast to [34], we wish to test both well-conditioned and ill-conditioned situations. For that, following [17], we simulate the roundoff by parameter . It is assumed to be , but where denotes the unit roundoff error111Computer roundoff for floating-point arithmetic is characterized by a single parameter , defined in different sources as the largest number such that either or in machine precision. . When , i.e. the machine precision limit, the problem above becomes ill-conditioned. By varying the ill-conditioning parameter , we are able to explore some numerical insights of each method assessed.

The numerical experiment is organized as follows. For each fixed value of ill-conditioning parameter , the SSM in Example 2 is simulated for to generate measurements. Next, the unknown system parameter is estimated from the available experimental data, , by using gradient-based adaptive KF techniques examined, i.e. by four “differentiated” KF methods mentioned earlier in this Section. For a fair comparison, each “differentiated” algorithm utilizes the same data and the same initial value for the optimization method, . Next, the obtained optimal estimate is compared with the “true” value of for assessing the estimation quality of each method. We repeat the experiment times and calculate a posterior mean of the estimate, the root mean squared error (RMSE) and the mean absolute percentage error (MAPE) over Monte Carlo runs.

Having carefully analyzed the obtained numerical results summarized in Table 2, we make a few important conclusions. First, all “differentiated” KF variants work equally well when is about and , i.e. when the problem is not ill-conditioned. This confirms that all “differentiated” techniques are algebraically equivalent. Second, among all methods examined, the conventional approach (“differentiated” KF) shows the worst performance. It degrades faster than any other algorithms when . Furthermore, the line in Table 2 means that MATLAB can not even run the algorithm. Third, we analyze the outcomes obtained by other methods tested and observe that the UD- and SVD-based “differentiated” techniques produce a better estimation quality than the SR-based counterpart. This conclusion is reasonable if we recall that in this paper we do not explore the filtering algorithms, but their differential form for the KF sensitivities computation. Any existing “differentiated” SR-based scheme requires the triangular matrix inversion that is a square-root factor of the innovation covariance ; see Eq (6) in [26]. In contrast, the UD- and SVD-based “differentiated” methods involve the inversion of only diagonal matrix ; see (30) and Eq (8) in [27]. Finally, we observe that the new SVD-based approach slightly outperforms the UD-based counterpart when .

In summary, the previously derived UD- and the new SVD-based techniques provide the best estimation quality when solving parameter estimation problem by the gradient-based adaptive filtering methodology. This creates a strong background for their practical use. In our ill-conditioned test example, the new SVD-based approach even slightly outperforms the UD-based counterpart.

## References

• [1] R. K. Mehra, “Approaches to adaptive filtering,” IEEE Trans. Automat. Contr., vol. 17, no. 5, pp. 693–698, Oct. 1972.
• [2] G. Bastin and M. R. Gevers, “Stable adaptive observers for nonlinear time-varying systems,” IEEE Trans. Automat. Contr., vol. 33, no. 7, pp. 650–658, Jul. 1988.
• [3] R. K. Mehra, “Optimal input signals for parameter estimation in dynamic systems – survey and new results,” IEEE Trans. Automat. Contr., vol. AC-19, no. 6, pp. 753–768, Dec. 1974.
• [4] N. K. Gupta and R. K. Mehra, “Computational aspects of maximum likelihood estimation and reduction in sensitivity function calculations,” IEEE Trans. Automat. Contr., vol. AC-19, no. 6, pp. 774–783, Dec. 1974.
• [5] P. A. Zadrozny and S. Mittnik, “Kalman-filtering methods for computing information matrices for time-invariant, periodic, and generally time-varying VARMA models and samples,” Computers & Mathematics with Applications, vol. 28, no. 4, pp. 107–119, 1994.
• [6] A. Klein and G. Mélard, “Computation of the Fisher information matrix for time series models,” Journal of computational and applied mathematics, vol. 64, no. 1, pp. 57–68, 1995.
• [7] A. Klein, G. Mélard, and T. Zahaf, “Construction of the exact Fisher information matrix of Gaussian time series models by means of matrix differential rules,” Linear Algebra and its Applications, vol. 321, no. 1, pp. 209–232, 2000.
• [8] P. A. Zadrozny, “Analytic derivatives for estimation of linear dynamic models,” Computers & Mathematics with Applications, vol. 18, no. 6, pp. 539–553, 1989.
• [9] A. Klein and H. Neudecker, “A direct derivation of the exact Fisher information matrix of Gaussian vector state space models,” Linear Algebra and its Applications, vol. 321, no. 1, pp. 233–238, 2000.
• [10] M. Verhaegen and P. Van Dooren, “Numerical aspects of different kalman filter implementations,” IEEE Trans. Automat. Contr., vol. AC-31, no. 10, pp. 907–917, Oct. 1986.
• [11] M. Verhaegen, “Round-off error propagation in four generally-applicable, recursive, least-squares estimation schemes,” Automatica, vol. 25, no. 3, pp. 437–444, 1989.
• [12] P. G. Kaminski, A. E. Bryson, and S. F. Schmidt, “Discrete square-root filtering: a survey of current techniques,” IEEE Trans. Automat. Contr., vol. AC-16, no. 6, pp. 727–735, Dec. 1971.
• [13] G. J. Bierman, Factorization Methods For Discrete Sequential Estimation.   New York: Academic Press, 1977.
• [14] A. H. Sayed and T. Kailath, “Extended handrasekhar recursion,” IEEE Trans. Automat. Contr., vol. AC-39, no. 3, pp. 619–622, Mar. 1994.
• [15] P. Park and T. Kailath, “New square-root algorithms for alman filtering,” IEEE Trans. Automat. Contr., vol. 40, no. 5, pp. 895–899, May 1995.
• [16] T. Kailath, A. H. Sayed, and B. Hassibi, Linear Estimation.   New Jersey: Prentice Hall, 2000.
• [17] M. Grewal and A. Andrews, Kalman filtering: theory and practice.   New Jersey: Prentice Hall, 2001.
• [18] G. H. Golub and C. F. Van Loan, Matrix computations.   Baltimore, Maryland: Johns Hopkins University Press, 1983.
• [19] N. J. Higham, “Analysis of the Cholesky decomposition of a semi-definite matrix,” University of Manchester, Tech. Rep. MIMS EPrint: 2008.56, 1990.
• [20] X. Zhang, W. Hu, Z. Zhao, Y. Wang, and Q. Wei, “SVD based Kalman particle filter for robust visual tracking,” in Proceedings of the 19th International Conference on Pattern Recognition.   Tampa, FL, USA: IEEE, 2008, pp. 1–4.
• [21] O. Straka, J. Dunik, M. Simandl, and J. Havlik, “Aspects and comparison of matrix decompositions in unscented Kalman filter,” in Proceedings of the IEEE American Control Conference (ACC), 2013, pp. 3075–3080.
• [22] Q. Zhang, X. Meng, S. Zhang, and Y. Wang, “Singular Value Decomposition-based Robust Cubature Kalman Filtering for an Integrated GPS/SINS Navigation System,” Journal of Navigation, vol. 68, no. 3, pp. 549–562, 2014.
• [23] L. Wang, G. Libert, and P. Manneback, “Kalman Filter Algorithm based on Singular Value Decomposition,” in Proceedings of the 31st Conference on Decision and Control.   Tuczon, AZ, USA: IEEE, 1992, pp. 1224–1229.
• [24] M. V. Kulikova and J. V. Tsyganova, “Improved discrete-time Kalman filtering within singular value decomposition,” IET Control Theory & Applications, 2016,  (in progress). [Online]. Available: https://arxiv.org/abs/1611.03686
• [25] G. J. Bierman, M. R. Belzer, J. S. Vandergraft, and D. W. Porter, “Maximum likelihood estimation using square root information filters,” IEEE Trans. Automat. Contr., vol. 35, no. 12, pp. 1293–1298, Dec. 1990.
• [26] M. V. Kulikova, “Likelihood gradient evaluation using square-root covariance filters,” IEEE Trans. Automat. Contr., vol. 54, no. 3, pp. 646–651, Mar. 2009.
• [27] J. V. Tsyganova and M. V. Kulikova, “State sensitivity evaluation within UD based array covariance filter,” IEEE Trans. Automat. Contr., vol. 58, no. 11, pp. 2944–2950, Nov. 2013.
• [28] M. V. Kulikova and A. Pacheco, “Kalman filter sensitivity evaluation with orthogonal and J-orthogonal transformations,” IEEE Trans. Automat. Contr., vol. 58, no. 7, pp. 1798–1804, Jul. 2013.
• [29] M. V. Kulikova and J. V. Tsyganova, “Constructing numerically stable Kalman filter-based algorithms for gradient-based adaptive filtering,” International Journal of Adaptive Control and Signal Processing, vol. 29, no. 11, pp. 1411–1426, 2015.
• [30] ——, “A unified square-root approach for the score and Fisher information matrix computation in linear dynamic systems,” Mathematics and Computers in Simulation, vol. 119, pp. 128–141, 2016.
• [31] H. Neudecker, “Some theorems on matrix differentiation with special reference to Kronecker matrix products,” Journal of the American Statistical Association, vol. 64, no. 327, pp. 953–963, 1969.
• [32] E. E. Tyrtyshnikov, A brief introduction to numerical analysis.   Springer Science & Business Media: Springer, 2012.
• [33] F. C. Schweppe, “Evaluation of likelihood functions for Gaussian signals,” IEEE Trans. Inf. Theory, vol. IT-11, pp. 61–70, Jan. 1965.
• [34] H. E. Rauch, C. T. Striebel, and F. Tung, “Maximum likelihood estimates of linear dynamic systems,” AIAA journal, vol. 3, no. 8, pp. 1445–1450, 1965.