Particle Filter-Based Fault Diagnosis of Nonlinear Systems Using a Dual Particle Filter Scheme*

# Particle Filter-Based Fault Diagnosis of Nonlinear Systems Using a Dual Particle Filter Scheme*

## Abstract

In this paper, a dual estimation methodology is developed for both time-varying parameters and states of a nonlinear stochastic system based on the Particle Filtering (PF) scheme. Our developed methodology is based on a concurrent implementation of state and parameter estimation filters as opposed to using a single filter for simultaneously estimating the augmented states and parameters. The convergence and stability of our proposed dual estimation strategy are shown formally to be guaranteed under certain conditions. The ability of our developed dual estimation method is testified to handle simultaneously and efficiently the states and time-varying parameters of a nonlinear system in a context of health monitoring which employs a unified approach to fault detection, isolation and identification is a single algorithm. The performance capabilities of our proposed fault diagnosis methodology is demonstrated and evaluated by its application to a gas turbine engine through accomplishing state and parameter estimation under simultaneous and concurrent component fault scenarios. Extensive simulation results are provided to substantiate and justify the superiority of our proposed fault diagnosis methodology when compared with another well-known alternative diagnostic technique that is available in the literature.

## 1Introduction

Systems state estimation is a fundamental problem in control, signal processing, and fault diagnosis fields [1]. Investigations on both linear and nonlinear state estimation and filtering in stochastic environments have been an active area of research during the past several decades. Linear state estimation methods use a simpler representation of an actual nonlinear system and can provide an acceptable performance only locally around an operating point and in the steady state operational condition of the system. However, as nonlinearities in the system dynamics become dominant, the performance of linear approaches deteriorates and linear algorithms will not necessarily converge to an accurate solution. Although an optimal state estimation solution for linear filtering methods exists, nonlinear filtering methods suffer from generating sub-optimal or near-optimal solutions. Consequently, investigation of nonlinear estimation and filtering problems remain a challenging research area.

Numerous studies have been conducted in the literature to solve and analyze standard nonlinear filtering problems [2]. These methods can be broadly categorized into [7]: (a) linearization methods (extended Kalman filters (EKF)) [2], (b) approximation methods using finite-dimensional nonlinear filters [3], (c) particle filter (PF) methods as one of the most popular Bayesian recursive methods for state estimation [4], (d) classical partial differential equation (PDE) methods for approximating a solution to the Zakai equation [5], (e) Wiener chaos expansion methods [6], (f) moment methods [7], and (g) high dimensional nonlinear Kalman filter methods known as the Cubature Kalman filters [8].

One of the most important recent applications of nonlinear filtering methods is in the area of fault diagnosis of dynamical systems that can include fault detection, isolation, and identification (FDII) modules. Diagnosis methods that are based on linearization techniques suffer from poor detection and high rates of false alarms. Therefore, Monte Carlo filtering approach based on particle filters was first proposed in [9] to address the fault detection and isolation problem in nonlinear systems. In this work, the negative log-likelihood, which is calculated for a predefined time window, is considered as a measure for the fault detection. The fault isolation was achieved by using the augmentation of the fault parameters vector to the system states to perform the estimation task. However, the augmented state space model tends to increase the dimensionality of the model and as a result increases the number of required particles for achieving a sufficiently accurate result. For decreasing the computational burden of this method, the augmented model is used only after the fault detection stage and for only the fault isolation stage. An external covariance adjustment loop was added to this augmented model in [10] to enable the estimation algorithm to track changes in the system parameters in case of fault occurrences.

The combination of a particle filtering algorithm and the log-likelihood ratio (LLR) test in the multiple model environment, has led to the development of sensor/actuator FDI scheme in [11] for a general class of nonlinear non-Gaussian dynamical systems but with the assumption of full state measurements. The fault detection problem recently is addressed for a mobile robot based on the combination of the negative LLR test and particle filtering approach in [12]. However, both of the methods in [11], and [12] suffer from the high computational burden for on-line implementation of the algorithms. Hence, the idea of parallelized particle filters for on-line fault diagnosis is introduced in [12] to improve the performance of the algorithm.

A PF-based robust navigation approach was proposed in [13] to address multiple and simultaneous faults occurrences in both actuators and sensors in an underwater robot where an anomaly is modeled by a switching-mode hidden Markov system. The component and actuator fault detection and isolation of a point mass satellite was tackled in [14] by introducing several particle filters that run in parallel and each rejects a different subset of the faults.

Generally, the main issues with applying standard particle filters to the problem of fault diagnosis problem can be stated as follows [15]: (i) False diagnosis decisions due to low probabilities of transitions to fault states when there are fewer samples of states, and (ii) The exponential growth of the required samples for accurately approximating the a posteriori distributions as dimensionality of the estimation problem increases. The risk-sensitive PF is introduced to address the first problem and the variable resolution PF is developed to overcome the second problem in [16]. Moreover, the Gaussian PF (GPF) is also introduced in [17] as an efficient algorithm for performing fault diagnosis of hybrid systems faster than traditional methods that are based on PFs. Finally, the sample impoverishment problem in particle filters due to fault occurrence in a hybrid system is addressed in [18]. The developed algorithm enables the PF method to be implemented by fewer number of particles even under faulty conditions.

In this work, our objective is to address the above problems by specifically utilizing particle filters. This is accomplished through development of a PF-based dual state/parameter estimation scheme for the system component fault diagnosis. The developed dual estimation scheme relies on an on-line estimation of the system states as well as the system health parameters. These parameters are not necessarily fixed and their time variations are governed and influenced by the fault vector that directly affects them.

The on-line estimation of the system time-varying parameters by using particle filters is a challenging and active area of research. There are two main classes of PF-based parameter estimation algorithms (for on-line as well as off-line implementations) [19] known as Bayesian and maximum likelihood (ML) approaches. In the Bayesian approach, a prior distribution is considered for the unknown parameters and the posterior distribution of the parameters is approximated given the observations [20], whereas in the ML approach the estimated parameter is the maximizing argument of the likelihood function given the observations [22]. In the ML framework for parameter estimation, the maximization of any cost function can be performed based on gradient-based search methods [22]. On the other hand, expectation maximization (EM) methods are only applicable for maximization of the likelihood functions [25]. However, EM methods are not suitable for on-line applications due to their high computational cost for implementation. The recursive maximum likelihood method (RML) is recognized as a promising method for on-line parameter estimation based on a stochastic gradient algorithm [23]. In order to avoid the direct computation of the likelihood function gradient, an alternative method is proposed in [26] that is known as the gradient-free ML parameter estimation. Despite the above, the on-line ML methods suffer from the practical point of view of slow convergence rates and requiring large number of particles to achieve accurate estimates [27].

In the Bayesian framework, on-line implementation of particle filter-based parameter estimation algorithms are computationally intensive [28]. A general method that is capable of simultaneously estimating the static (i.e. , constant or fixed) parameters and time-varying states of a system is developed in [29]. The work is based on the sequential Monte Carlo (SMC) method in which an artificial dynamic evolution model is considered for the unknown model parameters. In order to overcome the degeneracy concerns arising from the particle filtering, kernel smoothing technique as a method for smoothing the approximation of the parameters conditional density has been utilized in [20]. The estimation algorithm is further improved by re-interpretation of the artificial evolution algorithm according to the shrinkage scaling concept. However, the proposed method in [20] is only applicable for estimating fixed parameters of the system and it uses the augmented state/parameter vector for the estimation task.

Our main goal in this work is to extend the Bayesian framework that is proposed in [20] for both state and parameter estimation by invoking our proposed modified artificial evolution law that is inspired from the prediction error (PE) concept. This modification enables the scheme to track and estimate the time-varying parameters of the system. The PE is used for both off-line and on-line system identification by incorporating a quadratically convergent scheme [30]. The concept of the kernel mean shrinkage (KS) is also used in our proposed scheme to avoid over-dispersion in the variance of the estimated parameters.

Component fault diagnosis of dynamical systems can be achieved through dual state/parameter estimation methods [32] by first detecting the faulty competent and then by determining its severity and isolating its location. In this work, our proposed dual state and parameter estimation strategy is applied for performing component fault diagnosis of a gas turbine engine [34]. This is accomplished by utilizing a multiple-model approach as developed in [35]. Although the multiple-model based approach can detect and isolate faults based on a set of predefined faulty models, the utilization of parameter estimation schemes embedded in our proposed dual state/parameter estimation methodology enables the diagnostic scheme to not only detect and isolate the faults but also accurately identify the type and severity of simultaneous fault scenarios in the gas turbine system. The simultaneous detection, isolation, and identification of faults are the main feature that distinguishes the current work with the works conducted in [35]. Moreover, the main methodology that is developed in these references is based on Kalman filters as opposed to the particle filtering methodology that is developed on the current work.

Based on the above discussion, the of this paper can be summarized as that of utilizing nonlinear Bayesian and Sequential Monte Carlo (SMC) methods to develop, design, analyze, and implement a unified framework for both states and parameters estimation as well as fault diagnosis of nonlinear systems. Our methodology is based on solving the Bayesian recursive relations through SMC methods. An on-line parameter estimation scheme is developed based on a modified artificial evolution concept by using the particle filters (PF) approach. Specifically, by using the prediction error to correct the time-varying changes in the system parameters, a novel methodology is proposed for parameter estimation of nonlinear systems based on the PF. Specifically, in the implementation of our proposed scheme, a dual structure for both state and parameter estimation is developed within the PF framework. In other words, the hidden states and variations of the system parameters are estimated through operating two concurrent filters. Convergence and stability properties of our proposed dual estimation strategy are shown to be guaranteed formally under certain conditions.

The remainder of this paper is organized as follows. In Section 2, the statement of the nonlinear filtering problem is presented. Our proposed dual state/parameter estimation scheme is developed in Section 3, in which state and parameter estimation methods are first developed concurrently and subsequently integrated together for simultaneously estimating the system states and parameters. The stability and convergence properties of the proposed schemes under certain conditions are also provided in Section 3. Our proposed fault diagnosis framework and formulation are also provided in Section 3. In Section 4, extensive simulation results and case studies are provided to demonstrate and justify the merits of our proposed method for fault diagnosis of a gas turbine engine under simultaneous and concurrent component faults. Finally, the paper is concluded in Section 5.

## 2Problem Statement

The problem under consideration is to obtain an optimal estimate of states as well as time-varying parameters of a nonlinear system whose dynamics is governed by a discrete-time stochastic model,

where is the system state, , is a known nonlinear function, is an unknown and possibly time-varying parameter vector governed by an unknown dynamics. The function is a known nonlinear function representing the map between the states, parameters and the system measurements, and and are uncorrelated stochastic process and measurement noise sequences with covariance matrices and , respectively. The following assumption is made regarding the dynamical system and .

Assumption A1. The vector ranges over a compact set denoted by , for which the functions and are continuously differentiable with respect to the state as well as the parameter .

The main objective of the dual state and parameter estimation problem is to approximate the following conditional expectations:

where denotes the available observations up to time , and are functions of states and parameters, respectively, that are to be estimated. The conditional probability functions and are to be approximated by the designed particle filters (PFs) through determining the filtering distributions according to

where the subscript in implies that the state/parameter conditional probability distributions are obtained from particles. Each state particle has a weight and each parameter particle has a weight , where denotes the Dirac-delta function mass that is positioned at or .

Based on the approximations used in equation , our goal is to address the convergence properties of the subsequently designed estimators to their true optimal estimates and also to develop and demonstrate under what conditions this convergence remains valid.

## 3Proposed Dual State/Parameter Estimation and Fault Diagnosis Framework

In this section, the main theoretical framework for our proposed dual state/parameter filtering as well as the fault diagnosis methodology of the nonlinear system and are introduced and developed.

### 3.1Dynamic Model in Presence of Time-Varying Parameters

Our first task is to represent the model and into another framework for our subsequent theoretical developments. Let denote the probability space on which the three real vector-valued stochastic processes , and are defined. The -dimensional process describes the evolution of the hidden states, the -dimensional process describes the evolution of the hidden system parameters that are conditionally independent of the states, and the -dimensional process denotes the observation process of the system.

The processes and are Markov processes with the associated initial state and parameter and , respectively. They are drawn from the initial distributions and , respectively. The dynamic evolution of states and parameters are modeled by the Markov transition kernels and , that also admit densities with respect to the Lebesgue measure

1, such that for all and , where and denote the Borel -algebra on and , respectively. The transition kernel is a probability distribution function (pdf) that follows the pdf of the stochastic process in process . The probability density function for approximating the parameter kernel transition is to be provided in the subsequent subsections.

Given the states and parameters, the observations are conditionally independent and have the marginal distribution with a density with respect to the Lebesgue measure as given by,

where is a probability density function that follows the probability density function of the stochastic process in equation .

In the dual state/parameter estimation framework, at first the state is estimated (which is denoted by ). The estimated value at time is then used to estimate the parameter at time (which is denoted by ). In the Bayesian framework for parameter estimation, the prior evolution of parameters are not specified, therefore it is necessary to consider a given evolution for the parameters in order to design an estimation filter. In our proposed dual structure for the state estimation filter, first the parameters are assumed to be constant at time at their estimated value , and then for the parameter estimation filter they are evolved to the next time instant by applying an update law that is based on the prediction error (PE) method. The details regarding our proposed methodology are presented in the subsequent subsections in which the filtering of states and parameters are fully described and developed.

### 3.2The Dual State/Parameter Estimation Framework

In our proposed dual state/parameter estimation framework, two filters are runing concurrently. At every time step, the first PF-based state filter estimates the states by using the current available estimate of the parameters, , whereas the second PF-based parameter filter estimates the unknown parameters by using the current estimated states, . The developed schematic is shown in Figure 1.

In our dual estimation framework, the well-known maximum a posteriori (MAP) solution corresponding to the marginal estimation methods based on the decoupled approach is used for solving the dual estimation problem [36]. In this method, the joint state/parameter marginal density is expressed as

where and denote the state and parameter marginal densities, respectively. Assuming that the variations of parameters are slow when compared to the system state time variations, one can use the approximation , so that the joint marginal density is approximated as

Our ultimate goal is to maximize the two marginal distribution terms in expression separately according to the decoupled approach in [36] as follows

In the above decoupled methodology, one attribute is optimized at a time by keeping the other attribute fixed and then alternating them. Associated with optimization of both marginal distributions, different cost functions can be chosen [36]. For developing a dual extended Kalman filter, corresponding to specific cost functions of the parameter marginal density, various estimation methods have been proposed in the literature [36]. For example, the maximum likelihood (ML) and prediction error approaches are selected for marginal estimations. The main motivation for choosing these two approaches is due to the fact that one considers to maximize only the marginal density as opposed to the joint density . However, in order to maximize the parameter marginal density, it is also necessary to generate state estimates that are produced by maximizing the state marginal density .

It should be noted that in marginal estimation methods no explicit cost function is considered for maximization of the state marginal distribution, since the state estimation is only an implicit step in marginal approaches and the joint state/parameter cost is used that may have variety of forms in different filtering algorithms [36]. In our proposed dual particle filtering framework, is approximated by the state filtering distribution from equation . Next, the prediction error cost function is chosen for maximization of the parameter marginal density, where this cost function is implemented through a PE approach in order to attain a less computational cost [30].

In the subsequent subsections, specific details regarding the concurrent state and parameter estimation filters design and development are provided.

### 3.3The State Estimation Problem

For designing the state and parameter estimation filters, our main objectives are to approximate the integrals in equations and by invoking the particle filter (PF) scheme as well as to approximate the estimate of the conditional state and parameter distributions. Consider to denote the a priori state estimation distribution before the observation at time becomes available, and to denote the marginal distribution of the parameter at time . The a posteriori state distribution after the observation at the instant becomes available is obtained according to the following rule,

In the above it is assumed that is known for this filter. Therefore, the last distribution in the right hand side of equation is set to one.

The particle filter (PF) procedure for implementation of the state estimation and for determining consists of two main steps, namely (a) the prediction step (time update step), and (b) the measurement update step. Consider one states in the particles at time . The prediction step utilizes the knowledge of the previous distribution of the states as well as the previous parameter estimate, these are denoted by (corresponding to estimated state particles that follow the distribution ) and , respectively, as well as the process model given by equation . In other words, the prediction step is explicitly governed by the following equations for , namely

where denotes the process noise related to each particle and is drawn from the noise distribution with the probability distribution function , and denotes the independent samples generated from equation for particles. Moreover, denotes the independent samples of the predicted outputs that are evaluated at samples, and denotes the a priori state estimation covariance matrix.

For the first step, the one-step ahead prediction distribution known as the a priori state estimation distribution is now given by,

For the second step, the information on the present observation is used. This results in approximating , where is considered to be given from a parameter estimation filter and obtained from the distribution . Consequently, the particle weights are updated by the likelihood function (the importance function) according to , where denotes the probability distribution function of the additive noise of the output and is evaluated at .

In this work, since our ultimate goal is in developing a fault diagnosis algorithm that is practically stable, the structure of regularized particle filters (RPF) is chosen which is quite common in many practical applications [38]. The characteristics of the RPFs are related to the fact that they are capable of transforming the discrete-time approximation of the a posteriori state estimation distribution into a continuous-time one. Consequently, the resampling step is modified in such a manner that the new resampled particles are obtained from an absolutely continuous-time distribution with different locations from that of [39]. Therefore, where the probability for taking the -th particle is , and for denotes the normalized particle weights. In other words, the particle selection in the resampling step is performed for particles that have higher probabilities of .

For the resampling step, two main choices can be considered that are known as (i) Bayesian bootstrap, and (ii) Sampling importance resampling (SIR) [4]. Although both approaches are applicable for this filter, the bootstrap method is chosen in this paper. Therefore, the a posteriori state estimation distribution is approximated by before one performs the resampling by using the RPF structure [39], and by after one performs the resampling that is provided below,

where denotes the regularized state vector that is evaluated at points that are obtained from the absolutely continuous-time distribution of the particles as given by

where denotes the first standard deviation of the particles from their mean. Hence, is obtained from the continuous-time distribution through the regularization kernel that is considered to be a symmetric density function on [39]. The matrix in equation is chosen to yield a unit covariance value in the new population and . The constant denotes the optimal bandwidth of the kernel, and denotes the a posteriori state estimation at time .

We are now in a position to introduce our overall particle filter (PF) scheme for implementing the state estimation filter. Our goal for proposing this algorithm is to ensure that an approximation to by takes , where denotes the a posteriori distribution of (after the resampling from ), that is given by . The estimated output from the state estimation filter is also given by .

1. Initialize the PF scheme with particles, and the parameters (the mean of the parameter initial distribution ).

2. Draw , where denotes a given distribution for the process noise in the filter, and then predict the state particles according to equation .

3. Compute from equation to obtain the importance weights as and normalize them to .

4. Resampling: Draw new particles with the replacement for each , according to , from the regularized kernel where as given by equation .

5. Calculate from the conditional distribution that is given by equation ,
with equally weighted as .

6. Update the parameters from the parameter estimation filter (to be specified in the next subsection).

7. Set and go to Step .

Following the implementation of the above state estimation filter, the parameter estimation filter that is utilized for adjusting the parameters is now described in detail in the next subsection.

### 3.5The Parameter Estimation Problem

One of the main contributions of this paper is to develop a novel PF-based parameter estimation filter within our proposed dual state/parameter estimation framework by utilizing the prediction error (PE) concept. For this methodology it is assumed that the a priori distribution of the time-varying developed is not known. Moreover, the estimated states that are generated by the state estimation filter provided in the previous subsection will be used. Therefore, it is imperative that one considers a dynamical model associated with the parameters evolution in order to estimate the density function .

The most common dynamical model that is considered for the parameter propagation (in case of the system with constant parameters) is the conventional artificial evolution law. In this representation small random disturbances are added to the state particles (parameters) between each consecutive time step [29]. However, in our work, the conventional update law for the parameters is modified to include the output prediction error as a multiplicative term to allow one to deal with time variations in the parameters that can affect the system output.

In order to derive the parameter update law, an algorithm based on the prediction error (PE) method is proposed by minimizing the expectation of a quadratic performance index with respect to . This is due to the fact that our parameter estimation algorithm for obtaining the distribution of the a posteriori parameter estimate is based on the kernel smoothing that uses the shrinkage of the particle locations. This method attempts to force the particles towards their mean from the previous time step, i.e. the estimated value of , and is denoted by (before adding noise to the particles). This is also used in the state estimation filter for approximating . Therefore, our goal is to investigate the convergence properties of whose boundedness ensures the boundedness of . Towards this end, the performance index is now selected as where the integral is approximated in the PF by .

The term now represents a quadratic function of the output prediction error related to each particle . The prediction error is now defined according to , where denotes the particle related to the estimated value of the parameter whose true value is denoted by (this is clearly assumed to be unknown). Therefore, we define , in which the expectation is taken over the observation sequence of samples. Let us now select the quadratic criterion as

The following modified artificial evolution law is now proposed for the parameter update in the particle filters for generating parameter particles that correspondingly determine the distribution from which the a priori parameter estimate is considered to be the same as , and the a posteriori parameter estimate is obtained in two steps that are denoted by and , respectively.

In the first step (the second step is described in the next page) for one gets

where , which when evaluated at is denoted by , denotes the step size design parameter, denotes the zero-mean normal increment particles to the parameter update law at each time step with the covariance matrix through the use of the kernel smoothing concept, denotes the shrinkage matrix, and denotes the covariance of the parameter estimates in the previous time step . The kernel shrinkage algorithm attempts to force the parameter particles distribution towards its mean in the previous time instant that was denoted by , by applying the shrinkage coefficient matrix to the obtained .

The processes and are conditionally independent given observations up to time . Moreover, where and denotes the -th element of the vector . The term denotes a time-varying coefficient that determines the updating direction and is a positive scalar to ensure that the criterion can be minimized by changing in the steepest descent direction. Therefore, the first step estimate of the a posteriori parameter estimation particle is denoted by . The convergence of the update law - will be shown in the subSection 3.8.

The parameter update law according to - contains a term in addition to the independent normal increment . The estimated parameter from this update law is invoked in the PF-based parameter estimation filter to represent the distribution from which the parameter particle population for the next time step is chosen. Therefore, the above proposed prediction error based modified artificial evolution law enables the PF-based estimation algorithm to handle and cope with the time-varying parameter scenarios. The time-varying term acts as an adaptive step size in equations -, and therefore our algorithm can also be considered as an adaptive step size scheme.

In order to ensure that the obtained from the modified artificial evolution law given by equations - remains in (refer to Assumption A1), the following projection algorithm is utilized that forces to remain inside according to the following procedure [30],

1. Choose a factor ,

2. Compute ,

3. Construct ,

4. If go to Step , else go to Step ,

5. Set , and go to Step 3,

6. Stop.

It should be noted that the main reason for considering the above mapping is related the fact that the actual dynamics of the parameters are not known, therefore such mapping ensures that the assumed dynamics for the parameters based on modified artificial evolution model does not cause instability of the entire system.

Consequently, the a priori distribution of the parameter is assumed to have the same distribution as in the previous time step. On the other hand, as the present observation becomes available in the measurement update step, the a posteriori distribution of the parameter is obtained through two steps that denoted by and , respectively. In what follows, more details related to these distributions are presented.

Consider equations -. The a posteriori distribution of the parameters calculated from the distribution of the parameter particles is given by,

and the measurement equation is expressed as,

where denotes the evaluated output that is obtained by the parameter estimation filter that is different from the one that is obtained by the state estimation filter, as provided in the subSection 3.3.

Now, in the second step for estimating the a posteriori parameter estimate distribution, consider the present observation , so that the particle weights are updated by the likelihood function according to . This can now be expressed by using the normalized weights as , where . Following the resampling/selection step, an equally weighted particle distribution is obtained as for approximating , and the resampled (selected) particles that are denoted by follow the distribution . Therefore, the a posteriori parameter estimation distribution is approximated by a weighted sum of the Dirac-delta masses as before one performs the resampling and with an equally weighted particle distribution approximation as according to

where denotes the normalized parameter particle weight, is obtained from the resampling/selection step of the scheme by duplicating the particles having large weights and discarding the ones with small values to emphasize the zones with higher a posteriori probabilities according to . In our proposed filter the residual resampling method is used to ensure that the variance reduction among the resampled particles is guaranteed [40].

Therefore, an approximation to by takes on the form , where denotes the a posteriori distribution of the parameter estimate (after performing the resampling from ). The resulting estimated output of this filter is obtained by . The explicit details for implementation of the parameter estimation filter are now provided below.

The particle filter for implementation of the parameter estimation is described as follows:

1. Initialize the particles for the parameters as , and use the initial values of the states as that represents the mean of the states initial distribution .

2. Draw .

3. Predict from equations - with the projection algorithm.

4. Compute the importance weights , and normalize them to .

5. Resampling: Draw new particles with replacement for each , , where