Probabilistic Control for Uncertain Systems
In this paper a new framework has been applied to the design of controllers which encompasses nonlinearity, hysteresis and arbitrary density functions of forward models and inverse controllers. Using mixture density networks, the probabilistic models of both the forward and inverse dynamics are estimated such that they are dependent on the state and the control input. The optimal control strategy is then derived which minimizes uncertainty of the closed loop system. In the absence of reliable plant models, the proposed control algorithm incorporates uncertainties in model parameters, observations, and latent processes. The local stability of the closed loop system has been established. The efficacy of the control algorithm is demonstrated on two nonlinear stochastic control examples with additive and multiplicative noise.
The conventional theory of stochastic control is particularly suitable for taking into account randomly varying system parameters and designing probabilistic control strategies under uncertain working conditions. However, the functional equations describing its solution are mostly computationally infeasible and are solved approximately by assuming the certainty equivalence principle, for example, and the ignorance of model uncertainty [1, 2, 3].
Machine learning has been proposed in  to model the conditional distribution of the system dynamics. The application of these methods to control systems is still in its infancy where only the Gaussian distribution of the process noise dynamics has been estimated [4, 5]. Nevertheless, these methods are proven to be promising in addressing a number of key weaknesses in traditional control approaches that are either based on deterministic models, ignore model uncertainty or treat uncertainty as a nuisance parameter. A key objective of this paper is to extend these currently well developed methods of estimating the conditional distributions of the systems dynamics in such a way that they are able to approximate the general distributions of the dynamical stochastic and deterministic nonlinear systems and exploit them in control strategies. This will be accomplished by using the mixture density network (MDN) from the neural network field . In its original structure the MDN has been proposed to model the probability density functions of stochastic static models. In previous work , we have successfully extended the idea of the MDN such that the probability density functions of the stochastic dynamic models can be estimated as well. The developed theory was then used to estimate the conditional distribution of an ill posed inverse controller where a unique solution cannot be found. In this work the MDN will be used to estimate the general distributions of system models which are not constrained by the Gaussian assumption. Although, the idea of a mixture density network is not new [8, 9, 6, 10], no results to our knowledge on using this neural network have been reported in the control literature.
Several attempts to deriving control strategies of stochastic uncertain systems have been discussed in the literature [11, 12, 13, 14, 15, 16]. However, most of the current methods [13, 14, 15, 16] are devoted to linear uncertain systems, and do not consider, to a large extent, functional uncertainties as a result of unknown latent dynamics and hysteresis characteristics of the process. The closed loop entropy has been proposed in  to characterize the uncertainty of the tracking error for general nonlinear and non–Gaussian stochastic systems. However the probability density function (pdf) of the tracking error in  is measured by assuming a known pdf of the random input that affects the dynamic of the system. In other words, the system output is assumed to be invertible with respect to the noise. This is a crude assumption for many general practical systems since it is often difficult to measure the pdf of arbitrary random inputs. It also implicitly indicates the existence of an accurate model that describes the system dynamics, and consequently ignores model uncertainty.
In this paper a new framework which does not assume the invertibility of the system output with respect to the noise or even the invertibility of the control input with respect to the system output is developed. It also does not assume the existence of accurate models that describe the system forward and inverse dynamics. The new framework is based on estimating the probabilistic models of the tracking error and the inverse controller from process data. These probabilistic models are not constrained by Gaussian assumptions. They are state and control input dependent and are estimated using a MDN. The control input is then designed to minimize uncertainty of the closed loop system. Our framework provides a theoretical but practically implementable control mechanism that leads to a more robust and efficient control strategy under highly complex working conditions. The innovation of this paper stems from accepting uncertainty as fundamental to the understanding of the control problem, placing it at the heart of a probabilistic generative view of control. This is counter to the traditional view of control in which, when uncertainty is considered at all, it is simply modeled and taken to be a separable nuisance aspect of the more basic deterministic control strategy. Moreover, the method developed in this paper is suitable for controlling multi–modal control systems, where a unique solution does not exist.
2 Statement of the problem
The general stochastic control problem is considered for systems with input , a measurable state vector , and future values of the system output, , affected by a random force . The system behavior can be represented by the following nonlinear ARMAX stochastic model
where is the relative degree of the system, and is an unknown nonlinear function that represents the system dynamic. In general need not be invertible with respect to the random input, . Besides, there is no assumption made on whether has a known pdf or is an independent and identically distributed random process. Because of the existence of the random force, only the probability distribution of the future output values, can be specified from the state and control at each instant of time, . The aim of control is subject to some constraints such as a finite energy budget,
where is the stochastic model of the tracking error and is obtained by subtracting the desired output from the system stochastic function . Hence also need not be invertible with respect to . Moreover, the tracking error distribution is not necessarily Gaussian because is a general stochastic noise. The density of can be obtained from the density of as
In this formulation the value of the variable parameterizes and affects the distribution of the system output and consequently the distribution of the tracking error. For the system to perform well in practice, the controller should be designed such that the pdf of the tracking error is made as narrow as possible. A narrow distribution indicates that the uncertainty of the tracking error is small which also corresponds to a small variance. In addition, the mean value of the tracking error and control energy should also be minimized. For this purpose, the performance index is set to be,
where and are constant weights for the variance, mean, and control input respectively.
Remark 1: To reemphasize, the pdf of the tracking error need not be invertible with respect to the random force , but is invertible with respect to the system output as specified by ( ‣ 2). Therefore, both the output pdf and the tracking error pdf will be used in this article mutually, and they have the same parameters with the mean of the tracking error pdf being different by the desired output than the mean of the output pdf. This will be further discussed in Section 2.1
Since the pdf of the system output is not constrained by the Gaussian assumption, a sum of squares or cross entropy error function for estimating the system output are not expected to yield satisfactory results. For this purpose a new class of network models obtained by combining a conventional neural network with a mixture of Gaussians is proposed in this paper to estimate the conditional probability distribution of the system output from process data. It is called a mixture density network and can in principle represent arbitrary conditional probability distributions in the same way that a conventional neural network can represent arbitrary functions. The control objective is then to track the specified desired output, to a small neighborhood of zero with the output , while ensuring local stability of the system output. The problem therefore arises as to how inverse controllers are acquired from the general distribution of the system output. Such learning must be able to divide up the control into appropriate regions which can be recombined to generate the system behavior. In this paper we propose a novel framework which can solve the learning of the general distributions of the system output and inverse controller in a computationally coherent manner from a single principle. The basic idea of the proposed framework is that a mixture of Gaussians exist to control the system and each is augmented with a corresponding Gaussian of the system output. It consists of a model pool of couples of MDNs of inverse controllers and system models. Each couple evaluates a number of probable control signals, and the couple generating the most suitable control signal is used to control the system. This framework is especially useful for estimating the general distributions of systems from process data. As well be demonstrated shortly, it is efficient for controlling complex systems with large uncertainties which are not constrained by the Gaussian assumption. Besides, it gives superior results for controlling systems characterized by hysteresis and multi–modality.
2.1 Mixture density networks and tracking error distribution
Mixture density networks have been employed in many system identification and inverse applications [18, 8] and have also been shown to provide a general framework for approximating the conditional distribution of the inverse controller where multi–modality and hysteresis play critical roles . In a mixture density network, the probability density of the system output is given by a linear combination of Kernel functions in the form
where is the number of kernels in the mixture. The parameters are the prior probabilities of having been generated from the component of the mixture. The functions represent the conditional density of for the kernel. Various choices of the kernel functions are possible , however, in this article Gaussian kernel functions are considered,
where , and represent the centre and the variance respectively of the kernel. Note that the prior probabilities, the centre, and the variances of the output pdf are taken to be continuous functions of the input variables . These functions are estimated as the outputs of a feed–forward neural network that takes as input. They represent the set of parameters which govern the system output distribution, and are denoted by in Equation ( ‣ 2.1). This combination of a density model and a feed–forward neural network is represented schematically in Figure 1, a. Gaussian kernels as specified by Equation ( ‣ 2.1) can approximate any given density function to arbitrary accuracy.
Using as a target, the mixture density network is then trained to minimize the negative logarithm of the probability density function of the system output by using back-propagation
where is the centre of the pdf of the tracking error which is obtained by subtracting the desired output value from the centre of the output pdf.
2.2 Mixture density networks and the distribution of the inverse controller
Based on the ability of the multiple Gaussians to describe the density function of the system output, we suggest that for each behavior captured by a kernel function, it is desired to learn a control strategy or in other words a paired inverse kernel function should be designed. As such, the probability density function of the inverse controller can be generated as the summation of outputs from these inverse kernels weighted by the prior probabilities of the output pdf,
where represent the conditional density of for the kernel.
Remark 2: Note that in Equation ( ‣ 2.2) the prior probabilities of the inverse controller are fixed to those obtained from the output pdf. The aim is that each inverse kernel learns to provide a suitable control signal under the context for which its paired kernel of the output pdf most likely produces the output value.
The kernel functions of the inverse controller are again taken to be Gaussian kernel functions,
where , and represent the centre and the variance respectively of the kernel of the inverse controller. Similar to the output pdf, the centre, and the variances of the inverse controller pdf are continuous functions of the input variables . The centre and the variances functions are estimated as the outputs of a feed–forward neural network that takes as input. This combination of a density model and a feed–forward neural network is represented schematically in Figure 1, b. The centre, variances and priors of the inverse controller MDN represent the set of parameters which govern the inverse controller distribution, and are denoted by in Equation ( ‣ 2.2). Here the target of the mixture density network is the optimal control input as calculated in Section 2.3, Equation ( ‣ 2.3) or its linearized form specified in Equation ( ‣ 2.3). The mixture density network is then trained to minimize the negative logarithm of the probability density function of the control input by using back-propagation.
To emphasize, the output pdf takes the control signal and the state values as inputs and estimates the conditional density function of the output of the system. The controller pdf takes the desired output of the system and the state values as inputs and estimates the conditional density function of the control input. The prior probabilities of the inverse controller are fixed to their corresponding probabilities from the output pdf. Conceptually speaking if the output of the system is most likely produced by one of the kernels, its corresponding inverse kernel receives the major part of the error signal and its output contribute significantly to the conditional density of the control input. Fixing the prior probabilities of the inverse controller to those obtained from the output pdf, is realistic in practice. It actually ensures that the output pdf and its counterpart of the inverse controller are tightly coupled both through training and control phases. Of course, one can still choose to have independent priors for the inverse controller.
A key advantage of using the MDN is its ability to represent arbitrary conditional probability distributions in the same way a conventional neural network can represent arbitrary functions. Moreover, the parameters of MDNs are optimized such as to minimize the negative log-likelihood of the probability density functions ( ‣ 2.1) and ( ‣ 2.2), therefore, a complete description of the estimated output can be obtained. This is of particular interest to control problems in which the mapping to be learned is multi–valued  as often arises in the solution of inverse control problems and robotics applications [19, 20] and hysteretic nonlinear systems [21, 22]. The above framework provides a new generic modeling and control methods appropriate for active control of complex uncertain systems seeking stochastically optimal control strategies in systems which exhibit nonlinearity, hysteresis, multimodality, randomness and uncertainty.
2.3 Control Algorithm Design
The development so far assumes no prior information about known pdfs of the system output or the inverse controller. All pdfs required are estimated using MDNs from process data. The parameters of those conditional distributions (means, variances and priors) are continuous functions of the input variables. This allows the development of a pragmatic method for estimating and incorporating functional uncertainties in deriving the optimal control law.
The probabilistic control problem is a nonlinear optimization problem that can be solved by setting the derivative of the performance function ( ‣ 2) with respect to the control signal equal to zero,
where . The solution to ( ‣ 2.3) cannot be analytically obtained due to the nonlinear nature of the parameters of the tracking error pdf. As such, this solution can only guarantee the search for local minima. An alternative approach for solving ( ‣ 2.3) analytically, would be to formulate a recursive algorithm for the control input .
Note that is continuous because the parameters of the tracking error pdf are continuous and first order differentiable with respect to . Applying the first backward difference operator, to ( ‣ 2.3) yields the following recursive formula for
The increment of can be approximated from the first order Taylor expansion of around the previous operating point :
Several methods for calculating the output of the mixture density network have been proposed in the literature. In multimodal control problems for example, the distribution of the system output will consist of limited numbers of distinct branches. In this case one specific branch from the estimated conditional density of the MDN more likely is needed to be selected. Two examples of how to select a specific branch are the most likely, and the most probable output values. In this paper the most probable output value corresponding to the most probable branch will be used. Since each component of the mixture model is normalized, the most probable branch is given by
To summarize, the following algorithm can be readily obtained:
Update the parameters of the MDN that estimates the conditional distribution of the inverse controller.
Calculate the control input from the controller MDN.
Apply to the stochastic system (2).
Update the parameters of the MDN that estimates the output probability density function.
Obtain the density function of the tracking error from ( ‣ 2.1).
This control algorithm can specifically be implemented on-line while actually performing in a good manner since knowledge of uncertainty is taken into consideration. The forward model of the plant to be controlled, and the controller can both be adapted on–line which means that speed requirement becomes stringent. However, since on–line optimisation is kept local fast convergence of the networks is expected. Moreover, the above control algorithm provides a general solution for stochastic systems subject to arbitrary random inputs with unknown pdfs. Indeed it is a general solution for stochastic and deterministic systems characterized by functional uncertainty. No assumptions are made about the invertibility of the system output with respect to the random inputs or the invertibility of the controller with respect to the system output. Thus, it can be concluded that this control formulation can be widely applied to many general practical stochastic non–Gaussian systems characterized by hysteresis and high levels of complexity and uncertainty. The scientific quality of our proposed method over other recently developed stochastic control methods focusses around the acceptance of ‘noise’ as ‘intrinsic uncertainty’ which often cannot be ignored and is absolutely key to solving the control problem.
3 Local Stability Analysis
There exist a set of parameters of the tracking error distribution such that . Here denotes the Kullback Leibler divergence distance, denotes the true or ideal distribution of the tracking error, and is a positive constant which represents the estimation error of the tracking error pdf.
There exist a set of parameters of the controller distribution such that . Here denotes the true or ideal distribution of the control input, and is a positive number which represents the estimation error of the pdf of the controller.
Proof: To prove the local stability of the closed loop system, (2) is linearized to give
where , and . Note also that without loss of generality we assumed that in order to simplify notation in the following discussion. Applying the delay operator, to both sides of ( ‣ 3), yields
Given assumption and that is selected such that,
yields that the control input as calculated from ( ‣ 2.3) is bounded. This together with assumption allows generating the following stochastic model for the control input
where denotes the control input as estimated from the controller MDN. In other words, the error as a result of estimating the control signal from the controller MDN, is bounded. From ( ‣ 2.3) and ( ‣ 3),
where we have defined the following row vector
Then, can be expressed as
Solving ( ‣ 3) for , yields
and where and are two polynomials that are related to the system and control structure. Define , then the linearized closed loop system of ( ‣ 3) can be rewritten as,
From assumption and the fact that and are bounded yield that is bounded. This means that the closed loop system is stable if we can guarantee that is bounded. This can be found out by obtaining the state space representation of . For that purpose define,
where . Then using the direct programming method, the state space representation of is given by,
From ( ‣ 3), it can be seen that the condition for the local stability of the closed loop system is given by
4 Numerical Simulations and Results
The advantages of the proposed probabilistic control approach are evaluated on two test stochastic control examples with additive and multiplicative non–Gaussian random noise. In this section we provide a comparison of the proposed probabilistic control method with the conventional indirect adaptive control approach. At the same time, these results also demonstrate that the inclusion of models uncertainty significantly enhances the performance of control systems for the two nonlinear stochastic control examples.
4.1 Example 1
A nonlinear stochastic control problem is considered here to test the effectiveness of the proposed probabilistic control approach. The nonlinear stochastic dynamical system is described by the following equation
where denotes a noise sequence sampled from a mixture of Gaussians with the following mean, and covariance matrix, ,
The following reference model with input output pairs represents the desired output behavior at time ,
For comparison purposes two experiments were conducted. In the first experiment the dynamical behavior of the system was estimated using a standard multi layer perceptron neural network (MLPNN) and the controller was estimated using the classical indirect adaptive control method such that the following error is minimized,
Here the inverse controller was a standard MLPNN as well. In the second experiment, the conditional distribution of the tracking error was estimated using the MDN and the controller was derived from this distribution and estimated using another MDN as discussed in Section 2. In both experiments the scaled conjugate gradient method is used to update the networks parameters. For fair comparison, both of the MDN and standard neural networks were subjected to the same noise sequence, reference input, and weights and for the variance mean and control input respectively. A rough initialization for the parameters of the standard MLP and MDNs in both experiments was obtained using an off line training method. The optimal control law from the standard MLP control model and the most probable control from the density network were calculated and forwarded to the system. The performance of the two controllers is shown in Figure 2. This figure shows the system output as a result of the MDN control superimposed on the desired model output over the whole control range. Clearly, the system output is able to track adequately the desired output. However, the standard NN output is struggling to track the desired output. This shows that although standard NNs are normally suitable for deriving the optimal control law and achieving a good tracking performance, they are inadequate for stochastic systems affected by general random inputs. Looking at the tracking error of the two methods it can be seen that on average the tracking error of the MDN controller is zero. However, the average tracking error of the standard NN is drifted away from zero.
4.2 Example 2
In this section we present a comparison of the proposed probabilistic control approach with the indirect adaptive control approach on a stochastic nonlinear control problem with multiplicative noise. The stochastic nonlinear dynamical system is described by the following difference equation:
where denotes a noise sequence sampled from a mixture of Gaussians with the following mean, and covariance matrix, :
The particular system considered here was used in , but with an additive Gaussian white noise sequence rather than multiplicative non–Gaussian noise. The following reference model with input output pairs represents the desired output behavior at time
Two experiments were conducted. In the first experiment two standard multi layer perceptron NNs were used to estimate the forward dynamics and the inverse controller of the system. In the second experiment, two MDNs were used to provide estimates for the conditional distributions of the tracking error and the inverse controller. Here also the scaled conjugate gradient method is used to update the networks parameters. Both of the MDN and standard NNs were subjected to the same noise sequence, reference input, and weights and for the variance mean and control input respectively. A rough initialization for the parameters of the standard MLP and MDNs in both experiments was obtained using off line training methods. The optimal control law from the standard MLP control model and the most probable control from the density network were calculated and forwarded to the system. The performance of the two controllers is shown in Figure 3. The result of this example is consistence with that obtained in the first example. The system output obtained from the MDN controller is superimposed on the desired model output over all the control range and the tracking error on average is equal to zero. However the standard neural network output is struggling to track the desired output and its average tracking error is drifting away from zero.
In this paper a new framework has been applied to the design of controllers which encompasses uncertainty, multimodality, hysteresis, and arbitrary density functions of forward models and inverse controllers. It is for the general class of stochastic nonlinear control problems, where the dynamics are nonlinear functions of the control and the state. The proposed framework considers functional uncertainty by estimating the probabilistic models of the system dynamics and the controller that are dependent on the state and the control input. Using mixture density networks from the neural network field, a control input is formulated which minimizes uncertainty of the closed loop system. Furthermore, the local stability condition of the closed loop system has been established. Global stability and closed-loop performance are topics of future research.
In contrast to traditional control methods, the derived controller in this paper is not constrained by the probability density
function of the random input that affects the dynamics of the
system. No assumptions are made about the invertibility of the
system output with respect to the random input or even the
invertibility of the control input with respect to the system
output. Simulation examples with additive and multiplicative
random inputs are used to illustrate the proposed controller and
encouraging results have been obtained.
Acknowledgment: This work has been carried out during sabbatical leave granted to the author Randa Herzallah from Al-Balqa’ Applied University (BAU) during the academic year
-  Åström, K. J., and Wittenmark, B., 1989. Adaptive Control. Addison-Wesley, Reading, MA, U.S.A.
-  Narendra, K. S., and Mukhopadhyay, S., 1994. “Adaptive control of nonlinear multivariable systems using neural networks”. Neural Networks, 7(5), pp. 737–752.
-  Yaz, E., 1986. “Certainty equivalent control of stochastic systems: Stability property”. IEEE Transactions on Automatic Control, 31, Feb, pp. 178–180.
-  Herzallah, R., and Lowe, D., 2008. “A Bayesian perspective on stochastic neuro control”. IEEE Transactions on Neural Networks, 19(5), May, pp. 914–924.
-  Herzallah, R., 2007. “Adaptive critic methods for stochastic systems with input-dependent noise”. Automatica, 43(8), pp. 1355–1362.
-  Bishop, C. M., 1995. Neural Networks for Pattern Recognition. Oxford University Press, New York, N.Y.
-  Herzallah, R., and Lowe, D., 2004. “A mixture density network approach to modelling and exploiting uncertainty in nonlinear control problems”. Engineering Applications of Artificial Intelligence, 17, pp. 145–158.
-  Evans, D. J., Nabney, I. T., and Cornford, D., 2000. “Structured neural network modelling of multi-valued functions for wind vector retrieval from satellite scatterometer measurements”. Neurocomputing, 30, pp. 23–30.
-  Richmond, K., King, S., and Taylor, P., 2003. “Modelling the uncertainty in recovering articulation from acoustics”. Computer Speech and Languages, 17, pp. 153–172.
-  Herzallah, R., and Lowe, D., 2003. “Multi-valued control problems and mixture density network”. In IFAC International Conference on Intelligent Control Systems and Signal Processing, ICONS, Vol. 2, pp. 387–392.
-  Hayakawa, T., Ishii, H., and Tsumura, K., 2009. “Adaptive quantized control for nonlinear uncertain systems”. Systems and Control Letters, 58, Sept, pp. 625–632.
-  Pin, G., Raimondo, D. M., Magni, L., and Parisini, T., 2009. “Robust model predictive control of nonlinear systems with bounded and statedependent uncertainties”. IEEE Transactions on Automatic Control, 54, July, pp. 1681–1687.
-  Primbs, J. A., and Sung, C. H., 2009. “Stochastic receding horizon control of constrained linear systems with state and control multiplicative noise”. IEEE Transactions on Automatic Control, 54, Feb, pp. 221–230.
-  Mirkin, B., and Gutman, P. O., 2008. “Robust output-feedback model reference adaptive control of SISO plants with multiple uncertain, time-varying state delays”. IEEE Transactions on Automatic Control, 53, Nov, pp. 2414–2419.
-  Petersen, I. R., 2008. “A kalman decomposition for robustly unobservable uncertain linear systems”. Systems and Control Letters, 57, Oct, pp. 800–804.
-  Zhang, Z., and Serrani, A., 2009. “Adaptive robust output regulation of uncertain linear periodic systems”. IEEE Transactions on Automatic Control, 54, Feb, pp. 266–278.
-  Yue, H., and Wang, H., 2003. “Minimum entropy control of closedloop tracking errors for dynamic stochastic systems”. IEEE Transactions on Automatic Control, 48, Jan, pp. 118–122.
-  Kravchenko, A. N., 2009. “Neural network method to solve inverse problems for canopy radiative transfer models”. Cybernetics and Systems Analysis, 45, pp. 477–503.
-  White, D. A., and Sofge, D., eds., 1992. Handbook of Intelligent Control. Multiscience Press, Inc, New York, N.Y.
-  MolinaVilaplana, J., PedrenoMolina, J. L., and LópezCoronado, J., 2004. “Hyper RBF model for accurate reaching in redundant robotic systems”. Neurocomputing, 61, Oct, pp. 495–501.
-  Ren, B., Ge, S. S., Lee, T. H., and Su, C. .-Y., 2009. “Adaptive neural control for a class of nonlinear systems with uncertain hysteresis inputs and time-varying state delays”. IEEE Transactions on Neural Networks, 19(7), July, pp. 1148–1164.
-  Ikhouane, F., and GomisBellmunt, O., 2008. “A limit cycle approach for the parametric identification of hysteretic systems”. Systems and Control Letters, 57, August, pp. 663–669.
-  Singla, P., Subbarao, K., and Junkins, J. L., 2007. “Directiondependent learning approach for radial basis function networks”. IEEE Transactions on Neural Networks, 18(1), pp. 203–222.