Safe Control under Uncertainty
Abstract
Controller synthesis for hybrid systems that satisfy temporal specifications expressing various system properties is a challenging problem that has drawn the attention of many researchers. However, making the assumption that such temporal properties are deterministic is far from the reality. For example, many of the properties the controller has to satisfy are learned through machine learning techniques based on sensor input data. In this paper, we propose a new logic, Probabilistic Signal Temporal Logic (PrSTL), as an expressive language to define the stochastic properties, and enforce probabilistic guarantees on them. We further show how to synthesize safe controllers using this logic for cyberphysical systems under the assumption that the stochastic properties are based on a set of Gaussian random variables. One of the key distinguishing features of PrSTL is that the encoded logic is adaptive and changes as the system encounters additional data and updates its beliefs about the latent random variables that define the safety properties. We demonstrate our approach by synthesizing safe controllers under the PrSTL specifications for multiple case studies including control of quadrotors and autonomous vehicles in dynamic environments.
Safe Control under Uncertainty
Dorsa Sadigh 
UC Berkeley 
Berkeley, CA, USA 
dsadigh@berkeley.edu 
Ashish Kapoor 
Microsoft Research 
Redmond, WA, USA 
akapoor@microsoft.com 
\@float
copyrightbox[b]
\end@floatSynthesizing safe controllers for cyberphysical systems (CPS) is a challenging problem, due to various factors that include uncertainty arising from the environment. For example, any safe control strategy for quadcopters need to incorporate predictive information about wind gusts and any associated uncertainty in such predictions. Similarly, in the case of autonomous driving, the controller needs a probabilistic predictive model about the other vehicles on the road in order to avoid collisions. Without a model of uncertainty that would characterize all possible outcomes, there is no guarantee that the synthesized control will be safe.
The field of Machine Learning has a rich set of tools that can characterize uncertainties. Specifically, Bayesian graphical models [?] have been very popular in modeling uncertainties arising in scenarios common to CPS. For example, one of the common strategies in CPS is to build classifiers or predictors based on acquired sensor data. It is appealing to consider such predictive systems in synthesizing safe controllers for dynamical systems. However, it is well known that it is almost impossible to guarantee a prediction system that works perfectly all the time. Consequently, we need to devise control synthesis methodologies that are aware of such limitations imposed by the Machine Learning systems. Specifically, we need to build a framework that is capable of synthesis of safe controllers by being aware of when the prediction system would work and when it would fail.
In this paper, we propose a methodology for safe controller synthesis using the novel Probabilistic Signal Temporal Logic (PrSTL) that allows us to embed predictive models and their associated uncertainties. The key ingredient of the framework is a logic specification that allows embedding of uncertainties via probabilistic predicates that take random variables as parameters. These random variables allow incorporation of Bayesian graphical models in these predicates, thereby resulting in a powerful logic specification that can reason about safety under uncertainty. One of the main advantages of using Bayesian graphical models (or Bayesian methods in general) is the fact that the predictions provided are full distributions associated with the quantity of interest as opposed to a point estimate. For example, a classical Machine Learning method might just provide a value for wind speed, however under the Bayesian paradigm we would be recovering an entire probability distribution over all possible winds. Finally, another distinguishing aspect of our framework is that these probabilistic predicates are adaptive: as the system sees more and more data, the inferred distribution over the latent variables of interest can change leading to change in the predicates themselves.
Previous efforts for synthesizing safe controllers either operate under deterministic environments or model uncertainty only as part of the dynamics of the system. For example, Signal Temporal Logic (STL) [?] provides a framework for expressing realvalued densetime temporal properties for safety, but assumes that the signal provided from the trajectory of the system is deterministically defined by the system dynamics. Similarly, other approaches that model uncertainty as a variable added to the dynamics [?, ?, ?, ?, ?] lack clear connections to various sources of uncertainty present in the environment. Specifically, with prior approaches there is no clear understanding of how uncertainty arising due to sensing and classification could be incorporated while reasoning about safe control trajectories.
In this paper we aim to alleviate these issues by defining a probabilistic logical specification framework that has the capacity to reason about safe control strategies by embedding various predictions and their associated uncertainty. Specifically, our contributions in this paper are:

Formally define PrSTL, a logic for expressing probabilistic properties that can embed Bayesian graphical models.

Formalize a receding horizon control problem to satisfy PrSTL specifications.

Provide a novel solution for the controller synthesis problem using Mixed Integer SemiDefinite Programs.

Provide a toolbox implementing our algorithms and showcasing experiments in autonomous driving and control of quadrotors.
The rest of this paper is organized as follows: In Section Safe Control under Uncertainty we go over some of the related work in the area of stochastic control, and controller synthesis under safety. In Section Safe Control under Uncertainty, we discuss the preliminaries, and in Section Safe Control under Uncertainty, we define the problem statement along with the formal definition of PrSTL. Section Safe Control under Uncertainty illustrates our experimental results, and we conclude in Section Safe Control under Uncertainty.
Over the years researchers have proposed different approaches for safe control of cyberphysical systems. For instance, designing controllers under reachability analysis is a wellstudied method that allows specifying safety and reachability properties [?, ?]. More recently, safe learning approaches construct controllers that keep the system in the safe region, while the optimal strategy is learned online [?, ?, ?]. However, finding the reachable set is computationally expensive, which makes this approach impractical for most interesting cyberphysical systems. Controller synthesis under temporal specifications such as Linear Temporal Logic (LTL) allows expressing more interesting properties of the system and environment, e. g., safety, liveness, response, stability, etc., and has shown promising results in robotics applications [?, ?, ?, ?, ?]. However, synthesis for LTL requires time and space discretization, which again suffers from the curse of dimensionality. Although, this approach is effective at high level planning, it is unsuitable for synthesizing control inputs at the level of dynamical systems. More recently, Raman et al. have studied synthesis for Signal Temporal Logic (STL), which allows realvalued, densetime properties in a receding horizon setting [?, ?]. Although, this approach requires solving mixed integer linear programs, it has shown promising results in practice. One downside of specifying properties in STL or LTL is that the properties of the system and environment have to be expressed deterministically. Knowledge of the exact parameters and bounds of the specification is an unrealistic assumption for most CPS applications, where the system interacts with uncertain environments, and has partial knowledge of the world based on its sensors and classifiers.
The problem of controller synthesis under uncertainty is also a wellstudied topic. One of the most effective approaches in robust control under uncertainty is modeling the environment uncertainty as part of the dynamics of the system, and finding the optimal strategy for the worst case disturbance [?, ?, ?]. However, considering worst case environment is inapplicable and too conservative. More recently, researchers have proposed modeling the environment in a chance constrained framework, and there are some promising results in the area of urban autonomous driving [?, ?, ?, ?, ?]. In most previous work the uncertainty from the system or environment is modeled as part of the dynamics, and there is not an intuitive connection between the properties, and the sensing and classification capabilities of the system. In addition, there has been efforts in verification and synthesis of controllers for temporal properties given probabilistic transition systems [?, ?, ?, ?, ?, ?]. To best of our knowledge, none of the previous studies consider scenarios, where the uncertainty and confidence in properties is originated from classifiers rather than the dynamics of the system, and is expressed as part of the specification. In this work, we propose a more natural framework for expressing temporal and Boolean properties over different sources of uncertainty, and their interconnect, which allows synthesizing safe controllers while considering such probabilistic temporal specifications.
We consider a continuous time hybrid dynamical system:
(1)  
Here, is a signal representing the continuous and discrete mode of the system at time , is the control input and is the output of the system at time . This continuous system can be discretized using time intervals , and every discrete time step is . The discrete time hybrid dynamical system is formalized as:
(2)  
We let denote the initial state of the dynamical system. We express an infinite run of the system as: . Given the initial state , and a finite length input sequence: , the finite horizon run or trajectory of the system following the dynamics in equation (Safe Control under Uncertainty) is:
(3) 
Furthermore, we let be a signal consisting of the state and input of the system at time ; is the state, and is the input at time .
The output of the system is also computed to be . A cost function is defined for the finite horizon trajectory, denoted by , and maps , the set of all trajectories to positive real valued costs in .
Signal Temporal Logic (STL) is an expressive framework that allows reasoning about realvalued densetime functions, and has been largely used for defining robustness measures and monitoring properties of realtime signals of hybrid systems [?, ?, ?]. More recently there has been interest in synthesizing controllers that satisfy STL properties [?, ?].
Formally, denotes that a signal satisfies the STL formula at time . An atomic predicate of an STL formula is represented by inequalities of the form , where is a function of the signal at time . The truth value of the predicate is equivalent to . Any STL formula consists of Boolean and temporal operations on these predicates and the syntax of STL formulae is defined recursively as follows:
(4) 
where and are STL formulae, denotes the globally operator and is the until operator. For instance, specifies that must hold at all times in the given interval, of signal . We can also define the eventually operator, and . Satisfaction of an STL formula for a signal at time is formally defined as follows:
(5) 
An STL formula is boundedtime if it contains no unbounded operators. The bound of a formula is defined as the maximum over the sum of all nested upperbounds on the STL formulae.
Synthesizing controllers that satisfy STL properties is a nontrivial task. Most promising approaches are based on Receding Horizon Control or Model Predictive Control (MPC) [?] that aim to iteratively optimize a cost function of interest. Specifically, starting with an initial state , the MPC scheme aims to determine the optimal control strategy given the dynamics model of the system as in equation (Safe Control under Uncertainty), while satisfying the STL formula . The constraints represented using STL allow expression of temporal specifications on the runs of the system and environment and limit the allowed behavior of the closed loop system [?, ?].
Prior work shows that MPC optimization with STL constraints can be posed as a Mixed Integer Linear Program (MILP) [?, ?]. It is wellknown that the global optimality of this approach is not guaranteed; nonetheless, MPC is fairly used in practice, and has shown to perform well.
Probability theory provides a natural way to represent uncertainty in the environment and recent advances in Machine Learning and Perception have heavily relied on Bayesian methods to infer distributions over latent phenomenon of interest [?, ?]. The two key ingredients include (a) Bayesian networks (equivalently graphical models) that allow expression of complex interactions between sets of latent variables and (b) the Bayesian inference procedure that numerically computes probability distributions over the variables of interest. One of the key distinguishing aspects of the Bayesian methodology is that, unlike other optimization based machine learning methods, the entire distributions over the variables of interest are available. Such distributions completely characterize the uncertainty present in the system and are crucial for our goal of synthesizing safe controllers.
While a thorough discussion of Bayesian networks and associated methods to model uncertainty is beyond the scope of this paper, we highlight these methods on the task of inferring classifiers from observed training data. Formally, given a set of training data points , with observations , where , we are interested in finding a hyperplane that separates the points belonging to the two classes according to . Under the Bayesian paradigm, we are interested in the distribution:
(6)  
The first line in the above equation stems from the Bayes rule, and the second line simply exploits the fact that given the classifier the labels for each of the points in the data set are independent. The expression in the third line is an indicator function which evaluates to when the condition inside the brackets holds. Thus, equation (Safe Control under Uncertainty) starts from a prior over the classifiers and eventually by incorporating the training data points, infers a posterior distribution over the set of all the classifiers that respect the observed labels and the points. While the above equation expresses the statistical dependencies among the various variables (i. e., the model), there are various Bayesian inference techniques [?, ?, ?] that would allow numerical computation of the posterior distribution of interest. In the above case of Bayesian classifier, the popular method of choice is to use Expectation Propagation [?] to infer as a Gaussian distribution . Linear application of this classifier to a data point as results in a Gaussian distribution of the prediction with the mean and the variance . Similarly, for the case of Bayesian linear regression the same procedure can be followed, albeit with continuous target variables .
Note that these Bayesian linear classifiers and regressors are a fairly rich class of models and have similar or better representation capabilities as kernel machines [?]. In this work, we specifically aim to incorporate such rich family of classification models in safe controller synthesis.
We propose Probabilistic Signal Temporal Logic (PrSTL) that allows us to express uncertainty over the latent variables via probabilistic specifications. The key idea in our work is to first incorporate random variables in predicates, and then express temporal and Boolean operations on such predicates. The proposed logic provides an expressive framework for defining safety conditions under a wide variety of uncertainties, including the uncertainty that arises due to application of Machine Learning classifiers.
The core ingredient in this work is the insight that when the uncertainty over the random variable is reasoned out in a Bayesian framework, we can use the inferred probability distributions to efficiently derive constraints from the PrSTL specifications. We provide a novel solution for synthesizing controllers for dynamical systems given different PrSTL properties. An interesting aspect of this framework is that the PrSTL formulae can evolve at every step. For example, a classifier associated with the dynamical system can continue to learn with time, thereby changing the inferred probability distributions on the latent random variables.
PrSTL supports probabilistic temporal properties on realvalued, densetime signals. Specifically, denotes that the signal satisfies the PrSTL formula at time . We introduce the notion of a probabilistic atomic predicate of a PrSTL formula that is parameterized with a timevarying random variable drawn from a distribution at every time step:
(7) 
Here represents the probability of the event and defines the tolerance level in satisfaction of the probabilistic properties. The parameter is a small timevarying positive number and represents the threshold on satisfaction probability of . A signal satisfies the PrSTL predicate with confidence if and only if:
(8) 
Here is an indicator function, and the equation marginalizes out the random variable with the probability density . The truth value of the PrSTL predicate thus is equivalent to satisfaction of the probabilistic constraint in equation (Safe Control under Uncertainty). We would like to point out that computing such integrals for general distributions is computationally difficult; however, there are many parameterized distributions (e.g., Gaussian and other members of the exponential family) for which there exists either a closed form solution or efficient numerical procedures.
Note that this probabilistic atomic predicate is a stochastic function of the signal at time and expresses a model of the uncertainty in environment based on the observed signals. As the system evolves and observes more data about the environment, the distribution over the random variable changes over time, thereby leading to an adaptive PrSTL predicate. The PrSTL formula consists of Boolean and temporal operations over their predicates. We formulate PrSTL in negation normal form, and recursively define the syntax of the logic as:
(9) 
Here, is a PrSTL formula, which is built upon predicates defined in equation (Safe Control under Uncertainty), propositional formulae composed of the predicates and Boolean operators such as (and), (negation), and temporal operators on such as (globally), (eventually) and (until). Note, that in these operations the PrSTL predicates can have different probabilistic parameters, i. e., and . In addition, satisfaction of the PrSTL formulae for each of the Boolean and temporal operations based on the predicates is defined as:
(10) 
Remark 1
Note that (negation) defined above, does not follow the traditional logical complement properties, i. e., a formula and its negation can both be satisfied or violated by our definition of negation. Satisfaction of a complement of a PrSTL formula is equivalent to negating the formula’s function .
Remark 2
The PrSTL framework reduces to STL, when the distribution is a Dirac distribution. A Dirac or a point distribution over enforces to be deterministic and equivalent to an STL predicate defined in Section Safe Control under Uncertainty.
We now formally define the controller synthesis problem in the MPC framework with PrSTL specifications.
Problem 1
Given a hybrid dynamical system as in equation (Safe Control under Uncertainty), an initial state , a PrSTL formula , an MPC horizon , and a cost function defined for a finite horizon trajectory find:
(11)  
subject to 
Problem (1) formulates a framework for finding a control strategy that optimizes a given cost function, and satisfies a PrSTL formula. Finding the best strategy for this optimization given only deterministic PrSTL formulae, where is drawn from a Dirac distribution is the same as solving a set of mixed integer linear constraints. In this section, we show how the optimization can be solved for the general case of PrSTL by translating the formula to a set of mixed integer constraints. Specifically, we provide full solution for the Gaussian distributions in Problem 1, where the optimization reduces to mixed integer semidefinite programs.
We first discuss how every PrSTL formula generates a set of integer constraints. Given a PrSTL formula, we introduce two integer variables for every time step , and , which correspond to the truth value of the PrSTL formula and its negation respectively. These variables enforce satisfaction of the PrSTL formula as follows:
(12)  
The formula holds true if , and its negation (defined in Section Safe Control under Uncertainty) holds true if . Due to our definition of negation for probabilistic formulae, there exist signals for which , and can both be set to 1, where both , and are satisfied by the signal. This explains the construction of two integer variables for every formula. Using both integer variables, we define the constraints required for logical and temporal operations of PrSTL on and for all times. These integer variables enforce the truth value of the formula , and we refer to them as truth value enforcers:

Negation and

Conjunction and

Disjunction

Globally
(Only for ). 
Eventually .

Unbounded Until

Bounded Until
Here, we have shown how and are defined for every logical property such as negation, conjunction, and disjunction, and every temporal property such as globally, eventually, and until. We use to refer to unbounded until, and for bounded until.
Note that while synthesizing controllers for PrSTL formulae in an MPC scheme, we sometimes are required to evaluate satisfaction of the formula outside of the horizon range . For instance, a property might need to be evaluated beyond for some . In such cases, our proposal is to act optimistically, which means that we assume the formula holds true for the time steps outside of the horizon of globally operator, and similarly assume the formula does not hold true for the negation of the globally operator. This optimism is evident in formulating the truth value enforcers of the globally operator above, and based on that, it is specified for other temporal properties.
Based on the recursive definition of PrSTL, and the above encoding, the truth value enforcers of every PrSTL formula is defined using a set of integer inequalities involving a composition of the truth value enforcers of the inner predicates.
We have defined the PrSTL predicate for a general function, of the signal at time . In general, the function allows a random variable drawn from any distribution at every time step. The general problem of controller synthesis that would satisfy the PrSTL predicates is computationally difficult due to the fact that evaluation of the predicates boils down to computing an integration depicted in equation (Safe Control under Uncertainty). Consequently, in order to solve the control problem in equation (1) we need to enforce a structure on the predicates of . In this section, we explore the linearGaussian structure of the predicates that appear in many of the realworld cases and show how they translate into Mixed Integer SDPs.
Formally, if is only a single predicate, the optimization in equation (1) will reduce to:
(13)  
subject to 
This optimization translates to a chance constrained problem [?, ?, ?, ?, ?, ?] at every time step of the horizon, based on the definition of PrSTL predicates in equation (Safe Control under Uncertainty):
(14)  
subject to  
One of the big challenges with such chance constrained optimization is there are no guarantees that the above optimization in equation (14) is convex. The convexity of the problem depends on the structure of the function , and the distribution .
It turns out that the problem takes a particularly simple convex form when the function takes a linearGaussian form, i. e., the random variable comes from a Gaussian distribution and the function itself is linear in :
(15) 
It is easy to show that for this structure of , where is a weighted sum of the states with Gaussian weights , the chance constrained optimization in equation (14) is convex [?, ?]. Specifically, the optimization problem can be transformed to a secondorder cone program (SOCP). To see this, we consider normally distributed random variable , its cumulative distribution function (CDF) :
(16) 
Then, the chance constrained optimization reduces to SOCP via the following derivation:
(17)  
In this formulation, is the linear term, where is the mean of the random variable at every time step, and is the norm representing a quadratic term, where is the variance of . This quadratic term is scaled by , the inverse of the Normal CDF function, which is negative for small values of . Thus, every chance constraint can be reformulated as a SOCP, and as a result with a convex cost function , we can efficiently solve the following convex optimization for every predicate of PrSTL:
(18)  
subject to  
Assuming the a linearGaussian form of the function, we generate the SOCP above and easily translate it to a semidefinite program (SDP) by introducing auxiliary variables [?]. We can use this semidefinite program that solves the problem in equation (13) with a single constraint as a building block, and use it multiple times to handle complex PrSTL formulae. Specifically, any PrSTL formula can be decomposed to its predicates by recursively introducing integer variables that correspond to the truth value enforcers of the formula at every step as discussed in Section Safe Control under Uncertainty.
We would like to point out that assuming linearGaussian form of the function is not too restrictive. The linearGaussian form subsumes the case of Bayesian linear classifiers, and consequently the framework can be applied to a wide variety of scenarios where a classification or regression function needs to estimate quantities of interest that are critical for safety. Furthermore, the framework is applicable to all random variables whose distributions exhibit unimodal behavior and aligned with the large law of numbers. Finally, for the cases of nonGaussian random variables, there are many approximate inference procedures that can approximate the distributions as Gaussian distributions effectively.
As discussed in the previous section Safe Control under Uncertainty, at the predicate level of , we create a chance constrained problem for predicates . These predicates of the PrSTL formulae can be reformulated as a semidefinite program, where the predicates are over intersections of cone of positive definite matrices with affine spaces. Semidefinite programs are special cases of convex optimization; consequently, solving Problem 1, only for PrSTL predicates is a convex optimization problem. Note that in Section Safe Control under Uncertainty we introduced integer variables for temporal and Boolean operators of the PrSTL formula. Construction of such integer variables increases the complexity of Problem 1, and results in a mixed integer semidefinite program (MISDP). However, we are not always required to create integer variables for all temporal and Boolean operators. Therefore, we define Convex PrSTL as a subset of PrSTL formulae that can be solved without constructing integer variables.
Definition 1
Convex PrSTL is a subset of PrSTL such that it is recursively defined over the predicates by applying Boolean conjunctions, and the globally temporal operator. Satisfaction of a convex PrSTL formulae is defined as:
(19) 
Theorem 1
Given a convex PrSTL formula , a hybrid dynamical system as in equation (Safe Control under Uncertainty), and its initial state . The controller synthesis problem with convex PrSTL constraints defined in Problem 1 is a convex program.

We have shown that the predicates of , missingi. e., create a set of convex constraints. The Boolean conjunction of convex programs are also convex; therefore, result in convex constraints. In addition, the globally operator is defined as a set of finite conjunctions over its time interval: . Thus, the globally operator retains the convexity property of the constraints. Consequently, Problem 1, with a convex PrSTL constraint is a convex program.
Theorem 1 allows us to efficiently reduce the number of integer variables required for solving Problem 1. We only introduce integer variables when disjunctions, eventually, or until operators appear in the PrSTL constraints. Even when a formula is not completely part of the Convex PrSTL, integer variables are introduced only for the nonconvex segments.
We show our complete method of controlling dynamical systems in uncertain environments in Algorithm 1. At the first time step , we run an openloop control algorithm to populate past in line 3. We then run the closedloop algorithm, finding the optimal strategy at every time step of the time interval . In the closedloop algorithm, we linearize the dynamics at the current local state and time in line 5, and then update the distributions over the random variables in the PrSTL formula based on new sensor data in line 6. Then, we update the PrSTL formulae, based on the updated distributions. If there are any other dynamic parameters that change at every time step, they can also be updated in line 7. In line 8, we generate the mixed integer constraints in , and then populate C with all the constraints including the PrSTL constraints, linearized dynamics, and enforcing the past trajectory. Note that we do not construct integer variables if the formula is in the subset of Convex PrSTL. Then, we call the finite horizon optimization algorithm under the cost function , and the constraints C in line 10, which provides a length strategy . We advance the state with the first element of , and update the history of the trajectory in past. We continue running this loop and synthesizing controllers over all time steps in interval .
We implemented our controller synthesis algorithm for PrSTL formulae as a Matlab toolbox, available at:
https://www.eecs.berkeley.edu/dsadigh/PrSTL.
Our toolbox uses Yalmip [?] and Gurobi [?] as its optimization engine.
For all the examples we tried, the optimization computed at every step completed in less than 2 seconds on a 2.3 GHz Intel Core i7 processor with 16 GB memory.
We show some of our results for controlling quadrotors and autonomous driving under uncertain environments.
Controlling quadrotors in dynamic uncertain environments is a challenging task. Different sources of uncertainty appear while controlling quadrotors, e. g., uncertainty about the position of the obstacles based on classification methods, distributions over wind profiles or battery profiles, etc. In this case study, we show how to express properties of different models of uncertainty over time, and we find an optimal strategy under such uncertain environments.
We follow the derivation of the dynamics model of a quadrotor in [?]. We consider a dimensional system, where the state consists of the position and velocity of the quadrotor and , as well as the Euler angles , i. e., roll, pitch, yaw, and the angular velocities . Let be:
(20) 
The system has a dimensional control input:
(21) 
where , and are the control inputs about each axis for roll, pitch and yaw respectively. represents the thrust input to the quadrotor in the vertical direction (axis). The nonlinear dynamics of the system is:
(22)  
where and are rotation matrices, relating body frame and inertial frame of the quadrotor, is a skewsymmetric matrix, and is the inertial matrix of the rigid body. Also and denote gravity and mass of the quadrotor. Then the dynamics equation is:
(23) 
.
Our first goal is for a quadrotor to reach a point in the space while avoiding obstacles. This is shown in Figure Safe Control under Uncertainty, where the quadrotor is shown by a green square at its starting position, the origin , and its objective is to reach the coordinates smoothly. If we let represent the ground level, the objective of the quadrotor is to take off and travel a distance, and then land on the ground again. Note that we use the convention, where is above the ground level. We optimize the following objective:
(24) 
Here, we penalize the norm of the Euler angles by a factor of , since we look for a smooth trajectory. We chose in our examples. In addition to initializing the state and control input at zero, we need to satisfy the following deterministic PrSTL formulae:
Bounds on Roll Input  (25)  
Bounds on Pitch Input  
Bounds on Thrust 
In Figure Safe Control under Uncertainty, the purple surface is a ceiling that the quadrotor should not collide with as it is taking off and landing at the final position. However, the quadrotor does not have a full knowledge of where the ceiling is exactly located. We define a sensing mechanism for the quadrotor, which consists of a meshgrid of points around the body of the quadrotor. As the system moves in the space, a Bayesian binary classifier is updated by providing a single label (no obstacles present) or (obstacle present) for each of the sensed points.
The Bayesian classifier is the same as the Gaussian Process based method as described in Section Safe Control under Uncertainty and has the linearGaussian form. Applying this classifier results in a Gaussian distribution for every point in the Dspace. We define our classifier with confidence , as the stochastic function , where ,, and define the coordinates of the sensing points in the space, and is the Gaussian weight inferred over time using the sensed data. So, we define a timevarying probabilistic constraint that needs to be held at every time step as its value changes over time. Our constraint specifies that given a classifier based on the sensing points parameterized by , we would enforce the quadrotor to stay within a safe region (defined by the classifier) with probability , for at all times. Thus the probabilistic formula is:
(26)  
We enforce this probabilistic predicate at all times in , which verifies the property starting from a small time after the initial state, so the quadrotor has gathered some sensor data. In Figure Safe Control under Uncertainty, the orange surface, represents the second order cone created based on , at every time step. This surface is characterized by:
(27) 
Note that the surface shown in Figure Safe Control under Uncertainty, at the initial time step is not an accurate estimate of where the ceiling is, and it is based on a distribution learned from the initial values of the sensors. Thus, if the quadrotor was supposed to follow this estimate without updating, it would collide with the ceiling, since the orange surface showing the belief of the location of the ceiling is above the purple surface representing the real position of the ceiling. However, the Bayesian inference running at every step of the optimization, updates the distribution over the classifier. As shown in Figure Safe Control under Uncertainty, the orange surfaces changes at every time step, since the parameters of the learned random variable , which are , and are updated at every step. In Figure Safe Control under Uncertainty, the blue path represents the trajectory the quadrotor has already taken, and the dotted green line represents the future planned trajectory based on the current state of the classifier. The dotted green trajectory at the initial state goes through the ceiling, since the belief of the location of the ceiling is incorrect; however, the trajectory is modified at every step as the classifier values are updated, and the quadrotor safely reaches the final position. We solve the optimization using our toolbox, with , and horizon length of . We emphasize that some of the constraints are timevarying, and we need to update them at every step of the optimization. We similarly update the dynamics at every time step, since we locally linearize the dynamics around the current position of the quadrotor at every step.
We consider another scenario for controlling quadrotors, where we add a battery state to the state space of the system discussed above. So the quadrotor will be a dimensional system, where the first states follow the same order and dynamics of equations (20) and (21). We let denote the battery state, and we initialize it at . The state of evolves with the negative thrust value:
(28) 
So the dynamics is , and all the other states and inputs are initialized at zero. We enforce the same constraints as in equation (25), and the objective of the quadrotor is to reach the coordinates smoothly, which corresponds to flying from the origin to the top diagonal corner of the space. Furthermore, we impose that the quadrotor can fly above a specific height only if it is confident in its battery power. The formula encodes that eventually in the next s, the quadrotor will fly above a threshold level of . Therefore, the truth of this formula should imply that the system is confident in the battery level, and consequently can make it to the goal position safely. However, we assume we don’t have access to the exact value of battery state due to uncertain environment factors that can affect the battery level such as radio communication, etc. We use a stochastic linear classifier on a battery state augmented with value , to estimate the belief on the battery level. We allow the battery state to vary with a variance scaled at every time step of the horizon. So the formula ensuring that the quadrotor flies above a threshold only if its battery level is high enough is:
(29)  
We let the confidence . The property can be reformulated as:
(30) 
Here , and ranges over the horizon time steps. So, illustrates that the quadrotor has to be confident that its battery state perturbed by a timevarying variance is above at all times. Therefore, specifies that if the battery state is below some threshold, the quadrotor has to fly close to the ground. We synthesize a controller for the specifications, and the trajectory of the quadrotor is shown in Figure Safe Control under Uncertainty. The trajectory in Figure (a) corresponds to when , i. e., the battery state changes deterministically, and Figure (b), corresponds to , when the quadrotor is more cautious about the state of the battery. So the trajectory does not pass the height level whenever the confidence in the battery level is below .
In our second case study, we consider an autonomous driving scenario. We use a simple pointmass model to define the dynamics of the vehicles on the road. We let the state of the system be , where , denote the coordinates of the vehicle, is the heading, and is the speed. We let be the control inputs, where is the steering input, and is the acceleration. Further, we write the dynamics of the vehicle as:
(31)  
Figure Safe Control under Uncertainty, shows a scenario for an autonomous vehicle making a right turn at a signalized intersection. We refer to the red car as the ego vehicle, i. e., the vehicle we control autonomously, and the yellow car as the environment vehicle. We would like to find a strategy for the ego vehicle, so it makes a safe right turn when the traffic light is red, while yielding to the oncoming traffic. The yellow car in the figure represents the oncoming traffic at this intersection. In this example, the ego vehicle only has a probabilistic model of the velocity of the environment car. All vehicles in this example follow the same dynamics as in equation (31). We refer to the states of the the ego vehicle as: , and the states of the environment vehicle as: . While synthesizing a strategy for the red car, we would enforce a set of PrSTL specifications: (i) We enforce bounds on control inputs and states of the two vehicles, (ii) We encode the states and transitions of the traffic light, and enforce the vehicles to obey the traffic rules, (iii) We enforce that all vehicles remain within their road boundaries. In addition, we would like the two cars to avoid any collisions. We define collision avoidance as the following PrSTL property:
(32)  
Here, consists of a global operator at all times over the disjunction of four PrSTL predicates. Each probabilistic predicate encodes possible crash between the two vehicles. In equation (32), represents the minimum distance between the and coordinates of the two vehicles in either direction, which generates the four disjunctions on the predicates. The estimate of the distance between and coordinates of the two vehicles is encoded in each predicate, by considering the difference between the coordinates of the ego vehicle, and the propagated coordinates of the environment vehicle based on the value of its velocity. The velocity is a vector of Gaussian random variables computed based on the current heading of the environment vehicle , centered at the current speed , and perturbed by a variance . The predicates in define a linear classifier on the signal representing the coordinates of the ego vehicle, parameterized by a random variable characterizing the velocity of the environment vehicle. These predicates can easily be reformulated to the nominal structure of a PrSTL predicate . However, we leave them as in equation (32) for better illustration.
In the autonomous driving example, we use a sampling time of s, and horizon of . In addition, we let . We successfully synthesize a strategy for the autonomous vehicle by solving Problem 1, and following the steps in Algorithm 1. The trajectory generated using this strategy is shown by the solid blue line in Figure Safe Control under Uncertainty. The dotted green line is the future trajectory computed by the MPC scheme. In Figure (a), the ego vehicle has a deterministic model of the environment vehicle as ; therefore, it performs the right turn before letting the environment vehicle pass. However, as shown in Figure (b), given , and , the ego vehicle is not confident enough in avoiding collisions, so it acts in a conservative manner and waits for the environment car to pass first, and then performs its right turn safely.
We have presented a framework for safe controller synthesis under uncertainty. The key contributions include defining PrSTL, a logic for expressing probabilistic properties that allows embedding Bayesian graphical models. We also show how to synthesize control in a receding horizon framework under PrSTL specifications that express Bayesian linear classifiers. Another distinguishing aspect of this work is that the resulting logic adapts as more data is observed with the evolution of the system. We demonstrate the effectiveness of the approach by synthesizing safe strategies for a quadrotor and an autonomous vehicle traveling in uncertain environments.
The presented approach extends easily to distributions other than Gaussians via Bayesian approximate inference techniques [?, ?] that can project distributions to the Gaussian densities. Future work includes, extending controller synthesis for arbitrary distributions via sampling based approaches; we are also exploring using the proposed framework as a building block for complex robotic tasks that need to invoke higher level planning algorithms.
 [1] A. K. Akametalu, S. Kaynama, J. F. Fisac, M. N. Zeilinger, J. H. Gillula, and C. J. Tomlin. Reachabilitybased safe learning with gaussian processes. In 2014 IEEE 53rd Annual Conference on Decision and Control.
 [2] C. Andrieu, N. De Freitas, A. Doucet, and M. I. Jordan. An introduction to mcmc for machine learning. Machine learning, 50(12):5–43, 2003.
 [3] A. Aswani, H. Gonzalez, S. S. Sastry, and C. Tomlin. Provably safe and robust learningbased model predictive control. Automatica, 49(5):1216–1226, 2013.
 [4] M. J. Beal. Variational algorithms for approximate Bayesian inference. University of London, 2003.
 [5] A. BenTal, L. El Ghaoui, and A. Nemirovski. Robust optimization. Princeton University Press, 2009.
 [6] L. Blackmore, M. Ono, and B. C. Williams. Chanceconstrained optimal path planning with obstacles. IEEE Transactions on Robotics, 27(6):1080–1094, 2011.
 [7] S. Boyd and L. Vandenberghe. Convex optimization. Cambridge university press, 2004.
 [8] A. Carvalho, Y. Gao, S. Lefevre, and F. Borrelli. Stochastic predictive control of autonomous vehicles in uncertain environments. In 12th International Symposium on Advanced Vehicle Control, 2014.
 [9] A. Donzé, T. Ferrere, and O. Maler. Efficient robust monitoring for STL. In Computer Aided Verification. Springer, 2013.
 [10] A. Donzé and O. Maler. Robust Satisfaction of Temporal Logic over RealValued Signals. 2010.
 [11] J. Fu and U. Topcu. Integrating active sensing into reactive synthesis with temporal logic constraints under partial observations. arXiv:1410.0083, 2014.
 [12] J. Fu and U. Topcu. Probably approximately correct mdp learning and control with temporal logic constraints. arXiv:1404.7073, 2014.
 [13] J. Fu and U. Topcu. Computational methods for stochastic control with metric interval temporal logic specifications. arXiv:1503.07193, 2015.
 [14] A. Gelman, J. B. Carlin, H. S. Stern, and D. B. Rubin. Bayesian data analysis, volume 2. Taylor & Francis, 2014.
 [15] J. H. Gillula and C. J. Tomlin. Guaranteed safe online learning via reachability: tracking a ground target using a quadrotor. In 2012 IEEE International Conference on Robotics and Automation.
 [16] E. A. Gol, M. Lazar, and C. Belta. Temporal logic model predictive control. Automatica, 56:78–85, 2015.
 [17] I. Griva, S. G. Nash, and A. Sofer. Linear and nonlinear optimization. Siam, 2009.
 [18] I. Gurobi Optimization. Gurobi optimizer reference manual, 2015.
 [19] H. Huang, G. M. Hoffmann, S. L. Waslander, and C. J. Tomlin. Aerodynamics and control of autonomous quadrotor helicopters in aggressive maneuvering. In 2009 IEEE International Conference on Robotics and Automation.
 [20] M. Jordan. Learning in graphical models (adaptive computation and machine learning). 1998.
 [21] S. Kataoka. A stochastic programming model. Econometrica: Journal of the Econometric Society, 1963.
 [22] M. V. Kothare, V. Balakrishnan, and M. Morari. Robust constrained model predictive control using linear matrix inequalities. Automatica, 32(10):1361–1379, 1996.
 [23] H. KressGazit, G. E. Fainekos, and G. J. Pappas. Temporallogicbased reactive mission and motion planning. IEEE Transactions on Robotics, 25(6):1370–1381, 2009.
 [24] D. Lenz, T. Kessler, and A. Knoll. Stochastic model predictive controller with chance constraints for comfortable and safe driving behavior of autonomous vehicles. In Intelligent Vehicles Symposium (IV), 2015 IEEE.
 [25] S. C. Livingston, R. M. Murray, and J. W. Burdick. Backtracking temporal logic synthesis for uncertain environments. In 2012 IEEE International Conference on Robotics and Automation.
 [26] J. Löfberg. Yalmip: A toolbox for modeling and optimization in matlab. In 2004 IEEE International Symposium on Computer Aided Control Systems Design.
 [27] O. Maler and D. Nickovic. Monitoring temporal properties of continuous signals. In Formal Techniques, Modelling and Analysis of Timed and FaultTolerant Systems. Springer.
 [28] T. P. Minka. A family of algorithms for approximate Bayesian inference. PhD thesis, Massachusetts Institute of Technology, 2001.
 [29] I. Mitchell and C. J. Tomlin. Level set methods for computation in hybrid systems. In Hybrid Systems: Computation and Control. Springer.
 [30] I. M. Mitchell, A. M. Bayen, and C. J. Tomlin. A timedependent hamiltonjacobi formulation of reachable sets for continuous dynamic games. IEEE Transactions on Automatic Control, 50, 2005.
 [31] M. Morari, C. Garcia, J. Lee, and D. Prett. Model predictive control. Prentice Hall Englewood Cliffs, NJ, 1993.
 [32] N. Piterman, A. Pnueli, and Y. Saar. Synthesis of reactive (1) designs. In Verification, Model Checking, and Abstract Interpretation, pages 364–380. Springer, 2006.
 [33] A. Puggelli, W. Li, A. L. SangiovanniVincentelli, and S. A. Seshia. Polynomialtime verification of pctl properties of mdps with convex uncertainties. In Computer Aided Verification. Springer, 2013.
 [34] V. Raman, A. Donzé, M. Maasoumy, R. M. Murray, A. SangiovanniVincentelli, S. Seshia, et al. Model predictive control with signal temporal logic specifications. In 2014 IEEE 53rd Annual Conference on Decision and Control.
 [35] V. Raman, A. Donzé, D. Sadigh, R. M. Murray, and S. A. Seshia. Reactive synthesis from signal temporal logic specifications. In 18th International Conference on Hybrid Systems: Computation and Control, 2015.
 [36] D. Sadigh, E. S. Kim, S. Coogan, S. S. Sastry, S. Seshia, et al. A learning based approach to control synthesis of markov decision processes for linear temporal logic specifications. In 2014 IEEE 53rd Annual Conference on Decision and Control.
 [37] M. Svoreňová, J. Křetínskỳ, M. Chmelík, K. Chatterjee, I. Černá, and C. Belta. Temporal logic control for stochastic linear systems using abstraction refinement of probabilistic games. In 18th International Conference on Hybrid Systems: Computation and Control, 2015.
 [38] C. Van de Panne and W. Popp. Minimumcost cattle feed under probabilistic protein constraints. Management Science, 9(3):405–430, 1963.
 [39] M. P. Vitus. Stochastic Control Via Chance Constrained Optimization and its Application to Unmanned Aerial Vehicles. PhD thesis, Stanford University, 2012.
 [40] M. P. Vitus and C. J. Tomlin. A probabilistic approach to planning and control in autonomous urban driving. In 2013 IEEE 52nd Annual Conference on Decision and Control.
 [41] Y. Wang, L. Xie, and C. E. de Souza. Robust control of a class of uncertain nonlinear systems. Systems & Control Letters, 19(2):139–149, 1992.
 [42] C. K. Williams and C. E. Rasmussen. Gaussian processes for machine learning. the MIT Press, 2(3):4, 2006.
 [43] T. Wongpiromsarn, U. Topcu, and R. M. Murray. Receding horizon control for temporal logic specifications. In 13th ACM international conference on Hybrid systems: computation and control, 2010.
 [44] K. Zhou and J. C. Doyle. Essentials of robust control, volume 180. Prentice hall Upper Saddle River, NJ, 1998.