Satisfiability Modulo ODEs

Satisfiability Modulo ODEs

Sicun Gao    Soonho Kong    Edmund Clarke
Abstract

We study SMT problems over the reals containing ordinary differential equations. They are important for formal verification of realistic hybrid systems and embedded software. We develop -complete algorithms for SMT formulas that are purely existentially quantified, as well as -formulas whose universal quantification is restricted to the time variables. We demonstrate scalability of the algorithms, as implemented in our open-source solver dReal, on SMT benchmarks with several hundred nonlinear ODEs and variables.

1 Introduction

Hybrid systems tightly combine finite automata and continuous dynamics. In most cases, the continuous components are specified by ordinary differential equations (ODEs). Thus, formal verification of general hybrid systems requires reasoning about logic formulas over the reals that contain ODE constraints. This problem is considered very difficult and has not been investigated in the context of decision procedures until recently [7, 8, 16]. It is believed that current techniques are not powerful enough to handle formulas that arise from formal verification of realistic hybrid systems, which typically contain many nonlinear ODEs and other constraints.

Since the first-order theory over the reals with trigonometric functions is already undecidable, solving formulas with general ODEs seems inherently impossible. We have resolved much of this theoretical difficulty by proposing the study of -complete decision procedures for such formulas [10]. An algorithm is -complete for a set of SMT formulas, where is an arbitrary positive rational number, if it correctly decides whether a formula is unsatisfiable or -satisfiable. Here, a formula is -satisfiable if, under some -perturbations, a syntactic variant of the original formula is satisfiable [9]. We have shown that -complete decision procedures are suitable for various formal verification tasks [9, 10]. We have also proved that -complete decision procedures exist for SMT problems over the reals with Lipschitz-continuous ODEs. Such results serve as a theoretical foundation for developing practical decision procedures for the SMT problem.

In this paper we study practical -complete algorithms for SMT formulas over the reals with ODEs. We show that such algorithms can be made powerful enough to scale to realistic benchmark formulas with several hundred nonlinear ODEs.

We develop decision procedures for the problem following a standard DPLL(ICP) framework, which relies on constraint solving algorithms as studied in Interval Constraint Propagation (ICP) [2]. In this framework, for any ODE system we can consider its solution function as a constraint between the initial variables , time variable , and the final state variables . We define pruning operators that take interval assignments on , , and as inputs, and output refined interval assignments on these variables. We formally prove that the proposed algorithms are -complete. Beyond standard SMT problems where all variables are existentially quantified, we also study -formulas under the restriction that the universal quantifications are limited to the time variables (we call them -formulas). Such formulas have been an obstacle in SMT-based verification of hybrid systems [4, 5].

In brief, this paper makes the following contributions:

  • We formalize the SMT problem over the reals with general Lipschitz-contiunous ODEs, and illustrate its expressiveness by encoding various standard problems concerning ODEs: initial and boundary value problems, parameter synthesis problems, differential algebraic equations, and bounded model checking of hybrid systems. In some cases, -formulas are needed.

  • We propose algorithms for solving SMT with ODEs, using ODE constraints to design pruning operators in a branch-and-prune framework. We handle both standard SMT problems with only existentially quantified variables, as well as -formulas. We prove that the algorithms are -complete.

  • We demonstrate the scalability of the algorithms, as implemented in our open-source solver dReal [11], on realistic benchmarks encoding formal verification problems for several nonlinear hybrid systems.

The paper is organized as follows. In Section 2, we define the SMT problem with ODEs and show how it can encode various standard problems with ODEs. In Section 3, we propose algorithms in the DPLL(ICP) framework for solving fully existentially quantified formulas as well as formulas. In Section 4 we show experimental results.

Related Work.

Solving real constraints with ODEs has a wide range of applications, and much previous work exists for classes with special structures in different paradigms [6, 13, 18]. Recently [12] proposed a more general constraint solving framework, focusing on the formulation of the problem in the standard CP framework. On the SMT solving side, several authors have considered logical combinations of ODE constraints and proposed partial decision procedures [7, 8, 16]. We aim to extend and formalize existing algorithms for a general SMT theory with ODES, and formally prove that they can be made -complete. In terms of practical performance, the proposed algorithms are made scalable to various benchmarks that contain hundreds of nonlinear ODEs and variables.

2 SMT over the Reals with ODEs

2.1 Computable Real Functions

As studied in Computable Analysis [19, 17], we can encode real numbers as infinite strings, and develop a computability theory of real functions using Turing machines that perform operations using oracles encoding real numbers. We briefly review definitions and results of importance to us. Throughout the paper we use to denote the max norm over for various . First, a name of a real number is a sequence of rational numbers converging to it:

Definition 2.1 (Names).

A name of is any function that satisfies: for any , . For , . We write the set of all possible names for as .

Next, a real function is computable if there is a Turing machine that can use any argument of as an oracle, and compute the value of up to an arbitrary precision , where . Formally:

Definition 2.2 (Computable Functions).

We say is computable if there exists an oracle Turing machine such that for any , any name of , and any , the machine uses as an oracle and as an input to compute a rational number satisfying

The definition requires that for any , with access to an arbitrary oracle encoding the name of , outputs a -approximation of . In other words, the sequence is a name of . Intuitively, is computable if an arbitrarily good approximation of can be obtained using any good enough approximation to any . Most common continuous real functions are computable [19]. Addition, multiplication, absolute value, , , , . Compositions of computable functions are computable. In particular, solution functions of Lipschitz-continuous ordinary differential equations are computable, as we explain next.

2.2 Solution Functions of ODEs

We now show that the framework of computable functions allows us to consider solution functions of ODE systems.

Notation 2.3.

We use between -dimensional vectors to denote the system of equations for .

Let be compact and be Lipschitz-continuous functions, which means that for some constant (), for all ,

Let be a variable over . We consider the first-order autonomous ODE system

(1)

where . Here, each

(2)

is called the -th solution function of the ODE system (1). A key result in computable analysis is that these solution functions are computable, in the sense of Definition 2.2:

Proposition 2.4 ([17]).

The solution functions in the form of (2) of the ODE system (1) are computable over .

To see why this is true, recall that for any and , the value of the solution function follows the Picard-Lindelöf form:

Approximations of the right-hand side of the equation can be computed by finite sums, theoretically up to an arbitrary precision.

2.3 SMT Problems and -Complete Decision Procedures

We now let denote an arbitrary collection of computable real functions, which can naturally contain solution functions of ODE systems in the form of (2). Let denote the first-order signature , where constants are seen as 0-ary functions in . Let be the structure that interprets -formulas in the standard way. We focus on formulas whose variables take values from bounded domains, which can be defined using bounded quantifiers:

Definition 2.5 (Bounded Quantifiers).

The bounded quantifiers and are defined as

where and denote terms, whose variables only contain free variables in excluding . It is easy to check that .

The key definition in our framework is -variants of first-order formulas:

Definition 2.6 (-Variants).

Let , and a bounded -sentence of the standard form

where and . Note that negations are represented by sign changes on the terms. The -weakening of is defined as the result of replacing each atom by and by . That is,

The SMT problem is standardly defined as deciding satisfiability of quantifier-free formulas, which is equivalent to deciding the truth value of fully existentially quantified sentences. We will also consider formulas that are partially universally quantified. Thus, we consider both and formulas here.

Definition 2.7 (Bounded - and -SMT Problems).

A -SMT problem is a formula of the form

and a -SMT problem is of the form

In both cases is a quantifier-free -formula.

Definition 2.8 (-Completeness [9]).

Let be a set of formulas, and . We say a decision procedure is -complete for , if for any , correctly returns one of the following answers

is false;

is true.

If the two cases overlap, either one is correct.

We have proved in [10] that -complete decision procedures exists for arbitrary bounded -sentences. In particular, there exists -complete decision procedures for the bounded and SMT problems. This serves as the theoretical foundation as well as a correctness requirement for the practical algorithms that we will develop in the following sections.

2.4 SMT Encoding of Standard Problems with ODEs

In this section, we list several standard problems related to ODE systems and show that they can be easily encoded and generalized through SMT formulas. They motivate the development of decision procedures for the theory.

Remark 2.9.

In all the following cases, solutions to the standard problems are obtained from witnesses for the existentially quantified variables in the SMT formulas.

Remark 2.10.

In the definitions below, when the solution functions of ODE systems are written as part of a formula, no analytic forms are needed. They are functions included in the signature .

Generalized Initial Value Problems. Given an ODE system, the standard initial value problem asks for a solution of the variables at certain time, given a complete assignment to the initial conditions of the system. In the form of SMT formulas, we easily allow the initial conditions to be constrained by arbitrary quantifier-free -formulas:

Definition 2.11 (Generalized IVP).

Let be a compact domain, , and be the computable solution functions of an ODE system. Let be an arbitrary constant that represents a time point of interest. The generalized IVP problem is defined by formulas of the form:

where is a quantifier-free -formula constraining the initial states , and is the needed value for time point .

Generalized Boundary Value Problems. Given an ODE system, the standard boundary value problem is concerned with computing the computable solution function when part of the variables are assigned values at the beginning of flow, and part of the variables as assigned values at the end of the flow. A generalized version as encoded by SMT formulas is:

Definition 2.12 (Generalized BVP).

Let be a compact domain, , and be the solution functions of an ODE system. Let be two time points of interest. The generalized BVP problem is:

where is a quantifier-free -formula that specifies the boundary conditions. Note that is the value that we are interested in solving in the chosen time point .

Data-Fitting and Parameter Synthesis. The data fitting problem is the following. Suppose an ODE system has part of its parameters unspecified. Given a sequence of data , we need to find the values of the missing parameters of the original ODE system. More formally:

Definition 2.13 (Data-Fitting Problems).

Let and be compact domains, , and be the solution functions of an ODE system, where be a vector of parameters. Let be a sequence of pairs in . The data-fitting problem is defined by:

where a quantifier-free constraints the initial states .

Differential Algebraic Equations. DAE problems combine ODEs and algebraic constraints:

(3)
(4)

where , . To express the problem in , we need to use extra universal quantification to ensure that the algebraic relations hold throughout the time duration. Again, we can also generalize the equation in (4) to an arbitrary quantifier-free -formula. The problem is encoded as:

Definition 2.14 (DAE Problems).

Let be a compact domain, , and be the computable solution functions of the ODE system in (3) parameterized by . Let be defined by (4). Let be a time point of interest. A DAE problem is defined by the following formula:

where a quantifier-free specifies the initial conditions for , and is the needed value at time point .

Bounded Model Checking of Hybrid Systems. Bounded model checking problems for hybrid systems can be naturally encoded as SMT formulas with ODEs [7, 8, 16, 4, 5]. We consider a simple hybrid system to show an example. Let be an -dimensional 2-mode hybrid system. In mode 1, the flow of the system follows an ODE system whose solution function is , and in mode 2, it follows another solution function . The jump condition from mode 1 to mode 2 is specified by . The invariants are specified by and for mode . Let denote an unsafe region. Let the continuous variables be bounded in and time be bounded in . Now, if starts from mode 1 with initial states satisfying , it can reach the unsafe region after one discrete jump from mode 1 to mode 2, iff the following formula is true:

The encoding can be explained as follows. For each mode, we use two variable vectors and to represent the continous flows. denote the starting values of a flow, and denotes the final values. In mode , the flow starts with some values in the initial states, specified by . Then, we follow the continuous dynamics in mode , so that denotes the final value . Then the system follows the jumping condition and resets the variables from to as specified by . After that, the system follows the flow in mode 2. In the end, we check if the final state in mode 2 satisfies the unsafe predicate, .

3 Algorithms

3.1 The ICP framework

The method of Interval Constraint Propagation (ICP) [2] finds solutions of real constraints using a “branch-and-prune” method that performs constraint propagation of interval assignments on real variables. The intervals are represented by floating-point end-points. Only over-approximations of the function values are used, which are defined by interval extensions of real functions.

Definition 3.1 (Floating-Point Intervals and Hulls).

Let denote the finite set of all floating point numbers with symbols and under the conventional order . Let

denote the set of closed real intervals with floating-point endpoints, and the set of boxes with these intervals, respectively. When is a set of real numbers, the hull of is:

Definition 3.2 (Interval Extension [2]).

Suppose is a real function. An interval extension operator maps to a function , such that for any , it is always true that

1:
2:
3:while  do
4:     
5:     while  do
6:         
7:     end while
8:     if  then
9:         if  then
10:              
11:              
12:         else
13:              return sat
14:         end if
15:     end if
16:end while
17:return unsat
Algorithm 1 ICP()

ICP uses interval extensions of functions to “prune” out sets of points that are not in the solution set, and “branch” on intervals when such pruning can not be done, until a small enough box that may contain a solution is found. A high-level description of the decision version of ICP is given in Algorithm 1. In Algorithm 1, Branch is an operator that returns two smaller boxes and , where . The key component of the algorithm is the operation. Any operation that contracts the intervals on variables can be seen as pruning, but for correctness we need formal requirements on the pruning operator in ICP. Basically, we need to require that the interval extensions of the functions converge to the true values of the functions, and that the pruning operations are well-defined, as specified below.

Definition 3.3 (-Regular Interval Extensions).

We say an interval extension of is -regular, if for some constant , for any , .

Definition 3.4 (Well-defined Pruning Operators [9]).

Let be a collection of real functions, and be a -regular interval extension operator on . A well-defined (equality) pruning operator with respect to is a partial function , such that for any , ,

  1. ;

  2. If , then ;

  3. .

When is clear, we simply write . The rules can be explained as follows. (W1) ensures that the algorithm always makes progress. (W2) ensures that the result of a pruning is always a reasonable box that may contain a zero, and otherwise is pruned out. (W3) ensures that the real solutions are never discarded. We proved the following theorem in [9]:

Theorem 3.5.

Algorithm 1 is -complete if the pruning operators are well-defined.

3.2 ODE Pruning in an ICP Framework

We now study the algorithms for SMT formulas with ODEs. The key is to design the appropriate pruning operators for the solution functions of ODE systems. The pruning operations here strengthen and formalize the ones proposed in [7, 8, 12], such that -completeness can be proved.

We recall some notations first. Let be compact and be Lipschitz-continuous functions. Given the first-order autonomous ODE system

(5)

where , we write

to represent the -th solution function of the ODE system. The -regular interval extension of is an interval function

such that for a constant , for any time domain and any box of initial values , we have

and

We will also need the notion of the reverse of the ODE system (5), as defined by

(6)

Here, is defined as , the vector of functions consisting of the negation of each function in , which is equivalent to reversing time in the flow defined by the ODE system. That is, for , we always have

(7)

Naturally, we write to denote the -regular interval extension of the -th component of .

1:repeat
2:     
3:     
4:     
5:until 
6:return
Algorithm 2

The relation between the initial variables , the time duration , and the flow variables is specified by the constraint . Given the interval assignment on any two of , , and , we can use the constraint to obtain a refined interval assignment to the third variable vector. Thus, we can define three pruning operators as follows.

Remark 3.6.

The precise definitions of the pruning operators should map the interval assigments on all variables to new assignments on all variables. For notational simplicity, in the pruning operators below we only list the assignments that are actually changed between inputs and outputs. For instance, the forward pruning operator only changes the values on .

Forward Pruning. Given interval assignments on and , we compute a refinement of the interval assignments on . Figure 1 depicts the forward pruning operation. Formally, we define the following operator:

Definition 3.7 (Forward Pruning).

Let be the solution functions of an ODE system. Let , , and be interval assignments on the variables , , and . We define the forward-pruning operator as:

Figure 1: Forward Pruning. , , represents the current interval assignments, and is the refined interval assignment on after pruning.
1:
2:
3:while  do
4:     
5:     
6:end while
7:return
Algorithm 3

Backward Pruning. Given interval assignments on and , we can compute a refinement of the interval assignments on using the reverse of the solution function. Figure 2 depicts backward pruning. Formally, we define the following operator:

Definition 3.8 (Backward Pruning).

Let be the solution functions of an ODE system, and let be the reverse of . Let , , and be interval assignments on the variables , , and . We define the backward-pruning operator as:

Figure 2: Backward Pruning. , , represents the current interval assignments, and is the refined interval assignment on after pruning.

Time-Domain Pruning. Given interval assignments on and , we can also refine the interval assignment on by pruning out the time intervals that do not contain any that is consistent with the current interval assignments on . Figure 3 depicts time-domain pruning. Formally, we define the following operator:

Definition 3.9 (Time-Domain Pruning).

Let be the solution functions of an ODE system. Let , , be interval assignments on the variables , , and . We define the time-domain pruning operator as:

Figure 3: Time-Domain Pruning. , , represents the current interval assignments, and is the refined interval assignment on after pruning.

Overall, the pruning algorithm on based on ODE constraints iteratively applies the three pruning operators until a fixed point on the interval assignments is reached.

1:
2:
3:while  do
4:     
5:     
6:end while
7:return
Algorithm 4

We show the more detailed steps in the three pruning operations in Algorithm 2, 3, 4, and 5.

1:
2:
3:while  do
4:     
5:     if  then
6:         
7:     else
8:         
9:     end if
10:end while
11:return
Algorithm 5
Theorem 3.10.

The three pruning operators are well-defined.

Proof.

We prove that the forward pruning operator is well-defined, and the proofs for the other two operators are similar. Note that the definitions of well-defined pruning are formulated for equality constraints compared to 0. Here we use the function in the pruning operator. (Strictly speaking is a function vector that evaluates to on points satisfying the ODE flow. Here for notational simplicity we just write as a single-valued function and compare with the scalar .)

First, (W1) is satisfied because of the simple fact that for any boxes , we have .

Next, suppose . Then there does not exist any that satisfies both and . Since at the same time

this requires that . Consequently (W2) is satisfied.

Third, note that is an interval extension of . Thus, for any such that for some and , we have . Following the definition of the pruning operator, we have . Thus, and (W3) holds. ∎

3.3 -Formulas and Low-Order Approximations

For -formulas, if the universal quantification is only over the time variables, we can follow the trajectory and prune away the assignment on , , and that violates the constraints on the universally quantified time variable. In fact, although the extra quantification complicates the problem, the universal constraints improve the power of the pruning operations.

Here we focus on problems with one ODE system, which can be easily generalized. Let denote the solution functions of an ODE system, we consider an -formula of the form

(8)

Note that the problems encoded as -SMT formulas as listed in Section 2.4 are all of this form.

We consider as a special constraint on the and variables. Using this constraint, we can further refine the three pruning operators as follows.

Definition 3.11 (Pruning Refined by -Constraints).

Let be the solution functions of an ODE system. Let , , and be interval assignments on the variables , , and . Let be a constraint on the universally quantified time variable, as in (8). We first define

and define by replacing with in the definition above. The forward pruning operator with , written as , is defined as

Backward pruning is defined as

Time-domain pruning is defined as

In general, can be computed by a recursive call to DPLL(ICP), by solving the -SMT problem . In many practical applications, is of some simple form such as , in which case simple pruning is shown in Figure 4.

Figure 4: Pruning with -Constraints

Another useful heuristic in ODE pruning is to bound the range of the derivatives for a vector space specified by . Suppose for any time , the derivatives are bounded in . Then by the Picard-Lindelöf representation, we have

We can use this formula to perform preliminary pruning on , which is especially efficient when combined with -constraints. Figure 5 illustrates this pruning method.

Figure 5: Pruning with Low-Order Taylor Approximations

4 Experiments

Our tool dReal implements the procedures we studied for solving SMT formulas with ODEs. It is built on several existing packages, including opensmt [3] for the general DPLL(T) framework, realpaver [14] for ICP, and CAPD [1] for computing interval-enclosures of ODEs. The tool is open-source at http://dreal.cs.cmu.edu. All benchmarks and data shown here are also available on the tool website.

P #M #D #O #V delta R Time(s) Trace
AF 4 3 20 44 0.001 S 43.10 90K
AF 8 7 40 88 0.001 S 698.86 20M
AF 8 23 120 246 0.001 S 4528.13 59M
AF 8 31 160 352 0.001 S 8485.99 78M
AF 8 47 240 528 0.001 S 15740.41 117M
AF 8 55 280 616 0.001 S 19989.59 137M
CT 2 2 15 36 0.005 S 345.84 3.1M
CT 2 2 15 36 0.002 S 362.84 3.1M
EO 3 2 18 42 0.01 S 52.93 998K
EO 3 2 18 42 0.001 S 57.67 847K
EO 3 11 72 168 0.01 U 7.75
BB 2 10 22 66 0.01 S 0.25 123K
BB 2 20 42 126 0.01 S 0.57 171K
BB 2 20 42 126 0.001 S 2.21 168K
BB 2 40 82 246 0.01 U 0.27 —-
BB 2 40 82 246 0.001 U 0.26 —-
D1 3 2 9 24 0.1 S 30.84 72K
DU 3 2 6 16 0.1 U 0.04
Table 1: #M = Number of modes in the hybrid system, #D = Unrolling depth, #O = Number of ODEs in the unrolled formula, #V = Number of variables in the unrolled formula, R = Bounded Model Checking Result (delta-SAT/UNSAT) Time = CPU time (s), Trace = Size of the ODE trajectory, AF = Atrial Filbrillation, CT = Cancer Treatment, EO = Electronic Oscillator, BB = Bouncing Ball with Drag, D1,DU = Decay Model.

All experiments were conducted on a machine with a 3.4GHz octa-core Intel Core i7-2600 processor and 16GB RAM, running 64-bit Ubuntu 12.04LTS. Table 1 is a summary of the running time of the tool on various SMT formulas generated from bounded model checking hybrid systems. The formulas typically contain a large number of variables and nonlinear ODEs.

Figure 6: Above: Witness for the AF model at depth 23 and 1500 time units. Below: Experimental simulation data.

The AF model as we show in Table 1 is obtained from [15]. It is a precise model of atrial fibrillation, a serious cardiac disorder. The continuous dynamics in the model concerns four state variables and the ODEs are highly nonlinear, such as:

The exponential term on the right-hand side of the ODE is the sigmoid function, which often appears in modelling biological switches. On this model, our tool is able to perform a depth-55 unrolling, and solve the generated logic formula. Such a formula contains 280 nonlinear ODEs of the type shown here, with 616 variables. The computed trace from dReal suggests a witness of the reachability property that can be confirmed by experimental simulation. Figure 6 shows the comparison between the trace computed from bounded model checking and the actual experimental simulation trace.

Figure 7: Above: Witness computed for the CT model at depth 3 and 500 time units. Below: Experimental simulation data.

The CT model represents a prostate cancer treatment model that contains nonlinear ODEs such as following: