A Sum-of-Squares Problems

# Finding Polynomial Loop Invariants for Probabilistic Programs

## Abstract

Quantitative loop invariants are an essential element in the verification of probabilistic programs. Recently, multivariate Lagrange interpolation has been applied to synthesizing polynomial invariants. In this paper, we propose an alternative approach. First, we fix a polynomial template as a candidate of a loop invariant. Using Stengle’s Positivstellensatz and a transformation to a sum-of-squares problem, we find sufficient conditions on the coefficients. Then, we solve a semidefinite programming feasibility problem to synthesize the loop invariants. If the semidefinite program is unfeasible, we backtrack after increasing the degree of the template. Our approach is semi-complete in the sense that it will always lead us to a feasible solution if one exists and numerical errors are small. Experimental results show the efficiency of our approach.

## 1 Introduction

Probabilistic programs extend standard programs with probabilistic choices and are widely used in protocols, randomized algorithms, stochastic games, etc. In such situations, the program may report incorrect results with a certain probability, rendering classical program specification methods [11, 18] inadequate. As a result, formal reasoning about the correctness needs to be based on quantitative specifications. Typically, a probabilistic program consists of steps that choose probabilistically between several states. and the specification of a probabilistic program contains constraints on the probability distribution of final states, e. g. through the expected value of a random variable. Therefore the expected value is often the object of correctness verification [23, 21, 14].

To reason about correctness for probabilistic programs, quantitative annotations are needed. Most importantly, correctness of while loops can be proved by inferring special bounds on expectations, usually called quantitative loop invariants [23]. As in the classical setting, finding such invariants is the bottleneck of proving program correctness. For some restricted classes, such as linear loop invariants, some techniques have been established [25, 21, 3]. To use them to synthesize polynomial loop invariants, so-called linearization can be used [1], a technique widely applied in linear algebra. It views higher-degree monomials as new variables, establishes their relationship with existing variables, and then exploits linear loop invariant generation techniques. However, the number of monomials grows exponentially when the degree increases. Kapur et al. [29] introduce solvable mappings, which are a generalization of affine mappings, to avoid non-polynomial effects generated by polynomial programs. Recently, Chen et al. [6] applied multivariate Lagrange interpolation to synthesize polynomial loop invariants directly.

Another important problem for probabilistic programs is the almost-sure termination problem, answering whether the program terminates almost surely. In [13], Fioriti and Hermanns argued that Lyapunov ranking functions, used in non-probabilistic termination analysis, cannot be extended to probabilistic programs. Instead, they extended the ranking supermartingale approach [2] to the bounded probabilistic case, and provided a compositional and sound proof method for the almost-sure termination problem. In [20], Kaminski and Katoen investigated the computational complexity of computing expected outcomes (including lower bounds, upper bounds and exact expected outcomes) and of deciding almost-sure termination of probabilistic programs. In [5], Chatterjee et al. further investigated termination problems for affine probabilistic programs. Recently, they also presented a method [4] to efficiently synthesize ranking supermartingales through Putinar’s Positivstellensatz [28] and used it to prove the termination of probabilistic programs. Their method is sound and semi-complete over a large class of programs.

In this paper, we develop a technique exploiting semidefinite programming through another Positivstellensatz to synthesize the quantitative loop invariants. Positivstellensätze are essential theorems in real algebra to describe the structure of polynomials that are positive (or non-negative) on a semialgebraic set. While our approach shares some similarities with the one in [4], the difference to the termination problem requires a variation of the theorem. In detail, Putinar’s Positivstellensatz deals with the situation when the polynomial is strictly positive on a quadratic module, which is not enough for quantitative loop invariants. In the program correctness problem, equality constraints are taken into consideration as well as inequalities. Therefore in our method, Stengle’s Positivstellensatz [30] dealing with general real semi-algebraic sets is being used.

As previous results [21, 15, 6], our approach is constraint-based [9]. We fix a polynomial template for the invariants with a fixed degree and generate constraints from the program. The constraints can be transformed into an emptiness problem of a semialgebraic set. By Stengle’s Positivstellensatz [30], it suffices to solve a semidefinite programming feasibility problem, for which efficient solvers exist. From a feasible solution (which may not be the tightest one) we can then obtain the corresponding coefficients of the template. We verify the correctness of the template. If the solver does not provide a feasible solution or if the coefficients are not correct, we refine the analysis by adding constraints to block the undesired solutions and get a tighter invariant or increasing the degree of the template, which will always lead us to a feasible solution if one exists.

The method is applied to several case studies taken from [6]. The technique usually solves the problem within one second, which is about one tenth of the time taken by the tool described in [6]. Our tool supports real variables rather than discrete ones, and supports programs that require polynomial invariants. We illustrate these features by analyzing a non-linear perceptron program and a model for airplane delay with continuous distributions. Moreover, we conduct a sequence of trials on parameterized probabilistic programs to show that the main influence factor on the running time of our method is the degree of the invariant template. We compare our results on these examples with the Lagrange Interpolation Prototype (LIP) in [6], Prinsys [15] and the tool for super-martingales (TM) [2].

## 2 Preliminaries

In this section we introduce some notations. We use to denote an -tuple of variables . For a vector , denotes the monomial , and is its degree.

###### Definition 1

A polynomial in variables is a finite linear combination of monomials: where finitely many are non-zero.

The degree of a polynomial is the highest degree of its component monomials. Extending the notation, for a sequence of polynomials and a vector , we let denote . The polynomial ring with variables is denoted with , and the set of polynomials of degree at most is denoted with . For and , is the value of at .

A constraint is a quantifier-free formula constructed from (in)equalities of polynomials. It is linear if it contains only linear expressions. A semialgebraic set is a set described by a constraint:

###### Definition 2

A semialgebraic set in is a finite union of sets of the form , where is a polynomial and is a finite set of polynomials.

A polynomial is a sum of squares (or SOS, for short), if there exist polynomials such that . Chapters 2 and 3 of [?] introduce a way to transform the problem whether a given polynomial is an SOS into a semidefinite programming problem (or SDP, for short), which is a generalization of linear programming problem. We introduce the transformation and SDP problems briefly in Appendices A and B.

### 2.1 Probabilistic Programs

We use a simple probabilistic guarded-command language to construct probabilistic programs with the grammar:

 P::=skip∣abort∣x:=E∣P;P∣P [p] P∣% if (G) then {P} else {P}∣while(G){P}

where is a Boolean expression and is a real-valued expression defined by the grammar:

 E::=c∣xn∣r∣constant/% variable/random variable::=E+E∣E⋅E∣arithmeticG::=E

Random variable follows a given probability distribution, discrete or continuous. For , the probabilistic choice command executes with probability and with probability .

###### Example 1

The following probabilistic program describes a simple game:

 z:=0; while(0

The program models a game where a player has dollars at the beginning and keeps tossing a coin with probability . The player wins one dollar if he tosses a head and loses one dollar for a tail. The game ends when the player loses all his money, or he wins dollars for a predetermined . The variable records the number of tosses made by the player during this game.

We assume that the reader is familiar with the basic concepts of probability theory and in particular expectations, see e. g. [12] for details. Expectations are typically functions from program states (i. e. the real-valued program variables) to . An expectation is called a post-expectation when it is to be evaluated on the final distribution, and it is called a pre-expectation when it is to be evaluated on the initial distribution. Let , be expectations and a probabilistic program. We say that the sentence holds if the expected value of after executing is equal to or greater than the expected value of . When and are functions, the comparison is executed pointwise.

Classical programs can be viewed as special probabilistic programs in the following sense. For classical precondition and postcondition , let the characteristic function equal 1 if the precondition is true and 0 otherwise, and define similarly. If one considers a Hoare triple where is a classical program, then it holds if and only if holds in the probabilistic sense.

### 2.2 Probabilistic Predicate Transformers

Let , be probabilistic programs, an expression, a post-expectation, a pre-expectation, a Boolean expression, and . The probabilistic predicate transformer can be defined as follows [16]:

 wp(skip,post)=postwp(abort,post)=0wp(x:=E,post)=post[x/ES(E)]wp(P; Q,post)=wp(P,wp(Q,post))wp(if(G) then(P) else(Q),post)=χG⋅wp(P,post)+(1−χG)⋅wp(Q,post)wp(P [p] Q,post)=p⋅wp(P,post)+(1−p)⋅wp(Q,post)wp(while(G) {P},post)=μX.(χG⋅wp(P,X)+(1−χG)⋅post)

Here denotes the formula obtained by replacing free occurrences of in by the expectation of expression over the state space . The least fixed point operator is defined over the domain of expectations [25, 23], and it can be shown that holds if and only if . Thus, is the greatest lower bound of precondition expectation of with respect to , and we say is the weakest pre-expectation of P w. r. t. post.

### 2.3 Positivstellensatz

Hilbert’s Nullstellensatz is very important in algebra, and its real version, known as Positivstellensatz, is crucial to our method. First, some concepts are needed to introduce the theorem.

• The set is a positive cone if it satisfies: (i) If , then , and (ii) is closed under addition and multiplication.

• The set is a multiplicative monoid with 0 if it satisfies: (i) , and (ii) is closed under multiplication.

• The set is an ideal if it satisfies: (i) , (ii) is closed under addition, and (iii) If and , then .

We are interested in finitely generated positive cones, multiplicative monoids with 0, and ideals. Let be a finite set of polynomials. We recall that

• Any element in the positive cone generated by (i. e., the smallest positive cone containing ) is of the form

 ∑α∈{0,1}skαFαwhere kα is a sum of squares for all α∈{0,1}s

In the sum, denotes an -length vector with each element or .

• Any nonzero element in the multiplicative monoid with 0 generated by is of the form , where .

• Any element in the ideal generated by is of the form , where .

The Positivstellensatz due to Stengle states that for a system of real polynomial equalities and inequalities, either there exists a solution, or there exists a certain polynomial which guarantees that no solution exists.

###### Theorem 2.1 (Stengle’s Positivstellensatz [30])

Let be finite families of polynomials in . Denote by the positive cone generated by , by the multiplicative monoid with 0 generated by , and by the ideal generated by . Then the following are equivalent:

1. The set

 Misplaced & (1)

is empty.

2. There exist such that .

## 3 Problem formulation

The question that concerns us here is to verify whether the loop sentence

 ⟨preE⟩ while(G) {body} ⟨postE⟩

holds, when given the pre-expectation , post-expectation , a Boolean expression , and a loop-free probabilistic program . One way to solve this problem is to calculate the weakest pre-expectation , and to check whether it is not smaller than . However, the weakest pre-expectation of a while statement requires a fixed-point computation, which is not trivial. To avoid the fixed point, the problem can be solved through a quantitative loop invariant.

###### Theorem 3.1 ([15])

Let be a pre-expectation, a post-expectation, a Boolean expression, and body a loop-free probabilistic program. To show

 ⟨preE⟩ while(G) {body} ⟨postE⟩,

it suffices to find a loop invariant which is an expectation such that

1. (boundary) and ;

2. (invariant) ;

3. (soundness) the loop terminates with probability 1 from any state that satisfies , and

1. the number of iterations is finite, or

2. is bounded from above by some fixed constant, or

3. the expected value of tends to zero as the number of iterations tends to infinity.

Since soundness of a loop invariant is not related to pre- and postconditions and can be verified from its type before any specific invariants are found, we focus on the boundary and invariant conditions in Theorem 3.1. The soundness property is left to be verified manually in case studies.

For pre-expectation and post-expectation , the boundary and invariant conditions in Theorem 3.1 provide the following requirements for a loop invariant :

 preE ≤I (2) I⋅(1−χG) ≤postE I⋅χG ≤wp(body,I).

The inequalities induced by the boundary and invariant conditions contain indicator functions, which we find difficult to analyze if they appear on both sides. So first we rewrite the expectations to a standard form. For a Boolean expression , we use to represent its integer value, i. e. if is true, and otherwise. An expectation is in disjoint normal form (DNF) if it is of the form

 f=[F1]⋅f1+⋯+[Fk]⋅fk

where the are disjoint expressions, which means any two of the expressions cannot be true simultaneously, and the are polynomials.

###### Lemma 1 ([21])

Suppose and are expectations over in DNF. Then, if and only if (pointwise)

 k⋀i=1l⋀j=1[Fi∧Gj⇒fi≤gj]∧k⋀i=1[Fi∧(l⋀j=1¬Gj)⇒fi≤0]∧l⋀j=1[(k⋀i=1¬Fi)∧Gj⇒0≤gj]. (3)
###### Example 2

Consider the following loop sentence for our running example:

 ⟨xy−x2⟩ z:=0; while(0

For this case, the following must hold for any loop invariant .

 xy−x2 ≤I I⋅[x≤0∨y≤x] ≤z I⋅[0

By Lemma 1, these requirements can be written as

 ⇒xy−x2≤I∧ (4) x≤0∨y≤x ⇒I≤z∧ (5) 0

The program in this example originally served as a running example in [6]. There, after transforming the constraints into the form above, Lagrange interpolation is applied to synthesize the coefficients in the template. In our approach, we check the correctness of each conjunct in (48) by checking the nonnegativity of the polynomial on the right side over a semialgebraic set related to polynomials on the left side. In this way, we can use the Positivstellensatz to synthesize the coefficients.

## 4 Constraint Solving by Semidefinite Programming

Our aim is to synthesize coefficients for the fixed invariant template for simple (Subsection 4.1) and nested (Subsection 4.2) programs. Checking the validity of constraints can be transformed into checking the emptiness of a semialgebraic set. Then, we show that the emptiness problem can be turned into sum-of-squares constraints using Stengle’s Positivstellensatz.

Our Approach in a Nutshell. For a given polynomial template as a candidate quantitative loop invariant, it needs to satisfy boundary and invariant conditions. Our goal is to synthesize the coefficients in the template. These conditions describe a semialgebraic set, and the satisfiability of the constraints is equivalent to the non-emptiness of the corresponding semialgebraic set. Applying the Positivstellensatz (see Section 2.3), we will transform the problem to an equivalent semidefinite programming problem using Lemma 2. Existing efficient solvers can be used to solve the problem. A more efficient yet sufficient way is to transform the problem into a sum-of-squares problem (see Appendix A) using Lemma 3 and then to solve it by semidefinite programming. After having synthesized the coefficients of the template, we verify whether they are valid. In case of a negative answer, which may happen due to floating-point errors, some refinements can be made by adding further constraints, which is described in Section 4.3. If the problem is still unsolved, we try raising the maximum degree of the template and reiterate the procedure.

### 4.1 Synthesis Algorithm for Simple Loop Programs

Now we are ready for the transformation method. Each conjunct obtained in Lemma 1 is of the form , where is a quantifier-free formula constructed from (in)equalities between polynomials in , and is of the form , or , with . If contains negations, we use De Morgan’s laws to eliminate them. If there is a disjunction in , we split the constraints into sub-constraints as is equivalent to . After these simplifications, can be written in the form where . Observe that a constraint is satisfied if and only if the set is empty. In this way, we transform our constraint into the form required by Theorem 2.1.

Summarizing, Constraint (2) (the boundary and invariant conditions of Theorem 3.1) is satisfied if and only if all semialgebraic sets created using the procedure above are empty. Now we are ready to transform this constraint to an SDP problem.

###### Lemma 2 ([26, 10])

The emptiness of (1) is equivalent to the feasibility of an SDP problem.

See Appendix C.1 for a constructive proof. Although the transformation in Lemma 2 is effective, it is complicated in practice. In the following lemma we present a simpler yet sufficient procedure.

###### Lemma 3

The following statements hold (with ):

1. holds if is a sum of squares for some .

2. holds if is a sum of squares for some .

3. holds if is a sum of squares for some ; if is , it is additionally required that .

The proof is in Appendix C.2. Note that Item (3) is one of the possible sufficient relaxations; more general relaxations can be obtained by adding a cross product and squares of the .

###### Example 3

Applying the above procedure, Constraint (5) in Example 2 is split into and then normalized to . This holds if is a SOS for some and is a SOS for some . The other constraints can be handled in a similar way.

After applying the Positivstellensatz and Lemma 2, template coefficients for the loop invariant can be synthesized efficiently by semidefinite programming. The corresponding technique is introduced in Appendix B.

We summarize our approach in Algorithm 1. The aim is to synthesize the coefficients of template . The terms in are all terms with degree in the multiplicative monoid generated by . Algorithm 1 is semi-complete in the sense that it will generate an invariant if there exists one. Its termination is guaranteed in principle by Theorem 2.1 and the equivalence between SOS and SDP in lemma 2, though due to numerical errors, the algorithm may fail to find in practice.

In practice, Lemma 3 is often used instead of Lemma 2 for efficiency. Step 5 in Algorithm 1 is replaced by: “Let be the relaxation of to an SOS problem according to Lemma 3”; this can be translated to an equivalent SDP problem, which is simpler than the direct translation of Lemma 2, using the technique of Appendix A.

###### Example 4

We extend Example 2 using Lemma 3. To illustrate our solution method, we choose Constraints (4), (5), and (7). The initial condition is not included in these constraints, so (4) needs to be refined to .

First, we set a template for . Assume as a quadratic polynomial with three variables :

 I=c0+c1x+c2y+c3z+c11x2+c12xy+c13xz+c22y2+c23yz+c33z2

where are coefficients that remains to be determined.

For Constraint (4) with initial constraint , we get the following corresponding constraint:

 I−(xy−x2)−v⋅z≥0 \ref{equ:3}^{\prime}

For (5), the antecedens is a conjunction of two constraints. As in Example 3, (5) is split into two constraints and transformed into

 z−I+u1⋅x ≥0and z−I−u2⋅(x−y) ≥0 (\ref{equ:4}${}^{\prime}$)

For (7), the constraint needs to be split into two inequalities . Similarly to (5), we transform (7) to

 0.5⋅I(x+1,y,z+1)+0.5⋅I(x−1,y,z+1)−I−u3⋅x−u4⋅(y−x)≥0 \ref{equ:6}^{\prime}

In this way the example can be transformed into an SDP problem with constraints (4), (5), and (7), and positivity constraints on the multipliers . (For , an arbitrary real value is allowed.) Then the resulting SDP problem can be submitted to any SOS solver.

The result using solver SeDuMi [31] is shown below.

 I=−7.1097⋅10−10−3.8818⋅10−10x−0.4939⋅10−10y+z−x2+xy+2.7965⋅10−10xz+0.97208⋅10−10y2+4.4656⋅10−10yz−0.28694⋅10−10z2

If we ignore the amounts smaller than the order of magnitude of , we get . This satisfies all constraints including (6) and (8), so it is correct.

### 4.2 Synthesis Algorithm for Nested Loop Programs

We are now turning to programs containing nested loops. To simplify our discussion, we assume the program only contains a single, terminating inner loop, i. e. it can be written as

 P =while(G){body} =while(G){body1; while(Ginn){bodyinn}; body2}

where , , and are loop-free program fragments. (If the inner loop is placed within an if statement, one can transform it to the above form by strengthening .) For a given and , we need to verify whether there exists an invariant that satisfies Constraint (2) (the boundary and invariant conditions of Theorem 3.1). We denote the inner loop by .

For such a program, the main difficulty is how to deal with in Constraint (2). We propose a method here that takes the inner and outer iteration into consideration together and uses the verified pre-expectation of the inner loop to relax the constraint.

Fix templates for the polynomial invariants: for the outer loop and for the inner loop , both with degree . Since is loop-free, it is easy to obtain . We use as post-expectation for the inner loop. Note that (2) for the inner loop requires , so we can use the template also as template for . Then the constraints for loop invariant are

 preE ≤I (9) I⋅[1−χG] ≤postE I⋅χG ≤˜wp(body,I)=wp(body1,preEinn) preEinn =Iinn Iinn⋅[1−χGinn] ≤postEinn=wp(body2,I) Iinn⋅χGinn ≤wp(bodyinn,Iinn)

The first three equations are almost Constraint (2) for the outer loop, except that is the strengthening of the weakest pre-expectation using in the -calculation instead of . The last three equations are Constraint (2) for the inner loop, except that we require equality in .

Then we have the following lemma.

###### Lemma 4

An invariant that satisfies Constraint (9) also satisfies (2), therefore it is a loop invariant for program .

See Appendix C.3 for the proof.

### 4.3 Handling Numerical Error

In practice, it sometimes happens that numerical errors lead to wrong or trivial coefficients in the templates. We suggest several methods to refine the constraints and avoid these errors.

Due to the inaccuracy of floating-point calculations, it is hard for a software to check equations and inequalities like or . A common trick to avoid this problem is to turn the equality constraint into . As for inequalities, taking as an example, a way to solve the problem is adding a new variable to transform the constraint into , since implies for any value of . The new constraints are in the form required by Theorem 2.1.

Numerical errors may also lead to an unsound invariant: we may get some coefficients with a small magnitude, which often result from floating-point inaccuracies. A common solution for this problem is to ignore those small numbers, usually smaller than in practice. In Example 4, eliminating the terms with a small order of magnitude was successful, but we cannot be sure whether the resulting invariant is correct if the remaining coefficients are approximate. We propose to check the soundness of such solutions symbolically as follows. Checking whether the generated invariant satisfies Constraint (2) is a special case of quantifier elimination . Such problem can be solved efficiently using an improved Cylindrical Algebraic Decomposition (CAD) algorithm implemented in [17]. In our experiments in Section 5, the found solutions are obtained by ignoring small numbers, and we verified they are correct by running CAD in a separate tool.

If the invariant still violates some of the constraints, we can try to strengthen the constraint (e. g., change to ) and repeat our method.

## 5 Experimental Results

We have implemented a prototype in Python to test our technique. We call the MATLAB toolbox YALMIP [22] with the SeDuMi solver [31] to solve the SDP feasibility problem. We use the math software Maple to verify the correctness of the constraints through CAD. The experiments were done on a computer with Intel(R) Core(TM) i7-4710HQ CPU and 16 GiB of RAM. The operating system is Window 7 (32bit). Constraint refinement cannot be handled automatically in the current version, but we plan to add it together with projection for rounding solutions in a future version.

Our prototype and the detailed experimental results can be found at http://iscasmc.ios.ac.cn/ppsdp. For each probabilistic program, we give the description of the while loop with pre- and post-expectations in Table 1 and Appendix D. The annotated pre-expectation serves as an exact estimate of the annotated post-expectation at the entrance of the loop. We apply the method to several different types of examples. A summary of the results is shown in Table 1. The first eleven probabilistic programs are benchmarks taken from paper [6], thus we skip the detailed descriptions of them. We have further constructed three case studies to illustrate continuous distributions, polynomial probabilistic programs and nested loop programs. The details of these examples are included in Appendix D. We ran CAD in Maple manually to verify the feasability of the generated invariants.

As we can see from Table 1, the running time of our method is within one second. There are some notes when calculating the examples. We relax the loop condition in example geo2 into . Also in the fair coin example, we relax the loop condition into . Since variables in those two examples are integers, the relaxation is sound.

### 5.1 Evaluation

Other approaches to compute loop invariants in probabilistic programs are the Lagrange Interpolation Prototype (LIP) in [6], the tool for martingales (TM) in [2] and Prinsys in [15]. The tools are executed on the same computer, LIP and TM under Linux and the other two under Windows. In Table 2, we compare the features supported by the four tools.

We have tested the examples in Table 1 on these four tools. Prinsys takes the longest time and fails to verify any of non-linear examples presented. LIP fails to verify any examples that include a continuous variable or have a degree larger than 3; additionally it is always about 10 times slower than our tool. TM fails to verify examples ruin, bin3 and geo directly. We observe that it cannot treats constraints of the form or (where and might be variables or constants). However, by transforming into , TM can synthesize a supermartingale for the program. Also, it cannot verify the simple perceptron, as it is a non-linear program. Furthermore, TM cannot deal with nested loop programs.

We now consider the parametric linear program in Section D.3. Table 3 gives a comparison of time consumption of the main technical step in our prototype. The number of constraints grows with the number of variables in our approach, similarly with the running time. Some more experiments on the number of variables and maximum degree of polynomials are included in Appendix D.3.

## 6 Conclusion

In this paper, we propose a method to synthesize polynomial quantitative invariants for recursive probabilistic programs by semialgebraic programming via a Positivstellensatz. First, a polynomial template is fixed whose coefficients remain to be determined. The loop and its pre- and post-expectation can be transformed into a semialgebraic set, of which the emptiness can be decided by finding a counterexample satisfying the condition of the Positivstellensatz. Semidefinite programming provides an efficient way to synthesize such a counterexample. The method can be applied to polynomial programs containing continuous or discrete variables, including those with nested loops. When numerical errors prevent finding a loop invariant polynomial right away, we currently can correct them ad hoc (by deleting terms with very small coefficients and sometimes strengthening the constraints), but we would like to develop a more systematic treatment.

As future improvements, we are considering improvements in numerical error handling. A better approximation can be found by projecting onto a rational subspace defined by SDP constraints  [27, 19]. There are also acceleration methods for different types of probabilistic programs: For linear programs, Handelman’s Positivstellensatz describes a faster way to synthesize SOS constraints, and for Archimedean programs, [10] describes a faster way to apply Stengle’s Positivstellensatz.

The following appendices are added for the convenience of the reviewers. They will become part of a technical report accompanying our publication, once accepted. We appreciate the reviewers’ consideration.—The authors.

## Appendix A Sum-of-Squares Problems

The set of sum-of-squares polynomials is a proper subset of the nonnegative polynomials with good algebraic properties, allowing efficient calculations. The “Gram matrix” method [7] is a way to decompose a polynomial into a sum of squares using semidefinite programming.

Consider a polynomial with degree at most . is a sum of squares if and only if it has a representation , where the matrix is positive semidefinite. So checking whether is a sum of squares is equivalent to solving the following constraint problem:

 ⎧⎪ ⎪⎨⎪ ⎪⎩c0=a00,cγ=∑α+β=γaαβ for γ≠0and A=(aαβ)α,β∈Δ is positive % semidefinite (10)

The above problem can be solved using a semidefinite program; we refer to Appendix B for details. Semidefinite programs can be efficiently solved both in theory and in practice and have seen active research in recent years. Some of the tools being used are SeDuMi [31] and SDPT3 [32].

## Appendix B Semidefinite Programming

A semidefinite program can be seen as a generalization of a linear program where the constraints are described by a cone of positive semidefinite matrices.

We use to denote all real symmetric matrices. Then for a matrix , is positive semidefinite if all eigenvalues of are (one can find more about positive semidefinite matrix in any book about linear algebra such as [24]). For matrices and in , we write if and only if is positive semidefinite.

We use to denote the scalar product of two matrices or vectors, i. e. for and

 ⟨A,B⟩:=Tr(ATB)=n∑i,j=1aijbij
 ⟨x,y⟩:=xTy

A semidefinite program is defined as follows:

 minimize⟨C,X⟩subject to⟨Ai,X⟩=bi for i=1,…,kX⪰0 (11)

where is the decision variable, and , are given symmetric matrices.

Sum-of-squares problem (10) can be transformed into an SDP problem as follows. Let , (for ) be the symmetric matrix whose entry is 1 if and 0 otherwise, and let . In this way, (10) is transformed into the form (11) without objective.

## Appendix C Proofs of Lemmas

### c.1 Proof of Lemma 2

Lemma 2 ([26, 10]). The emptiness of (1) is equivalent to the feasibility of an SDP problem.

###### Proof

We use the notations from Theorem 2.1 in this proof. The emptiness of (1) is equivalent to the existence of a solution for equation with , , due to Theorem 2.1.

Note that , and can be represented as

• where is a sum of squares for all and , and

• with , and

• , where .

Fix a maximal degree . Then we only consider and with . We set templates with degree for and and treat their coefficients as parameters. Every polynomial can be presented as the difference of two SOS polynomials: . Then , where is some integer, is one of , , , or , and is a SOS. Then the equation can be transformed into a set of equations by merging coefficients of each with maximal degree . One can formulate additional constraints that needs to be a SOS and the coefficient matrix with to be positive semidefinite. The equation set with constraints can be transformed into an SDP problem of the form in Appendix B.

### c.2 Proof of Lemma 3

###### Proof

We only prove (1) and (2) here, (3) is a straightforward extension of them by analogy.

For (1), assume is an SOS for some and . Since , .

For (2), assume is an SOS for arbitrary and . Then .

By the method indicated above Lemma 3, one can easily find that a constraint of the form can be translated to two SOS problems.

### c.3 Proof of Lemma 4

Lemma 4. An invariant that satisfies Constraint (9) also satisfies (2), therefore it is a loop invariant for program .

###### Proof

The first two inequalities of (2) are literally the same as the first two of (9). To prove the remaining inequality of (2), we assume that the inner loop terminates. (The verification of soundness is not a part of our algorithm.) From Theorem 3.1 applied to the last three (in)equalities in (9), we immediately get that . From this we have:

 I⋅χG≤˜wp(body,I)=wp(body1,preEinn)≤wp(body1,wp(Pinn,postEinn))≤wp(body1,wp(Pinn,wp(body2,I)))=wp(body1;Pinn;body2,I)=wp(body,I).

## Appendix D Experiment Details

### d.1 Non-linear Probabilistic Programs

We use a non-linear probabilistic program to model a simple perceptron, which is an algorithm for supervised learning of binary classifiers in machine learning. It gives a linear classifier function to decide whether an input belongs to one class or another based on a set of given data. Assume the training data is the collection of pairs , where is the desired output value of . We have to learn the linear function that maps the data to a single binary value:

 f(xi)={1if w⋅xi+b>00otherwise

When the outcome does not match , the random perceptron updates its classifier by and where is a learning rate. When the input of the perceptron is one data pair , the algorithm to generate a simple perceptron can be described as:

real , ; real , ; int