A Zonotopic Framework for Functional Abstractions

# A Zonotopic Framework for Functional Abstractions

Eric Goubault and Sylvie Putot CEA LIST, Laboratory for the Modelling and Analysis of Interacting Systems,
Point courrier 94, Gif-sur-Yvette, F-91191 France, 11email: Firstname.Lastname@cea.fr
###### Abstract

This article formalizes an abstraction of input/output relations, based on parameterized zonotopes, which we call affine sets. We describe the abstract transfer functions and prove their correctness, which allows the generation of accurate numerical invariants. Other applications range from compositional reasoning to proofs of user-defined complex invariants and test case generation.

## 1 Introduction

We present in this paper an abstract domain based on affine arithmetic [4] to bound the values of variables in numerical programs, with a real number semantics. Affine arithmetic can be conceived as describing particular polytopes, called zonotopes [19], which are bounded and center-symmetric. But it does so by explicitly parametrizing the points, as affine combinations of symbolic variables, called noise symbols. This parametrization keeps, in an implicit manner, the affine correlations between values of program variables, by sharing some of these noise symbols. It is tempting then to attribute a meaning to these noise symbols, so that the abstract elements we are considering are no longer merely polytopes, but have a functional interpretation, due to their particular parametrization: we define abstract elements as tuples of affine forms, which we call affine sets. They define a sound abstraction of relations that hold between the current values of the variables, for each control point, and the inputs of a program. The interests of abstracting input/output relations are well-known [6], we mention but a few: more precise and scalable interprocedural abstractions, proofs of complex invariants (involving relations between inputs and outputs), sensitivity analysis and test case generation as exemplified in [7].

An abstract domain relying on such affine forms has been described in [8, 11, 13], but these descriptions miss complete formalization, and over-approximate the input/output relations more than necessary. In this paper, we extend this preliminary work by presenting a natural framework for this domain, with a partial order relation that allows Kleene like iteration for accurately solving fixed point equations. In particular, a partial order that is now global to the abstract state, and no longer defined independently on each variable, allows to use relations also between the special noise symbols created by taking an upper bound of two affine forms. Our results are illustrated with sample computations and geometric interpretations.

A preliminary version of this abstract domain, extended to analyse the uncertainty due to floating-point computations, is used in practice in a real industrial-size static analyser - FLUCTUAT - whose applications have been described in [7, 14]. A preliminary version of this domain, dedicated to the analysis of computations in real numbers, is also implemented as an abstract domain - Taylor1+ [8] - of the open-source library APRON [17].

##### Related work

Apart from the work of the authors already mentioned, that uses zonotopes in static analysis, a large amount of work has been carried out mostly for reachability analysis in hybrid systems using zonotopes, see for instance [9]. One common feature with our work is the fact that zonotopic methods prove to be precise and fast. But in general, in hybrid systems analysis, no union operator is defined, whereas it is an essential feature of our work. Also, the methods used are purely geometrical: no information is kept concerning input/output relationships, e.g. as witnessed by the methods used for computing intersections [10]. Zonotopes have also been used in imaging, in collision detection for instance, see [16], where purely geometrical joins have been defined.

Recent work in static analysis by abstract interpretation for input/output relations abstraction and modular analyses can be found in [6], where an example is given in particular using polyhedra. In [5], it is shown that some classical analyses (e.g. Mycroft’s strictness analysis) are input/output relational analyses (also called dependence-sensitive analyses). Applications of abstractions of input/output relations have been developped, in particular for points-to alias analysis, using summary functions, see for instance [3].

##### Contents

In Section 2, we quickly introduce the principles of affine arithmetic, and show the interest of a domain with explicit parametrization of zonotopes, compared to its geometric counterpart, through simple examples. Then in Section 3, we state properties of affine sets. Introducing a matrix representation, we make the link between the affine sets and their zonotope concretisation. We then introduce perturbed affine sets, that will allow us to define a partially ordered structure. Starting with a thorough explanation of the intuition at Section 4.1, we then describe the partial order relation in Section 4.3, the monotonic abstract transfer functions in Section 4.4, and the join operator in Section 4.5. For intrinsic reasons, our abstract domain does not have least upper bounds, but minimal upper bounds. We show in Section 4.6 that a form of bounded-completeness holds that allows Kleene-like iteration for solving fixed point equations. By lack of space, we do not demonstrate here the behaviour of our abstract domain on fixed-point computations, but results on preliminary versions of our domain are described in [8, 13].

## 2 Abstracting input/output relations with affine arithmetic

##### Affine arithmetic

Affine arithmetic is an extension of interval arithmetic on affine forms, first introduced in [4], that takes into account affine correlations between variables. An affine form is a formal sum over a set of noise symbols

 ^x\lx@stackreldef=αx0+n∑i=1αxiεi,

with for all . Each noise symbol stands for an independent component of the total uncertainty on the quantity , its value is unknown but bounded in [-1,1]; the corresponding coefficient is a known real value, which gives the magnitude of that component. The same noise symbol can be shared by several quantities, indicating correlations among them. These noise symbols can not only model uncertainty in data or parameters, but also uncertainty coming from computation.

The semantics of affine operations is straightforward, non affine operations are linearized : we refer the reader to [11, 13] for more details on the semantics for static analysis.

##### Introductory examples

Consider the simple interprocedural program :

float main() {
float x [-1,1];
return f(x)-x;
}
float f(float x) {
float y;
if (x >= 0) y = x + 1;
else y = x - 1;
return y; }


In order to analyse this program precisely, we need to infer the relation between the input and output of function f, since the main function subtracts the input of f from its output. We will show in Section 4.1 that our method gives an accurate representation of such input/output relations, at low cost, easily proving here that main returns a number between -1 and 1. We will also show that even tight geometric representations of the image of f on [a,b] may fail to prove this.

Another interest of our method is to allow compositional abstractions for interprocedural calls [6], making our domain very scalable. For instance, the abstract value for the output of f, as found in Section 4.1, represents the fact that its value is the value of the input plus an unknown value in [-1,1]. In fact a little more might be found out, which would lay the basis for efficient disjunctive analyses, where we would find that the output of f is its input plus an unknown value in . This is left for future work. This compact representation can be used as an abstract summary function (akin to the ones of [3] or of [5]) for f which can then be reused without re-analysis for each calls to f. The complete discussion of this aspect is nevertheless outside the scope of this paper.

Last but not least, input/output relations that are dealt with by our method allow proofs of complex invariants, and test case generation at low cost. Consider for instance the following program, where g computes an approximation of the square root of x using a Taylor expansion of degree 2, centered at point 1:

float main() {
float x [1,2], z, t;
z = g(x);
t = z*z-x;
return t; }

float g(float x) {
float y;
y = 3/8.0+3/4.0*x-1/8.0*x*x
return y;
}


With our semantics, we will find the following abstract value for x, z and t: , and . This proves that z is within (real result is ), and that t is within (real result is ). This means that we get a rather precise estimate of the quality of the algorithm that approximates the square root. Finally, examining the dependency of on the noise symbol modelling the input, we see that , that is , is the most likely value for reaching the maximum of , in absolute value. This input value is thus a good test case to maximize the algorithmic error between the approximation of square root and the real square root. Here it does indeed correspond to the worst case. These applications are detailed in [7], and stronger statements about test case generation can be found in [12], where a generalized form for abstract values is used for under-approximations.

## 3 Affine sets and zonotopes : notations and properties

In what follows, we introduce matrix notations to handle tuples of affine forms, which we call affine sets, and characterize the geometric concretisation of sets of values taken by these affine sets.

We note the space of matrices with lines and columns of real coefficients. An affine set expressing the set of values taken by variables over noise symbols , can be represented by a matrix .

For example, consider the affine set

 ^x = 20−4ε1+2ε3+3ε4 (1) ^y = 10−2ε1+ε2−ε4, (2)

we have , and : . Two matrix multiplications will be of interest in what follows :

• , where , represents a linear combination of our variables, expressed on the basis,

• , where , and , represents the vector of actual values that our variables take for the particular values for each of our noise variables. In this case, the additional symbol which is equal to 1, accounts for constant terms, as done for instance in the zone abstract domain [18].

We formally define the zonotopic concretisation of affine sets by :

###### Definition 1

Let an affine set with variables over noise symbols, defined by a matrix . Its concretisation is the zonotope

 γ(A)={tAt(1|e)∣e∈Rn,∥e∥∞≤1}⊆Rp.

We call its linear concretisation the zonotope centered on 0

 γlin(A)={tAe∣e∈Rn+1,∥e∥∞≤1}⊆Rp.

For example, Figure 1 represents the concretization of the affine set defined by (1) and (2). It is a zonotope with center given by the vector of constant coefficients of the affine forms.

Zonotopes are particular bounded convex polyhedra [19]. A way to characterize convex shapes is to consider support functions. For any direction , let the function which associates to all , where is the standard scalar product in , meaning that . Level-sets of support functions, i.e. sets defined by bounds on such functions characterize convex sets [1], and nicely characterize zonotopes centered on 0:

###### Lemma 1

Let be a convex shape in . Then can be characterized as the (possibly infinite) intersection of half-spaces of the form

 Bt={x∈Rp∣pt(x)≤supy∈Spt(y)]}

In case is a zonotope centered around , it has finitely many faces with normals (), and this intersection is finite:

 S=⋂1≤i≤k{x∈Rp∣|pti(x)|≤supy∈Spti(y)}

Furthermore, there is an easy way to characterize the linear concretization (see also [15]):

###### Lemma 2

Given a matrix , for all , , where is the norm.

Proof. First of all, is the image of the unit disc for the norm by as we noted in Definition 1. Therefore,

 sup{y∈γlin(A)}pt(y)=sup{e∈Rn+1,∥e∥∞≤1}pt(tAe)

We now have

 pt(tAe)=⟨t,tAe⟩=⟨At,e⟩=∑ni=0(∑pj=1ai,jtj)ei≤∑ni=0∣∣∑pj=1ai,jtj∣∣∥e∥∞=∥At∥1∥e∥∞

This bound is reached for , which is such that .∎

We illustrate Lemma 2 in Figure 2. Consider the matrix associated to affine set {(1)-(2)} without its center. Its affine concretisation is the same zonotope as but centered on 0. For , , the -level set corresponds to points on the hyperplane defined by : for , . This hyperplane is orthogonal to the line going through 0, with direction . It intersects at a point such that . Given a direction in , the -level set that intersects with maximal value for realizes by Lemma 2. We now take three vectors such that . For , , we find the maximum of its concretisation on the -axis to be . For , , and , where is the region (or band) between the line orthogonal to depicted as a blue dashed line and its symmetric with respect to zero. For which is orthogonal to a face of the zonotope, and , which is the band between the two parallel faces in green.

And indeed, for any matrix , is entirely described by providing the set of values , where varies among all directions in  :

###### Lemma 3

For matrices and , we have if and only if for all .

Proof. Suppose first that for all .  By first part of Lemma 1,

 γlin(X)=⋂t∈Rp{x∈Rn∣pt(x)∈[infy∈γ(X)pt(y),supy∈γ(X)pt(y)]}

with by Lemma 2. Thus

 γlin(X)=⋂t∈Rp{x∈Rn∣|pt(x)|≤∥Xt∥1}
 ⊆⋂t∈Rp{x∈Rn∣|pt(x)|≤∥Yt∥1}=γlin(Y).

Conversely, suppose . Then

 ∥Xt∥1=supx∈γlin(X)pt(x)≤supx∈γlin(Y)pt(x)=∥Yt∥1.

## 4 Perturbed affine sets

### 4.1 Rationale

Let us get back to the program defining function f in Section 2. We introduce a noise symbol to represent the range of values for x. Using for example the sub-optimal join operator described in Lemma 10 to come, the affine set for x and y at the end of the program will be , , with a new (perturbation) noise symbol . The corresponding zonotope is depicted in solid red in Figure 3.

Now, a better geometrical abstraction of the abstract value of (x,y) is the zonotope depicted in dashed blue in Figure 3. Since y=x+1 for positive x and y=x-1 for negative x, we only have to include the two segments in solid dark in the smallest zonotope as possible. This is realized easily by a zonotope defined by the faces and . Let us take a new symbol to represent , and to represent . This gives and . Although the corresponding blue zonotope is strictly included in the red zonotope , so it is geometrically more precise, we lose relations to the input values. Indeed, symbols express dependencies to inputs of the program, whereas symbols do not. Thus, computing y minus the input of f, as in the main function of the example, gives . This range is far less precise than using the representation , where we find that this difference is equal to .

If we were not interested in input/output relations, a classical abstraction based on affine sets  would be using the geometrical ordering on zonotopes. We would say that affine set  is less or equal than iff . For the sake of simplicity in the present discussion, suppose that and are centered on 0. By Lemma 3, we would then ask for for all .

Now, being interested in input/output relations, we will keep the existing symbols used to express possible ranges of values of input variables (for instance, defines the value of input variable x in the example above), and which should have a very strict interpretation, as well as the noise symbols due to (non linear) arithmetic operations. We call them the central noise symbols (such as ). And, to express uncertainty on these relations due to possibly different execution paths, we will add additional noise symbols which we call perturbation noise symbols (such as in the example above).

We now define an ordered structure using these two sets of noise symbols.

### 4.2 Definition

We thus consider perturbed affine sets as Minkowski sums [1] of a central zonotope and of a perturbation zonotope (always centered on 0)  :

###### Definition 2

We define a perturbed affine set  by the pair of matrices
. We call the central matrix, and the perturbation matrix.

The perturbed affine form , where the are the central noise symbols and the the perturbation or union noise symbols, describes the variable of X. We call the central zonotope and the perturbation zonotope.

For instance as defined in Section 4.1 is described by , (first column corresponds to variable x, second column, to y). is described by (the line corresponding to ) and (the first line corresponds to perturbation symbol , the second to ).

### 4.3 Ordered structure

Expressing less or equal than on these perturbed affine sets with the geometrical order yields

 ∥CXt∥1−∥CYt∥1≤∥PYt∥1−∥PXt∥1,∀t∈Rp.

But many transformations that leave and fixed for all , and thus preserve that inequality, lose the intended meaning of the central noise symbols. We can fix this easily, by strengthening this preorder. Note that for all , , so defining

 X≤Y iff ∥(CX−CY)t∥1≤∥PYt∥1−∥PXt∥1

should imply the geometrical ordering at least (as we will prove in Lemma 5). The good point is that no transformation on the central noise symbols is allowed any longer using this preorder (as the characterization of the equivalence relation generated by this preorder will show, see Lemma 4), keeping a strict interpretation of the noise symbols describing the values of the input variables, hence the input/output relations.

We now formalize and study this stronger order:

###### Definition 3

Let , be two perturbed affine sets in . We say that iff

 supu∈Rp(∥(CY−CX)u∥1+∥PXu∥1−∥PYu∥1)≤0

Coming back to our example of Section 4.1, but . Take for instance . Then .

###### Lemma 4

The binary relation of Definition 3 is a preorder. The equivalence relation generated by this preorder is iff by definition and . It can be characterized by and (geometrically speaking, as sets). We still denote by in the rest of the text.

Proof. Reflexivity of is immediate. Suppose now and , then for all :

 ∥(CY−CX)u∥1≤∥PYu∥1−∥PXu∥1∥(CZ−CY)u∥1≤∥PZu∥1−∥PYu∥1

Using the triangular inequality, we get

 ∥(CZ−CX)u∥1≤∥(CZ−CY)u∥1+∥(CY−CX)u∥1≤∥PZu∥1−∥PYu∥1+∥PYu∥1−∥PXu∥1≤∥PZu∥1−∥PXu∥1

implying , hence transitivity of .

Finally, and imply that for all , is less or equal than and is also less or equal than . Hence for all , meaning and for all . By Lemma 3 this exactly means that .∎

###### Lemma 5

Take and . Then implies

 γ(CXPX)⊆γ(CYPY)

or said in a different manner: where denotes the Minkowski sum. Note that implies .

Proof. It is easy to prove that given that , using Lemma 3 and the triangular inequality for .

However, what we want is a little stronger. In order to derive it, we define, for all matrix of dimension , a matrix of dimension by

 ˜A=⎛⎜ ⎜ ⎜ ⎜ ⎜⎝10⋮0A⎞⎟ ⎟ ⎟ ⎟ ⎟⎠

The interest of this transformation, is that the zonotopic concretisation is a particular face (which is the intersection with an hyperplane) of the 0-centered zonotope  :

 γ(A)=γlin(˜A)∩{(1,x1,…,xp)∣(x1,…,xp)∈Rp}. (3)

We now prove For all ,

 =∥˜CXt∥1−∥˜CYt∥1+∥PXt∥1−∥PYt∥1=∣t0+∑pk=1cX0,ktk∣−∣t0+∑pk=1cY0,ktk∣+∥(cXi,k)1≤i≤n,1≤k≤pt(t1,…tp)∥1−∥(cYi,k)1≤i≤n,1≤k≤pt(t1,…tp)∥1+∥PXt∥1−∥PYt∥1≤∣∑pk=1cX0,ktk−∑pk=1cY0,ktk∣+∥(cXi,k)1≤i≤n,1≤k≤pt(t1,…tp)∥1−∥(cYi,k)1≤i≤n,1≤k≤pt(t1,…tp)∥1+∥PXt∥1−∥PYt∥1≤∥(CY−CX)t∥1+∥PXt∥1−∥PYt∥1≤0

Hence by Lemma 3, which, by (3), implies the result. ∎

The order we define is in fact essentially more complex than the inclusion ordering, while still being computable:

###### Lemma 6

The partial order is decidable, with a complexity bounded by a polynomial in and an exponential in .

Proof. The problem can be solved using linear programs. Let , be two perturbed affine sets in . We want to decide algorithmically whether that is

 supu∈Rp(∥(CY−CX)u∥1+∥PXu∥1−∥PYu∥1)≤0

Looking at the proof of Lemma 2, we see that

 ∥Au∥1=sup{e∈Rn+1,∥e∥∞≤1}n∑i=0(p∑j=1ai,juj)ei

and that this bound is reached for such that for all , or .

We therefore produce, for each , and , with, for all , or , or , or , the following linear program:

 supu∈Rp(n∑i=0p∑j=1(cYi,j−cXi,j)eiuj+m∑i=1p∑j=1pXi,jfiuj−m∑i=1p∑j=1pYi,jgiuj)

subject to

 (p∑j=1(cYi,j−cXi,j)uj)ei ≥ 0,∀0≤i≤n (p∑j=1pXi,juj)fi ≥ 0,∀1≤i≤n (p∑j=1pYi,juj)gi ≥ 0,∀1≤i≤n

that we solve using any linear program solver (with polynomial complexity). We then check for each problem that it is either not satisfiable or its supremum is negative or zero.∎

Hopefully, there is no need to use this general decision procedure in a static analyser by abstract interpretation. We refer the reader to the end of Section 4.6 for a discussion on this point.

### 4.4 Extension of affine arithmetic on perturbed affine forms

#### 4.4.1 Interpretation of assignments and correctness issues

We detail below the interpretation of arithmetic expressions, dealing first with affine assignments, that do not lose any precision. We use a very simple form for the multiplication. There are in fact more precise ways to compute assignments containing polynomial expressions. Firstly, the multiplication formula can be improved, see [8, 11]. Secondly, when interpreting a non-linear assignment, it is better in practice to introduce new noise symbols for the entire expression, and not for every non linear elementary operation as we present here. But for sake of simplicity, we do not describe this here. Note also that we would need formally to prove that projections onto a subset of variables (change of scope), and renumbering of variables are monotonic operations, but these are easy checks and we omit them here. Note finally that the proofs of monotonicity of our transfer functions are not only convenient for getting fixpoints for our abstract semantics functionals. They are also necessary for proving the correctness of our approach. As already stated in [11, 13], the correctness criterion we need relies on the property that whenever are two perturbed affine sets, all future evaluations using expressions give smaller concretisations starting with than starting with , i.e. . This is proven easily as follows: as is a composite of monotonic functions, . The conclusion holds because of Lemma 5.

#### 4.4.2 Affine assignments

We first define the assignment of a possibly unknown constant within bounds to a (new) variable, :

###### Definition 4

Let be a perturbed affine set in and . We define with :

• for all ,

• , for all and

• for all ,

• for all

Or in block matrix form, ,

We carry on by addition, or more precisely, the operation interpreting the assignment and adding new variable to the affine set:

###### Definition 5

Let be a perturbed affine set in . We define by

 CZ=⎛⎜ ⎜ ⎜⎝CXcX0,i+cX0,j…cXn,i+cXn,j⎞⎟ ⎟ ⎟⎠ and PZ=⎛⎜ ⎜ ⎜⎝PXpX1,i+pX1,j…pXm,i+pXm,j⎞⎟ ⎟ ⎟⎠.

Finally, we give a meaning to the interpretation of assignments of the form , for  :

###### Definition 6

Let be a perturbed affine set in . We define by

 CZ=⎛⎜ ⎜ ⎜⎝CXλcX0,i…λcXn,i⎞⎟ ⎟ ⎟⎠ and PZ=⎛⎜ ⎜ ⎜⎝PXλpX1,i…λpXm,i⎞⎟ ⎟ ⎟⎠.

We can prove the correctness of our abstract semantics:

###### Lemma 7

Operations , and are increasing over perturbed affine sets. Moreover these three operations do not introduce over-approximations.

Proof. Suppose we are given two perturbed affine sets and such that .

First, for constant assignments, we have, for all :

 ∥(C[[xp+1=[a,b]]]X−C[[xp+1=[a,b]]]Y)t∥1=∥(CX−CY)t∥1≤∥PYt∥1−∥PXt∥1≤∥P[[xp+1=[a,b]]]Yt∥1−∥P[[xp+1=[a,b]]]Xt∥1

which shows monotonicity of The concretisation of is obviously exact.

Now for addition of variables, we have, for all :

 ∥(C[[xp+1=xi+xj]]X−C[[xp+1=xi+xj]]Y)t∥1==∑nl=0∣∑p+1k=0(c[[xp+1=xi+xj]]Xl,k−c[[xp+1=xi+xj]]Yl,k)tk∣=∑nl=0∣∑pk=0(cXl,k−cYl,k)tk+(cXi,k+cXj,k)tp+1∣=∥(CX−CY)t(t1,…,ti+tp+1,…,tj+tp+1,…,tp)∥1≤∥PYt(t1,…,ti+tp+1,…,tj+tp+1,…,tp)∥1−∥PXt(t1,…,ti+tp+1,…,tj+tp+1,…,tp)∥1=∥P[[xp+1=xi+xj]]Yt∥1−∥P[[xp+1=xi+xj]]Xt∥1

which shows monotonicity of The concretisation of is obviously exact.

And finally, we have, for all :