Compositional Verification of Procedural Programs using Horn Clauses over Integers and Arrays

Compositional Verification of Procedural Programs using Horn Clauses over Integers and Arrays

Abstract

We present a compositional SMT-based algorithm for safety of procedural C programs that takes the heap into consideration as well. Existing SMT-based approaches are either largely restricted to handling linear arithmetic operations and properties, or are non-compositional. We use Constrained Horn Clauses (CHCs) to represent the verification conditions where the memory operations are modeled using the extensional theory of arrays (ARR). First, we describe an exponential time quantifier elimination (QE) algorithm for ARR which can introduce new quantifiers of the index and value sorts. Second, we adapt the QE algorithm to efficiently obtain under-approximations using models, resulting in a polynomial time Model Based Projection (MBP) algorithm. Third, we integrate the MBP algorithm into the framework of compositional reasoning of procedural programs using may and must summaries recently proposed by us. Our solutions to the CHCs are currently restricted to quantifier-free formulas. Finally, we describe our practical experience over SV-COMP’15 benchmarks using an implementation in the tool Spacer.

\DeclareCaptionType

copyrightbox

I Introduction

Under-approximating a projection (i.e., existential quantification), for example in computing an image, is a key aspect of many techniques of symbolic model checking. A typical (though not ubiquitous) approach to this is what we call Model-based Projection (MBP) [spacer_cav14]: we generalize a particular point in the space of the image (obtained using a model) to a subset of the image that contains it. In some cases, the purpose is to compute the exact image by a series of under-approximations [gupta]. In other cases, such as IC3 [ic3], the purpose of MBP is to produce a relevant proof sub-goal. When the number of possible generalizations is finite, we say that we have a finite MBP which allows us to compute the exact image by iterative sampling, or to guarantee that the branching in our proof search is finite.

The feasibility of a finite MBP depends on the underlying logical theory. Finite MBPs exist for propositional logic [gupta, gpdr] and Linear Integer Arithmetic (LIA) with a divisibility predicate [spacer_cav14], and have been applied in both hardware and software model checking. LIA is often adequate for software verification, provided that heap and array accesses can be eliminated. This can be done by abstraction, or by inlining all procedures and performing compiler optimizations to lower memory into registers (e.g., [ufo, seahorn_svcomp15]). However, the inlining approach has many drawbacks. It can expand the program size exponentially, it cannot handle recursion, and it is not always feasible to eliminate heap and array accesses.

We address this issue here by considering the problem of MBP for the extensional theory of arrays (ARR). We find that a finite MBP exists that can be computed in polynomial time when only array-valued variables are projected. Projecting variables of index and value sorts is not always possible, since the quantifier-free fragments of the theory combinations are not guaranteed to be closed under projection. We therefore take a pragmatic approach to MBP that may not always converge to the exact projection. This allows us to handle, for example, the combination of ARR and LIA.

We test the effectiveness of this approach using the model checking framework of Spacer [spacer_cav14]. This SMT-based framework makes use of MBP to produce proof sub-goals for Hoare-style procedure-modular proofs of recursive programs. The ability to reason with ARR makes it possible to handle heap-allocating programs without inlining procedures, as the heap can be faithfully modeled using ARR [seahorn]. This leads to significant improvements in scalability, when compared to the use of LIA alone with inlining, as measured using benchmark programs from the 2015 Software Verification Competition (SVCOMP 2015) [svcomp15]. Not inlining the programs also has the advantage that we generate procedure-modular proofs (containing procedure summaries) that might be re-usable in various ways (e.g., [evolcheck]).

In summary, we (a) describe an exponential rewriting procedure for projecting array variables (Sec. LABEL:sec:qe), (b) adapt this procedure to obtain a polynomial-time (per model) finite MBP for projecting array variables (Sec. LABEL:sec:mbp), (c) integrate this with existing MBP procedures for Linear Arithmetic (Sec. LABEL:sec:arr_lia) in the Spacer framework obtaining a new compositional proof search algorithm (Sec. III), and (d) evaluate the algorithm experimentally using SVCOMP benchmarks (Sec. IV).

Ii Preliminaries

We consider a first-order language with equality whose signature contains basic sorts (e.g., bool of Booleans, int of integers, etc.) and array sorts. An array sort is parameterized by a sort of indices and a sort of values . We assume that is always a basic sort. For every array sort , the language has the usual function symbols and for reading from and writing to the array. Intuitively, denotes the value stored in the array at the index and denotes the array obtained from by replacing the value at the index by . We use the following axioms for the extensional theory of arrays (ARR):

Read-after-write

Extensionality

Intuitively, the first schema says that after modifying an array at index , a read results in the new value at index and at every other index . The second schema says that if two arrays agree on the values at every index location, the arrays are equal. We use an over-bar to denote a vector. We write to denote that every term in vector has sort , to denote the th component of , and to denote that is equal to some component of , i.e., . Let and be vectors of index and value terms of the same length . We write to denote . Unless specified otherwise, contains no other symbols.

For arrays and of sort , and a (possibly empty) vector of index terms , we write to denote and call such formulas partial equalities [stump]. Using extensionality, one can easily show the following

(1)
(2)
(3)

We write for a formula with free variables , and we treat as a predicate over . We also write to to indicate that a term or formula occurs in at some syntactic position.

Given formulas and with and , a Craig Interpolant [craig], denoted , is a formula such that and .

Iii The Compositional Verification Framework

MBP plays a crucial role in enabling the search for compositional proofs. In this section, we will consider the role played by MBP in a model checking framework called Spacer [spacer_cav14]. In this framework, MBP is used to create succinct localized proof sub-goals that make it possible to reason about only one procedure at a time. The proof goals take the form of under-approximate summaries, either of the calling context of a procedure or of the procedure itself. Without some form of projection, Spacer would not be compositional, as it would build up formulas of exponential size, in effect inlining procedures to create bounded model checking formulas.

Iii-a Modeling programs with CHCs

Spacer checks safety of procedural programs by reducing the problem to SMT of a special kind of formulas known as Constrained Horn Clauses (CHCs) [bmr12, spacer_cav14, seahorn]. We augment the signature with a set of fresh predicate symbols . A Constrained Horn Clause (CHC) is a formula of the form

where for each , is a symbol in , and is equal to the arity of . The constraint is a formula over , and is either an application of a predicate in or another formula over . We use body to refer to the antecedent of the CHC, as shown above. A CHC is called a query if is a formula over and otherwise, it is called a rule. If in the body, the CHC is linear and is non-linear otherwise. Following the convention of logic programming literature, we also write the above CHC as .

Intuitively, each predicate symbol represents an unknown partial correctness specification of a procedure (that is, an over-approximate summary). A query defines a property to be proved, while each rule gives modular verification condition for one procedure. A satisfying assignment to the symbols is thus a certificate that the program satisfies its specification and corresponds to the annotations in a Floyd/Hoare style proof. In this work, we are interested in finding annotations that can be expressed in the quantifier-free fragment of our first-order language, to avoid the difficulty of reasoning with quantifiers.

Any given set of CHCs encoding safety of procedural programs can be transformed to an equisatisfiable set of just three CHCs with a single predicate symbol (encoding the program location using a variable). These CHCs have the following form:

(4)

Intuitively, is the program invariant, denotes the pre-state of a program transition, denotes the post-state, and denotes the summary of a procedure call (if one is made). If there are no procedure calls, is independent of and can be dropped: in this case denotes an inductive invariant of an ordinary transition system. In the sequel, we restrict to this normal form and consider only quantifier-free interpretations of the predicate .

It is useful to rewrite the above rules using a function that substitutes given predicates and for the occurrences of in the rule bodies. That is, let

The rules are thus equivalent to . Abusing notation, we will also write for .

Iii-B The Spacer framework

Spacer is a general framework that can be instantiated for a given logical theory by supplying three elements: (a) a model-generating SMT solver for , (b) an MBP procedure Mbp for and (c) in interpolation procedure Itp for . Compared to other SMT-based algorithms (e.g., [whale, hsf, ultimate, duality]), the key distinguishing feature of Spacer is compositional reasoning. That is, instead of checking satisfiability of large formulas generated by program unwinding, Spacer iteratively creates and checks local reachability queries for individual procedures. In this way it is similar to IC3 [ic3, pdr], a SAT-based algorithm for safety of finite-state transition systems, and GPDR [gpdr], its extension to Linear Real Arithmetic. Like these methods, Spacer maintains a sequence of over-approximations of procedure behaviors, called may summaries, corresponding to program unwindings. However, unlike other approaches, Spacer also maintains under-approximations of procedure behaviors, called must summaries, to avoid redundant reachability queries. Another distinguishing feature of Spacer is the use of MBP for efficiently handling existentially quantified formulas to create a new query or a must summary. We note, however, that MBP is a general technique and can be exploited in IC3/PDR as well.1

Alg. 1 gives a simplified description of Spacer as a solver for CHCs in the form of (4) (though Spacer handles general CHCs). It is described using a set of rules that can be applied non-deterministically. Each rule is presented as a guarded command “[ grd ] cmd”, where cmd can be executed only if grd holds.

Input: Formulas
Output: Inductive invariant (FO interpretation of satisfying (4)) or Unsafe
if satisfiable then return Unsafe // initialize data structures   // set of pairs
  // max level, or recursion depth
,   // may summary sequence
  // must summary
forever non-deterministically do
       (Candidate) [ satisfiable ]
            , for some
            (DecideMust) [ , ]
                
                 (DecideMay) [ , ]
                     
                      (Leaf) [ , , ]
                          
                           (Successor) [ , ]
                               
                                (Conflict) [ , ]
                                     ,
                                     (Induction) [ , ]
                                          ,
                                          (Unfold) [ ] (Safe) [ ] return invariant (Unsafe) [ satisfiable ] return Unsafe
Algorithm 1 Rule-based description of Spacer.

As shown in Alg. 1, Spacer maintains a set of reachability queries , a sequence of may summaries , and a must summary . Intuitively, a query corresponds to checking if is reachable for recursion depth , over-approximates the reachable states for recursion depth , and under-approximates the reachable states. denotes the current bound on recursion depth. The sequence of may summaries and correspond to the trace of approximations and the maximum level in IC3/PDR, respectively. For convenience, let be . , for a formula and model , denotes the result of some MBP function associated with for the model .

Alg. 1 initializes to 0 and, and to . Candidate initiates a backward search for a counterexample beginning with a set of states in . The potential counterexample is expanded using either DecideMust or DecideMay. DecideMust jumps over the call , in the last CHC of (4), utilizing the must summary . DecideMay, on the other hand, creates a query for the call using the may summary of its calling context. Successor updates when a query is known to be reachable. The other rules are similar to IC3 [ic3] and GPDR [gpdr] and we skip their explanation in the interest of space. Spacer is sound and if Mbp utilizes finite MBP functions, Spacer also terminates for a fixed  [spacer_cav14].

Iii-C Instantiation for ARR+LIA

In instantiating this framework for ARR+LIA, the key ingredient is the MBP procedure of the previous section. An interpolation procedure Itp can be trivially obtained by using literal-dropping approach based on UNSAT cores, or a more sophisticated approach can be taken (e.g., see [gpdr, duality]).

Because we do not have a finite MBP, Spacer is not guaranteed to terminate even for a fixed bound on the recursion depth . That is, it can generate an infinite sequence of queries and must summaries. Note that MBP is used in 3 rules: DecideMay, DecideMust, and Successor. The elimination of quantifiers in Successor is only an optimization and can be avoided. This is not the case with DecideMay or DecideMust without changing the structure of the queries, the considerations of which are outside the scope of this paper. In the following, we identify restrictions on the CHCs where termination is still guaranteed and for the other cases, we propose some heuristic modifications to Mbp and Itp to help avoid divergence.

Equality resolution in Mbp

There are several cases where terms over combined signatures appear in conjunction with equality terms over the index quantifier, e.g., for a term independent of . In these cases, the quantifier can be eliminated using equality resolution, e.g., in the above example. Such cases seem to be natural in the case of a single procedure, i.e., when in (4) is independent of . Consider a disjunct in a DNF representation of . Now, represents a path in the procedure and typically, index terms (in reads and writes) in can be ordered such that every index term is a function of the previous index terms or the current-state variables . This makes it possible to eliminate any index variables in using equality resolution as mentioned above.

Privileging array equalities

Here is a simple example that exhibits non-termination:

Here, intuitively, denotes the summary of a procedure which takes an array as input and produces as output and we are interested in checking if there is sign change in the value at an index as a result of the procedure call. For this example, DecideMay creates queries of the form where is a specific integer constant. If Itp returns interpolants of the form , it is easy to see that Spacer would not terminate even for , even though there is a trivial solution: .

To alleviate this problem, we modify Mbp and Itp to promote the use of array equalities in interpolants. Let be the result of Mbp for a given model . For every pair of array terms , in , we strengthen with the array equality or disequality , depending on whether holds or not. In the above example, the queries will now be of the form . However, continues to be an interpolant whereas the desired interpolant is . To reduce the dependence on specific integer constants in the learned interpolants, and hence in the may summaries, we modify Itp as follows. Suppose we are computing an interpolant for (as occurs in Conflict). We let where contains all the literals where an integer quantifier is substituted using its interpretation in a model. Using a minimal unsatisfiable subset (MUS) algorithm, we can generalize to such that is unsatisfiable and then obtain . In the above example, for we have , , and . One can show that is simply and the only possible interpolant is . In our implementation, we add such (dis-)equalities on-demand in a lazy fashion. Note that adding such (dis-)equalities to the queries is only a heuristic and may not always help with termination.

Iv Experimental Results

As noted in the introduction, the array theory allows us to model heap references accurately. This eliminates the need to inline procedures so that heap-allocated objects are reduced to local variables. We hypothesize that the resulting increase in modularity will allow Spacer to more efficiently verify procedural programs using ArrayMbp, in spite of the potential for divergence due to non-finiteness of the MBP.

We test this hypothesis using a prototype implementation of Spacer with ArrayMbp.2 To verify C programs, we use SeaHorn [seahorn], which uses the LLVM infrastructure to compile and optimize the input program, then encodes the verification conditions as CHCs in the SMT-LIB2 format. SeaHorn can optionally inline procedure calls before encoding, allowing us to test our hypothesis regarding modularity.

For reference, we also compare Spacer to the implementation of GPDR [gpdr] in Z3 [z3]. A key difference between Spacer and GPDR is that the latter does not use must summaries. Z3 also uses MBP, but is limited to equality resolution and the substitution method. As a result Z3 GPDR is effective only for inlined programs.

We use benchmarks from the software verification competition SVCOMP’15 [svcomp15]. We considered the 215 benchmarks from the Device Drivers category where Z3 GPDR (with inlining) needed more than a minute of runtime or did not terminate within the resource limits of SVCOMP [seahorn_svcomp15]. All experiments have been carried out using a 2.2 GHz AMD Opteron(TM) Processor 6174 and 516GB RAM, running Ubuntu Linux. Our resource limits are 30 minutes and 15GB for each verification task. In the scatter plots that follow, a diamond indicates a time-out, a star indicates a mem-out, and a box indicates an anomaly in the implementation.

Fig. 1: Advantage of inter-procedural encoding using Spacer.
Fig. 2: Spacer vs. Z3 on hard SVCOMP benchmarks with inlining.

The scatter plot in Fig. 1 compares the combined run time for the CHC encoding and verification, when inlining is turned on and off. A clear advantage is seen in the non-inlining case. This shows that Spacer is able to effectively exploit the additional modularity that is made possible by ArrayMBP, and that this advantage outweighs any occurrences of divergence due to non-finite MBP.3 We note that Spacer with only LIA is able to handle only a small fraction of the non-inlined benchmarks. This result confirms our hypothesis.

For reference, we also compare to the performance of Z3 GPDR. We observed that without ArrayMBP, Z3 is very ineffective in the non-inlined case. We should mention, however, that of the 7 unsafe programs verified by Z3, 5 could not be verified by Spacer. Fig. 2 compares Spacer and Z3 with inlining on. This shows an overwhelming advantage for Spacer, which is due to its more effective MBP approach.

V Related Work

There are several SMT-based approaches for sequential program verification that iteratively check satisfiability of formulas corresponding to safety of various unwindings of the program [whale, hsf, ultimate, duality]. However, these monolithic SMT formulas can grow exponentially. In contrast, the Spacer framework [spacer_cav14] we use allows us to do a compositional proof search for safety. Such local proof search is also found in the IC3 algorithm for hardware model checking [ic3] and its extensions to software model checking (e.g., [gpdr]), although Spacer is the first to use under-approximate summaries of procedures for avoiding redundant proof sub-goals. Model-based generalizations have also been used to obtain projections efficiently in decision procedures for quantified formulas [lazy_qe].

Vi Conclusion and Future Work

We have presented a procedure for existentially projecting array variables from formulas over combined theories of ARR, LIA, and propositional logic. We have adapted the procedure to a finite MBP for array variables. While existential projection is worst-case exponential, the corresponding MBP is polynomial. However, projecting arrays might introduce new existentially quantified variables (whose sort is the same as the index- or value-sort of the eliminated array). For projecting these variables, a finite MBP need not exist. We described heuristics for obtaining a practical (but not necessarily finite) MBP procedure, obtaining an instantiation of the Spacer framework for verification of safety of sequential heap-manipulating programs. We show that the new variant of Spacer is effective for constructing compositional proofs of Linux Device Drivers. In the future, we plan to extend these ideas for handling more complex heap-manipulating programs that require universal quantifiers in the program invariants.

References

Footnotes

  1. Arguably sub-goal creation in IC3 is a simple MBP for propositional logic.
  2. https://bitbucket.org/spacer/code
  3. Unfortunately, we have no way to distinguish divergence from timeouts.
Comments 0
Request Comment
You are adding the first comment!
How to quickly get a good reply:
  • Give credit where it’s due by listing out the positive aspects of a paper before getting into which changes should be made.
  • Be specific in your critique, and provide supporting evidence with appropriate references to substantiate general statements.
  • Your comment should inspire ideas to flow and help the author improves the paper.

The better we are at sharing our knowledge with each other, the faster we move forward.
""
The feedback must be of minimum 40 characters and the title a minimum of 5 characters
   
Add comment
Cancel
Loading ...
13039
This is a comment super asjknd jkasnjk adsnkj
Upvote
Downvote
""
The feedback must be of minumum 40 characters
The feedback must be of minumum 40 characters
Submit
Cancel

You are asking your first question!
How to quickly get a good answer:
  • Keep your question short and to the point
  • Check for grammar or spelling errors.
  • Phrase it like a question
Test
Test description