Exploring Approximations for FloatingPoint Arithmetic using UppSAT
Abstract
We consider the problem of solving floatingpoint constraints obtained from software verification. We present UppSAT — an new implementation of a systematic approximation refinement framework [24] as an abstract SMT solver. Provided with an approximation and a decision procedure (implemented in an offtheshelf SMT solver), UppSAT yields an approximating SMT solver. Additionally, UppSAT yieldsincludes a library of predefined approximation components which can be combined and extended to define new encodings, orderings and solving strategies. We propose that UppSAT can be used as a sandbox for easy and flexible exploration of new approximations. To substantiate this, we explore several approximations of floatingpoint arithmetic. Approximations can be viewed as a composition of an encoding into a target theory, a precision ordering, and a number of strategies for model reconstruction and precision (or approximation) refinement. We present encodings of floatingpoint arithmetic into reduced precision floatingpoint arithmetic, realarithmetic, and fixedpoint arithmetic (encoded into the theory of bitvectors in practice). In an experimental evaluation we compare the advantages and disadvantages of approximating solvers obtained by combining various encodings and decision procedures (based on existing, stateoftheart SMT solvers for floatingpoint, real, and bitvector arithmetic).
1 Introduction
The construction of satisfying assignments of a formula, or showing that no such assignments exist, is one of the most central tasks in automated reasoning. Although this problem has been addressed extensively in research fields including constraint programming, and more recently in Satisfiability Modulo Theories (SMT), there are still constraint languages and background theories where effective model construction is challenging. Such theories are, in particular, arithmetic domains such as bitvectors, nonlinear real arithmetic (or realclosed fields), and floatingpoint arithmetic; even when decidable, the high computational complexity of such problems turns model construction into a bottleneck in applications such as model checking, testcase generation, or hybrid systems analysis.
In several recent papers, the notion of approximation has been proposed as a means to speed up the construction of (precise) satisfying assignments. Generally speaking, approximationbased solvers follow a twotier strategy to find a satisfying assignment of a formula . First, a simplified or approximated version of is solved, resulting in an approximate solution that (hopefully) lies close to a precise solution. Second, a reconstruction procedure is applied to check whether can be turned into a precise solution of the original formula . If no precise solution close to can be found, refinement can be used to successively obtain better, more precise, approximations.
This highlevel approach opens up a large number of design choices, some of which have been discussed in the literature. The approximations considered have different properties; for instance, they might be over or underapproximations (in which case they are commonly called abstractions), or be nonconservative and exhibit neither of those properties. The approximated formula can be formulated in the same logic as , or in some proxy theory that enables more efficient reasoning. The reconstruction of from can follow various strategies, including simple reevaluation, precise constraint solving on partially evaluated formulas, or randomised optimisation. Refinement can be performed with the help of of approximate assignments , using proofs or unsatisfiable cores, or be skipped altogether.
In this paper, we aim at a uniform description and exploration of the complete design space. We focus on the case of (quantifierfree) floatingpoint arithmetic (FPA) constraints, a particularly challenging domain that has been studied extensively in the SMT context over the past few years [3, 12, 23, 22, 13, 24]. To enable uniform exploration of approximation, reconstruction, and refinement methods, as well as simple prototyping and comparative studies, we present UppSAT as a a general framework for building approximating solvers. UppSAT is implemented in Scala, opensource under GPL licence, and allows the implementation of approximation schemes in a modular and highlevel fashion, such that different components can be easily combined with various backends.
With the help of the UppSAT framework, we explore several ways of approximating SMT reasoning for floatingpoint arithmetic. The contributions of the paper are:

a highlevel approach to design, implement, and evaluate approximations, presented using the case of floatingpoint arithmetic constraints;

a conceptual and experimental comparison of three different forms of FPA approximation, based on the notions of reducedprecision floatingpoint arithmetic, fixedpoint arithmetic, and real arithmetic;

a systematic comparison of different backend solvers for the case of reducedprecision floatingpoint arithmetic.
1.1 Introductory Example
In this paper we will use the following formula as running example to illustrate the effects of using different approximations:
Example 1.
Consider a floatingpoint formula over two variables and :
Note that the formula can be satisfied by the model in both singleprecision and doubleprecision FPA (and a couple of further formats).
We will use the formula to highlight different aspects of the approximations discussed in this paper, in particular approximations using reducedprecision FPA and fixedpoint arithmetic.
1.1.1 ReducedPrecision FloatingPoint Arithmetic
The first form of approximation uses floatingpoint operations of reduced precision, i.e., with a reduced number of bits for the significant and exponent. Approximations of this kind have previously been studied in [23, 24], and found to be an effective way to boost the performance of bitblastingbased SMT solvers, since the size of FPA circuits tends to grow quickly with the bitwidth. The change of the actual formula lies in decreasing the number of bits used for each variable and operator.
Example 2.
We assume reduction to the floatingpoint format, i.e., the format in which 3 bits are used for the significant, and 3 bits for the exponent. The approximate formula is obtained by replacing the variables and with retyped variants , casting all floatingpoint literals to the new format, and replacing the addition operator and comparison predicate with the operator and the predicate for reducedprecision arguments:
Even though is satisfiable the models are not guaranteed models for the original formula, but only satisfies the reduced precision formula because of over/underflows and rounding errors when working with only three precision and three significand bits. For example, ), satisfies because .
This could means that the current reduced precision does not allow for representation of the solutions that exists for the full precision formula. Therefore we need to refine the precision, and a simple strategy is to increase the precision of every node by the same amount, yielding:
which will have model which is also a model for the original problem.
1.1.2 FixedPoint Arithmetic
As a second relevant case, we consider the use of fixedpoint arithmetic as an approximation of FPA. This is done by choosing a fixed number of integral bits and a fixed number of fractional bits, defining the applied fixedpoint format, and then recasting each floatingpoint constraint as a bitvector constraint: each floatingpoint operation is replaced with a set of bitvector operations implementing the corresponding computation over fixedpoint numbers.
Example 3.
For our example, we can initially choose a representation with 5 integral and 5 fractional bits, i.e., in the fixedpoint format. We can note that fixedpoint addition is exactly implemented by bitvector addition over 10 bits, and fixedpoint comparison by signed bitvector comparison over 10 bits, so that the translation becomes relatively straightforward, resulting in the formula :
Constants are interpreted as 2’s complement numbers with 5 fractional and 5 integral bits, e.g., represents the binary number , which is in decimal notation.
It can easily be seen that the constraint is satisfied by the model , which corresponds to the fixedpoint solution and , and to the floatingpoint solution given in Example 1.
2 UppSAT — An Abstract Approximating SMT Solver
UppSAT is an implementation of the systematic approximation refinement framework [24] as an abstract SMT solver. It takes an approximation and a backend SMT solver to yield an approximating SMT solver. UppSAT can implement a broad range of approximations in a simple and modular way, and it easily integrates offtheshelf SMT solvers.
The theoretical framework [24] is defined in terms of a monolithic auxiliary theory to solve the original problem. Instead of solving the problem of theory directly, the formula is lifted to the formula of the approximation theory . The formula is solved using a decision procedure for . enables approximation of the original problem and controlling the degree of approximation. The search for the model is guided by a search through the space of approximations expressible in . Lifting the formula to introduces precision as the means of characterizing the degree of approximation. For different values of precision, different approximations of the original formula are obtained. The overall goal is to find a model of an approximation that can be translated back to a model of the original formula .
The solving process can be seen as a twotier search, in which the search for a sufficiently precise approximation guides the actual model search. Search for the approximation tries to capture essential properties of the model and is performed by the abstract solver. The lowlevel search is entirely unaware of the highlevel aspects. It is performed by the backend solver which seeks the approximate model. The two tiers of search guide each other in turns, until a solution is found or the search space of approximations has been exhausted. In practice there is no need to implement a solver for the monolithic approximation theory . Instead, UppSAT uses an offtheshelf SMT solver as the backend procedure to solve lifted formulas.
The overall goal for the approximation is to produce constraints that are ‘easier’ to solve than the input constraints. One example, that we consider later, is approximating the theory of FPA using the theory of reals, which is considered ‘simpler’ because it ignores the rounding behavior and special values of FPA semantics.
A bird’s eye view on UppSAT.
This paper focuses on the theory of FPA and presents several approximations suitable for solving FPA formulas. We first discuss the general structure of the approximations from the perspective of UppSAT.
An approximation context contains the following components: 1. an input theory , the language of the problem to solve; 2. an output theory , the language of in which we solve lifted formulas; 3. a precision domain, the parameters used to indicate degree of approximation; and 4. a precision ordering, defining an order among different approximations.
Given an approximation and a backend solver, UppSAT takes a set of constraints of the input theory and produces constraints of the output theory for the backend solver. Precision regulates the encoding from the input to the output theory, and its domain and ordering are of consequence for encoding, approximation refinement and termination.
The approximation context only determines the setting for the approximation, but does not give the complete picture. For example, fixing an input and an output theory does not uniquely determine the encoding, also the choice of precision domain and the precision ordering are essential for the expressiveness of the approximation. Given an approximation context, to fully define an approximation we also need to define the following components: 1. encoding of the formula based on precision; 2. decoding the values of the approximate model; 3. model reconstruction strategy; 4. modelbased refinement strategy; and 5. proofbased refinement strategy. The flow of data between these components can be seen in Fig 1.
Encoding of the formula and decoding of the approximate model are the core of the approximation. These operations describe the two directions of moving between the input and the output theory. The encoding aims to retain the essential properties of the problem while making it easier to solve. The goal of decoding is to translate a model for the approximate constraints to an assignment of the input theory. These two operations are of course closely related (and implemented by the Codec trait).
The purpose of model reconstruction strategy is to transform the decoded model into a model of the original constraints. Sometimes the model of the lifted formula will also be a model for the original formula. However, often this is not the case, and a reconstruction strategy is used to repair the assignment in an attempt to find a satisfying assignment. Reconstruction strategies can range from simple reevaluation of the constraints to using a constraint solver or an optimization procedure.
The goal of the modelbased and proofbased refinement strategies is to select the approximation for the next iteration based on the available information. For example, given an approximate model and a failed model we can infer which parts of the formula to refine. This is expressed as a new precision value, specifically a precision value which is greater than the previous one according to the precision ordering.
In order to preserve completeness and termination, for the case of decidable theories, we assume that every precision domain contains a top element , and that precision domains satisfy the ascending chain condition (every ascending chain is finite) [24]. By convention, approximation in top precision corresponds to solving the original, unapproximated constraint with the help of a fallback solver.
Implementation of these operations can be separated into two layers, a general layer and a theoryspecific layer. For example, significant parts of the encoding and decoding are specific to theories involved in the approximation, while the various strategies are mostly theoryindependent, except for a few details. Theoryindependent layers are abstracted into templates, that provide hook functions for theoryspecific details. UppSAT is designed around the mixandmatch principle, to provide a sandbox for testing different approximations with little implementation effort.
Fig. 2 shows the traits (i.e., interfaces) that have to be implemented by approximations in UppSAT. The approximation class takes an object implementing all four traits, and combines them into an approximation to be used by the abstract solver.
Consider an implementation of the reduced precision approximation of FPA in a concise and compact manner. This approximation comprises of dropping FPAspecific elements, such as the rounding modes, and replacing FPA operations by the corresponding real arithmetic operations. in certain cases, a combination of operations may be necessary, e.g., in the case of the fusedmultiplyadd operation. In the case of the FPA theory, the approximation could hardcode one rounding mode for all operations, change the variables and operations to have reduced precision, or just omit some of the constraints.
3 Specifying Approximations in UppSAT
In this section we show how to specify approximations in UppSAT,^{1}^{1}1https://github.com/uuverifiers/uppsat using the example of reducedprecision FPA [24] from Section 1.1.1. It should be remarked that one of the design goals of UppSAT is the ability to define approximations in a convenient, highlevel way; the code we show in this section is mostly identical to the actual implementation in UppSAT, modulo a small number of simplifications for the purpose of presentation. We will first give an intuition for this particular approximation, before breaking it down into the elements that UppSAT requires.
3.1 Approximation using ReducedPrecision FPA
Floatingpoint numbers are a twoparameter data type, denoted . The parameters and are the number of bits used to store the exponent and the significand in memory, respectively. The IEEE754 standard specifies several distinct combinations of and , for example, single precision and double precision floatingpoint numbers. And indeed, these are the most commonly used data types to represent realvalued data. Solving FPA constraints typically involves encoding them into bitvector arithmetic and subsequently into propositional logic, via a procedure called flattening or bitblasting. The size and complexity of the propositional formula depends on the size of floatingpoint numbers in memory. Such an encoding of FPA constraints can become prohibitively large very quickly. However, many key values, e.g., special values, one, powers of two, can be represented compactly and exist in floatingpoint representations that contain very few bits. Therefore, reasoning over single or doubleprecision floatingpoint numbers, for models that involve mostly (or only) these values can be wasteful. Instead, we solve a reducedprecision version of the formula, i.e., we work with Reduced Precision Floating Points (RPFP). Reducing the precision does not affect the structure of the formula, and only changes the sorts of floatingpoint variables, predicates and operations. Bitblasting reducedprecision constraints results in significantly smaller propositional formulas, that are still expressive enough to find an approximate solution.
3.2 ReducedPrecision FPA Approximation in UppSAT
An approximation in UppSAT consists of several parts: an approximation context (the “approximation core”), a codec, a model reconstruction strategy, and a refinement strategy for model and proofguided refinement. Fig. 4 shows the object RPFPApp implementing the reducedprecision floatingpoint approximation. The approximation object is implemented using Scala mixin traits (shown in Fig. 2), which enable the modular mixandmatch approximation design. In the following paragraphs, we show the key points of reduced precision floatingpoint approximation through its component traits.
Approximation context.
An approximation context specifies input and output theory, a precision domain and a precision ordering. The reducedprecision floatingpoint approximation encodes floatingpoint constraints as scaleddown floatingpoint constraints. Therefore, both the input and the output theory are the quantifierfree floatingpoint theory ( FPTheory). The precision uniformly affects both the significand and the exponent, so a scalar data type Prec = Int is sufficient to represent precision. In particular, we choose integers in the range as the precision domain with the usual ordering. Fig. 5 shows the specification of RPFPContext the approximation context object for the reduced precision floatingpoint approximation.
Codec.
The essence of approximation takes place in the encoding of the formula, and conversely how the approximate model is decoded. These two operations are implemented by the RPFPCodec trait, shown in Fig. 6 for the case of the reducedprecision FPA. Reducedprecision floatingpoint approximation scalesdown the sort of floatingpoint variables and operations, while keeping the highlevel structure of the formula. Scaling for operations and variables are performed based on precision values, while predicate nodes are scaled to the largest sort among their children. Constant literals and rounding modes remain unaffected by encoding. The discrepancy in sorts due to individual precisions is removed by inserting fp.toFP casts where necessary. The fp.toFP declaration is an SMTlib function which casts a Floating Point value to a given sort. To ensure internal consistency of the approximate models, all occurences of a variable share the same precision. Predicate scaling requires that the sorts of the arguments are known, i.e., arguments are already encoded when their parent node is encoded. Therefore, we consider a formula as an abstract syntax tree (AST) and use a postorder visit pattern over the formula. UppSAT provides a template trait for such an encoding called PostOrderCodec. To implement it, the user needs to define two hook functions: encodeNode and decodeNode.
To encode a node, we scale the sort, pad the arguments, reinstantiate the symbol to the new sort and bundle the new symbol with the padded children. These steps are implemented in the encodeNode hook function, shown in Fig. 6. The details of scaling the sort are shown in the scaleSort auxiliary function. The sort scaling is linear and consists of 6 sorts, starting with the up to (and including) the original sort. The cast function adds a floatingpoint cast fp.toFP between the parent and the child node where necessary. Implementation of the functions cast and encodeSymbol is straightforward and omitted in the interest of brevity.
After the backend solver returns a model of the approximate constraints, it needs to be decoded. Decoding is essentially casting variable assignments to their sort in the original formula. For example, suppose a formula over variables and of sort is encoded to the formula (as in Ex. 2), yielding a model . Decoding will cast these values from the model of approximate constraints and translate them to the same values, but in their fullprecison sort, resulting in a variable assignment . Special values are also decoded by reinstantiating them in the original sort. Other values are decoded by adding the missing bits to their representation. The missing bits in the encoded formula are implicitly set to zero. To decode the significant, the missing zero bits are simply reinserted. Padding the exponent requires some attention due to the details of the IEEE754 standard . The values of the exponent are stored with an added bias value, which is dependent on the exponent bitwidth. To pad the exponent, we first remove the bias of the exponent in reduced precision, and then add the bias of fullprecison FP. (Subnormal floatingpoint values require more attention.)
PostOrderCodec implements the decodeModel function through the decodeNode hook function. The hook function is applied to all the values in the model of the approximate constraints. The decoding of the values is performed by the decodeFPValue function, all shown in Fig. 6.
Model reconstruction strategy
specifies how to obtain a model of the input constraints starting from the decoded model. A simple strategy to obtain a reconstructed model is to satisfy the same Boolean constraints (constraints which are true or false, e.g., equalities, inequalities, predicate) as the approximate model, i.e., to try and satisfy the Boolean structure in the same way. We call those constraints critical atoms. However, due to the difference in semantics, values of the decoded model are not guaranteed to satisfy them. Typically, the rounding error, significantly larger in reduced precision FPA, accumulates and changes the value of critical atoms under the original semantics. Therefore, evaluation of critical atoms under the original semantics is necessary to ensure that the model satisfies the original formula. In fact, rather than evaluating the critical atoms simply as a verification step, evaluation can be used to infer the errorfree values under the original semantics. Starting from an empty candidate partial model, the constraints are evaluated in a bottomup fashion. Thus, the reconstruction can be defined by defining the reconstruction of a single node in the reconstructNode hook function, shown in Fig. 8.
The key to a good reconstruction strategy is propagation. Certain constraints allow more information to be propagated than others. For example, equality uniquely determines the value of if the values of and are known and the equality is known to hold. Whereas, an inequality for example, allows for less propagation. The decoded model contains the information which critical atoms need to be satisfied. The critical atoms combined with a bottomup evaluation, allow propagation to take place, by applying equalityasassignment; if the following conditions are satisfied: 1. the equality is true in the approximate model, 2. its left or righthand side is a variable that is currently unassigned in the candidate model, and 3. the value of the other side is defined in the candidate model, then the variable can be assigned the value of the other side in the candidate model. Equalityasassignment is crucial for elimination of rounding errors due to the RPFP encoding. Note that this reconstruction strategy can fail if cyclic dependencies exist among the constraints.
An important aspect of the reconstruction strategy is the order of evaluation. Bottomup evaluation, bar equalityasassignment, requires that all the subexpressions have a value in the candidate model. The base case are variables which might be undefined in the candidate model. If they are undefined in the candidate model when they are needed for evaluation, they are assigned the value from the decoded model. This means, that evaluation of inqualities ahead of equalities might prevent equalityasassignment to take place. Therefore, we wish to evaluate predicates in an order such that equalityasassignment enabled critical atoms are evaluated first. Therefore we separate all predicates of the form or , where var variables and is some operation or predicate. We call these equations definitional. In order to maximise the propagation during the reconstruction, definitional equalities are prioritised over the remaining predicates.
Furthermore, the definitional equalities are sorted based on a topological order of the variables in a graph defined by viewing definitional equalities as directed edges in a graph. An equation of the form generates edges from every variable on the right hands side to , and an equation of the form generates an edge from to and one edge from to . The topological sorting of variables starts with the varibles occuring in definitional equalities, that have the lowest input degree. Their values can be safely copied from the decoded model. The resulting order of variables corresponds to a bottomup propagation through the formula, that maximises applicatoins of equalityasassignment in the reconstruction. Any cyclic dependencies will be broken, with algorithm picking any variable arbitrarily (we leave it to future work to design a reasonable heuristic). After a topological order of the variables have been established, the equalities ordered according to the variableordering.
The model reconstruction performs a bottomup reconstruction of critical atoms in the topological order of the equalites followed by the remaining predicates. Ordering the predicates in the described manner increases the likelihood of propagation fixing rounding errors introduced by the FPFP encoding.
Modelguided refinement strategy
takes place when model reconstruction fails to obtain a model. Modelguided refinement increases the precision of the formula, based on the decoded model and the failed candidate model. The refinement increases the precision of operations, but only so far that a more precise model is obtained in the next iteration. Comparison of the evaluation of the formula under the two assignments, highlights critical atoms that should be refined. These atoms evaluate to true in the approximate model and to false in the failed candidate model. Since FPA is a numerical domain, it is possible to apply some notion of error to determine which nodes contribute the most to the discrepancies in evaluation and use them to rank the subexpressions. After ranking, only a portion of them is refined, say 30%. Refinement amounts to increasing precision by some amount, in this case a constant. In general, one could use the error to determine by how much to increase the precision. Since errorbased refinement can be applied to any numerical domain, UppSAT implements an errorbased refinement strategy, which is instantiated by providing an implementation of the nodeError hook function, shown in Fig. 9.
Proofguided refinement strategy
uses proofs of unsatisfiability to refine the formula. Formula can be refined using unsatisfiable cores, when an approximate model is not available. At the moment UppSAT has no support for obtaining cores or proofs from the backend solvers. Instead, a naïve refinement strategy is used, which increases all the precisions by a constant, shown in Fig. 10.
4 Approximations in UppSAT
In this section, we discuss some more general aspects of approximations within the UppSAT framework. In addition to listing alternatives to the components of the RPFP approximation, some implementation details are discussed.
Precision domains
are crucial for both the expressiveness of the encoding and the subtlety of the refinement. Precision can be uniform or compositional in terms of their relationship with the formula. Uniform precision assigns a single precision value to the entire formula, whereas compositional precision associates different values with some or all parts of the formula. As we have seen, the RPFP approximation uses a compositional precision, which is associated with variable and function nodes. Uniform precision is used in the BV and RA approximations, which are presented in the next section.
From the perspective of encoding expressivity, precision can be a scalar value or a vector. While in most cases scalar precision suffices, vectors (or tuples) can be used to elegantly encode more expressive approximations. For instance, a pair of precisions associated with an FPA node allow the significand and the exponent to have independent bitwidths. Choosing a suitable precision domain is important, both for the compactness of the definition of approximation in UppSAT and for the performance of the resulting approximating solver. Too crude a precision domain might yield a negligible improvement of performance, while too fine a precision domain might spend too much time wandering through the different approximations.
Encoding and decoding
are the heart of the approximation. The two translations are intertwined, a simple elegant encoding is useless if the model cannot be translated back in a meaningful way. In fact, the encoding often suggests a natural way of implementing the decoding, since the translations are in a sense inverse. In general, an encoding is just an arbitrary translation of a formula of the input theory to a formula of the output theory; in practice, like for the RPFP approximation, the encoding does not change the overall structure of the formula, but merely adjust the sorts involved. Other approximations might add global constraints in the encoding, e.g., definitional equalities or impose ranges, or they might add or remove nodes in the formula. For instance, the realarithmetic approximation RA of FPA will not encode the rounding modes, since they do not have an equivalent in real arithmetic. The decoding of a real model needs to produce some reasonable values for the rounding modes somehow. This can, for instance, be done by choosing a preselected default value.
To maintain information of the relationship between the original and the encoded formula, UppSAT uses labeled abstract syntax trees. During the encoding, the result of the encoding is assigned the label of the source node in the original formula that it encodes. The labels offer a way to keep track of the translation, since the encoding can be ambiguous to decode. All the approximations presented in this paper are contextindependent and nodebased, i.e., it is sufficient to specify the translation at the node level. UppSAT offers a pattern for this kind of codec, called PostOrderCodec. Overall, UppSAT can handle a broad range of encodings, that can be specified succinctly within the framework.
Model reconstruction strategies
take place entirely in the input theory, and as such can be combined with a number of different encodings (they are independent of the chosen output theory). The reconstruction strategy used by the RPFP approximation is simple in the sense that it only evaluates expressions, and it does not pose satisfiability queries to a solver. A different strategy, along similar lines, might start the reconstruction from the difficult (e.g., nonlinear) constraints and then evaluate the remainder of the formula. More complex strategies might use a solver during the reconstruction to search for a model within some distance of a decoded (failing) model. A numeric model lifting strategy was proposed by Ramachandran and Wahl [22]. Their method identifies a subset of the model to be tweaked, and instantiates the formula as a univariate satisfiability check. Except for a chosen variable, all the variables in the formula are substituted by their value in the failing decoded model. This approach often quickly patches the candidate model. UppSAT can express these more advanced strategies, but implementation and experiments in this direction have been left for future work.
Refinement strategies
use the information obtained either from the models or the proof of unsatisfiability to find a better approximation. In cases when information is scarce (e.g., no proofs are available in case of unsatisfiability), or the approximation is very coarse and no useful information can be extracted from a decoded model, a uniform refinement strategy can increase precision of the entire formula. This is the case with fixedpoint approximation BV and the proofguided refinement of the RPFP approximation. In case of numeric domains, a notion of error can be used to determine which terms to refine and by how much [24]. This is the strategy used by the RPFP approximation. In the case of a precision vector for each node in a formula, the error between the decoded and candidate model can be used to refine either the exponent, if the magnitude of the error is large, or the significant if the error is very small.
5 Other Approximations of FPA
We have shown in detail the RPFP approximation of FPA, and discussed different components that can be used in general. In this section we outline two further approximations of FPA that have been implemented in UppSAT: the fixedpoint approximation BV (Section 1.1.2), encoded as bitvectors, and the realarithmetic approximation RA. Both approximations are currently implemented in a more experimental and less refined way than the RPFP approximation, but encouragingly, even simple approximations can give rise to speedups compared to their backend solvers (as shown in Section 7).
5.1 BV — The FixedPoint Approximation of FPA
The idea behind the BV approximation is to avoid the overhead of the rounding semantics and special values of the FPA, by encoding all the FPA values and variables and operations as values and operations of the fixedpoint arithmetic.
The BV context.
The input theory is the theory of FPA, and the intended output theory is the theory of fixedpoint arithmetic. However, since fixedpoint arithmetic is not commonly supported by SMT solvers, we can encode fixedpoint constraints in the theory of fixedwidth bitvectors. The precision determines the number of integer and fractional binary digits in the fixedpoint representation of a number. For simplicity, at this point we do not mix multiple fixedpoint formats in one formula, but instead apply uniform precision in the BV approximation; as a result, all operations in a constraint are encoded using the same fixedpoint sort. As a proof of concept, the precision domain is twodimensional, with the first component in a pair denoting the number of integral, and the second component the number of fractional bits in the encoding, respectively. The precision domain ranges from to , with the maximum element being interpreted as sending the original, unapproximated FPA constraint to Z3 as a fallback solver.
Example 4.
Given a variable of precision , we will have a domain of numbers between and , which when interpreted in two’scomplement notation are numbers between and .
The BV codec.
A codec describes how values can be converted from the input theory to the output theory, and vice versa. The floatingpoint operations are in BV encoded as their fixedpoint equivalents, which in turn are encoded as bitvector operations. This process is fairly straightforward, with the exception of the rounding modes and special FPA values. The rounding modes and notanumber values are omitted by the encoding, while the remaining special values are encoded, with respect to the current precision, either as zero or as the largest or smallest value (in case of infinities). Translation of literal floatingpoint constants amounts to a representation as the closest value in the chosen fixedpoint sort. The decoding consists of converting a fixedpoint number to a rational number, followed by conversion to the closest floatingpoint number, with some care taken for the special values.
BV reconstruction and refinement.
The BV approximation uses the same model reconstruction strategy as the RPFP approximation. In contrast, the chosen refinement strategy in the BV approximation is currently very simple: since the precision is uniform, the refinement is also uniform, regardless of whether an approximate model is available or not. At each iteration, the precision is increased by 4 in both dimensions, resulting in addition of 4 bits to both the integral and fractional part of numbers.^{2}^{2}2This means that the approximation does not really leverage the twodimensional precision, and that maximal precision of the encoding is reached after at most 5 iterations.
5.2 RA — The Real Arithmetic Approximation of FPA
The third and possibly most obvious approach to approximate FPA is by encoding into real arithmetic constraints. We present a comparatively simplistic implementation of this kind of approximation, due to the difficulty to refine approximations in real arithmetic in a meaningful way (real arithmetic already represents to infiniteprecision arithmetic). Ramachandran and Wahl [22] describe a topological notion of refinement, that requires a backend solver that handles the combined theory of real arithmetic and FPA. However, solving constraints over this combination of theories is challenging in itself, and efficient SMT solvers are not publicly available, to the best of our knowledge.
RA context.
In the RA approximation, the FPA is the input theory, and the output theory is the theory of (nonlinear) real arithmetic. The precision domain is a uniform binary domain , deciding whether approximation is taking place at all (), or whether the original FPA constraint is sent to a backend solver (for ; again, the fallback solver in this case is Z3). Essentially, this is a hitormiss approximation, which either will work right away or directly resort to the fallback solver.
RA codec.
The encoding is fairly straightforward, the FPA operations are translated as their real counterparts, omitting the rounding modes in the process. While the special values can be encoded, currently they are not supported by the RA approximation. FPA numerals are converted to reals, i.e., in the case of normal FPA numbers the resulting real number is . Decoding will translate a real number to the closest FPA numeral.
RA reconstruction
coincides with the RPFP reconstruction.
RA refinement
is achieved by uniform refinement, and results in the full precision after a single iteration. In the case of the topological refinement proposed by Ramachandran and Wahl [22], the precision domain would be the same, but the precision itself would be compositional, i.e., a precision would be associated with each node of the formula. Essentially, the precision would represent a switch, deciding whether a node should be encoded in real arithmetic or floatingpoint arithmetic.
6 Related Work
6.1 Approximations in General
The concept of abstraction (and approximation) is central to software engineering and program verification, and is increasingly employed in general mathematical reasoning and in decision procedures as well. Frequently only under and overapproximations are considered, i.e., the formula that is solved either implies or is implied by an approximate formula. Counterexample guided abstraction refinement [7] is a general concept that is applied in many verification tools and decision procedures, even on a relatively low level as in QBF solvers [17], or in modelbased quantifier instantiation for SMT [15].
6.2 Decision Procedures for FloatingPoint Arithmetic
The SMT solvers MathSAT [6], Z3 [21], and Sonolar [19] feature bitprecise conversions from FPA to bitvector constraints, known as bitblasting, and represent the currently most commonly used solvers in program verification. As we show in our experiments, bitblasting can be boosted significantly with the help of our approximation approach.
A general framework for decision procedures is Abstract CDCL, introduced by D’Silva et al. [11], which was also instantiated for FPA [12, 2]. This approach relies on the definition of suitable abstract domains (as defined for abstract interpretation [8]) for constraint propagation and learning. In our experimental evaluation (Section 7), we compare to two decision procedures for FPA that are implemented in MathSAT; instances of ACDCL and eager translation to bitvectors. ACDCL can seamlessly be integrated into the UppSAT framework, for instance to solve approximations or to derive an approximation based on abstract domains.
The work presented in this paper builds on previous research on the use of approximations for solving FPA constraints [23, 24]. UppSAT is also close in spirit to the framework presented by Ramachandran and Wahl [22] for efficiently solving FPA constraints based on the notion of ‘proxy’ theories, which correspond to our ‘output theories.’ This framework applies a relatively sophisticated method of reconstruction, by applying a fallback FPA solver to a version of the input constraint in which all but one variables have been substituted by their value in a failing decoded model. Such reconstruction could also be realized in UppSAT, and an implementation in UppSAT and experimental comparison with other reconstruction methods is planned as future work.
A further recent approximationbased solver for FPA is XSat [13]. In XSat, reconstruction of models is implemented with the help of randomized optimization, which results in good performance, but does not give rise to a decision procedure (incorrect sat/unsat results can be produced).
There is a long history of formalization and analysis of FPA concerns using proof assistants, among others in Coq by Melquiond [20] and in HOL Light by Harrison [16]. Coq has also been integrated with a dedicated floatingpoint prover called Gappa by Boldo et al. [1], which is based on interval reasoning and forward error propagation to determine bounds on arithmetic expressions in programs [10]. The ASTRÉE static analyzer [9] features abstract interpretationbased analyses for FPA overflow and divisionbyzero problems in ANSIC programs.
7 Experimental evaluation
In this section we evaluate the effectiveness of the discussed approximations of FPA, when combined with the bitvectors, real and FPA decision procedures implemented in MathSAT and Z3.
Experimental setup.
The evaluation is done on the satisfiable benchmarks of the
QF_FP
category of the SMTLIB. Currently, UppSAT does not
extract unsatisfiable cores from backend, and none of the
approximations have a meaningful proofbased refinement strategy, so
that performance on unsatisfiable problems is guaranteed to be worse
than that of the backend solver. All experiments were done on an AMD
Opteron 2220 SE machine, running 64bit Linux, with memory limited to
1.0gb, and with a timeout of one hour.
UppSAT instances.
Table 1 shows combinations of approximation and backend solver that we evaluate. The UppSAT instances are named in the form of APPROXIMATION(backend). Note that the backend needs to implement a decision procedure for the output theory of the approximation. As a consequence, we have three configurations for the RPFP approximation, by using bitblasting procedures in Z3 and MathSAT and the ACDCL algorithm in MathSAT as decisionprocedures for FPA. UppSAT currently lacks support for the bitvector theory in MathSAT, so for the BV approximation only the bitvector solver in Z3 is used as the backend. The backend for the RA approximation is the nlsat tactic in Z3, since it is the only decision procedure in Z3 and MathSAT to support nonlinear constraints over reals.
ACDCL  MathSAT  Z3  nlsat  

RPFP  \Checkmark  \Checkmark  \Checkmark  
BV  \Checkmark  
RA  \Checkmark 
Investigated questions.
In previous work, we have observed that the RPFP approximation improves performance of bitblasting implemented in the Z3 SMT solver [24]. Here we seek to reproduce those results, but also to see whether similar behavior can be observed with other implementations and algorithms. We were interested in answering the following research questions:

Is the positive effect of the RPFP approximation on performance of the bitblasting approach for FPA independent of the implementation?

Does the RPFP approximation have a positive effect on the ACDCL algorithm for FPA?

What is the impact of approximations on the stateoftheart for the theory of FPA?
acdcl  mathsat  z3  BV  RPFP  RPFP  RPFP  RA  
(z3)  (acdcl)  (mathsat)  (z3)  (nlsat)  
Solved  86  99  97  91  78  101  101  90 
Timeouts  44  31  33  39  52  29  29  40 
Best  65  4  6  9  3  9  9  4 
Average Iterations        2.69  3.59  3.16  3.02  1.85 
Max Precision        23  2  1  2  110 
Average Rank  3.81  5.40  6.33  5.42  5.32  4.38  4.53  6.60 
Total Time (s)  10071  16748  34526  11979  8448  8279  14992  27169 
Average Time (s)  117.10  169.17  355.94  131.64  108.30  81.97  148.43  301.87 
Only solver  1  0  2  0  0  1  0  0 
To answer these questions, we compare the performance of the backends
and the UppSAT instances on 130 nontrivial^{3}^{3}3The
regression tests in the wintersteiger family were ignored
for the evaluation. satisfiable benchmarks of the QF_FP
category of the SMTLIB benchmarks. On each benchmark, solvers were
assigned a rank based on their solving time, i.e., if a solver had the
smallest solving time, it was assigned rank 1, the solver with the
next smallest solving time rank 2, etc.
The results are summarized in Table 2, and a more detailed view of runtimes is provided by the cactus plot shown in Figure 11. Table 2 shows, for each solver, the number of benchmarks solved within the 1 hour timeout, the number of timeouts, the number of instances for which the solver was fastest, the average number of refinement iterations on solved problems, the number of benchmarks for which refinement reached maximum precision , the average rank, the total time needed to process all benchmarks (excluding timeouts), the average solving time (excluding timeouts), and the number of unique instances only solved by the respective solver.
Discussion.
We can observe that the RPFP approximation combined with bitblasting, either in Z3 or MathSAT, solves the largest number of instances. When comparing the average rank, MathSAT comes out as the marginally better choice of backend. This is expected, based on the performance on the backends themselves. All the configurations shine on at least a few benchmarks, indicating that the approximations do offer an improvement. Furthermore, the ACDCL algorithm outperforms all the other solvers on 65 benchmarks, which is also indicated by the lowest average rank, but it solves fewer benchmarks that the bitblasting approaches in total.
Looking only at the approximations, we can see that on average the benchmarks are solved using around three iterations. The notable exception is the RA approximation, which performs at most two iterations, the RA approximation and the full FPA semantics. This indicates that for many of the benchmarks, fullprecision encoding is not really necessary, since the RPFP approximation rarely reaches maximum precision. However, the BV and RA approximations reach maximal precision more often. In their defense, both BV and RA approximations are presented as a proof of concept, since neither has tailored reconstruction and refinement strategies.
Virtual portfolios.
To compare the impact of the approximations on the stateoftheart, we compare a virtual portfolio of the backend solvers alone, and a virtual portfolio of both the backends and the UppSAT instances. Table 3 shows the number of benchmarks solved, the number of timeouts, and the total and average solving time. The addition of the UppSAT instances allows only two more benchmarks to be solved, compared to the backend portfolio. However, the total solving time is improved dramatically.
Virtual Portfolio (Backend)  Virtual Portfolio (All)  
Solved  110  112 

Timeouts  20  18 
Total time  25135  12516 
Average time  228.50  111.75 
Cactus plot.
To complement the aggregated data, the cactus plot in Figure 11 shows on the X axis how many instances can be solved in the amount of time shown on the Y axis, by each of the solvers and the portfolios. The UppSAT instances are shown using full lines, while the backends are presented using dashed lines. The colors denote the same backend, e.g., mathsat and RPFP(mathsat) are both colored green.
It corroborates that the ACDCL algorithm is very efficient in solving many benchmarks, solving as many as 68 in less than 10s, however, eventually it gets overtaken by the other solvers. Looking more closely at the RPFP approximation, we can conclude that it improves performance of bitblasting considerably, regardless of the implementation (MathSAT or Z3). On the other hand, RPFP seems to hinder, rather than help, the already very efficient ACDCL algorithm.^{4}^{4}4Earlier experiments using the stable version 5.4.1 of MathSAT have shown similar effects of the RPFP approximation to those on the bitblasting methods. However, overall the performance results were not consistent with performance of MathSAT in previous publications, and indicated a bug. We thank Alberto Griggio for promptly providing us with a corrected version of MathSAT, which we use in the evaluation. Furthermore, the virtual portfolios are also shown. While both portfolios solve more instances than any individual solver, the portfolio based on the backend solvers and the UppSAT instances is a clear winner, showing the impact of presented approximations on the stateoftheart.