Checking Satisfiability by Dependency Sequents

Checking Satisfiability by Dependency Sequents

Abstract

We introduce a new algorithm for checking satisfiability based on a calculus of Dependency sequents (D-sequents). Given a CNF formula , a D-sequent is a record stating that under a partial assignment a set of variables of is redundant in formula . The D-sequent calculus is based on operation join that forms a new D-sequent from two existing ones. The new algorithm solves the quantified version of SAT. That is, given a satisfiable formula , it, in general, does not produce an assignment satisfying . The new algorithm is called DS-QSAT where DS stands for Dependency Sequent and Q for Quantified. Importantly, a DPLL-like procedure is only a special case of DS-QSAT where a very restricted kind of D-sequents is used. We argue that this restriction a) adversely affects scalability of SAT-solvers b) is caused by looking for an explicit satisfying assignment rather than just proving satisfiability. We give experimental results substantiating these claims.

1 Introduction

Algorithms for solving the Boolean satisfiability problem are an important part of modern design flows. Despite great progress in the performance of such algorithms achieved recently, the scalability of SAT-solvers still remains a big issue. In this paper, we address this issue by introducing a new method of satisfiability checking that can be viewed as a descendant of the DP procedure [3].

We consider Boolean formulas represented in Conjunctive Normal Form (CNF). Given a CNF formula , one can formulate two different kinds of satisfiability checking problems. We will refer to the problems of the first kind as QSAT where Q stands for quantified. Solving QSAT means just checking if is true. In particular, if is satisfiable, a QSAT-solver does not have to produce an assignment satisfying . The problems of the second kind that we will refer to as just SAT are a special case of those of the first kind. If is satisfiable, a SAT-solver has to produce an assignment satisfying .

Intuitively, QSAT should be easier than SAT because a QSAT-solver needs to return only one bit of information. This intuition is substantiated by the fact that checking if an integer number is prime (i.e. answering the question if non-trivial factors of exist) is polynomial while finding factors of explicitly is believed to be hard. However, the situation among practical algorithms defies this intuition. Currently, the field is dominated by procedures based on DPLL algorithm  [2] that is by SAT-solvers. On the other hand, a classical QSAT-solver, the DP procedure [3], does not have any competitive descendants (although some elements of the DP procedure are used in formula preprocessing performed by SAT-solvers  [5]).

In this paper, we introduce a QSAT-solver called DS-QSAT where DS stands for Dependency Sequent. On the one hand, DS-QSAT can be viewed as a descendant of the DP procedure. On the other hand, DPLL-like procedures with clause learning is a special case of DS-QSAT. Like DP procedure, DS-QSAT is based on the idea of elimination of redundant variables. A variable is redundant in if the latter is equivalent to where is the set of all clauses of with . Note that removal of clauses of produces a formula that is equisatisfiable rather than functionally equivalent to .

If is satisfiable, all variables of are redundant in because an empty set of clauses is satisfiable. If is unsatisfiable, one can make the variables of redundant by deriving an empty clause and adding it to . An empty clause is unsatisfiable, hence all other clauses of can be dropped. So, from the viewpoint of DS-QSAT, the only difference between satisfiable and unsatisfiable formulas is as follows. If is satisfiable, its variables are already redundant and one just needs to prove this redundancy. If is unsatisfiable, one has to make variables redundant by derivation and adding to an empty clause.

The DP procedure makes a variable of redundant in one step, by adding to all clauses that can be produced by resolution on . This is extremely inefficient due to generation of prohibitively large sets of clauses even for very small formulas. DS-QSAT addresses this problem by using branching. The idea is to prove redundancy of variables in subspaces and then “merge” the obtained results. DS-QSAT records the fact that a set of variables is redundant in in subspace specified by partial assignment as . Here is a subset of the assignments of relevant to redundancy of . The record is called a dependency sequent (or D-sequent for short). To simplify notation, if and are obvious from the context, we record the D-sequent above as just   .

A remarkable fact is that a resolution-like operation called join can be used to produce a new D-sequent from two D-sequents derived earlier [8, 7]. Suppose, for example, that D-sequents and specify redundancy of variable in different branches of variable . Then D-sequent holds where the left part assignment of this D-sequent is obtained by taking the union of the left part assignments of the two D-sequents above but those to variable . The new D-sequent is said to be obtained by joining the two D-sequents above at variable . The calculus based on the join operation is complete. That is, eventually DS-QSAT derives D-sequent stating unconditional redundancy of the variables of in . If by the time the D-sequent above is derived, contains an empty clause, is unsatisfiable. Otherwise, is satisfiable. Importantly, if is satisfiable, derivation of D-sequent does not require finding an assignment satisfying .

DPLL-based SAT-solvers with clause learning can be viewed as a special case of DS-QSAT where only a particular kind of D-sequents is used. This limitation on D-sequents is caused by the necessity to generate a satisfying assignment as a proof of satisfiability. Importantly, this necessity deprives DPLL-based SAT-solvers of using transformations preserving equisatisfiability rather than functional equivalence. In turn, this adversely affects the performance of SAT-solvers. We illustrate this point by comparing the performance of DPLL-like SAT-solvers and a version of DS-QSAT on compositional formulas. This version of DS-QSAT use the strategy of lazy backtracking as opposed to that of eager backtracking employed by DPLL-based procedures. A compositional CNF formula has the form where ,. Subformulas are identical modulo variable renaming/negation. We prove theoretically that performance of DS-QSAT is linear in . On the other hand, one can argue that the average performance of DPLL-based SAT-solvers with conflict learning should be quadratic in . In Section 8, we describe experiments confirming our theoretical results.

The contribution of this paper is fourfold. First, we use the machinery of D-sequents to explain some problems of DPLL-based SAT-solvers. Second, we describe a new QSAT-solver based on D-sequents called DS-QSAT. Third, we give a theoretical analysis of the behavior of DS-QSAT on compositional formulas. Fourth, we show the promise of DS-QSATby comparing its performance with that of well-known SAT-solvers on compositional and non-compositional formulas.

This paper is structured as follows. In Section  2 we discuss the complexity of QSAT and SAT. Section 3 gives a brief introduction into DS-QSAT. We recall D-sequent calculus in Section 4. A detailed description of DS-QSAT is given in Section 5. Section 6 gives some theoretical results on performance of DS-QSAT. Section 7 describes a modification of DS-QSAT  that allows additional pruning of the search tree. Experimental results are given in Section 8. We describe some background of this research in Section 9 and give conclusions in Section 10.

2 Is QSAT Simpler Than SAT?

{
1 ;
2 if (ans=unsat) return(unsat);
3 ; :=;
4 while () {
5 ;
6 if ()
7 ;
8 else ;
9
10 ;
11 ;}
12 return();
Figure 1: SAT-solving by QSAT

In this section, we make the following point. Both QSAT-solvers and SAT-solvers have exponential complexity on the set of all CNF formulas, unless P = NP. However, this is not true for subsets of CNF formulas. It is possible that a set of formulas describing, say, properties of a parameterized set of designs can be solved in polynomial time by some QSAT-solver while any SAT-solver has exponential complexity on .

To illustrate the point above, let us consider procedure gen_sat_assgn shown in Figure 1. It finds an assignment satisfying a CNF formula (if any) by solving a sequence of QSAT problems. First, gen_sat_assgn calls a QSAT-solver solve_qsat to check if is satisfiable (line 2). If it is, gen_sat_assgn picks a variable of (line 5) and calls solve_qsat to find assignment under which formula is satisfiable (lines 6-8). Since is satisfiable, and/or has to be satisfiable. Then gen_sat_assgn fixes variable at the chosen value val and adds (=val) to assignment (lines 9-10) that was originally empty. The gen_sat_assgn procedure keeps assigning variables of in the same manner in a loop (lines 5-11) until every variable of is assigned. At this point, is a satisfying assignment of .

The number of QSAT checks performed by solve_qsat in gen_sat_assgn is at most . So if there is a QSAT-solver solving all satisfiable CNF formulas in polynomial time, gen_sat_assgn can use this QSAT-solver in its inner loop to find a satisfying assignment for any satisfiable formula in polynomial time. However, this is not true when considering a subset of all possible CNF formulas. Suppose there is a QSAT-solver solving the formulas of in polynomial time. Let be a formula of . Let denote under partial assignment . The fact that does not imply . So the behavior of gen_sat_assgn using this QSAT-solver in the inner loop may actually be even exponential if this QSAT-solver does not perform well on formulas .

For example, one can form a subset of all possible CNF formulas such that a) a formula describes a check that a number is composite and b) an assignment satisfying (if any) specifies two numbers , such that , and . The satisfiability of formulas in can be checked by a QSAT-solver in polynomial time  [14]. At the same time, finding satisfying assignments of formulas from i.e. factorization of composite numbers is believed to be hard. For instance, gen_sat_assgn cannot use the QSAT-solver above to find satisfying assignments for formulas of in polynomial time. The reason is that formula does not specify a check if a number is composite. That is does not imply that .

Note that a SAT-solver is also limited in the ways of proving unsatisfiability. For a SAT-solver, such a proof is just a failed attempt to build a satisfying assignment explicitly. For example, instead of using the polynomial algorithm of  [14], a SAT-solver would prove that a number is prime by failing to find two non-trivial factors of .

3 Brief comparison of DPLL-based SAT-solvers and DS-QSAT in Terms of D-sequents

In this section, we use the notion of D-sequents to discuss some limitations of DPLL-based SAT-solvers. We also explain how DS-QSAT  (described in Section 5 in detail) overcomes those limitations.

Example 1

Let SAT_ALG be a DPLL-based SAT-solver with clause learning. We assume that the reader is familiar with the basics of such SAT-solvers [15, 16]. Let be a CNF formula of 8 clauses where , , , , , , , . The set of variables of is equal to .

Let SAT_ALG first make assignment . This satisfies clauses ,, and removes literal from . Let SAT_ALG then make assignment . Removing literal from and turn them into unit clauses and respectively. This means that SAT_ALG ran into a conflict. At this point, SAT_ALG generates conflict clause that is obtained by resolving clauses and on and adds to . After that, SAT_ALG erases assignment and the assignment made by SAT_ALG to and runs BCP that assigns to satisfy that is currently unit. In terms of D-sequents, one can view generation of conflict clause and adding it to as derivation of D-sequent equal to . D-sequent says that making assignments falsifying clause renders all unassigned variables redundant. Note that is inactive in the subspace that SAT_ALG enters after assigning 1 to . (We will say that D-sequent    is active in the subspace specified by partial assignment if the assignments of are a subset of those of .) So the variables proved redundant in subspace become non-redundant again.

One may think that reappearance of variables in subspace is “inevitable” but this is not so. Variables , have at least two reasons to be redundant in subspace . First, is falsified in this subspace. Second, the only clauses of containing variables , are ,,,. But and are satisfied by and can be satisfied by an assignment to ,. So ,,, can be removed from in subspace without affecting the satisfiability of . Hence D-sequents and equal to and are true. (In Example 3, we will show how and are derived by DS-QSAT.) Suppose that one replaces the D-sequent above with D-sequents where is equal to . Note that only D-sequent is inactive in subspace . So only variable reappears after changes its value from 0 to 1

The example above illustrates the main difference between SAT_ALG and DS-QSAT in terms of D-sequents. At every moment, SAT_ALG has at most one active D-sequent. This D-sequent is of the form    where is an assignment falsifying a clause of and is the set of all variables that are currently unassigned. DS-QSAT may have a set of active D-sequents where , ,. When SAT_ALG changes the value of variable of , all the variables of reappear as non-redundant. When DS-QSAT changes the value of , variables of reappear only if . So only a subset of variables of reappear.

To derive D-sequents    above, DS-QSAT goes on branching in the presence of a conflict. Informally, the goal of such branching is to find alternative ways of proving redundancy of variables from . So DS-QSAT uses extra branching to minimize the number of variables reappearing in the right branch (after the left branch has been explored). This should eventually lead to the opposite result i.e. to reducing the amount of branching. Looking for alternative ways to prove redundancy can be justified as follows. A practical formula typically can be represented as . Here are internal variables of and are “communication” variables that may share with some other subformulas , . One can view as describing a “design block” with external variables . Usually, is much smaller than . Let a clause of be falsified by the current assignment due to a conflict. Suppose that at the time of the conflict all variables of of subformula were assigned and their values were specified by assignment . Suppose is consistent for i.e. can be extended by assignments to to satisfy . This means that the variables of are redundant in subspace in where . Then by branching on variables of one can derive D-sequent   . If is inconsistent for , then by branching on variables of one can derive a clause falsified by . Adding to makes the variables of redundant in in subspace . So the existence of many ways to prove variable redundancy is essentially implied by the fact that formula has structure.

The possibility to control the size of right branches gives an algorithm a lot of power. Suppose, for example, that an algorithm guarantees that the number of variables reappearing in the right branch is bounded by a constant . We assume that this applies to the right branch going out of any node of the search tree, including the root node. Then the size of the search tree built by such an algorithm is . Here is the maximum depth of a search tree built by branching on variables of and is the number of nodes in a full binary sub-tree over variables. So the factor limits the size of the right branch. The complexity of an algorithm building such a search tree is linear in . In Section 6, we show that bounding the size of right branches by a constant is exactly the reason why the complexity of DS-QSAT on compositional formulas is linear in the number of subformulas.

The limitation of D-sequents available to SAT_ALG  is consistent with the necessity to produce a satisfying assignment. Although such limitation cripples the ability of an algorithm to compute the parts of the formula that are redundant in the current subspace, it does not matter much for SAT_ALG. The latter simply cannot use this redundancy because it is formulated with respect to formula rather than . Hence, discarding the clauses containing redundant variables preserves equisatisfiability rather than functional equivalence. So, an algorithm using such transformations cannot guarantee that a satisfying assignment it found is correct.

4 D-sequent Calculus

In this section, we recall the D-sequent calculus introduced  [8, 7]. In Subsections 4.1 and  4.2 we give basic definitions and describe simple cases of variable redundancy. The notion of D-sequents is introduced in Subsection 4.3. Finally, the operation of joining D-sequents is presented in Subsection 4.4.

4.1 Basic definitions

Definition 1

A literal of a Boolean variable is itself and its negation. A clause is a disjunction of literals. A formula represented as a conjunction of clauses is said to be the Conjunctive Normal Form (CNF) of . A CNF formula is also viewed as a set of clauses. Let be an assignment, be a CNF formula, and be a clause. denotes the variables assigned in ; denotes the set of variables of ; denotes the set of variables of .

Definition 2

Let be an assignment. Clause is satisfied by if a literal of is set to 1 by . Otherwise, is falsified by . Assignment satisfies if satisfies every clause of .

Definition 3

Let be a CNF formula and be a partial assignment to variables of . Denote by that is obtained from by a) removing all clauses of satisfied by ; b) removing the literals set to 0 by from the clauses that are not satisfied by . Notice, that if =, then = .

Definition 4

Let be a CNF formula and be a subset of . Denote by the set of all clauses of containing at least one variable of .

Definition 5

The variables of are redundant in formula if . We note that since does not contain any variables, we could have written . To simplify notation, we avoid explicitly using this optimization in the rest of the paper.

Definition 6

Let and be assignments. The expression denotes the fact that and each variable of has the same value in and .

4.2 Simple cases of variable redundancy

There at least two cases where proving that a variable of is redundant in is easy. The first case concerns monotone variables of . A variable of is called monotone if all clauses of containing have only positive (or only negative) literal of . A monotone variable is redundant in because removing the clauses with from does not change the satisfiability of . The second case concerns the presence of an empty clause. If contains such a clause, every variable of is redundant.

4.3 D-sequents

Definition 7

Let be a CNF formula. Let be an assignment to and be a subset of . A dependency sequent (D-sequent) has the form . It states that the variables of are redundant in . If formula for which a D-sequent holds is obvious from the context we will write this D-sequent in a short notation:   .

Example 2

Let be a CNF formula of four clauses: , , , . Notice that since clause is satisfied in subspace , variable is monotone in formula . So D-sequent holds. On the other hand, the assignment falsifies clause . So variable is redundant in and D-sequent    holds.

4.4 Join Operation for D-sequents

Proposition 1 ([8])

Let be a CNF formula. Let D-sequents    and    hold, where . Let , have different values for exactly one variable . Let consist of all assignments of , but those to . Then, D-sequent    holds too.

We will say that the D-sequent    of Proposition 1 is obtained by joining D-sequents    and    at variable . The join operation is complete [8, 7]. That is eventually, D-sequent is derived proving that the variables of the current formula are redundant. If contains an empty clause, then is unsatisfiable. Otherwise, it is unsatisfiable.

An obvious difference between the D-sequent calculus and resolution is that the former can handle both satisfiable and unsatisfiable formulas. This limitation of resolution is due to the fact that it operates on subspaces where formula is unsatisfiable. One can interpret resolving clauses to produce clause as using the Boolean cubes , where and are unsatisfiable to produce a new Boolean cube where the resolvent is unsatisfiable. On the contrary, the join operation can be performed over parts of the search space where may be satisfiable. When D-sequents    and    are joined, it does not matter whether formulas and are satisfiable. The only thing that matters is that variables are redundant in and .

4.5 Virtual redundancy

Let be a CNF formula and be an assignment to . Let and . The fact that variables of are redundant in , in general, does not mean that they are redundant in . Suppose, for example, that is satisfiable, is unsatisfiable, does not have a clause falsified by and . Then formula has no clauses and so is satisfiable. Hence and so the variables of are not redundant in . On the other hand, since is satisfiable, the variables of are redundant in .

We will say that the variables of are virtually redundant in where if either a) or b) and is satisfiable. In other words, if variables are virtually redundant in , removing the clauses with a variable of from may be wrong but only locally. From the global point of view this mistake does not matter because it occurs only when is satisfiable.

We need a new notion of redundancy because the join operation introduced above preserves virtual redundancy [8] rather than redundancy in terms of Definition 5. Suppose, for example, that the variables of are redundant in and in terms of Definition 5 and so D-sequents    and    hold. Let    be the D-sequent obtained by joining the D-sequents above. Then one can guarantee only that the variables of are virtually redundant in . For that reason we need to replace the notion of redundancy by Definition 5 with that of virtually redundancy. In the future explanation, we will omit the word “virtually”. That is when we say that variables of are redundant in we actually mean that they are virtually redundant in .

5 Description of DS-QSAT

In this section, we describe DS-QSAT, a QSAT-solver based on the machinery of D-sequents.

// is a CNF formula
// is an assignment to
//  is a set of active D-sequents
DS-QSAT(,,){ 1 if (empty_clause()) exit(unsat); 2 if () 3 if (left_branch()) 4 :=; 5 else { 6 :=; 7 return(); } 8 ; 9 if (all_vars_assgn_or_redund(,); 10 if () exit(sat); 11 else return(); - - - - - - - - - - - - - - 12 ; 13 = ; 14 DS-QSAT(,,); 15 ; 16 if () return(); 17 ; 18 = ; 19 DS-QSAT(,,); - - - - - - - - - - - - - - 20 ; 21 return();}

Figure 2: DS-QSAT procedure

5.1 High-level view

Pseudocode of  DS-QSAT is given in Figure 2. DS-QSAT accepts a CNF formula , a partial assignment to where , and a set of active D-sequents  stating redundancy of some variables from in subspace . DS-QSAT returns CNF formula that consists of the clauses of the initial formula plus some resolvent clauses and a set  of D-sequents stating redundancy of every variable of in subspace . To check satisfiability of a CNF formula, one needs to call DS-QSAT with , .

DS-QSAT is a branching procedure. If DS-QSAT cannot prove redundancy of some variables in the current subspace, it picks one of such variables and branches on it. So DS-QSAT  builds a binary search tree where a node corresponds to a branching variable. We will refer to the first (respectively second) assignment to as the left (respectively right) branch of . Although Boolean Constraint Propagation (BCP) is not explicitly mentioned in Figure 2, it is included into the pick_variable procedure as follows. Let be the current partial assignment. Then a) preference is given to branching on variables of unit clauses of (if any); b) if is a variable of a unit clause of of and is picked for branching, then the value satisfying is assigned first.

As soon as a variable is proved redundant in the current subspace , a D-sequent    is recorded where is a subset of assignments of . All the clauses of containing variable are marked as redundant and ignored until becomes non-redundant again. This happens when a variable of changes its value making the D-sequent    inactive in the current subspace.

As we mentioned in Section 3, if a clause containing a variable is falsified after an assignment is made to , DS-QSAT keeps making assignments to unassigned non-redundant variables. However, this happens only in the left branch of . If is falsified in the right branch of , DS-QSAT backtracks. A unit clause gets falsified in the left branch only when DS-QSAT tries to satisfy another unit clause such that and have the opposite literals of a variable . We will refer to the node of the search tree corresponding to as a conflict one. The number of conflict nodes DS-QSAT may have is not limited.

DS-QSAT consists of three parts. In Figure 2, they are separated by dashed lines. In the first part, described in Subsections 5.3 and 5.4 in more detail, DS-QSAT checks for termination conditions and builds D-sequents for variables whose redundancy is obvious. In the second part (Subsection 5.5), DS-QSAT picks an unassigned non-redundant variable and splits the current subspace into subspaces and . Finally, DS-QSAT merges the results of branches and (Subsection 5.6).

5.2 Eager and lazy backtracking (DPLL as a special case of DS-QSAT)

Let be the current partial assignment to variables of and variable be the variable assigned in most recently. Let be assigned a first value (left branch). Let be a clause of falsified after is assigned in . In this case, procedure update_Dseqs of DS-QSAT (line 4 of Figure 2), derives a D-sequent   . Here is the smallest subset of assignments of falsifying and is a subset of the current set of the unassigned, non-redundant variables.

The version where i.e. where no D-sequent    is derived by update_Dseqs will be called DS-QSAT with lazy backtracking. In our theoretical and experimental evaluation of DS-QSAT given in Sections 6 and 8 we used the version with lazy backtracking. The version of DS-QSAT where is always equal to will be referred to as DS-QSAT with eager backtracking. DPLL is a special case of DS-QSAT where the latter employs eager backtracking. In this case, all unassigned variables are declared redundant and DS-QSAT immediately backtracks without trying to prove redundancy of variables of in some other ways.

// =; =;
// , if no clause of
// is falsified by , respectively
{
1 for () {
2 if ()
continue;
3 ;
4 ;
5 ;
6 ;}
- - - - - - - - - - - - - - -
7 ;
8 ;
9 if (() and ()) {
10 ;
11 ;
12 ;
13 else
14 ;
15 return(); }
Figure 3: merge procedure

5.3 Termination conditions

DS-QSAT reports unsatisfiability if the current formula contains an empty clause (line 1 of Figure 2). DS-QSAT reports satisfiability if no clause of is falsified by the current assignment and every variable of is either assigned in or proved redundant in subspace (line 10). Note that DS-QSAT uses slight optimization here by terminating before the D-sequent is derived stating unconditional redundancy of variables of in .

If no termination condition is met but every variable of is assigned or proved redundant, DS-QSAT ends the current call and returns and  (lines 7,11). In contrast to operator return, the operator exit used in lines 1,10 eliminates the entire stack of nested calls of DS-QSAT.

5.4 Derivation of atomic D-sequents

Henceforth, for simplicity, we will assume that DS-QSAT  derives D-sequents of the form    i.e. for single variables. A D-sequent    is then represented as different D-sequents   , .

In the two cases below, variable redundancy is obvious. Then DS-QSAT derives D-sequents we will call atomic. The first case, is when clause of is falsified by the current assignment . This kind of D-sequents is derived by procedures update_Dseqs (line 4) and finish_Dseqs(line 6). Let be the variable assigned in most recently. Let be a clause of falsified after the current assignment to is made. If is assigned a first value (left branch), then, as we mentioned in Subsection 5.2, for some unassigned variables that are not proved redundant yet, one can build D-sequents   ,…,  . Here is the shortest assignment falsifying . So update_Dseqs may leave some unassigned variables non-redundant. On the contrary, finish_Dseqs is called in the right branch of . In this case, for every unassigned variable that is not proved redundant yet, D-sequent    is generated. So on exit from finish_Dseqs, every variable of is either assigned or proved redundant.

D-sequents of monotonic variables are the second case of atomic D-sequents. They are generated by procedure monot_vars_Dseqs (line 8) and by procedure monot_var_Dseq called when DS-QSAT merges results of branches (line 14 of Figure 3). Let be the current partial assignment and be a monotone unassigned variable of . Assume for the sake of clarity, that only clauses with positive polarity of are present in . This means that every clause of with literal is either satisfied by or contains a variable proved redundant in . Then DS-QSAT generates D-sequent    where is formed from assignments of as follows. For every clause of with literal assignment a) contains an assignment satisfying or b) contains all the assignments of such that D-sequent    is active and is a variable of . Informally, contains a set of assignments under which variable becomes monotone.

5.5 Branching in DS-QSAT

When DS-QSAT cannot prove redundancy of some unassigned variables in the current subspace , it picks a non-redundant variable for branching (line 12 of Figure 2). First, DS-QSAT  calls itself with assignment . (Figure 2 shows the case when assignment is explored in the left branch but obviously the assignment can be explored before .) Then DS-QSAT partitions the returned set of D-sequents into and .

The set consists of the D-sequents    of such that . The D-sequents of remain active in the branch . The set consists of the D-sequents    such that contains assignment . The D-sequents of are inactive in the subspace and the variables whose redundancy is stated by those D-sequents reappear in the right branch. If , there is no reason to explore the right branch. So, DS-QSAT just returns the set of D-sequents (line 16). Otherwise, DS-QSAT recovers the variables and clauses that were marked redundant after D-sequents from were derived (line 17) and calls itself with partial assignment .

5.6 Merging results of branches

After both branches of variable has been explored, DS-QSAT merges the results by calling the merge procedure (line 20). The pseudocode of merge is shown in Figure 3. DS-QSAT backtracks only when every unassigned variable is proved redundant in the current subspace. The objective of merge is to maintain this invariant by a) replacing the currently D-sequents that depend on the branching variable with those that are symmetric in ; b) building a D-sequent for the branching variable itself.

The merge procedure consists of two parts separated in Figure 3 by the dotted line. In the first part, merge builds D-sequents for the variables of . In the second part, it builds a D-sequent for the branching variable. In the first part, merge iterates over variables . Let be a variable of . If the current D-sequent for (i.e. the D-sequent for from the set returned in the right branch) is symmetric in , then there is no need to build a new D-sequent (line 2). Otherwise, a new D-sequent for that does not depend on is generated as follows. Let and be the D-sequents for variable contained in and respectively (lines 3,4). That is and were generated for variable in branches and . Then D-sequent is produced by joining and at variable (line 5).

Figure 4: Search tree built by DS-QSAT

Generation of a D-sequent for the variable itself depends on whether node (i.e the node of the search tree corresponding to ) is a conflict one. If so, contains clauses and that have variable and are falsified by and respectively. In this case, to make variable redundant merge generates the resolvent of and on variable and adds to (lines 10,11). Then D-sequent    is generated where is the shortest assignment falsifying clause (line 12).

Figure 5: D-sequents of Figure 4

If node is not a conflict one, this means that clause and/or clause does not exist. Suppose, for example, that no clause containing variable is falsified by . This means that every clause with the positive literal of is either satisfied by or contains a variable redundant in subspace . In other words, is monotone in after removing the clauses with redundant variables. Then an atomic D-sequent is generated by merge (line 14) as described in Subsection 5.4.

Example 3

Here we show how DS-QSAT with lazy backtracking operates when solving the CNF formula introduced in Example 1. Formula consists of 8 clauses: , , , , , , , . Figure 4 shows the search tree built by DS-QSAT. The ovals specify the branching nodes labeled by the corresponding branching variables. The label 0 or 1 on the edge connecting two nodes specifies the value made to the variable of the higher node. The rectangles specify the leaves of the search tree. The rectangle SAT specifies the leaf where DS-QSAT reported that is satisfiable.

Every edge of the search tree labeled with value 0 (respectively 1) also shows the set of D-sequents (respectively ) derived when the assignment corresponding to this edge was made. The D-sequents produced by DS-QSAT  are denoted in Figure 4 as . The values of are given in Figure 5. When representing and , we use the symbol ’’ to separate D-sequents derived before and after a call of DS-QSAT. Consider for example, the set on the path . The set of D-sequents listed before ’’ is empty in . This means that no D-sequents had been derived when DS-QSAT  was called with . On the exit of this invocation of DS-QSAT, D-sequents were derived. We use ellipsis after symbol ’’ for the calls of DS-QSAT  that were not finished by the time was proved satisfiable.

Below, we use Figures 4 and 5 to illustrate various aspects of the work of DS-QSAT.

Leaf nodes correspond to subspaces where every variable is either assigned or proved redundant. For example, the node on the path is a leaf because are assigned and is proved redundant.

Atomic D-sequents. D-sequents are atomic. For example, the D-sequent is derived in subspace due to becoming monotone. is equal to because only assignments are responsible for the fact that is monotone.

Branching in the presence of a conflict. On the path , clauses and turned into unit clauses and respectively. So no matter how first assignment to was made, one of these two clauses would get falsified. DS-QSAT made first assignment and falsified clause . Since this was the left branch of , DS-QSAT proceeded further to branch on variable .

Merging results of branches. When branching on variable , DS-QSAT derived sets and where is equal to and is equal to . DS-QSAT merged the results of branching by joining and at the branching variable . The resulting D-sequent equal to does not depend on .

D-sequents for branching variables. DS-QSAT generated D-sequents for branching variables and . Variable was monotone in subspace because the clauses , containing the positive literal of were not present in this subspace. was satisfied by assignment while contained variable whose redundancy was stated by D-sequent equal to . So the D-sequent equal to was derived.

Variable was not monotone in subspace because, in this subspace, clauses and turned into unit clauses and respectively. So first, DS-QSAT made variable redundant by adding to clause obtained by resolution of and on . Note that is falsified in subspace . So the D-sequent equal to was generated.

Reduction of the size of right branches. In the left branch of node , the set of D-sequents was derived. D-sequent equal to is not symmetric in (i.e. depends on ). On the other hand, and stating redundancy of and are symmetric in . So only D-sequent was inactive in the right branch . So only variable reappeared in this branch while , remain redundant.

Termination. In subspace , every variable of was assigned or redundant and no clause of was falsified by . So DS-QSAT terminated reporting that was satisfiable.

5.7 Correctness of DS-QSAT

The proof of correctness of DS-QSAT can be performed by induction on the number of derived D-sequents. Since such a proof is very similar to the proof of correctness of the quantifier elimination algorithm we gave in [8], we omit it here. Below we just list the facts on which this proof of correctness is based.

  • DS-QSAT derives correct atomic D-sequents.

  • D-sequents obtained by the join operation are correct.

  • DS-QSAT correctly reports satisfiability when every clause is either satisfied or proved redundant in the current subspace because D-sequents stating redundancy of variables are correct.

  • New clauses added to the current formula are obtained by resolution and so are correct. So DS-QSAT correctly reports unsatisfiability when an empty clause is derived.

6 DS-QSAT on Compositional Formulas

In this section, we consider the performance of DS-QSAT on compositional formulas. We will say that a satisfiability checking algorithm is compositional if its complexity is linear in the number of subformulas forming a compositional formula. We prove that DS-QSAT with lazy backtracking is compositional and argue that DPLL-based SAT-solvers are not.

We say that a formula is compositional if it can be represented as where . The motivation for our interest in such formulas is as follows. As we mentioned in Section 3, a practical formula typically can be represented as where are internal variables of and are communication variables. One can view compositional formulas as a degenerate case where and so do not talk to each other. Intuitively, an algorithm that does not scale well even when will not scale well when .

From now on, we narrow down the definition of compositional formulas as follows. We will call formula compositional if and all subformulas , are equivalent modulo variable renaming/negation. That is can be obtained from by renaming some variables of and then negating some variables of the result of variable renaming.

Proposition 2

Let be a compositional formula. Let be the search tree built by DS-QSAT with lazy backtracking when checking the satisfiability of . The size of is linear in no matter how decision variables are chosen. (A variable is a decision one if no clause of that is unit in the current subspace contains .)

Proof

We will call a D-sequent    limited to subformula if . The idea of the proof is to show that every D-sequent derived by DS-QSAT is limited to a subformula . Then the size of is limited by