Checking Satisfiability by Dependency Sequents
Abstract
We introduce a new algorithm for checking satisfiability based on a calculus of Dependency sequents (Dsequents). Given a CNF formula , a Dsequent is a record stating that under a partial assignment a set of variables of is redundant in formula . The Dsequent calculus is based on operation join that forms a new Dsequent from two existing ones. The new algorithm solves the quantified version of SAT. That is, given a satisfiable formula , it, in general, does not produce an assignment satisfying . The new algorithm is called DSQSAT where DS stands for Dependency Sequent and Q for Quantified. Importantly, a DPLLlike procedure is only a special case of DSQSAT where a very restricted kind of Dsequents is used. We argue that this restriction a) adversely affects scalability of SATsolvers b) is caused by looking for an explicit satisfying assignment rather than just proving satisfiability. We give experimental results substantiating these claims.
1 Introduction
Algorithms for solving the Boolean satisfiability problem are an important part of modern design flows. Despite great progress in the performance of such algorithms achieved recently, the scalability of SATsolvers still remains a big issue. In this paper, we address this issue by introducing a new method of satisfiability checking that can be viewed as a descendant of the DP procedure [3].
We consider Boolean formulas represented in Conjunctive Normal Form (CNF). Given a CNF formula , one can formulate two different kinds of satisfiability checking problems. We will refer to the problems of the first kind as QSAT where Q stands for quantified. Solving QSAT means just checking if is true. In particular, if is satisfiable, a QSATsolver does not have to produce an assignment satisfying . The problems of the second kind that we will refer to as just SAT are a special case of those of the first kind. If is satisfiable, a SATsolver has to produce an assignment satisfying .
Intuitively, QSAT should be easier than SAT because a QSATsolver needs to return only one bit of information. This intuition is substantiated by the fact that checking if an integer number is prime (i.e. answering the question if nontrivial factors of exist) is polynomial while finding factors of explicitly is believed to be hard. However, the situation among practical algorithms defies this intuition. Currently, the field is dominated by procedures based on DPLL algorithm [2] that is by SATsolvers. On the other hand, a classical QSATsolver, the DP procedure [3], does not have any competitive descendants (although some elements of the DP procedure are used in formula preprocessing performed by SATsolvers [5]).
In this paper, we introduce a QSATsolver called DSQSAT where DS stands for Dependency Sequent. On the one hand, DSQSAT can be viewed as a descendant of the DP procedure. On the other hand, DPLLlike procedures with clause learning is a special case of DSQSAT. Like DP procedure, DSQSAT is based on the idea of elimination of redundant variables. A variable is redundant in if the latter is equivalent to where is the set of all clauses of with . Note that removal of clauses of produces a formula that is equisatisfiable rather than functionally equivalent to .
If is satisfiable, all variables of are redundant in because an empty set of clauses is satisfiable. If is unsatisfiable, one can make the variables of redundant by deriving an empty clause and adding it to . An empty clause is unsatisfiable, hence all other clauses of can be dropped. So, from the viewpoint of DSQSAT, the only difference between satisfiable and unsatisfiable formulas is as follows. If is satisfiable, its variables are already redundant and one just needs to prove this redundancy. If is unsatisfiable, one has to make variables redundant by derivation and adding to an empty clause.
The DP procedure makes a variable of redundant in one step, by adding to all clauses that can be produced by resolution on . This is extremely inefficient due to generation of prohibitively large sets of clauses even for very small formulas. DSQSAT addresses this problem by using branching. The idea is to prove redundancy of variables in subspaces and then “merge” the obtained results. DSQSAT records the fact that a set of variables is redundant in in subspace specified by partial assignment as . Here is a subset of the assignments of relevant to redundancy of . The record is called a dependency sequent (or Dsequent for short). To simplify notation, if and are obvious from the context, we record the Dsequent above as just .
A remarkable fact is that a resolutionlike operation called join can be used to produce a new Dsequent from two Dsequents derived earlier [8, 7]. Suppose, for example, that Dsequents and specify redundancy of variable in different branches of variable . Then Dsequent holds where the left part assignment of this Dsequent is obtained by taking the union of the left part assignments of the two Dsequents above but those to variable . The new Dsequent is said to be obtained by joining the two Dsequents above at variable . The calculus based on the join operation is complete. That is, eventually DSQSAT derives Dsequent stating unconditional redundancy of the variables of in . If by the time the Dsequent above is derived, contains an empty clause, is unsatisfiable. Otherwise, is satisfiable. Importantly, if is satisfiable, derivation of Dsequent does not require finding an assignment satisfying .
DPLLbased SATsolvers with clause learning can be viewed as a special case of DSQSAT where only a particular kind of Dsequents is used. This limitation on Dsequents is caused by the necessity to generate a satisfying assignment as a proof of satisfiability. Importantly, this necessity deprives DPLLbased SATsolvers of using transformations preserving equisatisfiability rather than functional equivalence. In turn, this adversely affects the performance of SATsolvers. We illustrate this point by comparing the performance of DPLLlike SATsolvers and a version of DSQSAT on compositional formulas. This version of DSQSAT use the strategy of lazy backtracking as opposed to that of eager backtracking employed by DPLLbased procedures. A compositional CNF formula has the form where ,. Subformulas are identical modulo variable renaming/negation. We prove theoretically that performance of DSQSAT is linear in . On the other hand, one can argue that the average performance of DPLLbased SATsolvers with conflict learning should be quadratic in . In Section 8, we describe experiments confirming our theoretical results.
The contribution of this paper is fourfold. First, we use the machinery of Dsequents to explain some problems of DPLLbased SATsolvers. Second, we describe a new QSATsolver based on Dsequents called DSQSAT. Third, we give a theoretical analysis of the behavior of DSQSAT on compositional formulas. Fourth, we show the promise of DSQSATby comparing its performance with that of wellknown SATsolvers on compositional and noncompositional formulas.
This paper is structured as follows. In Section 2 we discuss the complexity of QSAT and SAT. Section 3 gives a brief introduction into DSQSAT. We recall Dsequent calculus in Section 4. A detailed description of DSQSAT is given in Section 5. Section 6 gives some theoretical results on performance of DSQSAT. Section 7 describes a modification of DSQSAT that allows additional pruning of the search tree. Experimental results are given in Section 8. We describe some background of this research in Section 9 and give conclusions in Section 10.
2 Is QSAT Simpler Than SAT?
{  
1  ;  
2  if (ans=unsat) return(unsat);  
3  ; :=;  
4  while () {  
5  ;  
6  if ()  
7  ;  
8  else ;  
9  
10  ;  
11  ;}  
12  return(); 
In this section, we make the following point. Both QSATsolvers and SATsolvers have exponential complexity on the set of all CNF formulas, unless P = NP. However, this is not true for subsets of CNF formulas. It is possible that a set of formulas describing, say, properties of a parameterized set of designs can be solved in polynomial time by some QSATsolver while any SATsolver has exponential complexity on .
To illustrate the point above, let us consider procedure gen_sat_assgn shown in Figure 1. It finds an assignment satisfying a CNF formula (if any) by solving a sequence of QSAT problems. First, gen_sat_assgn calls a QSATsolver solve_qsat to check if is satisfiable (line 2). If it is, gen_sat_assgn picks a variable of (line 5) and calls solve_qsat to find assignment under which formula is satisfiable (lines 68). Since is satisfiable, and/or has to be satisfiable. Then gen_sat_assgn fixes variable at the chosen value val and adds (=val) to assignment (lines 910) that was originally empty. The gen_sat_assgn procedure keeps assigning variables of in the same manner in a loop (lines 511) until every variable of is assigned. At this point, is a satisfying assignment of .
The number of QSAT checks performed by solve_qsat in gen_sat_assgn is at most . So if there is a QSATsolver solving all satisfiable CNF formulas in polynomial time, gen_sat_assgn can use this QSATsolver in its inner loop to find a satisfying assignment for any satisfiable formula in polynomial time. However, this is not true when considering a subset of all possible CNF formulas. Suppose there is a QSATsolver solving the formulas of in polynomial time. Let be a formula of . Let denote under partial assignment . The fact that does not imply . So the behavior of gen_sat_assgn using this QSATsolver in the inner loop may actually be even exponential if this QSATsolver does not perform well on formulas .
For example, one can form a subset of all possible CNF formulas such that a) a formula describes a check that a number is composite and b) an assignment satisfying (if any) specifies two numbers , such that , and . The satisfiability of formulas in can be checked by a QSATsolver in polynomial time [14]. At the same time, finding satisfying assignments of formulas from i.e. factorization of composite numbers is believed to be hard. For instance, gen_sat_assgn cannot use the QSATsolver above to find satisfying assignments for formulas of in polynomial time. The reason is that formula does not specify a check if a number is composite. That is does not imply that .
Note that a SATsolver is also limited in the ways of proving unsatisfiability. For a SATsolver, such a proof is just a failed attempt to build a satisfying assignment explicitly. For example, instead of using the polynomial algorithm of [14], a SATsolver would prove that a number is prime by failing to find two nontrivial factors of .
3 Brief comparison of DPLLbased SATsolvers and DSQSAT in Terms of Dsequents
In this section, we use the notion of Dsequents to discuss some limitations of DPLLbased SATsolvers. We also explain how DSQSAT (described in Section 5 in detail) overcomes those limitations.
Example 1
Let SAT_ALG be a DPLLbased SATsolver with clause learning. We assume that the reader is familiar with the basics of such SATsolvers [15, 16]. Let be a CNF formula of 8 clauses where , , , , , , , . The set of variables of is equal to .
Let SAT_ALG first make assignment . This satisfies clauses ,, and removes literal from . Let SAT_ALG then make assignment . Removing literal from and turn them into unit clauses and respectively. This means that SAT_ALG ran into a conflict. At this point, SAT_ALG generates conflict clause that is obtained by resolving clauses and on and adds to . After that, SAT_ALG erases assignment and the assignment made by SAT_ALG to and runs BCP that assigns to satisfy that is currently unit. In terms of Dsequents, one can view generation of conflict clause and adding it to as derivation of Dsequent equal to . Dsequent says that making assignments falsifying clause renders all unassigned variables redundant. Note that is inactive in the subspace that SAT_ALG enters after assigning 1 to . (We will say that Dsequent is active in the subspace specified by partial assignment if the assignments of are a subset of those of .) So the variables proved redundant in subspace become nonredundant again.
One may think that reappearance of variables in subspace is “inevitable” but this is not so. Variables , have at least two reasons to be redundant in subspace . First, is falsified in this subspace. Second, the only clauses of containing variables , are ,,,. But and are satisfied by and can be satisfied by an assignment to ,. So ,,, can be removed from in subspace without affecting the satisfiability of . Hence Dsequents and equal to and are true. (In Example 3, we will show how and are derived by DSQSAT.) Suppose that one replaces the Dsequent above with Dsequents where is equal to . Note that only Dsequent is inactive in subspace . So only variable reappears after changes its value from 0 to 1
The example above illustrates the main difference between SAT_ALG and DSQSAT in terms of Dsequents. At every moment, SAT_ALG has at most one active Dsequent. This Dsequent is of the form where is an assignment falsifying a clause of and is the set of all variables that are currently unassigned. DSQSAT may have a set of active Dsequents where , ,. When SAT_ALG changes the value of variable of , all the variables of reappear as nonredundant. When DSQSAT changes the value of , variables of reappear only if . So only a subset of variables of reappear.
To derive Dsequents above, DSQSAT goes on branching in the presence of a conflict. Informally, the goal of such branching is to find alternative ways of proving redundancy of variables from . So DSQSAT uses extra branching to minimize the number of variables reappearing in the right branch (after the left branch has been explored). This should eventually lead to the opposite result i.e. to reducing the amount of branching. Looking for alternative ways to prove redundancy can be justified as follows. A practical formula typically can be represented as . Here are internal variables of and are “communication” variables that may share with some other subformulas , . One can view as describing a “design block” with external variables . Usually, is much smaller than . Let a clause of be falsified by the current assignment due to a conflict. Suppose that at the time of the conflict all variables of of subformula were assigned and their values were specified by assignment . Suppose is consistent for i.e. can be extended by assignments to to satisfy . This means that the variables of are redundant in subspace in where . Then by branching on variables of one can derive Dsequent . If is inconsistent for , then by branching on variables of one can derive a clause falsified by . Adding to makes the variables of redundant in in subspace . So the existence of many ways to prove variable redundancy is essentially implied by the fact that formula has structure.
The possibility to control the size of right branches gives an algorithm a lot of power. Suppose, for example, that an algorithm guarantees that the number of variables reappearing in the right branch is bounded by a constant . We assume that this applies to the right branch going out of any node of the search tree, including the root node. Then the size of the search tree built by such an algorithm is . Here is the maximum depth of a search tree built by branching on variables of and is the number of nodes in a full binary subtree over variables. So the factor limits the size of the right branch. The complexity of an algorithm building such a search tree is linear in . In Section 6, we show that bounding the size of right branches by a constant is exactly the reason why the complexity of DSQSAT on compositional formulas is linear in the number of subformulas.
The limitation of Dsequents available to SAT_ALG is consistent with the necessity to produce a satisfying assignment. Although such limitation cripples the ability of an algorithm to compute the parts of the formula that are redundant in the current subspace, it does not matter much for SAT_ALG. The latter simply cannot use this redundancy because it is formulated with respect to formula rather than . Hence, discarding the clauses containing redundant variables preserves equisatisfiability rather than functional equivalence. So, an algorithm using such transformations cannot guarantee that a satisfying assignment it found is correct.
4 Dsequent Calculus
In this section, we recall the Dsequent calculus introduced [8, 7]. In Subsections 4.1 and 4.2 we give basic definitions and describe simple cases of variable redundancy. The notion of Dsequents is introduced in Subsection 4.3. Finally, the operation of joining Dsequents is presented in Subsection 4.4.
4.1 Basic definitions
Definition 1
A literal of a Boolean variable is itself and its negation. A clause is a disjunction of literals. A formula represented as a conjunction of clauses is said to be the Conjunctive Normal Form (CNF) of . A CNF formula is also viewed as a set of clauses. Let be an assignment, be a CNF formula, and be a clause. denotes the variables assigned in ; denotes the set of variables of ; denotes the set of variables of .
Definition 2
Let be an assignment. Clause is satisfied by if a literal of is set to 1 by . Otherwise, is falsified by . Assignment satisfies if satisfies every clause of .
Definition 3
Let be a CNF formula and be a partial assignment to variables of . Denote by that is obtained from by a) removing all clauses of satisfied by ; b) removing the literals set to 0 by from the clauses that are not satisfied by . Notice, that if =, then = .
Definition 4
Let be a CNF formula and be a subset of . Denote by the set of all clauses of containing at least one variable of .
Definition 5
The variables of are redundant in formula if . We note that since does not contain any variables, we could have written . To simplify notation, we avoid explicitly using this optimization in the rest of the paper.
Definition 6
Let and be assignments. The expression denotes the fact that and each variable of has the same value in and .
4.2 Simple cases of variable redundancy
There at least two cases where proving that a variable of is redundant in is easy. The first case concerns monotone variables of . A variable of is called monotone if all clauses of containing have only positive (or only negative) literal of . A monotone variable is redundant in because removing the clauses with from does not change the satisfiability of . The second case concerns the presence of an empty clause. If contains such a clause, every variable of is redundant.
4.3 Dsequents
Definition 7
Let be a CNF formula. Let be an assignment to and be a subset of . A dependency sequent (Dsequent) has the form . It states that the variables of are redundant in . If formula for which a Dsequent holds is obvious from the context we will write this Dsequent in a short notation: .
Example 2
Let be a CNF formula of four clauses: , , , . Notice that since clause is satisfied in subspace , variable is monotone in formula . So Dsequent holds. On the other hand, the assignment falsifies clause . So variable is redundant in and Dsequent holds.
4.4 Join Operation for Dsequents
Proposition 1 ([8])
Let be a CNF formula. Let Dsequents and hold, where . Let , have different values for exactly one variable . Let consist of all assignments of , but those to . Then, Dsequent holds too.
We will say that the Dsequent of Proposition 1 is obtained by joining Dsequents and at variable . The join operation is complete [8, 7]. That is eventually, Dsequent is derived proving that the variables of the current formula are redundant. If contains an empty clause, then is unsatisfiable. Otherwise, it is unsatisfiable.
An obvious difference between the Dsequent calculus and resolution is that the former can handle both satisfiable and unsatisfiable formulas. This limitation of resolution is due to the fact that it operates on subspaces where formula is unsatisfiable. One can interpret resolving clauses to produce clause as using the Boolean cubes , where and are unsatisfiable to produce a new Boolean cube where the resolvent is unsatisfiable. On the contrary, the join operation can be performed over parts of the search space where may be satisfiable. When Dsequents and are joined, it does not matter whether formulas and are satisfiable. The only thing that matters is that variables are redundant in and .
4.5 Virtual redundancy
Let be a CNF formula and be an assignment to . Let and . The fact that variables of are redundant in , in general, does not mean that they are redundant in . Suppose, for example, that is satisfiable, is unsatisfiable, does not have a clause falsified by and . Then formula has no clauses and so is satisfiable. Hence and so the variables of are not redundant in . On the other hand, since is satisfiable, the variables of are redundant in .
We will say that the variables of are virtually redundant in where if either a) or b) and is satisfiable. In other words, if variables are virtually redundant in , removing the clauses with a variable of from may be wrong but only locally. From the global point of view this mistake does not matter because it occurs only when is satisfiable.
We need a new notion of redundancy because the join operation introduced above preserves virtual redundancy [8] rather than redundancy in terms of Definition 5. Suppose, for example, that the variables of are redundant in and in terms of Definition 5 and so Dsequents and hold. Let be the Dsequent obtained by joining the Dsequents above. Then one can guarantee only that the variables of are virtually redundant in . For that reason we need to replace the notion of redundancy by Definition 5 with that of virtually redundancy. In the future explanation, we will omit the word “virtually”. That is when we say that variables of are redundant in we actually mean that they are virtually redundant in .
5 Description of DSQSAT
In this section, we describe DSQSAT, a QSATsolver based on the machinery of Dsequents.
5.1 Highlevel view
Pseudocode of DSQSAT is given in Figure 2. DSQSAT accepts a CNF formula , a partial assignment to where , and a set of active Dsequents stating redundancy of some variables from in subspace . DSQSAT returns CNF formula that consists of the clauses of the initial formula plus some resolvent clauses and a set of Dsequents stating redundancy of every variable of in subspace . To check satisfiability of a CNF formula, one needs to call DSQSAT with , .
DSQSAT is a branching procedure. If DSQSAT cannot prove redundancy of some variables in the current subspace, it picks one of such variables and branches on it. So DSQSAT builds a binary search tree where a node corresponds to a branching variable. We will refer to the first (respectively second) assignment to as the left (respectively right) branch of . Although Boolean Constraint Propagation (BCP) is not explicitly mentioned in Figure 2, it is included into the pick_variable procedure as follows. Let be the current partial assignment. Then a) preference is given to branching on variables of unit clauses of (if any); b) if is a variable of a unit clause of of and is picked for branching, then the value satisfying is assigned first.
As soon as a variable is proved redundant in the current subspace , a Dsequent is recorded where is a subset of assignments of . All the clauses of containing variable are marked as redundant and ignored until becomes nonredundant again. This happens when a variable of changes its value making the Dsequent inactive in the current subspace.
As we mentioned in Section 3, if a clause containing a variable is falsified after an assignment is made to , DSQSAT keeps making assignments to unassigned nonredundant variables. However, this happens only in the left branch of . If is falsified in the right branch of , DSQSAT backtracks. A unit clause gets falsified in the left branch only when DSQSAT tries to satisfy another unit clause such that and have the opposite literals of a variable . We will refer to the node of the search tree corresponding to as a conflict one. The number of conflict nodes DSQSAT may have is not limited.
DSQSAT consists of three parts. In Figure 2, they are separated by dashed lines. In the first part, described in Subsections 5.3 and 5.4 in more detail, DSQSAT checks for termination conditions and builds Dsequents for variables whose redundancy is obvious. In the second part (Subsection 5.5), DSQSAT picks an unassigned nonredundant variable and splits the current subspace into subspaces and . Finally, DSQSAT merges the results of branches and (Subsection 5.6).
5.2 Eager and lazy backtracking (DPLL as a special case of DSQSAT)
Let be the current partial assignment to variables of and variable be the variable assigned in most recently. Let be assigned a first value (left branch). Let be a clause of falsified after is assigned in . In this case, procedure update_Dseqs of DSQSAT (line 4 of Figure 2), derives a Dsequent . Here is the smallest subset of assignments of falsifying and is a subset of the current set of the unassigned, nonredundant variables.
The version where i.e. where no Dsequent is derived by update_Dseqs will be called DSQSAT with lazy backtracking. In our theoretical and experimental evaluation of DSQSAT given in Sections 6 and 8 we used the version with lazy backtracking. The version of DSQSAT where is always equal to will be referred to as DSQSAT with eager backtracking. DPLL is a special case of DSQSAT where the latter employs eager backtracking. In this case, all unassigned variables are declared redundant and DSQSAT immediately backtracks without trying to prove redundancy of variables of in some other ways.
// =; =;  
// , if no clause of  
// is falsified by , respectively  
{  
1  for () {  
2  if ()  
continue;  
3  ;  
4  ;  
5  ;  
6  ;}  
                
7  ;  
8  ;  
9  if (() and ()) {  
10  ;  
11  ;  
12  ;  
13  else  
14  ;  
15  return(); } 
5.3 Termination conditions
DSQSAT reports unsatisfiability if the current formula contains an empty clause (line 1 of Figure 2). DSQSAT reports satisfiability if no clause of is falsified by the current assignment and every variable of is either assigned in or proved redundant in subspace (line 10). Note that DSQSAT uses slight optimization here by terminating before the Dsequent is derived stating unconditional redundancy of variables of in .
If no termination condition is met but every variable of is assigned or proved redundant, DSQSAT ends the current call and returns and (lines 7,11). In contrast to operator return, the operator exit used in lines 1,10 eliminates the entire stack of nested calls of DSQSAT.
5.4 Derivation of atomic Dsequents
Henceforth, for simplicity, we will assume that DSQSAT derives Dsequents of the form i.e. for single variables. A Dsequent is then represented as different Dsequents , .
In the two cases below, variable redundancy is obvious. Then DSQSAT derives Dsequents we will call atomic. The first case, is when clause of is falsified by the current assignment . This kind of Dsequents is derived by procedures update_Dseqs (line 4) and finish_Dseqs(line 6). Let be the variable assigned in most recently. Let be a clause of falsified after the current assignment to is made. If is assigned a first value (left branch), then, as we mentioned in Subsection 5.2, for some unassigned variables that are not proved redundant yet, one can build Dsequents ,…, . Here is the shortest assignment falsifying . So update_Dseqs may leave some unassigned variables nonredundant. On the contrary, finish_Dseqs is called in the right branch of . In this case, for every unassigned variable that is not proved redundant yet, Dsequent is generated. So on exit from finish_Dseqs, every variable of is either assigned or proved redundant.
Dsequents of monotonic variables are the second case of atomic Dsequents. They are generated by procedure monot_vars_Dseqs (line 8) and by procedure monot_var_Dseq called when DSQSAT merges results of branches (line 14 of Figure 3). Let be the current partial assignment and be a monotone unassigned variable of . Assume for the sake of clarity, that only clauses with positive polarity of are present in . This means that every clause of with literal is either satisfied by or contains a variable proved redundant in . Then DSQSAT generates Dsequent where is formed from assignments of as follows. For every clause of with literal assignment a) contains an assignment satisfying or b) contains all the assignments of such that Dsequent is active and is a variable of . Informally, contains a set of assignments under which variable becomes monotone.
5.5 Branching in DSQSAT
When DSQSAT cannot prove redundancy of some unassigned variables in the current subspace , it picks a nonredundant variable for branching (line 12 of Figure 2). First, DSQSAT calls itself with assignment . (Figure 2 shows the case when assignment is explored in the left branch but obviously the assignment can be explored before .) Then DSQSAT partitions the returned set of Dsequents into and .
The set consists of the Dsequents of such that . The Dsequents of remain active in the branch . The set consists of the Dsequents such that contains assignment . The Dsequents of are inactive in the subspace and the variables whose redundancy is stated by those Dsequents reappear in the right branch. If , there is no reason to explore the right branch. So, DSQSAT just returns the set of Dsequents (line 16). Otherwise, DSQSAT recovers the variables and clauses that were marked redundant after Dsequents from were derived (line 17) and calls itself with partial assignment .
5.6 Merging results of branches
After both branches of variable has been explored, DSQSAT merges the results by calling the merge procedure (line 20). The pseudocode of merge is shown in Figure 3. DSQSAT backtracks only when every unassigned variable is proved redundant in the current subspace. The objective of merge is to maintain this invariant by a) replacing the currently Dsequents that depend on the branching variable with those that are symmetric in ; b) building a Dsequent for the branching variable itself.
The merge procedure consists of two parts separated in Figure 3 by the dotted line. In the first part, merge builds Dsequents for the variables of . In the second part, it builds a Dsequent for the branching variable. In the first part, merge iterates over variables . Let be a variable of . If the current Dsequent for (i.e. the Dsequent for from the set returned in the right branch) is symmetric in , then there is no need to build a new Dsequent (line 2). Otherwise, a new Dsequent for that does not depend on is generated as follows. Let and be the Dsequents for variable contained in and respectively (lines 3,4). That is and were generated for variable in branches and . Then Dsequent is produced by joining and at variable (line 5).
Generation of a Dsequent for the variable itself depends on whether node (i.e the node of the search tree corresponding to ) is a conflict one. If so, contains clauses and that have variable and are falsified by and respectively. In this case, to make variable redundant merge generates the resolvent of and on variable and adds to (lines 10,11). Then Dsequent is generated where is the shortest assignment falsifying clause (line 12).
If node is not a conflict one, this means that clause and/or clause does not exist. Suppose, for example, that no clause containing variable is falsified by . This means that every clause with the positive literal of is either satisfied by or contains a variable redundant in subspace . In other words, is monotone in after removing the clauses with redundant variables. Then an atomic Dsequent is generated by merge (line 14) as described in Subsection 5.4.
Example 3
Here we show how DSQSAT with lazy backtracking operates when solving the CNF formula introduced in Example 1. Formula consists of 8 clauses: , , , , , , , . Figure 4 shows the search tree built by DSQSAT. The ovals specify the branching nodes labeled by the corresponding branching variables. The label 0 or 1 on the edge connecting two nodes specifies the value made to the variable of the higher node. The rectangles specify the leaves of the search tree. The rectangle SAT specifies the leaf where DSQSAT reported that is satisfiable.
Every edge of the search tree labeled with value 0 (respectively 1) also shows the set of Dsequents (respectively ) derived when the assignment corresponding to this edge was made. The Dsequents produced by DSQSAT are denoted in Figure 4 as . The values of are given in Figure 5. When representing and , we use the symbol ’’ to separate Dsequents derived before and after a call of DSQSAT. Consider for example, the set on the path . The set of Dsequents listed before ’’ is empty in . This means that no Dsequents had been derived when DSQSAT was called with . On the exit of this invocation of DSQSAT, Dsequents were derived. We use ellipsis after symbol ’’ for the calls of DSQSAT that were not finished by the time was proved satisfiable.
Leaf nodes correspond to subspaces where every variable is either assigned or proved redundant. For example, the node on the path is a leaf because are assigned and is proved redundant.
Atomic Dsequents. Dsequents are atomic. For example, the Dsequent is derived in subspace due to becoming monotone. is equal to because only assignments are responsible for the fact that is monotone.
Branching in the presence of a conflict. On the path , clauses and turned into unit clauses and respectively. So no matter how first assignment to was made, one of these two clauses would get falsified. DSQSAT made first assignment and falsified clause . Since this was the left branch of , DSQSAT proceeded further to branch on variable .
Merging results of branches. When branching on variable , DSQSAT derived sets and where is equal to and is equal to . DSQSAT merged the results of branching by joining and at the branching variable . The resulting Dsequent equal to does not depend on .
Dsequents for branching variables. DSQSAT generated Dsequents for branching variables and . Variable was monotone in subspace because the clauses , containing the positive literal of were not present in this subspace. was satisfied by assignment while contained variable whose redundancy was stated by Dsequent equal to . So the Dsequent equal to was derived.
Variable was not monotone in subspace because, in this subspace, clauses and turned into unit clauses and respectively. So first, DSQSAT made variable redundant by adding to clause obtained by resolution of and on . Note that is falsified in subspace . So the Dsequent equal to was generated.
Reduction of the size of right branches. In the left branch of node , the set of Dsequents was derived. Dsequent equal to is not symmetric in (i.e. depends on ). On the other hand, and stating redundancy of and are symmetric in . So only Dsequent was inactive in the right branch . So only variable reappeared in this branch while , remain redundant.
Termination. In subspace , every variable of was assigned or redundant and no clause of was falsified by . So DSQSAT terminated reporting that was satisfiable.
5.7 Correctness of DSQSAT
The proof of correctness of DSQSAT can be performed by induction on the number of derived Dsequents. Since such a proof is very similar to the proof of correctness of the quantifier elimination algorithm we gave in [8], we omit it here. Below we just list the facts on which this proof of correctness is based.

DSQSAT derives correct atomic Dsequents.

Dsequents obtained by the join operation are correct.

DSQSAT correctly reports satisfiability when every clause is either satisfied or proved redundant in the current subspace because Dsequents stating redundancy of variables are correct.

New clauses added to the current formula are obtained by resolution and so are correct. So DSQSAT correctly reports unsatisfiability when an empty clause is derived.
6 DSQSAT on Compositional Formulas
In this section, we consider the performance of DSQSAT on compositional formulas. We will say that a satisfiability checking algorithm is compositional if its complexity is linear in the number of subformulas forming a compositional formula. We prove that DSQSAT with lazy backtracking is compositional and argue that DPLLbased SATsolvers are not.
We say that a formula is compositional if it can be represented as where . The motivation for our interest in such formulas is as follows. As we mentioned in Section 3, a practical formula typically can be represented as where are internal variables of and are communication variables. One can view compositional formulas as a degenerate case where and so do not talk to each other. Intuitively, an algorithm that does not scale well even when will not scale well when .
From now on, we narrow down the definition of compositional formulas as follows. We will call formula compositional if and all subformulas , are equivalent modulo variable renaming/negation. That is can be obtained from by renaming some variables of and then negating some variables of the result of variable renaming.
Proposition 2
Let be a compositional formula. Let be the search tree built by DSQSAT with lazy backtracking when checking the satisfiability of . The size of is linear in no matter how decision variables are chosen. (A variable is a decision one if no clause of that is unit in the current subspace contains .)
Proof
We will call a Dsequent limited to subformula if . The idea of the proof is to show that every Dsequent derived by DSQSAT is limited to a subformula . Then the size of is limited by