Breaking the PPSZ Barrier for Unique 3SAT
Abstract
The PPSZ algorithm by Paturi, Pudlák, Saks, and Zane (FOCS 1998) is the fastest known algorithm for (Promise) Unique SAT. We give an improved algorithm with exponentially faster bounds for Unique SAT.
For uniquely satisfiable 3CNF formulas, we do the following case distinction: We call a clause critical if exactly one literal is satisfied by the unique satisfying assignment. If a formula has many critical clauses, we observe that PPSZ by itself is already faster. If there are only few clauses allover, we use an algorithm by Wahlström (ESA 2005) that is faster than PPSZ in this case. Otherwise we have a formula with few critical and many noncritical clauses. Noncritical clauses have at least two literals satisfied; we show how to exploit this to improve PPSZ.
1 Introduction
The wellknown problem SAT is NPcomplete for . If PNP, SAT does not have a polynomial time algorithm. For a CNF formula over variables, the naive approach of trying all satisfying assignments takes time . Especially for much work has been put into finding socalled “moderately exponential time” algorithms running in time for some . In 1998, Paturi, Pudlák, Saks, and Zane presented a randomized algorithm for 3SAT that runs in time . Given the promise that the formula has at most one satisfying assignment (that problem is called Unique 3SAT), a running time of was shown. Both bounds were the best known when published. The running time of general SAT has been improved repeatedly (e.g. [8, 5]), until PPSZ was shown to run in time for general 3SAT [3].
Any further improvement of 3SAT further also improves Unique 3SAT; however that bound has not been improved upon since publication of the PPSZ algorithm. In this paper, we present a randomized algorithm for Unique 3SAT with exponentially better bounds than what could be shown for PPSZ. Our algorithm builds on PPSZ and improves it by treating sparse and dense formulas differently.
A key concept of the PPSZ analysis is the socalled critical clause: We call a clause critical for a variable if exactly one literal is satisfied by this unique satisfying assignment, and that literal is over . It is not hard to see that the uniqueness of the satisfying assignment implies that every variable has at least one critical clause. If some variables have strictly more than one critical clause, then we will give a straightforward proof that PPSZ by itself is faster already. Hence the bottleneck of PPSZ is when every variable has exactly one critical clause, and in total there are exactly critical clauses.
Given a formula with exactly critical clauses, consider how many other (noncrtical) clauses there are. If there are few, we use an algorithm by Wahlström [9] that is faster than PPSZ for formulas with few clauses allover. If there are many noncritical clauses we use the following fact: A noncritical clause has two or more satisfied literals (w.r.t. unique satisfying assignment); so after removing a literal, the remaining 2clause is still satisfied. We will exploit this to improve PPSZ.
An remaining problem is the case if only very few (i.e. sublinearly many) variables have more than one critical clause or appear in many (noncritical) clauses. In this case, we would get only a subexponential improvement. A significant part of our algorithm deals with this problem.
1.1 Notation
We use the notational framework introduced in [11]. Let be a finite set of propositional variables. A literal over is a variable or a negated variable . If , then , the negation of , is defined as . We mostly use for variables and for literals. We assume that all literals are distinct. A clause over is a finite set of literals over pairwise distinct variables from . By we denote the set of variables that occur in , i.e. . is a clause if and it is a clause if . A formula in CNF (Conjunctive Normal Form) over is a finite set of clauses over . We define . is a kCNF formula (a CNF formula) if all clauses of are clauses (clauses). A (truth) assignment on is a function which assigns a Boolean value to each variable. extends to negated variables by letting . A literal is satisfied by if . A clause is satisfied by if it contains a satisfied literal and a formula is satisfied by if all of its clauses are. A formula is satisfiable if there exists a satisfying truth assignment to its variables. A formula that is not satisfiable is called unsatisfiable. Given a CNF formula , we denote by the set of assignments on that satisfy . SAT is the decision problem of deciding if a CNF formula has a satisfying assignment.
If is a CNF formula and , we write (analogously ) for the formula arising from removing all clauses containing and truncating all clauses containing to their remaining literals. This corresponds to assigning to (or ) in and removing trivially satifsied clauses. We call assignments on and and consistent if for all . If is an assignment on and , we denote by the assignment on with for . If , we write as a shorthand for , the restriction of F to .
For a set , we denote by choosing an element u.a.r. (uniformly at random). Unless otherwise stated, all random choices are mutually independent. We denote by the logarithm to the base 2. For the logarithm to the base , we write . By we denote a polynomial factor depending on . We use the following convention if no confusion arises: When is a CNF formula, we denote by its variables and by the number of variables of , i.e. and . By we denote a quantity dependent on going to with .
1.2 Previous Work
Definition 1.1.
(Promise) Unique 3SAT is the following promise problem: Given a CNF with at most one satisfying assignment, decide if it is satisfiable or unsatisfiable.
A randomized algorithm for Unique 3SAT is an algorithm that, for a uniquely satisfiable CNF formula returns the satisfying assignment with probability .
Note that if the formula is not satisfiable, there is no satisfying assignment, and the algorithm cannot erroneously find one. Hence the error is onesided and we don’t have to care about unsatisfiable formulas.
The PPSZ algorithm [7] is a randomized algorithm for Unique 3SAT running in time . The precise bound is as follows:
Definition 1.2.
Let .
Theorem 1.3 ([7]).
There exists a randomized algorithm (called PPSZ) for Unique 3SAT running in time .
Note that and .
1.3 Our Contribution
For Unqiue 3SAT, we get time bounds exponentially better than PPSZ:
Theorem 1.4.
There exists a randomized algorithm for Unique 3SAT running in time where .
2 The PPSZ Algorithm
In this section we review the PPSZ algorithm [7], summarized in Algorithm 1. We need to adapt some statements slightly. For the straightforward but technical proofs we refer the reader to the appendix. The following two definitions are used to state the PPSZ algorithm.
Definition 2.1.
A CNF formula implies a literal if there exists a subformula with and all satisfying assignments of set to .
In a random permutation, the positions of two elements are not independent. To overcome this, placements were defined. They can be seen as continuous permutations with the nice property that the places of different elements are independent.
Definition 2.2 ([7]).
A placement on is a mapping . A random placement is obtained by choosing for every uniformly at random from , independently.
Observation 2.3.
By symmetry and as ties happen with probability , ordering according to a random placement gives a permutation distributed the same as a permutation drawn uniformly at random from the set of all permutations on .
The analysis of PPSZ builds on the concept of forced and guessed variables:
Definition 2.4.
If in PPSZ, is assigned or because of implication, we call forced. Otherwise (if is set to ), we call guessed.
The following lemma from [7] relates the expected number of guessed variables to the success probability (the proof is by an induction argument and Jensen’s inequality).
Lemma 2.5 ([7]).
Let be a satisfiable CNF, let be a satisfying assignment. Let be the expected number of guessed variables conditioned on depending on . Then returns with probability at least .
Remember that , which corresponds to the probability that a variable is guessed. We define where the integral starts from instead of ; this corresponds to the probability that a variable has place at least and is guessed.
Definition 2.6.
Let .
Observation 2.7.
For , .
In the appendix, we derive from [7] the following:
Corollary a.3.
Let a CNF with unique satisfying assignment . Then in PPSZ() conditioned on , the expected number of guessed variables is at most .
Furthermore, suppose we pick every variable of with probability , independently, and let be the resulting set. Then in PPSZ() conditioned on , the expected number of guessed variables is at most .
By Lemma 2.5, we have the following corollary:
Corollary 2.8.
Let a CNF with unique satisfying assignment . Then the probability that returns is at least .
Furthermore, suppose we pick every variable of with probability , independently, and let be the resulting set. Then the expected of the probability (over the choice of ) that returns is at least .
The first statement is actually what is shown in [7], and the second statement is a direct consequence. We need this later when we replace PPSZ by a different algorithm on variables with place at most . It is easily seen that for a CNF , runs in time . Hence by a standard repetition argument, PPSZ gives us an algorithm finding an assignment in time and we (re)proved Theorem 1.3.
3 Reducing to One Critical Clause per Variable
In this section we show that an exponential improvement for the case where every variable has exactly one critical clause gives an exponential improvement for Unique 3SAT.
Definition 3.1 ([7]).
Let be a CNF formula satisfied by . We call a clause critical for (w.r.t. ) if satisfies exactly one literal of , and this literal is over .
Definition 3.2.
A 1CUnique CNF is a uniquely satisfiable CNF where every variable has at most one critical clause. Call the corresponding promise problem 1CUnique 3SAT.
All formulas we consider have a unique satisfying assignment; critical clauses will be always w.r.t. that. First we show that a variables with more than one critical clause are guessed less often; giving an exponential improvement for formulas with a linear number of such variables. A similar statement for shorter critical clauses is required in the next section.
Lemma 3.3.
Let be a CNF uniquely satisfied by . A variable with at least two critical clauses (w.r.t. ) is guessed given with probability at most . Furthermore, a variable with a critical clause is guessed with probability at most
Proof.
Suppose . Let and be two critical clauses of . If and share no variable besides , then the probability that is forced is at least by the inclusionexclusion principle. If and share one variable besides , then the probability that is forced is at least (which is smaller than . and cannot share two varibles besides : in that case , as being a critical clause for w.r.t. predetermines the polarity of the literals. Intutiviely, if is small, then is almost twice as large as ; therefore in this area the additional clause helps us and the overall forcing probability increases. For a critical clause the argument is analogous. Here, the probability that is forced given place is at least . The statement follows now by integration using the dominated convergence theorem, see appendix A.1. ∎
Corollary 3.4.
Let be a CNF formula uniquely satisfied by . If variables of have two critical clause, PPSZ finds with probability at least
If variables of have a critical clause clause, PPSZ finds with probability at least
If there are only few variables (less than ) with one critical clause, we can find and guess them by bruteforce. If we choose small enough, any exponential improvement for 1CUnique 3SAT gives a (diminished) exponential improvement to Unique 3SAT. To bound the number of subsets of size , we define the binary entropy and use a wellknown upper bound to the binomial coefficient.
Definition 3.5.
For , ().
Lemma 3.6 (Chapter 10, Corollary 9 of [6]).
If is an integer, then
We will manily prove that we have some exponential improvement. The claimed numbers are straightforward to check by inserting the values from the following table.
name  value  description 

improvement in 1CUnique 3SAT  
improvement in Unique 3SAT  
threshold fraction of vars. with more than 1 crit. clause  
is the amount of variables for sparse and dense  
exponential savings on repetitions if is sparse  
prob. that a var. is assigned using indep. clauses instead of PPSZ 
Lemma 3.7.
If there is a randomized algorithm solving 1CUnique 3SAT in time for , then there is a randomized algorithm (Algorithm 2) solving Unique 3SAT in time for some .
Proof.
Let be a CNF uniquely satisfied by . Let be the number of variables of with more than one critical clause. If , PPSZ is faster by Corollary 3.4. If , we can use .
However, what if ? In that case, we get rid of these variables by bruteforce: For all subsets of variables and for all possible assignments on , we try . For one such , we have satisfiable and ; namely if includes all variables with multiple critical clauses and is compatible with . This is because fixing variables according to does not produce new critical clauses w.r.t. .
There are subsets of size of the variables of , each with possible assignments. As (Lemma 3.6), we invoke at most times. Setting small enough such that retains an improvement for Unique 3SAT.
∎
4 Using One Critical Clause per Variable
In this section we give an exponential improvement for 1CUnique 3SAT.
Theorem 4.1.
Given a 1CUnique CNF on variables, OneCC() runs in expected time and finds the satisfying assignment with probability .
Obtaining a randomized algorithm using independent repetitions and Markov’s inequality is straightforward.
Corollary 4.2.
There exists a randomized algorithm for 1CUnique 3SAT running in time .
Together with Lemma 3.7 this immediately implies Theorem 1.4. We obtain the improvement by doing a case distinction into sparse and dense formulas, as defined now:
Definition 4.3.
For a CNF formula and a variable , the degree of in , is defined to be the number of clauses in that contain the variable . The 3clause degree of in , is defined to be the number of 3clauses in that contain the variable .
For a set of variables , denote by the part of independent of that consists of the clauses of that do not contain variables of .
We say that is sparse if there exists a set of at most variables such that has maximum 3clause degree . We say that is dense otherwise.
We will show that for small enough, we get an improvement for sparse 1CUnique CNF formulas. On the other hand, for any we will get an improvement for dense 1CUnique CNF formulas. In the sparse case we can fix by brute force a small set of variables to obtain a formula with few 3clauses. We need to deal with the clauses and then use an algorithm from Wahlström for CNF formulas with few clauses.
4.1 Dense Case
First we show the improvement for any dense 1CUnique CNF. density means that even after ignoring all clauses over any variables, a variable with 3clause degree of at least 5 remains. The crucial idea is that for a variable with 3clause degree of at least 5, picking one occurence of u.a.r. and removing it gives a 2clause satisfied (by the unique satisfying assignment) with probability at least . The only way a nonsatisfied clause can arise is if the 3clause was deleted from was critical for . However we assumed that there is at most one critical clause for .
Repeating such deletions and ignoring all 3clauses sharing variables with the produced 2clauses, as in listed in GetInd2Clauses(), gives us the following:
Observation 4.4.
For a dense 1CUnique CNF , GetInd2Clauses() returns a set of independent clauses, each satisfied (by the unique satisfying assignment of ) independently with probability .
As a random 2clause is satisfied with probability by a specific assignment, this set of 2clauses gives us nontrivial information about the unique satisfying asignment. Now we show how to use these 2clauses to improve PPSZ:
Lemma 4.5.
Let be a dense 1CUnique CNF for some . Then there exists an algorithm () runing in time for and returning the satisfying assignment of with probability
Proof.
First we give some intuition. For variables that occur late in PPSZ, the probability of being forced is large (being almost in the second half). However for variables that come at the beginning, the probability is very small; a variable at place is forced (in the worst case) with probability for , hence we expect forced variables among the first variables in total.
However, a clause that is satisfied by with probability can be used to guess both variables in a better way than uniform, giving constant savings in random bits required. For such clauses, we expect of them to have both variables among the first variables. For each 2clause we have some nontrivial information; intuitively we save around bits. In total we save bits among the first variables, which is better than PPSZ for small enough .
Formally, let be a random set of variables, where each variable of is added to with probability . On , we replace PPSZ by our improved guessing; on the remaining variables we run PPSZ as usual. Let be the event that the guessing on (to be defined later) finds . Let be the event that PPSZ() finds . Observe that for a fixed , and are independent. Hence we can write the overall probability to find (call it ) as an expectation over :
where in the last two steps we used Jensen’s inequality and linearity of expectation.
By Corollary 2.8, . We now define the guessing and analyze (see Algorithm 5 as a reference): By Observation 4.4 we obtain a set of independent clauses , each satisfied (by ) independently with probability . In the following we assume that has at least a fraction of satisfied 2clauses as this happens with constant probability (for a proof, see e.g. [2]) and we only need to show subexponential success probability.
Using the set of 2clauses , we choose an assignment on as follows: For every clause in completely over choose an assignment on both of its variables: with probability such that is violated, and with probability each on of the three assignments that satisfy . Afterwards, guess any remaining variable of u.a.r. from .
Given , let be the number of clauses of completely over not satisfied by . Let be the number of clauses of completely over satisfied by . Then
This is seen as follows: For any variable for which no clause in is completely over , we guess uniformly at random and so correctly with probability . For any clause which is completely over , we violate the clause with probability , and choose a nonviolating assignment with probability . For any clause not satisfied by , we hence set both variables according to with probability . For any clause satisfied by , we set both variables according to with probability , as we have to pick the right one of the three assignments that satisfy . As , , , , we have
The inequality follows from the observations and . One can calculate , corresponding to the fact that a fourvalued random variable where one value occurs with probability at most has entropy at most .
Hence by our calculations and Observation 2.7 (to evaluate ), we have
This gives an improvement over PPSZ of . The first term corresponds to the savings PPSZ would have, the second term corresponds to the savings we have in our modified guessing. Observe that for small , the integral evaluates to , but the second term is . Hence choosing small enough gives an improvement. ∎
4.2 Sparse Case
Now we show that if is small enough we get an improvement for a sparse 1CUnique CNF. For this, we need the following theorem by Wahlström:
Theorem 4.6 ([9]).
Let be a CNF formula with average degree at most where we count degree as instead. Then satisfiability of can be decided in time . Denote this algorithm by .
Lemma 4.7.
Let be a sparse 1CUnique CNF. For small enough, there exists an algorithm () running in expected time for and finding the satisfying assignment of with probability .
Proof.
Similar to Section 3, we first check by bruteforce all subsets of variables and all possible assignments of ; by definition of sparse for some , the part of independent of (i.e. ) has maximum 3clause degree . If furthermore is compatible with , is a 1CUnique CNF with maximum 3clause degree : We observed that critical clauses cannot appear in the process of assigning variables according to ; furthermore any clause of not independent of must either disappear in or become a clause. As earlier, there are at most cases of choosing and . We now analyze what happens for the correct choice of :
We would like to use Wahlstroem on ; however might contain an arbitrary amount of clauses. The plan is to use the fact that either there are many critical clauses, in which case PPSZ is better, or few critical clauses, in which case all other clauses are noncritical and have only satisfied literals.
The algorithm works as follows: We have a 1CUnique CNF on on variables; the maximum degree in the 3clauses is at most 4. First we try PPSZ: if there are critical clauses, this gives a satisfying assignment with probability . Otherwise, if there are less than clauses, the criterion of Theorem 4.6 applies: We invoke with probability ; this runs in expected time and finds a satisfying assignment with probability .
If both approaches fail, we know that has at less than critical clauses clauses, but also more than clauses overall. Hence at most one third of the clauses is critical. However a noncritical clause must be a clause with both literals satisfied. Hence choosing a clause of uniformly at random and setting all its literals to sets two variables correctly with probability at least . That is we reduce the number of variables by 2 with a better probability than PPSZ overall; and we can repeat the process with the reduced formula. This shows that for the correct , we have expected running time and success probability for . It is important to see that does not depend on . Repeating this process times gives success probability .
Together with the brutefroce choice of and , we have expected running time of . By choosing small enough we are better than PPSZ.
∎
5 Open Problems
Can we also obtain an improvement for general SAT? In general SAT, there might be even fewer critical clauses and critical clauses for some assignments are not always critical for others. We need to fit our improvement into the framework of [3]. As there is some leeway for multiple assignments, this seems possible, but nontrivial and likley to become very complex.
Another question is whether we can improve (Unique) SAT. PPSZ becomes slower as increases, which makes an improvement easier. However the guessing in Sparse relied on the fact that noncritical clauses have all literals satisfied, which is not true for larger clauses.
Suppose Wahlström’s algorithm is improved so that it runs in time on 3CNF formulas with average degree . The sparsification lemma [4] shows that for and , we obtain an algorithm for 3SAT running in time for . Can our approach be extended to a similar sparsification result?
Acknowledgements
I am very grateful to my advisor Emo Welzl for his support in realizing this paper.
References
 [1] R. G. Bartle. The elements of integration and Lebesgue measure, volume 92. WileyInterscience, 2011.
 [2] K. Hamza. The smallest uniform upper bound on the distance between the mean and the median of the binomial and poisson distributions. Statistics & Probability Letters, 23(1):21 – 25, 1995.
 [3] T. Hertli. 3SAT faster and simpler—uniqueSAT bounds for PPSZ hold in general. In 2011 IEEE 52nd Annual Symposium on Foundations of Computer Science—FOCS 2011, pages 277–284. IEEE Computer Soc., Los Alamitos, CA, 2011.
 [4] R. Impagliazzo, R. Paturi, and F. Zane. Which problems have strongly exponential complexity. J. Comput. System Sci., 63(4):512–530, 2001. Special issue on FOCS 98 (Palo Alto, CA).
 [5] K. Iwama and S. Tamaki. Improved upper bounds for 3SAT. In Proceedings of the Fifteenth Annual ACMSIAM Symposium on Discrete Algorithms, pages 328–329 (electronic), New York, 2004. ACM.
 [6] F. F. J. MacWilliams and N. N. J. A. Sloane. The Theory of ErrorCorrecting Codes. NorthHolland, Amsterdam, 1977.
 [7] R. Paturi, P. Pudlák, M. E. Saks, and F. Zane. An improved exponentialtime algorithm for SAT. J. ACM, 52(3):337–364 (electronic), 2005.
 [8] U. Schöning. A probabilistic algorithm for SAT and constraint satisfaction problems. In Proceedings of the 40th Annual Symposium on Foundations of Computer Science, pages 410–414. IEEE Computer Society, Los Alamitos, CA, 1999.
 [9] M. Wahlström. An algorithm for the SAT problem for formulae of linear length. In Algorithms—ESA 2005, volume 3669 of Lecture Notes in Comput. Sci., pages 107–118. Springer, Berlin, 2005.
 [10] M. Wahlström. Algorithms, measures and upper bounds for satisfiability and related problems. PhD thesis, Department of Computer and Information Science, Linköpings Universitet Sweden, 2007.

[11]
E. Welzl.
Boolean Satisfiability – Combinatorics and Algorithms
(Lecture Notes), 2005.
www.inf.ethz.ch/~emo/SmallPieces/SAT.ps
.
Appendix A Omitted Proofs
Theorem A.1.
Let be a CNF on variables with unique satisfying assignment . Let and . In the PPSZ algorithm, conditioned on and , is forced with probability at least , where is a monotone increasing function ( stands for a term that goes to when goes to ).
Proof of Theorem a.1.
This can be derived from [7]. There are two differences: The first is that we define PPSZ with implication instead of bounded resolution. It is easily seen that the critical clause tree construction of [7] also works with implication. We use implication because we think it makes the algorithm easier to understand.
The second difference it that we give a bound for a fixed . We need this to be able to modify PPSZ and analyze it in special situations. However, we can derive this result from [7]:
Let , the “ideal” lower bound of PPSZ that a variable is forced. Remember that . In [7] tehy give a lower bound on the probability that a variable is forced given with . This bound is shown to integrate to . As the probability that a variable is forced does not decrease if it comes later in PPSZ, the bound can easily been made monotone (if it is not already) by setting ,
For the statement is trivial. Now suppose for some , for some and all . By the above observation, , and . By continuity of and monotonicity of , we find such that for all . But then by monotonicity of and , , a contradiction. ∎
To go from Theorem A.1, where the place of a variable is fixed to the expectation, we need to integrate (this complicated approach gives us some flexibility later). We need the following special case of the wellknown dominated convergence theorem (e.g. see [1]). It essentially states that the integrates to an in our case.
Theorem A.2 (Dominated Convergence Theorem).
Let be a continuous function with . Let be integrable with . Then .
Combining Theorem A.1 with Lemma 2.5 and the dominated convergence theorem A.2 gives us the following corollary. Integrability of follows from monotonicity.
Corollary A.3.
Let a CNF with unique satisfying assignment . Then in PPSZ() conditioned on , the expected number of guessed variables is at most .
Furthermore, suppose we pick every variable of with probability , independently, and let be the resulting set. Then in PPSZ() conditioned on , the expected number of guessed variables is at most .