Phase transition for Local Search on planted SAT
The Local Search algorithm (or Hill Climbing, or Iterative Improvement) is one of the simplest heuristics to solve the Satisfiability and Max-Satisfiability problems. It is a part of many satisfiability and max-satisfiability solvers, where it is used to find a good starting point for a more sophisticated heuristics, and to improve a candidate solution. In this paper we give an analysis of Local Search on random planted 3-CNF formulas. We show that if there is such that the clause-to-variable ratio is less than ( is the number of variables in a CNF) then Local Search whp does not find a satisfying assignment, and if there is such that the clause-to-variable ratio is greater than then the local search whp finds a satisfying assignment. As a byproduct we also show that for any constant there is such that Local Search applied to a random (not necessarily planted) 3-CNF with clause-to-variable ratio produces an assignment that satisfies at least clauses less than the maximal number of satisfiable clauses.
A CNF formula over variables is a conjunction of clauses where each clause is a disjunction of one or more literals. A formula is said to be a -CNF if every clause contains exactly literals. In the problem -SAT the question is, given a -CNF, decide if it has a satisfying assignment (find such an assignment for the search problem). In the MAX--SAT problem the goal is to find an assignment that satisfies as many clauses as possible. The problem -SAT for is one of the first problems proved to be NP-complete problems and serves as a model problem for many algorithm and complexity concepts since then. In particular, Håstad  proved that the MAX--SAT problem is NP-hard to approximate within ratio better than 7/8. These worst case hardness results motivate the study of the typical case complexity of those problems, and a quest for probabilistic or heuristic algorithms with satisfactory performance, in the typical case. In this paper we analyze the performance of one of the simplest algorithms for (MAX-)-SAT, the Local Search algorithm, on random planted instances.
Let us start with planted instances. One of the most natural and well studied probability distributions on the set of 3-CNFs is the uniform distribution on the set of 3-CNFs with a given clauses-to-variables ratio . It can be constructed and sampled as follows. Fix the number of 3-clauses as a function of the number of variables. The elements of are 3-CNFs generated by selecting clauses over variables . Clauses are chosen uniformly at random from the set of possible clauses, and so the probability of every 3-CNF from is the same. An important parameter of such CNFs is the clause-to-variable ratio, , or density of the formula. We will use the density of a 3-CNF rather than the number of clauses, and so we write instead of . Density can also be a function of .
However, the typical case complexity for this distribution is not very interesting except for a very narrow range of densities. The reason is that the random 3-SAT under this distribution demonstrates a sharp satisfiability threshold in the density . A random 3-CNF with density below the threshold (estimated to be around 4.2) is satisfiable whp (with high probability, meaning that the probability tends to 1 as goes to infinity), and a 3-CNF with density above the threshold is unsatisfiable whp. Therefore the trivial algorithm outputting yes or no by just counting the density of a 3-CNF gives a right answer to 3-SAT whp. For more results on the threshold see [10, 11, 1, 18]. It is also known that, as the density grows, the number of clauses satisfied by a random assignment differs less and less from the maximal number of satisfiable clauses. If density is infinite (meaning it is an unbounded function of ), then whp this difference becomes negligible, i.e. . Therefore, distribution is not very interesting for MAX-3-SAT, at least when density is large, as one can get whp a very good approximation just by checking a random assignment.
A more interesting and useful distribution is obtained from by conditioning on satisfiability: such distribution is uniform and its elements are the satisfiable 3-CNFs. Then the problem is to find or approximate a satisfying assignment knowing it exists. Unfortunately, to date there are no techniques to tackle such problems (see, e.g., [6, 9]), particularly, to sample the satisfiable distribution. A good approximation for such a distribution is the planted distribution , which is obtained from by conditioning on satisfiability by a specific “planted” assignment. To construct an element of a planted distribution we select an assignment of a set of variables and then uniformly at random include clauses satisfied by the assignment selected. Some attempts have been made to define a better approximation of the satisfiable distribution, see, e.g. , however, the analysis of such distributions is difficult and it is not clear if they are closer to the distribution sought.
Another interesting feature of the planted distribution is that there is a hope that it is possible to design an algorithm that solves all planted instances whp. Some candidate algorithms were suggested in [6, 13, 21]. Algorithm from  and  use different approaches to solve planted 3-SAT of high density. Experiments show that the algorithm from  achieves the goal, but a rigorous analysis of this algorithm is not yet made. For a wider survey on SAT algorithms the reader is referred to [23, 7].
The Local Search algorithm (LS) is one of the oldest heuristics for SAT that has been around since the eighties. Numerous variations of this method have been proposed since then, see, e.g., [15, 25]. We study one of the most basic versions of LS, which, given a CNF, starts with a random assignment to its variables, and then on each step chooses at random a variable such that flipping this variable increases the number of satisfied clauses, or stops if such a variable does not exist. Thus LS finds a random local optimum accessible from the initial assignment.
LS has been studied before. The worst-case performance of pure LS is not very good: the only known lower bound for local optima of a -CNF is of clauses satisfied, where is the number of all clauses . In , it is shown that if density of 3-CNFs is linear, that is, , then LS solves whp a random planted instance. Finally, in , we gave an estimation of the dependence of the number of clauses LS typically satisfies and the density of the formula.
Often visualization of the number of clauses satisfied by an assignment is useful: Assignments can be thought of as points of a landscape, and the elevation of a point corresponds to the number of clauses unsatisfied, the higher the point is, the less clauses it satisfies. It is suspected that ‘topographic’ properties of such a landscape are responsible for many complexity properties of satisfiability instances. For example, it is believed that the hardness of random CNFs whose density is close to the satisfiability threshold is due to the geometry of the satisfying assignments. They tend to concentrate around several centers, that make converging to a solution more difficult [7, 22]. As we shall see the performance of LS is closely related to geometric properties of the assignments, and so we hope that the study of LS may lead to a better understanding of those properties.
We classify the performance of LS for all densities higher than an arbitrary constant. In particular, we demonstrate that LS has a threshold in its performance. The main result is the following theorem.
(1) Let , and . Then the local search whp finds
a solution of an instance from .
(2) Let , a constant, and . Then the local search whp does not find a solution of an instance from .
To prove part (1) of the theorem 1 we show that under those conditions all the local optima of a 3-CNF whp are either satisfying assignments, that is, global optima, or obtained by flipping almost all the values of planted solution, and so are located on the opposite side of the set of assignments. In the former case LS finds a satisfying assignment, while whp it does not reach the local optima of the second type. We also show that that for any constant density there is such that the assignment produced by LS on an instance from or satisfies at least clauses less than the maximal number of satisfiable clauses. Unfortunately, it is somewhat difficult to run computational experiments on CNFs of infinite density, as in order to have sufficiently large must be prohibitively big. However, experiments we were able to conduct agree with the results.
Another region where LS can find a solution of the random planted 3-CNF is the case of very low density. Methods similar to Lemma 9 and Theorem 11 show that this low density transition happens around . However, we do not go into details here.
Usually the main difficulty of analysis of algorithms for random SAT is to show that as an algorithm runs, some kind of randomness of the current assignment is kept. This property allows one to use ‘card games’, Wormald’s theorem, and differential equations as in [1, 8], or relatively simple probabilistic constructions, such as martingales, as in . For LS randomness cannot be assumed after just a few iterations of the algorithm, which makes its analysis more difficult. This is why the most difficult part of the proof is to identify to which extent assignments produced by LS as it runs remain random, while most of the probabilistic computations are fairly standard.
The paper is organized as follows. After giving several necessary definitions in Section 2, we prove in Section 3, that above the threshold established in Theorem 1 planted 3-CNFs do not have local optima that can be found by LS, other than satisfying assignments. In Section 4 we show that below the threshold there are many such optima, and that LS necessarily gets stuck into one of them.
A 3-CNF is a conjunction of 3-clauses. As we consider only 3-CNFs, we will always call them just clauses. Depending on the number of negated literals, we distinguish 4 types of clauses: , , and . If is a 3-CNF over variables , an assignment of these variables is a Boolean -tuple , so the value of is . The density of a 3-CNF is the number where is the number of clauses, and is the number of variables in .
The uniform distribution of 3-CNFs of density (density may be a function of ), is the set of all 3-CNFs containing variables and clauses equipped with the uniform probability distribution on this set. To sample a 3-CNF accordingly to one chooses uniformly and independently clauses out of possible clauses. Thus, we allow repetitions of clauses, but not repetitions of variables within a clause. Random 3-SAT is the problem of deciding the satisfiability of a 3-CNF randomly sampled accordingly to . For short, we will call such a random formula a 3-CNF from .
The uniform planted distribution of 3-CNF of density is constructed as follows. First, choose at random a Boolean -tuple , a planted satisfying assignment. Then let be the uniform probability distribution over the set of all 3-CNFs over variables with density and such that is a satisfying assignment. For our goals we can always assume that is the all-ones tuple, that is a 3-CNF belongs to if and only if it contains no clauses of the type . We also simplify the notation by . To sample a 3-CNF accordingly to one chooses uniformly and independently clauses out of possible clauses of types , and . Random Planted 3-SAT is the problem of deciding the satisfiability of a 3-CNF from .
The problems Random MAX-3-SAT and Random Planted MAX-3-SAT are the optimization versions of Random 3-SAT and Random Planted 3-SAT. The goal in these problems is to find an assignment that satisfies as many clauses as possible. Although the two problems usually are treated as maximization problems, it will be convenient for us to consider them as problems of minimizing the number of unsatisfied clauses. Since we always evaluate the absolute error of our algorithms, not the relative one, such transformation does not affect the results.
A formal description of the Local Search algorithm (LS) is given in Fig. 1.
|Input: 3-SAT formula over variables .|
|Output: Boolean -tuple , which is a local minimum of .|
|choose uniformly at random a Boolean -tuple|
|let be the set of all variables such that the number of clauses that can be made satisfied|
|by flipping the value of is strictly greater than the number of those made unsatisfied|
|while is not empty|
|pick uniformly at random a variable from|
|change the value of|
Observe that LS stops when reaches a local minimum of the number of unsatisfied clauses.
Given an assignment and a clause it will be convenient to say that votes for a variable to have value 1 if contains literal and its other two literals are unsatisfied. In other words if either (a) assigns to 0, is not satisfied by , and it will be satisfied if the value of is changed, or (b) the only literal in satisfied by is . Similarly, we say that votes for if contains the negation of and its other two literals are not satisfied. Using this terminology we can define set as the set of all variables such that the number of votes received to change the current value is greater than the number of those to keep it.
Probabilistic tools we use are fairly standard and can be found in the book .
Let be a 3-CNF with variables . The primal graph of is the graph with vertex set and edge set . The hypergraph associated with is a hypergraph, whose vertices are the variables of and the edges are the 3-element sets of variables belonging to the same clause. Note that if , then is a random 3-hypergraph with vertices and edges, but is not a random graph.
We will need the following properties that a graph of not too high density has.
Let for a certain constant , and let .
(1) For any , whp all the subgraphs of induced by at most vertices have the average degree less than 5.
(2) The probability that has a vertex of degree greater than is .
Proof: (1) This part of the lemma is very similar to Proposition 13 from , and is proved in a similar way. Let be a fixed set of variables with . The number of 3-element sets of variables that include 2 variables from is bounded from above by
For each of them the probability that this set is the set of variables of one of the random clauses chosen for (we ignore the type of the clause) equals
Thus, the probability that of them are included as clauses is at most
Let . Using the union bound, the probability that there exists a required set with at most variables is at most
(2) The probability that the degree of a fixed vertex is at least is bounded from above by
where is the probability that some particular random clauses include , and is the number of -element sets of clauses. Then it is not hard to see that
as goes to infinity.
Several times we need the following corollary from Azuma’s inequality for supermartingales (see Lemma 1 from ).
(1) Let be a supermartingale such that and for some . Then for any .
(2) This inequality implies that if and then the process is a supermartingale and we have the following inequality
The following lemma is a simple corollary of Chernoff bound.
Let be integers, a positive real, and let be some real constants. There are constants and such that we have
for any random variables and such that and for some binomial random variables .
Proof: Let . It is easy to see that event implies occurrence of at least one of the events from the set
Indeed, inequality can be derived from inequalities, opposite to the ones in and .
Application of Chernoff bound gives us inequalities
Thus if we set , then using union bound we can conclude that inequality (2) holds.
3 Success of Local Search
In this section we prove the first statement of the Theorem 1(1). This will be done as follows. First, we show that if a 3-CNF has high density, that is, greater than for some then whp all the local minima that do not satisfy the CNF — we call such minima proper — concentrate very far from the planted assignment. This is the statement of Proposition 8 below. Then we use Lemma 5 to prove that starting from a random assignment LS whp does not go to that remote region. Therefore the algorithm does not get stuck to a local minimum that is not a solution.
Several times we will need the following observation that can be checked using the inequality. For any , , and with
We need the following two lemmas. Recall that the planted solution is the all-ones one.
Let for some constant , and let constants be such that . Whp any assignment with zeros satisfies more clauses than any assignment with zeros.
Proof: Let be some vectors with and zeros, respectively. Let be a random clause, then (1) with probability all its literals are positive, (2) with probability two literals are positive and similar (3) with probability one literal is positive. The probabilities that the clause is satisfied by in these cases are and , respectively. Hence the total probability of a clause to be satisfied by equals . A similar result holds for . Thus the expectation of the number of clauses satisfied by and in a random formula equals and respectively, thus applying lemma 4 we conclude that
for some . There are assignments, hence, application of the union bound finishes proof of the lemma.
Let for some (not necessarily ). There is such that for whp for any proper local minimum of the number of variables assigned to 0 by is either less than , or greater than .
Proof: Let be the set of all variables that assigns to 0. Let be event “for every the number of clauses voting for to be 1 is less than or equal to the number of clauses voting for to be 0”. Since is a local minimum, is the case for . It is easy to see that event implies event “the total number of votes given by clauses for variables in to be 1 is less than or equal to the total number of votes given by clauses for variables in to be 0”. To bound the probability of we will bound the probability of .
Let be a random clause. It can contribute from 0 to 3 votes for variables in to be one and 0 or 1 vote for them to remain zero. Let us compute, for example, the probability that it contributes exactly two votes for variables in to become one. It happens if is of type , both its positive variables are in and the negative variable is outside of . Probability of this event is . So the expectation of the number of clauses voting for exactly 2 variables in to be 1 is . The expectations of the numbers of clauses voting for three and one variables to be 1 are and , respectively.
A clause votes for a variable in to remain 0 if its type is , one of its negative literals is not in , and two other literals are in , or if its type is and all the variables in it belong to . Thus the expectation of the number of clauses voting for variables in to remain 0 is .
Hence the expectation of the number of votes for variables in to flip equals
and expectation of the number of votes for variables in to remain 0 equals
Therefore we can apply Lemma 4 to the votes for and against 0s and get the following bound for some . Then we can bound number of votes for a flip from below by for some constant and we can bound the number of sets of size as
then union bound implies that whp there is no set such that happens. It is easy to see that for and that is close enough to 1 the above inequality holds, which finishes the proof of the lemma.
Now suppose that is a proper local minimum of . There is a clause that is not satisfied by . Without loss of generality, let the variables in be , and let the variable assigned 0 be . Thus, clause votes for to be flipped to 1. Since is a local minimum there must a clause that is satisfied, that becomes unsatisfied should flipped. We call such a clause a support clause for the 0 value of . In any support clause the supported variable is negated, and therefore any support clause has the type or . A variable of a CNF is called -isolated if it appears positively in at most clauses of the type . The distance between variables of a CNF is the length of the shortest path in connecting them.
If and then for any integers and for a random whp there are no two -isolated variables within distance from each other.
Proof: Let be some variable. The probability that it is -isolated can be computed as
for any .
By Lemma 2(2), the degree of every vertex of whp does not exceed . Hence, there are at most vertices at distance from . Applying the union bound we can estimate the probability that there is a -isolated vertex at distance from as . Finally, taking into account the probability that itself is -isolated, and applying the union bound over all vertices of we obtain that the probability that two -isolated vertices exists at distance from each other can be bounded from above by
Thus for whp there are no two such vertices.
Let , and . Then whp proper local minima of a 3-CNF from have at most ones.
Proof: Let be a random planted instance. Suppose that is a proper local minimum that has more than ones. We use the following observation. Let be a clause not satisfied by . Then it contains at least one variable that is assigned to zero by . The assignment is a local minimum, so there must be a clause that is satisfied only by . Hence, is a support clause, and contains a variable which is assigned to zero by . Variables and are at distance . Setting and , by Lemma 7, we conclude that one of them is not 11-isolated.
Set , and consider the set of all variables assigned to zero by that are not 11-isolated. By the observation above this set is non-empty. On the other hand, by Lemma 6, is for some . Consider . It appears positively in at least 10 clauses of the type . Each of these clauses is either unsatisfied or contains a variable assigned to 0. Suppose there are unsatisfied clauses among them. Since is a local minimum, to prevent from flipping, must be supported by at least support clauses, each of which contains a variable assigned to 0. Thus, at least 6 neighbors of in are assigned to 0. Any two neighbors of are at distance 2. By Lemma 7 at least 5 of the neighbors assigned to 0 are not 11-isolated, and therefore belong to . Thus the subgraph induced by in has the average degree greater than 5, which is not possible by Lemma 2(1).
Now we are in a position to prove statement (1) of Theorem 1.
Proof: [of Theorem 1(1)] By Lemma 5 for a whp any assignment with variables equal to 1, where , satisfies more clauses than any assignment with equal to 1. Then, whp a random initial assignment for LS assigns between and of all variables to 1. Therefore, whp LS never arrives to a proper local minimum with less than variables equal to 1, and, by Proposition 8, to any proper local minimum.
4 Failure of Local Search
We now prove statement (2) of Theorem 1. The overall strategy is the following. First, we show, Proposition 10, that in contrast to the previous case there are many proper local minima in the close proximity of the planted assignment. Then we show, Proposition 12, that those local minima are located so that they intercept almost every run of LS, and thus almost every run is unsuccessful.
We start off with a technical lemma. A pair of clauses , is called a cap if are 1-isolated, that is they do not appear in any clause of the type except for and , respectively, and are not 0-isolated (see Figure 2(a)). We denote equality by .
Let , and . There is , , such that whp a random planted CNF contains at least caps.
Proof: The proof is fairly standard, see, e.g. the proof of Theorem 4.4.4 in . We use the second moment method. The result follows from the fact that a cap has properties similar to the properties of strictly balanced graphs, see . Take some , and let be a random variable equal to the number of caps in a 3-CNF . Straightforward calculation shows that the probability that a fixed 5-tuple of variables is a cap is . Therefore .
Let be a fixed 5-tuple of variables, say, , and denote the event that forms a cap. For any other 5-tuple , the similar event is denoted by , and we write if these two events are not independent. By Corollary 4.3.5 of  it suffices to show that
Let . It is not hard to see that the only cases when and are not independent and the probability is significantly different from 0 is: and , or and , or and , or and . Then, as before, it can be found that in each of these cases .
We can choose if , and if for .
Let , and . Then there is , , such that a 3-CNF from whp has at least proper local minima.
Proof: Let , be a cap and an assignment such that , and for all other . It is straightforward that is a proper local minimum. By Lemma 9, there is such that whp the number of such minima is at least .
Before proving Proposition 12, we note that a construction similar to caps helps evaluate the approximation rate of the local search in the case of constant density on planted and also on arbitrary CNFs. A subformula is called a crown if the variables do not appear in any clauses other than (see Fig. 2(b)). The crown is satisfiable, but the all-zero assignment is a proper local minimum. For a CNF and an assignment to its variables, by and we denote the maximal number of simultaneously satisfiable clauses and the number of clauses satisfied by , respectively.
If density is such that for some and , then there is such that whp Local Search on a 3-CNF () returns an assignment such that , where denotes the maximal number of clauses in that can be simultaneously satisfied and denotes the number of clauses satisfied by .
If is constant then is also constant.
Proof: As in the proof of Lemma 9, it can be shown that for that satisfies conditions of this theorem there is such that whp a random [random planted] formula has at least crowns. If is a constant, is also a constant. For a random assignment , whp the variables of at least crowns are assigned zeroes. Such an all-zero assignment of a crown cannot be changed by the local search.
Then we move on to proving Proposition 12.
Let , and . The local search on a 3-CNF from whp ends up in a proper local minimum.
If then Proposition 12 follows from Theorem 11. So in what follows we assume that . The main tool of proving Proposition 12 is coupling of local search (LS) with the algorithm Straight Descent (SD) that on each step chooses at random a variable assigned to 0 and changes its value to 1. Obviously SD is not a practical algorithm, since to apply it we need to know the solution. For the purposes of our analysis we modify SD as follows. At each step SD chooses a variable at random, and if it is assigned 0 changes its value (see Fig. 3(a)). The algorithm LS is modified in a similar way (see Fig. 3(b)).
|Input: with the all-ones solution,|
|Boolean tuple ,|
|Output: The all-ones Boolean tuple.|
|while there is a variable assigned 0|
|pick uniformly at random variable from|
|the set of all variables|
|if then set|
|Input: 3-SAT formula , Boolean tuple ,|
|Output:||Boolean tuple , which is local|
|minima of .|
|while is not a local minima|
|pick uniformly at random variable from|
|the set of all variables|
|if the number of clauses that can be made|
|satisfied by flipping the value of is strictly|
|greater than the number of those made unsatisfied|
It is easy to see that the vector obtained by SD at step does not depend on the formula. And since SD treats all variables equally we can make the following
If starts its work at a random vector with ones and after step , , it arrives to a vector with ones, then this vector is selected uniformly at random from all vectors with ones.
Proof: Let us denote the probability that at step SD arrives to vector , conditional to it starts from a vector with ones, by . We prove by induction on that for any with ones. We denote this number by . As the starting vector is random, it is obvious for . Then for and any vector with ones we have
where is the number of variables in the formula and goes over all vectors that can be obtained from by flipping a one into zero. It does not depend on a particular vector .
We will frequently use the following two properties of the algorithm SD.
Whp the running time of SD does not exceed .
Proof: For a variable the probability that it is not considered for steps equals . So for this probability equals . Applying the union bound over all variables we obtain the required statement.
Given 3-CNF and an assignment we say that a variable is -righteous if the number of clauses voting for it to be one is greater by at least than the number of clauses voting for it to be zero. Let and be a Boolean tuple. The ball of radius with the center at is the set of all tuples of the same length as at Hamming distance at most from . Let and be arbitrary functions and be an integer constant. We say that a set of -tuples is -safe, if for any the number of variables that are not -righteous does not exceed . A run of SD is said to be -safe if at each step of this run the ball of radius with the center at the current assignment is -safe.
Let for some . For any constants and there is a constant such that, for any , whp a run of SD on is -safe.
Proof: Consider a run of SD on with a random initial assignment. If SD starts its work at a tuple with ones, then at step it has ones. Then by Lemma 13 if at step the current assignment of SD has ones then it is drawn uniformly at random from all vectors with ones. Event Unsafe “run of SD is not -safe” is a union of events “at step of SD’s run the ball of radius with the center at the current assignment is not -safe”. We will use the union bound to show that probability of Unsafe is small.
Let be a Boolean -tuple having positions filled with 1s. Since whp the number of 1s in the initial assignment is at least , for every step the number of 1s is at least . Let be an arbitrary set of variables with . We consider events “every variable is not -righteous” and “the total number of votes given by clauses for variables in to be 1 does not exceed the total number of votes given by clauses for variables in to be 0 plus .”
The same technique as in Lemma 6 can be used to show that the probability of and consequently the probability of is bounded above by for some constant , not dependent on . By inequality (3), there are at most distinct assignments in the -neighborhood of SD and distinct subsets of size . So for close to 1 the union bound implies that whp does not take place for any tuple, any subset of variables at any step which completes the proof of the lemma.
For CNFs we denote by their conjunction.
We will need formulas that obtained from a random formula by adding some clauses in an ‘adversarial’ manner. Following  we call distributions for such formulas semi-random. However, the type of semi-random distributions we need is different from that in . Let be some constant. A formula is sampled according to semi-random distribution if , where is sampled according to and contains at most clauses and is given by an adversary.
If then for any constants and there is a constant such that for any a run of on is whp