# The Space of Solutions of Coupled XORSAT Formulae

###### Abstract

The XOR-satisfiability (XORSAT) problem deals with a system of Boolean variables and clauses. Each clause is a linear Boolean equation (XOR) of a subset of the variables. A -clause is a clause involving distinct variables. In the random -XORSAT problem a formula is created by choosing -clauses uniformly at random from the set of all possible clauses on variables. The set of solutions of a random formula exhibits various geometrical transitions as the ratio varies.

We consider a coupled -XORSAT ensemble, consisting of a chain of random XORSAT models that are spatially coupled across a finite window along the chain direction. We observe that the threshold saturation phenomenon takes place for this ensemble and we characterize various properties of the space of solutions of such coupled formulae.

## I Introduction

Spatial coupling is a technique that starts with a graphical model and a “hard” computational task (e.g., decoding or more generally inference) and creates from this a new graphical model for the same task that has “locally” the same structure but is computationally “easy”. Kudekar, Richardson and Urbanke [1, 2] made the basic observation (in the context of coding theory) that on spatially-coupled graphs, low-complexity (message passing) algorithms suffice to achieve optimal performance. Despite its very recent introduction, spatial coupling has already had significant impact on coding, communications, and compressive sensing (see for example [3]-[9]) and has lead to new insights in computer science and statistical physics (see [12]).

We consider the effect of spatial coupling on random XORSAT formulae. The XORSAT problem is the simplest instance among the class of constraint satisfaction problems (CSP). CSPs arise in many branches of science, e.g., in statistical physics (spin glasses), information theory (LDPC codes), and in combinatorial optimization (satisfiability, coloring). These CSPs are believed to share a number of common structural properties, but some models are inherently more difficult to investigate than others. It is therefore natural to start with relatively “simple” CSPs if one wants to learn more about the general behavior of this class of models.

It is relatively simple to capture the same basic properties in the XORSAT problem due to its direct connection with linear algebra. Among such properties, an important one is the geometry of the space of solutions, which as was already understood a decade ago displays very interesting phase transitions [14, 15]. Recently in [16, 17], a fairly complete characterization of this geometry has been provided as a function of the ratio of number of clauses to number of variables. In particular, it is shown that for some range of values of this parameter, the space of solutions breaks into many disconnected “clusters”. It is widely believed that such a cluster structure is closely connected to the failure of standard message passing algorithms to find solutions (e.g., the belief propagation algorithm). In other words, it is believed that there is a strong connection between the “hardness” of the problem and the geometry of the solution space. Therefore we call this regime the hard-SAT regime.

Consider now what happens when we spatially couple such formulae. As we will show in the following, a remarkable phenomenon called threshold saturation takes place: the belief propagation algorithm succeeds in solving the problem in the hard-SAT regime of the original (non-coupled) model. This immediately raises the question how the space of solutions changes under spatial coupling. In other words, what happens to the clusters? A naive guess is that these clusters become connected. As we will see, the answer is–yes!

Our main objective is to provide an explanatory picture of how the geometry of the solution space is altered under spatial coupling. This picture can be helpful in further understanding the mechanism of spatial coupling, as well as in gaining some intuition about the solution space of other coupled CSPs, or in designing efficient algorithms for solving them [12].

The outline of this paper is as follows. In Section I-A we introduce in detail the XORSAT problem and random -XORSAT ensembles. We also explain in brief the related results on the geometry of the solution space of these random formulae. In Section II we introduce the coupled -XORSAT ensemble. Using the results of [13] and [12] we then prove the threshold saturation phenomenon for this ensemble. Finally, we discuss the geometry of the space of solutions of this ensemble by a direct use of the techniques in [16].

### I-a The -XORSAT Ensemble: Basic Setting

An XORSAT formula consists of Boolean variables , , and a set of exclusive OR (XOR) constraints . Each constraint, , called from now on a clause, is a linear equation consisting of the XOR of some variables being equal to a Boolean value . The number of variables involved in a clause is called the length of the clause. Further, a clause of length is typically called a -clause. Furthermore, A -XORSAT formula is a formula consisting only of -clauses. In matrix form, a -XORSAT formula can be represented as linear system

(1) |

Here, the matrix is an matrix with entries , and is equal to if and only clause contains the variable . The vector is an component vector representing the variables and the vector is also an component vector representing the clause values .

It is convenient to represent a XORSAT formula via a bipartite graph , where we denote the set of variable nodes by and the set of clause nodes by . We thus have and . There is an edge between a clause and a variable if and only if contains . The set of edges of is denoted by .

Let us now explain the ensemble of random -XORSAT formulae. Let , where is a positive real number and is called the clause density. To choose an instance from the -XORSAT ensemble, we proceed as follows. There are clauses of length and variables. Each clause picks uniformly at random a subset of length of the variables and flips a fair coin to decide the value of . All the above steps are taken independent of each other. In other words, the random -XORSAT ensemble is defined by taking uniformly at random in and uniformly at random from the set of all the matrices with entries in that have exactly ones per row.

One objective of the XORSAT problem is to specify whether a given formula has a solution or not. Standard linear algebraic methods allow us to accomplish this task with complexity . Here, we discus a linear complexity algorithm for solving XORSAT formulae called the peeling algorithm. In our case, this algorithm is known to be equivalent to the belief propagation(BP) algorithm.

### I-B The Peeling Algorithm

We begin by a brief explanation of the algorithm. Let be an XORSAT formula. As mentioned previously, we can think of as a bipartite graph. The algorithm starts with and in each step shortens until we either reach the empty graph or we can not make any further shortening. Assume now that there exists a variable in with degree or . In the former case, the value of the variable can be chosen freely. Also, in the latter case, assuming is the check node connected to , it is easy to see that the value of can be determined after the values of the other variables connected to are specified. Hence, without loss of generality, we can remove and its neighboring clause (if any) from and search for a solution for the graph . In other words, finding a solution for is equivalent to finding a solution for . As a result, we can peel the variable from and do the same procedure on . We continue this process until the residual graph is empty or it has no more variables with degree at most . The final graph that we reach to by the peeling procedure is called the -core or the maximal stopping set of . We recall that a stopping set of is a subgraph of containing a set of clauses and a set of variables where each clause has degree and all the variables have degree at least . The -core is a stopping set of maximum size. The peeling algorithm determines the -core of a graph . If the -core is empty then the algorithm succeeds and it is easy to see that the solution can be explicitly found by backtracking.

The peeling algorithm has an equivalent message passing (MP) formulation. It can be shown that the message passing rules for the peeling algorithm are also equivalent to the BP update rules. Further, if the formula comes from the -XORSAT ensemble, then one can analyze the behavior of the peeling algorithm in a probabilistic framework called density evolution (DE). The DE equations can be cast into a simple scalar recursion [19]

(2) |

with . Here, is related to the fraction of edges present in the remaining graph at time . For the peeling algorithm to succeed, the value of should tend to as increases. This is possible if and only if the equation

(3) |

has a unique solution which is the trivial fixed point . The net result is that the peeling algorithm succeeds with high probability (w.h.p) for defined as

### I-C Phase Transitions and the Space of Solutions

For a random -XORSAT formula with , the peeling algorithm succeeds w.h.p and hence the formula has a solution. What happens for ? It is easy to see that for the formula has w.h.p no solution. In fact, there exists a critical density such that when the clause density crosses , the -XORSAT ensemble undergoes a phase transition from almost certain solvability to almost certain unsatisfiability. The value is called the SAT/UNSAT threshold and is given as

(6) |

The value of separates two phases. For the graph has no -core whereas for the graph has a large -core and no algorithm is known to find a solution in linear time. These two phases differ also in the structure of their solution space as we explain now. We assume without loss of generality that the vector is the all-zero vector. Note here that a non-zero affects the solution space of the homogeneous system only by a shift and hence does not alter its structure.

The solutions of a formula are members of the Hamming cube . For we let denote their Hamming distance. For , there exists a constant such that that w.h.p the following holds [16]. Let . Consider two solutions . Then, there exists a sequence of solutions such that . Thus, for , the space of solutions can be imagined as a big cluster in which one can walk from one solution to another by a numbers steps that are of size at most (sub-linear in ). For the space of solutions shatters into an exponential number of clusters. Each cluster corresponds to a solution of the -core in the following sense. Given an assignment , we denote by its projection onto the core. In other words, is the vector of those entries in that corresponds to vertices in the core. Now, for a solution of the core, , we define the cluster associated to as the set of solutions to the whole formula such that . Hence, for each solution of the core, there exists one cluster in the space of solutions of the formula. It can be shown that each two solutions of the core differ in positions [19]. Thus, any two solutions belonging to two different clusters also differ in positions. However, each cluster by itself has a connected structure in the sense that for any two solutions belonging to the cluster, there exists a sequence of solutions inside that cluster such that . Figure 1 shows a symbolic picture of the clustering of solutions in the two phases.

## Ii The Coupled -XORSAT Ensemble

This ensemble represents a chain of coupled underlying ensembles. Figure 2 is a visual aid but gives only a partial view. We consider clause positions and variable positions . At each variable position , we lay down Boolean variables. Also, for each check position , we lay down clauses of length . So in total we have variables and clauses. Let us now specify how the set of edges, , is chosen. Each clause at a position , chooses its variables via the following procedure. We first pick a position with uniformly random in the window , then we pick a variable uniformly at random among all the variables located at position , and finally we connect the clause and the variable. The value of is also chosen by flipping a fair coin.

This ensemble is called the (spatially) coupled -XORSAT ensemble and an instance of it is called a coupled formula.

It is also useful to consider another ensemble of coupled graphs where positions are placed on a ring. This ensemble is called the ring ensemble and is obtained as follows. We consider clause positions and variable positions . At each variable position , we lay down Boolean variables. Also, for each check position , we lay down clauses of length . So in total we have variables and clauses. Each clause at position , chooses its variables via the following procedure. We first pick a position with uniformly random in the window , then we pick a variable node uniformly at random among all the variables located at position , and finally we connect the clause and the variable. The value of is also chosen by flipping a fair coin. It can be easily seen that by picking a random ring formula and removing all of its clauses that are placed at positions we generate a coupled formula.

### Ii-a Threshold Saturation

The peeling algorithm can be used for the coupled and ring formulae in the same manner as explained above. We denote by and the threshold for the emergence w.h.p of a non-empty -core for the coupled and ring ensembles. We also denote the SAT/UNSAT threshold for these ensembles by and , respectively.

Let us first consider the coupled ensemble. A similar message passing analysis as above yields a set of one-dimensional coupled recursions

(7) |

with boundary values for and . This recursion results in the one-dimensional fixed point equations

(8) |

with boundary values for and . We recall that is the highest clause density for which the fixed point equation (8) admits a unique solution that is the all-zero solution.

###### Lemma 1

We have

(9) |

The fact that tends to as grows large, follows from the interpolation arguments of [12]. For the other limit, from (6) it can be shown that corresponds to the potential threshold (defined in [13]) of the scalar recursion (2). Hence, it follows from [13, Theorem 1] that tends to . As a result, as and grow large the peeling algorithm succeeds at densities very close to . Table I contains some numerical predictions of .

K | ||||
---|---|---|---|---|

For the ring ensemble, the fixed point equation for the peeling algorithm become

(10) |

It is easy to see that for , the above set of fixed point equations admit a nontrivial solution in the following form. For , we have , where is the largest solution the FP equation in (3). For , it is also clear that there is only one solution which is the all-zero solution. Hence, for the ring ensemble we obtain for any choice of and

(11) |

By combining (11) and (9), one observes the following remarkable phenomenon. Let and be large but finite numbers such that . For these choices of we have from (9) that . Also, let and pick a formula from the ring ensemble. We deduce from (11) that such a formula has a non-trivial -core. Furthermore, it can be shown that the -core has a circular structure and for each position , it has variables clauses. Now, assume that from this -core we remove all the clauses at positions (i.e., we open the ring, see Figure 3) and run the peeling algorithm on the remaining graph. From (9) we deduce that the peeling algorithm succeeds on the remaining graph in the sense that it continues all the way until it reaches the empty graph. Note here that the ratio of the clauses that we remove from the -core is which vanishes as we choose .

### Ii-B The Set of Solutions

We now focus on the geometrical properties of the space of solutions of the coupled and ring formulas. Given the fact that for a ring formula has a core, we deduce that for this region of the set of solutions of a ring formula resembles the set of solutions of an uncoupled formula which was explained in Section I-C. In other words, the space of solutions of a ring formula shatters into exponentially many clusters (see Figure 1). Each cluster corresponds to a unique solution of the -core. Also, each cluster is itself connected and the distance between any two different clusters is . Now, assume and are large but finite numbers such that . For these choices of we have from (9) that . Let and pick a formula from the coupled ensemble. Let us denote this formula by and its set of solutions by . This formula w.h.p does not have a core. Also, we keep in mind that a coupled formula can be obtained from a typical ring formula by removing the clauses at the last positions. We denote such a ring formula by and its set of solutions by . We know that shatters into exponentially many clusters. It is easy to see that . As a result contains all the clusters of . Given these facts, how does the space of the space look like? In particular how are the two spaces and related? We now show that the space is a connected cluster.

###### Theorem 2

Let . Consider a random coupled -XORSAT formula and let be its set of solutions. The set is a connected cluster in the following sense. There exists a such that for any two solutions , there exists a sequence of solutions such that .

Proof sketch: The proof of this theorem essentially mimics the proof of Theorem 2 in [16] except for the last part. For the sake of briefness, we only give an sketch of the proof. The proof goes by showing that the set of solutions of the equation , i.e. the kernel of the matrix , has a sparse basis. In other words, there exists vectors that span the space , and each of the vectors has a low weight, i.e., where denotes the Hamming weight. We call such a basis a sparse basis. It is easy to see that if such a basis exists for the space of solutions, then the result of the theorem holds.

We now proceed by explicitly constructing such a basis. We first show that if the matrix has no core, then the peeling procedure provides us with a natural choice of a basis for . We then show that such a basis is indeed sparse. In this regard, we consider an slightly modified, but equivalent, version of the peeling algorithm called the the synchronous peeling algorithm. Given an initial formula (graph) , this algorithm consists of rounds . The residual graph at the end of round is denoted by . We also let . We denote the set of clauses, variables and edges removed at round by . Hence for we have . At each round , the algorithm considers the graph and removes all the variable nodes that have degree or less together with all the clauses (if any) connected to these variables. It is easy to see that synchronous peeling is somehow a compressed version of the peeling algorithm mentioned in Section I-B. Assuming that the initial graph has no core, the final is empty.

To ease the analysis, let us re-order the clauses and the variables in the following way. We start from the clauses in and order theses clauses (in an arbitrary way) from to . We then consider clauses in and order them (in an arbitrary way) from to and so on. We do the same procedure for the variable nodes but with the following additional ordering. Within each set , the ordering is chosen in such a way that nodes that have degree in appear with a smaller index than the ones that have degree . Now, with such a re-ordering of the nodes in the graph, the matrix has the following fine structure. For the sets and , we let be the sub-matrix of that consists of elements of whose rows are and columns are . The matrix can be partitioned into block matrices where such that for , is the all-zero matrix and the diagonal blocks have a staircase structure. Here, by a staircase structure we mean that the set of columns of can be partitioned into groups such that the columns in are all-zero and the columns in have only their -th entry equal to and the rest are equal to . Given such a decomposition of , it is now easy to see how one can find a basis for its kernel. In fact, the matrix has essentially an upper triangular structure. With this structure, one can apply the method of back substitution [16, Lemma 3.4] to solve the equation and find the kernel of . Here, for the sake of briefness we just mention the final result. We partition into a disjoint union in a way that will be our set of independent variables and will be the set of dependent ones (i.e., can be expressed in terms of ). The partition is then constructed by letting and . For each , we construct by using the staircase structure of . We recall that the columns of have the partition . We then construct as , where is constructed from by removing an arbitrary element from it ( is empty if ). In other words, among the variables in we choose one as the dependent variable and let the others be independent variables in . We then let . With the sets and explained as above, let us reorder the variable in as followed by , i.e., we reorder the variables such that we can write . One can show that the columns of the matrix

(12) |

form a basis for the set of solutions. Here, the matrix denotes the identity matrix of size . Also, if then we have , where by we mean the distance between variables in the graph .

It is now easy to show that the Hamming weight of any column of is bounded above by the value , where by we mean the set of variables such that . In the last step, we argue that with high probability

(13) |

where and are finite constants. From (13), [16, Lemma 3.11], and the fact the coupled ensemble has the same local structure as the un-coupled ensemble, we then deduce that w.h.p , where and are finite constants. It remains to justify (13). Consider the DE equations (7) starting from the initial point for and for at the boundaries. Let be a (very) small constant. It can be shown from [18] that that there exists a constant such that for all . In other words, the effect of the boundary (i.e., for and ) propagates towards the positions at the middle in wave-like manner and with a speed and hence at time all the values are small. Once the value of is sufficiently small then it converges to doubly exponentially fast. Hence, intuitively, the synchronous peeling algorithm needs w.h.p an extra steps to clear out the whole formula and the total time taken by peeling will be . Of course, this is just an intuitive argument. A formal analysis can be followed similar to [16, Lemma 3.11].

### Ii-C An Intuitive Picture of the Sparse Basis

As we mentioned in the end of Section II-A, a ring formula with density has a core. The core has a circular structure with roughly clauses and variables in each position . Further, each two solutions of the core are different in positions. Now, consider the formula obtained by removing the clauses at the last positions of the core (i.e., positions ). We call such a formula the opened core. We know that the peeling algorithm succeeds on the opened core and from Theorem 2 its solution space is a connected cluster and admits a sparse basis. So the distant solutions of the original core are now connected to each other via the new solutions spanned by this sparse basis. Our objective is now to see, at the intuitive level, how its spare basis looks like.

All the variables in the opened core have degree at least two except the ones at the two boundaries (we call the first positions and the last positions the boundaries of the chain). Once the synchronous peeling algorithm begins, the effect of these low degree variables at the boundaries starts to propagate like a wave towards the middle of the chain. The algorithm evacuates the positions one-by-one with a constant speed approaching the middle [18]. A simple, albeit not very accurate, analogy is a chain of properly placed domino pieces. Once we topple a boundary piece the whole chain is toppled with roughly a constant speed.

Consider the peeling algorithm explained in Section I-B. This algorithm removes the variables in the graph one-by-one. Each variable that is removed in this algorithm has either degree or . A variable that, at the time of being peeled, has degree is called an independent variable. A variable of degree is called a dependent variable. One can easily see that the definition of an independent (dependent) variable is equivalent to the definition given in the proof of Theorem 2. In Theorem 2 we proved that the opened core has a sparse basis. The number of elements of the basis is equal to the number of independent variables explored during the peeling algorithm. Furthermore, there is a one-to-one correspondence between the independent variables and the elements of the sparse basis, as we explain now.

Consider the synchronous peeling procedure defined in the proof of Theorem 2. The synchronous peeling procedure is a compressed version of the peeling algorithm in the following sense. At any step of synchronous peeling, we peel all the variables in the remaining graph that have degree or .
Let us now denote the graph of the opened core by .
Consider an independent variable and assume that the variable is removed at step of the synchronous peeling algorithm. Let be the set of all the variables such that^{1}^{1}1We denote by the graph distance between the variables and in the opened core. and is peeled at some time before . We also include in any check node (together with its edges) whose variables are all inside . Intuitively, corresponds to the history of the variable with respect to the peeling procedure. Figure 4 illustrates these concepts via a simple expample.

As we explained above, the (synchronous) peeling procedure on the opened core propagates like a wave from the boundaries towards the middle of the core, with a constant speed . As a result, if the variable is at a (variable) position , then we have^{2}^{2}2Of course, this relation is true for most of the independent variables in the opened core and there is a (vanishing) fraction of variables for which it takes steps for them to be peeled off.
. As a result, when is large and , then is w.h.p a tree whose leaf nodes are located at one of the boundaries of the opened core (see Figure 4). Let us now see how the basis vector corresponding to the independent variable looks like. One can think of as a sub-graph or a sub-formula of . Also, since we are solving the equation , a solution of can naturally be extended (lifted) to a solution of by simply assigning to the variables in .
Consider a solution of for which the value that the variable takes is . Since the peeling succeeds on and is an independent variable, such a solution exists (one can find such a solution by assigning to and then backtracking on ). Such a solution, when extended to a solution of is the corresponding basis element for the variable .

## References

- [1] S. Kudekar, T. Richardson, R. Urbanke, Threshold saturation via spatial coupling: why convolutional LDPC ensembles perform so well over the BEC, [online] Available: arXiv 1001.1826 [cs.IT].
- [2] S. Kudekar, T. Richardson, R. Urbanke, Spatially Coupled Ensembles Universally Achieve Capacity under Belief Propagation, [online] Available: arXiv:1201.2999 [cs.IT].
- [3] A. Yedla, P. S. Nguyen, H. D. Pfister, K. R. Narayanan, “Universal Codes for the Gaussian MAC via Spatial Coupling,” In proceedings of Allerton 2011, TX, USA (Sept. 2011).
- [4] K. Takeuchi, T. Tanaka, and T. Kawabata, Improvement of BP-Based CDMA Multiuser Detection by Spatial Coupling, [online] Available: lanl.arxiv.org no 1102.3061[cs.IT]
- [5] S. Kudekar, K. Kasai, Spatially Coupled Codes over the Multiple Access Channel, in lanl.arxiv.org no 1102.2856[cs.IT]
- [6] V. Aref, N. Macris, R. Urbanke and M. Vuffray, “Lossy Source Coding via Spatially Coupled LDGM Ensembles”, in proceedings of ISIT 2012, pp. 373-377, (July 2012).
- [7] S. Kudekar and H. D. Pfister, “The Effect of Spatial Coupling on Compressive Sensing,” in Proc. of the Allerton Conf. on Communications, Control, and Computing, Monticello, IL, USA, 2010.
- [8] F. Krzakala, M. Mezard, F. Sausset, Y. Sun, and L. Zdeborova, “Statistical Physics-Based Reconstruction in Compressed Censing,” CoRR, [online] Available: arXiv:1109.4424 [cs.IT].
- [9] D. Donoho, A. Javanmard, and A. Montanari, “Information-Theoretically Optimal Compressed Sensing via Spatial Coupling and Approximate Message Passing,” [online] Available: arXiv:1112.0708 [cs.IT].
- [10] S. Hamed Hassani, N. Macris, R. Urbanke, Coupled graphical models and their threshold, Information Theory Workshop (ITW) IEEE Dublin (2010); also in lanl.arxiv.org no 1105.0785[cs.IT]
- [11] S. H. Hassani, N. Macris, R. Urbanke, Chains of Mean Field Models, J. Stat. Mech. (2012) P02011.
- [12] S. H. Hassani, N. Macris, R. Urbanke, Threshold Saturation in Spatially Coupled Constraint Satisfaction Problems, J. Stat. Phys, (2012)
- [13] A. Yedla, Y. Jian, P. S. Nguyen and H. D. Pfister, “A Simple Proof of Threshold Saturation for Coupled Scalar Recursions,” In proceedings of ISTC 2012, Gothenburg, Sweden (August 2011).
- [14] O. Dubois and J. Mandler, “The 3-XORSAT Threshold”, In proc. of 43rd IEEE FOCS pp.769-778 (2002).
- [15] M. Mezard, F. Ricci-Tersenghi, R. Zecchina, “Alternative Solutions to Diluted p-spin Models and XORSAT Problems”, J. Stat. Phys, vol 111, pp.505-533 (2003)
- [16] M. Ibrahimi, Y. Kanoria, M. Kraning, and A. Montanari “The Set of Solutions of Random XORSAT Formulae,” Proc. SODA 2012.
- [17] D. Achlioptas and M. Molloy ,“The Solution Space Geometry of Random Linear Equations”, arXiv:1107.5550 [cs.DS].
- [18] S. Kudekar, T. Richardson and R. Urbanke,“Wave-Like Solutions of General One-Dimensional Spatially Coupled Systems,” [online] Available: arXiv:1208.5273 [cs.IT].
- [19] M. Mézard and A. Montanari, “Information, Physics and Computation”, Oxford University Press (2009).