Analysis of the Min-Sum Algorithm for Packing and Covering Problems via Linear Programming

# Analysis of the Min-Sum Algorithm for Packing and Covering Problems via Linear Programming

Guy Even School of Electrical Engineering, Tel-Aviv University, Tel-Aviv 69978, Israel. E-mail: guy@eng.tau.ac.il.    Nissim Halabi School of Electrical Engineering, Tel-Aviv University, Tel-Aviv 69978, Israel. E-mail: nissimh@eng.tau.ac.il.
###### Abstract

Message-passing algorithms based on belief-propagation (BP) are successfully used in many applications including decoding error correcting codes and solving constraint satisfaction and inference problems. BP-based algorithms operate over graph representations, called factor graphs, that are used to model the input. Although in many cases BP-based algorithms exhibit impressive empirical results, not much has been proved when the factor graphs have cycles.

This work deals with packing and covering integer programs in which the constraint matrix is zero-one, the constraint vector is integral, and the variables are subject to box constraints. We study the performance of the min-sum algorithm when applied to the corresponding factor graph models of packing and covering LPs.

We compare the solutions computed by the min-sum algorithm for packing and covering problems to the optimal solutions of the corresponding linear programming (LP) relaxations. In particular, we prove that if the LP has an optimal fractional solution, then for each fractional component, the min-sum algorithm either computes multiple solutions or the solution oscillates below and above the fraction. This implies that the min-sum algorithm computes the optimal integral solution only if the LP has a unique optimal solution that is integral.

The converse is not true in general. For a special case of packing and covering problems, we prove that if the LP has a unique optimal solution that is integral and on the boundary of the box constraints, then the min-sum algorithm computes the optimal solution in pseudo-polynomial time.

Our results unify and extend recent results for the maximum weight matching problem by [Sanghavi et al.,’2011] and [Bayati et al., 2011] and for the maximum weight independent set problem [Sanghavi et al.’2009].

## 1 Introduction

We consider optimization problems over the integers called packing and covering problems. Many optimization problems can be formulated as packing problems including maximum weight matchings and maximum weight independent sets. Optimization problems such as minimum weight set-cover and minimum weight dominating set are special cases of covering problems. The input for both types of problems consists of an zero-one constraint matrix , an integral constraint vector , an upper bound vector , and a weight vector . An integral vector is an integral packing if and . In a packing problem the goal is to find an integral packing that minimizes . An integral vector is an integral covering if and . In a covering problem the goal is to find an integral packing that maximizes .

Packing and covering problems generalize problems that are solvable in polynomial time (e.g., maximum matching) and problems that are NP-hard and even NP-hard to approximate (e.g., maximum independent set). The hardness of special cases of these problems imply that general algorithms for packing/covering problems are heuristic in nature. Two heuristics that are used in practice to solve such problems are linear programming (LP) and belief-propagation (BP).

Linear programming deals with optimizing a linear function over polyhedrons (subsets of the Euclidean space [BT97]. Perhaps the most naive way to utilize linear programming in this setting is to solve the LP relaxation of the integer problem (i.e., relax the restriction that to the restriction ). If the result happens to be integral, then we are lucky and we have found an optimal integral packing or covering. A great deal of literature deals with characterizing problems for which this method works well (e.g., works on total unimodularity [Sch98]). In fact, LP decoding of error correcting codes works in the same fashion and has been proven to work well in average [FWK05, ADS09].

Belief-propagation is an algorithmic paradigm that deals with inference over graphical models [Pea88]. The graphical model that corresponds to packing/covering problems is a bipartite graph that represents the zero-one matrix . We focus on a common variant of belief-propagation that is called the min-sum algorithm (or the max-product algorithm). In the variant we consider, the initial messages are all zeros and messages are not attenuated.

Our main result is a proof that the min-sum algorithm is not better than the heuristic based on linear programming. This proof holds for every instance of packing and covering problems described above with respect to the min-sum algorithm with zero initialization and no attenuation.

#### Previous Work.

Message-passing algorithms based on belief-propagation (BP) have been invented multiple times (see [Gal63, Vit67, Pea88]). Numerous papers report empirical results that demonstrate the usefulness of these algorithms for decoding error correcting codes, inference with noise, constraint satisfaction problems, and many other applications [Yed11]. It took a while until it was noticed that algorithms for decoding of Turbo codes [BGT93] and LDPC codes [Gal63] are special variants of BP [MMC98, Wib96].

In this paper we focus on a common variant of BP called the min-sum algorithm, and consider the case where messages are initialized to zero. The BP algorithm is a message-passing algorithm in which messages are sent along edges of a graph called the factor graph. The factor graph of packing/covering problems is a bipartite graph that represents that constraint matrix . In essence, value computed by the min-sum algorithm for equals the outcome of a dynamic programming algorithm over a path-prefix tree rooted at the vertex corresponding to the th column of . Since dynamic programming computes an optimal solution over trees, the min-sum algorithm is optimal when the factor graph is a tree [Pea88, Wib96]. A major open problem is to analyze the performance of BP (or even the min-sum algorithm) when the factor graph is not a tree. Execution of algorithms based on BP over graphs that contain cycles is often referred to as loopy BP.

Recently, a few papers have studied the usefulness of the min-sum algorithm for solving optimization problems compared to linear programming. Such a comparison for the maximum weight matching problem appears in [BSS08, BBCZ11, SMW11] with respect to constraints of the form for every vertex . Loosely speaking, the main result that they show for maximum weighted matching is that the min-sum algorithm is successful if and only if the LP heuristic is successful. The sufficient condition states that if the LP relaxation has a unique optimal solution and that solution is integral, then the min-sum algorithm computes this solution in pseudo-polynomial time. The necessary condition states that if the LP relaxation has a fractional optimal solution, then the min-sum algorithm fails. In [SSW09], the min-sum algorithm for the maximum weighted independent set problem was studied with respect to constraints for every edge . They prove an analogous necessary condition and present a counter-example that proves that the sufficient condition (i.e., unique optimal solution for the LP that is integral) does not imply the success of the min-sum algorithm. The performance of the min-sum algorithm has been also studied for computing shortest - paths [RT08] and min-cost flows [GSW12, BCMR13].

#### Our Results.

The results in this paper extend and generalize previous necessary conditions for the success of the min-sum algorithm. This necessary condition implies that, compared to the LP heuristic, the min-sum algorithm is not a better heuristic for solving packing and covering problems. Our contributions can be summarized as follows: (1) We consider a unified framework of packing and covering problems. Previous works deal with a zero-one constraint matrix that has two nonzero entries in each column (for maximum weight matching [BSS08, BBCZ11, SMW11]) or two nonzero entries in each row (for maximum weight independent set [SSW09]). Our results hold with respect to any zero-one constraint matrix . (2) We allow box constraints, namely . Previous results consider only zero-one variables. (3) Our oscillation results hold also when the LP relaxation has multiple solutions. To obtain such a result, we consider the set of optimal values computed by the min-sum algorithm at each variable (rather than declare failure if there are multiple optimal values). We compare these sets with the optimal solutions of the LP relaxation and show a weak oscillation between even and odd iterations. (4) The analogous result for covering LPs is obtained by a simple reduction (see Claim 3) that applies complementation; this reduction generalizes reductions from maximum matchings to minimum edge covers [SMW11]. (5) We present a unified proof method based on graph covers. This method also enables us to prove convergence of the min-sum algorithm under certain restrictions (see Theorem 17 in Appendix A).

#### Techniques.

The main challenge in comparing between the min-sum algorithm and linear programming is in finding a common structure that captures both algorithms. It turns out that graph covers are a common structure. Graph covers have been used previously to analyze iterative message passing algorithms [VK05, HE11]. In the context of optimization problems, -covers have been used in [BBCZ11] to reduce matchings in general graphs to matchings in bipartite graphs [BSS08]. Bayati et al. [BBCZ11] write that their “proof gives a better understanding of the often-noted but poorly understood connection between BP and LP through graph covers.” We further clarify this connection by using higher order covers that capture fractional optimal LP solutions, as suggested by Ruozzi and Tatikonda [RT12]. Graph covers not only capture LP solutions but also solutions computed by the min-sum algorithm. In fact, the min-sum algorithm performs the same computation over any graph cover because it operates over a path-prefix tree of the factor graph. Hence we make the mental experiment in which the min-sum algorithm is executed over a graph cover in which all the basic feasible solutions are integral. We avoid the problems associated with loopy-BP by considering a graph cover, the girth of which is much larger than the number of iterations of the min-sum algorithm. Thus the execution of the min-sum algorithm is equivalent to a dynamic programming over subtrees induced by balls in the graph cover. This mental game justifies a dynamic programming interpretation of the outcome of the min-sum algorithm. The dynamic programming algorithm makes a “local” decision based on balls, the radius of which is twice the number of iterations. The LP solution, on the other hand, is a global solution.

The proof proceeds by creating “hybrid” solutions that either refute the (global) optimality of the LP solution or the (local) optimality of the dynamic programming solution over the ball in the graph cover.

## 2 Preliminaries

### 2.1 Graph Terminology and Algebraic Notation

#### Algebraic Notation.

We denote vectors in bold, e.g., . We denote the th coordinate of by , e.g., . For a vector , let denote the norm of . The cardinality of a set is denoted by . We denote by the set for . For a set , we denote the projection of the vector onto indices in by .

A vector is rational if all its components are rational. Similarly, a vector is integral if all its components are integers. A vector is fractional if it is not integral, i.e., at least one of its components is not an integer.

Let denote a non-negative integral vector. Denote the Cartesian product by . Similarly, denote the Cartesian product by . Note that vectors in are integral.

#### Graph Terminology.

Let denote an undirected simple graph. Let denote the set of neighbors of vertex (not including itself). Let denote the edge degree of vertex in a graph , i.e., . For a set let . A path in is a sequence of vertices such that there exists an edge between every two consecutive vertices in the sequence. A backtrack in a path is a subpath that is a loop consisting of two edges traversed in opposite directions, i.e., a subsequence . All paths considered in this paper do not include backtracks. The length of a path is the number of edges in the path. We denote the length of a path by . Let denote the distance (i.e., length of a shortest path) between vertex and in , and let denote the length of the shortest cycle in . Let denote the set of vertices in with distance at most from , i.e., .

The subgraph of induced by consists of and all edges in , both endpoints of which are contained in . Let denote the subgraph of induced by . A subset is an independent set if there are no edges in the induced subgraph . A graph is bipartite if is the union of two disjoint nonempty independent sets.

### 2.2 Covering and Packing Linear Programs

We consider two types of linear programs called covering and packing problems. In both cases the matrices are zero-one matrices and the constraint vectors are positive.

In the sequel we refer to the constraints and as box constraints.

###### Definition 1.

Let denote a zero-one matrix with rows and columns. Let denote a constraint vector, let denote a weight vector, and let denote a domain boundary vector.

1. The integer program is called a packing IP, and denoted by PIP.

2. The integer program is called a covering IP, and denoted by CIP.

3. The linear program is called a packing LP, and denoted by PLP.

4. The linear program is called a covering LP, and denoted by CLP.

### 2.3 Factor Graph Representation of Packing and Covering LPs

The belief-propagation algorithm and its variant called the min-sum algorithm deal with graphical models known as factor graphs (see, e.g., [KFL01]). In this section we review the definition of factor graphs that are used to model covering and packing problems.

###### Definition 2 (factor graph model of packing problems).

A quadruple is the factor graph model of PIP if:

• is a bipartite graph that represents the zero-one matrix . The set of variable vertices corresponds to the columns of , and the set of constraint vertices corresponds to the rows of . The edge set is defined by .

• The vector defines the alphabets that are associated with the variable vertices. The alphabet associated with equals .

• For each constraint vertex , we define a packing factor function , defined by

 ψCj(y) ≜{0if  ∑vi∈NG(Cj)yi≤bj−∞otherwise (1)

We denote the set of factor functions by .

• For each variable vertex , we define a variable function defined by . We denote the set of variable functions by .

We note that (1) One could define a factor graph for PLP; the only difference is that the alphabet associated with variable vertex is the real interval , and the range of each factor function is . (2) The factor functions are local in the sense that each constraint vertex can evaluate the value of based on the values of its neighbors.

A vector is viewed as an assignment to variable vertices in where is assigned to vertex . To avoid composite indices, we use to denote the value assigned to by the assignment . An integral assignment is valid if it satisfies all the constraints, namely, for every and .

The factor graph model allows for the following equivalent formulation of the packing integer program:

 argmax{∑v∈Vϕv(xv)+∑C∈CψC(x) ∣∣ x∈ZBox(X)}. (2)

If there exists at least one valid assignment, then this formulation is equivalent to the formulation:

 (3)

We may define a factor graph model for covering problems in the same manner. The only difference is in the definition of the covering factor functions, namely,

 ψC(y) ≜{0if ∑v∈NG(C)yv≥bC∞otherwise (4)

Using this factor model, we can reformulate the covering integer program CIP by

 argmin{∑v∈Vϕv(xv)+∑C∈CψC(x) ∣∣ x∈ZBox(X)}. (5)

One could define a factor graph model for general LP’s as well. Suppose the goal is to maximize the objective function. Then, for each constraint, the range of the factor function is . If the constraint is satisfied, then the value of is ; otherwise it is .

## 3 Min-Sum Algorithms for Packing and Covering Integer Programs

In this section we present the min-sum algorithm for solving packing and covering integer programs with zero-one constraint matrices. Strictly speaking, the algorithm for PIP is a max-sum algorithm, however we refer to these algorithms in general as min-sum algorithms. All the results in this section apply to any other equivalent algorithmic representation (e.g., max-product-type formulations). We first define the min-sum algorithms for PIPs and CIPs, and then state our main results.

### 3.1 The Min-Sum Algorithm

The min-sum algorithm for the packing integer program (PIP) is listed as Algorithm 1. The input to algorithm min-sum-packing consists of a factor graph model of a PIP instance and a number of iterations . Each iteration consists of two parts. In the first part, each variable vertex performs a local computation and sends messages to all its neighboring constraint vertices. In the second part, each constraint vertex performs a local computation and sends messages to all its neighboring variable vertices. Hence, in each iteration, two messages are sent along each edge.

Let denote the message sent from a variable vertex to an adjacent constraint vertex in iteration under the assumption that vertex is assigned the value . Similarly, let denote the message sent from to in iteration assuming that vertex is assigned the value . Denote by the final value computed by variable vertex for assignment of .

The initial messages (considered as the zeroth iteration) have the value zero and are sent along all the edges from the constraint vertices to the variable vertices. We refer to these initial messages as the zero initialization of the min-sum algorithm.

The algorithm proceeds with iterations. In Line 2a the message to be sent from to is computed by adding the previous incoming messages to (not including the message from ) and adding to it . In Line 2b the message to be sent from back to is computed. The constraint vertex considers all the possible assignments to its neighbors in which . In fact, only assignments that satisfy the constraint of and the box constraints of the neighbors of are considered. The message from to equals the maximum sum of the previous incoming messages (not including the message from ) among these assignments.

Finally, in Line 3 each vertex decides locally on its outcome . The maximum value is computed, and the set of values that achieve the maximum is computed as well. The decision of vertex equals the minimum or maximum value in depending on the parity of . Here, we deviate from previous descriptions of the min-sum algorithm that declare failure if contains more than one element.

Algorithm min-sum-covering listed as Algorithm 2 is based on the following reduction of the covering LP to a packing LP as follows. It is easy to write a direct min-sum formulation of algorithm min-sum-covering.

###### Claim 3.

Let , then

 argmin{wT⋅z ∣∣ A⋅z≥b, z∈RBox(X)}=X−argmax{wT⋅x ∣∣ A⋅x≤d, x∈RBox(X)}.
###### Proof.

Consider the mapping . The mapping is a one-to-one and onto mapping from the set to the set . Moreover, the mapping satisfies , and the claim follows. ∎

### 3.2 Main Results

#### Notation.

Let denote the set of optimal solutions of the packing LP

 OPTLP≜argmax{wT⋅x ∣∣ A⋅x≤b,x∈RBox(X)}.

Let denote the factor graph model of this packing LP. Fix a variable vertex in the factor graph of the packing LP. Let and let . Let and denote the minimum and maximum values in , respectively.

The proof of the following theorem appears in Section 5.2.

###### Theorem 4 (weak oscillation).

Consider an execution of min-sum-packing. For every variable vertex the following holds:

1. If is even, then .

2. If is odd, then .

###### Corollary 5.

If there exists an optimal solution such that is not an integer, then if the number of iterations is even, and if is odd.

The following corollary implies that if algorithm min-sum-packing outputs the same value for a vertex in two consecutive iterations, then this value is the LP optimal value.

###### Corollary 6.

Let denote an even number and denote an odd number. If , then contains a single element such that .

###### Proof.

If , then by Theorem 4, . ∎

Previous works on the min-sum algorithm for optimization problems define the case that contains more than one element as a failure. Under this restricted interpretation, Theorem 4 and Corollary 5 imply a necessary condition for the convergence of the min-sum-packing algorithm. Namely, must contain a unique optimal solution and this optimal solution must be integral. Indeed, if , then oscillates above and below the interval between even and odd iterations.

Analogous results holds for covering problems. We state only the theorem that is analogous to Theorem 4. Redefine so that it denotes the set of optimal solutions of the covering LP, i.e.,

 OPTLP≜argmin{wT⋅x ∣∣ A⋅x≥b,x∈RBox(X)}.

Let denote the factor graph model of the covering LP.

###### Theorem 7 (weak oscillation).

Consider an execution of min-sum-covering. For every variable vertex the following holds:

1. If is even, then .

2. If is odd, then .

See Appendix A for a discussion of the convergence of the min-sum algorithm.

## 4 Graph Liftings

In this section we briefly review the definition of graph coverings, state a combinatorial characterization based on [RT12], and show how the girth can be arbitrarily increased.

### 4.1 Covering Maps and Liftings

###### Definition 8 (covering111 The term covering is used both for optimization problems called covering problems and for topological mappings called covering maps. map [Al02]).

Let and denote finite graphs. A graph homomorphism is a covering map if for every the restriction of to neighbors of is a bijection to the neighbors of .

We refer only to finite covering maps. The pre-image of a vertex is called the fiber of . It is easy to see that all the fibers have the same cardinality if is connected. This common cardinality is called the degree or fold number of the covering map. If is a covering map, we call the base graph and a lift of . In the case where the fold number of the covering map is , we say that is an -lift of .

If is connected, then every -lift of is isomorphic to an -lift that is constructed as follows: (1) The vertex set is simply and the covering map is the projection defined by . (2) For every , the edges in between the fibers of and constitute a matching.

The notion of -lifts in graphs is extended to -lifts of factor graph models in a natural manner. We denote a variable vertex in the fiber of by (so is denoted by ). Each variable vertex inherits the variable function of , namely, . Similarly, each constraint variable inherits the factor function of . For brevity, we refer to the lifted factor graph model of simply as the lift of a factor graph .

An assignment to the variable vertices of a factor graph is extended to an assignment over the lift simply by defining . Note that this extension preserves the validity of assignments.

Every assignment of an -lift induces an assignment of the base graph that we call the average assignment. The average assignment is defined by

 avg(~x)v≜1M⋅∑~v∈π−1(v)~x~v. (6)

Consider an -lift of the factor graph and a valid integral assignment to . By linearity, is a rational valid assignment to . The following theorem deals with the converse situation.

###### Theorem 9 (special case of [Rt12, Theorem VII.2]).

For every rational feasible solution of an LP, there exists an , an -lift , and an integral valid assignment to such that .

Note that all the basic feasible solutions (extreme points) of the packing LP and the covering LP are rational.

### 4.2 Increasing Girth

The following proposition deals with obtaining lifts with large girth.

###### Proposition 10.

There exists a finite lift of such that .

###### Proof.

Given a graph , we construct a -lift as follows. Let . The vertices in each fiber of are indexed by a binary string of length . Index the edges in by . For an edge , the matching between the fiber of and the fiber of is induced simply by flipping the ’th bit in the index. Namely, .

Consider a cycle in and its projection in . Each edge in must appear an even number of times. Otherwise, the ’th bit is flipped an odd number of times in , and can not be a cycle. It follows that . ∎

By applying Proposition 10 repeatedly, we have the following corollary.

###### Corollary 11.

Consider a graph . Then for any finite there exists a finite lift of such that .

## 5 Proof of Main Results

### 5.1 Min-Sum as a Dynamic Programming on Computation Trees

Given a graph and a vertex . The path-prefix tree of height is defined as follows.

###### Definition 12 (Path-Prefix Tree).

Let denote the set of all paths with length at most without backtracks that start at vertex . Let . The directed graph is called the path-prefix tree of rooted at vertex with height , and is denoted by .

We denote the zero-length path in by . The graph is obviously acyclic and is an out-tree rooted at . Path-prefix trees of that are rooted in variable vertices are often called computation trees of or unwrapped trees of .

We use the following notation. Vertices in are paths in , and are denoted by and whereas variable vertices in are denoted by . For a path , let denote the last vertex (i.e., target) of path .

Consider a path-prefix tree of a factor graph . We denote the vertex set of by , where denotes paths that end in a variable vertex, and denotes paths that end in a constraint vertex. Paths in are called variable paths, and paths in are called constraint paths. We attach variable functions to variable paths , and factor functions to constraint paths; each vertex inherits the function of its endpoint. The box constraint for a variable path that ends at vertex is defined by .

In the following lemma, the min-sum-packing algorithm is interpreted as a dynamic programming algorithm over the path-prefix trees (see e.g., [GSW12, Section 2]).

###### Lemma 13.

Consider an execution of . Consider the computation tree . For every variable vertex and ,

 μr(β)=max{∑p∈^V^ϕp(^zp)+∑q∈^C^ψq(^zN(q))∣∣∣ ∀p∈^V.^zp∈ZBox(^Xp), ^z(r)=β}.
###### Definition 14 (optimal assignment).

We say that a valid assignment to the variable paths in is optimal if it maximizes the objective function .

Let denote the set of optimal valid assignments to the variable paths in . By Line 3c in algorithm min-sum-packing, the following corollary holds.

.

### 5.2 Weak Oscillation of min-sum-packing - Proof of Theorem 4

###### Proof of Theorem 4.

We prove Part (1) of the Theorem. Fix an assignment such that . Fix an optimal solution such that . Assume, for the sake of contradiction, that is even and that .

Without loss of generality, is a basic feasible solution. By Theorem 9 and Corollary 11, for some , there exists an -lift of such that: (i) there exists an integral valid assignment for such that , and (ii) .

The value of equals the average of over the fiber of . Let denote a vertex in the fiber of such that .

Let denote the ball of radius centered at . Denote by the subgraph of induced by . Because , is a tree. It follows that is isomorphic to the computation tree . Because is an optimal valid assignment to , we can regard also as an optimal valid assignment to the variable vertices in .

Because , the restriction of to the variable vertices in does not equal . Our goal is to obtain a contradiction by showing that either is not an optimal assignment or there exists an optimal solution such that . We show this by constructing “hybrid” integral solutions.

Let . Similarly, let . Note that both and contain only variable vertices. We refer to as the even layers and to as the odd layers.

Let denote the subgraph of that is induced by: (i) the vertices such that , (ii) the vertices such that , and (iii) constraint vertices in . By definition, is a vertex in , and by our assumption . Hence, . Let denote the connected component of that contains . We root at , and refer to as an alternating tree.

A subtree of is a skinny tree if each constraint vertex chooses only one child and each variable vertex chooses all its children. Formally, a subtree of is a skinny tree if it is a maximal tree with respect to inclusion among all trees that satisfy (i) , (ii)  for every constraint vertex in , and (iii)  for every variable vertex in . We fix a skinny subtree of to obtain a contradiction.

For a subset of variable vertices , let (recall that the weight of a variable vertex in is given to each vertex in the fiber of ) . We claim that

 ~w(TS∩E) ≥~w(TS∩O), (7)

To prove Equation (7), define an integral assignment to variable vertices in by

 ~y~u≜⎧⎨⎩~x~u−1if ~u∈TS∩E~x~u+1if ~u∈TS∩O~x~uotherwise.

Observe that is a valid assignment for . Indeed, all the box constraints are satisfied because we increment a value compared to only if . Similarly, we decrement a value compared to only if . In addition, we need to show that every constraint is satisfied by . Note that a constraint may have at most two neighbors that are variable vertices in the skinny tree; the rest of the neighbors retain the value assigned by . If a constraint is not a neighbor of a variable vertex in , then it is satisfied because satisfies it. If a constraint has two neighbors that are variable vertices in , then one is incremented and the other is decremented. Overall, the constraint remains satisfied. Finally, suppose has only one neighbor in . Denote this neighbor by . Then is a parent of . If , then clearly is satisfied by . If the value assigned to is incremented, then (i.e., an odd layer). This implies that the children of are in an even layer, the distance of which to the root is at most . Hence, the children of belong to the ball . Moreover, these children do not belong to the alternating tree (otherwise, one of its children would belong to the skinny tree). Thus, for each child of we have . In addition, . Hence, when restricted to the neighbors of . Because satisfies , so does , as required. Because is a valid assignment, is a feasible solution of the packing LP. The optimality of implies that . By the definition of , we have

 wT⋅(x−avg(~y))=1M⋅(~w(TS∩E)−~w(TS∩O)),

and Equation (7) follows.

We now define an assignment to variable vertices in by

 θ~u≜⎧⎨⎩z~u+1if ~u∈TS∩Ez~u−1if ~u∈TS∩Oz~uotherwise

We claim that is a valid integral assignment for . The proof is analogous to the proof that is a valid assignment. By Equation (7), the value of is not less than the value of since

 (8)

Therefore, . However, , a contradiction. It follows that if is even.

The proof of Part (2) that for an odd is analogous to the proof that if is even. It requires the following modifications. (1) Fix such that and such that . (2) Assume towards a contradiction that is odd and . (3) Pick so that . (4) The forest is induced by the following set of vertices: (i) the vertices such that , (ii) the vertices such that , and (iii) constraint vertices in . (5) Prove that the weight of even layers in the skinny tree is not greater than the weight of the odd layers. (6) The assignment is defined so that it increments even layers and decrements odd layers. (7) The assignment is defined so that it decrements even layers and increments odd layers.

## Acknowledgments

The authors would like to thank Nicholas Ruozzi and Kamiel Cornelissen for helpful comments.

## References

• [ADS09] S. Arora, C. Daskalakis, and D. Steurer, “Message passing algorithms and improved LP decoding,” in Proc. 41st Annual ACM Symp. Theory of Computing, Bethesda, MD, USA, pp. 3–12, 2009.
• [AL02] A. Amit and N. Linial, “Random graph coverings I: General theory and graph connectivity,” Combinatorica, vol. 22, no. 1, pp. 1–18, 2002.
• [BBCZ11] M. Bayati, C. Borgs, J. Chayes, and R. Zecchina, “Belief propagation for weighted b-matchings on arbitrary graphs and its relation to linear programs with integer solutions,” SIAM Journal on Discrete Mathematics, vol. 25, no. 2, pp. 989-â1011, 2011.
• [BCMR13] T. Brunsch, K. Cornelissen, B. Manthey, and H. Röglin, “Smoothed analysis of belief propagation for minimum-cost flow and matching,” in Proc. WALCOM: Algorithms and Computation, pp. 182-193, Springer Berlin Heidelberg, 2013.
• [BGT93] C. Berrou, A. Glavieux, and P. Thitimajshima, “Near Shannon limit error-correcting coding and decoding: Turbo-codes,” in Proc. IEEE Int. Conf. on Communications (ICC’93), Geneva, Switzerland, vol. 2, pp. 1064–1070, 1993.
• [BSS08] M. Bayati, D. Shah, and M. Sharma, “Max-product for maximum weight matching: Convergence, correctness, and LP duality,” IEEE Trans. Inf. Theory, vol. 54, no. 3, pp. 1241â-1251, Mar. 2008.
• [BT97] D. Bertsimas and J. N. Tsitsiklis, Introduction to linear optimization. Athena Scientific, 1997.
• [HE11] N. Halabi and G. Even, “On decoding irregular Tanner codes with local-optimality guarantees,” CoRR, http://arxiv.org/abs/1107.2677, Jul. 2011.
• [FWK05] J. Feldman, M. J. Wainwright, and D. R. Karger, “Using linear programming to decode binary linear codes,” IEEE Trans. Inf. Theory, vol. 51, no. 3, pp. 954â-972, Mar. 2005.
• [Gal63] R. G. Gallager, Low-Density Parity-Check Codes.   MIT Press, Cambridge, MA, 1963.
• [GSW12] D. Gamarnik, D. Shah, and Y. Wei, “Belief propagation for min-cost network flow: Convergence and correctness,” Operations Research, vol. 60, no. 2, pp. 410–428, Mar.-Apr. 2012.
• [KFL01] F. R. Kschischang, B. J. Frey, and H.-A. Loeliger, “Factor graphs and the sum-product algorithm,” IEEE Trans. Inf. Theory, vol. 47, no. 2, pp. 498–519, Feb. 2001.
• [MMC98] R. J. McEliece, D. J. C. MacKay, and J.-F. Cheng, “Turbo decoding as an instance of pearl’s ‘belief propagation’ algorithm,” IEEE J. Select. Areas Commun., vol. 16, no. 2, pp. 140–152, Feb. 1998.
• [Pea88] J. Pearl, Probabilistic reasoning in intelligent systems: networks of plausible inference. Morgan Kaufmann, 1988.
• [RT08] N. Ruozzi and S. Tatikonda, “s-t paths using the min-sum algorithm,” in Proc. of the 46th Allerton Conference on Communication, Control, and Computing, Monticello, Illinois, USA, pp. 918–921, 2008.
• [RT12] N. Ruozzi and S. Tatikonda, “Message-passing algorithms: Reparameterizations and splittings,” CoRR, abs/1002.3239, v3, Dec. 2012.
• [Sch98] A. Schrijver, Theory of linear and integer programming. Wiley-Interscience, New York, NY, 1998.
• [SMW11] S. Sanghavi, D. Malioutov, and A. Willsky, “Belief propagation and LP relaxation for weighted matching in general graphs,” IEEE Trans. Inf. Theory, vol. 57, no. 4, pp. 2203 –2212, Apr. 2011.
• [SSW09] S. Sanghavi, D. Shah, and A.S. Willsky, “Message passing for maximum weight independent set,” IEEE Trans. Inf. Theory, vol. 55, no. 11, pp. 4822 –4834, Nov. 2009.
• [Vit67] Andrew Viterbi, “Error bounds for convolutional codes and an asymptotically optimum decoding algorithm,” IEEE Trans. Inf. Theory, vol. 13, no. 2, pp. 260–269, Apr. 1967.
• [VK05] P. O. Vontobel and R. Koetter, âGraph-cover decoding and finite-length analysis of message-passing iterative decoding of LDPC codes,â CoRR, http://www.arxiv.org/abs/cs.IT/0512078, Dec. 2005.
• [Wib96] N. Wiberg, “Codes and decoding on general graphs”, Ph.D. dissertation, Linköping University, Linköping, Sweden, 1996.
• [Yed11] J. S. Yedidia, “Message-passing algorithms for inference and optimization,” J. Stat. Phys., vol. 145, no. 4, pp. 860–890, Nov. 2011.

## Appendix A On Convergence of the Min-Sum Algorithm for Nonbinary Packing and Covering Problems

In Section 3.2 we showed that if the min-sum-packing algorithm outputs the same result in two consecutive iterations, then this result equals the optimal solution of the LP relaxation (see Corollary 6). On the other hand, even if the LP relaxation has a unique optimal solution and that solution is integral, then the min-sum-packing algorithm may not converge (see Sanghavi et al. [SSW09] for an example with respect to the maximum weight independent set problem).

Convergence of the min-sum algorithm was proved for the maximum weight -matching and the minimum -edge covering problems by Sanghavi et al. [SMW11] and Bayati et al. [BBCZ11]. They considered the zero-one integer program for maximum weight matching with the constraints , and proved that after a pseudo-polynomial number of iterations, the min-sum algorithm converges to the optimal solution of the LP relaxation provided that it is unique and integral. The parameter that is used to bound the number of iterations is defined as follows.

###### Definition 16 ([Smw11]).

Given a polyhedron and a cost vector . Define by

 c(P,w) ≜minx∈P∖{x∗}wT⋅(x∗−x)∥x∗−x∥1,

where .

By definition, , and if and only the LP has a unique optimal solution. On the other hand, , where .

We generalize the convergence result of Sanghavi et al. [SMW11] to nonbinary packing and covering problems. For the sake of brevity we state the result for packing problems; the analogous result for covering problems holds as well.

###### Theorem 17.

Let denote the polytope . Assume that every column of contains at most two s. Assume that the packing LP