In this paper we aim to define a robust family of sequential algorithms which can be easily adapted to the distributed setting. We then develop new tools to further enhance these algorithms, achieving state of the art results for fundamental problems.
We define a simple class of greedylike algorithms which we call orderlesslocal algorithms. We show that given a legal coloring of the graph, every algorithm in this family can be converted into a distributed algorithm running in communication rounds in the CONGEST model. We show that this family is indeed robust as both the method of conditional expectations and the unconstrained submodular maximization algorithm of Buchbinder et al. [BFNS15] can be expressed as orderlesslocal algorithms for local utility functions — Utility functions which have a strong local nature to them.
We use the above algorithms as a base for new distributed approximation algorithms for three fundamental problems: Max Cut, MaxDiCut, Max 2SAT and correlation clustering. We develop algorithms which have the same approximation guarantees as their sequential counterparts, up to an additive factor, while achieving an running time for deterministic algorithms and running time for randomized ones. This improves exponentially upon the currently best known algorithms.
1 Introduction
1.1 Definitions and motivation
A large part of research in the distributed environment aims to develop fast distributed algorithms for problems which have already been studied in the sequential setting. Ideally, we would like to use the power of the distributed environment to achieve a substantial improvement in the running time over the sequential algorithm, and indeed, for many problems distributed algorithms achieve an exponential improvement over the sequential case. When designing a distributed algorithm the sequential algorithm is a natural staring point [CLS17, BCS17, BCGS17, GHS83, BS07], then certain adjustments are made for the distributed environment in order to achieve a faster running time. In this paper we focus our attention on approximation algorithms for unconstrained optimization problems on graphs. We are given some graph , where each vertex is assigned a variable taking values in some set . We aim to maximize some utility function over these variables (For a formal definition see Section 2). Our distributed model is the CONGEST model of distributed computation, where the network is represented by a graph, s.t nodes are computational units and edges are communication links. Nodes communicate in synchronous communication rounds, where at each round a node sends and receives messages from all of its neighbors. In the CONGEST model the size of messages sent between nodes is limited to bits. This is more restrictive than the LOCAL model, where message size is unbounded. Our complexity measure is the number of communication rounds of the algorithm.
Adapting a sequential algorithm of the type we describe above to the distributed setting, means we wish each node in the communication graph to output an assignment to such that the approximation guarantee is close to that of the sequential algorithm, while minimizing the number of communication rounds of the distributed algorithm. Our goal is to formally define a family of sequential algorithms which can be easily converted to distributed algorithms, and then develop tools to allow these algorithms to run exponentially faster, while achieving almost the same approximation ratio up.
Finally, we apply our techniques to the classical problems of Max Cut, MaxDiCut, Max 2SAT and correlation clustering, achieving an exponential improvement over the current state of the art distributed algorithms [CLS17], while losing only an additive factor in the approximation ratio.
1.2 Tools and results
We define a family of utility functions, which we call local utility functions (Formally defined in Section 2). We say that a utility function is a local utility function, if the change to the value of the function upon setting one variable can be computed locally. Intuitively, while optimizing a general utility function in the distributed setting might be difficult for global functions, the local nature of the family of local utility functions makes it a perfect candidate.
We focus on adapting a large family of, potentially randomized, algorithms to the distributed setting. We consider orderlesslocal algorithms  algorithms that can traverse the variables in any order and in each iteration apply some local function to decide the value of the variable. By local we mean that the decision only depends on the local environment of the node in the graph, the variables of nodes adjacent to that variable and some randomness only used by that node. This is similar to the family of Priority algorithms first defined in [BNR02]. The goal of [BNR02] was to formally define the notion of a greedy algorithm, and then to explore the limits of these algorithms. Our definition is similar (and can be expressed as a special case of priority algorithms), but the goal is different. While [BNR02] aims to prove lower bounds, we aim to characterize the sequential algorithms that can be easily transformed into fast distributed algorithms.
One might expect that due to the locality of this family of algorithms it can be distributed if the graph is provided with a legal coloring. The distributed algorithm goes over the color classes one after another and executes all nodes in the color class simultaneously. This solves any conflicts that may occur form executing two neighboring nodes, while the orderless property guarantees that this execution is valid. In a sense this argument was already used for specific algorithm (Coloring to MIS [Lin92], MaxIS of [BCGS17], MaxCut of [CLS17]). We provide a more general result, using this classical argument. Specifically, we show that given a legal coloring, any orderlesslocal algorithm can be distributed in communication rounds in the CONGEST model.
To show that this definition is indeed robust, we show two general applications. The first is adapting the method of conditional expectations (Formally defined in Section 2) to the distributed setting. This method is inherently sequential, but we show that if the utility function optimized is a local utility function, then the algorithm is an orderlesslocal algorithm. A classical application of this technique is for Max cut, where an approximation is achieved when every node chooses a cut side at random. This can be derandomized using the method of conditional expectations, and adapted to the distributed setting, as the cut function is a local utility function. We note that the same exact approach results in a approximation for maxagree correlation clustering on general graphs (see Section 2 for a definition). Because the tools used for MaxCut directly translate to correlation clustering, we focus on MaxCut for the rest of the paper, and only mention correlation clustering at the very end.
The second application is the unconstrained submodular maximization algorithms of [BFNS15], where a deterministic 1/3approximation and a randomized expected 1/2approximation algorithms are presented. We show that both are orderlesslocal algorithms when provided with a local utility function. This can be applied to the problem of MaxDiCut, as it is an unconstrained submodular function, and also a local utility function. The algorithms of [BFNS15] were already adapted to the distributed setting for the specific problem of MaxDiCut by [CLS17] using similar ideas. The main benefit of our definition is the convenience and generality of adapting these algorithms without the need to consider their analysis or reprove correctness. We conclude that the family of orderlesslocal algorithms indeed contains robust algorithms for fundamental problems, and especially the method of conditional expectations, which has no analogue in the distributed environment to the best of our knowledge.
Next, we wish to consider the running time of these algorithms. Recall that we expressed the running time of orderless local algorithms in terms of the colors of some legal coloring for the graph. For a general graph, we cannot hope for a legal coloring using less than , where is the maximum degree in the graph. This means that using the distributed version of an orderlesslocal algorithm as is will have a running time linear in . We show how to overcome this obstacle for Max Cut and MaxDiCut. The general idea is to compute a defective coloring of the graph which uses few colors, drop all monochromatic edges, and call the algorithm for the new graph which now has a legal coloring.
A key tool in our algorithms is a new type of defective coloring we call an defective coloring. The classical defective coloring allows each vertex to have at most monochromatic edges, for some defect parameter . We require a fractional defect  at most fraction of the edges for any vertex are monochromatic. We show that an defective coloring using colors can be computed deterministically in rounds using the defective coloring algorithm of [Kuh09] as a black box.
Although we cannot guarantee a legal coloring with a small number of colors for any graph , we may remove some subset of which will result in a new graph with a low chromatic number. We wish to do so while not removing too many edges, which we prove guarantees the approximation will only be mildly affected for our cut problems. Formally, we show that if we only remove an fraction of the edges, we will incur an additive loss in the approximation ratio of the cut algorithms for . For the randomized algorithm this is easy, simply color each vertex randomly with a color in and drop all monochromatic edges. For the deterministic case, we execute our defective coloring algorithm, and then remove all monochromatic edges. We then execute the relevant cut algorithm on the resulting graph which now has a legal coloring, using a small number of colors. The above results in extremely fast approximation algorithms for Max Cut and MaxDiCut, while having almost the same approximation ratio as their sequential counterpart.
Finally, our techniques can also be applied to the problem of Max 2SAT. To do so we may use the randomized expected 3/4approximation algorithm presented in [PSWvZ17]. It is based on the algorithm of [BFNS15], and thus is almost identical to the unconstrained submodular maximization algorithm. Because the techniques we use are very similar to the above, we defer the entire proof to the appendix (Section B). We state our results in Table 1.
Problem  Our Approx.  Our Time  Prev Approx.  Prev Time  Notes 

CorrelationClustering*      det.  
MaxCut  [CLS17]  det.  
Max Cut      det.  
MaxDicut  [CLS17]  det.  
MaxDicut  [CLS17]  rand.  
Max 2SAT      rand. 
1.3 Previous research
Cut problems:
An excellent overview of the MaxCut and MaxDiCut problems appears in [CLS17], which we follow in this section. Computing MaxCut exactly is NPhard as shown by Karp [Kar72] for the weighted version, and by [GJS76] for the unweighted case. As for approximations, it is impossible to improve upon a 16/17approximation for MaxCut and a 12/13approximation for MaxDiCut unless [TSSW00, Hås01]. If every node chooses a cut side randomly, an expected 1/2approximation for MaxCut, a 1/4approximation for MaxDiCut and a approximation is achieved. This can be derandomized using the method of conditional expectations. In the breakthrough paper of Goemans and Williamson [GW95] a 0.878approximation is achieved using semidefinite programming. This is optimal under the unique games conjecture [KKMO07]. In the same paper a 0.796approximation for MaxDiCut was presented. This was later improved to 0.863 in [MatuuraM01]. Other results using different techniques are presented in [KS11, Tre12].
In the distributed setting the problem has not received much attention. A node may choose a cut side at random, achieving the same guarantees as above in constant time. In [HRSS14] a distributed algorithm for regular triangle free graphs which achieves a approximation ratio in a single communication round is presented. The only results for general graphs in the distributed setting is due to [CLS17]. In the CONGEST model they present a deterministic 1/2approximation for MaxCut, a deterministic 1/3approximation for MaxDiCut, and a randomized expected 1/2 approximation for MaxDiCut running in communication rounds. The results for MaxDiCut follow from adapting the unconstrained submodular maximization algorithm of [BFNS15] to the distributed setting. Better results are presented for the LOCAL model; we refer the reader to [CLS17] for the full details.
Max 2SAT
The decision version of Max 2SAT is NPcomplete [GJS76], and there exist several approximation algorithms [GW95, FG95, LLZ02, MM01], of which currently the best known approximation ratio is 0.9401 [LLZ02]. In [Aus07] it is shown that assuming the unique games conjecture, the approximation factor of [LLZ02] cannot be improved. Assuming only that it cannot be approximated to within a 21/22factor [Hås01]. To the best of our knowledge the problem of Max 2SAT (or MaxSAT) was not studied in the distributed model.
Correlation clustering
An excellent overview of correlation clustering (see Section 2 for a definition) appears in [ACG15], which we follow in this section. Correlation clustering was first defined by [BBC02]. Solving the problem exactly is NPHard, thus we are left with designing approximation algorithms for the problem, here one can try to approximate maxagree or mindisagree. If the graph is a clique, there exists a PTAS for maxagree [BBC02, GG06], and a 2.06approximation for maxdisagree [CMSY15]. For general (even weighted) graphs there exists a 0.7666approximation for maxagree [CGW05, Swa04], and a approximation for mindisagree [DEFI06]. A trivial 1/2approximation for maxagree on general graphs can be achieved by considering putting every node in a separate cluster, then considering putting all nodes in a single cluster, and taking the more profitable of the two.
In the distributed setting little is known about correlation clustering. In [CHK16] a dynamic distributed MIS algorithm is provided, it is stated that this achieves a 3approximation for mindisagree correlation clustering as it simulates the classical algorithm of Ailon et al. [ACN08]. We note that the algorithm of Ailon et al. assumes the graph to be a clique, thus the above result is limited to complete graphs where the edges of the communication graph are taken to be the positive edges, and the nonedges are taken as the negative edges (as indeed for general graphs, the problem is APXHard, and difficult to approximate better than [DEFI06]). We also note that using only two clusters, where each node chooses a cluster at random, guarantees an expected 1/2approximation for maxagree on general graphs. We derandomize this approach in this paper.
2 Preliminaries
Sequential algorithms
The main goal of this paper is converting sequential graph algorithms for unconstrained maximization (or minimization) to distributed graph algorithms. Let us first define formally this family of algorithms. The sequential algorithm receives as input a graph , we associate each vertex with a variable taking values in some finite set . The algorithm outputs a set of assignments . The goal of the algorithms is to maximize some utility function taking in a graph and the set of assignments and outputting some value in . For simplicity we assume that the order of the variables in does not affect , so we use a set notation instead of a vector notation. We somewhat abuse notation, and when assigning a variable we write , meaning that any other assignment to is removed from the set . We also omit as a parameter when it is clear from context.
When considering randomized algorithms we assume the algorithm takes in a vector of random bits denoted by . This way of representing random algorithms is identical to having the algorithm generate the random coins, and we use these two definitions interchangeably. The randomized algorithm aims to maximize the expectation of , where the expectation is taken over the random bits of the algorithm.
Max Cut, MaxDiCut
In this paper we provide fast distributed approximation algorithms to three fundamental problems, which we now define formally. In the Max Cut problem we wish to divide the vertices into disjoint sets, such that the number of edges between different sets is maximized. In the MaxDiCut problem the edges are directed and we wish to divide the edges into two disjoint sets, denoted , such that the number of edges directed from to is maximized.
Max 2SAT
In the Max 2SAT problem we are given a set of unique clauses over some set of variables, where each clause contains at most two literals. Our goal is to maximize the number of satisfied clauses. This problem is more general than the cut problems, so we must define what does it mean in the distributed context. First, the variables will be node variables as defined before. Second, each node knows all of the clauses it appears in as a literal. Finally, all problems considered here have weighted counterparts, which we do not consider in this paper.
Correlation clustering
We are given a graph , such that each edge is also assigned a value from (referred to positive and negative edges). Given some partition, , of the the graph into disjoint clusters, we say that an edge agrees with if it is positive and both endpoints are in the same cluster, or it negative, and its endpoints are in different clusters. Otherwise we say it disagrees with . We aim to find a partition , using any number of clusters, such that the number of edges that agree with (agreements) is maximized (maxagree), or equivalently the number of edges that disagree with is minimized (mindisagree).
The problem is usually expressed as an LP using edge variables, where each variable indicates whether the nodes are in the same cluster. This allows a solution to use any number of clusters. In this paper we only aim to achieve a 2approximation for the problem. This can be done rather simply without employing the full power of correlation clustering. Specifically, two clusters are enough for our case as we show that we can deterministically achieve agreements which results in the desired approximation ratio.
Local utility functions
We are interested in a type of utility function which we call a local utility function. Before we continue with the definition let us define an operator on assignments , we define . For convenience, when we pass as parameter to a function, we assume the function also receives the local topology of which we do not write explicitly. We say that a utility function , as defined above, is a local utility function if for every there exists a function s.t . That is, to compute the change in the utility function which is caused by changing from to , we need to only know the local topology of , and the assignment to neighboring node variables. We note that for the cut problems considered in this paper the utility functions are indeed local utility functions. This is proven in the following Lemma (proof deferred to the appendix):
Lemma 1.
The utility functions for Max Cut, MaxDiCut and maxagree correlation clustering with 2 clusters are local utility functions.
Submodular functions and graph cuts
A family of functions that will be of interest in this paper is the family of submodular functions. A function is called set function, with ground set . It is said to be submodular if for every it holds that . The functions we are interested in have as their ground set, thus we remain with our original notation, setting and having take in a set of binary assignments as a parameter.
The method of conditional expectations.
Next, we consider the method of conditional expectations. Let , and let be a vector of random variables taking values in . We wish to be consistent with the previous notation, thus we treat as a set of assignments. If , then there is an assignment of values such that . We describe how to find the vector . We first note that from the law of total expectation it holds that , and therefore for at least some it holds that . We set this value to be . We then repeat this process for the rest of the values in , which results in the set . In order for this method to work we need it to be possible to compute the conditional expectation of .
Graph coloring
A coloring for is defined as a function . For simplicity we treat any set of size with some ordering as the set of integers . This simplifies things as we can always consider , which is very convenient. We say that a coloring is a legal coloring if s.t it holds that . For any graph there exists a legal coloring, and can be computed in time sublinear in in the CONGEST model. We present the following theorem of Barenboim [Bar15] which will be of use later^{1}^{1}1Where hides factors polylogarithmic in :
Theorem 2.
There exists a determinstic distributed algorithm in the CONGEST model which can compute a legal coloring for any graph in communication rounds.
Another important tool in this paper is defective coloring. Let us fix some coloring function . We define the defect of a vertex to be the number of monochromatic edges it has. Formally, . We call a coloring with defect if it holds that . We present the following result by Kuhn [Kuh09].
Theorem 3.
For all an coloring with defect can be computed deterministically in rounds in the CONGEST model.
In this paper we define a new kind of defective coloring. We say that is an defective coloring for if it holds that . In Section 3 we show that this coloring can be computed deterministically in logarithmic time in the CONGEST model.
3 A fast distributed coloring algorithm
In this section we present a deterministic round distributed algorithm in the CONGEST model which returns an defective coloring using colors. Our coloring algorithm (Algorithm 1) first buckets the vertices according to their degree. Let , where . The th buckets contains all vertices with degree within . Denote the graph induced by the th bucket by . This bucketing defines a family of vertex disjoint induced graphs (thus, they are also edge disjoint). We note that the degree of a vertex can only become smaller in the induced graph, so is an upper bound of the degree of .
All buckets calculate a coloring simultaneously, where buckets with execute the legal coloring algorithm of Barenboim, while the rest execute the defective coloring algorithm of Kuhn with defect parameter . This is because the defective coloring algorithm requires the defect parameter to be greater than 1. Finally, the color of vertex is set to be the tuple , where is the bucket which it belongs to, and is the color assigned to it by the defective coloring algorithm. As for order, we use standard lexicographic order for the tuples, and linearize the coloring so the notion of will be properly defined.
The above guarantees that when we go back to the original graph , the defect of the vertex does not increase. This results in an coloring such that the defect of vertex does not exceed . We continue to formally prove the above claims.
Lemma 4.
For every vertex it holds that .
Proof.
For every the induced graph has degree at most . Applying the defective coloring algorithm with parameter results in a coloring with defect at most for the induced graph. When we go back to the original graph, all vertices not in the induced graph have a different color, thus the defect does not change and is bounded by . ∎
As for the running time, we first note that because the induced graphs are edge disjoint, this simultaneous execution does not cause any congestion. Executing the defective coloring algorithm requires rounds, while executing the coloring algorithm of Barenboim on the low degree buckets requires (Here hides factors polylogarithmic in ). We assume to be some constant in order to simplify things. Thus we get the main theorem for this section.
Theorem 5.
There exists a deterministic distributed defective coloring algorithm in the CONGEST model, running in communication rounds for constant values of .
4 Orderlesslocal algorithms
Next we turn our attention to a large family of (potentially randomized) greedy algorithms. We limit ourselves to graph algorithms, s.t every node has a variable taking values in some set . We aim to maximize some global utility function . We focus on a class of algorithms we call orderlesslocal algorithms. These are greedylike algorithms which may traverse the vertices in any order, and at each step decide upon a value for . This decision is local, meaning that it only depends on the 1hop topology of and the values of neighboring variables. The decision may be random, but each variable has its own random bits, keeping the decision process local.
The code for a generic algorithm of this family is given in Algorithm 2. The algorithm first initiates the vertex variables. Next it traverses the variables in some order . Each is assigned a value according to some function , which only depends on and some random bits which are only used to set the value for that variable. Finally the assignment to the variables is returned. We are guaranteed that the expected value of is at least for any ordering of the variables. Formally, .
We show that this family of algorithms can be easily distributed using coloring, where the running time of the distributed version depends on the number of colors. The distributed version, OLDist, is presented as Algorithm 5 in the appendix. The variables are all initiated as in the sequential version, then the color classes are executed sequentially, while in each color class the nodes execute simultaneously, and then send the newly assigned value to all neighbors. Decide does not communicate with the neighbors, so the algorithm finishes in communication rounds.
It is easy to see that given the same randomness both the sequential and distributed algorithms output the same result, this is because all decisions of the distributed algorithm only depend on the 1hop environment of a vertex, and we are provided with a legal coloring. Thus, one round of the distributed algorithm is equivalent to many steps of the sequential algorithm. We prove the following lemma (the proof is deferred to the appendix):
Lemma 6.
For any graph with a legal coloring , there exists an order on the variables s.t it holds that for any .
Finally we show that for any graph with a legal coloring , it holds that . We know from Lemma 6 that for any coloring there exists an ordering s.t for any . The proof is direct from here:
We conclude that any orderlesslocal algorithm can be distributed, achieving the same performance guarantee on , and requiring communication rounds to finish, given a legal coloring. We state the following theorem:
Theorem 7.
Given some utility function , any sequential orderlesslocal algorithm for which it holds that , can be converted into a distributed algorithm for which it holds that , where is a legal coloring of the graph. The running time of the distributed algorithm is communication rounds.
4.1 Distributed derandomization
We consider the method of conditional expectations in the distributed case for some local utility function , as defined in the preliminaries. Assume that the value of every is set independently at random according to some distribution on which depends only on the local topology of . We are guaranteed that . Thus in the sequential setting we may use the method of conditional expectations to compute a deterministic assignment to the variables with the same guarantee. We show that because is a local utility function, the method of conditional expectations applied on is an orderlesslocal algorithm, and thus can be distributed.
Initially all variables are initiated to some value , meaning the variable is unassigned. Let be some partial assignment to the variables. The method of conditional expectations goes over the variables in any order, and in each iteration sets . This is equivalent to , as the subtracted term is just a constant. With this in mind, we present the pseudo code for the method of conditional expectations in Algorithm 3.
To show that Algorithm 3 is an orderlesslocal algorithm we need only show that can be computed locally for any . We state the following lemma (proof deferred to the appendix), followed by the main theorem for this section.
Lemma 8.
The value can be computed locally.
Theorem 9.
Let be any graph and a local utility function for which it holds that , where the random assignments to the variables are independent of each other, and depend only on the local topology of the node. There exists a distributed algorithm achieving the same expected value for , running in communication rounds in the CONGEST model, given a legal coloring.
4.2 Submodular Maximization
In this section we consider both the deterministic and randomized algorithms of [BFNS15], achieving 1/3 and 1/2 approximation ratios for unconstrained submodular maximization. We show that both can be expressed as orderlesslocal algorithms for any local utility function. As the deterministic and randomized algorithms of [BFNS15] are almost identical, we focus on the randomized algorithm achieving a 1/2approximation in expectation (Algorithm 4), as it a bit more involved (The determinstic algorithm appears as Algorithm 6 in the appendix). The algorithms of [BFNS15] are defined for any submodular function, but as we are interested only in the case where the ground set is , we will present it as such.
The algorithm maintains two varaible assignment , initially , . It iterates over the variables in any order, at each iteration it considers two nonnegative quantities . These quantities represent the gain of either setting in or setting in . Next a coin is flipped with probability , if we set . If we get heads we set in and otherwise we set it to 0 in . When the algorithm ends it holds that , and this is our solution. The deterministic algorithm is almost identical, only that it allows to take negative values, and instead of flipping a coin it makes the decision greedily by comparing .
We first note that the algorithm does not directly fit into our mold, as each vertex has two variables. We can overcome this, by taking to be a binary tuple, the first coordinate stores its value for , and the other for . Initially it holds that , and our final goal function will only take the first coordinate of the variable. We note that because is a local utility function the values can be computed locally, this results directly from the definition of a local utility function, as we are interested in the change in caused by flipping a single variable. Now we may rewrite the algorithm as an orderlesslocal algorithm, the pseudocode appears in the appendix as Algorithm 7. Using Theorem 7 we state our main result:
Theorem 10.
For any graph and a local unconstrained submodular function with as its ground set, there exists a randomized distributed 1/2approximation, and a deterministic 1/3approximation algorithms running in communication rounds in the CONGEST model, given a legal coloring.
4.3 Fast approximations for cut functions
Using the results of the previous sections we can provide fast and simple approximation algorithms for MaxDiCut and Max Cut. Lemma 1 guarantees that the utility functions for these problems are indeed local utility functions. For MaxDiCut we use the algorithms of Buchbinder et al., as this is an unconstrained submodular functions. For Max Cut each node choosing a side uniformly at random achieves a approximation, thus we use the results of Section 4.1. Theorem 10 and Theorem 9 immediately guarantee distributed algorithms, running in communication rounds given a legal coloring.
Denote by one of the cut algorithms guaranteed by Theorem 10 or Theorem 9. We present two algorithms, approxCutDet, a deterministic algorithm to be used when is deterministic (Algorithm 9 in the appendix), and, approxCutRand, a randomized algorithm (Algorithm 10 in the appendix) for the case when is randomized. approxCutDet works by coloring the graph using an defective coloring and then defining a new graph by dropping all of the monochromatic edges, this means that the coloring is a legal coloring for . Finally we call one of the deterministic cut functions. approxCutRand is identical, apart from the fact that nodes choose a color uniformly at random from .
For approxCutDet, the running time of the coloring is rounds, returning an defective coloring. The running time of the cut algorithms is the number of colors, thus the total running time of the algorithm is rounds. Using the same reasoning, the running time of approxCutRand is . It is only left to prove the approximation ratio. We prove the following lemma:
Lemma 11.
Let be any graph, and let be a graph resulting from removing any subset of edges from of size at most . Then for any constant , any approximation for MaxDiCut or Max Cut for is a approximation for .
Proof.
Let be the size of optimal solutions for . It holds that , as any solution for is also a solution for whose value differs by at most (the discarded edges). Using the probabilistic method it is easy to see that for MaxDiCut and Max Cut it holds that . Using all of the above we can say that given a approximate solution for it holds that:
∎
Lemma 11 immediately guarantees the approximation ratio for the deterministic algorithm. As for the randomized algorithm, let the random variable be the fraction of edges removed, let be the approximation ratio guaranteed by one of the cut algorithms and let be the approximation ratio achieved by approxCutRand. We know that . Applying the law of total expectations we get that . We state our main theorems for this section.
Theorem 12.
There exists a deterministic approximation algorithms for Max Cut running in communication rounds in the CONGEST model.
Theorem 13.
There exists a deterministic approximation algorithm for MaxDiCut running in communication rounds in the CONGEST model.
Theorem 14.
There exists a randomized distributed expected approximation for MaxDiCut running in communication rounds in the CONGEST model.
Correlation clustering
We note the the same techniques used for MaxCut work directly for maxagree correlation clustering on general graphs. Specifically, we divide the nodes into two clusters, s.t each node selectes a cluster uniformly at random. Each edge has exactly probability 1/2 to agree with the clustering, thus the expected value of the clustering is , which is a 1/2approximation. The above can be derandomized exactly in the same manner as MaxCut, meaning this is an orderless local algorithm. Finally, we apply the defective coloring algorithm, discard all monochromatic edges and execute the deterministic algorithm guaranteed from Theorem 9 with a legal coloring. Because there must exists a clustering which achieves at least agreements, a lemma identical to Lemma 11 can be proved and we are done. We state the following theorem:
Theorem 15.
There exists a deterministic approximation algorithms for maxagree correlation clustering on general graphs, running in communication rounds in the CONGEST model.
Acknowledgements:
We would like to thank Ami Paz, Seri Khoury and Kenichi Kawarabayashi for many fruitful discussions and useful advice.
References
 [ACG15] Kook Jin Ahn, Graham Cormode, Sudipto Guha, Andrew McGregor, and Anthony Wirth. Correlation clustering in data streams. In ICML, volume 37 of JMLR Workshop and Conference Proceedings, pages 2237–2246. JMLR.org, 2015.
 [ACN08] Nir Ailon, Moses Charikar, and Alantha Newman. Aggregating inconsistent information: Ranking and clustering. J. ACM, 55(5):23:1–23:27, 2008.
 [Aus07] Per Austrin. Balanced max 2sat might not be the hardest. In STOC, pages 189–197. ACM, 2007.
 [Bar15] Leonid Barenboim. Deterministic ( + 1)coloring in sublinear (in ) time in static, dynamic and faulty networks. In Chryssis Georgiou and Paul G. Spirakis, editors, Proceedings of the 2015 ACM Symposium on Principles of Distributed Computing, PODC 2015, DonostiaSan Sebastián, Spain, July 21  23, 2015, pages 345–354. ACM, 2015.
 [BBC02] Nikhil Bansal, Avrim Blum, and Shuchi Chawla. Correlation clustering. In FOCS, page 238. IEEE Computer Society, 2002.
 [BCGS17] Reuven BarYehuda, Keren CensorHillel, Mohsen Ghaffari, and Gregory Schwartzman. Distributed approximation of maximum independent set and maximum matching. In PODC, pages 165–174. ACM, 2017.
 [BCS17] Reuven BarYehuda, Keren CensorHillel, and Gregory Schwartzman. A distributed (2 + )approximation for vertex cover in o(log / log log ) rounds. J. ACM, 64(3):23:1–23:11, 2017.
 [BFNS15] Niv Buchbinder, Moran Feldman, Joseph Naor, and Roy Schwartz. A tight linear time (1/2)approximation for unconstrained submodular maximization. SIAM J. Comput., 44(5):1384–1402, 2015.
 [BNR02] Allan Borodin, Morten N. Nielsen, and Charles Rackoff. (incremental) priority algorithms. In SODA, pages 752–761. ACM/SIAM, 2002.
 [BS07] Surender Baswana and Sandeep Sen. A simple and linear time randomized algorithm for computing sparse spanners in weighted graphs. Random Struct. Algorithms, 30(4):532–563, 2007.
 [CGW05] Moses Charikar, Venkatesan Guruswami, and Anthony Wirth. Clustering with qualitative information. J. Comput. Syst. Sci., 71(3):360–383, 2005.
 [CHK16] Keren CensorHillel, Elad Haramaty, and Zohar S. Karnin. Optimal dynamic distributed MIS. In PODC, pages 217–226. ACM, 2016.
 [CLS17] Keren CensorHillel, Rina Levy, and Hadas Shachnai. Fast distributed approximation for maxcut. CoRR, abs/1707.08496, 2017.
 [CMSY15] Shuchi Chawla, Konstantin Makarychev, Tselil Schramm, and Grigory Yaroslavtsev. Near optimal LP rounding algorithm for correlationclustering on complete and complete kpartite graphs. In STOC, pages 219–228. ACM, 2015.
 [DEFI06] Erik D. Demaine, Dotan Emanuel, Amos Fiat, and Nicole Immorlica. Correlation clustering in general weighted graphs. Theor. Comput. Sci., 361(23):172–187, 2006.
 [FG95] Uriel Feige and Michel X. Goemans. Aproximating the value of two prover proof systems, with applications to MAX 2sat and MAX DICUT. In ISTCS, pages 182–189. IEEE Computer Society, 1995.
 [GG06] Ioannis Giotis and Venkatesan Guruswami. Correlation clustering with a fixed number of clusters. Theory of Computing, 2(13):249–266, 2006.
 [GHS83] Robert G. Gallager, Pierre A. Humblet, and Philip M. Spira. A distributed algorithm for minimumweight spanning trees. ACM Trans. Program. Lang. Syst., 5(1):66–77, 1983.
 [GJS76] M. R. Garey, David S. Johnson, and Larry J. Stockmeyer. Some simplified npcomplete graph problems. Theor. Comput. Sci., 1(3):237–267, 1976.
 [GW95] Michel X. Goemans and David P. Williamson. Improved approximation algorithms for maximum cut and satisfiability problems using semidefinite programming. J. ACM, 42(6):1115–1145, 1995.
 [Hås01] Johan Håstad. Some optimal inapproximability results. J. ACM, 48(4):798–859, 2001.
 [HRSS14] Juho Hirvonen, Joel Rybicki, Stefan Schmid, and Jukka Suomela. Large cuts with local algorithms on trianglefree graphs. CoRR, abs/1402.2543, 2014.
 [Kar72] Richard M. Karp. Reducibility among combinatorial problems. In Complexity of Computer Computations, The IBM Research Symposia Series, pages 85–103. Plenum Press, New York, 1972.
 [KKMO07] Subhash Khot, Guy Kindler, Elchanan Mossel, and Ryan O’Donnell. Optimal inapproximability results for MAXCUT and other 2variable csps? SIAM J. Comput., 37(1):319–357, 2007.
 [KS11] Satyen Kale and C. Seshadhri. Combinatorial approximation algorithms for maxcut using random walks. In ICS, pages 367–388. Tsinghua University Press, 2011.
 [Kuh09] Fabian Kuhn. Weak graph colorings: distributed algorithms and applications. In SPAA, pages 138–144. ACM, 2009.
 [Lin92] Nathan Linial. Locality in distributed graph algorithms. SIAM J. Comput., 21(1):193–201, 1992.
 [LLZ02] Michael Lewin, Dror Livnat, and Uri Zwick. Improved rounding techniques for the MAX 2sat and MAX DICUT problems. In IPCO, volume 2337 of Lecture Notes in Computer Science, pages 67–82. Springer, 2002.
 [MM01] Shiro Matuura and Tomomi Matsui. 0.863approximation algorithm for MAX DICUT. In RANDOMAPPROX, volume 2129 of Lecture Notes in Computer Science, pages 138–146. Springer, 2001.
 [PSWvZ17] Matthias Poloczek, Georg Schnitger, David P. Williamson, and Anke van Zuylen. Greedy algorithms for the maximum satisfiability problem: Simple algorithms and inapproximability bounds. SIAM J. Comput., 46(3):1029–1061, 2017.
 [Swa04] Chaitanya Swamy. Correlation clustering: maximizing agreements via semidefinite programming. In SODA, pages 526–527. SIAM, 2004.
 [Tre12] Luca Trevisan. Max cut and the smallest eigenvalue. SIAM J. Comput., 41(6):1769–1786, 2012.
 [TSSW00] Luca Trevisan, Gregory B. Sorkin, Madhu Sudan, and David P. Williamson. Gadgets, approximation, and linear programming. SIAM J. Comput., 29(6):2074–2097, 2000.
Appendix A Omitted Pseudocodes and Proofs
Algorithm 5 is the pseudo code for the distributed version of an orderlesslocal algorithm. Algorithm 6 is the deterministic 1/3approximation for unconstrained submodular maximization of [BFNS15]. Algorithm 7 and Algorithm 8 are the psedocode the randomized 1/2approximation algorithm of [BFNS15] expressed as an orderless local algorithm. Algorithm 9 and Algorithm 10 are the deterministic and randomized approximation algorithms for the cut functions.

The utility functions for Max Cut, MaxDiCut and maxagree correlation clustering with 2 clusters are local utility functions.
Proof.
The utility functions for Max Cut is given by where if and 0 otherwise. Thus, if we fix some it holds that . Note that the sum now only depends on vertices , and we are finished.
For the problem of MaxDiCut the utility functions is given by , and for maxagree correlation clustering with 2 clusters the utility function is given by (Where are the positive and negative edges, respectively), and the proof is exactly the same. ∎

For any graph with a legal coloring , there exists an order on the variables s.t it holds that for any .
Proof.
We prove the claim by induction on the executions of color classes by the distributed algorithm. We note that the execution of the distributed algorithm defines an order on the variables. Let us consider the th color class. Let us denote these variables as , assigning some arbitrary order within the class. The ordering we analyze for the sequential algorithm would be . Now both the distributed and sequential algorithms follow the same order of color classes, thus we allow ourselves to talk about the sequential algorithm finishing an execution of a color class.
Let be the assignments to all variables of the distributed algorithm after the th color class finishes execution. And let be the assignments made by the sequential algorithm following until all variable in the th color class are assigned. Both algorithms initiate the variables identically, so it holds that . Assume that it holds that . The coloring is legal, so for any , s.t it holds that . Thus, when assigning , its neighborhood is not affected by any other assignments done in the color class, the randomness is identical for both algorithms, and using the induction hypothesis all assignment up until this color class were identical. Thus, for all variables in this color class will be executed with the same parameters for both the distributed and sequential algorithms, and all assignments will be identical. ∎

The value can be computed locally.
Proof.
It holds that:
Where the first equality is due to the law of total expectation and the fact that . The probability of assigning to some value can be computed locally, so we are only left with the difference between the expectations. To show that this is indeed a local quantity we use the definition of expectation as a weighted summation over all possible assignments to unassigned variables. Let be the set of all possible assignments to unassigned variables in and let be the set of all possible assignment to the rest of the unassigned variables. It holds that:
Where in the first equality we us the definition of expectations and the fact that the variables are set independently of each other. Then we use the definition of a local utility function, and finally the dependence on disappears due to the law of total probability. The final sum can be computed locally, as the probabilities for assigning variables in are known and is local. ∎
Appendix B Max 2SAT
In this section we consider the problem of Max 2SAT. We are given a set of clauses over the set of node variables, such that each clause contains at most two literals. We wish to find an assignment maximizing the number of satisfied clauses. As before, we are interested in adapting a sequential algorithm to the distributed setting. The algorithm we shall adapt is the sequential algorithm presented in [PSWvZ17] which achieves a 3/4approximation in expectation for the problem of weighted MaxSAT. It is based on the results of [BFNS15], thus it is almost identical to Algorithm 4. Before presenting the algorithm we need some preliminary definitions.
We allow the node variables to take on values in where means that the value to has yet been assigned. We define two utility functions, the first, counts the number clauses satisfied given the assignment, and the second counts the number of clauses that are unsatisfied given the assignment (all literals are false). We note that until now our functions depended on the node variables and the topology of the graph (which we omitted as a parameter), now they also depend on the clauses. As with the topology, when we pass is a parameter we assume we also pass all of the clauses, while when we pass as a parameter, we assume we also pass all of the clauses the contain as a literal.
For every we define two values, , where if the clause is satisfied by while if the clause is falsified by the assignment. It holds that and . Both functions are local utility functions. We prove the following lemma.
Lemma 16.
Both and are local utility functions.
Proof.
Let us assign each clause to one of the variables which appears in it as a literal (perhaps there is only a single one). Denote for each variable by the set of clauses assigned to it. Because this is a 2SAT instance, it holds that if then and .
It holds that . Thus it holds that:
Where the first equality is because all clauses that do not contain as a literal immediately get removed from the sum, and the final equality is because the clauses are of size at most two, thus every clause that is affected by an assignment to only depends on and s.t . This finishes the proof for . The proof for is identical. ∎
Now we describe the algorithm of [PSWvZ17] (Algorithm 11) for the special case of unweighted Max 2SAT on graphs. Initially all variables are unassigned. The algorithm iterates over the variables in any order. Our aim in each iteration is to maximize the difference between the number of the satisfied clauses and the number of unsatified clauses given the partial assignment. The algorithm calculates two quantities , which correspond to this difference if we set the current variable to 0 or 1, given the current partial assignment. Finally, we flip a coin and set the variable to each value with a probability proportional to its benefit. If the benefit is negative, this probability is 0.
Due to Lemma 16 we see that calculating