A Multicolored Separator Theorem

Approximation Schemes for Geometric Coverage Problems

Abstract

In their seminal work, Mustafa and Ray [29] showed that a wide class of geometric set cover (SC) problems admit a PTAS via local search – this is one of the most general approaches known for such problems. Their result applies if a naturally defined “exchange graph” for two feasible solutions is planar and is based on subdividing this graph via a planar separator theorem due to Frederickson [17]. Obtaining similar results for the related maximum -coverage problem (MC) seems non-trivial due to the hard cardinality constraint. In fact, while Badanidiyuru, Kleinberg, and Lee [4] have shown(via a different analysis) that local search yields a PTAS for two-dimensional real halfspaces, they only conjectured that the same holds true for dimension three. Interestingly, at this point it was already known that local search provides a PTAS for the corresponding set cover case and this followed directly from the approach of Mustafa and Ray.

In this work we provide a way to address the above-mentioned issue. First, we propose a color-balanced version of the planar separator theorem. The resulting subdivision approximates locally in each part the global distribution of the colors. Second, we show how this roughly balanced subdivision can be employed in a more careful analysis to strictly obey the hard cardinality constraint. More specifically, we obtain a PTAS for any “planarizable” instance of MC and thus essentially for all cases where the corresponding SC instance can be tackled via the approach of Mustafa and Ray. As a corollary, we confirm the conjecture of Badanidiyuru, Kleinberg, and Lee [4] regarding real half spaces in dimension three. We feel that our ideas could also be helpful in other geometric settings involving a cardinality constraint.

1 Introduction

The Maximum Coverage (MC) problem is one of the classic combinatorial optimization problems which is well studied due to its wealth of applications. Let be a set of ground elements, be a family of subsets of and be a positive integer. The Maximum Coverage (MC) problem asks for a -subset of such that the number of ground elements covered by is maximized.

Many real life problems arising from banking [12], social networks, transportation network [27], databases [21], information retrieval, sensor placement, security (and others) can be framed as an instance of MC problem. For example, the following are easily seen as MC problems: placing sensors to maximize the number of covered customers, finding a set of documents satisfying the information needs of as many users as possible [4], and placing security personnel in a terrain to maximize the number of secured regions is secured.

From the result of Cornuéjols [12], it is well known that greedy algorithm is a approximation algorithm for the MC problem. Due to wide applicability of the problem, whether one can achieve an approximation factor better than was subject of research for a long period of time. From the result of Feige [16], it is known that if there exists a polynomial-time algorithm that approximates maximum coverage within a ratio of for some then P = NP. Better results can however be obtained for special cases of MC. For example, Ageev and Sviridenko [1] show in their seminal work that their pipage rounding approach gives a factor for instances of MC where every element occurs in at most sets. For constant this is a strict improvement on but this bound is approached if is unbounded. For example, pipage rounding gives a -approximation algorithm for Maximum Vertex Cover (MVC), which asks for a -subset of nodes of a given graph that maximizes the number of edges incident on at least one of the selected nodes. Petrank [30] showed that this special case of MC is APX-hard.

In this paper, we study the approximability of MC in geometric settings where elements and sets are represented by geometric objects. Such problems have been considered before and have applications, for example, in information retrieval [4] and in wireless networks [15].

MC is related to the Set Cover problem (SC). For a given set of ground elements and a family of subsets of , this problem asks for a minimum cardinality subset of which covers all the ground elements of . This problem plays a central role in combinatorial optimization and in particular in the study of approximation algorithms. The best known approximation algorithm has a ratio of , which is essentially the best possible [16] under a plausibly complexity-theoretic assumption. A lot of work has been devoted to beat the logarithmic barrier in the context of geometric set cover problems[7, 32, 8, 28]. Mustafa and Ray [29] introduced a powerful tool which can be used to show that a local search approach provides a PTAS for various geometric SC problems. Their result applies if a naturally defined “exchange graph” (whose nodes are the sets in two feasible solutions) is planar and is based on subdividing this graph via a planar separator theorem due to Frederickson [17]. In the same paper [29], they applied this approach to provide a PTAS for the SC problem when the family consists of either a set of half spaces in , or a set of disks in . Many results have been obtained using this technique for different problems in geometric settings [9, 13, 18, 25]. Some of these works extend to cases where the underlying exchange graph is not planar but admits a small-size separator [3, 19, 20].

Beyond the context of SC, local search has also turned out to be a very powerful tool for other geometric problems but the analysis of such algorithms is usually non-trivial and highly tailored to the specific setting. Examples are the Euclidean TSP, Euclidean Steiner tree, facility location, -median [11]. In some very recent breakthroughs, PTASs for -means problem in finite Euclidean dimension (and more general cases) via local search have been announced [10, 33].

In this paper, we study the effectiveness of local search for geometric MC problems. In the general case, -swap local search is known to yield a tight approximation ratio of [23]. However, for special cases such as geometric MC problems local search is a promising candidate for beating the barrier . It seems, however, non-trivial to obtain such results using the technique of Mustafa and Ray [29]. In their analysis, each part of the subdivided planar exchange graph (see above) corresponds to a feasible candidate swap that replaces some sets of the local optimum with some sets of the global optimum and it is ensured that every element stays covered due to the construction of the exchange graph. It is moreover argued that if the global optimum is sufficiently smaller than the local optimum then one of the considered candidate swaps would actually reduce the size of the solution.

It is possible to construct the same exchange graphs also for the case of MC. However, the hard cardinality constraint given by input parameter poses an obstacle. In particular, when considering a swap corresponding to a part of the subdivision, this swap might be infeasible as it may contain (substantially) more sets from the global optimum than from the local optimum. Another issue is that MC has a different objective function than SC. Namely, the goal is to maximize the number of covered elements rather than minimizing the number of used sets. Finally, while for SC all elements are covered by both solutions, in MC we additionally have elements that are covered by none or only one of the two solutions requiring a more detailed distinction of several types of elements.

In fact, subsequent to the work of Mustafa and Ray on SC [29], Badanidiyuru, Kleinberg, and Lee [4] studied geometric MC. They obtained fixed-parameter approximation schemes for MC instances for the very general case where the family consists of objects with bounded VC dimension, but the running times are exponential in the cardinality bound . They further provided APX-hardness for each of the following cases: set systems of VC-dimension 2, halfspaces in , and axis-parallel rectangles in . Interestingly, while they have shown that for MC instances where consists of halfspaces in local search can be used to provide a PTAS, they only conjecture that local search will provide a PTAS for when consists of half spaces in . This underlines the observation that it seems non-trivial to apply the approach of Mustafa and Ray to geometric MC problems as at that point a PTAS for halfspaces in for SC was already known via the approach of Mustafa and Ray.

The difficulty of analyzing local search under the presence of a cardinality constraint is also known in other settings. For example, one of the main technical contributions of the recent breakthrough for the Euclidean -means problem [10, 33] is that the authors are able to handle the hard cardinality constraint by the concept of so-called isolated pairs [10]. Prior to these works approximation schemes have only been known for bicriteria variants where the cardinality constraint may be violated or where there is no constraint but—analogously to SC—the cardinality contributes to the objective function [5].

1.1 Our Contribution

In this paper, we show a way how to cope with the above-mentioned issue with a cardinality constraint. We are able to achieve a PTAS for many geometric MC problems. At a high level we follow the framework of Mustafa and Ray defining a planar (or more generally -separable) exchange graph and subdividing it into a number of small parts each of them corresponding to a candidate swap. As each part may be (substantially) imbalanced in terms of the number of sets of the global optimum and local optimum, respectively, a natural idea seems to swap in only a sufficiently small subset of the globally optimal sets. This idea alone is, however, not sufficient. Consider, for example, the case where each part contains either only sets from the local or only sets from the global optimum making it impossible to retrieve any feasible swap from the considering the single parts. To overcome this difficulty, we prove in a first step a color-balanced version of the planar separator theorem (Theorem 2). In this theorem, the input is a planar (or more generally -separable) graph whose nodes are two-colored arbitrarily. The distinctions of our separator theorem from the prior work, are that our separator theorem guarantees that all parts have roughly the same size (rather than simply an upper limit on their size) and that the two colors are represented in each part in roughly the same ratio as in the whole graph. This balancing property allows us to address the issue of the above-mentioned infeasible swaps. In a second step, we are able to employ the only roughly color-balanced subdivision to establish a set of perfectly balanced candidate swaps. We prove by a careful analysis (which turns out more intricate than for the SC case) that local search also yields a PTAS for the wide class -separable MC problems (see Theorem 3). As an immediate consequence, we obtain PTASs for essentially all cases of geometric MC problems where the corresponding SC problem can be tackled via the approach of Mustafa and Ray (Theorem 4). In particular, this confirms the conjecture of Badanidiyuru, Kleinberg, and Lee [4] regarding halfspaces in . We also immediately obtain PTASs for Maximum Dominating Set and Maximum Vertex Cover on -separable and minor-closed graph classes (see section 4), which, to the best of our knowledge, were not known before. We feel that our approach has the potential to find further applications in similar cardinality constrained settings.

2 Color Balanced Divisions

In this section we provide the main tool used to prove our main result (i.e., Theorem 3). We first describe a new subtle specialization (see Lemma 1) of the standard division theorem on -separable graph classes (see Theorem 1). This builds on the concept of -divisions (in the sense of Henzinger et al. [22]) of graphs in an -separable graph class. We then extend this specialized division lemma by suitably aggregating the pieces of the partition to obtain a two-color balanced version (see Theorem 2). This result generalizes to more than two colors. However, as our applications stem from the two-colored version, we defer the generalization to the appendix (see Appendix A). For a number , we use to denote the set .

For a graph , a subset of is an -balanced separator when its removal breaks into two collections of connected components such that each collection contains at most an fraction of where and is a constant. The size of a separator is simply the number of vertices it contains. For a non-decreasing sublinear function , a class of graphs that is closed under taking subgraphs is said to be -separable if there is an such that for any , an -vertex graph in the class has a -balanced separator whose size is at most . Note that, by the Lipton-Tarjan separator theorem [26], planar graphs are a subclass of the -separable graphs. More generally, Alon, Seymour, and Thomas [2] have shown that every graph class characterized by a finite set of forbidden minors is also a subclass of the -separable graphs (here, the constant depends on the size of the largest forbidden minor). In particular, from the graph minors theorem [31], every non-trivial minor closed graph class is a subclass of the -separable graphs (for some constant ). Note that when we discuss -separable graph classes we assume the function has the form for some , i.e., it is both non-decreasing and strongly sublinear.

Frederickson [17] introduced the notion of an -division of an -vertex graph , namely, a cover of by sets each of size where each set has boundary vertices, i.e., vertices in common with the other sets. Frederickson showed that, for any , every planar graph has an -division and that one can be computed in time. This result follows from a recursive application of the Lipton-Tarjan planar separator theorem [26]. This notion was further generalized by Henzinger et al. [22] to -divisions1 where is a function in and each set has at most vertices in common with the other sets. They noted that Frederickson’s proof can easily be adapted to obtain an -division of any graph from a subgraph closed -separable graph class – as formalized in Theorem 1). Note that we use an equivalent but slightly different notation than Frederickson and Henzinger et al. in that we consider the “boundary” vertices as a single separate set apart from the non-boundary vertices in each “region”, i.e., our divisions are actually partitions of the vertex set. This allows us to carefully describe the number of vertices inside each “region”.

Theorem 1 ([17, 22]).

For any subgraph closed -separable class of graphs , there are constants such that every graph in the class has an -division for any . Namely, for any , there is an integer such that can be partitioned into sets where the following properties hold.

  1. for each ,

  2. for each ,

  3. for each (thus, ).

Moreover, such a partition can be found in time where is the time required to find an -separation in .

We specialize the notion of -divisions first to uniform -divisions, and then generalize to two-color uniform -divisions of a two-colored graph (note: the coloring need not be proper in the usual sense). A uniform -division is an -division where the sets have a uniform (i.e., ) amount of internal vertices. A two-color uniform -division of a two-colored graph is a uniform -division where each set additionally has the “same” proportion of each color class (this is formalized in Theorem 2).

It is important to note that while this uniformity condition (i.e., that each region is not too small) has not been needed in the past2, it is essential for our analysis of local search as applied to MC problems in the next section. Moreover, to the best of our knowledge, neither Frederickson’s construction nor more modern constructions (e.g. [24]) of an -division explicitly guarantee that the resulting -division is uniform. To be specific, Frederickson’s approach consists of two steps. The first step recursively applies the separator theorem until each region together with its boundary is “small enough”. In the second step, each region where the boundary is “too large” is further divided. This is accomplished applying the separator theorem to a weighted version of each such region where the boundary vertices are uniformly weighted and the non- boundary vertices are zero-weighted. Clearly, even a single application of this latter step may result in regions with interior vertices. Modern approaches (e.g. [24]) similarly involve applying the separator theorem to weighted regions where boundary vertices are uniformly weighted and interior vertices are zero-weighted, i.e., regions which are too small are not explicitly avoided.

The remainder of this section is outlined as follows. We will first show for every -separable graph class there is a constant such that every graph in has a uniform -division (see Lemma 1). We then use this result to show that for every -separable graph class there is a constant such that every two-colored graph in has a two-color uniform -division for any – see Theorem 2. Our proofs are constructive and lead to efficient algorithms which produce such divisions when there is a corresponding efficient algorithm to compute an -separation.

To prove the first result, we start from a given -division and “group” the sets carefully so that we obtain the desired uniformity. For the two-colored version, we start from a uniform -division and again regroup the sets via a reformulation of the problem as a partitioning problem on two-dimensional vectors. Namely, we leverage Lemma 2 to perform the regrouping.

Lemma 1.

Let be a -separable graph class and be a sufficiently large -vertex graph in . There are constants (depending only on ) such that for any there is an integer such that can be partitioned into sets where are constants independent of and the following properties are satisfied.

  1. for each ,

  2. for each ,

  3. for each (thus, ).

Moreover, such a partition can be found in time where is the amount of time required to produce an -division of .

Proof.

We start from an -division as given by Theorem 1 where . We then partition into sets such that is a uniform -division where . In order to describe the partitioning, we first observe some useful properties of where, without loss of generality, . Let , and set . Note that:

(1)

From our choice of , the average size of the sets is .

Pick such that it is divisible by 8 and and assume in what follows. Then , i.e., . Now pick . Thus, we have . In particular, the average size of our sets is in .

Notice that . We build the sets such that . This provides .

We build the sets in two steps. In the first step we greedily fill the sets according to the largest unassigned set (formalized as follows). For each from to , we consider an index where and is minimized. If , then we place into , that is, we replace with . Otherwise (there is no such index ), we proceed to step two (below). Before discussing step two, we first consider the state of the sets at the moment when this greedy placement finishes. To this end, let be the index of the first (i.e., the largest) which has not been placed.

Claim 1: If for every , then all each set has been merged into some and the ’s satisfy the conditions of the lemma.
First, suppose there is an unallocated set . Since for each , our greedy procedure stopped due to having for each . This contradicts the average size of the ’s being at most . So, every set must have been merged into some . Thus, since and the average of the ’s is , we have that for every , . Moreover, for each , . Thus the ’s satisfy the lemma.

Claim 2: For every , .
Suppose some index has . Notice that, if , then for every , , i.e., contradicting Claim 1. Thus, for each where . For each , let and be the states of and (respectively) directly after index has been added to some set by the greedy algorithm.

We now let be the largest index in , and assume (without loss of generality) that for every , if , then . Intuitively, is the “first” index which attains while still having . Now, since , and , we have . Thus, for every iteration , we have . This means that after iteration , the number of unallocated vertices is strictly less than:

.

In particular, this means that on average each set can grow by less than . However, due to our choice of , we see that for every , . This means that even if we allocate all the remaining vertices, the average size of our sets will be strictly less than , i.e., providing a contradiction and proving Claim 2.

Claim 3: If every is placed into some , the ’s satisfy the conditions of the lemma.
First, note that is at most , i.e., . By Claim 2, we see that for each . Additionally, from the greedy construction, we have that . Thus, .

We now describe the second step. By Claim 3, we assume there are unassigned sets . By Claim 2, for every , . Finally, by Claim 1, there is an index where . Thus, since we have sets which partition at most elements, there must be some index where and , i.e., where is the largest unassigned set. Notice that there are at most indices which can be assigned and all the remaining sets contain at most vertices. If we spread these remaining ’s uniformly throughout our ’s, we will place at most vertices into each . Thus, for each , we have . So, by uniformly assigning these remaining indices, we have , , and , as needed.

We conclude with a brief discussion of the time complexity. First, we generate the -division in time. We then sort the sets (this can be done in time via bucket sort). In the next step we greedily fill the index sets – this takes time. Finally, we place the remaining “small” sets uniformly throughout the ’s – taking again time. Thus, we have time in total. ∎

We now prove a technical lemma which, together with the previous lemma regarding uniform divisions, provides our uniform two-color balanced divisions (see Theorem 2) as discussed following this lemma.

Lemma 2.

Let and be positive constants, and be a set of -dimensional vectors where for each , and such that . There is a permutation of such that for any , .

Thus for any positive integer , when is sufficiently larger than , there exist numbers and and a partitioning of into subsets such that for each we have:

  1. (thus, ), and

  2. .

Moreover, the permutation and partition can be computed in time.

Proof.

First, we partition into three sets , , and according to whether the weighted difference is positive, negative, or 0 (respectively). Note that, and for each , . We will pick indices one at a time from the sets , , to form the desired permutation.

We now construct a permutation on the indices so that any consecutive subsequence has . For notational convenience, for each , we use to denote . We now pick the ’s so that for each , . We initialize . For each from to we proceed as follows. Assume that . We further assume that any index has been removed from the sets , , and . If is negative, must contain some index since . Moreover, if we set , we have as needed (we also remove the index from at this point). Similarly, if is positive, we pick any index from , remove it from , and set . Finally, when ), we simply take any index from , remove it from , and set . Thus, in all cases we have .

Notice that, for any , we have (as needed for the first part of the lemma).

It remains to partition to form the sets . This is accomplished by splitting into consecutive subsequences of almost equal size. Namely, we pick . We further let , and , . From these integers, we make the sets with indices each and the sets with indices each by partitioning into these sets in order. A simple calculation shows that these sets satisfy the conditions of the lemma. Moreover, this construction is clearly performed in time. ∎

We will now use Lemmas 1 and 2 to prove Theorem 2. In particular, for a given two-colored graph where belongs to an -separable graph class, we first construct a uniform -division of as in Lemma 1. From this division we can again carefully combine the ’s to make new sets where each has roughly the same size and contains roughly the same proportion of each color class as occurring in . This follows by simply imagining each region of the uniform -division as a two-dimensional vector (according to its coloring) and then applying Lemma 2.

Theorem 2.

Let be an -separable graph class and be a 2-colored -vertex graph in with color classes such that . For any and where is suitably large, there is an integer such that can be partitioned into sets where are constants independent of our parameters and there is an integer all satisfying the following properties.

  1. for each ,

  2. for each ,

  3. for each (thus, ).

Moreover, such a partition can be found in time where is the amount of time required to produce a uniform -division of .

3 PTAS for -Separable Maximum Coverage

In this section we formalize the notion of -separable instances of the MC problem and prove our main result – see Theorem 3.

Definition 1.

A class of instances of MC is called -separable if for any two disjoint feasible solutions and of any instance in there exists an -separable graph with node set with the following exchange property. If there is a ground element that is covered both by and then there exists an edge in with and with .

Theorem 3.

Let be non-decreasing sublinear function. Then, any -separable class of instances of MC that is closed under removing elements and sets admits a PTAS.

Proof.

Our algorithm is based on local search. We fix a positive constant integer . Given an -separable instance of MC, we pick an arbitrary initial solution . We check if it is possible to replace sets in with sets from so that the total number of elements covered is increased. We perform such a replacement (swap) as long as there is one. We stop if there is no profitable swap and output the resulting solution.

In what follows, we show that for sufficiently large the above algorithm yields a -approximate solution and that it runs in polynomial time (for constant ). Here, and are the constants from Theorem 2. This will prove the claim of Theorem 3 by letting sufficiently large. Note that, if , then we see that Theorem 2 also holds for . Similarly, if , then Theorem 2 also holds for . Thus, we can safely assume that .

Since each step increases the number of covered elements, the number of iterations of the above algorithm is at most . Each iteration takes time. Therefore, the total running time of the algorithm is polynomial for constant .

We now analyze the performance guarantee of the algorithm. To this end, let be an optimum solution to the instance and let be the (locally optimal) solution output by the algorithm. Let , denote the number of elements covered by , , respectively.

Suppose that . We want to show that this would imply that there is a profitable swap as this would contradict the local optimality of and hence complete the proof.

We claim that it suffices to consider the case when are disjoint, which is justified as follows. Assume that . We remove the sets in from and all the elements covered by these sets from . Moreover, we decrease by and replace with and with . Since our class of instances is closed under removing sets and elements the resulting instance is still contained in the class. Moreover, . Finally, if we are able to show that there exists a feasible and profitable swap in the reduced instance then the same swap is also feasible and profitable in the original instance (with original solutions and ).

Therefore, we assume from now on that and are disjoint. Since our instance is -separable, there exists an -separable graph with precisely nodes for the two feasible solutions and with the properties stated in Definition 1.

We now apply our two colored separator theorem (Theorem 2) to with color classes and and with parameters and .

Since , the two color classes in are perfectly balanced. Let , , and for any part with of the resulting subdivision of .

We can assume that every set in is contained in for some . This can be achieved by suitably adding edges to while maintaining the necessary properties of the uniform colored subdivision. More precisely, for every of the at most many sets in we add an edge to a set in for some . By Theorem 2, we have and for each . Hence, we have that . Therefore, we can insert edges between the sets in and sets in , so that the neighborhood receives at most many additional nodes for each . Note that the exchange property of Definition 1 still holds as we only added edges. Also the properties of Theorem 2 are still valid except that the bound on the boundary size in Property (iii) has increased to at most since .

The idea of the analysis is to consider for each a feasible candidate swap (called candidate swap ) that replaces in the sets with some suitably chosen sets from . We will show that if then at least one of the candidate swaps is profitable leading to a contradiction.

To accomplish this, we will first show that there exists a profitable swap that replaces with . This swap may be infeasible as may be strictly smaller than . We will, however, show that a feasible and profitable swap can be constructed by adding only some of the sets in .

For technical reasons we are going to define a set of elements that we (temporarily) disregard from our calculations because they will remain covered and thus should not impact our decision which of the sets in we will pick for the feasible swap. More precisely, let be the set of elements that are covered by some but that remain covered even if is removed from .

Let be the set of elements that are “lost” when removing the from . Moreover, let be the set of elements that are “won” when we add all the sets of after removing .

We claim that . To this end, note that and that the family contains pairwise disjoint sets because all elements that are not exclusively covered by a single are contained in and thus removed. On the other hand, we claim that . To see this, note first that every element in contributes 0 to the left hand side and 0 or -1 to the right hand side. Every element covered by but not by contributes at least 1 to the left (because every set in lies in some by our extension of the exchange graph) hand side and precisely 1 to the right hand side. Finally, consider an element that is covered both by and by but does not lie in . This element lies in a set for some . Because of the definition of the exchange graph there is some set with and some set with such that and are adjacent in . We have that , for, otherwise . Because of the separator property of (see Property (i) of Theorem 2) we must have . Moreover lies in because it is not contained in but is covered by . Hence contributes at least 1 to the left hand side and precisely 1 to the right hand side of , which shows the claim.

We have and hence

Hence, we can pick such that

(2)

Recall that and assume that is large enough so that . Then by Properties (iv), (ii), and the (due the addition of edges to ) modified Property (iii) of Theorem 2, we have that , , and . Because of this implies and . Hence

(3)

We are now ready to construct our feasible and profitable swap. To this end let . We inductively define an order on the sets in where we require that

for any where maximizes .

Consider the following process of iteratively building a set starting with . Suppose that we add to the sets in this order ending up with . In doing so, the incremental gain is monotonically decreasing due to the definition of the order on and due to the submodularity of the objective function. Hence, for any prefix of the first sets we have that

(4)

Suppose that (otherwise we can just add all sets in ). Consider the swap where we replace the many sets from the local optimum with at most many sets from .

We now analyze how this swap affects the objective function value. By removing the sets in the objective function value drops by