Constant Factor Approximationfor Capacitated k-Center with Outliers1footnote 11footnote 1This work is partially supported by Foundation for Polish Science grant HOMING PLUS/2012-6/2.

# Constant Factor Approximation for Capacitated k-Center with Outliers111This work is partially supported by Foundation for Polish Science grant HOMING PLUS/2012-6/2.

Marek Cygan Institute of Informatics, University of Warsaw, Poland
[cygan, kociumaka]@mimuw.edu.pl
Tomasz Kociumaka Institute of Informatics, University of Warsaw, Poland
[cygan, kociumaka]@mimuw.edu.pl
###### Abstract

The -center problem is a classic facility location problem, where given an edge-weighted graph one is to find a subset of vertices , such that each vertex in is “close” to some vertex in . The approximation status of this basic problem is well understood, as a simple -approximation algorithm is known to be tight. Consequently different extensions were studied.

In the capacitated version of the problem each vertex is assigned a capacity, which is a strict upper bound on the number of clients a facility can serve, when located at this vertex. A constant factor approximation for the capacitated -center was obtained last year by Cygan, Hajiaghayi and Khuller [FOCS’12], which was recently improved to a -approximation by An, Bhaskara and Svensson [arXiv’13].

In a different generalization of the problem some clients (denoted as outliers) may be disregarded. Here we are additionally given an integer and the goal is to serve exactly clients, which the algorithm is free to choose. In 2001 Charikar et al. [SODA’01] presented a -approximation for the -center problem with outliers.

In this paper we consider a common generalization of the two extensions previously studied separately, i.e. we work with the capacitated -center with outliers. We present the first constant factor approximation algorithm with approximation ratio of even for the case of non-uniform hard capacities.

## 1 Introduction

The -center problem is a classic facility location problem and is defined as follows: given a finite set and a symmetric distance (cost) function satisfying the triangle inequality, find a subset of size such that each vertex in is “close” to some vertex in . More formally, once we choose the objective function to be minimized is . The vertices of are called centers or facilities. The problem is known to be NP-hard [12]. Approximation algorithms for the -center problem have been well studied and are known to be optimal [13, 15, 16, 17].

In the capacitated setting, studied for twenty years already, we are additionally given a capacity function and no more than vertices (called clients) may be assigned to a chosen center at . For the special case when all the capacities are identical (denoted as the uniform case), a -approximation was developed by Khuller and Sussmann [19] improving the previous bound of by Bar-Ilan, Kortsarz and Peleg [4]. In the soft capacities version, in contrast to the standard (hard capacities), we are allowed to open several facilities in a single location, i.e. the facilities may form a multiset. For the uniform soft capacities version the best known approximation ratio equals  [19]. For general hard capacities a constant factor approximation has been obtained only recently [11], somewhat surprisingly by using LP rounding. It was followed by a cleaner and simpler approach of An, Bhaskara and Svensson [1] who gave a -approximation algorithm. From the hardness perspective a lower bound on the approximation ratio is known [9, 11].

Another natural direction in generalizing the problem is an assumption that instead of serving all the clients we are given an integer and we are to select exactly clients to serve. The disregarded clients are in the literature called outliers. The -center problem with outliers admits a -approximation algorithm, which was obtained by Charikar et al. [8].

In this article we study a common generalization of the two mentioned variants of the -center problem, i.e. involving both capacities and outliers. In order to simplify our algorithms we work with a slight generalization, the Capacitated -supplier with Outliers problem, where vertices are either clients or potential facility locations. These vertices may coincide, so that one may have both a client and a potential facility location at the same point, as in -center. Below we give the formal problem definition.

Capacitated -supplier with Outliers Input: Integers , finite sets and , a symmetric distance (cost) function satisfying the triangle inequality, and a capacity function Find: Sets , , and a function satisfying , , for each . Minimize: .

Again, in the soft capacities version, is allowed to be a multiset, and in the uniform capacities version, the capacity function is constant.

Existence of an -approximation algorithm for Capacitated -center with Outliers can be shown to be equivalent to existence of an -approximation algorithm for Capacitated -supplier with Outliers (see Appendix B). Interestingly, such an equivalence is not known to hold if we do not allow outliers: the best known approximation factor for the Capacitated -supplier is 11 while for the Capacitated -center it is 9, see [1].

### 1.1 Our results and organization of the paper

The following is the main result of this paper.

###### Theorem 1.

The Capacitated -supplier with Outliers problem, both in hard and soft capacities version, admits a 25-approximation algorithm. The hard uniform capacities version admits a 23-approximation, and soft uniform capacities – a 13-approximation.

Note that taking shows that the -supplier problem generalizes the -center problem, and consequently gives the same approximation bounds for the latter.

###### Corollary 2.

The Capacitated -center with Outliers problem, both in hard and soft capacities version, admits a 25-approximation algorithm. The hard uniform capacities version admits a 23-approximation, and soft uniform capacities – a 13-approximation.

It is worth noting, that the already known approximation algorithm for the -center problem with outliers relies on the fact that a single vertex can serve all the clients that are its neighbors, i.e. there are no capacity constraints. At the same time the previous approximation algorithms for the capacitated -center problem (both in the uniform and non-uniform case) heavily used the fact that each vertex of the graph is close to some center in any solution. For this reason it was possible to create a path-like [11] or tree-like [1] structure with integrally opened non-leaf vertices, that was the crux in the rounding process. Consequently none of the algorithms for the two previously independently studied extensions of the basic problem, i.e. capacities and outliers, works for the problem we are interested in.

The first step of our algorithm (Section 3) is the standard thresholding technique, where we reduce a general metric to a distance metric of an unweighted graph. In Section 4 we introduce our main conceptual contribution, i.e. the notion of a skeleton. A skeleton is a set of vertices, for which there exists an optimum solution , such that each vertex of can be injectively mapped to a nearby vertex of and moreover each vertex of is close to some vertex of . Intuitively a skeleton is not yet a solution, but it looks similar to at least one optimum solution. If no outliers are allowed, any inclusion-wise maximal subset of with vertices far enough from each other, is a skeleton. In [11] and [1], such a set is then mapped to non-leaf vertices of the structure steering the rounding process. We use a skeleton in a similar way, but before we are able to do that, we need to bound the integrality gap. Without outliers, it was sufficient to take the standard LP relaxation and decompose the graph into connected components. Although with outliers this is no longer the case, as shown in Section 5, a skeleton lets us both strengthen the LP relaxation, adding an appropriate constraint, and obtain a more granular decomposition of the initial instance into several subinstances, for which the strengthened LP relaxation is feasible and has bounded integrality gap. Further in Section 6 we show how each of these smaller instances can be independently rounded using tools previously applied for the capacitated setting [1].222The final rounding step can be also done using the path-like structures notion of [11], however we use the ideas of [1] as it allows cleaner presentation. Section 7 contains a wrap-up of the whole algorithm. The improvements in the approximation ratio when soft or uniform capacities are considered, are presented in Appendix A.

### 1.2 Related facility location work

The facility location problem is a central problem in operations research and computer science and has been a testbed for many new algorithmic ideas resulting a number of different approximation algorithms. In this problem, given a metric (via a weighted graph ), a set of nodes called clients, and opening costs on some nodes called facilities, the goal is to open a subset of facilities such that the sum of their opening costs and connection costs of clients to their nearest open facilities is minimized. Up to now, the best known approximation ratio is 1.488, due to Li [21] who used a randomized selection in Byrka’s algorithm [6]. Guha and Khuller [14] showed that this problem is hard to approximate within a factor better than 1.463, assuming .

When the facilities have capacities, the problem is called the capacitated facility location problem. It has also received a great deal of attention in recent years. Two main variants of the problem are soft-capacitated facility location and hard-capacitated facility location: in the latter problem, each facility is either opened at some location or not, whereas in the former, one may specify any integer number of facilities to be opened at that location. Soft capacities make the problem easier and by modifying approximation algorithms for the uncapacitated problems, we can also handle this case [23, 18]. To the best of our knowledge all the existing constant-factor approximation algorithms for the general case of hard capacitated facility location are local search based, and the most recent of them is the -approximation algorithm of Bansal, Garg and Gupta [3]. The only LP-relaxation based approach for this problem is due to Levi, Shmoys and Swamy [20] who gave a 5-approximation algorithm for the special case in which all facility opening costs are equal (otherwise the LP does not have a constant integrality gap). Obtaining an LP based constant factor approximation algorithm for capacitated facility location is considered a major problem in approximation algorithms [24].

A problem very close to both facility location and -center is the -median problem in which we want to open at most facilities and the goal is to minimize the sum of connection costs of clients to their nearest open facilities. Very recently Li and Svensson [22] obtained an LP rounding -approximation algorithm, improving upon the previously best -approximation local search algorithm of Arya et al. [2]. Unfortunately obtaining a constant factor approximation algorithm for capacitated -median still remains open despite consistent effort. The only previous attempts with constant approximation factors for this problem violate the capacities within a constant factor for the uniform capacity case [7] and the non-uniform capacity case [10] or exceed the number of facilities by a constant factor [5].

## 2 Preliminaries

For a fixed instance of the Capacitated -supplier with Outliers, we call a solution if it satisfies the required conditions. We often identify the solution by only (considering it as a partial function from to ), using and to refer to the other elements of the triple. If satisfies , we say that is a distance- solution.

Let be an undirected graph. By we denote the metric defined by . For sets we define . If we write instead of .

For a vertex and an integer we denote and . We omit the superscript for and the subscript if there is no confusion which graph we refer to.

For a set and an element by we denote .

## 3 Reduction to graphic instances

As usual when working with a min max problem we start with the standard thresholding argument, i.e. reduce a general metric function to a metric defined by an unweighted graph.

We say that an instance of the -supplier problem is graphic, if is defined as the distance function of an unweighted bipartite graph , and the goal is to find a distance-1 solution. An -approximation algorithm is then allowed to either give a distance- solution, or, only if it finds out that no distance-1 solution exists, a NO answer.

Below we show how to build an -approximation algorithm for Capacitated -supplier with Outliers given an -approximation (in the aforementioned sense) for the graphic instances. Correctness of the reduction is standard. If an optimal solution exists, then its value belongs to . In particular, in the phase corresponding to , there is a distance-1 solution in . Thus the algorithm for graphic instances is required to find a solution. Therefore returns a solution for the first time at phase corresponding to . Since , is a distance- solution, hence also distance- solution.

## 4 Finding a skeleton

From now on we work with graphic instances only. Without loss of generality we may assume that for each . Indeed, setting has no influence on distance-1 solutions, while no additional distance- solutions are created.

The first phase of the algorithm outputs several subsets of . If a distance-1 solution exists, at least one of them resembles (in a certain sense, to be defined later) a distance-1 solution and can be successfully used by the subsequent phases as a hint for constructing a distance- solution. We formalize the features of a good hint in the following definition.

###### Definition 3.

A set is called a skeleton if

• (separation property) for any , ,

• there exists a distance-1 solution such that:

• (covering property) for each ,

• (injection property) there exists an injection satisfying for each .

If just separation and injection properties are satisfied, we call a preskeleton.

In other words a skeleton is a set , each vertex of which can be injectively mapped to a vertex of a distance-1 solution , and at the same time no two vertices of are close and contains the whole set .

Note that the separation property implies that sets are pairwise disjoint for , hence any function satisfying is in fact an injection, however we make it explicit for the sake of presentation.

###### Lemma 4.

Let be a preskeleton and let . Then is a skeleton, or and is a preskeleton, where is a highest-capacity vertex of .

###### Proof.

Let be a distance-1 solution, which witnesses being a preskeleton, where satisfies the injection property. If witnesses being a skeleton, we are done. Otherwise the covering property is not satisfied, hence there exists such that . Since is a distance function of a bipartite graph, this implies , so . If , then already witnesses being a preskeleton, as one can extend the injection by mapping a vertex of to . Therefore, we may assume that . In particular, this means that the clients in are not served by any facility of .

Let us modify to obtain as follows: close the facility in , opening one in instead. Let be the number of clients assigned to in . No longer serve these, instead serve any neighbors of in (as we have observed before, they are not served in ). Note that by the choice of maximizing the capacity and by the assumption of being bounded by . Consequently, there are enough neighbors of to serve, and the capacity constraint for is satisfied. Moreover, the number of open facilities and the number of served clients are preserved. Other open facilities remain unchanged, so satisfies the capacity and distance constraints for them, and therefore is a distance-1 solution. Finally, consider a function . As is at distance at least from , by the injection property for we know that does not belong to the image of , hence is an injection. Consequently and ensure satisfies the injection property. Moreover is far from , hence is a preskeleton. ∎

With being trivially a preskeleton provided that any distance-1 solution exists, Lemma 4 lets us generate a sequence of sets, which contains a skeleton (see Algorithm 2). Note that any skeleton, by the injection property, is of size at most .

###### Lemma 5.

If there exists a distance-1 solution, there is at least one skeleton among sets output by Algorithm 2.

## 5 Clustering

For a set define the following linear program , where a variable for denotes whether we open a facility in or not, while a variable for , corresponds to whether serves or not.

 ∑u∈Fyu =k (1) ∑u∈F,v∈Cxuv =p (2) xuv ≤yu for each u∈F,v∈C (3) ∑vxuv ≤L(u)⋅yu for each u∈F (4) ∑uxuv ≤1 for each v∈C (5) ∑u∈F∩N2[s]yu ≥1 for each s∈S (6) xuv =0 for each u∈F,v∈C such that (v,u)∉E (7) 0≤x,y ≤1 (8)

Constraints are the standard constraints for Capacitated -supplier with Outliers, ensuring that we open exactly facilities (1), serve exactly clients (2), obey capacity constraints (3)-(5), and serve clients which are close to facilities (7).

Observe that if is a skeleton and a distance-1 solution witnesses that fact, we get a feasible solution of setting iff and iff and . Indeed the injection property ensures that constraint (6) is satisfied. However, as usual in a capacitated problem with hard constraints, the integrality gap of this LP is unbounded. Similarly to the standard capacitated -center [11], this issue is addressed by considering the connected components of separately. When all the clients need to be served having a connected graph with a feasible solution of the standard LP is enough to round it [1, 11]. However, if we allow outliers, there are sill connected instances with arbitrarily large integrality gap (a simple construction is presented in Appendix C). For this reason we use the additional constraint (6) together with the assumption that all the vertices are close to . This way we crucially exploit the covering, injection and separation properties of a skeleton.

In the following we shall prove that any instance with a skeleton can be decomposed into several smaller instances with additional properties. In the next section we will show how to round the obtained smaller instances.

###### Lemma 6.

Let , let be components of after all vertices with are removed and let for .

If is a skeleton, then in polynomial time one can find partitions and such that are all feasible.

###### Proof.

Observe that if is a skeleton, then a witness solution opens facilities at distance at most 4 from , and thus serves clients with distance at most 5 from . Consequently all vertices further from can be safely removed and remains a skeleton. Then might contain several connected components with . The witness solution can be partitioned among these components so that we get assignments which in total open facilities to serve clients. In particular, this means that for some partitions and sets are skeletons, and consequently are feasible. The latter condition can be tested efficiently for any values and . While we cannot exhaustively test all partitions of and , dynamic programming lets us find partitions such that these linear programs are feasible for each .

For , and define a boolean value , which equals true iff there exist partitions and such that are all feasible for .

Clearly is true, while is false for any other pair . For the value is simply an alternative of for every pair such that is feasible, and . Thus in polynomial time one can check whether the desired partitions exists, and provided that together with a true value we also store the witness partitions, also find these partitions. ∎

## 6 Rounding

In the previous section we have shown how given a skeleton one can partition the initial instance into smaller subinstances with more structural properties. Our main goal in this section is to show that those structural properties are in fact sufficient to construct a solution for each of the subinstances, which is formalized in the following lemma.

###### Lemma 7.

Let be an instance of Capacitated -supplier with Outliers and let . If the following four conditions are satisfied:

1. is connected,

2. for any , we have ,

3. ,

then one can find a distance-25 solution for in polynomial time.

Before we give a proof of Lemma 7, in Section 6.1 we recall (an adjusted version) of a distance- transfer, a very useful notion introduced in [1], together with its main properties. Next, in Section 6.2 we prove Lemma 7.

### 6.1 Distance r-transfer

###### Definition 8.

Given a graph with , a capacity function and , a vector is a distance- transfer of if

1. and

2. for all .

If is a characteristic vector of , we say that is an integral distance- transfer of .

Less formally a distance- transfer is a reassignment, where the sum of -variables is preserved and locally for any set the total fractional capacity in a small neighborhood of does not decrease.

Like in [1], an integral distance- transfer of the fractional solution of the LP already gives a distance- solution (in particular point 2 of Definition 8 ensures that the Hall’s condition is satisfied). The proof must be modified though, so that it encompasses outliers.

###### Lemma 9.

Let be a bipartite graph with a capacity function . Assume is a feasible solution of and is an integral distance- transfer of . Then one can find a distance- solution in polynomial time.

###### Proof.

Consider a bipartite graph with if . Modify to obtain by removing vertices from and duplicating each vertex to its capacity, i.e.  times, see also Fig. 1. Observe that cardinality- matchings in this graph correspond to distance- solutions for . If any, such a matching can clearly be found in polynomial time. We shall prove its existence by checking the deficit version of Hall’s theorem, i.e. that for each we have

 ∑u∈F:d(u,U)≤r+1L(u)≥|U|−|C|+p

First, observe that

 ∑v∈U,u∈Fxuv=∑v∈C,u∈Fxuv−∑v∈C∖U,u∈Fxuv(???),(???)≥p−∑v∈C∖U1=p−|C∖U|=|U|−|C|+p.

Moreover

 ∑v∈U,u∈Fxuv=∑v∈U,u∈NG(U)xuv≤∑u∈NG(U)∑v∈Cxuv(???)≤∑u∈NG(U)L(u)yuDef. ??? point ???≤∑u∈F:dG(u,NG(U))≤rL(u)=∑u∈F:dG(u,U)≤r+1L(u).

Together these equalities conclude the proof. ∎

We proceed with a pair of simple properties of transfers.

###### Fact 10.

Let be a graph with and a capacity function , and let . Assume is a distance- transfer of and is a distance- transfer of . Then is a distance- transfer of .

###### Fact 11.

Let and be graphs with and and a capacity function . Let and let be a monotonic function such that for any . Assume is a distance- transfer of . Then is a distance- transfer of .

The following is the main technical contribution of [1].

###### Lemma 12 ([1]).

Let be a tree with a capacity function and let be a vector such that for every non-leaf and . Then one can find in polynomial time an integral distance-2 transfer of .

### 6.2 Final rounding

###### Lemma 13.

Let be a connected bipartite graph and let such that for every . There exists an auxiliary tree such that for any . Moreover, such a tree can be computed in polynomial time.

###### Proof.

We shall grow a tree adding a leaf in each step. At the beginning we select any and initialize with a single-vertex tree. Assume we have already grown a tree with vertex-set . Choose a shortest path connecting to . Such a path exists since is connected. If its length is at most 10, we add the endpoint in to the tree, joining it with the other endpoint. For a proof by contradiction assume that a shortest path has length greater than 10. Since is bipartite, its length needs to be even, and thus at least 12. Choose the midpoint of such a path. Its distance both to and to is at least 6, otherwise the path could be shortened. This vertex contradicts the assumption that for every . ∎

We are ready to prove Lemma 7.

###### proof of Lemma 7.

Since is connected and every vertex of is within distance from , we can use Lemma 13 to construct a tree . Let us add a duplicate of every to create a bipartite graph , where and . For each choose and set . Let us create a tree with . We build it in two steps, see also Fig. 2:

1. create a tree with vertex set so that is an edge iff ,

2. connect each vertex in to the closest vertex in .

Observe that endpoints of the edges created in the first step are at most at distance 10 in , while endpoints of the edges created in the second step, at most at distance 4. Consequently, for any . Moreover, note that all non-leaves of belong to .

Let be a feasible solution of . Note that can be interpreted as a vector in extending with zeroes at . We shall give an integral distance-24 transfer of . Despite it being formally a transfer in , will be a subset of , i.e. a transfer of as well.

Recall that by (ii), the sets are pairwise disjoint and in particular are pairwise different. This lets us use (6) to gather in one unit from for every so that the whole value in is transferred to . Note that for each , so this way we obtain a distance-2 transfer of . Additionally, we have made sure that , so can be interpreted as a vector in , and that , so is 1 for all non-leaves of . This lets us use Lemma 12 to obtain an integral distance-2 transfer of . According to Fact 11 it can be interpreted as a distance-20 transfer of . Finally we move the value from to for each . Note that these vertices have equal capacities, so this step can be interpreted as an integral distance-2 transfer.

The final transfer is therefore a composition of a distance-2 transfer, a distance-20 transfer and a distance-2 transfer. Thus, by Fact 10 it is a distance-24 transfer.333A simpler construction gives a distance- transfer, without introducing additional vertices . It is enough first to gather one unit from in and build a tree on vertices , where adjacent vertices of the tree are at distance at most in . By using Lemma 12 one obtains a distance-28 transfer, which together with the initial distance-2 transfer gives an integral distance-30 transfer. By Lemma 9 having an integral distance-24 transfer is enough to construct a distance-25 solution in polynomial time, which concludes the proof of Lemma 7. ∎

## 7 Wrap-up

With the results of previous section, we are ready to the prove the main theorem.

###### Theorem 14.

The Capacitated -supplier with Outliers problem admits a 25-approximation algorithm.

###### Proof.

Section 3 with Algorithm 1 provides (a Turing-like) reduction to graphic instances. Algorithm 2 of Section 4 given such an instance outputs several sets. Provided that a distance-1 solution exists, one of them is guaranteed to be a skeleton. Each of these sets is then processed separately. As described in Section 5, some redundant vertices are removed and the graph is partitioned into connected components. Dynamic programming (Lemma 6) is then used to find a compatible partition of and , so that each linear program admits a feasible solution. While this procedure might fail in general, it is guaranteed to succeed for a skeleton, hence at least once if a distance-1 solution exists.

Note that if such a partition is found, then for each of the instances together with sets , we can use Lemma 7 as all the conditions are satisfied. A sum of solutions for these instances is finally returned as a distance-25 solution for the original graphic instance. ∎

## Acknowledgements

We would like to thank Samir Khuller for suggesting the study of this variant of the -center problem and helpful discussions.

## References

• [1] Hyung-Chan An, Aditya Bhaskara, and Ola Svensson. Centrality of trees for capacitated -center. CoRR, abs/1304.2983, 2013.
• [2] Vijay Arya, Naveen Garg, Rohit Khandekar, Adam Meyerson, Kamesh Munagala, and Vinayaka Pandit. Local search heuristic for -median and facility location problems. In Jeffrey Scott Vitter, Paul G. Spirakis, and Mihalis Yannakakis, editors, STOC, pages 21–29. ACM, 2001.
• [3] Manisha Bansal, Naveen Garg, and Neelima Gupta. A 5-approximation for capacitated facility location. In Leah Epstein and Paolo Ferragina, editors, ESA, volume 7501 of Lecture Notes in Computer Science, pages 133–144. Springer, 2012.
• [4] Judit Bar-Ilan, Guy Kortsarz, and David Peleg. How to allocate network centers. Journal of Algorithms, 15(3):385–415, 1993.
• [5] Yair Bartal, Moses Charikar, and Danny Raz. Approximating min-sum -clustering in metric spaces. In Jeffrey Scott Vitter, Paul G. Spirakis, and Mihalis Yannakakis, editors, STOC, pages 11–20. ACM, 2001.
• [6] Jaroslaw Byrka and Karen Aardal. An optimal bifactor approximation algorithm for the metric uncapacitated facility location problem. SIAM J. Comput., 39(6):2212–2231, 2010.
• [7] Moses Charikar, Sudipto Guha, Éva Tardos, and David B. Shmoys. A constant-factor approximation algorithm for the -median problem. Journal of Computer and System Sciences, 65(1):129–149, 2002.
• [8] Moses Charikar, Samir Khuller, David M. Mount, and Giri Narasimhan. Algorithms for facility location problems with outliers. In S. Rao Kosaraju, editor, SODA, pages 642–651. ACM/SIAM, 2001.
• [9] Julia Chuzhoy, Sudipto Guha, Eran Halperin, Sanjeev Khanna, Guy Kortsarz, Robert Krauthgamer, and Joseph Naor. Asymmetric -center is -hard to approximate. Journal of the ACM, 52(4):538–551, 2005.
• [10] Julia Chuzhoy and Yuval Rabani. Approximating -median with non-uniform capacities. In SODA, pages 952–958. SIAM, 2005.
• [11] Marek Cygan, MohammadTaghi Hajiaghayi, and Samir Khuller. LP rounding for -centers with non-uniform hard capacities. In FOCS, pages 273–282. IEEE Computer Society, 2012.
• [12] M. R. Garey and David S. Johnson. Computers and Intractability: A Guide to the Theory of NP-Completeness. W. H. Freeman, 1979.
• [13] Teofilo F. Gonzalez. Clustering to minimize the maximum intercluster distance. Theoretical Computer Science, 38:293–306, 1985.
• [14] Sudipto Guha and Samir Khuller. Greedy strikes back: Improved facility location algorithms. Journal of Algorithms, 31(1):228–248, 1999.
• [15] Dorit S. Hochbaum and David B. Shmoys. A best possible heuristic for the -center problem. Mathematics of Operations Research, 10:180–184, 1985.
• [16] Dorit S. Hochbaum and David B. Shmoys. A unified approach to approximation algorithms for bottleneck problems. Journal of the ACM, 33(3):533–550, 1986.
• [17] Wen-Lian Hsu and George L. Nemhauser. Easy and hard bottleneck location problems. Discrete Applied Mathematics, 1:209–216, 1979.
• [18] Kamal Jain and Vijay V. Vazirani. Approximation algorithms for metric facility location and -median problems using the primal-dual schema and lagrangian relaxation. Journal of the ACM, 48(2):274–296, 2001.
• [19] Samir Khuller and Yoram J. Sussmann. The capacitated -center problem. SIAM Journal on Discrete Mathematics, 13(3):403–418, 2000.
• [20] Retsef Levi, David B. Shmoys, and Chaitanya Swamy. LP-based approximation algorithms for capacitated facility location. Mathematical Programming, 131(1-2):365–379, 2012.
• [21] Shi Li. A 1.488 approximation algorithm for the uncapacitated facility location problem. Information and Computation, 222:45–58, 2013.
• [22] Shi Li and Ola Svensson. Approximating -median via pseudo-approximation. In Dan Boneh, Tim Roughgarden, and Joan Feigenbaum, editors, STOC, pages 901–910. ACM, 2013.
• [23] David B. Shmoys, Éva Tardos, and Karen Aardal. Approximation algorithms for facility location problems. In Frank Thomson Leighton and Peter W. Shor, editors, STOC, pages 265–274. ACM, 1997.
• [24] David P. Williamson and David B. Shmoys. The Design of Approximation Algorithms. Cambridge University Press, 2011.

## Appendix A Soft capacities and uniform capacities

### a.1 Soft capacities

A variant of Capacitated -supplier with Outliers with soft capacities can be easily reduced to the original problem preserving the quality of solutions. It suffices to duplicate times each . Opening several facilities in then corresponds to opening facilities in several copies of .

###### Theorem 15.

The Capacitated -supplier with Outliers problem with soft capacities admits a 25-approximation algorithm.

### a.2 Uniform capacities

In the special case of Capacitated -supplier with Outliers where the capacities are uniform, we can obtain a slightly better approximation factor. Namely, in the proof of Lemma 7 we can set and avoid introducing additional vertices , using instead. With this change the third component of the transfer – moving the value from to – is not necessary, thus we get an integral distance-22 transfer. Analogously to Theorem 14, we then obtain the following result.

###### Theorem 16.

The Capacitated -supplier with Outliers problem with uniform capacities admits a 23-approximation algorithm.

### a.3 Uniform soft capacities

While we could argue as for general soft capacities that in the case of uniform soft capacities we have a 23-approximation algorithm, a tailor-made proof gives much better factor.

It is easy to verify that the ingredients of the proof of Theorem 14 be adapted to soft capacities with two changes:

• instead of a set of open facilities, we consider a multiset,

• we drop the requirement in the LP.

Thus, in order to obtain an -approximation algorithm it is enough to compute an integral (again, multisets allowed) distance- transfer of , where is the fractional solution of the LP for an instance satisfying the conditions of Lemma 7.

Again, we shall start with gathering value from in . This time we are allowed to gather more than one unit in , so we gather everything from . A vector defined this way clearly is a distance-2 transfer of . Moreover, by (6) at least one unit is gathered at each . Like in the proof of Lemma 7, the second component relies on the structure of . We connect each to the closest obtaining a tree . This way we have a tree on whose non-leaves belong to , and such that for any . We shall give an integral distance-1 transfer of . Let us make a rooted tree, setting the root at a vertex . For each define as the sum of over all descendants in the subtree rooted at . For each we transfer units from to its parent . Note that is an integer, since , so and the operation is well defined. Observe that for every it holds that

 y′′v=y′v−δv+∑u:child of vδu=⌊Y′v⌋−∑u:child of% v⌊Y′u⌋∈Z≥0.

Also, for any vertex we have . That is because for leaves and for the remaining vertices , since so that . Consequently, for any , setting , we get

 ∑v∈U′y′′v=∑v∈U′⎛⎝y′v−δv+∑u:child of vδu⎞⎠=∑v∈U′(y′v−δv)+∑u:p(u)∈U′δu≥∑v∈U(y′v−δv)+∑u∈Uδu=∑v∈Uy′v,

since for any . Moreover is a constant, so this inequality proves the condition 2. of Definition 8, and thus is indeed a distance-1 transfer of . By Fact 11 this defines a distance-10 transfer of , which composed with the previous transfer using Fact 10 gives an integral distance-12 transfer of . Consequently, repeating the proof of Theorem 14 we get the following result.

###### Theorem 17.

The Capacitated -supplier with Outliers problem with uniform soft capacities admits a 13-approximation algorithm.

## Appendix B Equivalence of Capacitatedk-supplier with Outliers and Capacitatedk-center with Outliers

###### Theorem 18.

Assume there exists an -approximation algorithm for Capacitated -center with Outliers. Then there exists an -approximation algorithm for Capacitated -supplier with Outliers.

###### Proof.

Let us consider an instance of Capacitated -supplier with Outliers. Define an instance of Capacitated -center with Outliers as follows: take where , and for every set . Other values of are taken as the symmetric, transitive closure of those determined explicitly (note that since was symmetric and satisfied triangle equality, the closure does not modify any explicitly set value of ). Also, set for , for , , and . Clearly can be constructed in polynomial time from . Thus, it suffices to show that a distance- solution exists in if and only if a distance -solution exists in .

One direction is very simple: assume is a distance- solution in . Observe that defined as for is a distance- solution in .

Now, let us prove the other implication. The construction is going to be similar to the one in the proof of Lemma 9. Assume is a distance- solution in . Note that may contain vertices from . Construct a bipartite graph with if , and modify to obtain by removing vertices from and multiplicating each to its capacity, i.e.  times. Note that , so a cardinality- matching in gives a distance- solution to . Observe that for any and , it holds that . Consequently, for any we have the following inequality

Therefore

 ∑u∈F:d(u,U)≤rL(u)>|U|−|C|+p−1.

Both sides of this inequality are integral, which implies

 ∑u∈F:d(u,U)≤rL(u)≥|U|−|C|+p

and, by the deficit version of Hall’s theorem, also guarantees the existence of a cardinality -matching in and a distance -solution to . ∎

## Appendix C Connected instance with arbitrarily large integrality gap

###### Fact 19.

For arbitrarily large there is a graphic instance of Capacitated -supplier with Outliers and a set , such that all conditions of Lemma 7 except (iii) are satisfied, but does not have a distance- solution.

###### Proof.

Assume and fix . Let consist of the following components (see also Figure 3): a path of vertices with endpoints and inner vertices alternately in and , four vertices (), with adjacent to , and vertices (), with adjacent both to and . For each we set , moreover and . The set is defined as .

Observe that an instance constructed this way satisfied conditions of Lemma 7 except (iii): clearly is connected, . Consider a solution of with the following non-zero coordinates: for , for , . It is easy to verify that it is a feasible solution.

It remains to show that does not have a distance- solution. For a proof by contradiction, assume that is does, with being the set of open facilities and being the set of clients served. Note that each must serve clients, since and for . Let and for . Observe that and the sum is disjoint. Consequently and for some . However, does not contain for any , so , a contradiction. ∎