Constant Factor Approximation
for Capacitated Center with
Outliers^{1}^{1}1This work is partially supported by Foundation for Polish Science grant HOMING PLUS/20126/2.
Abstract
The center problem is a classic facility location problem, where given an edgeweighted graph one is to find a subset of vertices , such that each vertex in is “close” to some vertex in . The approximation status of this basic problem is well understood, as a simple approximation algorithm is known to be tight. Consequently different extensions were studied.
In the capacitated version of the problem each vertex is assigned a capacity, which is a strict upper bound on the number of clients a facility can serve, when located at this vertex. A constant factor approximation for the capacitated center was obtained last year by Cygan, Hajiaghayi and Khuller [FOCS’12], which was recently improved to a approximation by An, Bhaskara and Svensson [arXiv’13].
In a different generalization of the problem some clients (denoted as outliers) may be disregarded. Here we are additionally given an integer and the goal is to serve exactly clients, which the algorithm is free to choose. In 2001 Charikar et al. [SODA’01] presented a approximation for the center problem with outliers.
In this paper we consider a common generalization of the two extensions previously studied separately, i.e. we work with the capacitated center with outliers. We present the first constant factor approximation algorithm with approximation ratio of even for the case of nonuniform hard capacities.
1 Introduction
The center problem is a classic facility location problem and is defined as follows: given a finite set and a symmetric distance (cost) function satisfying the triangle inequality, find a subset of size such that each vertex in is “close” to some vertex in . More formally, once we choose the objective function to be minimized is . The vertices of are called centers or facilities. The problem is known to be NPhard [12]. Approximation algorithms for the center problem have been well studied and are known to be optimal [13, 15, 16, 17].
In the capacitated setting, studied for twenty years already, we are additionally given a capacity function and no more than vertices (called clients) may be assigned to a chosen center at . For the special case when all the capacities are identical (denoted as the uniform case), a approximation was developed by Khuller and Sussmann [19] improving the previous bound of by BarIlan, Kortsarz and Peleg [4]. In the soft capacities version, in contrast to the standard (hard capacities), we are allowed to open several facilities in a single location, i.e. the facilities may form a multiset. For the uniform soft capacities version the best known approximation ratio equals [19]. For general hard capacities a constant factor approximation has been obtained only recently [11], somewhat surprisingly by using LP rounding. It was followed by a cleaner and simpler approach of An, Bhaskara and Svensson [1] who gave a approximation algorithm. From the hardness perspective a lower bound on the approximation ratio is known [9, 11].
Another natural direction in generalizing the problem is an assumption that instead of serving all the clients we are given an integer and we are to select exactly clients to serve. The disregarded clients are in the literature called outliers. The center problem with outliers admits a approximation algorithm, which was obtained by Charikar et al. [8].
In this article we study a common generalization of the two mentioned variants of the center problem, i.e. involving both capacities and outliers. In order to simplify our algorithms we work with a slight generalization, the Capacitated supplier with Outliers problem, where vertices are either clients or potential facility locations. These vertices may coincide, so that one may have both a client and a potential facility location at the same point, as in center. Below we give the formal problem definition.
Capacitated supplier with Outliers Input: Integers , finite sets and , a symmetric distance (cost) function satisfying the triangle inequality, and a capacity function Find: Sets , , and a function satisfying , , for each . Minimize: .
Again, in the soft capacities version, is allowed to be a multiset, and in the uniform capacities version, the capacity function is constant.
Existence of an approximation algorithm for Capacitated center with Outliers can be shown to be equivalent to existence of an approximation algorithm for Capacitated supplier with Outliers (see Appendix B). Interestingly, such an equivalence is not known to hold if we do not allow outliers: the best known approximation factor for the Capacitated supplier is 11 while for the Capacitated center it is 9, see [1].
1.1 Our results and organization of the paper
The following is the main result of this paper.
Theorem 1.
The Capacitated supplier with Outliers problem, both in hard and soft capacities version, admits a 25approximation algorithm. The hard uniform capacities version admits a 23approximation, and soft uniform capacities – a 13approximation.
Note that taking shows that the supplier problem generalizes the center problem, and consequently gives the same approximation bounds for the latter.
Corollary 2.
The Capacitated center with Outliers problem, both in hard and soft capacities version, admits a 25approximation algorithm. The hard uniform capacities version admits a 23approximation, and soft uniform capacities – a 13approximation.
It is worth noting, that the already known approximation algorithm for the center problem with outliers relies on the fact that a single vertex can serve all the clients that are its neighbors, i.e. there are no capacity constraints. At the same time the previous approximation algorithms for the capacitated center problem (both in the uniform and nonuniform case) heavily used the fact that each vertex of the graph is close to some center in any solution. For this reason it was possible to create a pathlike [11] or treelike [1] structure with integrally opened nonleaf vertices, that was the crux in the rounding process. Consequently none of the algorithms for the two previously independently studied extensions of the basic problem, i.e. capacities and outliers, works for the problem we are interested in.
The first step of our algorithm (Section 3) is the standard thresholding technique, where we reduce a general metric to a distance metric of an unweighted graph. In Section 4 we introduce our main conceptual contribution, i.e. the notion of a skeleton. A skeleton is a set of vertices, for which there exists an optimum solution , such that each vertex of can be injectively mapped to a nearby vertex of and moreover each vertex of is close to some vertex of . Intuitively a skeleton is not yet a solution, but it looks similar to at least one optimum solution. If no outliers are allowed, any inclusionwise maximal subset of with vertices far enough from each other, is a skeleton. In [11] and [1], such a set is then mapped to nonleaf vertices of the structure steering the rounding process. We use a skeleton in a similar way, but before we are able to do that, we need to bound the integrality gap. Without outliers, it was sufficient to take the standard LP relaxation and decompose the graph into connected components. Although with outliers this is no longer the case, as shown in Section 5, a skeleton lets us both strengthen the LP relaxation, adding an appropriate constraint, and obtain a more granular decomposition of the initial instance into several subinstances, for which the strengthened LP relaxation is feasible and has bounded integrality gap. Further in Section 6 we show how each of these smaller instances can be independently rounded using tools previously applied for the capacitated setting [1].^{2}^{2}2The final rounding step can be also done using the pathlike structures notion of [11], however we use the ideas of [1] as it allows cleaner presentation. Section 7 contains a wrapup of the whole algorithm. The improvements in the approximation ratio when soft or uniform capacities are considered, are presented in Appendix A.
1.2 Related facility location work
The facility location problem is a central problem in operations research and computer science and has been a testbed for many new algorithmic ideas resulting a number of different approximation algorithms. In this problem, given a metric (via a weighted graph ), a set of nodes called clients, and opening costs on some nodes called facilities, the goal is to open a subset of facilities such that the sum of their opening costs and connection costs of clients to their nearest open facilities is minimized. Up to now, the best known approximation ratio is 1.488, due to Li [21] who used a randomized selection in Byrka’s algorithm [6]. Guha and Khuller [14] showed that this problem is hard to approximate within a factor better than 1.463, assuming .
When the facilities have capacities, the problem is called the capacitated facility location problem. It has also received a great deal of attention in recent years. Two main variants of the problem are softcapacitated facility location and hardcapacitated facility location: in the latter problem, each facility is either opened at some location or not, whereas in the former, one may specify any integer number of facilities to be opened at that location. Soft capacities make the problem easier and by modifying approximation algorithms for the uncapacitated problems, we can also handle this case [23, 18]. To the best of our knowledge all the existing constantfactor approximation algorithms for the general case of hard capacitated facility location are local search based, and the most recent of them is the approximation algorithm of Bansal, Garg and Gupta [3]. The only LPrelaxation based approach for this problem is due to Levi, Shmoys and Swamy [20] who gave a 5approximation algorithm for the special case in which all facility opening costs are equal (otherwise the LP does not have a constant integrality gap). Obtaining an LP based constant factor approximation algorithm for capacitated facility location is considered a major problem in approximation algorithms [24].
A problem very close to both facility location and center is the median problem in which we want to open at most facilities and the goal is to minimize the sum of connection costs of clients to their nearest open facilities. Very recently Li and Svensson [22] obtained an LP rounding approximation algorithm, improving upon the previously best approximation local search algorithm of Arya et al. [2]. Unfortunately obtaining a constant factor approximation algorithm for capacitated median still remains open despite consistent effort. The only previous attempts with constant approximation factors for this problem violate the capacities within a constant factor for the uniform capacity case [7] and the nonuniform capacity case [10] or exceed the number of facilities by a constant factor [5].
2 Preliminaries
For a fixed instance of the Capacitated supplier with Outliers, we call a solution if it satisfies the required conditions. We often identify the solution by only (considering it as a partial function from to ), using and to refer to the other elements of the triple. If satisfies , we say that is a distance solution.
Let be an undirected graph. By we denote the metric defined by . For sets we define . If we write instead of .
For a vertex and an integer we denote and . We omit the superscript for and the subscript if there is no confusion which graph we refer to.
For a set and an element by we denote .
3 Reduction to graphic instances
As usual when working with a min max problem we start with the standard thresholding argument, i.e. reduce a general metric function to a metric defined by an unweighted graph.
We say that an instance of the supplier problem is graphic, if is defined as the distance function of an unweighted bipartite graph , and the goal is to find a distance1 solution. An approximation algorithm is then allowed to either give a distance solution, or, only if it finds out that no distance1 solution exists, a NO answer.
Below we show how to build an approximation algorithm for Capacitated supplier with Outliers given an approximation (in the aforementioned sense) for the graphic instances. Correctness of the reduction is standard. If an optimal solution exists, then its value belongs to . In particular, in the phase corresponding to , there is a distance1 solution in . Thus the algorithm for graphic instances is required to find a solution. Therefore returns a solution for the first time at phase corresponding to . Since , is a distance solution, hence also distance solution.
4 Finding a skeleton
From now on we work with graphic instances only. Without loss of generality we may assume that for each . Indeed, setting has no influence on distance1 solutions, while no additional distance solutions are created.
The first phase of the algorithm outputs several subsets of . If a distance1 solution exists, at least one of them resembles (in a certain sense, to be defined later) a distance1 solution and can be successfully used by the subsequent phases as a hint for constructing a distance solution. We formalize the features of a good hint in the following definition.
Definition 3.
A set is called a skeleton if

(separation property) for any , ,

there exists a distance1 solution such that:

(covering property) for each ,

(injection property) there exists an injection satisfying for each .

If just separation and injection properties are satisfied, we call a preskeleton.
In other words a skeleton is a set , each vertex of which can be injectively mapped to a vertex of a distance1 solution , and at the same time no two vertices of are close and contains the whole set .
Note that the separation property implies that sets are pairwise disjoint for , hence any function satisfying is in fact an injection, however we make it explicit for the sake of presentation.
Lemma 4.
Let be a preskeleton and let . Then is a skeleton, or and is a preskeleton, where is a highestcapacity vertex of .
Proof.
Let be a distance1 solution, which witnesses being a preskeleton, where satisfies the injection property. If witnesses being a skeleton, we are done. Otherwise the covering property is not satisfied, hence there exists such that . Since is a distance function of a bipartite graph, this implies , so . If , then already witnesses being a preskeleton, as one can extend the injection by mapping a vertex of to . Therefore, we may assume that . In particular, this means that the clients in are not served by any facility of .
Let us modify to obtain as follows: close the facility in , opening one in instead. Let be the number of clients assigned to in . No longer serve these, instead serve any neighbors of in (as we have observed before, they are not served in ). Note that by the choice of maximizing the capacity and by the assumption of being bounded by . Consequently, there are enough neighbors of to serve, and the capacity constraint for is satisfied. Moreover, the number of open facilities and the number of served clients are preserved. Other open facilities remain unchanged, so satisfies the capacity and distance constraints for them, and therefore is a distance1 solution. Finally, consider a function . As is at distance at least from , by the injection property for we know that does not belong to the image of , hence is an injection. Consequently and ensure satisfies the injection property. Moreover is far from , hence is a preskeleton. ∎
With being trivially a preskeleton provided that any distance1 solution exists, Lemma 4 lets us generate a sequence of sets, which contains a skeleton (see Algorithm 2). Note that any skeleton, by the injection property, is of size at most .
Lemma 5.
If there exists a distance1 solution, there is at least one skeleton among sets output by Algorithm 2.
5 Clustering
For a set define the following linear program , where a variable for denotes whether we open a facility in or not, while a variable for , corresponds to whether serves or not.
(1)  
(2)  
for each  (3)  
for each  (4)  
for each  (5)  
for each  (6)  
for each such that  (7)  
(8) 
Constraints are the standard constraints for Capacitated supplier with Outliers, ensuring that we open exactly facilities (1), serve exactly clients (2), obey capacity constraints (3)(5), and serve clients which are close to facilities (7).
Observe that if is a skeleton and a distance1 solution witnesses that fact, we get a feasible solution of setting iff and iff and . Indeed the injection property ensures that constraint (6) is satisfied. However, as usual in a capacitated problem with hard constraints, the integrality gap of this LP is unbounded. Similarly to the standard capacitated center [11], this issue is addressed by considering the connected components of separately. When all the clients need to be served having a connected graph with a feasible solution of the standard LP is enough to round it [1, 11]. However, if we allow outliers, there are sill connected instances with arbitrarily large integrality gap (a simple construction is presented in Appendix C). For this reason we use the additional constraint (6) together with the assumption that all the vertices are close to . This way we crucially exploit the covering, injection and separation properties of a skeleton.
In the following we shall prove that any instance with a skeleton can be decomposed into several smaller instances with additional properties. In the next section we will show how to round the obtained smaller instances.
Lemma 6.
Let , let be components of after all vertices with are removed and let for .
If is a skeleton, then in polynomial time one can find partitions and such that are all feasible.
Proof.
Observe that if is a skeleton, then a witness solution opens facilities at distance at most 4 from , and thus serves clients with distance at most 5 from . Consequently all vertices further from can be safely removed and remains a skeleton. Then might contain several connected components with . The witness solution can be partitioned among these components so that we get assignments which in total open facilities to serve clients. In particular, this means that for some partitions and sets are skeletons, and consequently are feasible. The latter condition can be tested efficiently for any values and . While we cannot exhaustively test all partitions of and , dynamic programming lets us find partitions such that these linear programs are feasible for each .
For , and define a boolean value , which equals true iff there exist partitions and such that are all feasible for .
Clearly is true, while is false for any other pair . For the value is simply an alternative of for every pair such that is feasible, and . Thus in polynomial time one can check whether the desired partitions exists, and provided that together with a true value we also store the witness partitions, also find these partitions. ∎
6 Rounding
In the previous section we have shown how given a skeleton one can partition the initial instance into smaller subinstances with more structural properties. Our main goal in this section is to show that those structural properties are in fact sufficient to construct a solution for each of the subinstances, which is formalized in the following lemma.
Lemma 7.
Let be an instance of Capacitated supplier with Outliers and let . If the following four conditions are satisfied:

is connected,

for any , we have ,

,

admits a feasible solution,
then one can find a distance25 solution for in polynomial time.
Before we give a proof of Lemma 7, in Section 6.1 we recall (an adjusted version) of a distance transfer, a very useful notion introduced in [1], together with its main properties. Next, in Section 6.2 we prove Lemma 7.
6.1 Distance transfer
Definition 8.
Given a graph with , a capacity function and , a vector is a distance transfer of if

and

for all .
If is a characteristic vector of , we say that is an integral distance transfer of .
Less formally a distance transfer is a reassignment, where the sum of variables is preserved and locally for any set the total fractional capacity in a small neighborhood of does not decrease.
Like in [1], an integral distance transfer of the fractional solution of the LP already gives a distance solution (in particular point 2 of Definition 8 ensures that the Hall’s condition is satisfied). The proof must be modified though, so that it encompasses outliers.
Lemma 9.
Let be a bipartite graph with a capacity function . Assume is a feasible solution of and is an integral distance transfer of . Then one can find a distance solution in polynomial time.
Proof.
Consider a bipartite graph with if . Modify to obtain by removing vertices from and duplicating each vertex to its capacity, i.e. times, see also Fig. 1. Observe that cardinality matchings in this graph correspond to distance solutions for . If any, such a matching can clearly be found in polynomial time. We shall prove its existence by checking the deficit version of Hall’s theorem, i.e. that for each we have
First, observe that
Moreover
Together these equalities conclude the proof. ∎
We proceed with a pair of simple properties of transfers.
Fact 10.
Let be a graph with and a capacity function , and let . Assume is a distance transfer of and is a distance transfer of . Then is a distance transfer of .
Fact 11.
Let and be graphs with and and a capacity function . Let and let be a monotonic function such that for any . Assume is a distance transfer of . Then is a distance transfer of .
The following is the main technical contribution of [1].
Lemma 12 ([1]).
Let be a tree with a capacity function and let be a vector such that for every nonleaf and . Then one can find in polynomial time an integral distance2 transfer of .
6.2 Final rounding
Lemma 13.
Let be a connected bipartite graph and let such that for every . There exists an auxiliary tree such that for any . Moreover, such a tree can be computed in polynomial time.
Proof.
We shall grow a tree adding a leaf in each step. At the beginning we select any and initialize with a singlevertex tree. Assume we have already grown a tree with vertexset . Choose a shortest path connecting to . Such a path exists since is connected. If its length is at most 10, we add the endpoint in to the tree, joining it with the other endpoint. For a proof by contradiction assume that a shortest path has length greater than 10. Since is bipartite, its length needs to be even, and thus at least 12. Choose the midpoint of such a path. Its distance both to and to is at least 6, otherwise the path could be shortened. This vertex contradicts the assumption that for every . ∎
We are ready to prove Lemma 7.
proof of Lemma 7.
Since is connected and every vertex of is within distance from , we can use Lemma 13 to construct a tree . Let us add a duplicate of every to create a bipartite graph , where and . For each choose and set . Let us create a tree with . We build it in two steps, see also Fig. 2:

create a tree with vertex set so that is an edge iff ,

connect each vertex in to the closest vertex in .
Observe that endpoints of the edges created in the first step are at most at distance 10 in , while endpoints of the edges created in the second step, at most at distance 4. Consequently, for any . Moreover, note that all nonleaves of belong to .
Let be a feasible solution of . Note that can be interpreted as a vector in extending with zeroes at . We shall give an integral distance24 transfer of . Despite it being formally a transfer in , will be a subset of , i.e. a transfer of as well.
Recall that by (ii), the sets are pairwise disjoint and in particular are pairwise different. This lets us use (6) to gather in one unit from for every so that the whole value in is transferred to . Note that for each , so this way we obtain a distance2 transfer of . Additionally, we have made sure that , so can be interpreted as a vector in , and that , so is 1 for all nonleaves of . This lets us use Lemma 12 to obtain an integral distance2 transfer of . According to Fact 11 it can be interpreted as a distance20 transfer of . Finally we move the value from to for each . Note that these vertices have equal capacities, so this step can be interpreted as an integral distance2 transfer.
The final transfer is therefore a composition of a distance2 transfer, a distance20 transfer and a distance2 transfer. Thus, by Fact 10 it is a distance24 transfer.^{3}^{3}3A simpler construction gives a distance transfer, without introducing additional vertices . It is enough first to gather one unit from in and build a tree on vertices , where adjacent vertices of the tree are at distance at most in . By using Lemma 12 one obtains a distance28 transfer, which together with the initial distance2 transfer gives an integral distance30 transfer. By Lemma 9 having an integral distance24 transfer is enough to construct a distance25 solution in polynomial time, which concludes the proof of Lemma 7. ∎
7 Wrapup
With the results of previous section, we are ready to the prove the main theorem.
Theorem 14.
The Capacitated supplier with Outliers problem admits a 25approximation algorithm.
Proof.
Section 3 with Algorithm 1 provides (a Turinglike) reduction to graphic instances. Algorithm 2 of Section 4 given such an instance outputs several sets. Provided that a distance1 solution exists, one of them is guaranteed to be a skeleton. Each of these sets is then processed separately. As described in Section 5, some redundant vertices are removed and the graph is partitioned into connected components. Dynamic programming (Lemma 6) is then used to find a compatible partition of and , so that each linear program admits a feasible solution. While this procedure might fail in general, it is guaranteed to succeed for a skeleton, hence at least once if a distance1 solution exists.
Note that if such a partition is found, then for each of the instances together with sets , we can use Lemma 7 as all the conditions are satisfied. A sum of solutions for these instances is finally returned as a distance25 solution for the original graphic instance. ∎
Acknowledgements
We would like to thank Samir Khuller for suggesting the study of this variant of the center problem and helpful discussions.
References
 [1] HyungChan An, Aditya Bhaskara, and Ola Svensson. Centrality of trees for capacitated center. CoRR, abs/1304.2983, 2013.
 [2] Vijay Arya, Naveen Garg, Rohit Khandekar, Adam Meyerson, Kamesh Munagala, and Vinayaka Pandit. Local search heuristic for median and facility location problems. In Jeffrey Scott Vitter, Paul G. Spirakis, and Mihalis Yannakakis, editors, STOC, pages 21–29. ACM, 2001.
 [3] Manisha Bansal, Naveen Garg, and Neelima Gupta. A 5approximation for capacitated facility location. In Leah Epstein and Paolo Ferragina, editors, ESA, volume 7501 of Lecture Notes in Computer Science, pages 133–144. Springer, 2012.
 [4] Judit BarIlan, Guy Kortsarz, and David Peleg. How to allocate network centers. Journal of Algorithms, 15(3):385–415, 1993.
 [5] Yair Bartal, Moses Charikar, and Danny Raz. Approximating minsum clustering in metric spaces. In Jeffrey Scott Vitter, Paul G. Spirakis, and Mihalis Yannakakis, editors, STOC, pages 11–20. ACM, 2001.
 [6] Jaroslaw Byrka and Karen Aardal. An optimal bifactor approximation algorithm for the metric uncapacitated facility location problem. SIAM J. Comput., 39(6):2212–2231, 2010.
 [7] Moses Charikar, Sudipto Guha, Éva Tardos, and David B. Shmoys. A constantfactor approximation algorithm for the median problem. Journal of Computer and System Sciences, 65(1):129–149, 2002.
 [8] Moses Charikar, Samir Khuller, David M. Mount, and Giri Narasimhan. Algorithms for facility location problems with outliers. In S. Rao Kosaraju, editor, SODA, pages 642–651. ACM/SIAM, 2001.
 [9] Julia Chuzhoy, Sudipto Guha, Eran Halperin, Sanjeev Khanna, Guy Kortsarz, Robert Krauthgamer, and Joseph Naor. Asymmetric center is hard to approximate. Journal of the ACM, 52(4):538–551, 2005.
 [10] Julia Chuzhoy and Yuval Rabani. Approximating median with nonuniform capacities. In SODA, pages 952–958. SIAM, 2005.
 [11] Marek Cygan, MohammadTaghi Hajiaghayi, and Samir Khuller. LP rounding for centers with nonuniform hard capacities. In FOCS, pages 273–282. IEEE Computer Society, 2012.
 [12] M. R. Garey and David S. Johnson. Computers and Intractability: A Guide to the Theory of NPCompleteness. W. H. Freeman, 1979.
 [13] Teofilo F. Gonzalez. Clustering to minimize the maximum intercluster distance. Theoretical Computer Science, 38:293–306, 1985.
 [14] Sudipto Guha and Samir Khuller. Greedy strikes back: Improved facility location algorithms. Journal of Algorithms, 31(1):228–248, 1999.
 [15] Dorit S. Hochbaum and David B. Shmoys. A best possible heuristic for the center problem. Mathematics of Operations Research, 10:180–184, 1985.
 [16] Dorit S. Hochbaum and David B. Shmoys. A unified approach to approximation algorithms for bottleneck problems. Journal of the ACM, 33(3):533–550, 1986.
 [17] WenLian Hsu and George L. Nemhauser. Easy and hard bottleneck location problems. Discrete Applied Mathematics, 1:209–216, 1979.
 [18] Kamal Jain and Vijay V. Vazirani. Approximation algorithms for metric facility location and median problems using the primaldual schema and lagrangian relaxation. Journal of the ACM, 48(2):274–296, 2001.
 [19] Samir Khuller and Yoram J. Sussmann. The capacitated center problem. SIAM Journal on Discrete Mathematics, 13(3):403–418, 2000.
 [20] Retsef Levi, David B. Shmoys, and Chaitanya Swamy. LPbased approximation algorithms for capacitated facility location. Mathematical Programming, 131(12):365–379, 2012.
 [21] Shi Li. A 1.488 approximation algorithm for the uncapacitated facility location problem. Information and Computation, 222:45–58, 2013.
 [22] Shi Li and Ola Svensson. Approximating median via pseudoapproximation. In Dan Boneh, Tim Roughgarden, and Joan Feigenbaum, editors, STOC, pages 901–910. ACM, 2013.
 [23] David B. Shmoys, Éva Tardos, and Karen Aardal. Approximation algorithms for facility location problems. In Frank Thomson Leighton and Peter W. Shor, editors, STOC, pages 265–274. ACM, 1997.
 [24] David P. Williamson and David B. Shmoys. The Design of Approximation Algorithms. Cambridge University Press, 2011.
Appendix A Soft capacities and uniform capacities
a.1 Soft capacities
A variant of Capacitated supplier with Outliers with soft capacities can be easily reduced to the original problem preserving the quality of solutions. It suffices to duplicate times each . Opening several facilities in then corresponds to opening facilities in several copies of .
Theorem 15.
The Capacitated supplier with Outliers problem with soft capacities admits a 25approximation algorithm.
a.2 Uniform capacities
In the special case of Capacitated supplier with Outliers where the capacities are uniform, we can obtain a slightly better approximation factor. Namely, in the proof of Lemma 7 we can set and avoid introducing additional vertices , using instead. With this change the third component of the transfer – moving the value from to – is not necessary, thus we get an integral distance22 transfer. Analogously to Theorem 14, we then obtain the following result.
Theorem 16.
The Capacitated supplier with Outliers problem with uniform capacities admits a 23approximation algorithm.
a.3 Uniform soft capacities
While we could argue as for general soft capacities that in the case of uniform soft capacities we have a 23approximation algorithm, a tailormade proof gives much better factor.
It is easy to verify that the ingredients of the proof of Theorem 14 be adapted to soft capacities with two changes:

instead of a set of open facilities, we consider a multiset,

we drop the requirement in the LP.
Thus, in order to obtain an approximation algorithm it is enough to compute an integral (again, multisets allowed) distance transfer of , where is the fractional solution of the LP for an instance satisfying the conditions of Lemma 7.
Again, we shall start with gathering value from in . This time we are allowed to gather more than one unit in , so we gather everything from . A vector defined this way clearly is a distance2 transfer of . Moreover, by (6) at least one unit is gathered at each . Like in the proof of Lemma 7, the second component relies on the structure of . We connect each to the closest obtaining a tree . This way we have a tree on whose nonleaves belong to , and such that for any . We shall give an integral distance1 transfer of . Let us make a rooted tree, setting the root at a vertex . For each define as the sum of over all descendants in the subtree rooted at . For each we transfer units from to its parent . Note that is an integer, since , so and the operation is well defined. Observe that for every it holds that
Also, for any vertex we have . That is because for leaves and for the remaining vertices , since so that . Consequently, for any , setting , we get
since for any . Moreover is a constant, so this inequality proves the condition 2. of Definition 8, and thus is indeed a distance1 transfer of . By Fact 11 this defines a distance10 transfer of , which composed with the previous transfer using Fact 10 gives an integral distance12 transfer of . Consequently, repeating the proof of Theorem 14 we get the following result.
Theorem 17.
The Capacitated supplier with Outliers problem with uniform soft capacities admits a 13approximation algorithm.
Appendix B Equivalence of Capacitated supplier with Outliers and Capacitated center with Outliers
Theorem 18.
Assume there exists an approximation algorithm for Capacitated center with Outliers. Then there exists an approximation algorithm for Capacitated supplier with Outliers.
Proof.
Let us consider an instance of Capacitated supplier with Outliers. Define an instance of Capacitated center with Outliers as follows: take where , and for every set . Other values of are taken as the symmetric, transitive closure of those determined explicitly (note that since was symmetric and satisfied triangle equality, the closure does not modify any explicitly set value of ). Also, set for , for , , and . Clearly can be constructed in polynomial time from . Thus, it suffices to show that a distance solution exists in if and only if a distance solution exists in .
One direction is very simple: assume is a distance solution in . Observe that defined as for is a distance solution in .
Now, let us prove the other implication. The construction is going to be similar to the one in the proof of Lemma 9. Assume is a distance solution in . Note that may contain vertices from . Construct a bipartite graph with if , and modify to obtain by removing vertices from and multiplicating each to its capacity, i.e. times. Note that , so a cardinality matching in gives a distance solution to . Observe that for any and , it holds that . Consequently, for any we have the following inequality
Therefore
Both sides of this inequality are integral, which implies
and, by the deficit version of Hall’s theorem, also guarantees the existence of a cardinality matching in and a distance solution to . ∎
Appendix C Connected instance with arbitrarily large integrality gap
Fact 19.
Proof.
Assume and fix . Let consist of the following components (see also Figure 3): a path of vertices with endpoints and inner vertices alternately in and , four vertices (), with adjacent to , and vertices (), with adjacent both to and . For each we set , moreover and . The set is defined as .
Observe that an instance constructed this way satisfied conditions of Lemma 7 except (iii): clearly is connected, . Consider a solution of with the following nonzero coordinates: for , for , . It is easy to verify that it is a feasible solution.
It remains to show that does not have a distance solution. For a proof by contradiction, assume that is does, with being the set of open facilities and being the set of clients served. Note that each must serve clients, since and for . Let and for . Observe that and the sum is disjoint. Consequently and for some . However, does not contain for any , so , a contradiction. ∎