Capacitated Center Problems with Two-Sided Bounds and Outliers

Capacitated Center Problems with Two-Sided Bounds and Outliers

Hu Ding111Computer Science and Engineering, Michigan State University, East Lansing, MI, USA    Lunjia Hu222Institute for Interdisciplinary Information Sciences, Tsinghua University, Beijing, China    Lingxiao Huang222Institute for Interdisciplinary Information Sciences, Tsinghua University, Beijing, China    Jian Li222Institute for Interdisciplinary Information Sciences, Tsinghua University, Beijing, China
Abstract

In recent years, the capacitated center problems have attracted a lot of research interest. Given a set of vertices , we want to find a subset of vertices , called centers, such that the maximum cluster radius is minimized. Moreover, each center in should satisfy some capacity constraint, which could be an upper or lower bound on the number of vertices it can serve. Capacitated -center problems with one-sided bounds (upper or lower) have been well studied in previous work, and a constant factor approximation was obtained.

We are the first to study the capacitated center problem with both capacity lower and upper bounds (with or without outliers). We assume each vertex has a uniform lower bound and a non-uniform upper bound. For the case of opening exactly centers, we note that a generalization of a recent LP approach can achieve constant factor approximation algorithms for our problems. Our main contribution is a simple combinatorial algorithm for the case where there is no cardinality constraint on the number of open centers. Our combinatorial algorithm is simpler and achieves better constant approximation factor compared to the LP approach.

1 introduction

The -center clustering is a fundamental problem in theoretical computer science and has numerous applications in a variety of fields. Roughly speaking, given a metric space containing a set of vertices, the -center problem asks for a subset of vertices, called centers, such that the maximum radius of the induced clusters is minimized. Actually -center clustering falls in the umbrella of the general facility location problems which have been extensively studied in the past decades. Many operation and management problems can be modeled as facility location problems, and usually the input vertices and selected centers are also called “clients” and “facilities” respectively. In this paper, we consider a significant generalization of the -center problem, where each vertex is associated with a capacity interval; that is, the cardinality of the resulting cluster centered at the vertex should satisfy the given lower and upper capacity bounds (the formal definition is shown in Section 1.2). In addition, we also consider the case where a given number of vertices may be excluded as outliers.

Besides being a natural combinatorial problem on its own, the -center problem with both capacity upper and lower bounds is also strongly motivated by several realistic issues raised in a variety of application contexts.

  1. In the context of facility location, each open facility may be constrained by the maximum number of clients it can serve. The capacity lower bounds also come naturally, since an open facility needs to serve at least a certain number of clients in order to generate profit.

  2. Several variants of the -center clustering have been used in the context of preserving privacy in publication of sensitive data (see e.g., (Aggarwal et al. , 2010; Li et al. , 2010; Sweeney, 2002)). In such applications, it is important to have an appropriate lower bound for the cluster sizes, in order to protect the privacy to certain extent (roughly speaking, it would be relatively easier for an adversary to identify the clients inside a too small cluster).

  3. Consider the scenario where the data is distributed over the nodes in a large network. We would like to choose nodes as central servers, and aggregate the information of the entire network. We need to minimize the delay (i.e., minimize the cluster radius), and at the same time consider the balancedness, for the obvious reason that the machines receiving too much data could be the bottleneck of the system and the ones receiving too little data is not sufficiently energy-efficient (Dick et al. , 2015).

Our problem generalizes the classic -center problem as well as many important variants studied by previous authors. The optimal approximation results for the classic -center problem appeared in the 80’s: Gonzalez (1985) and Hochbaum & Shmoys (1985) provided a -approximation in a metric graph; moreover, they proved that any approximation ratio would imply . The first study on capacitated (with only upper bounds) -center clustering is due to Barilan et al.  (1993) who provided a -approximation algorithm for uniform capacities (i.e., all the upper bounds are identical). Further, Khuller & Sussmann (2000) improved the approximation ratio to be and for hard and soft uniform capacities, respectively. *** We can open more than one copies of a facility in the same node in the soft capacity version. But in the hard capacity version, we can only open at most one copy. The recent breakthrough for non-uniform (upper) capacities is due to Cygan et al.  (2012). They developed the first constant approximation algorithm based on LP rounding, though their approximation ratio is about hundreds. Following this work, An et al.  (2015) provided an approximation algorithm with the much lower approximation ratio . On the imapproximability side, it is impossible to achieve an approximation ratio lower than for non-uniform capacities unless  (Cygan et al. , 2012).

For the ordinary -center with outliers, a -approximation algorithm was obtained by Charikar et al.  (2001). Kociumaka & Cygan (2014) studied -center with non-uniform upper capacities and outliers, and provided a -approximation algorithm.

-center clustering with lower bounds on cluster sizes was first studied in the context of privacy-preserving data management (Sweeney, 2002). Aggarwal et al.  (2010) provided a -approximation and a -approximation for the cases without and with outliers, respectively. Further, Ene et al.  (2013) presented a near linear time -approximation algorithm in constant dimensional Euclidean space. Note that both (Aggarwal et al. , 2010; Ene et al. , 2013) are only for uniform lower bounds. Recently, Ahmadian & Swamy (2016) provided a -approximation and a -approximation for the non-uniform lower bound case without and with outliers.

Our main results. To the best of our knowledge, we are the first to study the -center with both capacity lower and upper bounds (with or without outliers). Given a set of vertices, we focus on the case where the capacity of each vertex has a uniform lower bound and a non-uniform upper bound . Sometimes, we consider a generalized supplier version where we are only allowed to open centers among a facility set , see Definition 1 for details. We mainly provide first constant factor approximation algorithms for the following variants, see Table 1 for other results.

  1. (,,soft-,)-Center (Section 2.2): In this problem, both the lower bounds and the upper bounds are uniform, i.e., for all . The number of open centers can be arbitrary, i.e., there is no requirement to choose exactly open centers. Moreover, we allow multiple open centers at a single vertex (i.e., soft capacity). We may exclude outliers. We provide the first polynomial time combinatorial algorithm which can achieve an approximate factor of .

  2. (,,,)-Center(Section 2.3): In this problem, the lower bounds are uniform, i.e., for all , but the upper bound can be nonuniform. The number of open centers can be arbitrary. We may exclude outliers. We provide the first polynomial time combinatorial -approximation for this problem.

  3. (,,)-Center (Section 3.3): In this problem, we would like to open exactly centers, such that the maximum cluster radius is minimized. All vertices have the same capacity lower bounds, i.e., for all . But the capacity upper bounds may be nonuniform, i.e., each vertex has an individual capacity upper bound . Moreover, we do not exclude any outlier. We provide the first polynomial time -approximation algorithm for this problem, based on LP rounding.

  4. (,,,)-Center (Section 3.3): This problem is the outlier version of the (,,)-Center problem. The problem setting is exactly the same except that we can exclude vertices as outliers. We provide a polynomial time -approximation algorithm for this problem.

Problem Setting Approximation Ratio
Center Version Supplier Version
Without Constraint (,,soft-,) 5 5
(,,,) 10 23
(,,soft-,) 11 11
(,,,) 11 25
With Constraint (,,) 6 9
(,,) 9 13
(,,soft-,) 13 13
(,,,) 23 23
(,,soft-,) 25 25
(,,,) 25 25
Table 1: A summarization table for our results in this paper.

Our main techniques. In Section 2, we consider the first two variants which allow to open arbitrarily many centers. We design simple and faster combinatorial algorithms which can achieve better constant approximation ratios compared to the LP approach. For the simpler case (,,soft-,)-Center, we construct a data structure for all possible open centers. We call it a core-center tree (CCT). Our greedy algorithm mainly contains two procedures. The first procedure pass-up greedily assigns vertices to open centers from the leaves of CCT to the root. After this procedure, there may exist some unassigned vertices around the root. We then introduce the second procedure called pass-down, which assigns these vertices in order by finding an exchange route each time. For the more general case (,,,)-Center, our greedy algorithm is similar but somewhat more subtle. We still construct a CCT and run the pass-up procedure. Then we obtain an open center set , which may contain redundant centers. However, since we deal with hard capacities and outliers, we need to find a non-redundant open center set which is not ’too far’ from (see Section 2.3 for details) and have enough total capacities. Then by a pass-down procedure, we can assign enough vertices to their nearby open centers.

In Section 3 and 3.3, we consider the last two variants which require to open exactly centers. We generalized the LP approach developed for -center with only capacity upper bounds (An et al. , 2015; Kociumaka & Cygan, 2014) and obtain constant approximation schemes for two-sided capacitated bounds. Due to the lack of space, we defer many details and proofs to a full version.

1.1 Other Related Work

The classic -center problem is quite fundamental and has been generalized in many ways, to incorporate various constraints motivated by different application scenarios. Recently, Fernandes et al.  (2016) also provided constant approximations for the fault-tolerant capacitated -center clustering. Chen et al.  (2016) studied the matroid center problem where the selected centers must form an independent set of a given matroid, and provided constant factor approximation algorithms (with or without outliers).

There is a large body of work on approximation algorithms for the facility location and -median problems (see e.g., (Arya et al. , 2004; Charikar & Guha, 2005; Charikar et al. , 1999; Guha & Khuller, 1999; Jain et al. , 2002; Jain & Vazirani, 2001; Korupolu et al. , 2000; Li, 2013; Li & Svensson, 2016)). Moreover, Dick et al.  (2015) studied multiple balanced clustering problems with uniform capacity intervals, that is, all the lower (upper) bounds are identical; they also consider the problems under the stability assumption.

1.2 Preliminaries

In this paper, we usually work with the following more general problem, called the capacitated -supplier problem. It is easy to see it generalizes the capacitated -center problem. The formal definition is as follows.

Definition 1.

(Capacitated -supplier with two-sided bounds and outliers) Suppose that we have

  1. Two integers ;

  2. A finite set of clients, and a finite set of facilities;

  3. A symmetric distance function satisfying the triangle inequality;

  4. A capacity interval for each facility , where and .

Our goal is to find a client set of size at least , an open facility set of size exactly , and a function satisfying that for each , which minimize the maximum cluster radius . If the maximum cluster radius is at most , we call the tuple a distance- solution.

We denote the above problem as (,,,)-Supplier. If the lower bounds are uniform ( for all ), we use in place , e.g., (,,)-Supplier. Similarly, if the upper bounds are uniform ( for all ), we use in place . If there is no constraint to open centers, we use to replace , e.g., (,,,)-Supplier. Also note that the capacitated -center problem with two-sided bounds and outliers is a special case by letting , we denote it the (,,,)-Center problem.

By the similar approach of Kociumaka & Cygan (2014), we can reduce the (,,,)-Supplier problem to a simpler case. We first introduce some definitions.

Definition 2.

(Induced distance function) We say the distance function is induced by an undirected unweighted connected graph if

  1. , we have and .

  2. , the distance between and equals to the length of the shortest path from to .

Definition 3.

(Induced (,,,)-Supplier instance) An (,,,)-Supplier instance is called an induced (,,,)-Supplier instance if the following properties are satisfied:

  1. The distance function is induced by an undirected connected graph .

  2. The optimal capacitated -supplier value is at most 1.

Moreover, we say this instance is induced by .

When the graph of interest is clear from the context, we will use instead of for convenience. We then show a reduction from solving the generalized (,,,)-Supplier problem to solving induced (,,,)-Supplier instances by Lemma 4. The proof can be found in Appendix A.

Lemma 4.

Suppose we have a polynomial time algorithm that takes as input any induced (,,,)-Supplier instance, and outputs a distance- solution. Then, there exists a -approximation algorithm for the (,,,)-Supplier problem with polynomial running time.

By Lemma 4, we focus on designing an algorithm for different variants of the induced (,,,)-Supplier instances.

2 Capacitated Center with Two-Sided Bounds and Outliers

In this section, we consider the version that the number of open centers can be arbitrary. By the LP approach in Section 3.3 and enumerating the number of open centers, we can achieve approximation algorithms for different variants in this case. However, the approximation factor is not small enough. In this section, we introduce a new greedy approach in order to achieve better approximation factors. Since our algorithm is combinatorial, it is easier to be implemented and saves the running time compared to the LP approach.

2.1 Core-center tree (Cct)

Consider the (,,,)-Supplier problem. By Lemma 4, we only need to consider induced (,,,)-Supplier instances induced by an undirected unweighted connected graph . We first propose a new data structure called core-center tree (CCT) as follows.

Definition 5.

(Core-center tree (CCT)) Given an induced (,,,)-Supplier instance induced by an undirected unweighted connected graph , we call a tree a core-center tree(CCT) if the following properties hold.

  1. For each edge , we have ;

  2. Suppose the root of is at layer 0. Denote to be the set of vertices in the even layers of . We call the core-center set of . For any two distinct vertices , we have .

Lemma 6.

Given an induced (,,,)-Supplier instance induced by an undirected unweighted connected graph , we can construct a CCT in polynomial time.

Proof.

We first construct a graph on as follows: for each pair in with distance at most 2, we add an edge in . Observe that is connected by Definition 2. We then construct a spanning tree of satisfying that all facilities in even layers form an independent set of . It is not hard to verify that such a tree is a CCT. We build as follows. The above property directly holds from our construction.

  1. Initially, we randomly pick a facility as the root of . We then pick all adjacent facilities of in as its children (layer 1).

  2. By a modified BFS, we continue to construct layer 2 and layer 3. Each time we pick a facility in layer 1. We iteratively pick an adjacent facility of in which has not been scanned as a child of . After we append to layer 2, we immediately pick all unscanned neighbors of in as the children of (append them in layer 3).

  3. We then iteratively construct until all facilities in have been scanned. Each iteration, we build two consecutive layers: an odd layer and an even layer.

For any , denote to be the collection of all neighbors of . If is also a client, then . W.l.o.g., we assume that for every facility in this section. In fact, we can directly delete all satisfying that from the facility set , since can not be open in any optimal feasible solution. If this deletion causes the induced graph unconnected, similar to Lemma 6 in (Kociumaka & Cygan, 2014), we divide the graph into different connected components, and consider each smaller induced instance based on different connected components. Otherwise if , we set , which has no influence on any optimal feasible solution of the induced (,,,)-Supplier instance. The following lemma gives a useful property of CCT.

Lemma 7.

Given an induced (,,,)-Supplier instance induced by an undirected unweighted connect graph , and a core-center tree , suppose is the core-center set of . Then, we can construct a function satisfying the following properties in polynomial time.

  1. For all , we have ;

  2. For all , we have .

Proof.

Firstly, for each pair and , we define . We can make this mapping since for each pair , we have by Definition 5. For the rest clients , we define to be an arbitrary facility adjacent to .

By the above construction, the first constraint is satisfied naturally. The second constraint is satisfied by the fact that for all . ∎

2.2 A Simple Case: (,,soft-,)-Supplier

We first consider a simple case where the capacity bounds (upper and lower) are uniform and soft. In this setting, we want to find an open facility set . Note that we allow multiple open centers in . We also need to find an assignment function , representing that we assign every client to facility . The main theorem is as follows.

Theorem 8.

(main theorem) There exists a 5-approximation polynomial time algorithm for the (,,soft-,)-Supplier problem.

By Lemma 4, we only consider induced (,,soft-,)-Supplier instances. Given an induced (,,soft-,)-Supplier instance induced by an undirected unweighted connect graph , recall that we can assume for each . We first construct a CCT  rooted at node , and a function satisfying Lemma 7. For a facility set , we denote to be the collection of clients assigning to some facility in by .

Our algorithm mainly includes two procedures. The first procedure is called pass-up, which is a greedy algorithm to map clients to facilities from the leaves of to the root. After the ’pass-up’ procedure, we still leave some unassigned clients nearby the root. Then we use a procedure called pass-down to allocate those unassigned clients by iteratively finding an exchange route. In the following, we give the details of both procedures.

Procedure Pass-Up. Assume that for some and . In this procedure, we will find an open facility set of size . We also find an assignment function which assigns clients to some nearby facility in except a client set . Here, is a collection of clients in nearby the root . Our main idea is to open facility centers from the leaves of CCT  to the root iteratively. During opening centers, we assign exactly ’close’ clients to each center. This is the reason that there are unassigned clients after the whole procedure.

We then describe an iteration of pass-up. Assume that is the core-center set of . At the beginning, we find a non-leaf vertex satisfying that all of its grandchildren (if exists) are leaves. We denote to be the collection of all children and all grandchildren of . In the next step, we consider all unscanned clients in , §§§Here, unscanned clients are those clients that have not been assigned by before this iteration. and assign them to the facility . We want that each center at serves exactly centers. However, there may exist one center at serving less than unscanned clients in . We assign some clients in to this center such that it also serves exactly clients. After this iteration, we delete the subtree rooted at from except itself.

Finally, the root will become the only remaining node in . We open multiple centers at , each serving exactly clients in , until there are less than unassigned clients. See Algorithm 1 for details. We have the following lemma.

1 Input: an induced (,,soft-,)-Supplier instance induced by , a CCT , and a function ;
2 Initialize , , ;
3 while  do
4       If the root is the only node of , we let . Otherwise, arbitrarily pick a non-leaf vertex in whose all grandchildren (if exists) are leaves of ;
5       Denote the subtree of rooted at by . Denote to be the collection of all facilities in ;
6       Let . Assume that for some and ;
7       Arbitrarily pick clients from to form a set . Let ;
8       Let ;
9       For each center (), assign exactly clients to , i.e., let ;
10       Let , , ;
11       if  then
12             Let . Assume that for some and ;
13             Arbitrarily pick clients from to form a set ;
14             Let ;
15             For each center (), assign exactly clients to , i.e., let ;
16             Let , , ;
17            
18      
19Output: , and .
Algorithm 1 Pass-Up
Lemma 9.

Given an induced (,,soft-,)-Supplier instance induced by an undirected unweighted connect graph , assume that for some and . The output of Algorithm 1 satisfies the following properties:

  1. Each open facility satisfies that , and ;

  2. The unassigned client set , and ;

  3. For each facility , we have .

  4. For each client , is either , or the parent of in , or the grandparent of in . Moreover, we have .

Proof.

We first prove the feasibility of Algorithm 1. The feasibility of Line 7 follows from the fact that by Lemma 7. Since by Line 6, we can always pick clients from . The feasibility of Line 9 follows from the fact that . Since we open centers at , it is able to assign exactly clients in to each center.

Then we prove the properties of the output. The first three properties mainly follow from the fact that and we assign exactly clients to each open center. We only need to verify that . By Line 3, we always pick in the last iteration of Algorithm 1. By Line 11-16, this fact is obvious. For each center at , it only serves clients in . By the definition of and , we conclude the first part of the last property. Moreover, we have by Lemma 7 and by Definition 5. By the triangle inequality, we have . ∎

Procedure Pass-Down. After the procedure pass-up, we still leave an unassigned client set of size . However, our goal is to serve at least clients. Therefore, we need to modify the assignment function and serve more clients.

The procedure pass-down handles the remaining clients in one by one, see Algorithm 2 for details. At the beginning of pass-down, we initialize an ’unscanned’ client set , i.e., is the collection of those clients allowing to be reassigned by pass-down. In each iteration, we arbitrarily pick a client and assign it to the root node . However, if each open facility at has already served clients by , assigning to will violate the capacity upper bound. In this case, we actually find an open center such that , i.e., there are less than clients assigned to by . We then construct an exchange route consisting of open facilities in . We first find a sequence of nodes in satisfying that is the grandparent of in the core-center tree for all . Then for each node , we pick a client which has not been reassigned so far. We call such a sequence of clients an exchange route. Our algorithm is as follows: 1) we assign to ; 2) we iteratively reassign to in order ; 3) finally we reassign to . We then mark all clients in the exchange route by removing them from the ’unscanned’ client set . Note that our exchange route only increases the number of clients assigned to by one. We will prove such an exchange route always exists in each iteration. Thus in each iteration, the procedure pass-down assigns one more client to some open facility in . We will argue that there are at least clients served by at the end of pass-down.

1 Input: an induced (,,soft-,)-Supplier instance induced by , a CCT , a function , an open facility set , an unassigned client set , and a function ;
2 Initialize ;
3 while  and ,  do
4       Arbitrarily pick a client and an open facility () satisfying that ;
5       if  then
6             Let , ;
7            
8      else
9             Let be the sequence of nodes in where is the grandparent of for all .
10             Let . For every , arbitrarily pick a client ;
11             for  do
12                   Reassign ;
13                  
14            Reassign ;
15             Let , ;
16             Let , ;
17            
18      
Output: , and ;
Algorithm 2 Pass-Down

Now we prove the following lemma. Note that Theorem 8 can be directly obtained by Lemma 4 and Lemma 10.

Lemma 10.

Algorithm 2 outputs a distance-5 solution of the given induced (,,soft-,)-Supplier instance induced by in polynomial time.

Proof.

We first verify the feasibility of Algorithm 2. The feasibility of Line 8 follows from the fact that by Lemma 9. Then we only need to show that an exchange route described in Line 9-14 must exist in each iteration. On one hand, we verify that an exchange route in Line 9 always exists. Since we assign one more client in in each iteration. Thus, there are at most iterations by Property 2 in Lemma 9. Then the node appears at most times in the sequence in Line 8, and at most clients in are removed from at the end of the algorithm. By Lemma 7, we also have . Thus we always have in Line 12, which proves the existence of an exchange route. On the other hand, for each open facility (), the number of clients served by dose not change after Line 12. It is because we reassign to , and remove from it.

We then show that for all , we have at the end of the algorithm. In Line 11, we reassign a client to . Since is the grandchild of , we have by Property 4 in Lemma 9. Combining with , we conclude that . Then by Property 4 in Lemma 9, we finish the proof.

Finally, we show that the output client set is of size at least . In fact, we only need to prove that , i.e., . Since the number of open facility centers in the optimal solution served at least clients is at most , we have . ∎

(,,,)-Center. Consider the (,,,)-Center problem with hard capacities. We first treat a given induced (,,,)-Center instance as an induced (,,soft-,)-Center instance. Then we apply Theorem 8 and obtain a 5-approximation solution . Since the two instances are induced by the same connected graph, the optimal capacitated center value of the induced (,,soft-,)-Center instance is at most the optimal capacitated center value 1 of the induced (,,,)-Center instance. Therefore, we know . Since we have hard capacities, we still need to modify to be a single set. In fact, we can choose arbitrary vertex to replace each as an open center, and assign all vertices in to . Note that the distance between any and is at most and the new is at most . Thus we have the following theorem.

Theorem 11.

There exists a 10-approximation polynomial time algorithm for the (,,,)-Center problem.

2.3 (,,,)-Center

In this subsection, we consider a more complicated case where the capacity upper bounds are non-uniform, and each vertex has a hard capacity. Our main theorem is as follows.

Theorem 12.

(main theorem) There exists an 11-approximation polynomial time algorithm for the (,,,)-Center problem.

By Lemma 4, we only need to consider induced (,,,)-Supplier instances. For an induced (,,,)-Supplier instance induced by an undirected unweighted connected graph , recall that we can assume for every vertex . Recall that we may remove some facilities from such that this assumption is satisfied. Thus, the set may be a subset of . Since we consider the center version, every vertex has an individual capacity interval and can be opened as a center as well. This fact is useful for our following algorithm and is the reason why we do not consider the supplier version in this subsection.

Similar to (,,soft-,)-Center, our algorithm first computes a core-center tree rooted at , a core-center set and a function described as in Lemma 7. Assume that for some and . Note that the procedure pass-up algorithm does not depend on the capacity upper bounds. Therefore, we still use the procedure pass-up to compute an open set , an unassigned set of size , and a function .

However, we can not apply pass-down directly. On one hand, since we consider non-uniform capacity upper bounds, the inequality may not be satisfied. We need to choose open centers carefully such that at least vertices can be served. On the other hand, we can not open multiple facilities in a single vertex by hard capacities. Thus, we need the following lemma to modify the open center set .

Lemma 13.

Given an induced (,,,)-Center instance induced by where for each and an open set computed by pass-up, there exists a polynomial time algorithm that finds another open set satisfying the following properties:

  1. is a single set.

  2. For all , we have .

  3. .

We will prove the above lemma later. By Lemma 13, we are ready to prove Theorem 12.

Proof of Theorem 12. By Lemma 13, we obtain another open set . We first modify to be for all . Then we apply the procedure pass-down according to the modified capacities. By Lemma 10, we obtain a distance-5 solution . Since , at least vertices are served by . Finally, for each vertex and such that , we reassign to , i.e., let . By Lemma 13, we obtain a feasible solution for the given induced (,,,)-Center instance. Since , the capacitated center value of our solution is at most . Combining with Lemma 4, we finish the proof.

Now we only need to prove Lemma 13.

Proof.

We construct an undirected weighted bipartite graph as follows: () if and only if and this edge has weight . We then find a maximum-weight maximum-matching on this graph . We only need to verify that is perfectly matched in and the total weight of is at least . Suppose each is matched to , we finish the proof by letting .

Define to be the optimal open center set. By Hall’s theorem, we first prove the existence of a matching in satisfying that every vertex in is matched. For any subset , we assume by contradiction that , where . Define to be the set of vertices served by in the optimal solution. By the capacity lower bound, we have . Recall that is the unassigned set of size obtained by pass-up. Therefore, we have . On the other hand, for each vertex , we have by Lemma 9. Thus, we conclude that , which implies all . Since each open center only serves vertices by Lemma 9, we have which is a contradiction. So we prove the existence of . Note that the total weight of is at least .

Note that there exists a matching on such that is perfectly matched. We can achieve this property by matching each to an arbitrary vertex such that . Then by Hungarian Algorithm, we can construct a matching by iteratively finding augmenting paths based on , until all vertices in are matched. Since any augmenting path can not make a matching vertex unmatched, we conclude that the total weight of is at least . Thus, the maximum-weight maximum-matching on must satisfy that is perfectly matched and the total weight is at least .. ∎

(,,soft-,)-Supplier. Consider the (,,soft-,)-Supplier problem with soft capacities. Our technique is similar to (,,,)-Center except a difference procedure for choosing in Lemma 13. By Lemma 4, we again consider a given induced (,,soft-,)-Supplier instance induced by an undirected unweighted connected . W.l.o.g., we assume that for all facilities . Similarly, we compute and apply pass-up to compute , and . Before applying pass-down, we also need to find another open facility set . Since we have soft capacities, we only need to require to satisfy Property 2 and 3 in Lemma 13. This is the reason why we can consider the supplier version. By the same technique as in the proof of Theorem 12, we have the following theorem.

Theorem 14.

There exists a poly-time algorithm achieving approximation ratio 11 for (,,soft-,)-Supplier problem.

Proof.

We only need to find another open facility set satisfying Property 2 and 3 in Lemma 13. In fact, we simply define for all . We only need to verify that . Assume that the optimal open facility set is and the optimal assignment function is . W.l.o.g., we assume that . We only need to find an injection such that .

The injection can be found greedily. Suppose have been decided for some , we want to decide . Since each only serves clients by and the unassigned client set is of size , we have the following property by counting:

Arbitrarily pick a client . Define . By the definition of , there exists some such that . Therefore, . Thus, we have by the definition of . The proof is complete. ∎

3 Capacitated -Center with Two-Sided Bounds and Outliers

In this section, we study the capacitated -center problems with two-sided bounds, with or without outliers, and give approximation algorithms. We consider the case that all vertices have a uniform capacity lower bound , while the capacity upper bounds can be either uniform or non-uniform. Our goal is to propose approximation algorithms with constant approximation ratio. Similar to (An et al. , 2015; Kociumaka & Cygan, 2014), we use the standard LP relaxation and the rounding procedure distance- transfer. We will first extend the distance- transfer procedure for two-sided bounds.

3.1 LP Formulation

We first give a natural LP relaxation for (,,,)-Supplier.

Definition 15.

(Distance- relaxation ) Given an (,,,)-Supplier instance, the following feasibility that fractionally verifies whether there exists a solution that assigns at least clients to an open center of distance at most :

Here is called an assignment variable representing the fractional amount of assignment from client to center , and is called the opening variable of . For convenience, we use to represent and , respectively.

By Definition 3, must have a feasible solution for any induced(,,,)-Supplier instance. Assume that we have a feasible fractional solution of . We want to obtain a distance- solution by rounding . We then recall a rounding procedure called distance- transfer.

3.2 Distance-r Transfer

We first extend the definition of distance- transfer proposed in (An et al. , 2015; Kociumaka & Cygan, 2014) by adding the third condition. For a vertex and a set , we define .

Definition 16.

Given an (,,,)-Supplier instance and , a vector is a distance- transfer of if

  1. ;

  2. for all ;

  3. for all .

If is a characteristic vector of , we say that is an integral distance- transfer of .

Recall that the first condition says that a transfer should not change the total number of open centers. By an argument using Hall’s theorem as in (An et al. , 2015; Kociumaka & Cygan, 2014), the second condition is important for satisfying the capacity upper bounds. In this paper, we add the third condition to satisfy the capacity lower bounds. Like in (An et al. , 2015; Kociumaka & Cygan, 2014), an integral distance- transfer of the fractional solution of already gives a distance- solution by the following lemma.

Lemma 17.

Given an (,,,)-Supplier problem, assume is a feasible solution of and is an integral distance- transfer of . Then one can find a distance- solution in polynomial time.

Proof.

Consider a bipartite graph with () if . Modify to obtain by removing vertices from and duplicating each vertex to its capacity lower bound, i.e. times. Then we show that there exists a matching such that every vertex in is matched. By Hall’s theorem, we need to prove that