LocalSearch based Approximation Algorithms for Mobile Facility Location Problems^{1}
Abstract
We consider the mobile facility location (MFL) problem. We are given a set of facilities and clients located in a common metric space . The goal is to move each facility from its initial location to a destination (in ) and assign each client to the destination of some facility so as to minimize the sum of the movementcosts of the facilities and the clientassignment costs. This abstracts facilitylocation settings where one has the flexibility of moving facilities from their current locations to other destinations so as to serve clients more efficiently by reducing their assignment costs.
We give the first localsearch based approximation algorithm for this problem and achieve the bestknown approximation guarantee. Our main result is approximation for this problem for any constant using local search. The previous best guarantee for MFL was an 8approximation algorithm due to [13] based on LProunding. Our guarantee matches the bestknown approximation guarantee for the median problem. Since there is an approximationpreserving reduction from the median problem to MFL, any improvement of our result would imply an analogous improvement for the median problem. Furthermore, our analysis is tight (up to factors) since the tight example for the localsearch based 3approximation algorithm for median can be easily adapted to show that our localsearch algorithm has a tight approximation ratio of 3. One of the chief novelties of the analysis is that in order to generate a suitable collection of localsearch moves whose resulting inequalities yield the desired bound on the cost of a localoptimum, we define a treelike structure that (loosely speaking) functions as a “recursion tree”, using which we spawn off localsearch moves by exploring this tree to a constant depth. Our results extend to the weighted generalization wherein each facility has a nonnegative weight and the movement cost for is times the distance traveled by .
1 Introduction
Facility location problems have been widely studied in the Operations Research and Computer Science communities (see, e.g., [25] and the survey [20]), and have a wide range of applications. In its simplest version, uncapacitated facility location (UFL), we are given a set of facilities or serviceproviders with opening costs, and a set of clients that require service, and we want to open some facilities and assign clients to open facilities so as to minimize the sum of the facilityopening and clientassignment costs. An oftcited prototypical example is that of a company wanting to decide where to locate its warehouses/distribution centers so as to serve its customers in a costeffective manner.
We consider facilitylocation problems that abstract settings where facilities are mobile and may be relocated to destinations near the clients in order to serve them more efficiently by reducing the clientassignment costs. More precisely, we consider the mobile facility location (MFL) problem introduced by [11, 13], which generalizes the classical median problem (see below). We are given a complete graph with costs on the edges, a set of clients with each client having units of demand, and a set of initial facility locations. We use the term facility to denote the facility whose initial location is . A solution to MFL moves each facility to a final location (which could be the same as ), incurring a movement cost , and assigns each client to a final location , incurring assignment cost . The total cost of is the sum of all the movement costs and assignment costs. More formally, noting that each client will be assigned to the location nearest to it in , we can express the cost of as
where (for any node ) gives the location in nearest to (breaking ties arbitrarily). We assume throughout that the edge costs form a metric. We use the terms nodes and locations interchangeably.
Mobile facility location falls into the genre of movement problems introduced by Demaine et al. [11]. In these problems, we are given an initial configuration in a weighted graph specified by placing “pebbles” on the nodes and/or edges; the goal is to move the pebbles so as to obtain a desired final configuration while minimizing the maximum, or total, pebble movement. MFL was introduced by Demaine et al. as the movement problem where facility and client pebbles are placed respectively at the initial locations of the facilities and clients, and in the final configuration every clientpebble should be colocated with some facilitypebble.
Our results.
We give the first localsearch based approximation algorithm for this problem and achieve the bestknown approximation guarantee. Our main result is a approximation for this problem for any constant using a simple localsearch algorithm. This improves upon the previous best 8approximation guarantee for MFL due to Friggstad and Salavatipour [13], which is based on LProunding and is not combinatorial.
The localsearch algorithm we consider is quite natural and simple. Observe that given the final locations of the facilities, we can find the minimumcost way of moving facilities from their initial locations to the final locations by solving a minimumcost perfectmatching problem (and the client assignments are determined by the function defined above). Thus, we concentrate on determining a good set of final locations. In our localsearch algorithm, at each step, we are allowed to swap in and swap out a fixed number (say ) of locations. Clearly, for any fixed , we can find the best local move efficiently (since the cost of a set of final locations can be computed in polytime). Note that we do not impose any constraints on how the matching between the initial and final locations may change due to a local move, and a local move might entail moving all facilities. It is important to allow this flexibility, as it is known [13] that the localsearch procedure that moves, at each step, a constant number of facilities to chosen destinations has an unbounded approximation ratio.
Our main contribution is a tight analysis of this localsearch algorithm (Section 4). Our guarantee matches (up to terms) the bestknown approximation guarantee for the median problem. Since there is an approximationpreserving reduction from the median problem to MFL [13]—choose arbitrary initial facility locations and give each client a huge demand —any improvement of our result would imply an analogous improvement for the median problem. (In this respect, our result is a noteworthy exception to the prevalent state of affairs for various other generalizations of UFL and median—e.g., the data placement problem [4], {matroid, redblue} median [22, 16, 9, 6], facilitylocation [12, 15]—where the best approximation ratio for the problem is worse by a noticeable factor (compared to UFL or median); [14] is another exception.) Furthermore, our analysis is tight (up to factors) because by suitably setting in the reduction of [13], we can ensure that our localsearch algorithm for MFL coincides with the localsearch algorithm for median in [3] which has a tight approximation ratio of 3.
We also consider a weighted generalization of the problem (Section 5), wherein each facility has a weight indicating the cost incurred perunit distance moved and the cost for moving to is . (This can be used to model, for example, the setting where different facilities move at different speeds.) Our analysis is versatile and extends to this weighted generalization to yield the same performance guarantee. For the further generalization of the problem, where the facilitymovement costs may be arbitrary and unrelated to the clientassignment costs (for which a 9approximation can be obtained via LProunding; see “Related work”), we show that local search based on multiple swaps has a bad approximation ratio (Section 7).
The analysis leading to the approximation ratio of 3 (as also the simpler analysis in Section 3 yielding a 5approximation) crucially exploits the fact that we may swap multiple locations in a localsearch move. It is natural to wonder then if one can prove any performance guarantees for the localsearch algorithm where we may only swap in and swap out a single location in a local move. (Naturally, the singleswap algorithm is easier to implement and thus may be more practical). In Section 6, we analyze this singleswap algorithm and prove that it also has a constant approximation ratio.
Our techniques.
The analysis of our localsearch procedure requires various novel ideas. As is common in the analysis of localsearch algorithms, we identify a set of test swaps and use local optimality to generate suitable inequalities from these test swaps, which when combined yield the stated performance guarantee. One of the difficulties involved in adapting standard localsearch ideas to MFL is the following artifact: in MFL, the cost of “opening” a set of locations is the cost of the mincost perfect matching of to , which, unlike other facilitylocation problems, is a highly nonadditive function of (and as mentioned above, we need to allow for the matching from to to change in nonlocal ways). In most facilitylocation problems with opening costs for which local search is known to work, we may always swap in a facility used by the global optimum (by possibly swapping out another facility) and easily bound the resulting change in facility cost, and the main consideration is to decide how to reassign clients following the swap in a costeffective way; in MFL we do not have this flexibility and need to carefully choose how to swap facilities so as to ensure that there is a good matching of the facilities to their new destinations after a swap and there is a frugal reassignment of clients.
This leads us to consider long relocation paths to rematch facilities to their new destinations after a swap, which are of the form , where and are the locations that facility is moved to in the local and global optimum, and , respectively, and is the location closest to . By considering a swap move involving the start and end locations of such a path , we can obtain a bound on the movement cost of all facilities where is the start of the path or serves a large number of clients. To account for the remaining facilities, we break up into suitable intervals, each containing a constant number of unaccounted locations which then participate in a multilocation swap. This intervalswap move does not at first appear to be useful since we can only bound the costchange due to this move in terms of a significant multiple of (a portion of) the cost of the local optimum! One of the novelties of our analysis is to show how we can amortize the cost of such expensive terms and make their contribution negligible by considering multiple different ways of covering with intervals and averaging the inequalities obtained for these interval swaps. These ideas lead to the proof of an approximation ratio of 5 for the localsearch algorithm (Section 3).
The tighter analysis leading to the 3approximation guarantee (Section 4) features another noteworthy idea, namely that of using “recursion” (up to bounded depth) to identify a suitable collection of test swaps. We consider the treelike structure created by the paths used in the 5approximation analysis, and (loosely speaking) view this as a recursion tree, using which we spawn off intervalswap moves by exploring this tree to a constant depth. To our knowledge, we do not know of any analysis of a localsearch algorithm that employs the idea of recursion to generate the set of test local moves (used to generate the inequalities that yield the desired performance guarantee). We believe that this technique is a notable contribution to the analysis of localsearch algorithms that is of independent interest and will find further application.
Related work.
As mentioned earlier, MFL was introduced by Demaine et al. [11] in the context of movement problems. Friggstad and Salavatipour [13] designed the first approximation algorithm for MFL. They gave an 8approximation algorithm based on LP rounding by building upon the LProunding algorithm of Charikar et al. [8] for the median problem; this algorithm works only however for the unweighted case. They also observed that there is an approximationpreserving reduction from median to MFL. We recently learned that Halper [17] proposed the same localsearch algorithm that we analyze. His work focuses on experimental results and leaves open the question of obtaining theoretical guarantees about the performance of local search.
Chakrabarty and Swamy [6] observed that MFL, even with arbitrary movement costs is a special case of the matroid median problem [22]. Thus, the approximation algorithms devised for matroid median independently by [9] and [6] yield an 8approximation algorithm for MFL with arbitrary movement costs.
There is a wealth of literature on approximation algorithms for (metric) uncapacitated and capacitated facility location (UFL and CFL), the median problem, and their variants; see [27] for a survey on UFL. Whereas constantfactor approximation algorithms for UFL and median can be obtained via a variety of techniques such as LProunding [28, 23, 8, 9], primaldual methods [18, 19], local search [21, 7, 3], all known approximation algorithms for CFL (in its full generality) are based on local search [21, 30, 5]. We now briefly survey the work on localsearch algorithms for facilitylocation problems.
Starting with the work of [21], localsearch techniques have been utilized to devise approximation algorithms for various facilitylocation problems. Korupolu, Plaxton, and Rajaraman [21] devised approximation for UFL, and CFL with uniform capacities, and median (with a blowup in ). Charikar and Guha [7], and Arya et al. [3] both obtained a approximation for UFL. The first constantfactor approximation for CFL was obtained by Pál, Tardos, and Wexler [26], and after some improvements, the currentbest approximation ratio now stands at [5]. For the special case of uniform capacities, the analysis in [21] was refined by [10], and Aggarwal et al. [1] obtain the currentbest 3approximation. Arya et al. [3] devised a approximation algorithm for median, which was also the first constantfactor approximation algorithm for this problem based on local search. Gupta and Tangwongsan [15] (among other results) simplified the analysis in [3]. We build upon some of their ideas in our analysis.
Localsearch algorithms with constant approximation ratios have also been devised for various variants of the above three canonical problems. Mahdian and Pál [24], and Svitkina and Tardos [29] consider settings where the opening cost of a facility is a function of the set of clients served by it. In [24], this cost is a nondecreasing function of the number of clients, and in [29] this cost arises from a certain tree defined on the client set. Devanur et al. [12] and [15] consider facility location, which is similar to median except that facilities also have opening costs. Hajiaghayi et al. [16] consider a special case of the matroid median problem that they call the redblue median problem. Most recently, [14] considered a problem that they call the median forest problem, which generalizes median, and obtained a approximation algorithm.
2 The localsearch algorithm
As mentioned earlier, to compute a solution to MFL, we only need to determine the set of final locations of the facilities, since we can then efficiently compute the best movement of facilities from their initial to final locations, and the client assignments. This motivates the following localsearch operation. Given a current set of locations, we can move to any other set of locations such that , where is some fixed value. We denote this move by . The localsearch algorithm starts with an arbitrary set of final locations. At each iteration, we choose the localsearch move that yields the largest reduction in total cost and update our finallocation set accordingly; if no costimproving move exists, then we terminate. (To obtain polynomial running time, as is standard, we modify the above procedure so that we choose a localsearch move only if the costreduction is at least .)
3 Analysis leading to a 5approximation
We now analyze the above localsearch algorithm and show that it is a approximation algorithm. For notational simplicity, we assume that the localsearch algorithm terminates at a local optimum; the modification to ensure polynomial running time degrades the approximation by at most a factor (see also Remark 3.8).
Theorem 3.1
Let and denote respectively the movement and assignment cost of an optimal solution. The total cost of any local optimum using at most swaps is at most .
Although this is not the tightest guarantee that we obtain, we present this analysis first since it introduces many of the ideas that we build upon in Section 4 to prove a tight approximation guarantee of for the localsearch algorithm. For notational simplicity, we assume that all s are 1. All our analyses carry over trivially to the case of nonunit (integer) demands since we can think of a client having demand as colocated unitdemand clients.
Notation and preliminaries.
We use to denote the local optimum, where facility is moved to final location . We use to denote the (globally) optimal solution, where again facility is moved to . Throughout, we use to index locations in , and to index locations in . Recall that, for a node , is the location in nearest to . Similarly, we define to be the location in nearest to . For notational similarity with facility location problems, we denote by , and by . (Thus, and are the movement costs of in and respectively.) Also, we abbreviate to , and to . Thus, and are the assignment costs of in the local and global optimum respectively. (So .) Let be the set of clients assigned to the location , and . For a set , we define ; we define for similarly. Define . We say that captures all the locations in . The following basic lemma will be used repeatedly.
Lemma 3.2
For any client , we have .
Proof.
Let . The lemma clearly holds if . Otherwise, where the second inequality follows since is the closest location to in . ∎
To prove the approximation ratio, we will specify a set of localsearch moves for the local optimum, and use the fact that none of these moves improve the cost to obtain some inequalities, which will together yield a bound on the cost of the local optimum. We describe these moves by using the following digraph. Consider the digraph . We decompose into a collection of nodedisjoint (simple) paths and cycles as follows. Repeatedly, while there is a cycle in our current digraph, we add to , remove all the nodes of and recurse on the remaining digraph. After this step, a node in the remaining digraph, which is acyclic, has: exactly one outgoing arc if ; exactly one incoming and one outgoing arc if ; and exactly one incoming, and at most one outgoing arc if . Now we repeatedly choose a node with no incoming arcs, include the maximal path starting at in , remove all nodes of and recurse on the remaining digraph. Thus, each triple is on a unique path or cycle in . Define to be such that is an arc in ; if has no incoming arc in , then let .
We will use and to define our swaps. For a path , define to be and to be . Notice that . For each , let , , and . Note that for any with . For a set , define .
A basic building block in our analysis, involves a shift along an subpath of some path or cycle in . This means that we swap out and swap in . We bound the cost of the matching between and by moving each initial location to and moving to . Thus, we obtain the following simple bound on the increase in movement cost due to this operation:
(1) 
The last inequality uses the fact that for all . For a path , we use as a shorthand for .
3.1 The swaps used, and their analysis
We now describe the local moves used in the analysis. We define a set of swaps such that each is swapped in to an extent of at least one, and at most two. We classify each location in as one of three types. Define . We assume that .

: locations with .

: locations with or .

: locations with and .
Also define (so iff and ).
To gain some intuition, notice that it is easy to generate a suitable inequality for a location : we can “delete” (i.e., if , then do ) and reassign each to (i.e., the location in closest to the location serving in ). The cost increase due to this reassignment is at most , and so this yields the inequality . (We do not actually do this since we take care of the locations along with the locations.) We can also generate a suitable inequality for a location (see Lemma 3.4) since we can swap in and swap out . The cost increase by this move can be bounded by and , and the latter quantity can be charged to ; our definition of is tailored precisely so as to enable this latter charging argument. Generating inequalities for the locations is more involved, and requires another building block that we call an interval swap (this will also take care of the locations), which we define after proving Lemma 3.4. We start out by proving a simple bound that one can obtain using a cycle in .
Lemma 3.3
For any cycle , we have .
Proof.
Consider the following matching of to : we match to . The cost of the resulting new matching is which should at least since the latter is the mincost way of matching to . So we obtain . ∎
Lemma 3.4
Let and , and consider . We have
(2) 
Proof.
We can view this multilocation swap as doing for each and simultaneously. (Notice that no path contains , since .) For each the movementcost increase is bounded by . For we move the facility , where , to , so the increase in movement cost is at most for every . So since , we have . Thus, the increase in total movement cost is at most
We upper bound the change in assignment cost by reassigning the clients in as follows. We reassign each to . Each is assigned to , if , and otherwise to . Note that : since , and since . The change in assignment cost for each such client is at most by Lemma 3.2. Thus the change in total assignment cost is at most . Combining this with the bound on the movementcost change proves the lemma. ∎
We now define a key ingredient of our analysis, called an intervalswap operation, that is used to bound the movement cost of the  and locations and the assignment cost of the clients they serve. (We build upon this in Section 4 to give a tighter analysis proving a 3approximation.) Let be a subset of at most locations on a path or cycle in , where is the next location in after . Let where for and is an arbitrary location that appears after (and before ) on the corresponding path or cycle. Consider each . If , choose a random path with probability , and set and . If , set , and . Set and . Note that since for every . Notice that is a random set, but is deterministic. To avoid cumbersome notation, we use to refer to the distribution of swapmoves that results by the random choices above, and call this the interval swap corresponding to and . We bound the expected change in cost due to this move below. Let be the indicator function that is 1 if and 0 otherwise.
Lemma 3.5
Let and be as given above. Let , where and if . Consider the interval swap corresponding to and , as defined above. We have
(3) 
Proof.
Let be the path in or cycle in such that .
We first bound the increase in movement cost. The interval swap can be viewed as a collection of simultaneous moves. If for a random path , the movementcost increase can be broken into two parts. We do a shift along , but move the last initial location on to , and then do shift on from to . So the expected movementcost change is at most
which is at most . Similarly, if , we can break the movementcost increase into for all and . Thus, the total increase in movement cost is at most
(4) 
Next, we bound the change in assignment cost by reassigning clients in as follows. We assign each client to . If , then . For every client , observe that either or . To see this, let and . If then ; also , and so . So we assign to if and to otherwise; the change in assignment cost of is at most (Lemma 3.2).
Now suppose , so . For each , we again have or , and we assign to if and to otherwise. We assign every to (recall that ), and overestimate the resulting change in assignment cost by . Finally, note that we reassign a client with probability at most (since with probability at most ). So taking into account all cases, we can bound the change in total assignment cost by
(5) 
In (5), we are doublecounting clients in . We are also overestimating the change in assignment cost of a client since we include both the term, and the or terms. Adding (4) and (5) yields the lemma. ∎
Notice that Lemma 3.4 immediately translates to a bound on the assignment cost of the clients in for . In contrast, it is quite unclear how Lemma 3.5 may be useful, since the expression in the RHS of (3) may be as large as (but no more since if ) and it is unclear how to cancel the contribution of on the RHS. One of the novelties of our analysis is that we show how to amortize such expensive terms and make their contribution negligible by considering multiple interval swaps. We cover each path or cycle in different ways using intervals comprising consecutive locations from . We then argue that averaging, over these covering ways, the inequalities obtained from the corresponding interval swaps yields (among other things) a good bound on the movementcost of the locations on and the assignment cost of the clients they serve.
Lemma 3.6
Let , , where is the next location on after , and . Let if and otherwise. For ,
(6) 
Proof.
We first define formally an interval of (at most) consecutive locations along . As before, let for . For a path , define for and for . Also define for and for . If is a cycle, we let our indices wrap around and be , i.e., for all (so ).
For , define to be an interval of length at most on . Define . Note that we have if is a path, and if is a cycle. Consider the collection of intervals, . For each , where , we consider the interval swap corresponding to . We add the inequalities (3) for all such . Since each participates in exactly such inequalities, and each is the start of only the interval , we obtain the following.
(7) 
Notice that the locations other than on th