Probabilistic Analysis of Facility Location on Random Shortest Path Metrics111An extended abstract of this work will appear in the Proceedings of the 15th Conference on Computability in Europe (CiE).
The facility location problem is an -hard optimization problem. Therefore, approximation algorithms are often used to solve large instances. Such algorithms often perform much better than worst-case analysis suggests. Therefore, probabilistic analysis is a widely used tool to analyze such algorithms. Most research on probabilistic analysis of -hard optimization problems involving metric spaces, such as the facility location problem, has been focused on Euclidean instances, and also instances with independent (random) edge lengths, which are non-metric, have been researched. We would like to extend this knowledge to other, more general, metrics.
We investigate the facility location problem using random shortest path metrics. We analyze some probabilistic properties for a simple greedy heuristic which gives a solution to the facility location problem: opening the cheapest facilities (with only depending on the facility opening costs). If the facility opening costs are such that is not too large, then we show that this heuristic is asymptotically optimal. On the other hand, for large values of , the analysis becomes more difficult, and we provide a closed-form expression as upper bound for the expected approximation ratio. In the special case where all facility opening costs are equal this closed-form expression reduces to or or even if the opening costs are sufficiently small.
title \setkomafontsection \setkomafontsubsection \setkomafontsubsubsection \setkomafontdescriptionlabel \setkomafontparagraph
Large-scale combinatorial optimization problems, such as the facility location problem, show up in many applications. These problems become computationally intractable as the instances grow. This issue is often tackled by (successfully) using approximation algorithms or ad-hoc heuristics to solve these optimization problems. In practical situations these, often simple, heuristics have a remarkable performance, even though theoretical results about them are way more pessimistic.
Over the last decades, probabilistic analysis has become an important tool to explain this difference. One of the main challenges here is to come up with a good probabilistic model for generating instances: this model should reflect realistic instances, but it should also be sufficiently simple in order to make the probabilistic analysis possible.
Until recently, in almost all cases either instances with independent edge lengths, or instances with Euclidean distances have been used for this purpose [1, 7]. These models are indeed sufficiently simple, but they have shortcomings with respect to reflecting realistic instances: realistic instances are often metric, although not Euclidean, and the independent edge lengths do not even yield a metric space.
In order to overcome this, Bringmann et al.  used the following model for generating random metric spaces, which had been proposed by Karp and Steele . Given an undirected complete graph, start by drawing random edge weights for each edge independently and then define the distance between any two vertices as the total weight of the shortest path between them, measured with respect to the random weights. Bringmann et al. called this model random shortest path metrics. This model is also known as first-passage percolation, introduced by Hammersley and Welsh as a model for fluid flow through a (random) porous medium [8, 10].
1.1 Related Work
Although a lot of studies have been conducted on random shortest path metrics, or first-passage percolation (e.g. [5, 9, 11]), systematic research of the behavior of (simple) heuristics and approximation algorithms for optimization problems on random shortest path metrics was initiated only recently . They provide some structural properties of random shortest path metrics, including the existence of a good clustering. These properties are then used for a probabilistic analysis of simple algorithms for several optimization problems, including the minimum-weight perfect matching problem and the -median problem.
For the facility location problem, several sophisticated polynomial-time approximation algorithms exist, the best one currently having a worst-case approximation ratio of . Flaxman et al. conducted a probabilistic analysis for the facility location problem using Euclidean distances . They expected to show that some polynomial-time approximation algorithms would be asymptotically optimal under these circumstances, but found out that this is not the case. On the other hand, they described a trivial heuristic which is asymptotically optimal in the Euclidean model.
1.2 Our Results
This paper aims at extending our knowledge about the probabilistic behavior of (simple) heuristics and approximation algorithms for optimization problems using random shortest path metrics. We will do so by investigating the probabilistic properties of a rather simple heuristic for the facility location problem, which opens the cheapest facilities (breaking ties arbitrarily) where only depends on the facility opening costs. Due to the simple structure of this heuristic, our results are more structural than algorithmic in nature.
We show that this heuristic yields a approximation ratio in expectation if the facility opening costs are such that . For the analysis becomes more difficult, and we provide a closed-form expression as upper bound for the expected approximation ratio. We will also show that this closed-form expression is if all facility opening costs are equal. This can be improved to or even when the facility opening costs are sufficiently small. Note that we will focus on the expected approximation ratio and not on the ratio of expectations, since a disadvantage of the latter is that it does not directly compare the performance of the heuristic on specific instances.
We start by giving a mathematical description of random shortest path metrics and the facility location problem (Section 2). After that, we introduce our simple heuristic properly and have a brief look at its behavior (Section 3). Then we present some general technical results (Section 4) and two different bounds for the optimal solution (Section 5) that we will use to prove our main results in Section 6. We conclude with some final remarks (Section 7).
2 Notation and Model
In this paper, we use to denote that a random variable is distributed using a probability distribution . is being used to denote the exponential distribution with parameter . In particular, we use to denote that is the sum of independent exponentially distributed random variables with parameters . If , then is a Gamma distributed random variable with parameters and , denoted by .
For , we use as shorthand notation for . If are random variables, then are the order statistics corresponding to if is the th smallest value among for all . Furthermore we use as shorthand notation for the th harmonic number, i.e., . Finally, if a random variable is stochastically dominated by a random variable , i.e., we have for all (where and ), we denote this by .
Random Shortest Path Metrics.
Given an undirected complete graph on vertices, we construct the corresponding random shortest path metric as follows. First, for each edge , we draw a random edge weight independently from an exponential distribution222Exponential distributions are technically easiest to handle due to their memorylessness property. A (continuous, non-negative) probability distribution of a random variable is said to be memoryless if and only if for all . [17, p. 294] with parameter 1. Given these random edge weights , the distance between each pair of vertices is defined as the minimum total weight of a -path in . Note that this definition yields the following properties: for all , for all , and for all . We call the complete graph with distances obtained from this process a random shortest path metric.
Facility Location Problem.
We consider the (uncapacitated) facility location problem, in which we are given a complete undirected graph on vertices, distances between each pair of vertices, and opening costs . In this paper, the distances are randomly generated, according to the random shortest path metric described above. Moreover, w.l.o.g. we assume that the vertices are numbered in such a way that the opening costs satisfy and we assume that these costs are predetermined, independent of the random edge weights. We will use as a shorthand notation for . Additionally, we assume that the ratios between the opening costs are polynomially bounded, i.e., we assume for some constant as .
The goal of the facility location problem is to find a nonempty subset such that the total cost is minimal, where denotes the total opening cost of all facilities in . This problem is -hard . We use to denote the total cost of an optimal solution, i.e.,
One of the tools we use in our proofs in Section 6 involves fixing the number of facilities that has to be opened. We use to denote the total cost of the best solution to the facility location problem with the additional constraint that exactly facilities need to be opened, i.e.,
Note that by these definitions.
3 A simple heuristic and some of its properties
In this paper we are interested in a rather simple heuristic that only takes the facility opening costs into account while determining which facilities to open and which not, independently of the metric space. Define . Then our heuristic opens the cheapest facilities (breaking ties arbitrarily). Note that in the special case where all opening costs are the same, i.e. , this corresponds to .
This rather particular value of is originates from the following intuitive argument. Based on the results of Bringmann et al. [3, Lemma 5.1] (see below) we know that the expected cost of the solution that opens the cheapest facilities is given by . This convex function decreases as long as satisfies . Therefore, at least intuitively, the value of that we use is likely to provide a relatively ‘good’ solution.
We will show that this is indeed the case. Our main result will be split into two parts, based on the actual value of . If (i.e. if there are ‘many’ relatively expensive facilities), then we will show that our simple heuristic is asymptotically optimal for any polynomially bounded opening costs (that satisfy ). On the other hand, if , then the analysis becomes more difficult, and we will only provide a closed-form expression that can be used to determine an upper bound for the expected approximation ratio. We will show that this expression yields an approximation ratio in the special case with , and or even if is sufficiently small.
Throughout the remainder of this paper we will use to denote the value of the solution provided by this heuristic.
Probability distribution of .
In this section we derive the probability distribution of the value of the solution provided by our simple greedy heuristic, , and derive its expectation.
If , then denotes the cost of the solution which opens a facility at every vertex . So, we have , and, in particular, .
If , then the distribution of is less trivial. In this case, the total opening costs are given by , whereas, the distribution of the connection costs is known and given by [3, Sect. 5]. This results in .
Using this probability distribution, we can derive the expected value of . If , then it follows trivially that . If , then we have
4 Technical observations
In this section we present some technical lemmas that are being used for the proofs of our theorems in Section 6. These lemmas do not provide new structural insights, but are nonetheless very helpful for our proofs.
First of all, we will use the Cauchy-Schwarz inequality to bound the expected approximation ratio of our simple greedy heuristic. For general random variables , , this inequality states that .
Secondly, we will bound a sum of exponential distributions by a Gamma distribution. The following Lemma enables us to do so.
Lemma 1 ([18, Ex. 1.A.24]).
Let independently, . Moreover, let independently, . Then we have
We will use the following upper bound for the expectation of the maximum of a number of (dependent) random variables.
Lemma 2 ([2, Thm. 2.1]).
Let be a sequence of random variables, each with finite mean and variance. Then it follows that
Let independently, , and let be the order statistics corresponding to . Then, for any ,
where independently, and where “” means equal distribution.
A special case of Rényi’s representation is given by the following corollary.
Let independently, , and let be the order statistics corresponding to . Then, for any ,
Let independently. Using Lemma 3 it follows immediately that
Moreover, we use the following bound for the expected value of the ratio for two dependent nonnegative variables and , conditioned on the event that is relatively small.
Let and be two arbitrary nonnegative random variables and assume that for some . Then, for any that satisfies , we have
The expected value on the left-hand side can be computed and bounded as follows:
Observe that implies or . This observation yields
Since , the second integral vanishes, which leaves us with the desired result.∎
5 Bounds for the optimal solution
Not much is known about the distribution of the value of the optimal solution, , and about the distributions of . Therefore, in this section we derive two bounds for these optimal solutions which we can use in Section 6.
We start with an upper bound for the cumulative distribution function of that works good for relative small values of (i.e. values close to ).
Let and define . Then, for any given opening costs , we have
Let denote the number of open facilities in the optimal solution (if there are multiple optimal solutions, pick one arbitrarily). If , then we know that for some . Since these cases are disjoint, we can condition as follows:
Recall that is the total opening cost of all facilities in . Using the union bound, we can derive that
since and for all with .
Let for and let denote the corresponding order statistics. Then, using Rényi’s representation (see Corollary 4), we can derive that
Again using the union bound, it follows that
By combining the results above, the desired result follows now immediately.∎
Using the result of Lemma 1 we can also derive a stochastic lower bound for .
Let . Then we have .
If the number of open facilities in a solution is fixed to be , then the total opening costs of the optimal solution is trivially lower bounded by . Moreover, the total connection costs in this case is lower bounded by the total length of the shortest edges in the metric. This in turn can be lower bounded by the total weight of the lightest edge weights used to generate the metric.
Let denote the sum of the lightest edge weights. Since all edge weights are independent and standard exponential distributed, we have . Using the memorylessness property of the exponential distribution, it follows that , i.e., the second lightest edge weight is equal to the lightest edge weight plus the minimum of standard exponential distributed random variables. In general, we get . This yields
where the stochastic dominance follows from Lemma 1 by observing that
where the inequality follows from applying the well-known inequality . The desired result follows now immediately.∎
6 Main results
In this section we present our main results. We show that our simple heuristic is asymptotically optimal if (Theorem 8), and we provide a closed-form expression as an upper bound for the expected approximation ratio if (Theorem 16). Finally we will evaluate this expression for the special case where .
Define and assume that . Let denote the total cost of the solution which opens, independently of the metric space, the cheapest facilities (breaking ties arbitrarily), i.e., the facilities with opening costs . Then, it follows that
In order to prove this theorem, we consider the following three cases for the opening cost of the cheapest facility:
and as ;
We start with the rather straightforward proof of Case 3.
Proof of Theorem 8 (Case 3).
For sufficiently large , we have , and thus since . Therefore, using our observations in Section 3, we can derive that for sufficiently large . Moreover, we know that . Using this observation, it follows that
which finishes the proof of this case.∎
In order to prove Case 1 of Theorem 8 we need the following two lemmas.
Let as . For sufficiently large we have
We start by providing a bound for the cumulative distribution function of . Let . By our observations in Section 3 we know this distribution, and since , we can bound it as follows
Now, let independently, , and let denote the corresponding order statistics. Using Rényi’s representation (see Corollary 4), we can now rewrite the last probability as follows:
Applying a union bound to this result, we obtain that
Note that the last inequality becomes an equality whenever .
We can use this result to bound the given integral as follows:
where we used to bound the binomial coefficient. It remains to be shown that . To do so, we start by claiming that the following inequality holds for sufficiently large :
To see this, observe that for sufficiently large we have , and (in all three cases since ). Rearranging the inequality, we get
Upon exponentiation of both sides we obtain the desired result, which finishes this proof.∎
Let be a constant such that , let , take and assume that . For sufficiently large , and for any integer with , we have
Let be sufficiently large. Since is an increasing function of whenever , it follows that
where we also used for all . Next, define and recall that (by construction) for all and thus . Using this, we can see that , from which follows that and , where the last inequality follows from the definition of . Applying this, we obtain
Since , we have as and , implying . This gives us
Observe that the dominant term between the brackets on the right-hand side is given by , implying that this factor becomes less than whenever is sufficiently large. So, we obtain that
since is a constant. Combining this with the well-known inequality (for ), it follows that
From this inequality, we immediately get
On the other hand, since , and , it follows also that
where the last inequality follows since and .
Combining the two results above yields the desired inequality.∎
Proof of Theorem 8 (Case 1).
Let be sufficiently large. By definition of , it follows that and thus whenever . If , then we have as . So, in any case we have . Now, by our observations in Section 3 we know that . Set and observe that .
Conditioning on the events and yields
We start by bounding the second part. Applying Lemma 5, with , , and , we get
where . The terms of the summation can be bounded by Lemma 10. Using this lemma, we obtain that
since by definition. Moreover, since implies as , we also have as for some constant . This results in
Since we started with and (since ), it follows that
which finishes the proof of this case.∎
In order to prove Case 2 of Theorem 8 we need the following five lemmas.
Let and be two arbitrary events. Then we have .
Let and denote the indicator variables corresponding to and , respectively. Then it follows that and . From this, we deduce that and . Moreover, we can see that . Now, combining this knowledge with the variance-bound for the covariance, we derive
Since , it follows that , which finishes this proof.∎
Lemma 12 ([3, Lemma 3.2]).
Let . Then for any .
Suppose that . Set to shorten notation. Then, for sufficiently large we have
Let be sufficiently large. We start by applying the change of variables