Multiple Traveling Salesmen in Asymmetric Metrics

We consider some generalizations of the Asymmetric Traveling Salesman Path problem. Suppose we have an asymmetric metric with two distinguished nodes . We are also given a positive integer . The goal is to find paths of minimum total cost from to whose union spans all nodes. We call this the -Person Asymmetric Traveling Salesmen Path problem (-ATSPP). Our main result for -ATSPP is a bicriteria approximation that, for some parameter we may choose, finds between and paths of total length times the optimum value of an LP relaxation based on the Held-Karp relaxation for the Traveling Salesman problem. On one extreme this is an -approximation that uses up to paths and on the other it is an -approximation that uses exactly paths.

Next, we consider the case where we have pairs of nodes . The goal is to find an path for every pair such that each node of lies on at least one of these paths. Simple approximation algorithms are presented for the special cases where the metric is symmetric or where for each . We also show that the problem can be approximated within a factor when . On the other hand, we demonstrate that the general problem cannot be approximated within any bounded ratio unless P = NP.

## 1 Introduction

We consider generalizations of the metric Traveling Salesman problem. In most of our settings we are given a complete directed graph with nodes. There are nonnegative arc costs satisfying the directed triangle inequality for any distinct . In general, for so we call such graphs asymmetric metrics. Some well-studied Traveling Salesman variants in asymmetric metrics are to find the minimum cost Hamiltonian cycle or minimum cost Hamiltonian path where some of the endpoints of the path may be specified in advance. All of these problems are NP-hard to approximate within small constant factors [27].

We will mainly study problems concerning finding multiple paths in asymmetric metrics whose union covers all nodes while minimizing the total cost of these paths. This generalizes the Asymmetric Traveling Salesman Path problem (ATSPP). Formally, define -ATSPP to be the following problem. Given an asymmetric metric with arc costs and two distinct nodes , we want to find paths from to in such that every node lies on at at least one of these paths. Finally, we define General -ATSPP to be the following generalization of -ATSPP. We are given pairs of nodes in an asymmetric metric and we are to find an path for each so each lies on at least one of these paths. Again, the goal is find such paths of minimum total cost.

One thing that makes -ATSPP an attractive variant of ATSPP is that the gap between optimum solutions for different values of in an asymmetric metric may be arbitrarily large. For example, the instance in Figure 1 has a solution of cost 0 using 2 paths but any single path has cost at least 1. One way to think about this is that it might be efficient to hire additional salesmen to cover the locations in an asymmetric metric. On the other hand, we do not have this large gap in symmetric metrics (i.e. when for all ) because a single salesman can cover all paths by traveling back and forth between and and cover all locations with no greater cost (if is even then one final step from to makes it an path while adding an extra to the cost).

### 1.1 Related Work

In the well-studied Traveling Salesman problem (TSP), the goal is to find a Hamiltonian cycle of minimum total edge cost in a symmetric metric. A classic result of Christofides [9] is a polynomial-time algorithm for TSP that finds a Hamiltonian cycle with cost at most times the cost of the optimum solution. Hoogeveen [18] adapted this algorithm to the problem of finding minimum cost Hamiltonian paths. He obtains a -approximation if at most one endpoint is fixed in advance and a -approximation if both endpoints are fixed in advance. Recently, An, Kleinberg, and Shmoys have improved the approximability of the case when both endpoints are fixed to [2].

In asymmetric metrics, Frieze, Galbiati, and Maffioli [12] gave the first approximation algorithm for ATSP with an approximation ratio of where . A series of papers improved on this ratio by constant factors [5, 20, 10] with the last being . Finally, Asadpour et al. [3] produced an asymptotically better approximation algorithm for ATSP with ratio .

The variant of finding Hamiltonian paths in asymmetric metrics, namely ATSPP, has only recently been studied from the perspective of approximation algorithms. The first approximation algorithm was an -approximation by Lam and Newman [22]. Following this, Chekuri and Pal [8] brought the ratio down to . Finally, Feige and Singh [10] proved that an -approximation for ATSP implies a -approximation for ATSPP for any constant . Combining their result with the recent ATSP algorithm in [3] yields an -approximation for ATSPP.

There is a linear programming (LP) relaxation for each of these problems based on the Held-Karp relaxation for TSP [17]. For TSP, this relaxation is:

 minimize:∑uv∈Eduvxuvsuch that:x(δ(v))=2∀ v∈Vx(δ(S))≥2∀ ∅⊊S⊊Vxuv∈[0,1]∀ uv∈E

Many of the approximation algorithms mentioned above also bound the integrality gap of the respective Held-Karp LP relaxation. For TSP, Wolsey [32] proved the solutions found by Christofides’ algorithm [9] are within 3/2 of the optimal solution to the above LP relaxation. For ATSP, Williamson [31] proved that the algorithm of Frieze et al. [12] bounds the integrality gap of its respective LP by . The improved -approximation for ATSP in [3] improved the bound on gap to the same ratio. For TSP paths, An, Kleinberg and Shmoys [2] first showed that Hoogeveen’s algorithm bounds the integrality gap of a Held-Karp type relaxation for TSP paths by in cases where both endpoints are fixed. In the same paper they argue that their -approximation for this case also bounds the integrality gap by the same factor.

Nagarajan and Ravi [26] first showed that the integrality gap of an LP relaxation for ATSPP, which is the same as LP (1) in this paper when , was . Later Friggstad, Salavatipour, and Svitkina [13] showed a bound of in the integrality gap of this LP relaxation which is currently the best bound. We note that the result of Feige and Singh in [10] that relates the approximability of ATSP and ATSPP does not extend to their integrality gaps in any obvious way.

In the full version of [13], the authors studied extensions of their -approximation for ATSPP to -ATSPP. They demonstrated that -ATSPP can be approximated within and that this bounds the integrality gap of LP (1) by the same factor. Though not stated explicitly, their techniques can also be used to devise a bicriteria approximation for -ATSPP that uses paths of total cost at most times the value of LP (1) in a manner similar to the algorithm in the proof of Theorem 1.3 in [13]. As far as we know, no results are known for General -ATSPP even for the case .

One other problem related to one we consider is the following. We are given distinct nodes and in a symmetric metric. We want to find paths whose union spans all nodes. This should be such that each node in is the start node of exacly one path and each node in is the end node of exactly one path. Matroid intersection techniques used by Rathinam and Sengupta [28] can be easily adapted to approximate this problem within a factor 2.

### 1.2 Our Results

By the directed triangle inequality, it is easy to see that there is an optimum solution for an instance of -ATSPP where each node in lies on precisely one of the paths. Such an optimum solution corresponds to an integer point in LP (1) of the same cost. So the optimum value of LP (1), say , is a lower bound for the minimum cost -ATSPP solution. Our main result for -ATSPP is the following.

###### Theorem 1.1

For any integer , there is an efficient algorithm for -ATSPP that finds between and paths of total cost at most .

This is an -approximation using precisely paths when and an -approximation using at most paths when .

The algorithm is also easy to implement with the most complicated subroutine being that of finding a minimum weight perfect matching in a bipartite graph. Its running time can easily be seen to be where is the time it takes to compute such a matching in a complete bipartite graph with nodes on each side.

We then proceed to study variants of -ATSPP that vary how the start and/or end locations are specified. Examples are when the start locations are not fixed or when we have a set of start nodes and a set of end nodes and the start and end locations of the paths should establish a bijection between and . We extend our approximation algorithm for -ATSPP to these variants.

Finally, we study General -ATSPP. Our first result is an for General 2-ATSPP. We also have a 3-approximation for General -ATSPP in symmetric metrics and an -approximation for General -ATSPP when for all . However, we have the following hardness result for General -ATSPP with no further restrictions.

###### Theorem 1.2

It is NP-hard to distinguish between instances of General -ATSPP whose optimum solution has cost 0 and instances whose optimum solution has cost at least 1.

This implies the problem cannot be efficiently approximated within any bounded ratio unless P = NP. While the reduction uses , modifications can be made to prove hardness results (under stronger assumptions) for values of being as small as polylogarithmic in .

To summarize, Section 2 presents the algorithm for -ATSPP, proves Theorem 1.1, and discusses some variations of -ATSPP on how the start and/or end locations are specified. In Section 3 we demonstrate an -approximation for General 2-ATSPP, discuss approximation algorithms for other restrictions of General -ATSPP, and prove Theorem 1.2. Section 4 then concludes this paper by identifying some directions for future work and mentioning some basic results for the alternative goal of minimizing the cost of the most expensive path in either -ATSPP or General -ATSPP.

## 2 A Bicriteria Approximation for k-Atspp

In this section, we will develop a bicriteria approximation algorithm that finds approximately paths from to in an asymmetric metric whose total cost is within some bounded ratio of the optimum value of LP relaxation (1). The algorithm is parameterized by a positive integer ; different bicriteria approximation guarantees result from different choices of .

### 2.1 Preliminaries

If is a flow between two nodes or a circulation then we let denote the value that assignes to arc . For we let and . For brevity, we let and for . We say is integral if is an integer for each arc . The cost of is . All flows and circulations in this paper will have for each arc .

Our starting point will be to use structures similar to cycle covers from [12] and path/cycle covers from [22] and [13].

###### Definition 2.1

A -path/cycle cover from to is an integral flow such that for each , , and .

Note that in a -path/cycle cover the flow across any arc is either 0 or 1 and . If we regard as a multiset of arcs, then may be decomposed into paths from to and a collection of cycles where every lies on exacly one path or exactly one cycle. We can efficiently find a minimum-cost -path/cycle cover using a standard reduction to minimum weight perfect matching in a bipartite graph with nodes on each side.

LP (1) is the LP relaxation for -ATSPP we consider. It is similar to the LP relaxation for ATSPP considered in [26] and [13].

 minimize: ∑e∈Aduvxuv (1) subject to: x(δ+(v))=x(δ−(v)=1 ∀v∈V−{s,t} x(δ+(s))=x(δ−(t))=k x(δ−(s))=x(δ+(t))=0 x(δ+(S))≥1 ∀{s}⊆S⊊V (2) xuv≥0 ∀ uv∈A

We will break the presentation of the algorithm into two parts. The first is a loop that runs for iterations. Each iteration will find the cheapest -path/cycle cover on the remaining nodes and discards some nodes from the cycles or, more generally, circulations formed by taking the union of the current -path/cycle cover and paths from previous iterations. These discarded nodes can be added to the final solution by using the circulations to “graft” them into the paths. It is similar to the main loop in the algorithm for ATSPP in [13].

After this first phase we will have paths of cost from to whose union is acyclic and covers all remaining nodes. However, our goal is to use at most paths. The second part of the algorithm will assemble only paths that cover all remaining nodes using arcs from the paths. This will be possible because we will carefully select which nodes to discard in each iteration so that each remaining node lies in almost a -fraction of the paths after the main loop. So, if we regard our paths as a flow then each node supports almost a -fraction of this flow. If we scale the flow by then we have a flow sending units from to where every other remaining node supports units of this flow.

We will show how to round this flow to obtain an acyclic integral flow of roughly the same cost that sends between and units of flow from to so that each remaining node supports exactly 1 unit of this flow. Then a path decomposition of this acyclic integral flow yields the required paths. Finally, we graft the nodes that were discarded in the first phase to these paths using the circulations that were removed in the first phase.

### 2.2 The Algorithm

Let be an integer, this is the in the statement of Theorem 1.1. For notational convenience, we will let be for the remainder of this section. Algorithm 1 is the -ATSPP bicriteria approximation.

The proof of the following lemma is simple and is found in Appendix A.1.

###### Lemma 2.2

In Step 22, is an integral circulation whose support is strongly connected in .

We assume, for now, that Step 20 works as stated. If so, we can prove the following combinatorial analog of Theorem 1.1 that compares the cost of the solution to the optimum -ATSPP cost solution rather than the value of LP (1).

###### Theorem 2.3

Each node lies on exactly one of the paths returned by Algorithm 1 and the total cost of these paths is at most .

###### Proof.

Since is a circulation whose support is strongly connected in , then every node in is visited by the Eulerian circuit. Since the shortcutting did not bypass around either or , then the arc still appear exactly times in and there are no or arcs in for any . So, after removing the occurences of the arc from , we have a collection of precisely paths from to whose union visits all nodes.

All that is left is to bound the final cost by . We prove, by induction on , that the cost of after iteration is at most . For (before the first iteration) this is clear and we now assume that and that the cost of just before the ’th iteration is at most .

Let be the subset of nodes in the ’th iteration. We can obtain a feasible -path/cycle cover on of cost at most by shortcutting an optimum -ATSPP solution on around nodes in . Thus, the minimum cost -path/cycle cover on has cost at most . After adding to we have that the cost of is, by induction, at most . The rest of the body of the loop simply moves flow between and and shortcuts some flow so the cost of is still bound by at the end of the ’th iteration.

The cost of the circulation in Step 22 is then at most plus the cost of . Shortcutting past nodes in the Eulerian circuit yields an circuit whose total cost is still at most plus the cost of . The paths are formed by removing the arcs from to which have cost exactly the cost of . Thus, the cost of the returned paths is at most . ∎

### 2.3 Finding the Flow P in Step 20

Consider the acyclic integral flow after the main loop terminates. It is possible to argue that our choice of in Step 11 implies each has so a path decomposition of yields a collection of paths whose union covers all nodes. However, the number of paths is which is much larger than our desired value . The fact that we can find fewer paths is essentially due to the fact that every supports a lot of flow in .

The main object of concern in this step is the following polytope where . In , we have a variable for every arc in the subgraph induced by . The full description of is:

 z(δ+(w))=z(δ−(w)) =  1 ∀w∈W−{s,t} (3) z(δ+(s))=z(δ−(t)) =  D (4) z(δ−(s))=z(δ+(t)) =  0 (5) 0≤zuv ≤  Fuv ∀ ordered pairs u,v∈V (6)

Since the support of is acyclic and the support of is required to be a subset of the support of , then any integral point corresponds to a flow of the form required in Step 20 with . Notice that Constraints (6) imply that the cost of would be at most the cost of . Thus, our goal is to find a value for which has an integer point. The proof of the following property is standard (eg. [29]).

###### Lemma 2.4

Every basic point in polytope is integral when is an integer.

So, to prove has an integer point for a given integer it suffices to prove that contains any point. That is, for if there is some point with, perhaps, rational coordinates, then there is certainly a point with integer coordinates since is integral.

The following lemmas are proven in essentially the same way as Lemmas 2.3 and 2.5 in the extended abstract of [13], so we omit their proofs.

###### Lemma 2.5

Throughout the course of the algorithm, for every .

###### Lemma 2.6

When the main loop terminates, for each .

We will require that all node in support the same amount of flow in . The following lemma shows how to construct such a flow through simple modifications of .

###### Lemma 2.7

For some there is an acyclic integral flow sending units of flow from to where every has . Furthermore, the cost of is at most the cost of .

###### Proof.

Let and recall that by Lemma 2.5. While some has (cf. Lemma 2.6), shortcut past as follows. Choose any two arcs with . Then subtract 1 from both and and add 1 to . Note that remains integral and acyclic after such an operation and that the cost of does not increase by the triangle inequality. We let be this flow once all have . ∎

From now on, we will assume that for every where is some integer at most . The following lemma is the first step to finding a good integer for which . The value may be fractional, but we will deal with that problem later.

for some .

###### Proof.

Let and define a point by . It is easy to verify after noting that and implies . ∎

The main problem is that the may not be an integer. The following lemma fixes this by mapping a point in to a point in be modifying flow along “sawtooth” paths that alternate between following arcs in the support of in the forward and reverse directions. This can be used to decrease both and while preserving at all other nodes. The fact that such a sawtooth path always exists if is not an integer is a consequence of the fact that the total flow through all nodes in is integral. The full proof appears in Appendix A.2.

If , then .

###### Corollary 2.10

There is an integer such that . That is, we we can find between and paths from to whose union spans all nodes in . Furthermore, the cost of these paths is at most the cost of the flow after the main loop of Algorithm 1.

###### Proof.

Lemmas 2.8 and 2.9 show for some . The result follows since

 k≤⌊kLL−γ⌋≤k(b+1)⌊log2n⌋b⌊log2n⌋=k+kb.

In the next section, we complete the proof of Theorem 1.1 by showing the cost is actually at most times the value of LP relaxation (1).

### 2.4 Bounding the Integrality Gap

For a subset containing both and , let denote LP relaxation (1) on the asymmetric metric induced by . Consider the LP obtained by removing Constraints (2) from . The resulting LP is integral [29] whose integer points correspond to -path/cycle covers of . The following holds because removing the constraints from a minimizaton LP does not increase its value.

###### Lemma 2.11

For any subset containing and , the minimum cost of a -path/cycle cover of is at most the value of .

The proof of the next lemma is the same as the proof for the case in [13]. It uses splitting off techniques developed by Frank [11] and Jackson [19] for Eulerian graphs. The idea is that we obtain an Eulerian graph by multiplying an optimum basic (thus fractional) point in LP (1) by a large enough integer and identifying and . Since we are only concerned with preserving cuts for subsets of that include , then using splitting off techniques to bypass nodes in in this Eulerian graph and then scaling back to a fractional point will preserve in the graph induced by while not increasing the cost.

###### Lemma 2.12

For any subset of containing and , the value of is at most the value of .

Now we can complete the proof of Theorem 1.1.

###### Proof of Theorem 1.1..

By Lemmas 2.11 and 2.12, the cost of each -path/cycle cover found in Step (7) is at most . The proof of Theorem 2.3 shows that the final cost of the paths is at most the sum of the costs of each -path/cycle cover found. So, the total cost of the paths found is at most . ∎

### 2.5 Varying the Endpoints

Consider the following ways to specify the start locations of the paths: each path may start at any node (No Source), all paths start at a common node (Common Source), or there are nodes where each must be the start of some path (Multiple Source). We can also consider analogous ways to specify the end locations of the paths. In Multiple Source/Multiple Sink instances, we only require each path start at some and end at some . It may be that some paths start and end at locations with different indices.

The following theorems are easy to verify and the proofs are only briefly sketched. We let be the cost of the optimum solution using exactly paths for the -ATSPP variant in question.

###### Theorem 2.13

For any integer , there is an approximation algorithm for the No Source/Single Sink variant of -ATSPP that finds at most paths of total cost at most .

###### Proof.

Simply add a new start node and set and for every . Then use the approximation algorithm from Theorem 1.1. ∎

###### Theorem 2.14

For any integer , there is an approximation algorithm for the Multiple Source/Single Sink variant of -ATSPP that finds at most paths of total cost at most .

###### Proof.

Let be the multiple sources. Create a start node and other new nodes . Add cost 0 arcs from to and from to for each . Add a cost arc from to for every . Use Theorem 1.1 on the shortest paths metric of this graph. The intermediate nodes are to ensure that each is the start location of some path. ∎

Combining the constructions in Theorems 2.13 and 2.14 can also be used to combine different start and end location specifications (e.g. No Source/Multiple Sink).

## 3 General k-Atspp

Recall that in General -ATSPP, we are given pairs of nodes . The goal is to find an path for each so that every lies on at least one such path. This differs from the Multiple Source/Multiple Sink variant described in Section 2.5 since the path at must end at , rather than merely requiring that the start and end nodes of the paths establish a bijection between and .

### 3.1 Approximating General 2-ATSPP

Let and be pairs of nodes we are to connect. Furthermore, suppose all four of these endpoints are distinct (by creating multiple copies of locations if necessary). This means there is an optimum General 2-ATSPP solution where the two paths are vertex disjoint.

Notice that the optimum Multiple Source/Multiple Sink 2-ATSPP solution for the case with sources and sinks is a lower bound for . The first step of the algorithm is to run an -approximation for this variant of 2-ATSPP. This gives us two paths starting at , respectively. If ends at (equivalently, ends at ), then these paths form a valid solution for the General 2-ATSPP problem with cost at most . Otherwise, we have and paths. We use the following lemma which is proven in Appendix A.3. It gives us a relatively cheap way to adjust these paths to get and paths.

###### Lemma 3.1

There are nodes on and on such that the following hold: a) or is an arc on , b) or is an arc on and c) .

The rest of the General 2-ATSPP algorithm proceeds as follows. Try all guesses for where either or is an arc on and either or is an arc on . For each guess, construct a path by traveling along from to , then using the arc in , and then traveling along from to . Construct in a similar manner by traveling from to on , then using the arc in , and then traveling along from to . It is easy to see that and form a feasible solution for this General 2-ATSPP instance. Since each arc on and is traversed at most once by and , then the total cost of and is at most . Output the cheapest solution found over these guesses.

When the algorithm guesses nodes from Lemma 3.1 we have . It is straightforward to verify that each arc in and is used at most once between and . So the final cost of and is at most . To summarize:

###### Theorem 3.2

If there is an -approximation for the Multiple Source/Multiple Sink variant of 2-ATSPP then there is an -approximation for General 2-ATSPP.

By setting and using Theorem 2.14 composed with an analogous result for Multiple Sink instances, Theorem 3.2 gives us the following.

###### Corollary 3.3

There is an -approximation for General 2-ATSPP.

### 3.2 Approximating Other Restrictions of General k-Atspp

We breifly mention a couple of variants of General -ATSPP that can be approximated well. We leave their full descriptions to Appendix A.4 since they are simple variants of known algorithms for some TSP variants.

The first variant is when the metric is symmetric. In this case, there is a simple 3-approximation using a tree doubling approach. Next, if for each , then the problem is to find a cycle cover where each cycle contains some root node where we allow cost 0 loops on the nodes (corresponding to the salesman at going directly to ). This version can be approximated within using a modification of the ATSP algorithm by Frieze et al. [12].

### 3.3 Hardness of General k-Atspp

We will use the following NP-complete.

###### Definition 3.4

In the Tripartite Triangle Packing problem, we are given a tripartite graph with where no edge in has both endpoints in a common partition , or . A triangle is a subset of 3 nodes for which any two are adjacent in . The problem is to determine if it is possible to find vertex-disjoint triangles in .

NP-completeness of this problem is essentially shown in [14]. Technically, only related problems are proven to be NP-complete in this book. However, consider the 3D Matching problem that is proven to be NP-complete in Theorem 3.2 on page 50. If each triple is replaced with edges then we obtain an instance of Tripartite Triangle Packing. Careful inspection of the proof of Theorem 3.2 of [14] shows that the only triangles that can be formed must have come from some triple in the 3D Matching reduction. Thus, the Tripartite Triangle Packing problem is also NP-complete.

For our reduction to General -ATSPP, let be an instance of Tripartite Triangle Packing with . Create a directed graph with four layers of nodes where and are disjoint copies of , is a copy of , and is a copy of . For every edge in , there is a unique index such that the endpoints of lie in and . Add this arc to , direct it from to , and set its cost to 0. This is illustrated in Figure 2. Set and consider the General -ATSPP instance on obtained from the shortest paths metric where we set the cost of a arc to be 1 if there is no path in . For each , we have a source/sink pair from the copy of in to the copy of in .

The details of the following claim are simple and can be found in Appendix A.5. The proof of Theorem 1.2 immediately follows.

###### Claim 3.5

There is a Tripartite Triangle Packing solution in if and only if the optimum General -ATSPP solution in solution has cost 0.

Note that the value is in the above reduction. Similar hardness results for smaller values of (as a function of ) are established in Appendix A.6 down to being polylogarithmic in . The complexity of approximating the case when is a constant at least 3 remains open.

## 4 Future Directions

Our best approximation for -ATSPP that uses exactly paths has an approximation guarantee of . Can the dependence on in the approximation ratio be reduced? Perhaps there is an -approximation for -ATSPP that uses only paths. On the other hand, the problem might be hard to approximate much better than . Also, as far as we know the integrality gap of LP (1) could be .

For General -ATSPP, the case is simply ATSPP and we proved an -approximation for . Is there a more general -approximation for General -ATSPP whose running time is polynomial when is a constant?

Finally, rather than minimizing the total cost of all paths we might want to minimize the cost of the most expensive path. This can be thought of as minimizing the total time it takes for agents moving simultaneously to visit all locations. From Theorem 1.1 and the observation that the total cost of paths is at most times the cost of the most expensive path, we get an -approximation for this variant that uses at most paths. Also, Theorems A.5 and A.6 (Appendix A.7) show that approximation algorithms for Directed Orienteering and Directed -Stroll can be used to obtain other bicriteria approximation algorithms. The current best approximation algorithms for Directed Orienteering [7, 25] and Directed -Stroll [4] imply the following:

###### Corollary 4.1

There is an efficient algorithm that finds paths from to of maximum cost whose union covers all nodes in .

###### Corollary 4.2

There is an -approximation uses at most paths.

We leave it as an open problem to improve these bounds. In particular, is it possible to obtain a polylogarithmic approximation that uses only paths? We also note that the hardness results for General -ATSPP proven in this paper also hold for the the variant where we want to minimize the maximum cost of the paths.

## 5 Acknowledgements

The author would like to thank Anupam Gupta, Mohammad R. Salavatipour, and Zoya Svitkina for insightful discussions on these problems.

## References

• [1] H.-C. An and D. Shmoys. LP-based approximation algorithms for traveling salesman path problems. arXiv manuscript number 1105.2391.
• [2] H.-C. An, R. Kleinberg, and D. Shmoys. Improving Christofides’ Algorithm for the Path TSP. arXiv manuscript number 1110.4604.
• [3] A. Asadpour, M. X. Goemans, A. Madry, S. Oveis Gharan, and A. Saberi. An -approximation algorithm for the asymmetric traveling salesman problem. In Proc. 21st ACM Symp. on Discrete Algorithms, 2010.
• [4] M. Bateni and J. Chuzhoy. Approximation algorithms for the directed -tour and -stroll problems. In Proc. 13th APPROX, 2010.
• [5] M. Blaser. A new approximation algorithm for the asymmetric TSP with triangle inequality. In Proc. 13th ACM Symp. on Discrete Algorithms, 2002.
• [6] M. Charikar, M. X. Goemans, and H. Karloff. On the integrality ratio for asymmetric TSP. In Proc. 45th IEEE Symp. on Foundations of Computer Science, 2004.
• [7] C. Chekuri, N. Korula, and M. Pal. Improved algorithms for orienteering and related problems. In Proc. 19th ACM Symp. on Discrete Algorithms, 2008.
• [8] C. Chekuri and M. Pal. An approximation ratio for the asymmetric traveling salesman path problem. Theory of Computing, 3(1):197–209, 2007.
• [9] N. Christofides. Worst-case analysis of a new heuristic for the traveling salesman problem. Technical report, Graduate School of Industrial Administration, Carnegie-Mellon University, Pittsburgh, PA, 1976.
• [10] U. Feige and M. Singh. Improved approximation ratios for traveling salesperson tours and paths in directed graphs. In Proc. 10th APPROX, 2007.
• [11] A. Frank. On connectivity properties of Eulerian digraphs. Ann. Discrete Math., 41:179–194, 1989.
• [12] A. Frieze, G. Galbiati, and F. Maffioli. On the worst-case performance of some algorithms for the asymmetric traveling salesman problem. Networks, 12:23–39, 1982.
• [13] Z. Friggstad, M.R. Salavatipour, and Z. Svitkina, Asymmetric traveling salesman path and directed latency problems. Proc. SODA, 2010. Additional results appear in arXiv, manuscript number 0907.0726v1.
• [14] M.R. Garey and D.S. Johnson, Computers and Intractability: A Guide to the Theory of NP-Completeness. W. H. Freeman& Co., New York, 1979.
• [15] M. C. Golumbic. Algorithmic Graph Theory and Perfect Graphs (Annals of Discrete Mathematics, Vol 57). North-Holland Publishing Co., Amsterdam, The Netherlands, second edition, 2004.
• [16] G. Gutin and A. P. Punnen, editors. Traveling Salesman Problem and Its Variations. Springer, Berlin, 2002.
• [17] M. Held and R. Karp. The traveling salesman problem and minimum spanning trees. Operations Research, 18:1138–1162, 1970.
• [18] J. A. Hoogeveen. Some paths are more difficult than cycles. Oper. Res. Lett., 10:291–295, 1991.
• [19] B. Jackson. Some remarks on arc-connectivity, vertex splitting, and orientation in digraphs. Journal of Graph Theory, 12(3):429–436, 1988.
• [20] H. Kaplan, M. Lewenstein, N. Shafrir, and M. Sviridenko. Approximation algorithms for asymmetric TSP by decomposing directed regular multigraphs. J. ACM, 52(4):602–626, 2005.
• [21] J. Kleinberg and D. P. Williamson. Unpublished note, 1998.
• [22] F. Lam and A. Newman. Traveling salesman path problems. Math. Program., 113(1):39–59, 2008.
• [23] E. Lawler, J. K. Lenstra, A. H. G. Rinnooy Kan, and D. Shmoys, editors. The Traveling Salesman Problem: A guided tour of combinatorial optimization. John Wiley & Sons Ltd., 1985.
• [24] V. Nagarajan. On the LP relaxation of the asymmetric traveling salesman path problem. Theory of Computing, 4(1):191–193, 2008.
• [25] V. Nagarajan and R. Ravi. Poly-logarithmic approximation algorithms for directed vehicle routing problems. In Proc. 10th APPROX, pages 257–270, 2007.
• [26] V. Nagarajan and R. Ravi. The directed minimum latency problem. In Proc. 11th APPROX, pages 193–206, 2008.
• [27] C. H. Papadimitriou and S. Vempala. On the approximability of the traveling salesman problem. Combinatorica, 26(1):101–120, 2006.
• [28] S. Rathinam and R. Sengupta. Matroid intersection and its applications to multiple depot, multiple TSP. Technical report, University of California, Berkely, 2006.
• [29] A. Schrijver. Combinatorial Optimization - Polyhedra and Efficiency. Springer-Verlag, New York, 2005.
• [30] D. Shmoys and D. P. Williamson. Analyzing the Held-Karp TSP bound: a monotonicity property with application. Inf. Process. Lett., 35(6):281–285, 1990.
• [31] D. P. Williamson. Analysis of the Held-Karp heuristic for the traveling salesman problem. M.S. Thesis, MIT, 1990.
• [32] L. A. Wolsey. Heuristic analysis, linear programming and branch and bound. Mathematical Programming Study, 13:121–134, 1980.

## Appendix A Appendix

### a.1 Proof of Lemma 2.2

###### .

We see that is an integral circulation since it is the sum of an integral circulation , an integral flow of value , and an integral flow of value . To prove its support is strongly connected in it suffices to show that there is a path in the support of for every . Suppose is the set of remaining nodes just after the ’th iteration of the loop (and ). We prove by induction on that every node in has a path to .

Consider the base case . Since only uses arcs in the support of , since is acyclic (from Step 9), and since for (Lemmas 2.5 and 2.6) then any walk from any in the support of of maximal length must end at .

Now suppose and that every node in has a path to in the support of . Let , if then, by induction, has a path to . Otherwise, must have been removed in Step 15 of the current iteration for some circulation . But then , the representative of circulation that was chosen to remain in , is in , so it has a path to by induction. Finally, also has a path to in the support of since it can travel first to in and then, by induction, to . ∎

### a.2 Proof of Lemma 2.9

###### .

Suppose is not an integer, otherwise we are already done. Let be any point in . Form an undirected and weighted bipartite graph where both and are disjoint copies of . We identify an arc of with an edge in with and . That is, for each arc add an edge from to with weight . By constraints (3), (4) and (5), we have that (the total -value of all edges in incident to ) is an integer for every except for and .

We claim there is a path from to in that only uses edges with . Note that these paths are allowed to take a step from to since is undirected. Such a step corresponds to following an arc in the reverse direction in the original directed graph .

Suppose, for the sake of contradiction, that there is no path from to using only edges with . Let be the collection of all nodes in that can be reached from using only edges with positive -value; our assumptions means the copy of in is not in .

Let denote the total -value of edges with both endpoints in . On one hand, since every node has and since , then

 z(E(S))=∑v∈WR∩Sz(δ(v))=|WR∩S|.

On the other hand, since every node has and since , then

 z(E(S))=∑v∈WL∩Sz(δ(v))=|(WL−{s})∩S|+z(δ(s))=|(WL−{s})∩S|+D.

But contradicts our assumption that is not an integer. So, it must be that there is a path from to in using only edges with .

Suppose that such a path followed a sequence of edges . Since is bipartite and are on different sides, then must be odd. Let

 δ=min{D−⌊D⌋,min1≤i≤c: i oddzei}

be the minimum -value of the edges that are followed from to when walking along this path (or the difference between and if this is smaller). Update the values of the edges on this path by:

 zei←{zei−δi oddzei+δi even

We will now argue that the resulting -values fit in the polytope . First, notice that both and , which were originally , decrease by exactly . Any other node not on this path does not have the -value of any incident edge changed. Finally, if is an internal node on this path then the -value of one incident edge decreases by and the -value of another incident edge increases by . Therefore, we have after this update for every internal node on this path.

By our choice of , we maintain for every edge . Now, if the path was a single edge then no edge had its -value increased so the bounds continue to hold for every edge . Otherwise, every edge on this path has either or so must hold after this update. Since the only values that are increased are those in the support of the integral flow , then .

The above process maps a point from to a point in when is not an integer. If was chosen to be , then after this process and we are done. Otherwise, we can repeat the process to map a point from to for some with and so on. Each such step that does not result in a point in the polytope has us remove at least one edge from the support of . Since no edges are introduced to the support of , then this process will terminate in finitely many iterations with a point in . ∎

### a.3 Proof of Lemma 3.1

###### .

Let be the path and be the path in some fixed optimum solution. Also, for nodes we use to indicate that both and appear on the same path in our optimum solution and that appears sometime before on this path.

Let be such that is the first arc on with and . Such an arc exists because starts with a node in and ends in a node in . Let be the first node along such that . We know such a node exists because and . Now, let be the furthest node along but still before such that either or and . Again, such a node exists because and .

Suppose with . If is immediately followed by on then we let . Otherwise, say is the immediate successor of on . By our choice of we have and . In this case, we set . This case is illustrated in Figure 3.

Next, suppose