Minimum Eccentricity Shortest Path Problem:an Approximation Algorithm andRelation with the k-Laminarity Problem

# Minimum Eccentricity Shortest Path Problem: an Approximation Algorithm and Relation with the k-Laminarity Problem

Étienne Birmelé MAP5, UMR CNRS 8145, Univ. Sorbonne Paris Cité
Fabien de Montgolfier IRIF, UMR CRNS 8243, Univ. Sorbonne Paris Cité 22email: leo_planche@liafa.univ-paris-diderot.fr    Léo Planche MAP5, UMR CNRS 8145, Univ. Sorbonne Paris Cité
IRIF, UMR CRNS 8243, Univ. Sorbonne Paris Cité 22email: leo_planche@liafa.univ-paris-diderot.fr
###### Abstract

The Minimum Eccentricity Shortest Path (MESP) Problem consists in determining a shortest path (a path whose length is the distance between its extremities) of minimum eccentricity in a graph. It was introduced by Dragan and Leitert  who described a linear-time algorithm which is an -approximation of the problem. In this paper, we study deeper the double-BFS procedure used in that algorithm and extend it to obtain a linear-time -approximation algorithm. We moreover study the link between the MESP problem and the notion of laminarity, introduced by Völkel et al , corresponding to its restriction to a diameter (i.e. a shortest path of maximum length), and show tight bounds between MESP and laminarity parameters.

###### Keywords:
Graph search, Graph theory, Eccentricity, Diameter, BFS, Approximation Algorithms, -Laminar Graph

## 1 Introduction

For both graph classification purposes and applications, it is an important issue to determine to which extent a graph can be summarized by a path. Different path constructions and metrics to characterize how far the graph is from the constructed path can be used, for example path-decompositions and path-width  or path-distance-decompositions and path-distance-width . Another approach, on which we focus in this article, is to characterize the graph by a spine defined by one of its paths.

This problem was first studied in terms of domination, that is finding a path such that every vertex in the graph belongs to or has a neighbor in the path. Several graphs classes were defined in terms of dominating paths.  studies the graphs for which the dominating path is a diameter.  introduces dominating pairs, that is vertices such that every path linking them is dominating. Graphs such that short dominating paths are present in all induced subgraphs are characterized in . Linear-time algorithms to find dominating paths or dominating vertex pairs were also developed for AT-free graphs [4, 6].

Dominating paths do not exist however in every graph and have no associated metric to measure the distance from the graph to the path. A natural extension of the notion of domination is the notion of -coverage for a given integer , defined by the fact that a path -covers the graphs if every vertex is at distance at most from the path. The smallest such that a path -covers the graph is then a metric as desired.

In the present paper, we study the latter problem in which the covering path is required to be a shortest path between its end-vertices. It was introduced in  as the Minimum Eccentricity Shortest Path Problem, and shown to be linked to the minimum line distortion problem .

The MESP problem is also closely related to the notion of -laminar graphs introduced in , in which the covering path is required to be a diameter.

The MESP problem, as well as determining if a graph is -laminar for a given , are NP-hard [9, 12]. However, Dragan and Leitert  develop a 2-approximation algorithm for MESP of time complexity , a 3-approximation algorithm in and a linear 8-approximation. The latter is extremely simple as it consists in a double-BFS procedure.

In this paper, we introduce a different analysis of the double-BFS procedure and prove that it is in fact a 5-approximation algorithm, and that the bound is tight. We then develop the idea of this algorithm and reach a 3-approximation, which still runs in linear time. Finally, we establish bounds relating the MESP problem and the notion of laminarity.

### Definitions and Notations

Through this paper denotes a finite connected undirected graph. A shortest path between two vertices and is a path whose length is minimal among all -paths. This length (counting edges) is the distance . Depending on the context, we consider a path either as a sequence, or as a set of vertices. The distance between a vertex and a set is smallest distance between and a vertex from .

The eccentricity of a set is the largest distance between and any vertex of .

The maximal eccentricity of any singleton , or equivalently the largest distance between two vertices, denoted here , is often called the diameter of the graph, but for clarity in this paper a diameter is always a shortest path of maximum length, i.e. a shortest path of length , and not its length.

## 2 Double-BFS is a 5-Approximation Algorithm

Let us define the problem we are interested in:

###### Definition 1 (Minimum Eccentricity Shortest Path Problem (MESP))

Given a graph , find a shortest path such that, for every shortest path , .

denotes the eccentricity of a MESP of .

###### Theorem 2.1 (Dragan and Leitert )

Computing or finding a MESP are NP-complete problems.

It is therefore worth using polynomial-time approximation algorithms. We say that an algorithm is an -approximation of the MESP if every path output by this algorithm is a shortest path of eccentricity at most .

Double-BFS is a widely used tool for approximating . It simply consists in the following procedure:

1. Pick an arbitrary vertex

2. Perform a BFS (Breadth-First Search) starting at and ending at . is thus one of the furthest vertices from .

3. Perform a BFS (Breadth-First Search) starting at and ending at .

The output of the algorithm is the path from to , called a spread path, while its extremities are called a spread pair. A folklore result is that the distance between and 2-approximates the diameter of . As noted by Dragan and Leitert, Double-BFS may also be used for approximating MESP: they have shown in  that any spread path is an 8-approximation of the MESP problem.

The first result of the present paper is that any spread path is in fact a -approximation of the MESP problem and that the bound is tight. But before we prove this result (Theorem 2.2), let us give the key lemma used for proving our three theorems:

###### Lemma 1

Let G be a graph having a shortest path of eccentricity k.

Let P= be a shortest path of G.

Let (resp. ) be the smallest (resp. largest) integer such that (resp. ) is at distance at most k of P.

For every integer such that , is then at distance at most from .

Subsequently, every vertex of at distance at most from the subpath between and is at distance at most of .

One may think, at first glance, that this lemma looks similar to the following:

###### Lemma 2 (from Dragan et al.)

If has a shortest path of eccentricity at most from to , then every path with in and has eccentricity at most .

The difference lies in the fact that the in Lemma 2 is specific to the given couple of vertices while the in Lemma 1 is global. On the other hand, Lemma 2 gives a bound on the eccentricity of a path with respect to the whole graph, while Lemma 1 only guarantees an eccentricity for a defined subgraph.

###### Proof (of Lemma 1)

The second assertion of the lemma is straightforward given the first one. To prove the latter, we define, for all between and , the subpath .

Let us show by induction on that for all between and , is at distance at most of .

, .

Using the triangle inequality:

 d(viP0min,viP0max)≤d(viP0min,x0)+d(x0,viP0max)≤2k (1)

Hence, for all between and ,

 d(viP0min,vi)≤k or d(viP0max,vi)≤k (2)

The result is thus verified for .

Let in such that the property if verified for .

For all between and , is at distance at most of by the induction hypothesis. Hence, is at distance at most of .

Moreover,

 d(viPl−1max,viPlmax)≤d(vixl−1max,vixlmax) (3)

and by the triangle inequality:

 d(vixl−1max,vixlmax)≤d(vixl−1max,xl−1)+d(xl−1,xl)+d(xl,vixlmax)≤2k+1 (4)

As the sub-path of P between and is a shortest path, it follows that for all between and ,

 d(viPl−1max,vi)≤k or d(viPlmax,vi)≤k, (5)

meaning that is at distance at most of or of .

A similar proof shows that for all between and , is at distance at most from or from .

The property is verified by induction, and the lemma follows for .

###### Theorem 2.2

A double-BFS is a linear-time 5-approximation algorithm for the MESP problem.

Before we prove it, notice that Figure 1 shows that this bound is tight.

###### Proof

Let be , be a MESP (its eccentricity is thus ), and be the result of a double-BFS starting at some arbitrary vertex , then reaching , then reaching . We shall prove that is a -dominating path of .

Let (resp. ) be such that (resp. ) is at distance at most of (resp. ). The following inequalities are verified:

 d(r,x)≥d(r,vt)≥d(vi,vt)−d(r,vi)≥d(vi,vt)−k (6)
 d(r,x)≤d(r,vi)+d(vi,vj)+d(vj,x)≤d(vi,vj)+2k (7)

Combining those inequalities,

 d(vi,vt)−3k≤d(vi,vj) (8)

Similarly:

 d(vi,v0)−3k≤d(vi,vj) (9)

Therefore is at distance at most of or . Without loss of generality, assume that is at distance at most of .

Let be such that is at distance at most of . We distinguish two cases:

1. :

Then is at distance at most of . As is a vertex most distant from , is a -dominating vertex of the graph. The lemma is then verified.

2. Applying to the inequalities established at the beginning of the proof:

 d(vj,vt)−3k≤d(vj,vl) (10)

As , it follows that:

 d(vl,vt)≤3k (11)

Figure 2 shows the configuration of the graph in that case. The vertices at distance at most of a vertex such that (resp. ) are at distance at most of (resp. ).

According to Lemma 1, every vertex of at distance at most of a vertex such that is between and is at distance at most of any shortest path between and . The lemma is thus verified.

## 3 A 3-Approximation Algorithm

We show now that by using more BFS runs we may obtain a -approximation of MESP, still in linear time.

Let bestPath and bestEcc be global variables used as return values for the path and its eccentricity. bestPath stores a path and is uninitialized, and bestEcc is an integer initialized with .

###### Theorem 3.1

A -approximation of the MESP Problem can be computed in linear time by considering a spread pair of and running Algorithm3k(,,,).

###### Proof (Correctness)

Let G be a graph admitting a shortest path P = of eccentricity k.

Let and be any vertices of , a shortest path between and . Define (resp. ) as the smallest (resp. largest) integer such that (resp. ) is at distance at most of or . Then, by Lemma 1,

 For all j such that ix,ymin−k≤j≤ix,ymax+k,d(Qx,y,vj)≤2k (12)

Hence, if and , every vertex of is at distance at most of and, as is of eccentricity , is of eccentricity at most .

Algorithm3k uses this implication to exhibit a pair such that is of eccentricity at most . Indeed, in each recursive call, one of the following cases holds:

1. the vertex selected at line is at distance at most from . In that case, bestPath will be set to unless it already contains a path of even better eccentricity. In any case, the result of the algorithm is a path of eccentricity at most .

2. the vertex is at a distance greater than of . Let be such that is at distance at most of . Then, according to Equation (12),

 iz≤ix,ymin−k or iz≥ix,ymax+k (13)
1. Suppose that . Then, in the case , we get and . And in the case we get and .

2. A similar reasoning can be applied if , also yielding to and or and .

Therefore, either the algorithm already found a path of eccentricity at most , or it makes one of its two new calls with a couple such that the interval contains but has length increased by at least .

Consider now a spread pair for which Algorithm3k(,,,) is run. It follows from case (i) and (ii) of the proof of Theorem 2.2 that

 is,lmin≤5k and is,lmax≥t−5k (14)

At each of the recursive calls, if no path of eccentricity at most has already been discovered, one of the new calls expands the interval length by at least , while containing the previous interval. As the recursive calls are made until , it follows that either a path of eccentricity has been discovered, or one of the explored possibilities corresponds to eight extensions of size at least starting from .

In the latter case, Equation (14) implies that the final couple of vertices fulfills and . Every vertex of is then of distance at most of and thus is of eccentricity at most .

###### Proof (Complexity)

The algorithm computes two BFS trees at line 1 and 1, taking time. The rest of the operations is computed in constant time.

The recursivity width is 2 and, since it is first called with , the recursivity length is 8. The algorithm is thus called 255 times. Therefore the total runtime of the algorithm is .

###### Proof (Tightness of the approximation)

Figure 3 shows a graph for which the algorithm may produce a path of eccentricity (see caption).

## 4 Bounds between MESP and Laminarity

In this section, we investigate the link between the MESP problem and the notion of laminarity introduced by Völkel et al. in . The study of the -laminar graph class finds motivation both from a theoretical and practical point of view. On the theoretical side, AT-free graphs form a well known graph class introduced half a century ago by Lekkerkerker and Boland , which contains many graph classes like co-comparability graphs. An AT-free graph admits a diameter all other vertices are adjacent with . It is then natural to extend this notion of dominating diameter. On the practical side, some large graphs constructed from reads similarity networks of genomic or metagenomic data appear to have a very long diameter and all vertices at short distance from it , and exhibiting the ”best” diameter allows to better understand their structure.

###### Definition 2 (laminarity)

A graph is

• -laminar if has a diameter of eccentricity at most .

• -strongly laminar if every diameter has eccentricity at most .

and denote the minimal values of and such that is respectively -laminar and -strongly laminar.

###### Theorem 4.1

For every graph ,

 k(G) ≤l(G)≤4k(G)−2 k(G) ≤s(G)≤4k(G)

Moreover, there exist three graph sequences , and such that, for every ,

• ;

• and ;

• and ;

The bounds given by the inequalities are therefore tight.

###### Proof (k(G)≤l(G) and k(G)≤s(G))

Those inequalities are straightforward as every diameter is by definition a shortest path. The eccentricity of every diameter is therefore always greater than .

###### Proof (s(G)≤4k(G))

Let be a diameter of and a shortest path of eccentricity . We shall show . Let be any vertex of . Since there exists a vertex of such that . Let us distinguish three cases:

Case 1: there exists vertices , of and , of such that and and . Then by Lemma 1, is at distance at most from any shortest path between and , and thus at distance at most of .

Case 2: there exists no vertex of with and

Case 3: there exists no vertex of with and .

Without loss of generality we focus on Case 2 (illustrated in Figure 4), which is symmetric with Case 3. Let (resp. ) be such that (resp. ) is at distance at most of (resp. ), assume :

 d(vl,vm)≥d(x0,xs)−2k (15)

being a diameter,

 d(x0,xs)≥d(v0,vt) (16)

By combining those inequalities,

 d(vl,vm)≥d(v0,vt)−2k (17)
 d(vl,vm)≥d(v0,vi)+d(vi,vl)+d(vl,vm)+d(vm,vt)−2k (18)
 2k≥d(vi,vl) (19)

It follows that is at distance at most of .

###### Proof (l(G)≤4k(G)−2)

Let be a diameter of and a shortest path of eccentricity . We shall show that either or contains a diameter of eccentricity . If is a diameter we are done. Let us suppose from now it is of length at most .

Let be any vertex of and a vertex of such that . Let us distinguish the same three cases than in the proof that . The first case also leads to . The second and third being symmetric, let us suppose there exists no vertex of at distance at most k of such that .

Let (resp. ) be a vertex of P at distance at most from (resp. ), clearly,

 d(vl,vm)≥|D|−2k. (20)

Let us distinguish two subcases:

Case 2.1: ,

 d(vi,vl)≤d(v0,vt)−d(vl,vm)≤(|D|−1)−(|D|−2k+1)≤2k−2 (21)

It follows that is at distance at most of .

Case 2.2:

In this case, a path is a diameter. Assuming , Equation 19 in previous proof shows that:

 d(vi,vl)≤2k (22)

and with a symmetrical reasoning,

 d(vm,vt)≤2k (23)

It follows that any vertex of at distance at most of a vertex with (resp. ) is at distance at most of (resp. ). Hence at distance at most of . being a subpath of , any vertex of at distance at most of a vertex with between and is at distance at most k of . Finally, any vertex of is at distance at most of .

###### Proof (Tightness of the bounds)

Consider the graph reduced to a path of length to which a second path of length is attached in the middle. is then simultaneously the only diameter and the MESP, and it -covers but doesn’t -cover it. Hence the inequalities and are tight.

Figure 5 shows how to build the graph sequence (only and are drawn). is a graph with a shortest path of eccentricity and a diameter of eccenticity . The inequality is thus tight.

Figure 6 shows how to build the graph sequence (only , and and are drawn). is a graph with a shortest path of eccentricity , while the unique diameter has eccenticity ( is a special case with two diameters). The inequality is therefore tight.