Search for the end of a path in the
-dimensional grid and
in other graphs
We consider the worst-case query complexity of some variants of certain -complete search problems. Suppose we are given a graph and a vertex . We denote the directed graph obtained from by directing all edges in both directions by . is a directed subgraph of which is unknown to us, except that it consists of vertex-disjoint directed paths and cycles and one of the paths originates in . Our goal is to find an endvertex of a path by using as few queries as possible. A query specifies a vertex , and the answer is the set of the edges of incident to , together with their directions.
We also show lower bounds for the special case when consists of a single path. Our proofs use the theory of graph separators. Finally, we consider the case when the graph is a grid graph. In this case, using the connection with separators, we give asymptotically tight bounds as a function of the size of the grid, if the dimension of the grid is considered as fixed. In order to do this, we prove a separator theorem about grid graphs, which is interesting on its own right.
Keywords: Separator, graph, search, grid.
AMS 2010 Mathematics Subject Classification: 90B40, 05C85.
This paper deals with the following search problem. We are given a simple, undirected, connected graph and a vertex . We denote the directed graph obtained from by directing all edges in both directions by . Let be a directed subgraph of , which is the vertex-disjoint union of a directed path starting at and possibly some other directed paths and cycles. is unknown to us, and our goal is to identify an endvertex of a directed path. We may query a vertex , and as an answer, we learn the edges of incident to together with their directions. In particular, if the answer is only one incoming edge, then we know that is an endvertex. We analyze the minimum number of queries that are necessary in the worst case.
We give lower bounds in the more restrictive model where we know is one directed path. Note that if instead of looking for an endvertex, we look for an ending or a starting vertex of a path (different from ), then this model still gives a lower bound for this easier problem. In Section 4 we mention some additional models.
Denote by the minimum number of queries needed to find an endvertex in the worst case for any . If we know that is one directed path, denote this quantity by .
Biseparators and multiseparators.
To state some of our results we need to define separators of graphs. This notion can be defined in two different ways and both definitions are widely used. Here we distinguish between the two definitions.
Given a graph , a subset is called an -biseparator of if can be divided into two parts, and , such that there are no edges between and , and both have cardinality at most .
Given a graph , a subset is called an -multiseparator of if every connected component of has cardinality at most .
Note that or in the definition of a biseparator can be empty: we do not require to be disconnected. Small biseparators make sense only for .
Given these definitions, when we write separator, it can mean either a biseparator or a multiseparator, as in many cases it makes no difference. In the literature, the notation -separator can also be found, where is an upper bound on the cardinality of in terms of the number of vertices. In this paper it is more straightforward to fix and then look for the smallest -separator. Therefore, we let be the minimum cardinality of an -biseparator in and be the minimum cardinality of an -multiseparator in .
It follows from the definitions that every -biseparator is an -multiseparator, and thus . In many cases they are of the same order of magnitude. In particular, if we have a bound for a class of graphs which is closed under taking subgraphs for some and for some arbitrary , we get the same asymptotic bound on , by iteratively separating one of the components. However, there are cases when multiseparators are much smaller than biseparators. For example, if consists of three disjoint cliques of equal size, all connected to a degree-three vertex, then but . For any tree, but it is not hard to show that for a complete ternary tree, , see Appendix A. Finally, if we consider a class of graphs closed under taking subgraphs, by repeatedly refining the separation, then it is obvious that and have the same order of magnitude for any two constants and .
Our main result establishes a connection between the biseparators and the search complexity for general graphs.
For any connected graph with at least 2 vertices, we have .
We can prove an upper bound of the same order of magnitude, if every subgraph has small multiseparators. Note that when bounding , , the larger of the separators, gives the lower bound and , the smaller one, gives the almost matching upper bound, which implies that indeed for a large class of graphs and have the same order of magnitude.
Let be constants, let be a monotone function, and let be a graph such that any subgraph of has an -multiseparator of size at most . If for all , then
The condition on could be interpreted as having “at least polynomial growth”. The condition is fulfilled by the function if and only if . To put it differently, if and are given, the theorem applies with .
We also study the search problem for the special case of grid graphs.
Let be a positive integer and a sequence of positive integers. The -dimensional grid graph of side length , denoted by , has vertex set , and there is an edge between two vertices if and only if they differ in exactly one coordinate and the difference is . If , then we simply write .
We estimate the search complexity of grid graphs as follows.
As a tool, we will prove a bound on the cardinality of separators of grid graphs, using classic results from the theory of vertex isoperimetric problems and cube slicing.
The smallest -biseparator of the grid graph has cardinality .
We note that when considering grid graphs, one could also study the related problem that the path starting at is monotone, i.e., if and are on the path and (according to the usual partial order of the vectors), then the edge between and (if it exists) is directed towards . In this case the needed number of queries reduces dramatically. Indeed, the trivial algorithm which follows the path uses at most queries. In two dimensions we could improve slightly this upper bound, yet there is a more significant improvement by Xiaoming Sun (personal communication), who proved that queries are enough in two dimensions. From below, at least queries are needed regardless of [6, Lemma 6]. This problem resembles the pyramid-path search problem (but it is not exactly the same), where also a lower bound of is proved for the two-dimensional case .
Hirsch, Papadimitriou and Vavasis  have proved worst-case lower bounds for finding Brouwer fixed points for algorithms using only function evaluation. They showed a lower bound that is exponential in the dimension, disproving the conjecture that Scarf’s algorithm is polynomial. In our language, they have (implicitly) proved that [6, Lemma 16]. Our Theorem 1.5 is an improvement of their result, although we do not use the continuous setting but rather focus only on the discretization of the problem.
Later, Papadimitriou  considered similar complexity search problems in great detail and defined corresponding complexity classes , , etc. In his model, an exponential-size graph is given by a succinct representation, i.e., by the description of a Turing-machine . The vertices of the graph correspond to binary sequences of length and if we input such a sequence to , it outputs all the neighbors of the corresponding vertex in polynomial time (thus the degrees are bounded by a polynomial). Therefore, in his model, instead of considering query cost, one can work with the classical running time of the algorithm that gets as input. If the algorithm uses as a black box, we get back the query-cost model.
Papadimitriou considered the problem when the maximum degree of the graph is , i.e., it consists of vertex disjoint paths and cycles and we are also given, as part of the input, a degree-one vertex, , and our goal is to output another degree-one vertex. This search problem is denoted by LEAF, and the complexity class is defined such that LEAF is complete for . ( stands for “Polynomial Parity Argument”.)
Papadimitriou introduced another variant, where the underlying graph is directed ( outputs both the in- and out-neighbors of its input in this case), the in- and out-degree of every vertex is at most one, and we are given a starting vertex with in-degree zero and out-degree one. Therefore, the resulting digraph is the vertex-disjoint union of a directed path starting at and possibly some other directed paths and cycles, exactly like in the problem that we study. Here our goal can be either to output an in-degree one, out-degree zero vertex (called LEAFDS problem) or an in-degree plus out-degree equals one vertex (called LEAFD problem), which means the end of a path, just like in the problem we study. Thus, the query-cost of LEAFD is exactly .
The complexity classes for which the problems LEAFDS and LEAFD are complete are denoted, respectively, by and . It is easy to see that is contained in both and , while an oracle separation is known for the two latter classes . Nowadays enjoys huge popularity, as several problems, among them finding an -approximate Nash-equilibrium, turned out to be -complete. This is why this paper focuses on , the query-cost version of , though most of our results would also hold for the other variants.
An extensive list of -complete problems can be found on Wikipedia.
2 Upper bounds
Suppose that the connected components of are . If every vertex of has been queried, we know a which contains an endvertex (or that an endvertex is in , hence already identified).
The answers clearly show how many times we enter and leave from each component . If we enter a component more times than we leave it, then must contain an endvertex. If there is no such component, the component containing must contain an endvertex. ∎
This simple observation is crucial for our upper bounds and it does not hold if the answers would contain only the edges leaving the queried vertex.
Proof of Theorem 1.3..
Let us choose an -multiseparator with which cuts into parts , and query all vertices of . By Observation 2.1 we know a part which contains an endvertex. Let be restricted to and choose an -multiseparator of size at most , which cuts into parts .
Then is a separator of , which cuts it into parts . Thus, by again using Observation 2.1 after asking every vertex of we know which part contains an endvertex.
After this we can continue the same way, defining and asking , defining and asking and so on, until an endvertex is in some . As for any , one can easily see that . By the assumptions on , . Altogether at most questions were asked. ∎
A celebrated theorem of Lipton and Tarjan  states that planar graphs have -separators of size at most . Thus we have the following corollary.
If is planar, then .
Now, let us look at -dimensional grid graphs. Miller, Teng and Vavasis  introduced the so-called overlap graphs for every and proved that every member of the class has separator of size . They mention that any subset of the -dimensional infinite grid graph belongs to the class of overlap graphs. The polynomial function satisfies the assumption of Theorem 1.3. Since , this implies that . Here we show that the multiplicative constant is less than 3.
We follow the proof of Theorem 1.3, but the cuts we use are always axis-aligned hyperplanes, which cut the current part into two smaller grid graphs. More precisely, for any let , ; now is a hyperplane perpendicular to the coordinate axis, and it cuts into two parts of size at most . One can easily see that this is possible and , except if , in which case . This means that there are at most
3 Lower bounds
Before proving Theorem 1.6 which claims that any -separator in the grid graph has cardinality , we present a slightly weaker result, as it has a short proof not using results from the theory of isoperimetric problems.
Any -multiseparator in the grid graph has cardinality at least for .
We use induction on . The claim is trivial for . Let us denote by an -multiseparator.
Let us choose an arbitrary axis, and denote by the parallel lines in the grid which go in that direction. Let be the set of those lines which intersect . Note that every other element of contains vertices only from one component of . If , then we are done. Hence we can suppose .
Elements of cover less than points, hence for any component of , the other components together contain at least vertices, which are not covered by elements of . This means that there are at least elements of which contain only vertices not in . Now consider a hyperplane in the grid, orthogonal to the direction of the lines of , and denote by the vertices of that belong to the hyperplane. Clearly, contains at least elements not in , hence is an -multiseparator of (with ) and so we can apply induction on each of these -dimensional hyperplanes.
By induction, there are at least elements of in every such hyperplane, which gives at least elements in total. ∎
Before proving the stronger version of this result, we need to introduce some notations and results.
Let be an arbitrary set of vertices. The set of vertices that are not in , but are connected to some vertex of is called the boundary of , denoted by . Following the notations of Bollobás and Leader , we define an order on the vertices, the simplicial order, by setting if , or and for some we have and for all . This coincides with the lexicographic order according to the vector .
Theorem 3.2 (Bollobás and Leader ).
In , among sets of vertices of a given size, the initial segment of the simplicial order has the smallest boundary.
The special case , i.e., the hypercube, was previously treated by Harper , while the unbounded case of was solved by Wang and Wang . We note that in the paper of Bollobás and Leader the definition of boundary is different: they also include in .
We will also need some results about the volume of slices of a cube, i.e., intersections of the cube with specific hyperplanes. For a contemporary approach to this area we refer to . In the next theorem denotes the following set in the -dimensional unit cube : ; denotes the -dimensional volume of some set of dimension .
Let denote the -th layer of : the set of all vertices in whose coordinates sum to . The layer range from 0 to . We define the size of the “middle-most” layers by
In the even case, we actually know that the middle level is the largest of the three levels in the definition of , as the levels decrease symmetrically in size from the middle to the ends . From discretizing the above theorem, one can obtain the following bound on . Its proof can be found in Appendix B.
For every , there exists a constant such that
Now we are ready to prove Theorem 1.6.
Proof of Theorem 1.6.
We start with the lower bound. Let us denote by a -biseparator which separates the vertex set and (such that ). If we are done. Thus we suppose that . Denote by the vertex set of size which is an initial segment of the simplicial order. By Theorem 3.2 we know that .
By the definition of the simplicial order, is contained in the union of two successive layers and : , where and . First we claim that must be very close to the middlemost layer. More precisely, if is odd, we can assume , and if is even, we can assume or .
We treat only the odd case, the even case being similar. First, we show that must reach at least level . If were disjoint from , we would get
since the last set contains only vertices in the lower half of the levels. This contradicts the requirement fact that must cover at least half of the vertices. Secondly, if would contain vertices of level , it would contain more than the levels which make up half of all vertices. This is again a contradiction to the -biseparator property.
By the definition of , we have now established that each of the two central layers and contains at least points. To conclude the proof, we show that the separator which is contained in the two layers and must have size at least . If a vertex of is not in , then the adjacent vertex defined by must be in unless it is not a point of the grid (i.e., ):
Since the number of vertices of for which is , we obtain
from which the bound follows.
For the upper bound, we simply take the central layer of size as a biseparator. ∎
Now we are ready to prove Theorem 1.2, that .
Proof of Theorem 1.2..
We will use an adversary argument for the lower bound on the number of queries. The adversary will try to answer the queries in such a way that the discovery of the endvertex by the searcher is delayed as much as possible. The adversary need not choose a path in advance, but it is required that the answers remain consistent with some path.
Let denote the vertices that have been queried so far in the search. We will show that the adversary can achieve that after the other end of the path is found, becomes a -biseparator. The adversary maintains a component of , see Figure 1. is the set of vertices which can possibly be the endvertex of the path. (The adversary will follow a greedy strategy of keeping this set as large as possible.) In addition to , the adversary maintains a path between and some vertex , which will be part of the final path and for which . The remaining components of are partitioned into two sets such that both and contain at most vertices and there are no edges between and . Thus we always have a partition into four disjoint sets . The adversary can reveal all these data to the searcher as free additional information. Initially, , and .
The strategy is the following. If the queried vertex is in , the adversary repeats the previous answer for this vertex. If , the adversary answers by reporting the ingoing and outgoing edge of at that vertex. If , then the answer is that “the path does not pass through this vertex.” In these cases, no new information is revealed to the searcher. The vertex , the set , and the path remain unchanged; the only change is that is moved from to .
Let us now look at the case . Let be the partition of into connected components. The adversary chooses a largest component , and will answer in such a way that the new set becomes .
Therefore, if contains , the answer is again “the path does not pass through this vertex,” see Figure 2a. The current endpoint and the path are unchanged. If does not contain (including the case ), then choose to be a neighbor of , see Figure 2b. As was a possible endpoint of the path before this step, there is a path from to which lies in . The adversary uses and the edge to extend the path to a longer path . (This is the only case when the path is updated.) The adversary reports the last arc of as the ingoing arc at and as the outgoing arc.
To maintain the invariant that , we go through the components and add them either to or to (to eventually obtain and ), whichever is smaller. If, for example, , then as are disjoint subsets of . Therefore, the invariant is maintained.
The searcher can only identify , the end of the path, when becomes . By assumption, the graph has at least two vertices and is connected, and therefore . Thus, at this point,
We can now add the singleton set to the smaller of and without exceeding the size bound . The set of queried vertices forms thus a -biseparator. ∎
Theorem 1.5 summarizes the above results. The lower and upper bounds are quite close. Specifically, if we consider as fixed, then the theorem gives exact asymptotics in for the needed number of queries.
4 Concluding Remarks
Here we mention three more variants of the problem.
In the first variant, we consider any directed subgraph of and a vertex with larger out-degree than in-degree. In this version there is a vertex with higher in-degree than out-degree, our goal is to find such a vertex. All of our algorithms work in this case, and obviously the same lower bounds hold.
In the second variant, consists of directed paths and cycles, but we also assume that they cover every vertex. This is a special case of our model, hence the upper bounds hold. However, a lower bound similar to Theorem 1.2 is not plausible, as there are graphs that have only big separators, yet there are only a few valid choices for . For example if contains a vertex of degree one, different from the source, then this vertex must be the endvertex. But in case of grid graphs we can show that the additional assumption on does not make the problem much easier.
Denote by the minimum number of queries needed to find an endvertex in the worst-case for any . Now we show how to give a lower bound for . Let us suppose we are given an grid graph . Then let denote the grid graph.
Let be a grid graph. Then .
The proof of this theorem can be found in Appendix C.
One can easily see that if 4 divides and is the grid graph, then . We need a lower bound on the size of separators in . It is easy to see that if we replace every vertex of by 16 vertices to get , an -separator is replaced by an -separator, hence the same lower bound of , divided by 16, holds for .
In the third variant, is undirected. Our goal is to find another endvertex and the answer to the query is the at most two incident edges. Obviously, this is a harder problem than the directed variant. Hence our lower bounds hold, and one can easily modify our proofs to get the same upper bounds as well. For example, in Observation 2.1, the endvertex is in the component which is connected to by an odd number of edges, counting an extra edge for the component of .
Finally, a straightforward application of our proofs gives the asymptotics to a question recently asked on MathOverflow , which is the following. Given a path from the bottom-left vertex of an grid to its top-right vertex, and another path from its top-left vertex to its bottom-right vertex, how many queries are needed to find a vertex contained in both paths? The proofs of Theorems 1.2 and 2.3 can be adapted to show that queries are necessary and sufficient.
We would like to thank our anonymous referee for the remarks that improved the presentation of the paper.
-  P. Beame, S.A. Cook, J. Edmonds, R. Impagliazzo, T. Pitassi, The Relative Complexity of NP Search Problems, J. Comput. Syst. Sci. 57(1), 3–19 (1998).
-  B. Bollobás, I. Leader, Compressions and lsoperimetric Inequalities, Journal of Combinatorial Theory, Series A 56 (1991), 47–62.
-  N. G. de Bruijn, Ca. van Ebbenhorst Tengbergen, and D. Kruyswijk, On the set of divisors of a number, Nieuw Arch. Wiskunde (2) 23 (1951), 191–193.
-  D. Gerbner, B. Keszegh: Path-search in the pyramid and in other graphs, Journal of Statistical Theory and Practice, 6(2) (2012), 303–314.
-  L. H. Harper, Optimal numberings and isoperimetric problems on graphs, J. Combin. Theory 1 (1966), 385–393.
-  D. Hirsch, C. H. Papadimitriou, S. A. Vavasis: Exponential Lower Bounds for Finding Brouwer Fixed Points, Journal Of Complexity 5 (1989), 379–416.
-  R. J. Lipton, R. E. Tarjan, A separator theorem for planar graphs, SIAM Journal on Applied Mathematics 36 (2) (1979), 177–189.
-  G.L. Miller, S. Teng, S.A. Vavasis: A Unified Geometric Approach to Graph Separators, 32nd Annual Symposium on Foundations of Computer Science Proceedings (1991), 538–547.
-  mathoverflow.net/q/185003, A problem on chains of squares – can one find an easy combinatorial proof?
-  C. Papadimitriou, On the complexity of the parity argument and other inefficient proofs of existence. J. Comput. System Sci. 48 (1994), 498–532.
-  R.P. Stanley, Enumerative Combinatorics. Vol. 1, Corrected reprint of the 1986 original. Cambridge Studies in Advanced Mathematics, 49, Cambridge University Press, Cambridge (1997)
-  G. Pólya, Berechnung eines bestimmten Integrals, Math. Ann. 74 (1913) 204–212.
-  D.-L. Wang and P. Wang, Discrete isoperimetric problems, SIAM J. Appl. Math. 32 (1977), 860–870.
-  Y. Xu, R. Wang, Asymptotic properties of -splines, Eulerian numbers and cube slicing, Journal of Computational and Applied Mathematics 236(5) (2011), 988–995.
Appendix A Biseparators for Ternary Trees
We show that a rooted ternary tree with complete levels has . Any root-to-leaf path is a -biseparator, establishing the upper bound. Let us turn to the lower bound. A complete ternary tree of height has vertices. It is convenient to give each vertex a “weight” of 2. The total weight of the tree becomes , which is very near to a power of 3. In ternary notation, with twos, and the ideal weight for the halves of the biseparator is .
After removing a separating set, any union of components of the complement can be represented as a sum and difference of subtrees. Here, by a subtree we mean a node together with all its descendents. If the separator has nodes, we must be able to group the resulting components into a set that has between and nodes, i.e., weight between and . Each separator node creates at most four new subtrees from which the sum and difference can be formed: its own subtree and the three children subtrees. (These latter ones exist only if the node was not a leaf.) So with separating nodes, we get subtrees from which to form the sum and difference. Each tree has a weight of the form .
If we take a sum and difference of subtrees we must fulfill the inequality
For any number in the range , the ternary representation starts with at least ones. On the other hand, one easily sees by induction that a sum and difference of powers of 3 has at most ones in its ternary representation. We thus get the relation , from which follows. ∎
Appendix B Proof of Corollary 3.4
We show that for any fixed (and then by symmetry for every too), whenever is an integer,
We define , i.e., the volume of the middle slice of the unit hypercube. Setting in Theorem 3.3 establishes the convergence of to .
The layer , for , is a discrete version of a slice of a cube. If we fix the first coordinates, then there is at most one vertex in that has these first coordinates. Thus , where is the projection of along the last axis.
To estimate the size of (and thus of ) take first the middle slice of the continuous unit cube and project it to the first coordinates, yielding the polytope . As the normal vector of the slice is , projecting it to the hyperplane orthogonal to the last axis scales the volume by a factor of :
Now let , i.e., we blow up by a factor . Let be the set of grid points in this . As for fixed , is a factor- blow up of some fixed -dimensional convex polytope, the difference between its volume and the number of grid points in it is (this follows basically from the definition of the volume, for details see e.g., [11, Proposition 4.6.13]), thus
Now we are left to show that . For that it is enough to show that and are . For all of these points the sum of the coordinates is equal to (resp. ) for some . This is points for every , altogether points, which finishes the proof. ∎
Appendix C Proof of Theorem 4.1
Suppose we are given a grid graph and an Algorithm A which finds in in case one path and some cycles cover every vertex. We show an Algorithm B which finds the endvertex in in case there is only a directed path. We can naturally identify every vertex of with a grid in : the vertex corresponds to the axis-parallel rectangle (we call it a block) having vertices, whose two opposite corners are and . We call and the even corners and the two other corners and the odd corners.
Consider a directed path in . We call a system of a directed path and some directed cycles in good if they cover every vertex and the path goes through exactly those blocks which correspond to the vertices of , in the same order.
Now we construct good systems. If a vertex is not on the path, we cover the corresponding block by a cycle. In case of a vertex on the path in , the directed path arrives at the corresponding block in some corner , and goes straight to a neighboring corner , where it leaves. The remaining vertices form a rectangle, which can be covered by a cycle. Finally, when is the very last vertex on the path, we define similarly, and cover the remaining vertices by a path starting in .
Our good systems will satisfy an additional property. If, for a vertex , the coordinate sum is even, then the first vertex of the path in the corresponding block is an even corner, and the last vertex is an odd corner. In case is odd, it is the other way round. Note that if it is true for , it has to be true for every other block as well. Indeed, when the path leaves a block at, for example, an odd corner, it either moves in one of the first two dimensions (then it arrives to an even corner, and does not change), or in another dimension (then it arrives to an odd corner, but the parity of changes).
Note that these properties do not uniquely determine the system. We will incrementally determine the graph as queries arrive.
Now we are ready to define Algorithm B. At every step we call Algorithm A, and then answer such a way that at the end we get a good system. If Algorithm A would query a vertex in , Algorithm B queries the corresponding vertex in instead (i.e., the vertex with ). Using the answer for this query, we choose all the edges incident to vertices of and answer to Algorithm A according to this. If has been asked before, we have already determined the edges in , and answer accordingly. Suppose that has not been queried before. In case the answer is that is not on the path, choose an arbitrary cycle covering the vertices of the corresponding block and answer according to the edges incident to .
In case the answer gives two arcs and , we have to choose the entering vertex and the exit vertex . We will discuss this choice below. This choice will define 5 edges on the path and a cycle of length 12. One edge connects the blocks corresponding to and , leaving the last vertex of the path in and arriving at the first vertex of the path in , i.e., this edge is . Similarly we add the edge . We also add the three edges which connect and . Finally we cover the remaining 12 vertices with a cycle.
We still have to tell which one of the two possible first vertices we use as , and similarly for the possible last vertices. If has already been determined, this fixes the choice of as the vertex adjacent to it. If is parallel to one of the first two axes, this also reduces the choice of the corner to one possibility. Otherwise we pick arbitrarily among the two choices. The exiting vertex is determined analogously.
Even if Algorithm A would know all answers in , it does not give more information than what Algorithm B knows after asking . Algorithm A does not finish before Algorithm B finds the end vertex, thus Algorithm A needs at least as many queries as Algorithm B (on the respective graphs), which finishes the proof. ∎