Combinatorics and Geometry of Transportation Polytopes: An Update
A transportation polytope consists of all multidimensional arrays or tables of non-negative real numbers that satisfy certain sum conditions on subsets of the entries. They arise naturally in optimization and statistics, and also have interest for discrete mathematics because permutation matrices, latin squares, and magic squares appear naturally as lattice points of these polytopes.
In this paper we survey advances on the understanding of the combinatorics and geometry of these polyhedra and include some recent unpublished results on the diameter of graphs of these polytopes. In particular, this is a thirty-year update on the status of a list of open questions last visited in the 1984 book by Yemelichev, Kovalev and Kravtsov and the 1986 survey paper of Vlach.
2010 Mathematics Subject Classification:37F20, 52B05, 90B06, 90C08
Transportation polytopes are well-known objects in mathematical programming and statistics. In the operations research literature, classical transportation problems arise from the problem of transporting goods from a set of factories, each with given supply outcome, and a set of consumer centers, each with an amount of demand. Assuming the total supply equals the total demand and that costs are specified for each possible pair (factory, consumer center), one may wish to optimize the cost of transporting goods. Indeed this was the original motivation that led Kantorovich (see ), Hitchcock (see ), and T. C. Koopmans (see ) to look at these problems. They are indeed among the first linear programming problems investigated, and Koopmans received the Nobel Prize in Economics for his work in this area (see  for an interesting historical perspective). Not much later Birkhoff (see ), von Neumann (see ), and Motzkin (see, e.g., ) were key contributors to the topic. The success of combinatorial algorithms such as the Hungarian method (see [7, 21, 72, 73, 102, 107, 108, 118, 138]) depends on the rich combinatorial structure of the convex polyhedra that defined the possible solutions, the so called transportation polytopes.
In statistics, people have looked at the integral transportation tables, which are widely known as contingency tables. In statistics, a contingency table represents sample data arranged or tabulated by categories of combined properties. Several questions motivate the study of the geometry of contingency tables, for instance, in the table entry security problem: given a table (multi-dimensional perhaps) with statistics on private data about individuals, we may wish to release aggregated marginals of such a table without disclosing information about the exact entries of the table. What can a data thief discover about from the published marginals? When is uniquely identifiable by its margins? This problem has been studied by many researchers (see [34, 42, 46, 47, 66, 67, 68, 74, 98] and the references therein). Another natural problem is whether a given table presents strong evidence of significant relations between the characteristics tabulated (e.g., is cancer related to smoking). There is a lot of interest among statisticians on testing significance of independence for variables. Some methods depend on counting all possible contingency tables with given margins (see e.g., [65, 113]). This in turn is an interesting combinatorial geometric problem on the lattice points of transportation polytopes.
In this article we survey the state of the art in the combinatorics and geometry of transportation polytopes and contingency tables. The survey  by Vlach, the 1984 monograph  by Yemelichev, Kovalev, and Kravtsov, and the paper  by Klee and Witzgall summarized the status of transportation polytopes up to the 1980s. Due to recent advances on the topic by the authors and others, we decided to write a new updated survey collecting remaining open problems and presenting recent solutions. We also included details on some unpublished new work on the diameter of the graphs of these polytopes.
2. Classical transportation polytopes (-ways)
We begin by introducing the most well-known subfamily, the classical transportation polytopes in just two indices. We call them -way transportation polytopes and in general -ways refers to the case of variables with indices. Many of these facts are well-known and can be found in , but we repeat them here as we will use them in what follows.
Fix two integers . The transportation polytope of size defined by the vectors and is the convex polytope defined in the variables () satisfying the equations
Since the coordinates of are non-negative, the conditions (2.1) imply is bounded. The vectors and are called marginals or margins. These polytopes are called transportation polytopes because they model the transportation of goods from supply locations (with the th location supplying a quantity of ) to demand locations (with the th location demanding a quantity of ). The feasible points in a transportation polytope model the scenario where a quantity of of goods is transported from the th supply location to the th demand location. See Figure 1.
Let us consider the transportation polytope defined by the marginals and , which corresponds to the transportation problem shown in Figure 1. A point in is shown in Figure 2. The equations in (2.1) are conditions on the row sums and column sums (respectively) of tables .
2.1. Dimension and feasibility
Notice in Example 2.1 that . The condition that the sum of the supply margins equals the sum of the demand margins is not only necessary but also sufficient for a classical transportation polytope to be non-empty:
Let be the classical transportation polytope defined by the marginals and . The polytope is non-empty if and only if
The equations (2.1) and the inequalities can be rewritten in the matrix form
with a - matrix of size and a vector called the constraint matrix. The constraint matrix for a transportation polytope is the vertex-edge incidence matrix of the complete bipartite graph .
Let be the constraint matrix of a transportation polytope . Then:
Maximal rank submatrices of correspond to spanning trees on .
Each subdeterminant of is , thus is totally unimodular.
If , its dimension is .
Continuing from Example 2.1, observe , where is the constraint matrix
Up to permutation of rows and columns, the matrix is the unique constraint matrix for classical transportation polytopes. It is a matrix of rank five. Thus, is a four-dimensional polytope described in a nine-dimensional ambient space.
Birkhoff polytopes, first introduced by G. Birkhoff in , are an important subclass of transportation polytopes:
The th Birkhoff polytope, denoted by , is the classical transportation polytope with margins .
The Birkhoff polytope is also called the assignment polytope or the polytope of doubly stochastic matrices (see, e.g., ). It is the perfect matching polytope of the complete bipartite graph . We can generalize the definition of the Birkhoff polytope to rectangular arrays:
The central transportation polytope is the classical transportation polytope with and . This polytope is also called the generalized Birkhoff polytope of size .
2.2. Combinatorics of faces and graphs
The study of the faces of transportation polytopes is a nice combinatorial question (see, e.g., ). Unfortunately it is still incomplete, e.g., one does not know the number of -dimensional faces of each dimension other than in a few cases. E.g., in , Pak presented an efficient algorithm for computing the -vector of the generalized Birkhoff polytope of size . Hartfiel (see ) and Dahl (see ) described the supports of certain feasible points in classical transportation polytopes. In this section, we fully describe the vertices and the edges of a -way transportation polytope . The resulting graph has some interesting properties, but there are still open questions about it.
Let be a classical transportation polytope. For a point , define the support set . We also define a bipartite graph , called the support graph of . The graph is the following subgraph of the complete bipartite graph :
Vertices of . The vertices of the graph are the vertices of the complete bipartite graph . We label the supply nodes and the demand nodes .
Edges of . There is an edge if and only if is strictly positive. In other words, the edge set is indexed by .
An important subclass of transportation polytopes are those which are generic. Generic transportation polytopes are easiest to analyze in the proofs which follow and are the ones typically appearing in applications. Generic -way transportation polytopes are those whose vertices have maximal possible non-zero entries. All generic transportation polytopes are simple, but not vice versa.
A classical transportation polytope is generic if
for every non-empty proper subset and non-empty proper subset . (Of course, due to (2.2), we must disallow the case where and .)
The graph properties of provide a useful combinatorial characterization of the vertices of classical transportation polytopes:
Lemma 2.9 (Klee, Witzgall ).
Let be a classical transportation polytope defined by the marginals and , and let . Then the graph is spanning. The point is a vertex of if and only if is a spanning forest. Moreover, if is generic, then is a vertex of if and only if is a spanning tree.
Let be a point in a generic classical transportation polytope . Then is a vertex of if and only if .
A vertex of a transportation polytope is non-degenerate if it has positive entries. Otherwise, the vertex is degenerate. A transportation polytope is non-degenerate if all its vertices are non-degenerate. Non-degenerate transportation polytopes are of particular interest, as they have the largest possible number of vertices and largest possible diameter among the graphs of all transportation polytopes of given type and parameters (e.g., , , and ). Indeed, if is a degenerate transportation polytope, by carefully perturbing the marginals that define we can get a non-degenerate polytope . (A careful explanation of how to do the perturbation is given in Lemma 4.6 of Chapter 6 in  on page 281.) The perturbed marginals are obtained by taking a feasible point in , perturbing the entries in the table and using the recomputed sums as the new marginals for . The graph of can be obtained from that of by contracting certain edges, which cannot increase either the diameter nor the number of vertices.
Given integral marginals , all vertices of the corresponding transportation polytope are integral.
We now recall a classical characterization of the vertices of the Birkhoff polytope:
Theorem 2.12 (Birkhoff-von Neumann Theorem).
The vertices of the th Birkhoff polytope are the - permutation matrices of size .
In other words, the vertices of the Birkhoff polytope are the permutation matrices, so every doubly stochastic matrix is a convex combination of permutation matrices. This theorem was proved by Birkhoff in  and proved independently by von Neumann (see ). Equivalent results were shown earlier in the thesis  of Steinitz, and the theorem also follows from  and  by Kőnig. For a more complete discussion, see the preface to . See also the papers [25, 26, 27, 28], where various various combinatorial and geometric properties of the Birkhoff polytope were studied such as its graph. Of course due to the above theorem, Birkhoff’s polytopes play an important role in combinatorics and discrete optimization and the literature about their properties is rather large.
We also want to know how many vertices a transportation polytope can have. In particular there is a visible difference in behavior between generic and non-generic polytopes. How about maximum number of vertices? The exact formula is complicated but the following result of Bolker in  can serve as a reference:
Lemma 2.13 (Bolker, ).
The maximum possible number of vertices among transportation polytopes is achieved by the central transportation polytope whose marginals are and .
Indeed one can characterize which transportation polytopes reach the largest possible number of vertices. (See results by Yemelichev, Kravtsov and collaborators from the 1970’s mentioned in .)
What are the possible values for the number of vertices of a generic transportation polytope? Are there gaps or do all integer values on a interval occur?
The number of vertices of a non-degenerate classical transportation polytope is divisible by .
|sizes||Distribution of number of vertices in transportation polytopes|
|3 4 5 6|
|4 6 8 10 12|
|5 8 11 12 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30|
|9 12 15 18|
|16 21 24 26 27 29 31 32 34 36 37 39 40 41 42 44 45 46 48 49 50|
|52 53 54 56 57 58 60 61 62 63 64 66 67 68 70 71 72 74 75 76 78 80 84 90 96|
|108 116 124 128 136 140 144 148 152 156 160 164 168 172 176 180 184 188 192|
|196 200 204 208 212 216 220 224 228 232 236 240 244 248 252 256 260 264 268|
|272 276 280 284 288 296 300 304 312 320 340 360|
The support graph associated to a point of the transportation polytope also characterizes edges of classical transportation polytopes. (See Lemma 4.1 in Chapter 6 of .)
Let and be distinct vertices of a classical transportation polytope . Then the vertices and are adjacent if and only if the graph contains a unique cycle.
Let be the transportation polytope () defined by marginals and . Pick integers and . The subset of points of
is a facet of if and only if .
See Figure 4 for an example. From this basic characterization we see:
For and , the possible number of facets of a transportation polytope is a number of the form for and only such integers can occur.
For example, transportation polytopes can have , , , or facets and only these values occur.
Diameter of graphs of transportation polytopes
Now we study a classical question about the graphs of transportation polytopes. Recall that the distance between two vertices of a polytope is the minimal number of edges needed to go from to in the graph of . The diameter of a polytope is the maximum possible distance between pairs of vertices in the graph of the polytope. Though the Hirsch Conjecture was finally shown to be false in general for polytopes (see ), the problem is still unsolved for transportation polytopes, and diameter bounds for this special class of polytopes are very interesting. Dyer and Frieze (see ) gave the first polynomial diameter bound for totally unimodular polytopes which applies to classical transportation polytopes (and more generally to network polytopes), but this was recently improved by Bonifas et al. in .
The diameters of classical transportation polytopes and their applications (see, e.g., ) have been studied extensively. In , Balinski proved that the Hirsch Conjecture holds and is tight for dual transportation polyhedra. For the specific case of transportation polytopes Yemelichev, Kovalev, and Kravtsov (see Theorem 4.6 in Chapter 6 of  and the references therein) and Stougie (see ) presented improved polynomial bounds. This was improved to a quadratic bound by van den Heuvel and Stougie in , and further improved to a linear bound:
Theorem 2.19 (Brightwell, van den Heuvel, Stougie ).
The diameter of every transportation polytope is at most .
The bound follows from a crucial lemma which bounds the graph distance between any two vertices and of a transportation polytope , by constructing vertices and of and nodes of such that , , and . In the arguments below, there is an important distinction between vertices of the polytope (which we always denote by or ) and nodes of the support graph of a vertex of (which we always denote by or ).
Theorem 2.20 (Hurkens ).
The diameter of every transportation polytope is at most .
We present a brief sketch of Hurkens’ proof. The result follows immediately from this lemma:
Lemma 2.21 (Hurkens ).
For any two vertices and of a transportation polytope , there is an integer , a vertex of , and nodes of such that:
for , and
The key idea that Hurkens showed is that four pivots are required (on average) to construct a common leaf node. More specifically, Hurkens proved this lemma by showing that for any two vertices and of a transportation polytope , there is a node in (which can be assumed to be a supply) with incident edges in where are all leaf nodes (which are necessarily demands) of . Moreover, the nodes of identified in Hurkens’ algorithm also satisfy the property that if
then there is a vertex of obtained after at most pivots from the vertex of such that and have common leaf nodes.
In the algorithm of Brightwell, van den Heuvel, and Stougie (see ), pivots are applied to vertices and of , resulting in new vertices and of . A key difference in Hurkens’ algorithm in  is that pivots are only applied to one of the two vertices and of . Without loss of generality, pivots are applied to the vertex of and not applied to the vertex of . Thus, we do not describe the vertex further. Other than the property that the demand nodes are leaf nodes in adjacent to the node , the structure of may be arbitrary.
We label the relevant supply and demands nodes participating in pivots. For each let for be the edges in incident to , where . Let be the edges in incident to for where . See Figure 5.
Here we describe the successive pivots applied starting from the vertex of . For each , we do the following:
If is not in the support graph, pivot to add . Then, pivot to add edges of the form for until all edges of the form are removed.
If is not in the support graph, pivot to add . Then, pivot to add edges of the form for until all edges of the form are removed.
Continue in this way for : If is not in the support graph, pivot to add it. Then, pivot to add edges of the form for until all edges of the form are removed.
In the resulting vertex of , the support graph has as leaf nodes adjacent to , which matches the support graph of the vertex of . What remains to show (and we skip it) is that there is a choice of nodes where the number of pivots performed is at most . Instead, we illustrate the idea behind the sequence of prescribed pivots in an example:
Since is not in the support graph of the vertex of , we insert it, and the pivot operation removes the edge . We now apply pivots to the resulting adjacent vertex of as follows: After the pivot, only the edges and are incident to the demand node . These two edges are removed by pivoting to add the edges and , respectively, which causes to be a leaf node adjacent to .
After insertion of the edge the remaining edge of the form is removed the same way. Since is already a leaf node, the insertion of will cause it to be a leaf node adjacent to .
To prove that the Hirsch Conjecture is true for transportation polytopes, one would hope that any pair of vertices that differ in support elements has a pivot step that reduces the number of non-zero variables in which the vertices differ, but Brightwell et al.  noticed that this was not true. We show their counter-example in Figure 7.
Open Problem 2.23.
Prove or disprove the Hirsch Conjecture for -way transportation polytopes.
By Corollary 2.18, this would mean the diameter is less than or equal to . This conjecture holds for many special cases that restrict the margins. For example the conjecture is true for Birkhoff’s polytope and for some special right-hand sides (see e.g., ).
While transportation polytopes seem tame compared to other polytopes. It has been shown that they have some non-trivial topological structure: Diameter bounds for simple -polyhedra can be studied via decomposition properties of related simplicial complexes. Each non-degenerate simple polytope has a polar simplicial complex, a simplicial polytope. Billera and Provan (see ) showed that polytopes whose dual simplicial polytope is weakly vertex decomposable have a linear diameter. But it has recently been shown (see ) that the infinite family of polars of transportation polytopes for are not weakly vertex-decomposable, the first ever such examples. But at the same time, one can prove the Hirsch Conjecture holds for transportation polytopes by proving a stronger statement:
The Hirsch Conjecture holds for all convex polytopes obtained as the intersection of a cube and a hyperplane.
Fix a dimension . Let be the hyperplane determined by the non-zero normal vector and t he constant . Let denote -dimensional cube with - vertices. Then, let denote the polytope obtained as their intersection .
If the dimension of the polytope is less than , then is a face of . In that case, itself is a cube of lower dimension, so we assume that the polytope is of dimension . We may also assume that the polytope is not a facet of the -cube, so that intersects the relative interior of .
Without assuming any genericity, a simple dimension argument shows that the vertices of the polytope are either on the relative interior of an edge of the cube or are vertices of the cube. We assume that is sufficiently generic. Then, no vertex of the cube will be a vertex of . For each vertex of , we define its side signature to be a string of length consisting of the characters , , and by the following rule:
By genericity, it cannot be the case that there are two vertices of with the same side signature. Indeed, if there were two distinct vertices and with the same side signature, then will contain the entire edge of the cube containing them both, and and will not be vertices.
Let denote the hyperplane and let denote the hyperplane . If there is an such that the hyperplane does not intersect nor , then we can project to a lower-dimensional face of . Thus, for each , we can assume that inte rsects at least one of or .
Given two vertices and of , we define the Hamming distance between them based on their side signatures:
Let defined as above using a sufficiently-generic hyperplane . Let and be two vertices of . Let denote the number of facets of .
If and have the in the same coordinate and does not intersect either of the two facets in that direction, i.e., there is an such that and , then .
By rotating the (combinatorial) cube if necessary, we can assume without loss of generality that the side signature of the vertex is and that the side signature of the vertex is either of the form with trailing ones or of the form with trailing ones, after applying a suitable rotation to the cube.
In the first case, and we have at least “-facets” and “-facets.”
In the second case, we have “-facets”, “-facets” and (unless there is an such that and ) at least one more facet. Thus, , unless we are in the special case, in which case . ∎
Let defined as above using a sufficiently-generic hyperplane . Let and be two vertices of . Then, there is a pivot from the vertex to a vertex with .
Again by rotating if necessary, without loss of generality, we can assume that and that is either or .
If the side signature of is , performing a pivot on the vertex in any one of the last coordinates reduces the Hamming distance.
Otherwise, the side signature of is . We now describe what can occur when pivoting from the vertex to a new vertex . We claim that at least one of the possible pivots on the vertex does not put a in the first coordinate of the side signature of the new vertex . Otherwise, the hyperplane cuts the polytope as a vertex figure: that is to say, the polytope cuts the corner of the cube. See Figure 8 for a picture.
The remaining kind of pivots on that result in a new vertex give side signatures