On the Inequalities of Projected Volumes and the Constructible Region

On the Inequalities of Projected Volumes and the Constructible Region

Zihan Tan ; Liwei Zeng ; Jian Li
Abstract

We study the following geometry problem: given a dimensional vector , is there an object such that , for all , where is the projection of to the subspace spanned by the axes in ? If does correspond to an object in , we say that is constructible. We use to denote the constructible region, i.e., the set of all constructible vectors in . In 1995, Bollobás and Thomason showed that is contained in a polyhedral cone, defined a class of so called uniform cover inequalities. We propose a new set of natural inequalities, called nonuniform-cover inequalities, which generalize the BT inequalities. We show that any linear inequality that all points in satisfy must be a nonuniform-cover inequality. Based on this result and an example by Bollobás and Thomason, we show that constructible region is not even convex, and thus cannot be fully characterized by linear inequalities. We further show that some subclasses of the nonuniform-cover inequalities are not correct by various combinatorial constructions, which refutes a previous conjecture about . Finally, we conclude with an interesting conjecture regarding the convex hull of .

1 Introduction

We use notations introduced in [3].

Let be an object in and let be the standard basis of . By an object, we mean a bounded compact subset of . We let denote the subspace spanned by . Given an index set with , we denote by the orthogonal projection of onto , and by its -dimensional volume. We use to denote the -dimensional volume of . Given an -dimensional object , define to be the log-projection vector of , which is a dimensional vector with entries indexed by subsets of and for all (we use the convention that ). Whenever we refer to a dimensional vector , we assume that the entries are indexed by the subsets of (i.e., is the entry index by ). We say that a dimensional vector is constructible if is the log-projection vector of some object in . Let us define the constructible region , the central subject studied in this paper, to be the set of all constructible vectors:

Having the above definitions, it is natural to ask the following questions:

  1. Given a dimensional vector , is there an algorithm to decide whether is in ?

  2. What does look like? What property does have?

In 1995, Bollobás and Thomason [3] proposed a class of inequalities relating the projected volumes. Their result reads as follows. Let be a family of subsets of . We say is a -cover of , if each element of appears exactly times in the multiset induced by . For example, is a 2-uniform cover of .

Theorem 1.

(Bollobás-Thomason (BT) uniform-cover inequalities) Suppose is an object in and is a -cover of . Then, we have that

With the above notations, we define the polyhedron cone

BT inequalities essentially assert that every constructible vector is in , or equivalently . In the very same paper [3], they also presented a non-constructible point in , which immediately implies that . However the above result does not rule out the possibility that is convex, or even can be characterized by a finite set of linear inequalities.

1.1 Our Results

Except the results mentioned above, very little is known about and the main goal of this paper is to deepen our understanding about its structure. First, we propose a new class of natural inequalities, called nonuniform-cover inequalities, which generalize the BT uniform-cover inequalities. We need a few notations first.

Let , be two families of subsets 111 A subset of may appear more than one times in or of , where s and s are subsets of . We say covers if the following properties hold:

  1. The disjoint union of is the same as the disjoint union of . In other words, for every element , .

  2. Let and , and there is an one-to-one mapping between and such that: for any with , and .

Definition 1.

(Nonuniform-Cover (NC) inequalities) is a dimensional vector. Suppose covers . A nonuniform-cover inequality is of the following form:

Example 1.

Let and . We can see covers . The corresponding NC inequality is . Here is another example:

When the context is clear, we refer to a linear inequality of the form as an NC inequality as well. It is easy to see that that every BT inequality is an NC inequality. But the converse may not be true. For example, (We alert the reader that we do not claim such inequalities are always true. We will discuss it in detail in Section 4.)

Similar to , we define to be the set of all points that satisfies all NC inequalities: Formally, it is the following polyhedron cone:

Our first result states that all correct linear inequalities should be in this class.

Theorem 2.

If all points in satisfy a certain linear inequality , the linear inequality must be an NC inequality, or a positive combination of NC inequalities.

In order to prove the above theorem, we introduce a class of objects called rectangular flowers. We let to denote all possible log-projection vectors that can be generated by rectangular flowers (see the definition in Section 2). We show that for any linear inequality that is not an NC inequality, we can construct a rectangular flower which violates the inequality. It is simple to show that a log-projection vector of a rectangular flower in satisfies all nonuniform cover inequalities (i.e., it is in ). Moreover, we show that for every point , there is a rectangular flower in whose log-projection vector is . Therefore, we can prove the following theorem.

Theorem 3.

For all , .

Given Theorem 3, it is natural to ask whether . If the answer was yes, would have a compact description and deciding whether a point is in can be done using linear programming (see Section 2 for the details). However, the answer is not that simple. In fact, using Theorem 2, we can show our next result which states that is not even convex for . We note that for , , thus convex. For completeness, we provide a proof in Appendix A.

Theorem 4.

(Non-convexity of ) For , is not convex.

Theorem 4 implies that there exist certain constructible vector in which violates some NC constraint. In other words, . Thus it would be interesting to know which NC inequalities are true and which are false (we already know BT inequalities are true). In Section 4, we provide several methods for constructing counterexamples for different subclasses of NC inequalities. However, we have not been able to disprove all NC inequalities that are not BT inequalities, nor prove any of them. This leads us to conjecture the following.

Conjecture 1.

If all points in satisfy a certain linear inequality , the linear inequality must be a BT inequality or a positive combination of several BT inequalities. Moreover, , the convex hull of .

At the end of the introduction, we summarize our results in the following chain:

and we conjecture that .

1.2 A Motivating Problem from Databases

Our problem is closely related to the data generation problem [1] studied in the area of databases, which is in fact our initial motivation for studying the problem. Generating synthetic relation under various constraints is a key problem for testing data management systems. A relation is essentially a table, where each row is one record about some entity, and each column is an attribute. One of the most important operations in relational databases is the projection operation to a subset of attributes. One can think of the projection to subset of attributes, denoted as , as the table first restricted to columns in , and then with duplication removed. To see the connection between the database problem and geometry, we can think a relation with attributes as an -dimensional object in : A tuple (i.e., a row) can be thought as a unit cube . Then, , the projection of to , corresponds to exactly the projected relation .

Example 2.

The following table shows the information of course registration. items in the table correspond to in the coordinate system. In this way, a table is represented by an object in Euclidean space.

Rank Name Course
1 Alice Math
2 Alice Physics
3 Alice Biology
4 Bob Math
5 Bob Physics

In the data generating problem with projection constraints, we are given the cardinalities for a set of subsets . The goal is to construct a relation that is consistent with the given cardinalities. We can see it is a discrete version of our geometry problem. Moreover, if the given cardinalities (after taking logarithm) is not in , or violate any projection inequality, there is obviously no solution to the data generation problem. Therefore, a good understanding of our geometry problem is central for solving the data generation problem.

1.3 Other Related Work

Loomis and Whitney proved a class of projection inequalities in [7], allowing one to upper bound the volume of a dimensional object by the volumes of its -dimensional projection volumes. Their inequalities are special cases of BT inequalities. BT inequalities and their generalizations also play an essential role in the worst-case optimal join problem in databases (we can get an upper bound of the size of the relation knowing the cardinalities of its projections). See e.g., [8] for some most recent results on this problem.

There is a large body of literature on the constructible region for joint entropy function over random variables . More specifically, for each joint distribution over , there is a point in , which is a dimensional vector, with the entry indexed by being . Characterizing is one major problem in information theory and has been studied extensively. Many entropy inequalities are known, including Shannon-type inequalities and several non-Shannon-type inequalities. For a comprehensive treatment of this topic, we refer interested readers to the book [9]. There are close connections between entropy inequalities and projection inequalities [4, 3, 5, 2]. In particular, BT inequalities can be easily derived from the well known Shearer’s entropy inequalities [4] (many even regard them as the same).

2 Proof of Theorem 2 and Theorem 3

In this section, we prove Theorem 2 and Theorem 3. We need to introduce a class of special geometric objects, which are crucial to our proofs. We say an -dimensional object is cornered if implies for all (i.e., for all ). An object is said to be an open rectangle if , or a closed rectangle if .

Definition 2.

We say is a rectangular flower if

  1. is cornered,

  2. is a open rectangle in for any .

Figure 1: (i) A 3-dimensional rectangular flower. (ii) The network flow . The dashed line represents the minimum - cut.

See Figure 1 for an example. It is easy to see that a rectangular flower is a union of closed rectangles , each being a closed rectangle in . Moreover, If , for any , the edge length of along axis is no shorter than that of (since is cornered).

We also need to introduce a new class of inequalities, call fractional nonuniform-cover inequalities, which can be seen as the fractional generalization of NC inequalities. We need some notations first. Let , be two families of weighted subsets of , where s and s are subsets of and ( resp.) is the nonnegative weight associated with ( resp.). Construct a network flow instance as follows: Let and be sets of nodes. Let node be the source and node be the sink. There is an arc from to each node with capacity . There is an arc from each node to with capacity . For each pair of and , there is an arc with capacity from to if and . We say saturates if the following properties hold:

  1. For any , .

  2. The maximum - flow (or equivalently, the minimum - cut) of is .

Definition 3.

(Fractional-Nonuniform-Cover (FNC) inequalities) Suppose is an object in and covers . A fractional-nonuniform-cover inequality is of the following form:

When the context is clear, we also refer to linear inequalities of the form as FNC inequalities.

Lemma 1.

The set of FNC inequalities (the linear form) is exactly the set of all nonnegative linear combinations of NC inequalities.

Proof.

It is trivial to see that a nonnegative linear combination of NC inequalities is an FNC inequality. Now, we show the other direction. Fix the dimension to be . Consider an arbitrary FNC inequality . If all entries in are rational number, the FNC inequality itself is an NC inequality by scaling all coefficients by some integer factor (this is because if all capacities of the network are integral, there is an integral maximum flow). So, we only need to handle the case where some entries of are not rational. Now, we show that every point in is satisfied by . Suppose the contrary that there is point but . However, we claim that there is a sequence of FNC inequalities with rational coefficients such that Hence, we have that, for some sufficiently large , , which renders a contradiction.

Now, we briefly argue why the claimed sequence exists. It is not hard to see that the set of coefficient vectors corresponding to FNC inequalities is a rational polyhedral cone, defined by the linear constraints C1 and the flow constraint C2, which again can be captured by linear constraints (using auxiliary flow variables). So there is a set of rational generating vectors and can be written as a nonnegative combination of these vectors. Suppose , (each column of is a generating vector). Pick an arbitrary rational nonnegative sequence of vectors that approach to , and would be the desired sequence.

The rest can be seen from Farka’s Lemma: Let be a feasible system of inequalities and be an inequality satisfied by all with . Then, By Farka’s Lemma, is a nonnegative linear combination of the inequalities in (see e.g., [6]). ∎

Proof of Theorem 2. We only need to show that all non-FNC inequalities are wrong. Suppose is an object. Consider an arbitrary non-FNC inequality:

(1)

where does not saturate . We show that we can construct a rectangular flower that this inequality does not hold.

Consider the network flow instance . Suppose C1 does not hold: for some , . First, if , we can easily see that (1) is false by considering ( and ) ). Now, suppose but for some . W.l.o.g., assume . Let . Again, it is easy to see (1) is false since and ).

Now, suppose C2 is false, that is the value of the minimum - cut of is less than . Suppose the minimum - cut defines the partition of vertices such that and . Let and be defined as above, and , , , . Since the min-cut is less than , none of the above four sets are empty. Clearly, there is no edge from to since otherwise the value of the cut is infinity. In other words, absorbs all outgoing edges from . (See Figure 1(ii)).

Moreover, we can see the value of the min-cut is . Since this value is less than , we have that due to C1. Now, we construct the rectangular flower . Suppose and we use to denote the edge length of the close rectangle along axis . We only need to specify all s as follows:

Now, we verify that the above rectangular flow violates the given non-FNC inequality. In fact, we can easily see that for any node , there is a node and we have that . Hence,

On the other hand, we have that

which implies that the given inequality is false. This proves Theorem 2. ∎

We denote the set of log-projection vectors generated by rectangular flowers to be

Now, we prove Theorem 3.

Proof of Theorem 3. Clearly, . We only need to show that .

We can see that a given vector is the log-projection vector of some rectangular flower in if the following linear program, denoted as , is feasible (treating as variables):

Hence, It is easy to check that is a convex cone (i.e., if , for any ). In fact, from basic linear programming fact, is a polyhedron cone. In fact, this can be easily seen as follows: We can write as the standard matrix form . Obviously, is a finitely generated cone (generated by columns of ), thus a polyhedral cone. is the intersection of the above cone with the subspace , which is again a polyhedral cone.

It is straightforward to verify that each point in satisfies all NC inequalities (we leave the verification to the reader). So . Suppose for contradiction that there is a point but . So there is a hyperplane separating and (with ). So is not an FNC inequality (since should satisfy all FNC inequalities). From the proof Theorem 2, we have shown that for any non-FNC inequality, we can construct a rectangular flower that violates the inequality. This contradicts that for all . Hence, . This concludes the proof of the theorem. ∎

At the end of this section, we briefly mention projection inequalities with nonzero constant terms (, for ). If , none such inequality is true by just considering the hypercube. Obviously, if is true for all , for all also. Moreover, if is not an FNC inequality, can not be true for any , since we can make large enough in the proof of Theorem 2. Conversely, if for some is true for all , it must hold that for all . This is because if , for any . Therefore, it suffices to consider only those inequalities with zero constant term.

3 Proof of Theorem 4: Non-Convexity of

In this section, we will prove Theorem 4: the non-convexity of constructible region for . We suppose the converse that is convex. First, we can see that if is convex, it must be a convex cone (this is because if , for ). Hence, each supporting hyperplane of must correspond to an FNC inequality.

Consider

which is the projection of onto the subspace spanned by

where is the axis indexed by . Since is a convex cone, must also be a convex cone. Hence, each linear inequality that defines must be some FNC inequality with terms

Now, we prove that any FNC inequality involving only the above terms is a nonnegative linear combination of the following two BT inequalities:

(2)
(3)

By Lemma 1, it suffices to consider only NC inequalities. In fact, according to the definition of NC, the right hand side can only contain the terms and . Apply Corollary in Single Cover Theorem (which is discussed in next section) on this inequality, we can see that it must be a combination of (2) and (3). In other words, is defined by (2) and (3).

Now, we consider the vector , ,

The example is essentially adopted from the example in [3]. It is easy to see that satisfies (2) and (3). Now, we briefly show . Suppose there exists an object with the log-projection vector consistent with . In other words, Note that .

From Theorem in [3], we know that the projection of must be a rectangle . However, since the projection of must be a rectangle . Since , the two boxes are not the same and we arrive at a contradiction. This shows that is not convex and thus completes the proof of Theorem 4.

4 Counterexample Construction for NCBT Inequalities

We have shown that the constructible region cannot be fully characterized by a set of linear inequalities as it is not convex. However, it is still interesting to see what are all correct linear inequalities. Equivalently, we want to figure out the set of linear inequalities that define , the convex hull of .

In this section, we construct counterexamples for several NC but non-BT (denoted as NCBT) inequalities. Note that a compact object can be approximated by the union of small cubes, our counterexamples are also unions of cubes.

4.1 Skeleton

In this subsection, we use an -tuple where s are non-negative integers to represent the -dimensional unit hypercube: , i.e., . Denote the sum of two sets by their Minkovski sum, namely . We need to the notion of skeleton, which is important for our construction.

Definition 4.

(Connection Graph) In , consider an FNC inequality (). The connection graph for the above inequality is an undirected graph , where , representing dimensions. The edge if and only if both and appear in some but not in any .

Definition 5.

Let be all cliques (complete subgraphs) in . is a large positive integer. For every , we define

The skeleton for the given NC inequality is defined as

See Figure 2 for an example.

Figure 2: (i) Skeleton. (ii) Connection Graph.

In the figure above, the connection graph is the right one and the corresponding skeleton is the object on the left.

For , let be the size of maximum clique in , the subgraph induced by vertices in . For sufficiently large , we have the following asymptotic estimations:

The following lemma is a direct consequence of the above estimate.

Lemma 2.

If the NC inequality satisfies that

then it is incorrect, i.e., there exists a counterexample for it.

Example 3.

Consider the NC inequality . The connection graph contains two edges and . We have and . Hence, the inequality is not true in general.

4.2 Union of Boxes

By a box we mean a hypercube or a translation of it, i.e., for some positive vector v, here the sum is the Minkowski sum. The examples in this subsection are the disjoint union of two boxes and . Here we require not only and are disjoint in , but their projections onto any subspace are disjoint as well for any . In particular, we use the following two boxes:

As before, for sufficiently large and s to be determined later, the following asymptotic equations hold:

Note that we can use absolute value to replace the maximum function, , we obtain the following lemma.

Lemma 3.

If there exists t such that the following inequality is true:

then the corresponding NC inequality is incorrect.

Proof.

Our counterexample is the union of two boxes where is the counterexample for the absolute value inequality. By the above asymptotics and , we conclude that

Hence, the object is a counterexample. ∎

Example 4.

Again, consider the NC inequality in Example 3. Let . We can see the condition of Lemma 3 is met and the inequality is incorrect.

Example 5.

Consider the NC inequality and . So, this inequality is incorrect.

4.3 Exact Single Cover Theorem

Using the union of boxes method we can also obtain the following theorem which is a necessary condition for an inequality to be true. Let be the 0/1 indicator vector for set and for , i.e., if and only if .

Theorem 5.

(Exact Single Cover Theorem) If the FNC inequality holds for every , then for all , there exist nonnegative such that for all and

Proof.

Let , . It is immediate that is a convex subset of . If does not include , by separating hyperplane theorem, there exists a vector and real number such that

We still use a union of two boxes to be the counterexample:

It can be seen that

Now, it suffices to show that In the asymptotic showed before, we have that

The last inequality holds since is in . This completes the proof. ∎

Now, we show two simple corollaries.

Corollary 1.

Suppose the following FNC inequality holds for all , and the set indicator vectors are linearly independent. Then this inequality can be written as a nonnegative combination of BT inequalities.

Proof.

Let ( resp.) be the matrix with being the th column ( the th column). Let and . By the definition of FNC, we know that For each , we know for some . So