The convexification effect of Minkowski summation
Let us define for a compact set the sequence
It was independently proved by Shapley, Folkman and Starr (1969) and by Emerson and Greenleaf (1969) that approaches the convex hull of in the Hausdorff distance induced by the Euclidean norm as goes to . We explore in this survey how exactly approaches the convex hull of , and more generally, how a Minkowski sum of possibly different compact sets approaches convexity, as measured by various indices of non-convexity. The non-convexity indices considered include the Hausdorff distance induced by any norm on , the volume deficit (the difference of volumes), a non-convexity index introduced by Schneider (1975), and the effective standard deviation or inner radius. After first clarifying the interrelationships between these various indices of non-convexity, which were previously either unknown or scattered in the literature, we show that the volume deficit of does not monotonically decrease to 0 in dimension 12 or above, thus falsifying a conjecture of Bobkov et al. (2011), even though their conjecture is proved to be true in dimension 1 and for certain sets with special structure. On the other hand, Schneider’s index possesses a strong monotonicity property along the sequence , and both the Hausdorff distance and effective standard deviation are eventually monotone (once exceeds ). Along the way, we obtain new inequalities for the volume of the Minkowski sum of compact sets (showing that this is fractionally superadditive but not supermodular in general, but is indeed supermodular when the sets are convex), falsify a conjecture of Dyn and Farkhi (2004), demonstrate applications of our results to combinatorial discrepancy theory, and suggest some questions worthy of further investigation.
2010 Mathematics Subject Classification. Primary 60E15 11B13; Secondary 94A17 60F15.
Keywords. Sumsets, Brunn-Minkowski, convex hull, inner radius, Hausdorff distance, discrepancy.
- 1 Introduction
- 2 Measures of non-convexity
- 3 The behavior of volume deficit
- 4 Volume inequalities for Minkowski sums
- 5 The behavior of Schneider’s non-convexity index
- 6 The behavior of the effective standard deviation
- 7 The behavior of the Hausdorff distance from the convex hull
- 8 Connections to discrepancy theory
- 9 Discussion
Minkowski summation is a basic and ubiquitous operation on sets. Indeed, the Minkowski sum of sets and makes sense as long as and are subsets of an ambient set in which the operation + is defined. In particular, this notion makes sense in any group, and there are multiple fields of mathematics that are preoccupied with studying what exactly this operation does. For example, much of classical additive combinatorics studies the cardinality of Minkowski sums (called sumsets in this context) of finite subsets of a group and their interaction with additive structure of the concerned sets, while the study of the Lebesgue measure of Minkowski sums in is central to much of convex geometry and geometric functional analysis. In this survey paper, which also contains a number of original results, our goal is to understand better the qualitative effect of Minkowski summation in – specifically, the “convexifying” effect that it has. Somewhat surprisingly, while the existence of such an effect has long been known, several rather basic questions about its nature do not seem to have been addressed, and we undertake to fill the gap.
The fact that Minkowski summation produces sets that look “more convex” is easy to visualize by drawing a non-convex set111The simplest nontrivial example is three non-collinear points in the plane, so that is the original set of vertices of a triangle together with those convex combinations of the vertices formed by rational coefficients with denominator . in the plane and its self-averages defined by
This intuition was first made precise in the late 1960’s independently222Both the papers of Starr  and Emerson and Greenleaf  were submitted in 1967 and published in 1969, but in very different communities (economics and algebra); so it is not surprising that the authors of these papers were unaware of each other. Perhaps more surprising is that the relationship between these papers does not seem to have ever been noticed in the almost 5 decades since. The fact that converges to the convex hull of , at an rate in the Hausdorff metric when dimension is fixed, should perhaps properly be called the Emerson-Folkman-Greenleaf-Shapley-Starr theorem, but in keeping with the old mathematical tradition of not worrying too much about names of theorems (cf., Arnold’s principle), we will simply use the nomenclature that has become standard. by Starr  (see also ), who credited Shapley and Folkman for the main result, and by Emerson and Greenleaf . Denoting by the convex hull of , by the -dimensional Euclidean ball of radius , and by the Hausdorff distance between a set and its convex hull, it follows from the Shapley-Folkman-Starr theorem that if are compact sets in contained inside some ball, then
By considering , one concludes that . In other words, when is a compact subset of for fixed dimension , converges in Hausdorff distance to as , at rate at least .
Our geometric intuition would suggest that in some sense, as increases, the set is getting progressively more convex, or in other words, that the convergence of to is, in some sense, monotone. The main goal of this paper is to examine this intuition, and explore whether it can be made rigorous.
One motivation for our goal of exploring monotonicity in the Shapley-Folkman-Starr theorem is that it was the key tool allowing Starr  to prove that in an economy with a sufficiently large number of traders, there are (under some natural conditions) configurations arbitrarily close to equilibrium even without making any convexity assumptions on preferences of the traders; thus investigations of monotonicity in this theorem speak to the question of whether these quasi-equilibrium configurations in fact get “closer” to a true equilibrium as the number of traders increases. A related result is the core convergence result of Anderson , which states under very general conditions that the discrepancy between a core allocation and the corresponding competitive equilibrium price vector in a pure exchange economy becomes arbitrarily small as the number of agents gets large. These results are central results in mathematical economics, and continue to attract attention (see, e.g., ).
Our original motivation, however, came from a conjecture made by Bobkov, Madiman and Wang . To state it, let us introduce the volume deficit of a compact set in : , where denotes the Lebesgue measure in .
Conjecture 1.1 (Bobkov-Madiman-Wang ).
Let be a compact set in for some , and let be defined as in (1). Then the sequence is non-increasing in , or equivalently, is non-decreasing.
In fact, the authors of  proposed a number of related conjectures, of which Conjecture 1.1 is the weakest. Indeed, they conjectured a monotonicity property in a probabilistic limit theorem, namely the law of large numbers for random sets due to Z. Artstein and Vitale ; when this conjectured monotonicity property of  is restricted to deterministic (i.e., non-random) sets, one obtains Conjecture 1.1. They showed in turn that this conjectured monotonicity property in the law of large numbers for random sets is implied by the following volume inequality for Minkowski sums. For being an integer, we set .
Conjecture 1.2 (Bobkov-Madiman-Wang ).
Let , be integers and let be compact sets in . Then
Apart from the fact that Conjecture 1.2 implies Conjecture 1.1 (which can be seen simply by applying the former to , where is a fixed compact set), Conjecture 1.2 is particularly interesting because of its close connections to an important inequality in Geometry, namely the Brunn-Minkowski inequality, and a fundamental inequality in Information Theory, namely the entropy power inequality. Since the conjectures in  were largely motivated by these connections, we now briefly explain them.
The Brunn-Minkowski inequality (or strictly speaking, the Brunn-Minkowski-Lyusternik inequality) states that for all compact sets in ,
It is, of course, a cornerstone of Convex Geometry, and has beautiful relations to many areas of Mathematics (see, e.g., [38, 72]). The case of Conjecture 1.2 is exactly the Brunn-Minkowski inequality (3). Whereas Conjecture 1.2 yields the monotonicity described in Conjecture 1.1, the Brunn-Minkowski inequality only allows one to deduce that the subsequence is non-decreasing (one may also deduce this fact from the trivial inclusion ).
The entropy power inequality states that for all independent random vectors in ,
denotes the entropy power of . Let us recall that the entropy of a random vector with density function (with respect to Lebesgue measure ) is if the integral exists and otherwise (see, e.g., ). As a consequence, one may deduce that for independent and identically distributed random vectors , , the sequence
In particular, if all in the above inequality are identically distributed, then one may deduce that the sequence
is non-decreasing. This fact is usually referred to as “the monotonicity of entropy in the Central Limit Theorem”, since the sequence of entropies of these normalized sums converges to that of a Gaussian distribution as shown earlier by Barron . Later, simpler proofs of the inequality (5) were given by [49, 86]; more general inequalities were developed in [50, 75, 51].
There is a formal resemblance between inequalities (4) and (3) that was noticed in a pioneering work of Costa and Cover  and later explained by Dembo, Cover and Thomas  (see also [82, 87] for other aspects of this connection). In the last decade, several further developments have been made that link Information Theory to the Brunn-Minkowski theory, including entropy analogues of the Blaschke-Santaló inequality , the reverse Brunn-Minkowski inequality [19, 20], the Rogers-Shephard inequality [22, 53] and the Busemann inequality . Indeed, volume inequalities and entropy inequalities (and also certain small ball inequalities ) can be unified using the framework of Rényi entropies; this framework and the relevant literature is surveyed in . On the other hand, natural analogues in the Brunn-Minkowski theory of Fisher information inequalities hold sometimes but not always [35, 7, 37]. In particular, it is now well understood that the functional in the geometry of compact subsets of , and the functional in probability are analogous to each other in many (but not all) ways. Thus, for example, the monotonicity property desired in Conjecture 1.1 is in a sense analogous to the monotonicity property in the Central Limit Theorem implied by inequality (5), and Conjecture 1.2 from  generalizes the Brunn-Minkowski inequality (3) exactly as inequality (5) generalizes the entropy power inequality (4).
The starting point of this work was the observation that although Conjecture 1.2 holds for certain special classes of sets (namely, one dimensional compact sets, convex sets and their Cartesian product, as shown in subsection 3.1), both Conjecture 1.1 and Conjecture 1.2 fail to hold in general even for moderately high dimension (Theorem 3.4 constructs a counterexample in dimension 12). These results, which consider the question of the monotonicity of are stated and proved in Section 3. We also discuss there the question of when one has convergence of to 0, and at what rate, drawing on the work of the  (which seems not to be well known in the contemporary literature on convexity).
Section 4 is devoted to developing some new volume inequalities for Minkowski sums. In particular, we observe in Theorem 4.1 that if the exponents of in Conjecture 1.2 are removed, then the modified inequality is true for general compact sets (though unfortunately one can no longer directly relate this to a law of large numbers for sets). Furthermore, in the case of convex sets, Theorem 4.5 proves an even stronger fact, namely that the volume of the Minkowski sum of convex sets is supermodular. Various other facts surrounding these observations are also discussed in Section 4.
Even though the conjecture about becoming progressively more convex in the sense of is false thanks to Theorem 3.4, one can ask the same question when we measure the extent of non-convexity using functionals other than . In Section 2, we survey the existing literature on measures of non-convexity of sets, also making some possibly new observations about these various measures and the relations between them. The functionals we consider include a non-convexity index introduced by Schneider , the notion of inner radius introduced by Starr  (and studied in an equivalent form as the effective standard deviation by Cassels , though the equivalence was only understood later by Wegmann ), and the Hausdorff distance to the convex hull, which we already introduced when describing the Shapley-Folkman-Starr theorem. We also consider the generalized Hausdorff distance corresponding to using a non-Euclidean norm whose unit ball is the convex body . The rest of the paper is devoted to the examination of whether becomes progressively more convex as increases, when measured through these other functionals.
In Section 5, we develop the main positive result of this paper, Theorem 5.3, which shows that is monotonically (strictly) decreasing in , unless is already convex. Various other properties of Schneider’s non-convexity index and its behavior for Minkowski sums are also established here, including the optimal convergence rate for . We remark that even the question of convergence of to 0 does not seem to have been explored in the literature.
Section 6 considers the behavior of (or equivalently ). For this sequence, we show that monotonicity holds in dimensions 1 and 2, and in general dimension, monotonicity holds eventually (in particular, once exceeds ). The convergence rate of to 0 was already established in Starr’s original paper ; we review the classical proof of Cassels  of this result.
Section 7 considers the question of monotonicity of , as well as its generalizations when we consider equipped with norms other than the Euclidean norm (indeed, following , we even consider so-called “nonsymmetric norms”). Again here, we show that monotonicity holds in dimensions 1 and 2, and in general dimension, monotonicity holds eventually (in particular, once exceeds ). In fact, more general inequalities are proved that hold for Minkowski sums of different sets. The convergence rate of to 0 was already established in Starr’s original paper ; we review both a classical proof, and also provide a new very simple proof of a rate result that is suboptimal in dimension for the Euclidean norm but sharp in both dimension and number of summands given that it holds for arbitrary norms. In 2004 Dyn and Farkhi  conjectured that We show that this conjecture is false in , .
In Section 8, we show that a number of results from combinatorial discrepancy theory can be seen as consequences of the convexifying effect of Minkowski summation. In particular, we obtain a new bound on the discrepancy for finite-dimensional Banach spaces in terms of the Banach-Mazur distance of the space from a Euclidean one.
Finally, in Section 9, we make various additional remarks, including on notions of non-convexity not considered in this paper.
Acknowledgments. Franck Barthe had independently observed that Conjecture 1.2 holds in dimension 1, using the same proof, by 2011. We are indebted to Fedor Nazarov for valuable discussions, in particular for the help in the construction of the counterexamples in Theorem 3.4 and Theorem 7.3. We would like to thank Victor Grinberg for many enlightening discussions on the connections with discrepancy theory, which were an enormous help with putting Section 8 together. We also thank Franck Barthe, Dario Cordero-Erausquin, Uri Grupel, Bo’az Klartag, Joseph Lehec, Paul-Marie Samson, Sreekar Vadlamani, and Murali Vemuri for interesting discussions. Some of the original results developed in this work were announced in ; we are grateful to Gilles Pisier for curating that announcement. Finally we are grateful to the anonymous referee for a careful reading of the paper and constructive comments.
2 Measures of non-convexity
2.1 Preliminaries and Definitions
Throughout this paper, we only deal with compact sets, since several of the measures of non-convexity we consider can have rather unpleasant behavior if we do not make this assumption.
The convex hull operation interacts nicely with Minkowski summation.
Let be nonempty subsets of . Then,
Let . Then , where , , , and , . Thus, . Hence . The other inclusion is clear.
If is convex then
The Shapley-Folkman lemma, which is closely related to the classical Carathéodory theorem, is key to our development.
Lemma 2.3 (Shapley-Folkman).
Let be nonempty subsets of , with . Let . Then there exists a set of cardinality at most such that
We present below a proof taken from Proposition 5.7.1 of . Let . Then
where , , and . Let us consider the following vectors of ,
Notice that . Using Carathéodory’s theorem in the positive cone generated by in , one has
for some nonnegative scalars where at most of them are non zero. This implies that and that , for all . Thus for each , there exists such that . But at most scalars are positive. Hence there are at most additional that are positive. One deduces that there are at least indices such that for some , and thus for . For these indices, one has . The other inclusion is clear.
The Shapley-Folkman lemma may alternatively be written as the statement that, for ,
where denotes the cardinality of . When all the sets involved are identical, and , this reduces to the identity
It should be noted that the Shapley-Folkman lemma is in the center of a rich vein of investigation in convex analysis and its applications. As explained by Z. Artstein , It may be seen as a discrete manifestation of a key lemma about extreme points that is related to a number of “bang-bang” type results. It also plays an important role in the theory of vector-valued measures; for example, it can be used as an ingredient in the proof of Lyapunov’s theorem on the range of vector measures (see ,  and references therein).
For a compact set in , denote by
the radius of the smallest ball containing . By Jung’s theorem , this parameter is close to the diameter, namely one has
where is the Euclidean diameter of . We also denote by
the inradius of , i.e. the radius of a largest Euclidean ball included in . There are several ways of measuring non-convexity of a set:
The Hausdorff distance from the convex hull is perhaps the most obvious measure to consider:
A variant of this is to consider the Hausdorff distance when the ambient metric space is equipped with a norm different from the Euclidean norm. If is the closed unit ball of this norm (i.e., any symmetric333We always use “symmetric” to mean centrally symmetric, i.e., if and only if ., compact, convex set with nonempty interior), we define
In fact, the quantity (8) makes sense for any compact convex set containing 0 in its interior – then it is sometimes called the Hausdorff distance with respect to a “nonsymmetric norm”.
Another natural measure of non-convexity is the “volume deficit”:
Of course, this notion is interesting only when . There are many variants of this that one could consider, such as , or relative versions such as that are automatically bounded.
The “inner radius” of a compact set was defined by Starr  as follows:
The “effective standard deviation” was defined by Cassels . For a random vector in , let be the trace of its covariance matrix. Then the effective standard deviation of a compact set of is
Let us notice the equivalent geometric definition of :
In analogy with the effective standard deviation, we define the “effective absolute deviation” by
Another non-convexity measure was defined by Cassels  as follows:
The “non-convexity index” was defined by Schneider  as follows:
2.2 Basic properties of non-convexity measures
All of these functionals are 0 when is a convex set; this justifies calling them “measures of non-convexity”. In fact, we have the following stronger statement since we restrict our attention to compact sets.
Let be a compact set in . Then:
if and only if is convex.
if and only if is convex.
if and only if is convex.
if and only if is convex.
if and only if is convex.
if and only if is convex.
Under the additional assumption that has nonempty interior, if and only if is convex.
Directly from the definition of we get that if is convex (just select ). Now assume that , then is a sequence of compact convex sets, converging in Hausdorff metric to , thus must be convex. Notice that this observation is due to Schneider .
The assertion about follows immediately from the definition and the limiting argument similar to the above one.
If is convex then, clearly , indeed we can always take with . Next, if , then using Theorem 2.15 below we have thus and therefore is convex.
The statements about , and can be deduced from the definitions, but they will also follow immediately from the Theorem 2.15 below.
Assume that is convex, then and . Next, assume that . Assume, towards a contradiction, that . Then there exists and such that . Since is convex and has nonempty interior, there exists a ball and one has
which contradicts .
The following lemmata capture some basic properties of all these measures of non-convexity (note that we need not separately discuss , and henceforth owing to Theorem 2.15). The first lemma concerns the behavior of these functionals on scaling of the argument set.
Let be a compact subset of , , and .
. In fact, is affine-invariant.
. In fact, if , where is an invertible linear transformation and , then .
To see that is affine-invariant, we first notice that . Moreover writing , where is an invertible linear transformation and , we get that
which is convex if and only if is convex.
It is easy to see from the definitions that , and are translation-invariant, and that and are 1-homogeneous and is -homogeneous with respect to dilation.
The next lemma concerns the monotonicity of non-convexity measures with respect to the inclusion relation.
Let be compact sets in such that and . Then:
For the first part, observe that if ,
where in the last equation we used that is convex and Remark 2.2. Hence all relations in the above display must be equalities, and must be convex, which means .
For the second part, observe that
For the third part, observe that
For the fourth part, observe that
As a consequence of Lemma 2.6, we deduce that is monotone along the subsequence of powers of 2, when measured through all these measures of non-convexity.
Finally we discuss topological aspects of these non-convexity functionals, specifically, whether they have continuity properties with respect to the topology on the class of compact sets induced by Hausdorff distance.
Suppose , where all the sets involved are compact subsets of . Then:
, i.e., is continuous.
, i.e., is lower semicontinuous.
, i.e., is lower semicontinuous.
, i.e., is lower semicontinuous.
Let us first observe that for any compact sets
by applying the convex hull operation to the inclusions
and invoking Lemma 2.1.
Thus implies .
1. Observe that by the triangle inequality for the Hausdorff metric, we have the inequality
Using (9) one deduces that . Changing the role of and , we get
This proves the continuity of .
2. Recall that, with respect to the Hausdorff distance, the volume is upper semicontinuous on the class of compact sets (see, e.g., [73, Theorem 12.3.6]) and continuous on the class of compact convex sets (see, e.g., [72, Theorem 1.8.20]). Thus
so that subtracting the former from the latter yields the desired semicontinuity of .
3. Observe that by definition,
Thus , which is the desired semicontinuity of .
4. Using we get that is bounded and thus is bounded and there is a convergent subsequence . Our goal is to show that . Let . Then there exits such that . From the definition of we get that there exists such that and . We can select a convergent subsequence , where is compact (see [72, Theorem 1.8.4]), then and and therefore . Thus .
We emphasize that the semicontinuity assertions in Lemma 2.7 are not continuity assertions for a reason and even adding the assumption of nestedness of the sets would not help.
Schneider  observed that is not continuous with respect to the Hausdorff distance, even if restricted to the compact sets with nonempty interior. His example consists of taking a triangle in the plane, and replacing one of its edges by the two segments which join the endpoints of the edge to an interior point (see Figure 1). More precisely, let , , and . Then . But one has since is convex. Moreover one can notice that . Indeed on one hand , which implies that , on the other hand for every the point , thus . Notice also that . Indeed hence and the opposite inequality is not difficult to see since the supremum in the definition of is attained at the point .
To see that there is no continuity for , consider a sequence of discrete nested sets converging in to , more precisely: .
2.3 Special properties of Schneider’s index
All these functionals other than can be unbounded. The boundedness of follows for the following nice inequality due to Schneider .
 For any subset of ,
Applying the Shapley-Folkman lemma (Lemma 2.3) to , where is a fixed compact set, one deduces that . Thus .
Schneider  showed that if and only if consists of affinely independent points. Schneider also showed that if is unbounded or connected, one has the sharp bound .
Let us note some alternative representations of Schneider’s non-convexity index. First, we would like to remind the definition of the Minkowski functional of a compact convex set containing zero:
with the usual convention that if . Note that and is a norm if is symmetric with non empty interior.
For any compact set , define
and observe that
Hence, we can express
Rewriting this yet another way, we see that if , then for each , there exists and such that
or equivalently, . In other words, where , which can be written as using the Minkowski functional. Thus
This representation is nice since it allows for comparison with the representation of in the same form but with replaced by the Euclidean unit ball.
Schneider  observed that there are many closed unbounded sets that satisfy , but are not convex. Examples he gave include the set of integers in , or a parabola in the plane. This makes it very clear that if we are to use as a measure of non-convexity, we should restrict attention to compact sets.
2.4 Unconditional relationships
It is natural to ask how these various measures of non-convexity are related. First we note that and are equivalent. To prove this we would like to present an elementary but useful observation:
Let be an arbitrary convex body containing in its interior. Consider a convex body such that and . Then for any compact set ,
Hence, . In addition, one has
The next lemma follows immediately from Lemma 2.12:
Let be an arbitrary convex body containing 0 in its interior. For any compact set , one has
where are such that .
It is also interesting to note a special property of :
Let be a compact set in . If , then
If , then
If , then . But,
where we used the fact that by definition of , is convex. Hence, .
If , in addition to the above argument, we also have
Note that the inequality in the above lemma cannot be reversed even with the cost of an additional multiplicative constant. Indeed, take the sets from Example 2.8, then but tends to .
Observe that and have some similarity in definition. Let us introduce the point-wise definitions of above notions: Consider , define
More generally, if is a compact convex set in containing the origin,