The Vapnik-Chervonenkis dimension of cubes in
The Vapnik-Chervonenkis (VC) dimension of a collection of subsets of a set is an important combinatorial concept in settings such as discrete geometry and machine learning. In this paper we prove that the VC dimension of the family of -dimensional cubes in is .
Msc: 03E05, Vapnik-Chervonenkis dimension, discrete geometry
The Vapnik-Chervonenkis (VC) dimension of a collection of subsets of a space is one measure of the complexity of . Introduced in vapnik-chervonenkis (), it has found wide application in areas such as machine learning, where it is used to gauge the capacity of a model to represent sample data (see, e.g., vapnik () and devroye-lugosi (), or floyd-warmuth () for work on the related problem of sample compression). The VC dimension of many natural collections of subsets of Euclidean space has been determined. For instance, it is a standard result that the VC dimension of balls, half-spaces, and boxes in is , , and , respectively. However, the VC dimension of cubes, or more generally the balls according to the norm on for , has not previously been calculated. In this paper we will show that the VC dimension of cubes (the balls of the norm) in is . A remaining question is the VC dimension of the balls of other norms. While the VC dimension of such collections of balls for sufficiently large should be at least that of cubes, the precise values of the VC dimension and its behaviour as changes is not known.
To define the VC dimension, let be a set and a collection of subsets of . If is a subset of and a subset of , then we say that carves out from if there is a set in such that . The set is shattered by if every subset of is carved out by . With these definitions, we have the following.
The VC dimension of a collection of subsets of is the supremum of the cardinalities of finite sets that are shattered by .
To prove that the VC dimension of cubes in is , we first show that no set of larger cardinality is shattered by cubes, and then we construct sets in of the appropriate size that are shattered by cubes. Before we proceed, we remark that we will refer to the th coordinate of a point in with a superscript.
2 Establishing the upper bound on the VC dimension
We now prove that the VC dimension of cubes is at most .
Any set in that is shattered by cubes has at most points.
Let be a subset of with points, and suppose that is shattered by cubes. In order to establish the upper bound, for every axis we pick points and in whose th coordinates are minimal and maximal, respectively, among the th coordinates of points in . Observe that every point in must appear somewhere in the list of pairs of extremal points. For if that were not the case, and a point in did not appear, then the subset could not be carved out by a cube.
Now, let be the number of points in that appear precisely once in our list. These points are distributed over positions in the list of extrema, leaving positions in the list to be filled by the points that appear at least twice. So , implying that .
Finally, assume towards a contradiction that . By the pigeonhole principle there must be distinct axes and such that the points , , , and appear precisely once in the list of extrema. Since we can carve out with a cube, we can find two closed intervals of the same length such that the first contains the th coordinate of each point in and the second contains the th coordinate of each point in except for those of and . Thus . By repeating this argument with the roles of and interchanged we obtain the contradictory inequality . Thus , giving the upper bound. ∎
3 Establishing the lower bound on the VC dimension
In order to construct sets in of size that can be shattered by cubes, we need some preliminary definitions. Consider the collection of sets consisting of , all closed left-infinite intervals with , and right-infinite intervals with . Call a product of intervals in a constraint in . We say that is right-exclusive if is half-infinite to the left (that is, of the form ), left-exclusive if is half-infinite to the right, and inclusive if is . Observe that given a constraint and a bounded set , it is possible to carve out with arbitrarily large cubes. So if we can shatter a set with constraints, then we may do so with cubes.
To construct shattered sets of the appropriate size, we will need to construct sets in each dimension that are slightly smaller than required but can be shattered by constraints in a particularly “nice” way. The construction of these sets will be recursive in (see Lemmas 4 and 5), and the niceness property that is captured in Definition 3 will allow the recursion to continue.
Let be a subset of . Say that is accessible if every subset of may be carved out by an inclusive constraint. Say that is weakly accessible if every subset of may be carved out by two constraints, one of which is left-exclusive and the other of which is right-exclusive.
If is an accessible subset of then we may adjoin two points to to obtain a set of points shattered by cubes. We may also embed in to obtain a weakly accessible subset of .
Given such an , let and be points in with sufficiently negative and sufficiently positive, so in particular that and are minimal and maximal, respectively, on the th axis, and with for . Let . To verify that is shattered by cubes, let be a subset of and an inclusive constraint carving out . Then carves out from . We may carve out the subsets and by taking the constraint and replacing with an appropriate half-infinite interval from . Finally, we may carve from by carving from with a small cube.
Now embed in by taking each point in and duplicating its th coordinate to serve as its st coordinate. The resulting set is weakly accessible; the additional axis allows us to include and exclude the images of and from any subset we wish to carve out, and to do so with both left- and right-exclusive constraints. ∎
If is a weakly accessible subset of then we may adjoin one point to to obtain a set that is shattered by constraints, and hence by cubes. We may also embed in to obtain an accessible subset of .
Let be a point in whose th coordinate is strictly smaller than those of the points in , and whose th coordinate is for each . Define , and consider a subset of . A left-exclusive constraint that carves out from (with the left endpoint of not too small) will also carve out from , while a right-exclusive constraint that carves out from will carve out from . So is shattered by constraints.
As in the previous lemma, embed into by duplicating the th coordinate of each point in . Since is shattered by constraints, the resulting set in is accessible; we do not need to use the st axis to exclude any points when carving out subsets from the image of . ∎
For each there is a set in with points that is shattered by cubes.
- (1) V. N. Vapnik, A. Y. Chervonenkis, On the uniform convergence of relative frequencies of events to their probabilities, Theory of Probability and Its Applications 16 (1971) 264–280.
- (2) V. N. Vapnik, Statistical learning theory, Wiley, New York, 1996.
- (3) L. Devroye, L. Györfi, G. Lugosi, A Probabilistic Theory of Pattern Recognition, Springer-Verlag, New York, 1996.
- (4) S. Floyd, M. Warmuth, Sample compression, learnability, and the Vapnik-Chervonenkis dimension, Machine Learning 21 (1995) 1–36.