A geometrically motivated parametric
model in manifold estimation
José R. Berrendero, Alejandro Cholaquidis,
Antonio Cuevas, Ricardo Fraiman
Departamento de Matemáticas, Universidad Autónoma de Madrid, Spain
Centro de Matemática, Universidad de la República, Uruguay
Departamento de Matemática, Universidad de San Andrés, Argentina
The general aim of manifold estimation is reconstructing, by statistical methods, an -dimensional compact manifold on (with ) or estimating some relevant quantities related to the geometric properties of . We will assume that the sample data are given by the distances to the -dimensional manifold from points randomly chosen on a band surrounding , with and . The point in this paper is to show that, if belongs to a wide class of compact sets (which we call sets with polynomial volume), the proposed statistical model leads to a relatively simple parametric formulation. In this setup, standard methodologies (method of moments, maximum likelihood) can be used to estimate some interesting geometric parameters, including curvatures and Euler characteristic. We will particularly focus on the estimation of the -dimensional boundary measure (in Minkowski’s sense) of .
It turns out, however, that the estimation problem is not straightforward since the standard estimators show a remarkably pathological behavior: while they are consistent and asymptotically normal, their expectations are infinite. The theoretical and practical consequences of this fact are discussed in some detail.
AMS 2010 Subject Classification: 62F10, 62H35.
Key words and phrases: Estimation of boundary length; estimation of curvature; distance to boundary; volume function; remote sensing.
This work has been partially supported by Spanish Grants MTM2010-17366 (Authors 1, 3 and 4) and CCG10-UAM/ESP-5494 (Authors 1 and 3).
Some background: manifold estimation
Let be an -dimensional compact manifold in , that is, a compact subset of in which every point has a neighborhood which is homeomorphic to an open -ball in , where .
For define the -parallel set (or -offset),
where and denotes the Euclidean closed ball of center and radius .
A natural approach to tackle the estimation of is to assume that the sample points are randomly drawn “around” . More formally, these points could arise as noisy versions of observations randomly chosen on , that is , where the are random points chosen on the boundary and are iid random observations from a noise random variable . This is the additive noise model, used (under different assumptions for the noise variables ) by Genovese et al. (2012a, 2012b) and Niyogi et al. (2008).
We will consider here a slightly different clutter noise model which, in his more general formulation [Genovese et al. (2012b)], assumes that the sample observations come from a distribution , where is supported on , is uniformly distributed on a compact set and . To be more specific, be will consider the case of “extreme” noise contamination where and the noise support is the topological closure of an outside band surrounding .
The problem of estimating under such sample models is a relatively new subject, of increasing interest, usually called manifold estimation, which can be included in the broader field of manifold learning; see Dey (2007) for a recent general reference.
Manifold estimation is closely related, in the statement and methodology of the problem, to the theory of set estimation (see Cuevas and Fraiman (2009) for a recent survey) and, more specifically, to boundary estimation: see Cuevas and Rodríguez-Casal (2004). There, the problem is essentially to estimate the boundary of a given set from an iid sample of a probability distribution with support . As it sometimes happens in the statistical research, the focus of boundary (or set) estimation soon moved from the primary target of estimating the boundary (or set) itself, to other related goals that can be formalized in terms of estimation of appropriate functionals. A relevant example is the estimation of the -dimensional measure of . See Cuevas et al. (2007), Pateiro-López and Rodríguez-Casal (2008), Armendáriz et al. (2009) and Jiménez and Yukich (2011). In all these references, the sample model is somewhat different from the original simple iid situation mentioned above since the available sample information consists of random points drawn inside and outside . In some sense, the present paper goes along similar lines in the problem of manifold estimation as our main concern here will be the estimation of the -dimensional measure of a manifold with dimension .
The manifold problem and the solid problem. Our sampling model(s)
Given a set , we define the -dimensional Minkowski content of its topological boundary by
provided that this limit is finite; here denotes the -dimensional Lebesgue measure.
Likewise, the one-sided (outer) Minkowski content of is defined by
Assuming that has a Lipschitz boundary, it can be proved (see Ambrosio et al., 2008, Theorem 5, for a precise statement) that .
In this paper, we will consider two slightly different problems whose statistical treatment turns out to be essentially identical. First, the estimation of when is a -dimensional, smooth enough, compact manifold (so and ), will be called the manifold problem or the manifold model. Second, the estimation of , when is a -dimensional set (with non-empty interior), will be called the solid problem or the solid model.
As an important difference with respect to the general manifold estimation problem mentioned above, we will assume (in both models) that the sample data consist of the distances to from points uniformly drawn on the parallel set but outside . Therefore, whereas in the manifold problem (where typically ) this amounts to draw the random points on the whole parallel set , in the solid problem (where ) we will assume that the are drawn on . This distinction makes sense as, in practice, it is reasonable to assume that we are just allowed to observe “from the outside”. As we will see, the mathematical treatment is essentially identical in both cases, with just a few minor differences. For this reason we will denote, with some abuse of notation, in both cases the value of the target parameter. See expressions (10) and (20) below for details.
These sampling models can be motivated in terms of remote sensing: we could think that we are able to measure (with a sonar device, for example) the distance from the outside points to the surface or to the solid .
The structure of this paper.
Some necessary concepts, related to the structure of the volume function and its geometric and analytic interpretations are reviewed in Section 2.
The basic geometric assumption, as well as the main theoretical results concerning the estimation of the boundary measure , are established in Section 3. To be more specific, we show that, according to the proposed model, the distribution of the random variable belongs to a parametric family indexed by in such a way that the theoretical expressions for the asymptotic distributions of both the maximum likelihood and the moment estimator of can be explicitly obtained. We consider the two-dimensional case (where is a curve in the manifold model and a planar domain in the solid model) and the three-dimensional one (where is a surface or solid body, respectively). In the case , besides the estimation of , we can also tackle the estimation of a parameter, denoted by , which can be interpreted as the integrated mean curvature of .
In Section 4 we show that the standard estimators are in some sense, pathological. In particular, the moments estimator (in spite of being consistent and asymptotically normal) has an infinite expectation. This entails that the usual mean square error is no longer a suitable criterion to measure the performance of these estimators. Hence an alternative error criterion is proposed. Also other estimation methods, aimed to overcome the infinite expectation pathology are considered.
Section 5 is devoted to a small simulation study.
Section 6 includes some discussion and a few final remarks.
2 Some geometric preliminaries. The volume function
In what follows the volume function plays an outstanding role. It appears in a natural way in different topics related to stochastic geometry and geometric measure theory; see, e.g., Hug et al. (2004), Ambrosio et al. (2008) and Villa (2009) for recent references. In set estimation arises also as an auxiliary tool to obtain convergence rates with respect to the Hausdorff metric; see, e.g., Walther (1997). An additional statistical application of will be presented in this paper.
The discussion below involves the use of some classical, though non-trivial, concepts from differential geometry and geometric measure theory. This section is devoted to briefly outline them. We just introduce the main results and concepts, pointing out their intuitive meanings, and refer to some standard references for additional details.
The Steiner formula
The systematic study of the volume function goes back to the nineteenth century. The best known result about this function is maybe the classical Steiner’s (1840) formula whose -dimensional version is as follows: If is a compact convex set, then the corresponding volume function is a polynomial in of degree ,
where denotes the (-dimensional) volume of the Euclidean unit ball in (with ) and the coefficients are the so-called “intrinsic volumes” of . In particular, is the volume of , and .
In the cases and , we will express the Steiner formula with the notations,
respectively. The value in (5) coincides with the “integrated mean curvature” of .
Sets of positive reach. Federer’s volume formula and its geometrical interpretation
The appearance of a measure of curvature in (5) is not by chance. This point was clarified by Federer (1959) in a celebrated paper which, in many respects, can be considered as the pioneering reference in Geometric Measure Theory. In that paper, a generalization of the Steiner formula (5), together with a deep interpretation of the corresponding polynomial coefficients, was established.
Federer’s result is valid for a broad class of sets having a positive reach property. This is a fairly intuitive smoothness condition which does not involve any explicit differentiability assumption. The reach of a (closed) set , , is defined as the largest (possibly ) such that if then contains a unique point nearest to . If then is said to have positive reach.
As a combination of Theorems 5.6 and 5.19 in Federer (1959) we have the following clean and powerful result:
Federer’s Theorem.- If is a compact set with , then there exist unique such that
Moreover, coincides with the so-called Euler characteristic of (see below) which, in particular, is a topological invariant.
It is readily seen that and , where is defined in (2). The meaning of the remaining coefficients is also carefully addressed in Federer (1959) by showing that they can be interpreted as the total curvatures of .
The above theorem is a considerable extension of the Steiner formula. Note that it applies of course to any convex compact set since a closed set is convex if and only if . Moreover, as Federer (1959, Section 4) points out the class of sets with positive reach “contains (…) all those sets which can be defined locally by means of finitely many equations, , and inequalities, , using real valued continuously differentiable functions, , whose gradients are Lipschitzian and satisfy a certain independence condition”. Note that if has a positive reach condition then can have “outward peaks” but the “inward” (non-differentiable) peaks are ruled out.
The Euler characteristic
As indicated above, the total curvature in (6) equals the Euler characteristic of . This is an important, integer-valued, quantity which provides useful information on some geometric aspects of a surface or, more in general, of a topological space.
Let us recall that a Riemannian manifold is just a differentiable manifold in which every tangent space is equipped with an inner product with an associated Riemannian metric which varies smoothly from point to point.
The formal relation of the Euler characteristic with the notion of curvature is given by the Gauss-Bonnet theorem. The simplest version of this result states that the total Gaussian curvature of a compact two-dimensional Riemannian manifold without boundary is equal to where denotes the Euler characteristic of the surface. The result can be extended to even-dimensional manifolds (the Euler characteristic of an odd dimensional compact manifold is zero). This is a striking fact since, in principle, the curvature is a notion that depends on local properties of the surface (relying on differentiability properties) and the Euler characteristic is a global, topological invariant, which means that it does not change by bijective bi-continuous transformations.
Let us now briefly recall some basic facts about the Euler characteristic. A more complete discussion can be found in the book by Hatcher (2002). The simplest definition of can be given for polyhedral surfaces in . In this case . It is a well-known classical result that if is the boundary of a convex polyhedron then . The Euler characteristic can be defined for any subset of in such a way that it is a topological invariant. As a consequence of this invariance we also have that, for the two-dimensional sphere (i.e. the boundary of the three-dimensional ball) and the same holds for any compact orientable surface homeomorphic to the sphere.
In the -dimensional case we have that so that it is always 0 or 2.
The Euler characteristic is also related to other invariants. For example, for connected orientable compact surfaces without boundary, we have , where is the genus of the surface, which intuitively coincides with the number of “handles”. Thus, for a torus, for a double torus (with two handles) and so on.
The value of Euler’s characteristic is also explicitly known for many other interesting sets in , not necessarily curves or surfaces. For example, it is known that if is a “solid” ball, then . In fact the same is true for any contractible set (i.e., homotopy equivalent to a point). It follows from the previous discussion that the class of compact sets in with is extremely wide.
3 Statistical results: parametric estimation of some geometric quantities
The statistical interpretation of the volume function
According to the statistical model(s) established in the introduction, we will always assume that our sample data consist of iid observations from the distance variable so that where are iid random variables with uniform distribution on the band . We will simultaneously consider the manifold model where will be a -dimensional manifold with and the solid model where . In both cases the main target will be to estimate the surface measure .
The following proposition is just a reformulation, in terms of our statistical model, of some results proved by Stachó (1976). It is included here for the sake of completeness.
Let be a compact set and a fixed constant. Given a random variable uniformly distributed on the band , define the (Euclidean) distance variable . Denote by , for , the distribution function of .
The distribution function is given by
where is the volume function associated with . Moreover, is absolutely continuous and differentiable except for, at most, a countable set of points. In particular, it can be expressed as the integral of its derivative, where a.e. () is the density function of .
For every the left and right hand-side derivatives and do exist. Moreover they are continuous from the left and from the right, respectively and fulfill .
For all there exists the Minkowski measure and
Proof: (a) and (b) Since is uniformly distributed, we have
Now, statements (a) and (b) concerning the absolute continuity and differentiability properties of follow directly from Lemma 2 in Stachó (1976). In fact, these properties are established in general for the so-called functions of Kneser type, and it is shown that the volume function belongs to that class. This means that , for all , .
Result (c) is just Theorem 2 in Stachó (1976) rewritten in our statistical framework.
The basic geometric assumption: sets with polynomial volume
According to Proposition 1, the simpler the structure of the easier the statistical problem stated in the introduction. The discussion in the previous section suggests that, concerning , we cannot expect anything simpler than the polynomial structure given by Steiner’s theorem. However, as we have also pointed out, there is no need to assume that is convex in order to get a polynomial volume function (at least on a given interval).
This lead us in a natural way to the following definition.
We will say that is a set of polynomial volume, of type 1, on the interval if the volume function has an expression of type
where denotes the volume of the unit ball in and are appropriate coefficients. The family of sets in fulfilling this property will be denoted by . More generally, we could also define the class of sets of polynomial volume, of type , by imposing that their volume functions have an expression such as (9) where the term is replaced with .
As a consequence of Steiner’s theorem, the class includes that of compact convex sets in but, from Federer’s theorem, it also includes the much broader class of compact sets with reach and Euler characteristic 1.
Since the class of sets with positive reach is by far the best known class of sets with a polynomial volume on a interval, it is natural to ask whether there exists a simple characterization of those sets that having a polynomial volume but still do not fulfill the positive reach property. As far as we know, this is still an open question (see Heveling et al., 2004 for interesting closely related issues). It is easy to construct simple examples of such sets. Thus, the polygonal joining the points , and belongs to the family with and . The same holds for the non-convex pentagon , where denotes the (open) triangle whose vertices are , and .
The set (b) in Figure 1 is defined as the unit circle in minus the cone with center and angle . A direct calculation shows that in this case the volume function is
Heveling et al. (2004) present a general construction of non-convex sets in () with reach equal to zero and with polynomial volume function for any . Examples of them are those presented in Figure 1, (c) and (d). The first one is just the union of two touching balls, . It can be seen that
The set (d) in Figure 1 can be defined as , where is the union of the closed segment joining the points and with the point . It can be proved that in this case
As a conclusion, the cases (a) and (b) in Figure 1 provide examples of sets in with polynomial volume but not of type 1, that is they are not in since the value in the highest order term of the polynomial volume function is not 1. On the other hand, the cases (c) and (d) correspond to sets with reach 0 but belonging to for all .
Throughout the rest of the paper we shall concentrate on the cases and (though the basic ideas can potentially be extended to general dimensions). So we will deal with the classes and for which the expressions of on the interval are of type (4) and (5), respectively. Let us recall that these classes include all sets with positive reach and Euler’s characteristic 1. The more general cases and can be handled in a similar way, just incorporating as an additional parameter in the estimation procedure (in case it were not known in advance).
The two-dimensional case
Let us first consider the case where . In this case, our only estimation target is . The following result provides two alternative expressions for the distribution of the random variable “distance to the boundary of ”, , defined above. We assume that belongs to the class of sets with polynomial volume given by
The random variable is absolutely continuous with density function
An alternative expression for this density is
where , is the density function of a random variable , uniform in and is the density function of , where follows a Beta distribution with parameters and .
In order to gain some insight on the geometric meaning of (12), let us consider the simple case of a square . While the distance from those “regular” points in not projecting on any of the vertices of follows a uniform distribution , the density accounts for the remaining points whose projection is one vertex. For more complicated sets one could think that (12) reflects the mixture between “flatness” (the term) and “curvature”(the term) in the boundary of .
We are now ready to consider the estimation of . Let us first analyze the solution provided by the classical method of moments. The following theorem shows that, at first sight, this procedure works reasonably well, in the sense that the expression of the estimator is not too complicated and the asymptotic distribution is easy to identify. However, as we will see in the next section, a rather surprising property comes up.
Under the assumption (10), we have that the estimator of by the method of moments from a sample of is given by
where denotes the sample mean of .
This estimator is asymptotically normal. More precisely, we have
where stands for convergence in law and
Proof: (a) First, we compute the expected distance:
The moment estimator, is defined to be the solution in of the equation
From the Central Limit Theorem applied to we have,
where, after some algebra, it is not difficult to show that
Now observe that , where . It is easy to check that
Thus, and . Notice also that
Therefore, using the standard delta-method for restricted to the interval , [e.g. Lehmann and Casella (1998), Th. 8.12, p. 58] we conclude
which leads to (14).
The next theorem is devoted to analyze the properties of the maximum likelihood estimator . Unlike the moment estimator, has no explicit expression but, as we will see, it is slightly more efficient.
Under the assumption (10), we have that the maximum likelihood estimator of , , appears as the solution of the likelihood equation
This estimator is asymptotically normal, that is,
coincides with the Fréchet-Cramer-Rao bound (given by the inverse of Fisher’s information measure).
The likelihood equation (17) follows directly by calculating the derivative with respect to of the log-likelihood,
As for (18), we will use the standard result on asymptotic normality of the maximum likelihood estimation which can be found in many standard textbooks. We will use the version given in Lehmann and Casella (1998), Th. 3.10, p. 449. According to this result, a conclusion of type (18) can be obtained, for a general one-parameter family given by the (Lebesgue) densities , , under the following regularity conditions:
The parameter space is an open interval (not necessarily finite).
The support of the distributions in the parametric family does not depend on , so that the set is independent of
For every the density is three times differentiable with respect to , and the third derivative is continuous in .
The integral can be three times differentiated under the integral sign.
The Fisher information fulfills .
For any given , there exists a positive number and a function (both of which may depend on ) such that
Obviously, in our case , and is given by (11). So conditions (i), (ii) and (iii) are fulfilled. On the other hand, (iv) is also fulfilled since the function in the integrand has three continuous derivatives with respect to .
The validity of condition (v) follows from the direct calculation of the Fisher information quantity which yields
As for condition (vi) let us note that
Now, a function fulfilling condition in a neighborhood of is, for example,
which clearly satisfies .
Finally, as a consequence of the asymptotic normality (and asymptotic efficiency) of the maximum likelihood estimator [Theorem 3.10 in Lehmann and Casella, p. 449] we can conclude
The three-dimensional case
We first establish the basic model to be considered in the inference. This is done in the following result, which is the analog of Proposition 2 for the three-dimensional case. Again, we will provide two alternative expressions for the density of the random variable , the distance to from a random uniformly chosen on . The set is assumed to belong to the class of compact sets in with polynomial volume given by
The above defined random variable “distance to the boundary”, , is absolutely continuous with density function
This density can be alternatively expressed as
and, for , is the density function of a random variable , where is uniform on , has a distribution and is .
Again, expression (22) can be interpreted in geometric terms: if we think, to fix ideas, that is a polyhedron, then , and would represent, respectively, the densities of the distances of those points whose projections are inside a face, on an edge and on a vertex.
Now, the main results concerning the moment estimators of and are summarized in the following statement.
Under the assumption (20), we have that the estimators of and by the method of moments from a sample of the distance variable are
where and are the sample means of and , respectively. Moreover, if we denote
with and , where is the covariance matrix of the vector . The elements of are
Proof: Some elementary calculations lead to
The estimators and are then obtained as the solutions of the system of equations , .
With the notation introduced for , we have
Performing a Taylor expansion for at and denoting , we obtain
We only need to show . This follows from the fact that is a function of differentiability class two in a neighborhood of . Indeed, we have
To check the continuity of the second-order derivatives we only have to see that the denominators are not null at