Topological Analysis of Nerves, Reeb Spaces, Mappers, and Multiscale Mappers
Data analysis often concerns not only the space where data come from, but also various types of maps attached to data. In recent years, several related structures have been used to study maps on data, including Reeb spaces, mappers and multiscale mappers. The construction of these structures also relies on the so-called nerve of a cover of the domain.
In this paper, we aim to analyze the topological information encoded in these structures in order to provide better understanding of these structures and facilitate their practical usage.
More specifically, we show that the one-dimensional homology of the nerve complex of a path-connected cover of a domain cannot be richer than that of the domain itself. Intuitively, this result means that no new -homology class can be “created” under a natural map from to the nerve complex . Equipping with a pseudometric , we further refine this result and characterize the classes of that may survive in the nerve complex using the notion of size of the covering elements in . These fundamental results about nerve complexes then lead to an analysis of the -homology of Reeb spaces, mappers and multiscale mappers.
The analysis of -homology groups unfortunately does not extend to higher dimensions. Nevertheless, by using a map-induced metric, establishing a Gromov-Hausdorff convergence result between mappers and the domain, and interleaving relevant modules, we can still analyze the persistent homology groups of (multiscale) mappers to establish a connection to Reeb spaces.
Data analysis often concerns not only the space where data come from, but also various types of information attached to data. For example, each node in a road network can contain information about the average traffic flow passing this point, a node in protein-protein interaction network can be associated with biochemical properties of the proteins involved. Such information attached to data can be modeled as maps defined on the domain of interest; note that the maps are not necessarily -valued, e.g, the co-domain can be . Hence understanding data benefits from analyzing maps relating two spaces rather than a single space with no map on it.
In recent years, several related structures have been used to study general maps on data, including Reeb spaces [munch, DW13, reeb-space, MW16], mappers (and variants) [CO16, CS14, mapper] and multiscale mappers [DMW16]. More specifically, given a map defined on a topological space , the Reeb space w.r.t. (first studied for piecewise-linear maps in [reeb-space]), is a generalization of the so-called Reeb graph for a scalar function which has been used in various applications [BGSF08]. It is the quotient space of w.r.t. an equivalence relation that asserts two points of to be equivalent if they have the same function value and are connected to each other via points of the same function value. All equivalent points are collapsed into a single point in the Reeb space. Hence provides a way to view from the perspective of .
The Mapper structure, originally introduced in [mapper], can be considered as a further generalization of the Reeb space. Given a map , it also considers a cover of the co-domain that enables viewing the structure of at a coarser level. Intuitively, the equivalence relation between points in is now defined by whether points are within the same connected component of the pre-image of a cover element . Instead of a quotient space, the mapper takes the nerve complex of the cover of formed by the connected components of the pre-images of all elements in (i.e, the cover formed by those equivalent points). Hence the mapper structure provides a view of from the perspective of both and a cover of the co-domain .
Finally, both the Reeb space and the mapper structures provide a fixed snapshot of the input map . As we vary the cover of the co-domain , we obtain a family of snapshots at different granularities. The multiscale mapper [DMW16] describes the sequence of the mapper structures as one varies the granularity of the cover of through a sequence of covers of connected via cover maps.
New work. While these structures are meaningful in that they summarize the information contained in data, there has not been any qualitative analysis of the precise information encoded by them with the only exception of [CO16] and [GGP16] 111Carrière and Oudot [CO16] analyzed certain persistence diagram of mappers induced by a real-valued function, and provided a characterization for it in terms of the persistence diagram of the corresponding Reeb graph. Gasparovic et al [GGP16] provides full description of the persistence homology information encoded in the intrinsic Čech complex (a special type of nerve complex) of a metric graph. . In this paper, we aim to analyze the topological information encoded by these structures, so as to provide better understanding of these structures and facilitate their practical usage [EH09, survey]. In particular, the construction of the mapper and multiscale mapper use the so-called nerve of a cover of the domain. To understand the mappers and multiscale mappers, we first provide a quantitative analysis of the topological information encoded in the nerve of a reasonably well-behaved cover for a domain. Given the generality and importance of the nerve complex in topological studies, this result is of independent interest.
More specifically, in Section 3, we first obtain a general result that relates the one dimensional homology of the nerve complex of a path-connected cover (where each open set contained is path-connected) of a domain to that of the domain itself. Intuitively, this result says that no new -homology classes can be “created” under a natural map from to the nerve complex . Equipping with a pseudometric , we further refine this result and quantify the classes of that may survive in the nerve complex (Theorem LABEL:H1prop-mapper, Section LABEL:sec:persistentH1). This demarcation is obtained via a notion of size of covering elements in . These fundamental results about nerve complexes then lead to an analysis of the -homology classes in Reeb spaces (Theorem LABEL:RS-thm), mappers and multiscale mappers (Theorem LABEL:H1pers-thm). The analysis of -homology groups unfortunately does not extend to higher dimensions. Nevertheless, we can still provide an interesting analysis of the persistent homology groups for these structures (Theorem LABEL:thm:MM-ICinterleave, Section LABEL:sec:highD). During this course, by using a map-induced metric, we establish a Gromov-Hausdorff convergence between the mapper structure and the domain. This offers an alternative to [MW16] for defining the convergence between mappers and the Reeb space, which may be of independent interest.
2 Topological background and motivation
Space, paths, covers. Let denote a path connected topological space. Since is path connected, there exists a path connecting every pair of points where and . Let denote the set of all such paths connecting and . These paths play an important role in our definitions and arguments.
By a cover of we mean a collection of open sets such that A cover is path connected if each is path connected. In this paper, we consider only path connected covers.
Later to define maps between and its nerve complexes, we need to be paracompact, that is, every cover of has a subcover so that each point has an open neighborhood contained in finitely many elements of . Such a cover is called locally finite. From now on, we assume to be compact which implies that it is paracompact too.
Definition 1 (Simplicial complex and maps).
A simplicial complex with a vertex set is a collection of subsets of with the condition that if is in , then all subsets of are in . We denote the geometric realization of by . Let and be two simplicial complexes. A map is simplicial if for every simplex in , the simplex is in .
Definition 2 (Nerve of a cover).
Given a cover of , we define the nerve of the cover to be the simplicial complex whose vertex set is the index set , and where a subset spans a -simplex in if and only if .
Maps between covers. Given two covers and of , a map of covers from to is a set map so that for all . By a slight abuse of notation we also use to indicate the map Given such a map of covers, there is an induced simplicial map , given on vertices by the map . Furthermore, if are three covers of with the intervening maps of covers between them, then as well. The following simple result is useful.
Proposition 3 (Maps of covers induce contiguous simplicial maps [Dmw16]).
Let be any two maps of covers. Then, the simplicial maps and are contiguous.
Recall that two simplicial maps are contiguous if for all it holds that . In particular, contiguous maps induce identical maps at the homology level [munkres]. Let denote the -dimensional homology of the space in its argument. This homology is singular or simplicial depending on if the argument is a topological space or a simplicial complex respectively. All homology groups in this paper are defined over the field . Proposition 3 implies that the map arising out of a cover map can be deemed canonical.
3 Surjectivity in -persistence
In this section we first establish a map between and the geometric realization of a nerve complex . This helps us to define a map from the singular homology groups of to the simplicial homology groups of (through the singular homology of ). The famous nerve theorem [borsuk, leray] says that if the elements of intersect only in contractible spaces, then is a homotopy equivalence and hence leads to an isomorphism between and . The contractibility condition can be weakened to a homology ball condition to retain the isomorphism between the two homology groups [leray]. In absence of such conditions of the cover, simple examples exist to show that is neither a monophorphism (injection) nor an epimorphism (surjection). Figure LABEL:non-surject-fig gives an example where is not sujective in . However, for one dimensional homology we show that, for any path connected cover , the map is necessarily a surjection. One implication of this is that the simplicial maps arising out of cover maps induce a surjection among the one dimensional homology groups of two nerve complexes.