Progressive Wasserstein Barycenters of Persistence Diagrams
Abstract
This paper presents an efficient algorithm for the progressive approximation of Wasserstein barycenters of persistence diagrams, with applications to the visual analysis of ensemble data. Given a set of scalar fields, our approach enables the computation of a persistence diagram which is representative of the set, and which visually conveys the number, data ranges and saliences of the main features of interest found in the set. Such representative diagrams are obtained by computing explicitly the discrete Wasserstein barycenter of the set of persistence diagrams, a notoriously computationally intensive task. \julienRevision\EDIT, for which we introduce. To do so, we build upon existing work on the computation of Fréchet means and on fast approximations of Wasserstein distances for diagrams to design In particular, we revisit efficient algorithms for Wasserstein distance approximation [Bertsekas81, Kerber2016] to extend previous work on barycenter estimation [Turner2014]. We present a new fast algorithm, which progressively approximates the barycenter by iteratively increasing the computation accuracy as well as the number of persistent features in the output diagram. Such a progressivity drastically improves convergence in practice and allows to design an interruptible algorithm, capable of respecting computation time constraints. This enables the approximation of Wasserstein barycenters within interactive times. We present an application to ensemble clustering where we revisit the means algorithm to exploit our barycenters and compute, within execution time constraints, meaningful clusters of ensemble data along with their barycenter diagram. Extensive experiments on synthetic and reallife data sets report that our algorithm converges to barycenters that are qualitatively meaningful with regard to the applications, and quantitatively comparable to previous techniques, while offering an order of magnitude speedup when run until convergence (without time constraint). Our algorithm can be trivially parallelized to provide additional speedups in practice on standard workstations. We provide a lightweight C++ implementation of our approach that can be used to reproduce our results.
1010 \vgtccategoryResearch \vgtcpapertypeTechnique & Algorithm \ieeedoi10.1109/TVCG.2019.2934256 \authorfooter J. Vidal, J. Budin, J. Tierny are with Sorbonne Université and CNRS (LIP6). Email: {jules.vidal, joseph.budin, julien.tierny}@sorbonneuniversite.fr. \shortauthortitleBiv et al.: This is the title \CCScatlist \CCScatK.6.1Management of Computing and Information SystemsProject and People ManagementLife Cycle; \CCScatK.7.mThe Computing ProfessionMiscellaneousEthics \teaser \vgtcinsertpkg
Introduction
In many fields of science and engineering, measurements and simulations are core tools for the understanding of complex physical systems. However, given the geometrical complexity of the resulting data, interactive exploration and analysis can be challenging for users. This motivates the design of expressive data abstractions, capable of concisely capturing the main features of interest in the data, and of visually conveying that information to the user, quickly and effectively.
In that regard, Topological Data Analysis (TDA) [edelsbrunner09] has demonstrated over the last two decades its utility to support interactive visualization tasks [heine16]. In the applications, it robustly and efficiently captures in a generic way the features of interest in scalar data. Examples of successful applications include turbulent combustion [laney_vis06, bremer_tvcg11, gyulassy_ev14], material sciences [gyulassy_vis07, gyulassy_vis15, favelier16, roy19], nuclear energy [beiNuclear16], fluid dynamics [kasten_tvcg11], medical imaging [beiBrain18], chemistry [chemistry_vis14, harshChemistry] or astrophysics [sousbie11, shivashankar2016felix] to name a few. An appealing aspect of TDA is the ease it offers for the translation of domainspecific descriptions of features of interest into topological terms. Moreover, to distinguish noise from features, concepts from Persistent Homology [edelsbrunner02, edelsbrunner09] provide importance measures, which are both theoretically well established and meaningful in the applications. Among the existing abstractions, such as contour trees [carr00], Reeb graphs [pascucci07, biasotti08, tierny_vis09], or MorseSmale complexes [gyulassy_vis08, gyulassy_vis14, Defl15, gyulassy19], the persistence diagram [edelsbrunner02] has been extensively studied. In particular, its conciseness, stability [cohensteiner05] and expressiveness make it an appealing candidate for data summarization tasks. For instance, its applicability as a concise data descriptor has been well studied in machine learning [ReininghausHBK15, Carriere2017, rieck_topoInVis17]. In visualization, it provides visual hints about the number, data range and salience of the features of interest (Fig. 1), which helps users visually apprehend the complexity of their data.
In practice, modern numerical simulations are subject to a variety of input parameters, related to the initial conditions of the system under study, as well as the configuration of its environment. Given recent advances in hardware computational power, engineers and scientists can now densily sample the space of these input parameters, in order to better quantify the sensitivity of the system. For scalar variables, this means that the data which is considered for visualization and analysis is no longer a single field, but a collection, called “ensemble”, of scalar fields representing the same phenomenon, under distinct input conditions and parameters. In this context, extracting the global trends in terms of features of interest in the ensemble is a major challenge.
Although it is possible to compute a persistence diagram for each member of an ensemble, in particular insitu [insitu, AyachitBGOMFM15], this process only shifts the problem from the analysis of an ensemble of scalar fields to an ensemble of persistence diagrams. Then, given such an ensemble of diagrams, the question of estimating a diagram which is representative of the set naturally arises, as such a representative diagram could visually convey to the users the global trends in the ensemble in terms of features of interest. For this, naive strategies could be considered, such as estimating the persistence diagram of the mean of the ensemble of scalar fields. However, given the additive nature of the pointwise mean, this yields a persistence diagram with an incorrect number of features (Fig. 2), which is thus not representative of any of the diagrams of the input scalar fields. To address this issue, a promising alternative consists in considering the barycenter of a set of diagrams, given a distance metric between them, such as the socalled Wasserstein metric [edelsbrunner09], hence the term Wasserstein barycenter. For this, an algorithm has been proposed by Turner et al. [Turner2014]. However, it is based on an iterative procedure, for which each iteration relies itself on a demanding optimization problem (optimal assignment in a weighted bipartite graph [Munkres1957]), which makes it impractical for reallife datasets.
This paper addresses this problem by introducing a fast algorithm for the approximation of discrete Wasserstein barycenters of ensembles of persistence diagrams. We designed our approach by revisiting the core routines involved in this optimization problem with appropriate heuristics, motivated by practical observations. A unique aspect of our approach is its progressive nature: the computation accuracy and the number of features in the output diagram are progressively increased along the optimization. This specificity has two main practical advantages. First, this progressivity drastically accelerates convergence in practice. Second, it enables to formulate an interruptible algorithm, capable of producing meaningful barycenters while respecting computation time constraints. This latter advance is particularly useful, both in an interactive exploration setting (where users need rapid feedback) and in an insitu context (where computation ressources need to be carefully allocated). Extensive experiments on synthetic and reallife data report that our algorithm converges to barycenters that are qualitatively meaningful for the applications, while offering an order of magnitude speedup over the fastest combinations of existing techniques. Our algorithm is \EDITtriviallyeasily parallelizable, which provides additional speedups on commodity workstations. We illustrate the utility of our approach by introducing a clustering algorithm adapted from means which exploits our progressive Wasserstein barycenters. This application enables a meaningful clustering of the members based on their features of interest, within computation time constraints, and provides informative summarizations of the global trends of features found in the ensemble.
0.1 Related work
The literature related to our work can be classified into three main categories, reviewed in the following: (i) uncertainty visualization, (ii) ensemble visualization, and (iii) persistence diagram processing.
Uncertainty visualization: The analysis and visualization of uncertainty in data is a notoriously challenging problem in the visualization community [uncertainty_survey1, unertainty_survey2, uncertainty1, uncertainty2, uncertainty3, uncertainty4]. In this context, the data variability is explicitly modeled by an estimator of the probability density function (PDF) of a pointwise random variable. Several representations have been proposed to visualize the related data uncertainty, either focusing on the entropy of the random variables [uncertainty_entropy], on their correlation [uncertainty_correlation], or gradient variation [uncertainty_gradient]. When geometrical constructions are extracted from uncertain data, their positional uncertainty has to be assessed. For instance, several approaches have been presented for level sets, under various interpolation schemes and PDF models [uncertainty_isosurface1, uncertainty_isosurface2, uncertainty_isosurface3, uncertainty_isosurface4, uncertainty_isocontour1, uncertainty_nonparam, uncertainty_isocontour2, uncertainty_interp, athwale19]. Other approaches addressed critical point positional uncertainty, either for Gaussian [liebmann1, otto1, otto2, petz] or uniform distributions [gunther, bhatia, szymczak]. In general, visualization methods for uncertain data are specifically designed for a given distribution model of the pointwise random variables (Gaussian, uniform, etc.). This challenges their usage with ensemble data, where PDF estimated from empirical observations can follow an arbitrary, unknown model. Moreover, most of the above techniques do not consider multimodal PDF models, which is a necessity when several distinct trends occur in the ensemble.
Ensemble visualization: Another category of approaches has been studied to specifically visualize the main trends in ensemble data. In this context, the data variability is directly encoded by a series of global empirical observations (i.e. the members of the ensemble). Existing visualization techniques typically construct geometrical objects, such as level sets or streamlines, for each member of the ensemble. Then, given this ensemble of geometrical objects, the question of estimating an object which is representative of the ensemble naturally arises. For this, several methods have been proposed, such as spaghetti plots [spaghettiPlot] in the case of levelset variability in weather data ensemble [Potter2009, Sanyal2010], or boxplots for the variability of contours [whitaker2013] and curves in general [Mirzargar2014]. For the specific purpose of trend variability analysis, Hummel et al. [Hummel2013] developed a Lagrangian framework for classification in flow ensembles. Related to our work, clustering techniques have been used to analyze the main trends in ensembles of streamlines[Ferstl2016] and isocontours [Ferstl2016b]. However, only few techniques have focused on applying this strategy to topological objects. Overlapbased heuristics have been investigated to estimate a representative contour tree from an ensemble [Wu2013ACT, Kraus2010VisualizationOU]. Favelier et al. [favelier2018] introduced an approach to analyze critical point variability in ensembles. It relies on the spectral clustering of the ensemble members according to their persistence map, a scalar field defined on the input geometry which characterizes the spatial layout of persistent critical points. However, the clustering stage of this approach takes as an input a distance matrix between the persistence maps of all the members of the ensemble, which requires to mantain them all in memory, which is not conceivable for largescale ensembles counting a high number of members. Moreover, the clustering itself is performed on the spectral embedding of the persistence maps, where distances are loose approximations of the intrinsic metric between these objects. In contrast, the clustering application described in this paper directly operates on the persistence diagrams of the ensemble members, whose memory footprint is orders of magnitude smaller than the actual ensemble data. This makes our approach more practical, in particular in the perspective of an insitu computation [insitu] of the persistence diagrams. Also, it is built on top of our progressive algorithm for Wasserstein barycenters, which focuses on the Wasserstein metric between diagrams. Optionally, our work can also integrate the spatial layout of critical points by considering a geometrically lifted version of this metric [soler2018].
Persistence diagram processing: To define the barycenter of a set of persistence diagrams, a metric (i) first needs to be introduced to measure distances between them. Then, barycenters (ii) can be formally defined as minimizers of the sum of distances to the set of diagrams.
(i) The estimation of distances between topological abstractions has been a longstudied problem, in particular for similarity estimation tasks. Several heuristical approaches have been documented for the fast estimation of structural similarity [hilaga:sig:2001, thomas14, SaikiaSW14_branch_decomposition_comparison, ThomasN13]. More formal approaches have studied various metrics between topological abstractions, such as Reeb graphs [bauer14, BauerMW15] or merge trees [BeketayevYMWH14]. For persistence diagrams, the Bottleneck [cohensteiner05] and Wasserstein distances [monge81, Kantorovich, edelsbrunner09], have been widely studied, for instance in machine learning [cuturi2013] and adapted for kernel based methods [ReininghausHBK15, Carriere2017, rieck_topoInVis17]. The numerical computation of the Wasserstein distance between two persistence diagrams requires to solve an optimal assignment problem between them. The typical methods for this are the exact Munkres algorithm [Munkres1957], or an auctionbased approach [Bertsekas81] that provides an approximate result with improved time performance. Kerber et al. [Kerber2016] specialized the auction algorithm to the case of persistence diagrams and showed how to significantly improve performances by leveraging adequate data structures for proximity queries. Soler et al. [soler2018] introduced a fast extension of the Munkres approach, also taking advantage of the structure of persistence diagrams, in order to solve the assignment problem in an exact and efficient way, using a reduced, sparse and unbalanced cost matrix.
(ii) Recent advances in optimal transport [cuturi2013] enabled the practical resolution of transportation problems between continuous quantities [cuturi2014, solomon2015]. These methods have been successfully applied to persistence diagrams [lacombe2018]. However, this application requires to represent the input diagrams as heat maps of fixed resolution [Adams2015]. Such a rasterization can be interpreted as a prenormalization of the diagrams, which can be problematic for applications where the considered diagrams have different persistence scales, as typically found with timevarying phenomena for instance. Moreover, this approach does not explicitly produce a persistence diagram as an output, but a heat map of the population of persistence pairs in the barycenter. This challenges its usage for visualization applications, as the features of interest in the barycenter cannot be directly inferred from the barycenter heat map. In contrast, our approach produces explicitly a persistence diagram as an output, from which the geometry of the features (their number, data ranges and salience) can be directly visually inspected. Moreover, such an explicit representation also enables the efficient geometrical lifting of the Wasserstein metric [soler2018], which is relevant for scientific visualization applications but which would require to regularly sample a five dimensional space with heat map based approaches. Turner et al. [Turner2014] introduced an algorithm for the computation of a Fréchet mean of a set of persistence diagrams with regard to the Wasserstein metric. This approach provides explicit barycenters, which makes it appealing for the applications. However, its very high computational cost makes it impractical for reallife data sets. In particular, it is based on an iterative procedure, for which each iteration relies itself on optimal assignment problems [Munkres1957] between persistence diagrams (for the Wasserstein distances), where is the number of members in the ensemble. A naive approach to address this computational bottleneck would be to combine this method with the efficient algorithms for Wasserstein distances mentioned previously [Kerber2016, soler2018]. However, as shown in Sec. 5, such \EDITa naivean approach is still computationally expensive and it can require up to hours of computation on certain data sets. In contrast, our algorithm converges in multithreaded mode in a couple of minutes at most, and its progressive nature additionally allows for its interruption within interactive times, while still providing qualitatively meaningful results.
0.2 Contributions
This paper makes the following new contributions:

[leftmargin=1em]

A progressive algorithm for Wasserstein barycenters of persistence diagrams: \julienRevision\EDITWe present We extend existing work on Fréchet means of persistence diagrams [Turner2014] and revisit the fast approximation of Wasserstein distances [Bertsekas81, Kerber2016] in order to provide We revisit efficient algorithms for Wasserstein distance approximation [Bertsekas81, Kerber2016] in order to extend previous work on barycenter estimation [Turner2014]. In particular, we introduce a new \julienRevisionalgorithmapproach based on a progressive approximation strategy, which iteratively refines both computation accuracy and output details. The persistence pairs of the input diagrams are progressively considered in decreasing order of persistence. This focuses the computation towards the most salient features of the ensemble, while considering noisy persistent pairs last. The returned barycenters are explicit and provide insightful visual hints about the features present in the ensemble. \julienRevisionWe show that ourOur progressive strategy drastically accelerates convergence in practice, resulting in an order of magnitude speedup over the fastest combinations of existing techniques. The algorithm is trivially parallelizable, which provides additional speedups in practice on standard workstations. \julienRevision\EDITAdditionally, the progressivity allows to broaden the above algorithm to support computation time constraints, and produce barycenters accounting for the main features of the data within interactive times. We present an interruptible extension of our algorithm to support computation time constraints. This enables to produce barycenters accounting for the main features of the data within interactive times.

An interruptible algorithm for the clustering of persistence diagrams: We extend the above methods to revisit the means algorithm and introduce an interruptible clustering of persistence diagrams, which is used for the visual analysis of the global feature trends in ensembles.

Implementation: We provide a lightweight C++ implementation of our algorithms that can be used for reproduction purposes.
1 Preliminaries
This section presents the theoretical background of our approach. It contains definitions adapted from the Topology ToolKit [ttk17]. It also provides, for self completeness, concise descriptions of the key algorithms [Turner2014, Kerber2016] that our work extends. We refer the reader to Edelsbrunner and Harer [edelsbrunner09] for an introduction to computational topology.
1.1 Persistence diagrams
The input data is an ensemble of piecewise linear (PL) scalar fields defined on a PL manifold , with in our applications. We note the sublevel set of , namely the preimage of by . When continuously increasing , the topology of can only change at specific locations, called the critical points of . In practice, is enforced to be injective on the vertices of [edelsbrunner90], which are in finite number. This guarantees that the critical points of are isolated and also in finite number. Moreover, they are also enforced to be nondegenerate, which can be easily achieved with local remeshing [edelsbrunner03]. Critical points are classified according to their index : 0 for minima, 1 for 1saddles, for saddles, and for maxima. Each topological feature of (i.e. connected component, independent cycle, void) can be associated with a unique pair of critical points , corresponding to its birth and death. Specifically, the Elder rule [edelsbrunner09] states that critical points can be arranged according to this observation in a set of pairs, such that each critical point appears in only one pair such that and . Intuitively, this rule implies that if two topological features of (e.g. two connected components) meet at a critical point , the youngest feature (i.e. created last) dies, favoring the oldest one (i.e. created first). Critical point pairs can be visually represented by the persistence diagram, noted , which embeds each pair to a single point in the 2D plane at coordinates , which respectively correspond to the birth and death of the associated topological feature. The persistence of a pair, noted , is then given by its height . It describes the lifetime in the range of the corresponding topological feature. The construction of the persistence diagram is often restricted to a specific type of pairs, such as pairs or pairs, respectively capturing features represented by minima or maxima. In the following, we will consider that each persistence diagram is only composed of pairs of a fixed critical type , which will be systematically detailed in the experiments. \julienRevisionNote that in practice, each critical point in the diagram additionally stores its 3D coordinates in to allow geometrical lifting (Sec. 1.2).
In practice, critical point pairs often directly correspond to features of interest in the applications and their persistence has been shown to be a reliable importance measure to discriminate noise from important features. In the diagram, the pairs located in the vicinity of the diagonal represent low amplitude noise while salient features will be associated with persistent pairs, standing out far away from the diagonal\EDIT (Fig. 1) . For instance, in Fig. 1, the two hills are captured by two large pairs, while noise is encoded with smaller pairs near the diagonal. Thus, the persistence diagram has been shown in practice to be a useful, concise visual representation of the population of features of interest in the data, for which the number, range values and salience can be directly and visually read. In addition to its stability properties [cohensteiner05], these observations motivate its usage for data summerization, where it significantly help users distinguish salient features from noise. \julienRevisionIn the following, we review distances between persistence diagrams [cohensteiner05, edelsbrunner09] (Sec. 1.2), efficient algorithms for their approximation [Bertsekas81, Kerber2016] (Sec. 1.3), as well as the reference approach for barycenter estimation [Turner2014] (Sec. 1.4).
1.2 Distances between Persistence diagrams
To compute the barycenter of a set of persistence diagrams, a necessary ingredient is the notion of distance between them. Given two diagrams and , a pointwise distance, noted , inspired from the norm, can be introduced in the 2D birth/death space between two points and , with , as follows :
(1) 
By convention, is set to zero if both and exactly lie on the diagonal ( and ). The Wasserstein distance [Kantorovich, monge81], noted , between and can then be introduced as:
(2) 
where is the set of all possible assignments mapping each point to a point , or to \EDITthe closest diagonal pointits projection onto the diagonal, , which denotes the removal of the corresponding feature from the assignment, with a cost \EDIT(see additional materials for an illustration). The Wasserstein distance can be computed by solving an optimal assignment problem, for which existing algorithms [Munkres1957, Morozov:Dionysus] however often require a balanced setting. To address this, the input diagrams and are typically augmented into and , which are obtained by injecting the diagonal projections of all the points of one diagram into the other:
(3)  
(4) 
In this way, the Wasserstein distance is guaranteed to be preserved by construction, , while making the assignment problem balanced () and thus solvable with traditional assignment algorithms. \julienRevisionNote that wWhen , becomes a worst case assignment distance called the Bottleneck distance [cohensteiner05]\julienRevision., which is often interpreted in practice as less informative, however, than the Wasserstein distance. In the following, we will focus on .
In the applications, it can often be useful to geometrically lift the Wasserstein metric, by also taking into account the geometrical layout of critical points [soler2018]. Let be the critical point pair corresponding the point . Let be their linear combination with coefficient in : . Our experiments (Sec. 5) only deal with extrema\EDIT, and we set to for minima and for maxima (to only consider the extremum’s location). Then, the geometrically lifted pointwise distance can be given as:
(5) 
The parameter quantifies the importance \EDITthat is given to the geometry of critical points and it must be tuned on a per application basis. \EDITWhen , the combination of with becomes the classical Earth Mover’s distance [DBLP:conf/iccv/LevinaB01] between critical points in and thus enables users to simply linearly blend these two metrics.
1.3 Efficient distance computation by Auction
This section briefly describes a fast approximation of Wasserstein distances by the auction algorithm [Bertsekas81, Kerber2016], which is central to our method.
The assignment problem involved in Eq. 2 can be modeled in the form of a weighted bipartite graph, where the points are represented as nodes, connected by edges to nodes representing the points of , with an edge weight given by (Eq. 1). To efficiently estimate the optimal assignment, Bertsekas introduced the auction algorithm [Bertsekas81] (Fig. 3), which replicates the behavior of a reallife auction: the points of are acting as bidders that iteratively make offers for the purchase of the points of , known as the objects. Each bidder makes a benefit for the purchase of an object , which is itself labeled with a price , initially set to . During the iterations of the auction, each bidder tries to purchase the object of highest value . The bidder is then said to be assigned to the object . If was previously assigned, its previous owner becomes unassigned. At this stage, the price of is increased by , where is the absolute difference between the two highest values that the bidder found among the objects , and where is a constant. This bidding procedure is repeated iteratively among the bidders, until all bidders are assigned (which is guaranteed to occur by construction, thanks to the constant). At this point, \EDITwe sayit is said that an auction round has completed: a bijective, possibly suboptimal, assignement exists between and . The overall algorithm will repeat auction rounds, which progressively increases prices under the effect of competing bidders.
The constant plays a central role in the auction algorithm. Let be the approximation of the Wasserstein distance , obtained with the assignment returned by the algorithm. Large values of will drastically accelerate convergence (as they imply fewer iterations for the construction of a bijective assignment within one auction round, Fig. 3), while low values will improve the accuracy of . This observation is a key insight at the basis of our approach. Bertsekas suggests a strategy called scaling, which decreases after each auction round. In particular, if:
(6) 
then it can be shown that [Bertsekas91, Kerber2016]:
(7) 
This result is particularly important, as it enables to estimate the optimal assignment, and thus the Wasserstein distance, with an ondemand accuracy (controlled by the parameter ) by using Eq. 6 as a stopping condition for the overall auction algorithm. For persistence diagrams, Kerber et al. showed how the computation could be accelerated by using space partitioning data structures such as kdtrees [Kerber2016]. In practice, \EDITwe set to be initially is initially set to be equal to of the largest edge weight , \EDITwe divide itand is divided by after each auction round, as recommended by Bertsekas [Bertsekas81]\EDIT, and we. \julienRevisionWe use is set to as suggested by Kerber et al. [Kerber2016].
1.4 Wasserstein barycenters of Persistence diagrams
Let be the space of persistence diagrams. The discrete Wasserstein barycenter of a set of persistence diagrams can be introduced as the Fréchet mean of the set, under the metric . It is the diagram that minimizes its distance to all the diagrams of the set (i.e. minimizer of the socalled Fréchet energy):
(8) 
The computation of Wasserstein barycenters involves a computationally demanding optimization problem, for which the existence of at least one locally optimum solution has been shown by Turner et al. [Turner2014], who also introduced the first algorithm for its computation. This algorithm (Alg. 1) consists in iterating a procedure that we call Relaxation (line 3 to 8), which resembles a Lloyd relaxation [lloyd82], and which is composed itself of two subroutines: (i) Assignment (line 5) and (ii) Update (line 7). Given an initial barycenter candidate randomly chosen among the set , the first step ((i) Assignment) consists in computing an optimal assignment between and each diagram of the set , with regard to Eq. 2. The second step ((ii) Update) consists in updating the candidate to a position in which minimizes the sum of its squared distances to the diagrams of under the current set of assignments . In practice, this last step is achieved by \EDITsimply replacing each point by the arithmetic mean (in the birth/death space) of all its assignments . The overall algorithm continues to iterate the Relaxation procedure until the set of optimal assignments remains identical for two consecutive iterations.
1.5 Overview of our approach
The reference algorithm for Wasserstein barycenters (Alg. 1) reveals impractical for reallife data sets. Its main computational bottleneck is the Assignment step, which involves the computation of Wasserstein distances (Eq. 2). However, as detailed in the result section (Sec. 5), even when combined with efficient algorithms for the Wasserstein distance exact computation [soler2018] or even approximation [Kerber2016] (Sec. 1.3), this overall method can still lead to hours of computation in practice.
Thus, a drastically different approach is needed to improve computation efficiency, especially for applications such as ensemble clustering, which require multiple barycenter estimations per iteration.
In this work, we introduce a novel progressive framework for Wasserstein barycenters, which is based on two key observations. First, from one Relaxation iteration to the next (lines 3 to 8, Alg. 1), the estimated barycenter is likely to vary only slightly. This motivates the design of a faster algorithm for the Assignment step, which would be capable of starting its optimization from a relevant initial assignment, typically given in our setting by the previous Relaxation iteration. Second, in the initial Relaxation iterations, the estimated barycenter can be arbitrarily far from the final, optimized barycenter. Thus, for these iterations, it can be beneficial to relax the level of accuracy of the Assignment step, which is the main bottleneck, and to progressively increase it when it is the most needed, as the barycenter converges to a solution.
2 Progressive Barycenters
This section presents our novel progressive framework for the approximation of Wasserstein barycenters of a set of Persistence diagrams . As mentioned above, the key insights of our approach are twofolds. First, the assignments involved in the Wasserstein distance estimations can be reused as initial conditions along the iterations of the barycenter Relaxation (Sec. 3.2). Second, progressivity can be injected in the process to accelerate convergence, by adequately controlling the level of accuracy in the evaluation of the Wasserstein distances along the Relaxation iterations of the barycenter. This progressivity can be injected at two levels: by controlling the accuracy of the distance estimation itself (Sec. 3.3) and the resolution of the input diagrams (Sec. 3.4). The rest of the section presents a parallelization of our framework (Sec. 3.5) and describes an interruptible algorithm, capable of respecting running time constraints (Sec. 3.6).
3 Progressive Barycenters
This section presents our novel progressive framework for the approximation of Wasserstein barycenters of a set of Persistence diagrams\julienRevision..
3.1 Overview
The key insights of our approach are twofolds. First, in the reference algorithm (Alg. 1), from one Relaxation iteration to the next (lines 3 to 8), the estimated barycenter is likely to vary only slightly. \julienRevisionPrevious assignments can thus Thus, the assignments involved in the Wasserstein distance estimations can be reused as initial conditions along the iterations of the barycenter Relaxation (Sec. 3.2). Second, in the initial Relaxation iterations, the estimated barycenter can be arbitrarily far from the final, optimized barycenter. Thus, for these early iterations, it can be beneficial to relax the level of accuracy of the Assignment step, \EDITwhich is the main bottleneck, and to progressively increase it as the barycenter converges to a solution. Progressivity can be injected at two levels: by controlling the accuracy of the distance estimation itself (Sec. 3.3) and the resolution of the input diagrams (Sec. 3.4). Our framework is easily parallelizable (Sec. 3.5) and the progressivity allows to design an interruptible algorithm, capable of respecting running time constraints (Sec. 3.6).
3.2 Auctions with Price Memorization
The Assignment step of the Wasserstein barycenter computation (line 5, Alg. 1) can be resolved in principle with any of the existing techniques for Wasserstein distance estimation [Munkres1957, Morozov:Dionysus, Kerber2016, soler2018]. Among them, the Auction based approach [Bertsekas81, Kerber2016] (Sec. 1.3) is particularly relevant as it can compute very efficiently approximations with ondemand accuracy.
In the following, we consider that each distance computation involves augmented diagrams \EDIT(Sec. 1.2). Each input diagram is then considered as a set of bidders \EDIT(Sec. 1.3) while the output barycenter contains the objects to purchase. Each input diagram maintains its own list of prices for the purchase of the objects by the bidders . The search by a bidder for the two most valuable objects to purchase is accelerated with space partitioning data structures, by using a kdtree and a lazy heap respectively for the off and ondiagonal points [Kerber2016] (these structures are recomputed for each Relaxation). Thus, the output barycenter maintains only one kdtree and one lazy heap for this purpose. Since Wasserstein distances are only approximated in this strategy, we suggest to relax the overall stopping condition (Alg. 1) and stop the iterations after two successive increases in Fréchet energy (Eq. 8), as commonly done in gradient descent optimizations. In the rest of the paper, we call the above strategy the Auction barycenter algorithm [Turner2014]+[Kerber2016], as it \EDITnaivelyjust combines the algorithms by Turner et al. [Turner2014] and Kerber et al. [Kerber2016].
However, this \EDITnaive usage of the auction algorithm results in a complete reboot of the entire sequence of auction rounds upon each Relaxation, while in practice, for the barycenter problem, the output assignments may be very similar from one Relaxation iteration to the next and thus could be reused as initial solutions. For this, we introduce a mechanism that we call Price Memorization, which \EDITsimply consists in initializing the prices for each bidder to the prices obtained at the previous Relaxation iteration (instead of ). This has the positive effect of encouraging bidders to bid in priority on objects which were previously assigned to them, hence effectively reusing the previous assignments as an initial solution. This memorization makes most of the early auction rounds become unnecessary in practice, which enables to drastically reduce their number, as detailed in the following.
3.3 Accuracydriven progressivity
The reference algorithm for Wasserstein barycenter computation (Alg. 1) can also be interpreted as a variant of gradient descent [Turner2014]. For such methods, it is often observed that approximations of the gradient, instead of exact computations, can be sufficient in practice to reach convergence. This observation is at the basis of our progressive strategy. Indeed, in the early Relaxation iterations, the barycenter can be arbitrarily far from the converged result and achieving a high accuracy in the Assignment step (line 5) for these iterations is often a waste of computation time. Therefore we introduce a mechanism that progressively increases the accuracy of the Assignment step along the Relaxation iterations, \EDITto provide the maximumin order to obtain more accuracy near convergence.
To achieve this, inspired by the internals of the auction algorithm, we apply a global scaling \EDIT(Sec. 1.3), where we progressively decrease the value of , but only at the end of each Relaxation. Combined with Price Memorization \EDIT(Sec. 3.2), this strategy enables us to perform only one auction round per Assignment step. As large values accelerate auction convergence at the price of accuracy, this strategy effectively speeds up the early Relaxation iterations and leads to more and more accurate auctions, and thus assignments, along the Relaxation iterations.
In practice, we divide by after each Relaxation, as suggested by Bertsekas [Bertsekas81] in the case of the regular auction algorithm (Sec. 1.3). Moreover, to guarantee precise final barycenters (obtained for small values), we modify the overall stopping condition to prevent the algorithm from stopping if is larger than of its initial value.
Data set  \pbox10cm Sinkhorn  
[lacombe2018]  \pbox10cm Munkres  
[Turner2014]+[soler2018]  \pbox10cmAuction  
[Turner2014]+[Kerber2016]  Ours  Speedup  
Gaussians (Fig. 7)  100  2,078  7,499.33  H  8,975.60  785.53  11.4 
Vortex Street (Fig. 8)  45  14  54.21  0.14  0.47  0.23  0.6 
Starting Vortex (Fig. 9)  12  36  40.98  0.06  0.67  0.28  0.2 
Isabel (3D) (Progressive Wasserstein Barycenters of Persistence Diagrams)  12  1,337  1,070.57  H  377.42  82.95  4.5 
Sea Surface Height (Fig. 10)  48  1,379  4,565.37  H  949.08  75.90  12.5 
3.4 Persistencedriven progressivity
In practice, the persistence diagrams of reallife data sets often contain a very large number of critical point pairs of low persistence. These numerous small pairs correspond to noise and are often meaningless for the applications. However, although they individually have only little impact on Wasserstein distances (Eq. 2), their overall contributions may be nonnegligible. To account for this, we introduce in this section a persistencedriven progressive mechanism, which progressively inserts in the input diagrams critical point pairs of decreasing persistence. This focuses the early Relaxation iterations on the most salient features of the data, while considering the noisy ones last. In practice, this encourages the optimization to explore more relevant local minima of the Fréchet energy (Eq. 8) that favor persistent features.
Given an input diagram , let be the subset of its points with persistence higher than : . To account for persistencedriven progressivity, we run our barycenter algorithm (with Price Memorization, Sec. 3.2, and accuracydriven progressivity, Sec. 3.3) by initially considering as an input the diagrams . After each Relaxation iteration (Alg. 2, line 10), we decrease such that does not increase by more than 10% (to progress at uniform speed) and such that does not get smaller than (we set to replicate locally Bertsekas’s suggestion for setting, Sec. 1.3). Initially, is set to half of the maximum persistence found in the input diagrams. Along the Relaxation iterations, the input diagrams are progressively populated, which yields the introduction of new points in the barycenter , which we initialize at locations selected uniformly among the newly introduced points of the inputs. This strategy enables to distribute among the inputs the initialization of the new barycenter points. The corresponding prices are initialized with the minimum price found for the objects at the previous iteration.
3.5 Parallelism
Our progressive framework can be trivially parallelized as the most computationally demanding task, the Assignment step (Alg. 2), is independent for each input diagram . The space partitioning data structures used for proximity queries to are accessed independently by each bidder diagram. Thus, we parallelize our approach by running the Assignment step in independent threads.
3.6 Computation time constraints
Our persistencedriven progressivity (Sec. 3.4) focuses the early iterations of the optimization on the most salient features, while considering the noisy ones last. However, as discussed before, low persistence pairs in the input diagrams are often considered as meaningless in the applications. This means that our progressive framework can in principle be interrupted before convergence and still provide a meaningful result.
Let be a user defined time constraint. We first progressively introduce points in the input diagrams and perform the Relaxation iterations for the first of the time constraint , as described in Sec. 3.4. At this point, the optimized barycenter contains only a fraction of the points it would have contained if computed until convergence. To guarantee a precise output barycenter, we found that continuing the Relaxation iterations for the remaining of the time, without introducing new persistence pairs, provided the best results. \julienRevisionNote that iIn practice, in most of our experiments, we observed that this second optimization part fully converged even before reaching of the computation time constraint. Alg. 2 summarizes our overall approach for Wasserstein barycenters of persistence diagrams, with price memorization, progressivity, parallelism and time constraints. \julienRevisionFor clarity, we provide in the additional materials a toy example of the unfolding of our algorithm. Several iterations of our algorithm are illustrated on a toy example in additional material.
4 Application to Ensemble Topological Clustering
This section presents an application of our progressive framework for Wasserstein barycenters of persistence diagrams to the clustering of the members of ensemble data sets. Since it focuses on persistence diagrams, this strategy enables to group together ensemble members that have the same topological profile, hence effectively highlighting the main trends found in the ensemble in terms of features of interest.
The means is a popular algorithm for the clustering of the elements of a set, if distances and barycenters can be estimated on this set. The latter is efficiently computable for persistence diagrams thanks to our novel progressive framework and we detail in the following how to revisit the means algorithm to make it use our progressive barycenters \EDIT as estimates of the centroids of the clusters.
Data set  1 thread  8 threads  Speedup  

Gaussians (Fig. 7)  100  2,078  785.53  117.91  6.6 
Vortex Street (Fig. 8)  45  14  0.23  0.10  2.3 
Starting Vortex (Fig. 9)  12  36  0.28  0.19  1.5 
Isabel (3D) (Progressive Wasserstein Barycenters of Persistence Diagrams)  12  1,337  82.95  31.75  2.6 
Sea Surface Height (Fig. 10)  48  1,379  75.90  19.40  3.9 
Data set  \pbox10cmAuction  

[Turner2014]+[Kerber2016]  Ours  Ratio  
Gaussians (Fig. 7)  100  2,078  39.4  39.0  0.99 
Vortex Street (Fig. 8)  12  36  415.1  412.5  0.99 
Starting Vortex (Fig. 9)  45  14  112,787.0  112,642.0  1.00 
Isabel (3D) (Progressive Wasserstein Barycenters of Persistence Diagrams)  12  1,337  2,395.6  2,337.1  0.98 
Sea Surface Height (Fig. 10)  48  1,379  7.2  7.1  0.99 
The means is an iterative algorithm, which highly resembles barycenter computation algorithms (Sec. 1.4), where each Clustering iteration is composed itself of two subroutines: (i) Assignment and (ii) Update. Initially, cluster centroids () are initialized on diagrams of the input set . For this, in practice, we use the means++ heuristic described by Celebi et al. [celebi13], which aims at maximizing the distance between centroids. Then, the Assignment step consists in assigning each diagram to its closest centroid . This implies the computation, for each diagram of its Wasserstein distance to all the centroids . For this step, we estimate these pairwise distances with the Auction algorithm run until convergence (, Sec. 1.3). In practice, we use the accelerated means strategy described by Elkan [elkan03], which exploits the triangle inequality between centroids to skip, given a diagram , the computation of its distance to centroids which cannot be the closest. Next, the Update step consists in updating each centroid’s location by placing it at the barycenter (with Alg. 2) of its assigned diagrams . The algorithm continues these Clustering iterations until convergence, i.e. until the assignments between the diagrams and the centroids do not evolve anymore, hence yielding the final clustering.
From our experience, the Update step of a Clustering iteration is by far the most computationally expensive. To speed up this stage in practice, we derive a strategy that is similar to our approach for barycenter approximation: we reduce the computation load of each Clustering iteration and progressively increase their accuracy along the optimization. This strategy is motivated by a similar observation: early centroids are quite different from the converged ones, which motivates an accuracy reduction in the early Clustering iterations of the algorithm. Thus, for each Clustering iteration, we use a single round of auction with price memorization (Sec. 3.2), and a single barycenter update (i.e. a single Relaxation iteration, Alg. 2). Overall, only one global scaling (Sec. 3.3) is applied at the end of each Clustering iteration. This enhances the means algorithm with accuracy progressivity. If a diagram migrates from a cluster to a cluster , the prices of the objects of for the bidders of are initialized to and we run the auction algorithm between and until the pairwise value matches the global value, in order to obtain prices for which are comparable to the other diagrams. Also, we apply persistencedriven progressivity (Sec. 3.4) by adding persistence pairs of decreasing persistence in each diagram along the Clustering iterations. Finally, a computation time constraint can also be provided, as described in Sec. 3.6. Results of our clustering scheme are presented in Sec. 5.3.
5 Results
This section presents experimental results obtained on a ècomputer with two Xeon CPUs (3.0 GHz, 2x4 cores), with 64GB of RAM. The input persistence diagrams were computed with \julienRevisionthe algorithm by Gueunet et al. [gueunet_ldav17], which is available in the Topology ToolKit (TTK) [ttk17]. the FTM algorithm [gueunet_ldav17, ttk17]. We implemented our approach in C++, as TTK modules.
Our experiments were performed on a variety of simulated and acquired 2D and 3D ensembles, \julienRevision\EDIT., exactly the same as those used by Favelier et al. [favelier2018]. taken from Favelier et al. [favelier2018]. The Gaussians ensemble contains 100 2D synthetic noisy members, with 3 patterns of Gaussians (Fig. 7). The \julienRevisionconsidered features of interest in this example are the maxima\julienRevision, hence only the diagrams including the persistence pairs are considered.. The Vortex Street ensemble (Fig. 8) includes 45 runs of a 2D simulation of flow turbulence behind an obstacle. The considered scalar field is the curl orthogonal component, for 5 fluids of different viscosity\julienRevision(9 runs per fluid, each run with varying Reynolds numbers).. In this application, salient extrema are typically considered as reliable estimations of the center of vortices. Thus, each run is represented by two diagrams, processed independently by our algorithms: one for the pairs (involving minima) and one for the pairs (involving maxima). The Starting Vortex ensemble (Fig. 9) includes 12 runs of a 2D simulation of the formation of a vortex behind a wing, for 2 distinct wing configurations. The considered data is also the curl orthogonal component and diagrams involving minima and maxima are also considered. The Isabel data set (Progressive Wasserstein Barycenters of Persistence Diagrams) is a volume ensemble of 12 members, showing key time steps (formation, drift and landfall) in the simulation of the Isabel hurricane [scivisIsabel]. In this example, the eyewall of the hurricane is typically characterized by high wind velocities, well captured by velocity maxima. Thus we only consider diagrams involving maxima. Finally, the Sea Surface Height ensemble (Fig. 10) is composed of 48 observations taken in January, April, July and October 2012 (https://ecco.jpl.nasa.gov/products/all/). \julienRevisionTheHere, the features of interest are the center of eddies, which can be reliably estimated with height extrema. Thus, both the diagrams involving the minima and maxima are considered and independently processed by our algorithms. Unless stated otherwise, all results were obtained by considering the Wasserstein metric based on the original pointwise metric (Eq. 1) without geometrical lifting (i.e. , Sec. 1.2).
5.1 Time performance
\julienRevisionWe first evaluate the time performance of our progressive framework when run until convergence (i.e. no computation time constraint). The corresponding timings are reported in Tab. 1. Tab. 1 evaluates the time performance of our progressive framework when run until convergence (i.e. no computation time constraint). This table also provides running times for 3 alternatives. The column, Sinkhorn, provides the timings obtained with a Python CPU implementation kindly provided by Lacombe et al. [lacombe2018], for which we used the recommended parameter values (entropic term: heat map resolution: ). Note that this approach casts the problem as an Eulerian transport optimization under an entropic regularization term. Thus, it optimizes for a convex functional which is considerably different from the Fréchet energy considered in our approach (Eq. 8). Overall, these aspects, in addition to the difference in programming language, challenge direct comparisons and we only report running times for completeness. The columns Munkres, noted [Turner2014]+[soler2018], and Auction, noted [Turner2014]+[Kerber2016], report the running times of our own C++ implementation of Turner’s algorithm [Turner2014] where distances are respectively estimated with the exact method by Soler et al. [soler2018] \julienRevision(available in TTK [ttk17]) and our own C++ implementation of the auctionbased approximation by Kerber et al. [Kerber2016] (with kdtree and lazy heap, run until convergence, ).
As predicted, the cubic time complexity of the Munkres algorithm makes it impractical for barycenter estimation, as the computation completed within 24 hours for only two ensembles\julienRevision(Vortex Street and Starting Vortex), where the diagrams are particularly small.. The Auction approach is more practical but still requires up to hours to converge for the largest data sets. In contrast, our approach converges in sequential in less than 15 minutes at most. The column Speedup reports the gain obtained with our method against the fastest of the two explicit alternatives, Munkres or Auction. For ensembles of realistic size, this speedup is about an order of magnitude. \julienRevisionOur approach can be trivially parallelized by running the Assignment step (Alg. 2) in independent threads (Sec. 3.5). We implemented this strategy with OpenMP and the results are reported in Tab. 2. As reported in Tab. 2, our approach can be trivially parallelized with OpenMP by running the Assignment step (Alg. 2) in independent threads (Sec. 3.5). As the size of the input diagrams may strongly vary within an ensemble, this trivial parallelization may result in load imbalance among the threads, impairing parallel efficiency. In practice, this strategy still provides reasonable speedups, bringing the computation down to a couple of minutes at most.
5.2 Barycenter quality
Tab. 3 compares the Fréchet energy (Eq. 8) of the converged barycenters for our method and the Auction barycenter alternative [Turner2014]+[Kerber2016] \julienRevision(the Wasserstein distances between the results of the two approaches are provided in additional material for further details). \julienRevisionIn particular, tThe Fréchet energy has been precisely evaluated with an estimation of Wasserstein distances based on the Auction algorithm run until convergence (). While the actual values for this energy are not specifically relevant (because of various data ranges), the ratio between the two methods indicates that the local minima approximated by both approaches are of comparable numerical quality, with a variation of in energy at most. Fig. 4 provides a visual comparison of the converged Wasserstein barycenters obtained with the Auction barycenter alternative [Turner2014]+[Kerber2016] and our method, for one cluster of each of our data sets (Sec. 5.3). This figure shows that differences are barely noticeable and only involve pairs with low persistence, which are often of small interest in the applications.
Fig. 5 \julienRevisiondetailscompares the convergence rates of the Auction barycenter [Turner2014]+[Kerber2016](red) \julienRevisionandto three variants of our framework: without (blue) and with (orange) persistence progressivity and with time computation constraints (green\julienRevision, complete computations for increasing time constraints). \julienRevisionFor the interruptible version of our algorithm (green), the energy is reported after each computation, for increasing time constraints. \julienRevisionThis plotIt indicates that our approach based on Price Memorization and single auction round \julienRevision(Secs. 3.2, 3.3, blue (blue) already substantially accelerates convergence (the first iteration\julienRevisionafter initialization, dashed, is performed with a large and thus induces a high energy). Interestingly, persistencedriven progressivity (orange) provides the most important gains in convergence speed. The number of Relaxation iterations is larger for our approach (, orange) than for the Auction barycenter method (, red), which emphasizes the low computational effort of each of our iterations. Finally, when the Auction barycenter method completed its first Relaxation iteration (leftmost red point), our persistencedriven progressive algorithm already achieved 80% of its iterations, resulting in a Fréchet energy almost twice smaller. The quality of the barycenters obtained with the interruptible version of our approach (Sec. 3.6) is illustrated in Figs. Progressive Wasserstein Barycenters of Persistence Diagrams and 6 for varying time constraints. As predicted, features of decreasing persistence progressively appear in the diagrams, while the most salient features are accurately represented for very small constraints, allowing for reliable estimations within interactive times (below a second).
5.3 Ensemble visual analysis with Topological Clustering
In the following, we systematically set a time constraint of 10 seconds. To facilitate the reading of the diagrams, each pair with a persistence smaller than of the function range is shown in transparent white, to help visually discriminate salient features from noise. Fig. 7 shows the clustering of the Gaussians ensemble by our approach. This synthetic ensemble exemplifies the motivation for the geometrical lifting \julienRevisionintroduced in Sec. 1.2(Sec. 1.2). The first and third clusters both contain a single Gaussian, resulting in diagrams with a single persistent feature, but located in drastically different areas of the domain . Thus, the diagrams of these two clusters would be indistinguishable for the clustering algorithm if geometrical lifting was not considered. If feature location is important for the application, our approach can be adjusted thanks to geometrical lifting (Sec. 1.2). For the Gaussians ensemble, this makes our clustering approach compute the correct clustering. Moreover, taking the geometry of the critical points into account allows us to represent in the extrema involved in the Wasserstein barycenters (spheres, scaled by persistence, Fig. 7) which allows user to have a visual feedback in the domain of the features representative of the set of scalar fields. Geometrical lifting is particularly important in applications where feature location bears a meaning, such as the Isabel ensemble (Progressive Wasserstein Barycenters of Persistence Diagrams(f)). For this example, our clustering algorithm with geometrical lifting automatically identifies the right clusters, corresponding to the three states of the hurricane (formation, drift and landfall). For the remaining examples, geometrical lifting was not necessary (). For the Vortex Street ensemble (Fig. 8), our approach manages to automatically identify the correct clusters, precisely corresponding to the distinct viscosity regimes found in the ensemble. Note that the centroids computed by our topological clustering algorithm with a time constraint of 10 seconds (left) are visually indistinguishable from the Wasserstein barycenters of each cluster, computed after the fact with our progressive algorithm run until convergence (right). This indicates that the centroids provided by our topological clustering are reliable and can be used to visually represent the features of interest in the ensemble. In particular, for the Vortex Street example, these centroids enable the clear identification of the number and salience of the vortices: pairs which align horizontally and vertically respectively denote minima and maxima of flow vorticity, which respectively correspond to clockwise and counterclockwise vortices. Fig. 9 presents our results on the Starting Vortex, where our approach also automatically identifies the correct clustering, corresponding to two wing configurations. In this example, the difference in turbulence (number and strength of vortices) can be directly and visually read from the centroids returned by our algorithm (insets). Finally, Fig. 10 shows our results for the Sea Surface Height, where our topological clustering automatically identifies four clusters, corresponding to the four seasons: winter (top left), spring (top right), summer (bottom left), fall (bottom right). As shown in the insets, each season leads to a visually distinct centroid diagram. In this example, as diagrams are larger, differences between the interrupted centroids (left) and the converged barycenters (right) become noticeable. However, these differences only involve pairs of small persistence, whose contribution to the final clustering reveal negligible in practice.
Comparing with the approach from Favelier et al. [favelier2018], on the exact same data, we obtain similar qualitative results, with correct clusterings found on all datasets. Performancewise, our timeconstrained approach finds the correct clustering within ten seconds for a large ensemble dataset such as Sea Surface Height (in addition to 35 seconds to compute the persistence diagrams) against several hundreds of seconds for the parallel persistence atlas approach. Overall, our approach provides the same clustering results than Favelier et al. [favelier2018]: the returned clusterings are correct for both approaches, for all of the above data sets. However, once the input persistence diagrams are available, our algorithm computes within a time constraint of ten seconds only, while the approach by Favelier et al. requires up to hundreds of seconds (on the same hardware) to compute intermediate representations (Persistence Maps) which are not needed in our work.
5.4 Limitations
In our experiments, we focused on persistence diagrams which only involve extrema, as these often directly translate into features of interest in the applications. Although our approach can consider other types of persistence pairs (e.g. saddlesaddle pairs in 3D), from our experience, the interpretation of these structures is not obvious in practice and future work is needed to improve the understanding of these pairs in the applications. Thanks to the assignments computed by our algorithm, the extrema of the output barycenter can be embedded in the original domain (Fig. 7 to 10). However, in practice a given barycenter extremum can be potentially assigned with extrema which are distant from each other in the ensemble members, resulting in its placement at an inbetween location which may not be relevant for the application. Regarding the Fréchet energy, our experiments confirm the proximity of our approximated barycenters to \julienRevisionactual local minima (Fig. 4\julienRevision, Tab. 3). \julienRevisionHowever, theoretical proximity bounds to these minima are difficult to formulate and we leave this for future work. \julienRevisionHowever, aAlso, as it is the case for the original algorithm by Turner et al. [Turner2014], there is no guarantee that \julienRevisionthis solution is a global minimizerour solutions are global minimizers. For the clustering, we observed that the initialization of the means algorithm had a major impact on its outcome but we found that the means++ heuristic [celebi13] provided excellent results in practice. Finally, when the geometrical location of features in the domain has a meaning for the applications, the geometrical lifting coefficient (Sec. 1.2) must be manually adjusted by the user on a per application basis, which involves a trial and error process. However, our interruptible approach greatly helps in this process, as users can perform such adjustments at interactive rates.
6 Conclusion
In this paper, we presented an algorithm for the progressive approximation of Wasserstein barycenters of Persistence diagrams, with applications to the visual analysis of ensemble data. \julienRevision\EDITOur approach takes advantage of previous results on the computation of Fréchet means for persistence diagrams [Turner2014], and revisits the fast approximation of Wasserstein distance [Kerber2016] to induce progressivity. Our approach revisits efficient algorithms for Wasserstein distance approximation [Bertsekas81, Kerber2016] in order to specifically extend previous work on barycenter estimation [Turner2014]. Our experiments showed that our strategy drastically accelerates convergence and reported an order of magnitude speedup against previous work, while providing barycenters which are quantitatively \julienRevisioncomparable and visually similar. and visually comparable. The progressivity of our approach \julienRevisionallowedallows for the definition of an interruptible algorithm, enabling the estimation of reliable barycenters within interactive times. We presented an application to \julienRevisionthe topological clustering of ensemble data, ensemble data clustering, where the obtained centroid diagrams provided key visual insights about the global feature trends of the ensemble.
A natural direction for future work is the extension of our framework to other topological abstractions, such as Reeb graphs or MorseSmale complexes. However, the question of defining a relevant, and importantly, computable metric between these objects is still an active research debate. Our framework provides only approximations of Wasserstein barycenters. In the future, it would be useful to study the convergence of these approximations from a theoretical point of view. Although we have focused on scientific visualization applications, our framework can be used mostly asis for persistence diagrams of more general data, such as filtrations of high dimensional point clouds. \julienRevisionIn that context, other applications than clustering will also be investigated. Moreover, we believe our progressive strategy for Wasserstein barycenters can also be used for more general inputs than persistence diagrams, such as generic point clouds, as long as an importance measure substituting persistence is available, which would significantly enlarge the spectrum of applications of the ideas presented in this paper.