General transport problems with branched minimizers as functionals of -currents with prescribed boundary
A prominent model for transportation networks is branched transport, which seeks the optimal transportation scheme to move material from a given initial to a final distribution. The cost of the scheme encodes a higher transport efficiency the more mass is moved together, which automatically leads to optimal transportation networks with a hierarchical branching structure. The two major existing model formulations, either using mass fluxes (vector-valued measures) or patterns (probabilities on the space of particle paths), are rather different. Once their equivalence was established, the analysis of optimal networks could rest on both.
The transportation cost of classical branched transport is a fractional power of the transported mass, and several model properties and proof techniques build on its strict concavity. We generalize the model and its analysis to the most general class of reasonable transportation costs, essentially increasing, subadditive functions. This requires several modifications or new approaches. In particular, for the equivalence between mass flux and pattern formulation it turns out advantageous to resort to a description via -currents, an intuition which already Xia exploited. In addition, some already existing arguments are given a more concise and perhaps simpler form. The analysis includes the well-posedness, a metrization and a length space property of the model cost, the equivalence between the different model formulations, as well as a few network properties.
Keywords: optimal transport, optimal networks, branched transport, irrigation, urban planning, Wasserstein distance, geometric measure theory, currents
2010 MSC: 49Q20, 49Q10, 90B10
The classical theory for cost-efficient transportation of an amount of material from a given initial to a given final mass distribution is the theory of optimal transport, first suggested by Monge in the 19th century and substantially developed by Kantorovich in the mid 20th century. The optimal transport problem assigns the minimum possible transport cost to each pair of initial and final mass distribution. Here, the movement of an amount of mass from a position to a position contributes a transportation cost which is proportional to the transported mass and which may depend in a rather general way on and .
In the setting of classical optimal transport, all mass particles move independently from each other. If, however, the transportation cost is only subadditive in the transported mass, which models an efficiency gain if mass is transported in bulk, then material particles start interacting, and an optimal transportation scheme will move the particles along paths that together form ramified network structures. This can be used to model for instance biological networks such as vascular systems in plants and animals. Two such models, in which the transportation cost (per transport distance) is a fractional power of the transported mass ,
are the so-called branched transport models by Xia [Xia03] and by Maddalena, Morel, and Solimini [MSM03]. The model by Maddalena, Morel, and Solimini employs a Lagrangian formulation based on irrigation patterns that describe the position of each mass particle at time by . The model by Xia on the other hand uses a Eulerian formulation in which only the flux of particles is described, discarding its dependence on the time variable . The equivalence between both model formulations was shown in [MS09, MS13]; a comprehensive reference is the monograph [BCM09].
In this work we generalize both branched transport models and their analysis by replacing the transportation cost by a more general subadditive transportation cost as described below. We furthermore provide a model description in terms of -currents, which in the special case of branched transport had already been conceived by Xia [Xia04]. The potent measure geometric tools help to gain a more intuitive understanding of the models and greatly reduce the effort in comparing different model formulations. Of course, this comes at the cost of introducing the measure geometric machinery, building on classical as well as more recent results by White [Whi99a], Smirnov [Smi93], Šilhavý [Š07] or Colombo et al. [CDRMS17]. The motivation for generalizing the choice comes from work by the current authors [BW16] (and the subsequent studies [BRW16, BW17]) where it is shown that the so-called urban planning model (introduced in [BB05]) can be formulated in the same setting as branched transport, just using a different (no longer strictly concave) transportation cost.
1.1 General transportation costs
We will be concerned with transportation costs of the following form.
Definition 1.1 (Transportation cost).
The transportation cost is a function such that
and for ,
is lower semi-continuous.
The transportation cost has the interpretation of the cost per transport distance for transporting an amount of mass . Note that any non-decreasing concave function with can be chosen as a transportation cost. As mentioned above, the generalization of the choice to more general subadditive costs is motivated by [BW16], which is concerned with a different model for transport networks, the urban planning model, and which provides a model formulation analogous to branched transport, just with a different transportation cost. The most well-known choices for are summarized in the following example, and with this article we advocate the use of more general that may be tailored to applications.
Example 1.2 (Transportation cost).
Classical optimal transport, branched transport, urban planning, and a variant of the Steiner problem can be retrieved using
the Wasserstein cost for some ,
the branched transport cost for some ,
the urban planning cost for some , , or
the discrete cost if and .
Note that the properties required in Definition 1.1 are dictated by first principles: Transporting mass has a positive cost, where the cost increases with increasing mass. Furthermore, the transport cost only jumps when the maximum capacity of a transportation means is reached (for instance, if a lorry is fully loaded), which implies lower semi-continuity. In addition, the transportation cost is subadditive, since there is always the option of splitting the mass into several parts and transporting those separately. A direct consequence is that the average transportation cost per particle is bounded below as follows.
Lemma 1.3 ([Laa62, Thm. 5 and its proof]).
Let be a transportation cost and define
for . Then for all .
In principle, the condition of lower semi-continuity may be dropped, however, in the mass flux-based model formulation this would lead to the same model as taking the lower semi-continuous envelope of , while in the pattern-based formulation this would lead to non-existence of optimal networks.
An important feature of our generalization is that now also non-concave and non-strictly subadditive transportation costs are allowed, which will complicate the analysis in parts but covers cases of interest such as urban planning. As such, this work contains a mixture of arguments from [Xia03, MSM03, BCM05, MS09, MS13, BW16], all transferred into this more general setting and in several places streamlined. In particular, any reference to the specific form of the transportation cost (which is exploited in all of the above works) is eliminated.
1.2 Summary of main results
Given two probability measures and on , denoting the material source and the sink, the Eulerian model formulation will describe the mass transport from to via a vector-valued measure (a so-called mass flux) with , which represents the material flux through each point (cf. Definition 2.1). If can be represented as a weighted directed graph , whose edge weight represents the mass flux through edge , then its cost function will be defined as (cf. Definition 2.2)
Otherwise, will be approximated in an appropriate sense by sequences of graphs transporting to , and its cost will be defined via relaxation as
The Lagrangian model on the other hand will describe the mass transport via a so-called irrigation pattern with being the position of mass particle at time (cf. Definition 3.1). Its cost will essentially be defined as
where integration is with respect to the Lebesgue measure and denotes the total amount of all mass particles travelling through (cf. Definition 3.3). One is interested in the following.
Problem 1.4 (Flux and pattern optimization problem).
Given and , the problems of finding an associated optimal mass flux and irrigation pattern are
We show (in different order)
that the metric is induced by shortest paths in the space of probability measures (cf. Corollary 2.27),
that the optimal mass flux decomposes into for a countably -rectifiable set , , and a diffuse part (cf. Proposition 2.32) such that the cost turns into
and that the optimal irrigation pattern has only on a countably -rectifiable set and its cost is (cf. Proposition 3.19)
In several places the key idea is to reinterpret the Eulerian formulation as an optimization problem on -currents with prescribed boundary. Along the way some properties of transportation networks and the models themselves are shown; for instance, unlike in branched transport it is in general no longer true that optimal transportation networks have a tree-like structure. Our analysis also serves as a preparation for a further study in which we will introduce yet another model formulation quite different from the ones considered here and very much akin to the original formulation of urban planning.
Throughout the article, we will use the following notation.
denotes the -dimensional Lebesgue measure.
denotes the -dimensional Hausdorff measure.
denotes the set of nonnegative finite Borel measures on . Notice that these measures are countably additive and also regular by [Rud87, Thm. 2.18]. The total variation measure of is defined as . The total variation norm of then is .
denotes the set of -valued regular countably additive measures on . The total variation measure of and its total variation norm are and , respectively.
Weak- convergence on or is indicated by .
The support of a measure in or is the smallest closed set with .
The restriction of a measure in or to a measurable set is the measure defined by for all measurable sets .
The pushforward of a measure on under a measurable map is the measure defined by for all measurable sets .
The Dirac mass in is the measure if and else.
The Wasserstein--metric between two measures of equal mass is defined as . It metrizes weak- convergence on the space of nonnegative finite Borel measures with equal mass.
denotes the unit interval.
denotes the set of absolutely continuous functions .
denotes the set of Lipschitz functions .
The characteristic function of a set is defined as if and else.
and denote the vector spaces of -vectors and alternating -linear forms in the vector space , respectively [Fed69, 1.3-4]. In detail, is the quotient space of the -fold tensor product of with respect to the identification for all , and is its dual. If is equipped with an inner product, then an inner product of two -vectors is defined by , which induces a norm on . The corresponding operator norm on is also denoted .
(with varying domain and range specifications) denotes the vector space of times boundedly and continuously differentiable functions with norm being the supremum over the domain of all absolute derivatives up to order . For instance, denotes the space of bounded continuous functions.
and denote the set of continuous and smooth functions with compact support, respectively.
The article is organized as follows. Section 2 introduces the Eulerian model formulation and examines properties of optimal transportation networks. Furthermore, the metrization and the length space property are shown in that section. Section 3 then introduces the Lagrangian model formulation via irrigation patterns, and the equivalence between Eulerian and Lagrangian formulation is proved in Section 4.
2 Eulerian model for transportation networks
We start by recapitulating the model formulation due to Xia [Xia03] and subsequently analyse its properties in our more general setting. In large but not all parts we can follow the original arguments by Xia.
2.1 Model definition
Here, transportation networks are described with the help of graphs. First only transport between discrete mass distributions is considered, and then general transportation problems are obtained via a relaxation technique.
Definition 2.1 (Mass flux).
A discrete finite mass shall be a nonnegative measure of the form with , , .
Let be discrete finite masses with . A discrete mass flux between and is a weighted directed graph with vertices , straight edges , and edge weight function such that the following mass preservation conditions hold,
where and denote the initial (source) and final (sink) point of edge .
The flux associated with a discrete mass flux is given by
where every edge with direction was identified with the vector measure . Equation 2.1 is equivalent to (in the distributional sense).
A vector measure is a mass flux between two nonnegative measures and (also known as transport path), if there exist sequences of discrete finite masses , with , and a sequence of fluxes with , . A sequence satisfying these properties is called approximating graph sequence, and we write . Note that follows by continuity with respect to weak- convergence.
In the above, has the interpretation of the initial material distribution or mass source, while represents the final distribution or sink. The edge weight indicates the amount of mass flowing along edge so that (2.1) expresses mass conservation on the way from initial to final distribution. Indeed, (2.1) implies that the Dirac locations of and form vertices as well and that at every vertex the total mass influx equals the total outflux. Thus, a mass flux essentially encodes how the mass moves from to , and it can be associated with a cost.
Definition 2.2 (Cost functional).
Given a transportation cost , the cost function of a discrete mass flux between and is
where is the length of edge .
The cost function of a mass flux between and is defined as
We furthermore abbreviate
Given , the transport problem is to find the solution of
where is called cost distance.
We close this section with two lemmas showing that we may always restrict ourselves to probability measures with support on . Therefore, throughout the article and without loss of generality we will assume any source or sink to lie in
Lemma 2.3 (Mass rescaling).
Let with , and let be a transportation cost and a mass flux. We have
In particular, is a valid transportation cost. Furthermore, in (2.3) we may restrict to approximating graph sequences with .
That represents a valid transportation cost is straightforward to check. Likewise, it is easy to see that there is a one-to-one relation between approximating graph sequences and approximating graph sequences via and (the latter means that all edge weights are divided by ). Furthermore, , which together with the above directly implies the first statement.
As for the last statement, consider an approximating graph sequence and set . Due to the continuity of with respect to weak- convergence we have as . Now it is straightforward to see that another valid approximating graph sequence is obtained as if and else. However, .∎
Lemma 2.4 (Domain rescaling).
Let with , and let be a transportation cost and a mass flux. We have
Furthermore, in (2.3) we may restrict to approximating graph sequences with .
Again there is a one-to-one relation between approximating graph sequences and approximating graph sequences via and (the latter means that all vertex coordinates and edges are rescaled by ). Furthermore, , which implies the first statement.
As for the last statement, it is straightforward to see that for any approximating graph sequence we may project the Dirac locations of and the vertices of orthogonally onto , resulting in a modified approximating graph sequence with non-greater cost. Indeed, the edge lengths (and thus also the cost functional) are at most decreased, and still holds after the modification. ∎
2.2 Existence of minimizers and their properties
We will see that under certain growth conditions there will always be an optimal mass flux between any two measures with bounded support. To this end we first show that optimal discrete mass fluxes never have cycles.
Lemma 2.5 (Acyclicity of discrete mass fluxes).
For any discrete mass flux there exists an acyclic discrete mass flux with same initial and final measure and .
Suppose that there is a single cycle , that is, a loop of edges with consistent direction and positive weight. For consider the graph whose edge weights are given by
Note that the initial and final measure of are the same as of and that no longer contains a cycle, since one edge in has weight and can thus be removed. By the monotonicity of we have so that
In case of multiple cycles we just repeat this procedure until all cycles are removed. ∎
For completeness, let us at this point also prove a stronger property of optimal discrete mass fluxes in case of concave transportation costs , namely their tree structure. By tree we shall here understand a directed graph such that from any vertex to any other vertex there exists at most one path consistent with the edge orientations. Note that with this convention a tree may be composed of multiple disjoint trees.
Lemma 2.6 (Tree structure of discrete mass fluxes).
For any discrete mass flux and concave transportation cost there exists a tree with same initial and final measure and .
Suppose that there is a subset that forms a loop (not necessarily with consistent edge orientation), and choose an orientation. Let be the subset of edges with same orientation and . Assume that the loop orientation was chosen so that (else reverse the orientation), where shall denote an element of the supergradient of . Next, for consider the graph whose multiplicity is given by
Note that the initial and final measure of are the same as of and that no longer contains the loop, since one edge in has weight and can thus be removed. By the concavity of we have so that
In case of multiple loops we just repeat this procedure until all loops are removed so that the resulting graph has a tree structure. ∎
Remark 2.7 (Strict concavity).
If is strictly concave, the same proof shows that every optimal discrete mass flux must have a tree structure.
Remark 2.8 (Necessity of concavity).
If is not concave, Lemma 2.6 is false, and optimal discrete mass fluxes with tree structure may not exist. Indeed, for and let
as well as as illustrated in Fig. 1. Note that we choose and such that and there is with (Fig. 1 right). In that case, only two tree topologies are possible, displayed in Fig. 1 left. The first one has cost , while the second one has larger cost if is small enough due to its longer edges and . However, the nontree discrete mass flux has the strictly smaller cost .
As a consequence of the above, the mass flux through each edge of an optimal discrete mass flux can be bounded above.
Lemma 2.9 (Maximal mass flux).
Let be an acyclic discrete mass flux between and . Then for all .
Define the set of edges emanating from a vertex in without influx. Now inductively define , , as follows. Given , we seek a vertex such that all incoming edges to are in . All those edges we replace by the outgoing edges of to obtain . It is straightforward to show by induction that each edge lies in at least one and that the total flux through the edges is bounded by for all . ∎
Now we are in a position to show that either the transport cost between given is infinite, or a minimizer exists.
Theorem 2.10 (Existence).
Given with bounded support, the minimization problem
either has a solution, or is infinite.
By Lemma 2.3 and Lemma 2.4 we may assume . Let , , be a minimizing sequence with , and assume the infimum cost to be finite (else there is nothing to show). By Definition 2.1 there exists a triple of measures and a discrete mass flux such that
is uniformly bounded. Furthermore, by Lemma 2.4 we may assume the or to lie inside . Thus, we can extract a weakly-* converging subsequence (still indexed by for simplicity) so that we have for some with and
Under certain growth conditions on (depending on the space dimension ) one can always guarantee the existence of a finite cost mass flux and thus existence of minimizers. We will call the corresponding transportation costs admissible.
Definition 2.11 (Admissible transportation costs).
A transportation cost is called admissible, if it is bounded above by a concave function with .
Remark 2.12 (Invariance under mass rescaling).
The definition of admissibility is invariant under the transformation from Lemma 2.3 and thus independent of the total mass of sources and sinks.
Remark 2.13 (Continuity).
Obviously, admissible transportation costs are continuous in 0 and thus automatically continuous everywhere by [Kuc09, Thm. 16.2.1].
Example 2.14 (Admissible transportation costs).
The Wasserstein cost and the urban planning cost are admissible.
The branched transport cost is admissible for .
The transportation cost with is admissible.
For proving existence of finite cost mass fluxes we will need to express the admissibility in a different, less compact form.
Lemma 2.15 (Admissible transportation costs).
A transportation cost is admissible if and only if it is bounded above by a concave function with
The function must be non-decreasing (else we could not have ). Thus we have
so that is equivalent to . Now the change of variables yields
We will prove existence of finite cost networks by construction using the following components.
Definition 2.16 (-adic mass fluxes).
For a given measure we define the following.
An elementary -adic mass flux for of scale , centred at , is defined as with
where denotes the straight edge from to .
A -adic mass flux for of levels and scale , centred at , is defined inductively as
where the union of graphs is obtained by taking the union of all vertices and all wheighted directed edges. We will write . The th level of is defined as the graph
Let denote the leaves of . The -level approximation of is defined as
An illustration of the mass fluxes in two dimensions is provided in Fig. 2. It is straightforward to see that is a discrete mass flux between and . Likewise, the -adic mass flux is a discrete mass flux between and .
Remark 2.17 (Convergence of -level approximation).
If has support inside , then as . Indeed, we have