Domain Theory and Random Variables

Domain Theory and Random Variables

Michael Mislove
Abstract

The aim of this paper is to establish a theory of random variables on domains. Domain theory is a fundamental component of theoretical computer science, providing mathematical models of computational processes. Random variables are the mainstay of probability theory. Since computational models increasingly involve probabilistic aspects, it’s only natural to explore the relationship between these two areas. Our main results show how to cast results about random variables using a domain-theoretic approach. The pay-off is an extension of the results from probability measures to sub-probability measures. We also use our approach to extend the class of domains for which we can classify the domain structure of the space of sub-probability measures.

Department of Computer Science
Tulane University, New Orleans, LA 70118 


  Keywords:  Domain theory, random variables, Skorohod Representation Theorem  

1 Introduction

This paper draws its impetus from a line of work whose goal is to develop a domain-theoretic approach to random variables.\@footnotemark\@footnotetextIn the probability theory literature, random variables are measurable maps from a probability space that take values in the reals, while random elements are measurable mappings from a probability space to an arbitrary measure space. Here we use “random variables” to denote either. The original motivation was to use random variables to devise models for probabilistic computation that don’t suffer from the well-known problems of the probabilistic power domain [19], a program that began about 10 years ago, and recently has seen some notable successes – more on that below.

In this paper, we shift the focus from constructing monads for probabilistic choice to laying a foundation for a theory of random variables using domains. We show that an important result from the theory of random variables can be recast in the setting of domain theory, where measurable maps can then be approximated by Scott-continuous maps. The result in question is Skorohod’s Theorem [30], one of the basic results in stochastic process theory. In its simplest form, this theorem states that any Borel probability measure on a Polish space can be realized as the law for a random variable . That is, if is a Borel measure on a Polish space and if denotes Lebesgue measure on the unit interval, then there is a measurable map satisfying , the push forward of under . Furthermore, if in in the weak topology, then the random variables with laws and , respectively, satisfy almost surely. This result allows one to replace arguments about the convergence of measures in the weak topology with arguments about almost sure convergence of measurable maps from the unit interval to Polish spaces. It led Skorohod to develop the theory of càdlàg functions that play a prominent role in the analysis of stochastic processes.

Our main results are inspired by Skorohod’s Theorem. Each of our results generalizes from probability measures to sub-probability measures, which are more commonly used in domain theory.

Our first result extending Skorohod’s Theorem involves two new ingredients. First, in moving to the domain setting, we develop an approach to proving Skorohod’s theorem in which the Cantor set, , replaces the unit interval, and a domain is the target space. The role of Lebesgue measure is played by , Haar measure on regarded as a countable product of two-point groups. We also show that Skorohod’s Theorem with the unit interval and Lebesgue measure follows as a corollary of our approach.

We introduce the Cantor set into the discussion because it offers a ready-made computational model in the form of the Cantor tree, – the full rooted binary tree whose set of maximal elements is isomorphic to the Cantor set. Achieving our results requires the Cantor tree: if we tried our approach using just the Cantor set, which is a chain in the natural order, then all monotone images also would be chains, which would limit the result. But the Cantor tree proves to be just the right structure to generalize to arbitrary countably-based domains as images. This brings up the second new component of our approach: the use of the transport numbers between simple measures that are fundamental to the Splitting Lemma for simple measures on a domain. They allow us to define a sequence of Scott-continuous maps from the Cantor tree to a target domain that approximate a given measure on the domain.

To continue the discussion, we need some notation: we realize the Cantor tree, as the set of finite and infinite words over . If we endow with the prefix order, then , the convex power domain of is a coherent domain. If we denote by the set of words of length , then we let denote normalized counting measure on , which is Haar measure when is regarded as a finite group. Moreover, we have , where is the canonical projection. In fact, in SProb, the family of sub-probability measures on , regarded as Scott-continuous valuations over and ordered pointwise. If is a domain, then we let denote the family of Scott-continuous maps , where is a Lawson-closed antichain in , with the order iff and where all components are defined. Our first main result is the following:

Theorem 1. (Skorohod’s Theorem for Domains) Let be a countably-based coherent domain, and let be sequence of Borel sub-probability measures satisfying , where the limit is taken in the Lawson topology. Then there are Scott-continuous maps satisfying for each , and in the Lawson topology on .

Skorohod’s Theorem is a corollary of Theorem 1 as follows. Any Polish space has a computational model, a countably-based bounded complete domain for which is homeomorphic to the set Max of maximal elements endowed with the relative Scott topology. In fact, Max is a , hence a Borel subset of . The last piece is provided by the fact that the canonical surjection of the Cantor set onto the unit interval preserves all sups and infs, and so it has a lower adjoint preserving all suprema. Following by the maps provided in Theorem 1 then yields Skorohod’s original result.

Our second theorem is a special case of the discussion above, when the Polish space is actually totally ordered. In this case, we abandon our indirect approach using the Cantor tree, and instead take a direct approach to considering mappings from the unit interval, but also restricting to the case that is a complete chain. This allows us to prove the following result using direct, domain-theoretic arguments:

Theorem 3. If is a complete chain with as the only compact element, then , the family of sub-probability measures on , and , the family of probability measures on , are continuous lattices.

This result significantly expands our knowledge of the domain structure of the family of sub-probability measures on a domain . Indeed, up to this point, the only domains for which the domain structure of is known are a tree, , for which BCD, the category of bounded complete domains, or a finite reverse tree , in which case is in RB [19].

1.1 Related Work

Previous work that is related to our results include Edalat’s extensive history of results devising domain-theoretic approaches to topics such as integration theory [11], stochastic processes [12], dynamical systems and fractals [13], and Brownian motion [4]. His development with Heckmann of the formal ball model [15] provided an approach tailored to modeling metric spaces and Lipschitz maps using domain theory. The concept of a computational model emerged in Edalat’s work on domain models of spaces arising in real analysis using the domain of compact subsets under reverse inclusion, where the target space arises as the set of maximal elements. The first paper formally presenting such a model was [13], where a domain model for locally compact second countable spaces was given. That paper presents a range of applications of the approach, including dynamical systems, iterated function systems and fractal, a computational model for classical measure theory on locally compact spaces, and a computational generalization of Riemann integration. Related work led to the formal ball model [15] which was tailor-made for modeling metric spaces and Lipschitz functions. Further discussion of these developments occurs in our discussion of Polish spaces in Section 4 below.

Other related work concerns the program to develop random variable models of probabilistic computational processes. This began with [23], a paper that provided a domain model for finite random variables. Further efforts had limited success until recently. The model proposed in [17] turned out to be flawed, as was initially shown in [24, 25]. But inspired by ideas from [17], Barker [3] devised a monad of random variables that gives an abstract model for randomized algorithms. This line of research was initiated by Scott [29], who showed how the model of the lambda calculus could be extended naturally to support probabilistic choice with the aid of a random variable . Barker’s results abstract Scott’s approach by providing a model of randomized PCF that adds a version of probabilistic choice based on random variables. Finally, the author has devised another monad based on random variables [26] that supports settings in which processes, such as those representing honest participants in a crypto-protocol, for instance, have access to distinct sources of randomness, something that Barker’s monad does not support. It is notable that both of these monads leave important Cartesian closed categories of domains invariant – in particular, the category BCD of bounded complete domains, as well as the CCC RB of retracts of bifinite domains invariant, and each enjoys a distributive law with respect to at least one of nondeterminism monads.

The rest of the paper is as follows. In the next section, we review the material we need from a number of areas, domain theory, topology, and probability theory. Section 3 develops results about mappings from the Cantor tree to the space of sub-probability measures on a countably-based coherent domain . Section 4 contains the main results of the paper, by first recalling the development of Polish spaces as computational models, and then presenting the main theorems. Section 5 summarizes what’s been proved, and discusses future work.

2 Background

In this section we present the background material we need for our main results.

2.1 Domains

Our results rely fundamentally on domain theory. Most of the results that we quote below all can be found in [1] or [16]; we give specific references for those that are not.

To start, a poset is a partially ordered set. A poset is directed complete if each of its directed subsets has a least upper bound, where a subset is directed if each finite subset of has an upper bound in . A directed complete partial order is called a dcpo. The relevant maps between dcpos are the monotone maps that also preserve suprema of directed sets; these maps are usually called Scott continuous.

From a purely topological perspective, a subset of a poset is Scott open if (i) is an upper set, and (ii) if implies for each directed subset . It is routine to show that the family of Scott-open sets forms a topology on any poset; this topology satisfies is the closure of a point, so the Scott topology is always , but it is iff is a flat poset. In any case, a mapping between dcpos is Scott continuous in the order-theoretic sense iff it is a monotone map that is continuous with respect to the Scott topologies on its domain and range. We let DCPO denote the category of dcpos and Scott-continuous maps; DCPO is a Cartesian closed category.

If is a dcpo, and , then approximates iff for every directed set , if , then there is some with . In this case, we write and we let . A basis for a poset is a family satisfying is directed and for each . A continuous poset is one that has a basis, and a dcpo is a domain if is a continuous dcpo. An element is compact if , and is algebraic if forms a basis. Domains are sober spaces in the Scott topology.

We let DOM denote that category of domains and Scott continuous maps; this is a full subcategory of DCPO, but it is not Cartesian closed. Nevertheless, DOM has several Cartesian closed full subcategories. Two of particular interest to us are the full subcategory SDOM of Scott domains, and BCD its continuous analog. Precisely, a Scott domain is an algebraic domain for which is countable and that also satisfies the property that every non-empty subset of has a greatest lower bound. An equivalent statement to the last condition is that every subset of with an upper bound has a least upper bound. A domain is bounded complete if it also satisfies this last property that every non-empty subset has a greatest lower bound; BCD denotes the category of bounded complete domains and Scott-continuous maps.

Domains admit a Hausdorff refinement of the Scott topology which will play a role in our work. The weak lower topology on has the sets of the form if as a basis, where is a finite subset. The Lawson topology on a domain is the common refinement of the Scott- and weak lower topologies on . This topology has the family

as a basis. The Lawson topology on a domain is always Hausdorff.

A domain is coherent if its Lawson topology is compact. We denote the closure of a subset of a coherent domain in the Lawson topology by .

Two examples of coherent domains that we need are the Cantor tree and the unit interval. If is a finite set, then denotes the set of finite and infinite words over . In this case, the family is a base for the Lawson topology. The fact that is clopen in the Lawson topology for each compact element implies that the Lawson topology on a coherent algebraic domain is totally disconnected. Of particular interest is , in which case is the binary Cantor tree whose set of maximal elements is the Cantor set in the relative Scott (= relative Lawson) topology.

The other example is the unit interval , where iff or . The Scott topology on the has basic open sets together with for . Since DOM has finite products, is a domain in the product order, where iff for each ; a basis of Scott-open sets is formed by the sets for (this last is true in any domain).

The Lawson topology on [0,1] has basic open sets for – i.e., sets of the form for , which is the usual topology. Then, the Lawson topology on is the product topology from the usual topology on .

Since has a least element, the same results apply for any power of , where in iff for almost all , and for all . Thus, every power of is a coherent domain.

We note that all of these examples – including the last one if is countable – are countably based domains. That is, each has a countable basis. A result that plays an important role for us is the following:

Lemma 2.1

If is a countably based domain. then every is the supremum of a countable chain with for each .

Proof. Since has a countable base, there is a countable directed set with . If we enumerate , then we define the desired sequence as follows: , and if have been chosen from , then we choose with for each and . This extends the sequence, and then a standard maximality argument shows we can choose an countable sequence with for each . Finally, , since for each , but for each implies .

We also need some basic results about Galois adjunctions (cf. Section 0-3 of [16]) in the context of complete lattices. If and are complete lattices, a Galois adjunction is a pair of mappings and satisfying and . In this case, is the lower adjoint, and is the upper adjoint. Lower adjoints preserve all suprema, and upper adjoints preserve all infima. In fact, each mapping between complete lattices that preserves all suprema is a lower adjoint; its upper adjoint is defined by . Dually, each mapping preserving all infima is an upper adjoint; its lower adjoint is defined by . The cumulative distribution function of a probability measure on and its upper adjoint given in the introduction are examples we’ll find relevant.

2.2 Subprobability measures on domains and the probabilistic power domain

The probability-theoretic approach to measures on a complete metric space starts by considering the Borel -field generated by the open sets, and then defines a sub-probability measure as a non-negative, countably additive set function satisfying .\@footnotemark\@footnotetextMost texts confine the discussion to probability measures, but the results we need are valid for sub-probability measures. We provide proofs for the results we need below. Each such measure then defines an integral for , the Banach space of continuous functions of compact support, for example by approximating using simple functions. The sub-probability measures SProb then can be endowed with the weak topology, in which iff for each .

There also is a functional-analytic approach, which starts with the continuous bounded functions on a locally compact Hausdorff space , and then considers the dual space of continuous linear functionals . The Riesz Representation Theorem shows there is an isomorphism between the space of measures on and the dual space of . We then can endow with the weak *-topology: iff for all . A functional is positive if for all , and then the isomorphism above restricts to one between the sub-probability measures SProb and the positive linear functionals with norm .

These two approaches coincide – and the weak- and weak-topologies agree – when is a compact metric space. In particular, they agree for a countably-based coherent domain endowed with the Lawson topology.

Domain theory traditionally takes yet a third approach to sub-probability measures, one that emphasizes the order structure. In this approach, the sub-probability measures over a domain are viewed as continuous valuations: mappings from the family of Scott-open sets to the interval satisfying:

  • (Strictness) ,

  • (Modularity) , for ,

  • (Scott continuity) If is directed, then .

Valuations are ordered pointwise: iff for all . We denote the set of valuations over a domain with this order by .

It is straightforward to show that each Borel sub-probability measure restricts to a unique Scott-continuous valuation on the Scott-open sets. The converse, that each Scott-continuous valuation on a dcpo extends to a unique Borel sub-probability measure was shown by Alvarez-Manilla, Edalat and Saheb-Djorhomi [2].

Linking the order-theoretic approach to and the approaches to SProb outlined above relies on the next result. We recall that a simple sub-probability measure on a space is a finite convex sum , where is finite, for each , and . The following is called the Splitting Lemma:

Theorem 2.2

(Splitting Lemma [18]) Let be a domain and let and be simple sub-probability measures on . Then the following are equivalent:

  1. ,

  2. There is a family satisfying:

    • for each ,

    • for each ,

    • .

Moreover, iff (i) for each and (ii) satisfies implies for each .

This result can be used to show that, given a basis for , the family forms a basis for ; in particular, each sub-probability measure is the directed supremum of simple measures way-below it, so is a domain if is one. Moreover, Jung and Tix [19] showed that is a coherent domain if is.

Our interest is in countably-based coherent domains, in which case we can refine the Splitting Lemma 2.2 and Lemma 2.1; here denotes the family of dyadic rationals in the unit interval:

Corollary 2.3

If is a coherent domain with countable basis , then is a countably-based coherent domain with basis

.


Moreover, if , then the family of transport numbers from the Splitting Lemma 2.2 can be chosen to satisfy for all .

Finally, each is the supremum of a countable chain .

Proof. It is shown in [19] that is coherent if is, and the Splitting Lemma 2.2 implies is a basis for .

We next outline the proof of the second point – that the transport numbers between comparable simple measures all belong to if the coefficients of the measures do. This follows from the proof of the Splitting Lemma 2.2 as presented in [18]: That proof is an application of the Max Flow – Min Cut Theorem to the directed graph which has a “source node,” , connected by an outgoing edge of weight to each “node” , a “sink node,” , with an incoming edge of weight from each element , and edges from to of large weight (say, ), if .

A flow is an assignment of non-negative numbers to each edge so that for nodes , where is the weight as defined above, and satisfying for each node . The value of a flow is , the total amount of flow out of using . A cut is a partition of with and . The value of a flow across the cut is .

The Max Flow – Min Cut Theorem asserts that the maximum flow on a directed graph is equal to the minimum cut. It is proved by applying the Ford–Fulkerson Algorithm [6]. The algorithm starts by assigning the minimum flow for all edges , and then iterates a process of selecting a path from to , calculating the residual capacity of each edge in the path, defining a residual graph , augmenting the paths in to include additional flow, and then iterating. The result of the algorithm is the set of flows along edges across the cut, which are the transport numbers in our case. Since the calculations of new edge weights involve only arithmetic operations, and since the dyadic rationals form a subsemigroup of , the resulting transport numbers are dyadic rationals if the coefficients of the input distributions are dyadic.

The final assertion follows from the fact that is a basis, by an application of Lemma 2.1.

There remains a question of the order structure on that arises from . To clarify this point, we first recall that the real numbers, , are a continuous poset whose Scott topology has the intervals as a basis, and whose Lawson topology is the usual topology.

Proposition 2.4

Let be a coherent domain, and let be sub-probability measures on . Then the following conditions are equivalent:

  1. .

  2. For each Scott-continuous map , .

  3. For each monotone Lawson-continuous , .

Proof. We show the result for simple measures, which then implies it holds for all measures since is a domain – so its partial order is (topologically) closed – in which the simple measures are dense.

So, suppose and are simple measures on .

(i) implies (ii): Suppose that . If , then and . Since , there are guaranteed by the Splitting Lemma 2.2, and so

where the first inequality follows from the facts that implies and is monotone. This shows (i) implies (ii).

(ii) implies (iii): Since monotone Lawson continuous maps are Scott continuous, this is obvious.

(iii) implies (i): Let be a Scott-open subset of , and let . Using the facts that is coherent, so its Lawson topology is compact Hausdorff, and that is finite, we define a family of Scott-open upper sets indexed by , the dyadic numbers in as follows: We let , and for , we recursively choose , the Lawson-closure of . Then define a mapping

by if , and otherwise .


Since the family consists of Scott-open sets satisfying for , this mapping is monotone, and the standard Urysohn Lemma argument (cf. Theorem 33.1 [27]) shows it is Lawson continuous. So, by assumption.

Since and are simple, and . By construction, , so
, and ,
and so , as required.

Remark 2.5

Proposition 2.4 offers added insight into the relationship between SProb and for a countably-based coherent domain , by showing that the domain order from can be defined directly on SProb using the maps from to .

3 Domain Mappings from the Cantor Tree

The Cantor tree is the family of finite and infinite words over in the prefix order. Equivalently, is the full rooted binary tree which is directed complete, and since it is countably based, this means it is closed under suprema of increasing chains. will play the role of the unit interval in our approach to generalizing Skorohod’s Theorem to the domain setting. For that, we need some preliminary definitions.

An antichain is a non-empty subset satisfying implies and do not compare in the prefix order. An extensive study of Lawson-closed antichains in and of measures whose (Lawson) support is such an antichain are given in [24]. The key idea in that work was to associate to a Lawson-closed antichain, its Scott closure, which turns out to be because is a tree.

Our interest here is somewhat different. The results obtained in [24] were in the context of probability measures, and we want to extend the treatment in [24] to include sub-probability measures, since they arise naturally in the current context. Our aim also is to define mappings from all of to the target domain , rather than simply defining them from the Lawson-support of a particular measure. The mappings we seek ultimately can then be realized as restrictions to the Cantor set of maximal elements of of mappings defined on all of . Accomplishing these goals will be accomplished by first defining mappings on finite antichains of compact elements in , and then extending them canonically to all of .

Notation 3.1

For the next result, we need to establish some notation.

  1. We let be the set of -bit words in , which forms an antichain. Recall that there is a well-defined retraction mapping from the Cantor set onto ; this mapping sends each element of to its largest prefix in . In addition, if , then there is a map that sends each -bit word to its -bit prefix.

  2. We can restrict the projection to , the Cantor set of maximal elements in , and its image is then . This projection has a corresponding embedding sending an -bit word to the infinite word all of whose coordinates are . Then and . If are Lawson-closed antichains with , then there is a canonical partial mapping sending each element of to the unique element of below it, iff there is some element of below it. This mapping is continuous in the relative Scott topologies on its domain and .

  3. If is a domain, then we let denote the family functions where is a Lawson-closed antichain and is continuous in the relative Scott topology inherited from .

    If , then we define iff and where is defined.

  4. If is the set of -bit words, we order with the lexicographic order. Then each dyadic rational that can be expressed with denominator can also expressed as an interval in , namely, from to . Moreover, each sequence of such dyadics, whose sum is at most can be expressed as successive intervals, , etc. We denote each of these subintervals by .

Proposition 3.2

Let be a bounded complete domain with countable basis , and for each , let denote the antichain in of -bit words and normalized counting measure on .

If is an increasing sequence of simple sub-probability measures on whose coefficients are dyadic rationals, then there is a corresponding sequence and mappings satisfying implies and .

Proof. Let be a chain of simple sub-probability measures on , and assume with for each . We show by induction that every finite initial chain has a corresponding family satisfying and for each . An appeal to maximality then proves the result.
Basis Step: To begin, recall that means there are transport numbers satisfying the conditions that and for each . Since for all , it follows from Corollary 2.3 that the transport numbers also are dyadic rationals. This implies we can choose so that:

  • Every can be expressed as a dyadic with denominator , and

  • Every and every can be expressed as a dyadic with denominator .

We then define by , for ,\@footnotemark\@footnotetextTo avoid notational clutter, we assume the elements of are given in some total order, and that order is used to enumerate the intervals in . using the notation from Notation 3.1. That follows from the fact that .

Defining takes a bit more work, because the tree structure on complicates the allocation of all the transport numbers for fixed, as varies over . To simply things, for a fixed , we let denote the sequence of transport numbers for . We also let , for each ; this represents the portion of not needed to accommodate any of the mass from .

Next, we endow with the lexicographic order, based on the implicit order we assumed on each component. Then we enumerate as the sequence of intervals in lexicographic order, followed by the intervals in the order on , and then the final interval , the final subinterval of not needed for any mass in .

We then define by for each and , and we leave undefined on . A simple calculation shows that : the mass in consists of that is transported from the s, and the remaining mass in is .

To show that , note that our use of the lexicographic order implies that, for each , the sequence of intervals is in the upper set in of the interval , and since , this sequence of intervals exhausts the intersection of the upper set of with . This means the sequence represents exactly the mass transported from up to . From this it follows that, if is the natural projection in , then satisfies for some implies . This implies where both are defined, which is the definition of .
Inductive Step: For the inductive step, we assume we are given , and that we have defined , antichains and functions with and for .

We also assume that , , and . We also assume we are given a partition into subintervals, with each interval partitioned into subintervals satisfying for each , and where represents the final subinterval of not needed for the total mass in . We can also decompose into subintervals, where , which is the amount of mass at not needed to accommodate incoming mass from , and is the final interval representing the remaining “mass” in after the mass in is accounted for.

Since , Corollary 2.3 asserts there are transport numbers satisfying and for each , and . Using the larger denominator , we decompose each summand of as and . This gives us a partition

where , and .

We define by . By construction, , and . It is also clear from the construction that iff for some , and iff for some . This implies that , as required.

This shows that we can construct the required sequence of antichains and partial maps for each , and then the standard maximality argument shows that there is such a sequence that works simultaneously for all .

For our next result, we need some information about the weak topology on . The result we need follows from the Portmanteau Theorem (cf., e.g., [5]), and a proof can be found as Corollaries 15 and 16 in [7]:

Theorem 3.3

Let be a countably based coherent domain endowed with the Borel -algebra. Then the weak topology on is the same as the Lawson topology on when viewed as a family of valuations.

Moreover, for a family , the following are equivalent:

  1. Both of the following hold:

    • for all finitely generated upper sets .

    • for all Scott-open sets .

  2. for all Lawson-open sets .

Theorem 3.4

Let be a countably-based coherent domain, let denote the Cantor set of maximal elements in , the Cantor tree, and let denote Haar measure on viewed as a countable product of two-point groups. If , then there is a measurable partial map satisfying .

Proof. Using Corollary 2.3, there is an increasing sequence of simple measures in with dyadic rational coefficients satisfying . Then Proposition 3.2 implies there is a sequence of antichains in and a sequence satisfying and implies where defined. If is the natural projection, then , and so , again where and are defined.

In particular, if we let , then the restriction satisfies . Moreover, if , then . so the family is an increasing sequence of intervals in , each of which is clopen (being the embedded image of a subset of an antichain of compact elements in ). If we let , then is an open, hence Borel, subset of .

We define by .

Claim: is well-defined and measurable.
Proof: If then , so . So we conclude that is an increasing sequence in . Since is a domain, this sequence has a well-defined supremum, so is well-defined for all .

To show measurability, it is enough to show is a Borel subset of for all Scott-closed subsets . If is such a set, then iff , and since is Scott-closed, this holds iff