Measure Theory through Dynamical Eyes

# Measure Theory through Dynamical Eyes

Vaughn Climenhaga Department of Mathematics
University of Houston
Houston, TX 77005, USA
and  Anatole Katok Department of Mathematics
McAllister Building
Pennsylvania State University
University Park, PA 16802, USA

These notes are a somewhat embellished version of two rather informal evening review sessions given by the second author on July 14 and 15, 2008 at the Bedlewo summer school, which provide a brief overview of some of the basics of measure theory and its applications to dynamics which are foundational to the various courses at the school.

A number of results are quoted without proof, or with at most a bare sketch of a proof; references are given where full proofs may be found. Similarly, most basic definitions are assumed to be known, and we defer their reiteration to the references.

In light of the above, we emphasise that this presentation is not meant to be either comprehensive or self-contained; the reader is assumed to have some knowledge of the basic concepts of measure theory, ergodic theory, and hyperbolic dynamics, which will appear without any formal introduction. The tone is meant to be conversational rather than authoritative, and the goal is to give a general idea of various concepts which should eventually be examined thoroughly in the appropriate references.

For a full presentation of the concepts in §1, which concerns abstract measure theory, we refer the reader to Halmos’ book [Ha] (for more basic facts) and to Rokhlin’s article [Ro1]. The topics in measurable dynamics mentioned in §2 receive a more complete treatment in a later article by Rokhlin [Ro2], and the account in §3 of the relationship between foliations and measures in smooth dynamics draws on Barreira and Pesin’s book [BP], along with two articles by Ledrappier and Young [LY1, LY2].

The first author would like to thank Andrey Gogolev and Misha Guysinsky for providing useful references and comments.

## 1. Abstract Measure Theory

### 1.1. Points, sets, and functions

There are three “lenses” through which we can view measure theory; we may think of it in terms of points, in terms of sets, or in terms of functions. To put that a little more concretely, suppose we have a triple comprising a measurable space, a -algebra, and a measure. Then we may focus our attention either on the space (and concern ourselves with points), or on the -algebra (and concern ourselves with sets), or on the space (and concern ourselves with functions).

All three points of view play an important role in dynamics, and various definitions and results can be given in terms of any of the three. We will see later that the last two are completely equivalent in the greatest generality but the first requires certain additional, albeit natural, assumptions explained in Section 1.2 (see Theorem 1.5 and Definition 1.6). Of particular interest to us will be the correspondence between partitions of the space , sub--algebras of , and subspaces of .

First let us consider the set of all partitions of . This is a partially ordered set, with ordering given by refinement; given two partitions , , we say that is a refinement of , written , if and only if every is contained in some . In this case, we also say that is a coarsening of . The finest partition (which in this notation may be thought of as the “largest”) is the partition into points, denoted , while the coarsest (the “smallest”) is the trivial partition , denoted .

As on any partially ordered set, we have a notion of join and meet, corresponding to least upper bound and greatest lower bound, respectively. Following [Ro2], we shall refer to these as the product and intersection, and we briefly recall their definitions. Given two partitions and , their product (join) is

 (1.1) ξ∨η:={C∩D∣C∈ξ, D∈η}.

This is the coarsest partition which refines both and , and is also sometimes referred to as the joint partition. The intersection (meet) of and is the finest partition which coarsens both and , and is denoted ; in general, there is no analogue of (1.1) for .

So much for partitions; what do these have to do with -algebras, or with -spaces? Given a partition , we may consider the collection of all measurable subsets which are unions of elements of ; this collection forms a sub--algebra of , which we denote by . We will see later that this correspondence is far from injective; for example, certain partitions whose elements are countable sets are associated with the trivial -algebra, see Example 1.9.

Similarly, we may consider the collection of all square integrable functions which are constant on elements of ; this collection (more precisely, the collection of equivalence classes of such functions) forms the subspace .

###### Example 1.1.

Let be the unit square with Lebesgue measure , and the Borel -algebra. Let be the partition into vertical lines; then is the sub--algebra consisting of all sets of the form , where is Borel, and is the space of all square-integrable functions which depend only on the -coordinate. The latter is canonically isomorphic to the space of square-integrable functions on the unit interval.

The class of sub--algebras of and the class of subspaces of are both partially ordered by containment, and this partial ordering is preserved by the correspondences described above. Thus the map is a morphism of partially ordered sets; it is natural to ask whether this morphism is injective (and hence invertible) on a certain class of partitions, and we will return to this question eventually. First, however, we turn to the question of classifying measure spaces, and hence the associated classes of partitions and -algebras, since the end result turns out to be relatively simple.

### 1.2. Lebesgue spaces

It is a somewhat serendipitous fact that although one may consider many different measure spaces , which on the face of it are quite different from each other, all of the examples in which we will be interested actually fall into a relatively simple classification.

To elucidate this statement, let us consider the two fundamental examples of measure spaces. The simplest sort of measure space is an atomic space, in which is a finite or countable set, is the entire power set of , and is defined by the sequence of numbers . Each of the points is an atom – that is, a measurable set of positive measure such that every subset has either or . Atomic spaces are discrete objects, which belong to combinatorics as much as to measure theory, and do not require the full power of the latter theory.

At the other end of the spectrum stand the non-atomic spaces, in which every set of positive measure can be decomposed into two subsets of smaller positive measure. The easiest example of such a space is the interval with Lebesgue measure, where the -algebra is the collection of Lebesgue sets. In fact, up to isomorphism and setting aside examples which are for our purposes pathological, this is the only example of such a space.

What does it mean for measure spaces to be isomorphic? The most immediate (and essentially correct) idea is to require existence of a bijection between the spaces which carries measurable sets into measurable sets both ways and preserves the measure. There are some technicalities related with different ways of defining -algebras of measurable sets (e.g. Borel or Lebesgue on the interval) and also ignoring some “bad” sets of measure zero (e.g. cardinality of the space may be artificially increased by adding a set of points of measure zero of large cardinality). These inessential problems aside, there are examples of isomorphism which look striking on the surface.

###### Example 1.2.

Consider the unit interval and the unit square with Lebesgue measure. The standard construction of the Peano curve with division of into basic intervals and into equal squares provides a continuous surjective map which preserves the measure of any union of basic intervals and hence of any measurable set. While this map is obviously not bijective, it is bijective between complements of certain measure zero sets: namely, , the union of endpoints of all basic intervals, and , the union of boundaries of all squares involved in the construction.

To discuss the general case, we first need some definitions.

###### Definition 1.3.

Introduce a pseudo-metric on the (measured) -algebra111Because we deal with measure spaces (not just measurable spaces), we consider not just the -algebra , but also the measure it carries. This will be implicit in our discussion throughout this section. by the formula

 dμ(A,B)=μ(A△B),

where denotes the symmetric difference . We say that two sets are equivalent mod zero if , and write .

If we pass to the quotient space , we obtain a true metric space; we say that the -algebra is separable if this metric space is separable. That is, is separable if and only if there exists a countable set which is dense in with the pseudo-metric.

The completion of is the -algebra generated by together with all subsets of null sets (that is, sets with ). We say that two -algebras are equivalent mod zero if they have the same completion, and write .

###### Exercise 1.1.

Recall that if is any collection of sets, then , the -algebra generated by , is the smallest -algebra that contains . Show that if is equivalent mod zero to a -algebra generated by a countable collection of sets, then is separable.

In [Ro1], a stronger definition of separable is used, which implies the condition in Exercise 1.1. We follow the definition in [Ha]. When the measure is non-atomic, the two definitions are equivalent.

###### Exercise 1.2.

Show that if is a separable metric space, is the -algebra of Borel subsets of , and is any probability measure, then is separable as in Definition 1.3.

For the sake of simplicity in what follows, we will always consider sets (of positive measure) and -algebras up to equivalence mod zero (in particular, we will not distinguish between a -algebra and its completion), and will write in place of .

We say that two -algebras are isomorphic if there exists a bijection which preserves measure:

 (1.2) μ′(ρ(E))=μ(E)

and which respects unions and complements (and hence intersections as well)

 (1.3) ρ(∞⋃n=1Ei) =∞⋃n=1ρ(Ei), (1.4) ρ(X∖E) =X′∖ρ(E).

Obviously one way (but not the only way) to produce an isomorphism of -algebras is to have an isomorphism between measure spaces as in Example 1.2 and then to set .

Up to this notion of isomorphism, all the -algebras with which we are concerned can be studied by one “master” example:222A state of affairs which Tolkein would surely render quite poetically, particularly if we were to adopt a slightly different line of exposition and consider -rings instead of -algebras.

###### Theorem 1.4.

A separable (measured, complete) -algebra with no atoms is isomorphic to the -algebra of Lebesgue sets on the unit interval.

###### Proof.

For complete details of the proof, see [Ha, Section 41, Theorem C]. Here we give the main ideas.

Given a countable collection of sets , write and , and let . This is an increasing sequence of partitions of , and because is separable, we may take the sets to be such that is dense in . Moreover, has elements, which can be indexed by words as follows:

 Aw=Aw11∩Aw22∩⋯∩Awnn.

Write for the -algebra of Lebesgue sets on the unit interval. We can define a map as follows.

1. For a fixed , order the elements of lexicographically: for example, with we have

 A000≺A001≺A010≺A011≺⋯≺A111.
2. Identify these sets with subintervals of with the same measure and the same order: thus

 ρ(A000) =[0,μ(A000)), ρ(A001) =[μ(A000),μ(A000)+μ(A001)),

and so on, as shown in Figure 1. In general we have

 (1.5) ρ(Aw)=[∑v≺wμ(Av),∑v⪯wμ(Av)),

where the sums are over words of the same length as . Note that the image is empty whenever the sums are equal, which happens exactly when . In Figure 1, for example, we have .

It is clear from the construction that satisfies (1.3) whenever the union is an element of . Furthermore, is an isometry with respect to the metrics and , where is Lebesgue measure on the interval, and so it can be extended from the dense set to all of . The identities (1.2) and (1.4) hold on , and extend to by continuity.

Finally, because is non-atomic, the measure of the sets in goes to 0, and so the length of their images under goes to 0 as well, which shows that . ∎

###### Exercise 1.3.

Show that the sum appearing in (1.5) can also be written as

 ∑v≺wμ(Av)=∑\lx@stackrel1≤k≤nwk=1μ(Aw1⋯wk−10),

where is the length of the word .

The proof of Theorem 1.4 describes the construction of a -algebra isomorphism. At a first glance, it may appear that this also creates an isomorphism between the measure spaces themselves, but in fact, the proof as it stands may not produce such an isomorphism. The problem is the potential presence of “holes” in the space distributed among its points in a non-measurable way. In order to give a classification result for measure spaces themselves, we need one further condition in addition to separability which prevents the appearance of such “holes”.

Let be a finite partition of into measurable sets, and let be the -algebra which contains all unions of elements of , so that contains sets. Partitioning each into , we obtain a finer partition and a larger -algebra whose elements are unions of none, some, or all of the . Iterating this procedure, we have a sequence of partitions

 (1.6) ξ1<ξ2<⋯,

each of which is a refinement of the previous partition, and a sequence of -algebras

 (1.7) T1⊂T2⊂⋯.

This is obviously reminiscent of the construction from the proof of Theorem 1.4. An even more useful image to keep in mind here is the standard picture of the construction of a Cantor set, in which the unit interval is first divided into two pieces, then four, then eight, and so on – these “cylinders” (to use the terminology arising from symbolic dynamics) are the various sets , , etc.

We may consider the “limit” of the sequence (1.6):

 (1.8) ξ=∞⋁n=1ξn.

Each element of corresponds to a “funnel”

 (1.9) Ci1⊃Ci1,i2⊃Ci1,i2,i3⊃⋯

of decreasing subsets within the sequence of partitions; the intersection of all the sets in such a funnel is an element of .

The sequence (1.6) is a basis if it generates both the -algebra and the space , as follows:

1. the associated -algebras from (1.7) have the property that generates ;

2. it generates the space ; that is, every “funnel” as in (1.9) has intersection containing at most one point. Equivalently, any two points and are separated by some partition , and so , the partition into points.

Note that the existence of an increasing sequence of finite or countable partitions satisfying (1) is equivalent to separability of the -algebra.

It is often convenient to choose a sequence such that at each stage, each cylinder set is partitioned into exactly two smaller sets. This gives a one-to-one correspondence between sequences in and “funnels” as in (1.9).

###### Exercise 1.4.

Determine the correspondence between the above definition of a basis and the definition given in §1.2 of [Ro2].

Since each “funnel” corresponds to some element of which is either a singleton or empty, we have associated to each Borel subset of an element of , and so yields a measure on . Thus we have a notion of “almost all funnels” – we say that the basis is complete if almost every funnel contains exactly one point.333This is not to be confused with the notion of completeness for -algebras. That is, the set of funnels whose intersection is empty should be measurable, and should have measure zero. Equivalently, a basis defines a map from to which takes each point to the “funnel” containing it; the basis is complete if the image of this map has full measure.

The existence of a complete basis is the final invariant needed to classify “nice” measure spaces.

###### Theorem 1.5.

If is separable, non-atomic, and possesses a complete basis, then it is isomorphic to Lebesgue measure on the unit interval.

###### Proof.

Full details can be found in [Ro1]; here we describe the main idea, which is that with the completeness assumption, the argument from the proof of Theorem 1.4 indeed gives an isomorphism of measure spaces. Using the notation from that proof, every infinite intersection corresponds to a point in the interval, namely

 (1.10) limn→∞∑v≺x1⋯xnμ(Av)=∑\lx@stackreln≥1xn=1μ(Ax1⋯xn−10).

With the exception of a countable set (the endpoints of basic intervals of various ranks), every point is the image of at most one “funnel”. Completeness guarantees that the is correspondence is indeed a bijection between sets of full measure. Measurability follows from the fact that images of the sets from a basis are finite unions of intervals. ∎

In fact, all the measure spaces of interest to us are separable and complete, as the following series of exercises shows.

###### Exercise 1.5.

Let be a metric space and fix . Given , let be the boundary of the ball of radius centred at . Show that for any probability measure and any fixed , at most countably many of the have positive measure.

###### Exercise 1.6.

Let be a separable metric space, the -algebra of Borel sets, and a probability measure. Use Exercise 1.5 to show that has a basis such that all boundaries have zero measure – that is, for all and .

###### Exercise 1.7.

Let be as in Exercise 1.6, and suppose that in addition is complete (as a metric space). Show that the basis constructed in Exercise 1.6 is complete (as a basis).

###### Definition 1.6.

A separable measure space with a complete basis is called a Lebesgue space.

In light of Definition 1.6, we can rephrase Exercise 1.7 as the result that every separable complete metric space equipped with a Borel probability measure is a Lebesgue space. By Theorem 1.5, every Lebesgue space is isomorphic to the union of unit interval with at most countably many atoms.

It is also worth noting that any separable measure space admits a completion, just as is the case for metric spaces. The procedure is quite simple; take a basis for which is not complete, and add to one point corresponding to each empty “funnel”. Thus we need not concern ourselves with non-complete spaces.

###### Exercise 1.8.

Show that every separable measure space which is not complete is isomorphic to a set of outer measure one in a Lebesgue space.

Thus we have elucidated the promised difference between the language of sets and that of points: separability is sufficient for the first to lead to the standard model, while for the second, completeness is also needed. This distinction is important theoretically; in particular, it allows us to separate results which hold for arbitrary separable measure spaces (such as most ergodic theorems) from those which hold in Lebesgue spaces (such as von Neumann’s isomorphism theorem for dynamical systems with pure point spectrum).

However, non-Lebesgue measure spaces are at least as pathological for “normal mathematics” as non-measurable sets or sets of cardinality higher than continuum. In particular, as a consequence of Exercises 1.51.7, the measure spaces which arise in conjuction with dynamics are all Lebesgue spaces, so from now on we will restrict our attention to those.

### 1.3. Partitions and σ-algebras

We have already seen a simple procedure for associating to each partition of a sub--algebra of , and it is natural to ask whether there is a natural class of partitions on which the morphism is one-to-one, so that it can be inverted. The answer turns out to be positive if one considers equivalence classes of partitions mod zero.

###### Definition 1.7.

Two partitions , of are equivalent mod zero if there exists a set of full measure such that

 {C∩E∣C∈ξ}={D∩E∣D∈η},

in which case we write .

As with sets and -algebras, we will always consider partitions up to equivalence mod zero, and will again write in place of .

###### Theorem 1.8.

Given a separable measure space and a sub--algebra , there exists a partition of into measurable sets such that and are equivalent mod zero.

###### Proof.

Without loss of generality, we may assume that is non-atomic (if contains any atoms, these can be taken as elements of , and there can only be countably many disjoint atoms). Since is separable, so is . (The metric space is a subspace of .) In particular, we may take a basis for and define by (1.8). It remains only to show that , which we leave as an exercise. ∎

###### Exercise 1.9.

Complete the proof of Theorem 1.8.

We will denote the partition constructed in Theorem 1.8 by . Analogously to the proofs of Theorems 1.4 and 1.5, the elements of may be described explicitly as follows: without loss of generality, assume that and are such that each has the form , and given , let . Note that unlike in those proofs, the intersection may contain more than one point – indeed, some intersections must contain more than one point unless .

### 1.4. Measurable partitions

We now have a natural way to go from a partition to a -algebra , and from a -algebra to a partition .

The definition of in Theorem 1.8 guarantees that it is a one-sided inverse to , in the sense that for any -algebra (up to equivalence mod zero). So we may ask if the same holds for partitions; is it true that and are equivalent in some sense?

We see that since each set in is measurable, we should at least demand that not contain any non-measurable sets. For example, consider the partition , where , : then if is measurable (and hence as well), we have

 B(ξ)={∅,A,B,X}

and , while if is non-measurable, we have

 B(ξ)={∅,X}

and so . Thus a “good” partition should only contain measurable sets; it turns out, however, that this is not sufficient, and that there are examples where is not equivalent mod zero to , even though every set in is measurable.

###### Example 1.9.

Consider the torus with Lebesgue measure , and let be the partition into orbits of a linear flow with irrational slope ; that is, . In order to determine , we must determine which measurable sets are unions of orbits of ; that is, which measurable sets are invariant. Because this flow is ergodic with respect to , any such set must have measure or , and so up to sets of measure zero, is the trivial -algebra! It follows that is the trivial partition .

A discrete-time version of this is the partition of the circle into orbits of an irrational rotation.

###### Definition 1.10.

The partition is known as the measurable hull of , and will be denoted by . If is equivalent mod zero to its measurable hull, we say that it is a measurable partition.

In particular (foreshadowing the next section), if we denote the partition into orbits of some dynamical system by , then is also known as the ergodic decomposition of that system, and is denoted by .444It should be noted that because we have not yet talked about conditional measures, one may rightly ask just what about this decomposition is ergodic.

It is obvious that in general, the measurable hull of is a coarsening of ; the definition says that if is non-measurable, this is a proper coarsening.555Compare this with the action of the Legendre transform on functions – taking the double Legendre transform of any function returns its convex hull, which lies on or below the original function, with equality if and only if the original function was convex.

###### Exercise 1.10.

Show that the measurable hull is the finest measurable partition which coarsens —in particular, if is any partition with

 ξ≤η

then , and hence is non-measurable.

gives a map from the class of all partitions to the class of all -algebras, and gives a map in the opposite direction, which is the one-sided inverse of . We see that the set of measurable partitions is just the image of the map , on which acts as the identity, and on which and are two-sided inverses.

Thus we have a correspondence between measurable partitions and -algebras – one may easily verify that the operations and on measurable partitions correspond directly to the operations and on -algebras, and that the relations and correspond directly to the relations and .

Example 1.9 shows that the orbit partition for an irrational toral flow is non-measurable; in fact, this is true for any ergodic system with more than one orbit, since in this case is the trivial -algebra, whence is the trivial partition and . This sort of phenomenon is widespread in dynamical systems – for example, we will see in §3 that in the context of smooth dynamics, the partition into unstable manifolds is non-measurable whenever entropy is positive.

An alternate characterisation of measurability may be motivated by recalling that in the “toy” example of a partition into two subsets, the corresponding -algebra had four elements in the measurable case, and only two in the non-measurable case. In some sense, measurability of the partition corresponds to increased “richness” in the associated -algebra. This is made precise as follows:666In [Ro2, p. 4], the property described in Theorem 1.11 is given as the definition of measurable. The result here shows that the two definitions are equivalent.

###### Theorem 1.11.

Let be a partition of a Lebesgue space . is measurable if and only if there exists a countable set such that for almost every pair , we can find some which separates them in the sense that , .

###### Sketch of proof.

The key observation is the fact that such a set corresponds to a refining sequence of partitions (1.6) defined by

 ηk ={Ak,X∖Ak},ξn=n⋁k=1ηk.\qed
###### Exercise 1.11.

Complete the proof of Theorem 1.11.

It may not immediately be clear what is meant by “almost every pair” in the statement of Theorem 1.11. Recall that the natural projection takes to the unique partition element containing .777A word on notation. There is a natural correspondence between partitions and equivalence relations; if we use to denote the partition, then takes values in , whereas if we use to denote the equivalence relation, then takes values in the quotient space . Thus , which may be thought of as the space of equivalence classes, carries a measure which is the pushforward of under – given a measurable set , we have

 μξ(E)=μ(π−1(E)).

This gives a meaning to the notion of “almost every” partition element, and hence to “almost every pair” of partition elements. Another way to parse the statement is to see that we may remove some set of zero measure from and pass to the “trimmed-down” partition , for which the statement holds for every .

Aside from finite or countable partitions into measurable sets (which are obviously measurable), a good example of a measurable partition is given by Example 1.1, in which the square with Lebesgue measure is partitioned into vertical lines. In fact, this is in some sense the only measurable partition, just as is, up to isomorphism, the only Lebesgue space – the following result states that a measurable partition can be decomposed into a “discrete” part, where each element has positive measure, and a “continuous” part, which is isomorphic to the partition of the square into lines.

###### Theorem 1.12.

Given a measurable partition of a Lebesgue space , there exists a set such that

1. Each element of has positive measure (and hence there are at most countably many such elements).

2. is isomorphic to the partition of the unit square with Lebesgue measure into vertical lines given in Example 1.1.

###### Proof.

We give a complete proof modulo a technical lemma (Lemma 1.13), whose proof we only sketch. Let be the union of the elements of that have positive measure. To prove the theorem it suffices to restrict our attention to , and so from now on we assume that is empty and all elements of have measure .

The proof is a more sophisticated version of the argument in the proofs of Theorems 1.4 and 1.5. Let be as in Theorem 1.11, and as in the proof of Theorem 1.4, write , where and . The idea is that mimicking the proof of Theorem 1.5, we will construct an isomorphism that sends to the vertical strip in whose horizontal footprint is the interval with length and left endpoint at . What remains is to describe the vertical coordinate of the isomorphism.

Before doing this, first observe that the previous paragraph defines a map such that if and are such that , then for every . Since almost every admits such an , we see that is equivalent mod zero to

 (1.11) π−1(ε[0,1])={π−1(a)∣a∈[0,1]},

the partition into preimages for the map .

So far we have associated to almost every point a sequence that determines in which element of the point lies. Now fix another sequence of sets , this time requiring that they generate the entire -algebra: , the partition into points, where and . As before, given , let . Thus to almost every we can also associate with the property that .

The map only depends on , and so abusing notation slightly, we will find a function such that the map given by is the desired isomorphism. Just as is a sum of measures of partition elements, so too will be a sum of conditional measures of partition elements.

Given a word , we define functions by

 φwm(x)=μ(Bw∩Ax1⋯xm)μ(Ax1⋯xm).

Thus is the conditional measure of within the partition element . Figure 2 illustrates the procedure for varying and fixed ; observe that the total shaded area under the function remains constant within each as increases.

###### Lemma 1.13.

There exists a measurable function such that almost everywhere.

###### Sketch of proof.

Full details are in [Vi, Lemma 4]. The idea is to show that for every , the set

 Sα,β={x∣lim––––φwm(x)<α<β<¯¯¯¯¯¯¯¯limφwm(x)}

has zero measure, since the set of points without convergence is a countable union of such sets. To show this, one observes that for every there exist such that for all . Let

 Ci=⋃x∈Sα,βAx1⋯xci(x),Di=⋃x∈Sα,βAx1⋯xdi(x),

so that since for all , we have

 αμ(Ci)>μ(Bw∩Ci)>μ(Bw∩Di)>βμ(Di).

Writing , we have , so . ∎

The function may be interpreted as the conditional measure of the set in the partition element , a point which we elaborate on later. For the moment we conclude the proof of Theorem 1.12 by putting

 ϕ(x,y)=∑{n∣yn=1}φy1⋯yn−10(x),

which takes the place of (1.10). Now is the desired isomorphism. Figure 3 illustrates the first two steps in the definition of , although we point out that the functions need only be measurable, not smooth as in the picture. ∎

In the course of the previous proof, we showed that a measurable partition can be found as the partition into preimages (1.11) associated to a certain map. The following result, whose proof (which uses Theorem 1.11) is left as an exercise, gives conditions under which the converse is true, and thus gives another criterion that can be used to establish measurability of a partition (a further criterion is found in Exercise 2.1).

###### Theorem 1.14.

Let be a complete metric space, a Borel measure on , a second countable topological space, and a Borel map (that is, preimages of Borel sets are Borel). Then the partition into preimages defined by (1.11) is measurable.

###### Example 1.15.

Let be the usual middle-third Cantor set, which has Lebesgue measure but contains uncountably many points. Then there is a bijection from to , and so we may take a partition of such that each element of contains exactly two points, one in and one not in . Using the characterisation in Theorem 1.11, we see that is measurable, since we may take for our countable collection the set of intervals with rational endpoints. Further, this partition is equivalent mod zero to the partition into points.

The situation described in Example 1.15, where a partition is in some sense finer than it appears to be, happens all the time in ergodic theory. A fundamental example is the so-called Fubini’s nightmare, in which a partition which seems to divide the space into curves in fact admits a set of full measure intersecting each partition element exactly once, and hence is equivalent mod zero to the partition into points (we will return to this example in §3).

This sort of behaviour stands in stark contrast to absolute continuity – but in order to make any sense of that notion, we must first discuss conditional measures.

### 1.5. Conditional measures on measurable partitions

If a partition element carries positive measure (which can only be true of countably many elements), then we can define a conditional measure on by the obvious method; given , the conditional measure of is

 (1.12) μC(E):=μ(E)μ(C).

However, for many partitions arising in the study of dynamical systems, such as the partitions into stable and unstable manifolds which will be discussed later, we would also like to be able to define a conditional measure on partition elements of zero measure, and to do so in a way which allows us to reconstruct the original measure.

The model to keep in mind is the canonical example of a measurable partition, the square partitioned into vertical lines (Example 1.1). Then denoting by , , and the Lebesgue measures on the square, the horizontal unit interval, and vertical intervals, respectively, Fubini’s theorem says that for any integrable we have

 (1.13) ∫[0,1]2f(x,y)dλ(x,y)=∫[0,1]∫[0,1]f(x,y)dλ2(y)dλ1(x).

By Theorem 1.12, any measurable partition of a Lebesgue space is isomorphic to the standard example – perhaps with a few elements of positive measure hanging about, but these will not cause any trouble, as we already know how to define conditional measures on them. Taking the pullback of the Lebesgue measures and under this isomorphism, we obtain a factor measure on , which corresponds to the horizontal unit interval (the set of partition elements), and a family of conditional measures , which correspond to the vertical unit intervals.

Note that the factor measure is exactly the measure on the space of partition elements which was described in the last section. Note also that although the measure was the same for each vertical line (up to a horizontal translation), we can make no such statement about the measures , as the geometry is lost in the purely measure theoretic isomorphism between and . The key property of these measures is that for any integrable function , the function

 (1.14) ξ →R, C ↦∫CfdμC

is measurable, and we have

 (1.15) ∫Xfdμ=∫ξ∫CfdμCdμξ.

Each is “supported” on in the sense that , but the reader is cautioned that the measure theoretic support of a measure (which is not uniquely defined) is a different beast than the topological support of a measure, and that may not be equal to , as the following example shows.

###### Example 1.16.

Let be such that both it and its complement intersect every interval in a set of positive measure.888Such a set can be constructed, for instance, by repeatedly removing and replacing appropriate Cantor sets of positive measure. Let be one-dimensional Lebesgue measure, and define a measure on the unit square by

 μ(E)=λ1(E∩(A×{0}))+λ1(E∩(Ac×{1}))

for each . Then the topological support of is the union of two horizontal lines, , and intersects each partition element in two points, but the conditional measures are -measures supported on a single point.

We cannot in general write a simple formula for the conditional measures, as we could in the case where partition elements carried positive weight, so on what grounds do we say that these conditional measures exist? The justification above relies on the characterisation of measurable partitions given by Theorem 1.12. Related proofs that do not require constructing an isomorphism to the square are presented in Viana’s notes [Vi] (which draw on Rokhlin’s paper [Ro1]) and in Furstenberg’s book [Fu]. These use methods from functional analysis, principally the Riesz representation theorem, made available by defining a topology on .

### 1.6. Measure classes and absolute continuity

Let be a measurable space, and consider the set of all measures on . This set has various internal structures which may be of importance to us; for the time being, we focus our attention on the fact, guaranteed by the Radon–Nikodym Theorem, that measures come in classes. This theorem addresses the relationship between two measures and , and allows us to pass from a qualitative statement to a quantitative one; namely, if is absolutely continuous with respect to ,999This means that if , then as well, a state of affairs which is denoted . then there exists a measurable function , known as the Radon–Nikodym derivative, which has the property that

 ν(E)=∫Edνdμ(x)dμ(x)

for any .101010As an aside, note that if we change the -algebra , then we also change the Radon–Nikodym derivative, since must be measurable with respect to . This fact is crucial to the proof of the Birkhoff Ergodic Theorem in [KH].

Given a reference measure and any other measure , we also have the Radon–Nikodym decomposition of ; that is, we may write , where and (the latter means that there exists such that and ).

The notion of absolute continuity plays an important role in smooth dynamics, where we have a reference measure class given by the smooth structure of the manifold in question, and are often particularly interested in measures which are absolutely continuous with respect to this measure class.

Given a partition , we may also speak of as being absolutely continuous with respect to on the elements of by passing to the conditional measures and and applying the above definitions. For example, if we fix and write for the measure on with

 δx(E)={0x∉E,1x∈E,

and for Lebesgue measure on , then the product measure fails to be absolutely continuous with respect to Lebesgue measure on , but is absolutely continuous on the elements of the partition into vertical lines. This weaker version of absolute continuity is an important notion in smooth dynamics, where it allows us to ask not just if a measure is absolutely continuous on the manifold as a whole, but if it is absolutely continuous in certain directions, which correspond to the various rates of expansion and contraction given by the Lyapunov exponents. In particular, we are often interested in measures which are absolutely continuous on unstable leaves, so-called SRB measures.

###### Example 1.17.

Let be the usual middle-thirds Cantor set, and let be the probability measure on that gives weight to each of the basic intervals at the th stage of the construction (which have length ). Then the product measure is not absolutely continuous with respect to Lebesgue measure on the square, but is absolutely continuous with respect to the partition into vertical lines.

## 2. Measurable Dynamics

### 2.1. Partitions of times past and future

Now we consider not just a set , but a dynamical system , where is a map whose iterates are of interest. Generally speaking carries some structure – topological, measure-theoretic, metric, manifold – which is preserved by the action of . One of the key notions in dynamics is that of invariance: the map sends points to points, sets to sets, measures to measures, functions to functions, and we are interested in properties and characterisations of points, sets, measures, functions which are invariant under the action of .

In this section we will assume that is a measure space as in the previous section, and that is a measure-preserving transformation – that is, that for all . We often express the equality by saying that the measure is invariant under the action of . When is a metric space or a manifold and is a continuous or smooth map, it is often (but not always) the case that there are very many invariant measures. For the time being, though, we will only consider a single invariant measure.111111Many of the definitions and results here work for any measure-preserving transformation , but some also require to be invertible with measure-preserving inverse. In this case we also have for every , which is not necessarily the case for non-invertible transformations.

We may consider the property of invariance for partitions as well; we say that a partition is invariant if for every , that is, if the preimage of a partition element is again a single partition element.121212In [Ro2], such a partition is said to be completely invariant, and invariant instead refers to the weaker property that , so that the preimage of a partition element is a union of partition elements. This is written as , where

 T−1ξ={T−1(C)∣C∈ξ}.

Given an invariant partition , let denote the canonical projection , as before. Then induces an action on the space of partition elements , and the dynamics of may be viewed as a skew product over this action.

In light of the correspondence between measurable partitions and -algebras discussed in the previous section, we may also consider invariant -algebras, those for which . It is then reasonable to ask if there is a natural way to associate to an arbitrary partition or -algebra one which is invariant. One obvious way is to take a -algebra , and consider the sub--algebra which contains all the -invariant sets in .131313Of course, may not contain any non-trivial -invariant sets. However, there is another important construction, which we now examine.

Now we assume that is an automorphism, i.e., invertible with measure-preserving inverse. Let be a finite partition of into measurable sets, and define

 ξT:=⋁n∈ZTnξ=limn→∞n⋁j=−nTjξ.

The elements of this partition are given by , where . Observe that if and only if , and so knowing which element of the point lies in corresponds to knowing in which element of the points lies. This is commonly referred to as the coding of the trajectory of : knowing which element of the point lies in is equivalent to knowing the coding of the entire trajectory of , both forward and backward.

Because each of the partitions have finitely many elements, all measurable, these partitions themselves are measurable, and we have ; passing to the limit, we see that , so is measurable as well.

###### Exercise 2.1.

Show that a partition is measurable if and only if it is the limit () of an increasing sequence of finite partitions into measurable sets. Indeed, show that is measurable if the are any measurable partitions.

###### Example 2.1.

Let and let be the Bernoulli measure that gives weight to each -cylinder. Let be the partition induced by the equivalence relation iff for all , and let .

Let be the set of all such that the terms are eventually (that is, there exists such that for all ). Then for all , so ; indeed, is an element of . But is the trivial -algebra since all elements are shift-invariant and is ergodic. So is the trivial partition, hence is not measurable. This shows that the counterpart to Exercise 2.1 for decreasing sequences of finite measurable partitions is false.

It follows immediately from the construction of that it is an invariant partition, whose -algebra is very different from the invariant -algebra described above.

###### Example 2.2.

Consider the space of doubly infinite sequences on two symbols,

 X=Σ2={0,1}Z={(xn)n∈Z∣xn∈{0,1}},

and let be the shift . Equip with the Bernoulli measure which gives each -cylinder weight .

Geometrically, may be thought of as the direct product of two Cantor sets (each corresponding to the one-sided shift space ). In this picture, acts on each copy of as follows: draw two rectangles of width and height , each of which contains half of the horizontal Cantor set; contract each rectangle in the vertical direction by a factor of ; expand it in the horizontal direction by the same factor; and finally, stack the resulting rectangles one on top of the other, as in Figure 4.

Now let be the partition of into one-cylinders; that is, , where

 Ci:={(xn)n∈Z∈Σ2∣x0=i}.

Each one-cylinder corresponds to one of the two vertical rectangles in the above description, and the reader may verify that in this case, is the partition into points.

Another important partition, which is not necessarily invariant, is

 ξ−:=∞⋁n=0T−nξ.

As discussed above, an element of corresponds to trajectories with the same coding for both positive and negative – in other words, trajectories with the same past and future relative to the partition . By contrast, corresponds to just the infinite future – points whose forward iterates lie in the same elements of may have backwards iterates lying in different elements of .141414In the next section, we will see that for a smooth dynamical system can be interpreted as a partition into local stable manifolds.

This last statement is just another way of saying that is not necessarily invariant under the action of . However, we do have

 ξ−=ξ∨T−1ξ−≥T−1ξ−;

that is, is an increasing partition.151515Note that is increasing in the sense that it refines its pre-image; for this to be the case, each individual element must increase in size under , and hence decrease in size under . Thus one could also reasonably define increasing partitions as those for which , which is the convention followed in [LY1, LY2].

###### Example 2.3.

Let , , and be as in Example 2.2; then the elements of are the sets

 C(x)={y∈Σ2∣yn=xn ∀n≥0},

each of which is a copy of , and corresponds in the geometric picture of Figure 4 to a vertical Cantor set

 {t}×C⊂C×C⊂[0,1]×[0,1],

where and is the Cantor set mentioned previously. Note that each element of the partition is a union of two such vertical Cantor sets related by a horizontal translation by . Thus we have .

###### Example 2.4.

Let be the unit circle and a rotation by an irrational multiple of . Let be Lebesgue measure and be the partition into two semi-circles. Then and are both the partition into points. In particular, .

The previous two examples illustrate a dichotomy; either , and is in fact invariant, or is a proper refinement of , which is thus not invariant, as in Example 2.3. As we will soon see, there are fundamental differences between the two cases.

### 2.2. Entropy

What is the difference between the two cases just discussed, between the case and the case ? The key word here is entropy; recall that the entropy of a transformation with respect to a partition is defined as

 (2.1) hμ(T,ξ):=limn→∞1nHμ(n−1⋁k=0T−kξ),

where is the information content of a (finite or countable) partition, given by the following formula (using the convention that ):

 (2.2) Hμ(η):=−∑C∈ημ(C)logμ(C).

This may be interpreted as the expected amount of information that we gain if we know which element of a point lies in; similarly, the entropy is the average information we gain per iteration of .

###### Exercise 2.2.

Show that if is invariant, then has zero entropy, , whereas if is a proper refinement of , then the partition carries positive entropy, .

The notion of entropy is intimately connected with one more partition canonically associated with , defined as

 (2.3) Π(ξ):=H(∞⋀n=1T−nξ−).

Recall that the intersection of two partitions is the finest partition which coarsens both and ; if the partitions are measurable, then this corresponds to taking the intersection of the -algebras.

Observe that for every , and so

 (2.4) N⋀n=1T−nξ−=T−Nξ−=:ξ−N.

The partition in (2.4) corresponds to knowing what happens after time (relative to the partition ), but having no information on what happens before then.

###### Exercise 2.3.

Let and let be the shift ; this is a simple example of a non-transitive subshift of finite type, and comprises two independent copies of the system in Examples 2.2 and 2.3. Equip with the Bernoulli measure which gives each -cylinder weight , and consider the partition into -cylinders.

Show that is the partition , which separates into two copies of . Generalise this result to an arbitrary non-transitive subshift of finite type.

The three partitions we have constructed from are related as follows:

 (2.5) ξT≥ξ−≥Π(ξ).

Using the partitions from (2.4), we see that , while and may be thought of as the limits of as goes to and , respectively.

If , then for all , and all three partitions