Uniform mixing and completely positive sofic entropy

Uniform mixing and completely positive sofic entropy

Abstract

Let be a countable discrete sofic group. We define a concept of uniform mixing for measure-preserving -actions and show that it implies completely positive sofic entropy. When contains an element of infinite order, we use this to produce an uncountable family of pairwise nonisomorphic -actions with completely positive sofic entropy. None of our examples is a factor of a Bernoulli shift.

1 Introduction

Let be a countable discrete sofic group, a standard probability space and a measurable -action preserving . In [2], Lewis Bowen defined the sofic entropy of relative to a sofic approximation under the hypothesis that the action admits a finite generating partition. The definition was extended to general by Kerr and Li in [9] and Kerr gave a more elementary approach in [8]. In [3] Bowen showed that when is amenable, sofic entropy relative to any sofic approximation agrees with the standard Kolmogorov-Sinai entropy. Despite some notable successes such as the proof in [2] that Bernoulli shifts with distinct base-entropies are nonisomorphic, many aspects of the theory of sofic entropy are still relatively undeveloped.

Rather than work with abstract measure-preserving -actions, we will use the formalism of -processes. If is a countable group and is a standard Borel space, we will endow with the right-shift action given by for and . A -process over is a Borel probability measure on which is invariant under this action. Any measure-preserving action of on a standard probability space is measure-theoretically isomorphic to a -process over some standard Borel space . We will assume the state space is finite, which corresponds to the case of measure-preserving actions which admit a finite generating partition. Note that by results of Seward from [12] and [13], the last condition is equivalent to an action admitting a countable generating partition with finite Shannon entropy.

In [1], the first author introduced a modified invariant called model-measure sofic entropy which is a lower bound for Bowen’s sofic entropy. Let be a sofic approximation to . Model-measure sofic entropy is constructed in terms of sequences where is a probability measure on . If these measures replicate the process in an appropriate sense then we say that locally and empirically converges to . We refer the reader to [1] for the precise definitions. We have substituted the phrase ‘local and empirical convergence’ for the phrase ‘quenched convergence’ which appeared in [1]. This has been done to avoid confusion with an alternative use of the word ‘quenched’ in the physics literature. A process is said to have completely positive model-measure sofic entropy if every nontrivial factor has positive model-measure sofic entropy. The goal of this paper is the to prove the following theorem, which generalizes the main theorem of [5].

Theorem 1.1.

Let be a countable sofic group containing an element of infinite order. Then there exists an uncountable family of pairwise nonisomorphic -processes each of which has completely positive model-measure sofic entropy (and hence completely positive sofic entropy) with respect to any sofic approximation to . None of these processes is a factor of a Bernoulli shift.

In order to prove Theorem 1.1 we introduce a concept of uniform mixing for sequences of model-measures. This uniform model-mixing will be defined formally in Section 3. It implies completely positive model-measure sofic entropy.

Theorem 1.2.

Let be a countable sofic group and let be a -process with finite state space . Suppose that for some sofic approximation to , there is a uniformly model-mixing sequence which locally and empirically converges to over . Then has completely positive lower model-measure sofic entropy with respect to .

As in [5], the examples we exhibit to establish Theorem 1.1 are produced via a coinduction method for lifting -processes to -processes when . If is an -process then we can construct a corresponding -process as follows. Let be a transversal for the right cosets of in . Identify as a set with and thereby identify with . Set . We call the coinduced process and denote it by . (See page of [7] for more details on this construction.) When this procedure preserves uniform mixing.

Theorem 1.3.

Let be a countable sofic group and let be a uniformly mixing -process with finite state space . Let be a subgroup isomorphic to and identify with Then for any sofic approximation to , there is a uniformly model-mixing sequence of measures which locally and empirically converges to over .

We remark that it is easy to see that if is a Bernoulli shift (that is to say, is a product measure), then there is a uniformly model-mixing sequence which locally and empirically converges to . Indeed, if for a measure on then the measures on are uniformly model-mixing and locally and empirically converge to . Thus Theorem 1.2 shows that Bernoulli shifts with finite state space have completely positive sofic entropy, giving another proof of this case of the main theorem from [10]. We believe that completely positive sofic entropy for general Bernoulli shifts can be deduced along the same lines, requiring only a few additional estimates, but do not pursue the details here.

1.1 Acknowledgements

The first author’s research was partially supported by the Simons Collaboration on Algorithms and Geometry. The second author’s research was partially supported by NSF grants DMS-0968710 and DMS-1464475.

2 Preliminaries

2.1 Notation

The notation we use closely follows that in [1]; we refer the reader to that reference for further discussion. Let be a finite set. For any pair of sets we let be projection onto the -coordinates (thus our notation leaves the larger set implicit). Let be a countable group and let be a -process. For we will write for the -marginal of .

Let be another finite set and let be a measurable function. If we will say that is -local if it factors through . We will say is local if it is -local for some finite . Let be given by and note that is equivariant between the right-shift on and the right-shift on .

Let be a finite set and let be a map from to . For and we write instead of . For and we define

 σF(S)={σg⋅s:g∈F,s∈S}.

For we write for . We write for the map from to given by for and . We write for . With as before, we write for the map from to given by .

If is a finite set and is a probability measure on then denotes the Shannon entropy of , and for we define

 covϵ(η)=min{|F|:F⊆D is such that% η(F)>1−ϵ}.

If is a map to another finite set then we may write in place of . For we let .

We use the and asymptotic notations with respect to the limit . Given two functions and on , the notation means that there is a positive constant such that for all .

2.2 An information theoretic estimate

Lemma 2.1.

Let be a finite set and let be a sequence of finite sets such that increases to infinity. Let be a probability measure on . We have

 liminfn→∞H(μn)|Vn|≤supϵ>0liminfn→∞1|Vn|logcovϵ(μn).
Proof.

Let be a probability measure on a finite set and let . By conditioning on the partition and then recalling that entropy is maximized by uniform distributions we obtain

 H(μ) =μ(E)⋅H(μ(⋅|E))+μ(F∖E)⋅H(μ(⋅|F∖E))+H(μ(E)) ≤μ(E)⋅log(|E|)+(1−μ(E))⋅log(|F∖E|)+H(μ(E)). (2.1)

Now let and be as in the statement of the lemma. Let and let be a sequence of sets with and . By applying (2.1) with and we have

 liminfn→∞H(μn)|Vn| ≤liminfn→∞1|Vn|(μ(Sn)⋅log(|Sn|)+(1−μ(Sn))⋅log(|AVn∖Sn|)+H(μ(Sn))) ≤liminfn→∞1|Vn|(log(|Sn|)+ϵ⋅log(|AVn|)+H(ϵ)) ≤(liminfn→∞1|Vn|logcovϵ(μn))+ϵ⋅log(|A|).

Now let tend to zero to obtain the lemma. ∎

3 Metrics on sofic approximations and uniform model-mixing

Let us fix a proper right-invariant metric on : for instance, if is finitely generated then can be a word metric, and more generally we may let be any proper weight function and define to be the resulting weighted word metric. Again let be a finite set and let be a map from to . Let be the graph on with an edge from to if and only if or for some . Define a weight function on the edges of by setting

 W(v,w)=min{ρ(g,1G):σg⋅v=w or σg⋅w=v}.

If and are in the same connected component of let be the -weighted graph distance between and , that is

 ρσ(v,w)=min{k−1∑i=0W(pi,pi+1):(v=p0,p1,…,pk−1,pk=w) is an Hσ-path from v to w}.

Having defined on the connected components of , choose some number much larger than the -distance between any two points in the same connected component. Set for any pair of vertices in distinct connected components of . Note that if is a sofic approximation to then for any fixed once is large enough the map restricts to an isometry from to for most .

In the sequel the sofic approximation will be fixed, and we will abbreviate to . We can now state the main definition of this paper.

Definition 3.1.

Let be a sequence of finite sets with and for each let be a map from to . Let be a finite set. For each let be a probability measure on . We say the sequence is uniformly model-mixing if the following holds. For every finite and every there is some and a sequence of subsets such that

 |Wn|=(1−o(1))|Vn|

and if is -separated according the metric then

 H(πσFn(S)∗μn)≥|S|⋅(H(μF)−ϵ).

This definition is motivated by Weiss’ notion of uniform mixing from the special case when is amenable: see [14] and also Section 4 of [5]. Let us quickly recall that notion in the setting of a -process . First, if is finite and is another subset, then is -spread if any distinct elements satisfy . The process is uniformly mixing if, for any finite-valued measurable function and any , there exists a finite subset with the following property: if is another finite subset which is -spread, then

 H((ϕG∗μ)S)≥|S|⋅(Hμ(ϕ)−ϵ).

Beware that we have reversed the order of multiplying and in the definition of ’-spread’ compared with [5]. This is because we work in terms of observables such as rather than finite partitions of , and shifting an observable by the action of corresponds to shifting the partition that it generates by .

The principal result of [11] is that completely positive entropy implies uniform mixing. The reverse implication also holds: see [6] or Theorem 4.2 in [5]. Thus, uniform mixing is an equivalent characterization of completely positive entropy.

The definition of uniform mixing may be rephrased in terms of our proper metric on as follows. The process is uniformly mixing if and only if, for any finite-valued measurable function and any , there exists an with the following property: if is -separated according to , then

 H((ϕG∗μ)S)≥|S|⋅(Hμ(ϕ)−ϵ).

This is equivalent to the previous definition because a subset is -separated according to if and only if it is -spread. The balls are finite, because is proper, and any other finite subset is contained in for all sufficiently large .

This is the point of view on uniform mixing which motivates Definition 3.1. We use the right-invariant metric rather than the general definition of ’-spread’ sets because it is more convenient later.

Definition 3.1 is directly compatible with uniform mixing in the following sense. If is amenable and is a Følner sequence for , then the sets may be regarded as a sofic approximation to : an element acts on by translation wherever this stays inside and arbitrarily at points which are too close to the boundary of . If is an ergodic -process, then it follows easily that the sequence of marginals locally and empirically converge to over this Følner-set sofic approximation. If is uniformly mixing, then this sequence of marginals is clearly uniformly model-mixing in the sense of Definition 3.1.

On the other hand, suppose that admits a sofic approximation and a locally and empirically convergent sequence of measures over that sofic approximation which is uniformly model-mixing. Then our Theorem 1.2 shows that has completely positive sofic entropy. If is amenable then sofic entropy always agrees with Kolmogorov-Sinai entropy [3], and this implies that has completely positive entropy and hence is uniformly mixing, by the result of [11].

Thus if is amenable then completely positive entropy and uniform mixing are both equivalent to the existence of a sofic approximation and a locally and empirically convergent sequence of measures over it which is uniformly model-mixing. If these conditions hold, then we expect that one can actually find a locally and empirically convergent and uniformly model-mixing sequence of measures over any sofic approximation to . This should follow using a similar kind of decomposition of the sofic approximants into Følner sets as in Bowen’s proof in [3]. However, we have not explored this argument in detail.

Definition 3.1 applies to a shift-system with a finite state space. It can be transferred to an abstract measure-preserving -action on by fixing a choice of finite measurable partition of . However, in order to study actions which do not admit a finite generating partition, it might be worth looking for an extension of Definition 3.1 to -processes with arbitrary compact metric state spaces, similarly to the setting in [1]. We also do not pursue this generalization here.

4 Proof of Theorem 1.2

We will use basic facts about the Shannon entropy of observables (i.e. random variables with finite range), for which we refer the reader to Chapter of [4]. Let , and be as in the statement of Theorem 1.2. The following is the ‘finitary’ model-measure analog of Lemma in [5].

Lemma 4.1.

Let be finite. Let be a finite set and let be an -local observable. Let be a sequence of sets such that . Then we have

 H(μF)−1|Sn|H(πσFn(Sn)∗μn)≥Hμ(ϕ)−1|Sn|H(πSn∗ϕσn∗μn)−o(1).
Proof of Lemma 4.1.

Let be a map with . Fix and . Let and let . For let and let . Then we have and . Enumerate and write . All entropies in the following display are computed with respect to . We have

 H(α) =H(α1,…,αm) =H(α1)+m−1∑k=1H(αk+1|α1,…,αk) =H(α1,β1)+m−1∑k=1H(αk+1,βk+1|α1,…,αk) =H(β1)+H(α1|β1)+m−1∑k=1H(βk+1|α1,…,αk)+m−1∑k=1H(αk+1|βk+1,α1,…,αk) ≤H(β1)+m−1∑k=1H(βk+1|β1,…,βk)+m∑k=1H(αk|βk) =H(β)+m∑k=1H(αk|βk).

Let be the identity map on . Then

 |S|⋅H(μF)−H(πσFn(S)∗μn) =|S|⋅HμF(ι)−Hμn(α) ≥|S|⋅HμF(θ)+|S|⋅HμF(ι|θ)−Hμn(β)−∑s∈SHμn(αs|βs) =|S|⋅Hμ(ϕ)−H(πS∗ϕσn∗μn)+|S|⋅HμF(ι|θ)−∑s∈SHμn(αs|βs). (4.1)

Now allowing to vary, let be a sequence of sets such that . Write . Let be such that the obvious map from to is injective. Then the function provides an identification of with . This identification sends to and to . When is large the marginal of will resemble for most . Since and are measurable this implies that for most . More precisely, we can find a sequence of sets with

 |Cn|=(1−o(1))|Sn|

such that

 maxs∈Cn∣∣HμF(ι|θ)−Hνn(αs|βs)∣∣=o(1).

Thus

 ∣∣ ∣∣|Sn|⋅HμF(ι|θ)−∑s∈SnHνn(αs|βs)∣∣ ∣∣ ≤∑s∈Cn∣∣HμF(ι|θ)−Hνn(αs|βs)∣∣+∑s∈Sn∖Cn∣∣HμF(ι|θ)−Hνn(αs|βs)∣∣ =o(|Sn|).

The lemma then follows from (4.1) and the above. ∎

Recall that for a measure space and two observables and on the Rokhlin distance between and is defined by

 dRokμ(α,β)=Hμ(α|β)+Hμ(β|α).

This is a pseudometric on the space of observables on . An easy computation shows that if and are two families of observables on then

 dRokμ((α1,…,αn),(β1,…,βn))≤n∑k=1dRokμ(αk,βk).
Lemma 4.2.

Let be two local observables. Let be a sequence of sets with . Then we have

 1|Sn|∣∣H(πSn∗ϕσn∗μn)−H(πSn∗ψσn∗μn)∣∣≤dRokμ(ϕ,ψ)+o(1).
Proof.

Let and let . Let be a finite subset of such that both and are -local. Let be a map such that and let be a map such that . For let so that . Also let . Then we have

 1|Sn|∣∣H(πSn∗ϕσn∗μn)−H(πSn∗ψσn∗μn)∣∣ =1|Sn|∣∣Hμn(αn)−Hμn(βn)∣∣ ≤1|Sn|⋅dRokμn(αn,βn) =1|Sn|⋅dRokμn((αn,s)s∈Sn,(βn,s)s∈Sn) ≤1|Sn|∑s∈SndRokμn(αn,s,βn,s) (4.2)

If the map is injective on , we can identify with and thereby identify with and with . Note that

 dRokμF(θ,κ)=dRokμ(ϕ,ψ).

It follows that for any we can find a weak star neighborhood of such that if is such that then

 ∣∣dRokμn(αn,s,βn,s)−dRokμ(ϕ,ψ)∣∣<ϵ.

Thus, since locally and empirically converges to , there are sets with such that

 maxs∈Cn∣∣dRokμn(αn,s,βn,s)−dRokμ(ϕ,ψ)∣∣=o(1). (4.3)

The lemma now follows from (4.2) and (4.3). ∎

Corollary 4.1.

Let be a sequence of local observables and let be a local observable. Let be a sequence of sets with . Then if increases to infinity at a slow enough rate we have

 1|Sn|∣∣H(πSn∗ϕσn∗μn)−H(πSn∗ϕσnmn∗μn)∣∣≤dRokμ(ϕ,ϕmn)+o(1).
Proof of Theorem 1.2.

Let be a finite set and let be an observable with . Let be an AL approximating sequence for rel (see Definition in [1]). Then the sequence converges to in . In particular, is a Cauchy sequence and so we can find so that for all we have

 dRokμ(ϕm,ϕM)≤Hμ(ψ)8. (4.4)

We will also assume is large enough that

 Hμ(ϕM)≥Hμ(ψ)2. (4.5)

Let be a finite subset of such that is -local. Then Definition 3.1 provides an and a sequence of subsets such that and if is -separated then

 H(μF)−1|S|H(πσFn(S)∗μn)≤Hμ(ϕM)2. (4.6)

Let . Since is a sofic approximation there are sets with such that if then the ball of radius around has cardinality at most . Write and note that we have . For each let be an -separated subset of with maximal cardinality. Then so that

 |Sn|≥|Yn|K=(1−o(1))|Vn|K. (4.7)

By Lemma 4.1 and (4.6) we have

 Hμ(ϕM)−1|Sn|H(πSn∗ϕσnM∗μn)−o(1)≤Hμ(ϕM)2

so that from (4.5) we have

 Hμ(ψ)4−o(1)≤1|Sn|H(πSn∗ϕσnM∗μn). (4.8)

By Proposition in [1] if increases to infinity at a slow enough rate then will locally and empirically converge to . Since is finite, by the same argument as for Proposition in [1] we have

 h––qΣ(ψG∗μ) ≥supϵ>0liminfn→∞1|Vn|logcovϵ((ϕσnmn)∗μn) ≥liminfn→∞1|Vn|H((ϕσnmn)∗μn) (4.9)

where the second inequality follows from Lemma 2.1. We also assume that increases slowly enough for Corollary 4.1 to hold. By (4.4) we have

 ∣∣∣1|Sn|H(πSn∗ϕσnM∗μn)−1|Sn|H(πSn∗(ϕσnmn)∗μn)∣∣∣≤Hμ(ψ)8+o(1).

Combining this with (4.8) we see that

 1|Sn|H(πSn∗(ϕσnmn)∗μn)≥Hμ(ψ)8−o(1).

By the above and (4.7) we have that for all sufficiently large ,

 H((ϕσnmn)∗μn)≥Hμ(ψ)8K+1|Vn| (4.10)

Theorem 1.2 now follows from (4.9) and (4.10). ∎

5 Proof of Theorem 1.3

Let be a uniformly mixing -process, and for each positive integer let be the marginal of on . Let be an arbitrary sofic approximation to . Let have infinite order and write . We construct a measure on for each . We will later show that the sequence is uniformly model-mixing and locally and empirically converges to over .

We first construct a measure on for each pair with much smaller than . For a given , the single permutation partitions into a disjoint union of cycles. Since has infinite order and is a sofic approximation, once is large most points will be in very long cycles. In particular we assume that most points are in cycles with length much larger than . Partition the cycles into disjoint paths so that as many of the paths have length as possible, and let be the collection of all length- paths that result (so is not a partition of the whole of , but covers most of it). Fix any element and define a random element by choosing each restriction independently with the distribution of and extending to the rest of according to . Let be the law of this .

Now let increase to infinity at a slow enough rate that the following two conditions are satisfied:

1. The number of points of that lie in some member of the family is .

2. Whenever lie in distinct right cosets of , so that for all , we have

 |{v∈Vn: (σgn)−1(σhn)pσg′n⋅v=v for some p∈{−ln,…,ln}}|=o(|Vn|)

Set . We separate the proof that ha