On intrinsic ergodicity of factors of subshifts
It is well-known that any subshift with the specification property has the property that every factor is intrinsically ergodic, i.e., every factor has a unique factor of maximal entropy. In recent work, other subshifts have been shown to possess this property as well, including -shifts and a class of -gap shifts. We give two results that show that the situation for subshifts with is quite different. First, for any , we show that any subshift possessing a certain mixing property must have a factor with positive entropy which is not intrinsically ergodic. In particular, this shows that for , subshifts with specification cannot have all factors intrinsically ergodic. We also give an example of a shift of finite type, introduced by Hochman, which is not even topologically mixing, but for which every positive entropy factor is intrinsically ergodic.
Key words and phrases:; shift of finite type; sofic; multidimensional
2010 Mathematics Subject Classification:Primary: 37B50; Secondary: 37B10, 37A15
The well-known Variational Principle relates the concepts of measure-theoretic and topological entropy for dynamical systems, stating that the topological entropy of any dynamical system is the supremum of the measure-theoretic entropies of all invariant measures on that system. In general, there may be no measures achieving that supremum, but if the system is expansive, then at least one such measure, called a measure of maximal entropy, must exist ().
A topological dynamical system is said to be intrinsically ergodic (, ) if it has a unique measure of maximal entropy. It is well-known that for (i.e. one-dimensional) subshifts, strong enough topological mixing conditions imply intrinsic ergodicity; for instance, it was shown in  that the specification property implies intrinsic ergodicity. The specification property is also clearly preserved under factor maps, which implies that for a subshift with specification, every factor is intrinsically ergodic. In particular, since every topologically mixing shift of finite type has the specification property, every such system is also intrinsically ergodic, along with all of its factors. These facts lead to a natural question (), asked by Thomsen, of whether every factor of a -shift is intrinsically ergodic. This question was answered in the affirmative in , where the authors gave a new sufficient condition for intrinsic ergodicity that is preserved under factor maps. Informally, their condition imposes specification on “most words” in the subshift, in some quantifiable way.
Strictly speaking, for a zero-entropy system, every invariant measure is trivially a measure of maximal entropy, and so for such systems intrinsic ergodicity is equivalent to the existence of a unique invariant measure, also known as unique ergodicity. Since intrinsic ergodicity of zero-entropy systems is therefore a somewhat degenerate case, in this paper we will focus on the question of whether all positive entropy factors of a subshift are intrinsically ergodic. This slight restriction of scope changes none of the context of the work described above, since all subshifts with specification and all subshifts treated in  (see Proposition 2.4 there) have positive entropy.
In the current work, we study the class of subshifts () for which every positive entropy factor is intrinsically ergodic, proving two results which are somewhat surprising given the results for summarized above. The first is that any subshift with a certain topological mixing property (see Definition 2.15) must have a non-intrinsically ergodic factor, which is antithetical to the previously described results for .
For any and any subshift that has the D*-condition and does not consist of a single fixed point, there exists a factor map so that and is not intrinsically ergodic.
Our second main result shows that there do exist subshifts (in fact shifts of finite type) for which every positive entropy factor is intrinsically ergodic. The subshifts we consider are examples of Hochman () and are not topologically mixing; in fact, they have a forced hierarchical structure similar to substitutionally defined SFTs in the literature (, ).
There exist shifts of finite type with arbitrarily large entropy for which every factor with positive topological entropy is intrinsically ergodic.
2. Definitions and preliminaries
Let denote a finite set, which we will refer to as an alphabet.
A pattern over is a member of for some , which is said to have shape . For and an interval, patterns are generally called words.
We only consider patterns to be defined up to translation, i.e., if for a finite and , where for some , then we write to mean that for each in .
For any patterns and with , we define the concatenation to be the pattern in defined by and .
For any finite alphabet , the -shift action on , denoted by , is defined by for .
We always think of as being endowed with the product discrete topology, with respect to which it is obviously compact.
A subshift is a closed subset of which is invariant under the -shift action. A subshift is said to be non-trivial if it contains at least two points.
The language of a subshift , denoted by , is the set of all patterns with finite shape which appear in points of . For any finite , let , the set of patterns in the language of with shape .
Any subshift inherits a topology from , with respect to which it is compact. Each is a homeomorphism on any subshift, and so any subshift, when paired with the -shift action, is a topological dynamical system. For a subshift , we consider the set of all Borel probability measures on that are invariant under all shifts . Note that is compact in the weak topology. For a measure in and a pattern in , we let , where denotes the cylinder set defined by .
A subshift is called uniquely ergodic if , i.e., if there is only one invariant Borel probability measure on .
Any subshift can also be defined in terms of disallowed patterns: for any set of patterns over , one can define the set
It is well known that any is a subshift, and all subshifts are representable in this way.
A shift of finite type (SFT) is a subshift equal to for some finite set of forbidden patterns.
A (topological) factor map is any continuous shift-commuting map from a subshift onto a subshift . A bijective factor map is called a topological conjugacy.
It is well-known that any factor map is a so-called sliding block code, i.e. there exists (called the radius of ) so that uniquely determines for any and . (See  for a proof for , which extends to without changes.) A factor map is -block if it has radius .
The topological entropy of a subshift is
Let be a finite set, and let be a probability measure on . Then the entropy of is defined as
We will make use of the following basic facts about entropy, which we present without proof. See  for details.
Suppose is a finite set and is a probability measure on . Then
with equality if and only if is uniformly distributed on .
Suppose and are finite sets and is a probability measure on . For , let denote the projection (marginal) of onto . Then for any partition of , it holds that
For a subshift and a measure in , define to be the entropy of with respect to the partition given by :
Then the entropy of the measure is given by
For any subshift , the Variational Principle states that the topological entropy of is the supremum of the measure-theoretic entropies over all , which motivates the following definition.
For any subshift , any measure for which is called a measure of maximal entropy for .
For general topological systems the supremum may not be achieved; nonetheless, every subshift has at least one measure of maximal entropy; see  for a proof. It is natural to wonder when a subshift has a single such measure, which motivates the following definition (, ).
A subshift is said to be intrinsically ergodic if it has exactly one measure of maximal entropy.
We now turn to the mixing condition that appears in Theorem 1.1.
A subshift has the D*-condition if for any there exists with the property that for any , there exists such that and .
The D*-condition was defined in  as a property of subshifts which guarantees that any finite-range Gibbs measure on must be fully supported. It is significantly weaker than so-called uniform mixing conditions such as the uniform filling property and strong irreducibility/specification (see ). The following fact follows almost immediately from Definition 2.15, but it will be expeditious to state it as a lemma.
If has the D*-condition, , and is defined as in Definition 2.15, then for any , any (possibly infinite) collection such that the sets are disjoint for , and any , there exists so that for any , it holds that , and for any .
For finite , we prove the lemma by induction on . The case is just the definition of the D*-condition. Now, suppose the lemma holds for . Consider any and and as in the lemma, and choose any . Then, one can first apply the inductive hypothesis for to get which satisfies the conclusion of the lemma for all . But then, applying the case to and yields satisfying the conclusion of the lemma for itself, completing the proof.
Now, for infinite (but by necessity countable) , we first assume to be without loss of generality. Then, for each , by appeal to the finite case, there exists which has the desired properties for . The sequence has a convergent subsequence by compactness, and its limit has the desired properties for all , completing the proof.
In the proof of Theorem 1.1, we will use the following technical lemma.
If is a nontrivial subshift with the D*-condition, then there exists a pattern so that if we define to be the subshift consisting of points in with no occurrences of , then .
Assume that is such a subshift. Since is nontrivial, the alphabet of contains at least letters. If for every finite shape , then is a periodic orbit of two points, which does not have the D*-condition, a contradiction. Therefore, there exists so that , and since enlarging cannot decrease , we assume without loss of generality that for some . Denote by the guaranteed by Definition 2.15, and choose any distinct patterns .
Begin with an arbitrary point , and use Lemma 2.16 with the set and for every . In other words, we create with a finite equispaced grid of occurrences of , whose centers have separation along each cardinal direction. Define , a pattern in which also contains the entire grid of occurrences of just described.
We claim that if is defined as in the lemma, then . To see this, we define a family of points in in the following way: start with , and use Lemma 2.16 with the set and any choice of for every . In other words, we create points with an infinite equispaced grid filled with independent choices of or , whose centers have separation along each cardinal direction. We claim that all such points are in . Suppose for a contradiction that such a point, call it , has an occurrence of . However, note that no matter how is shifted, it will contain some occurrence of with center in (because contained occurrences of in every coset in .) This occurrence gives a contradiction, since every translate of with center in in is filled with either or , and so not . For every , this yields at least patterns in (from independent choices of or ), and so
3. Proof of Theorem 1.1
The “bad factor” proving Theorem 1.1 will always be of the same type; it will be a shift of finite type based on the lattice Widom-Rowlinson model from statistical physics (). We first define this SFT.
For any , the Widom-Rowlinson SFT with interaction distances and , denoted by , is the SFT with alphabet which consists of all satisfying the following local rules: (here and elsewhere, the distance between sites of always refers to the metric)
any pair of nonzero symbols must have distance greater than
any pair of nonzero symbols with opposite signs must have distance greater than
It seems “well-known” that if is large compared to , then is not intrinsically ergodic (see ). However, technically the cited paper only treats the case where , and so we present a self-contained proof here. We will use a fairly standard Peierls argument, following .
For any and , there exists so that if , then is not intrinsically ergodic.
Fix any . For technical reasons, assume that . For brevity, we will refer to simply as . For any a multiple of , consider patterns on the cube with boundary condition on given by if exactly one of the is and all others are divisible by , and otherwise. In other words, contains equispaced symbols at distance on each face of the boundary and symbols elsewhere on the boundary. This leaves only sites on undefined, and so we can define the measure on which gives equal measure to every so that . For any , we define to be the event that (i.e., set of patterns such that) there is a symbol at . We will give an upper bound on .
To this end, fix and consider any . Then, consider the union of over all at which . This union is nonempty since implies that . It is contained within by the boundary condition and the fact that and symbols must be separated by distance greater than . It may consist of several disjoint connected components; define to be the one containing . Let be the “outermost contour” of , i.e. the set of sites in that are adjacent to a site in but also can be connected to the boundary of by a path of adjacent sites in .
Then define by (for “moat”) the set of sites in within distance of . It should be clear that every site in must be labeled by in ; such a site can’t be a since it’s within distance of a site not in , and it can’t be a since it’s in and thereby within distance of a symbol. We note for future reference that is in fact determined by , because of the following alternate definition of : is the set of all sites within distance of that are “inside” , i.e. which cannot be connected to a site on the boundary of without passing through a site in . We leave it to the reader to verify that this definition of is equivalent to the original one. See Figure 1 for an illustration. We note also, as it will be useful later, that has “thickness” at least in every cardinal direction, i.e. any line segment in a cardinal direction connecting a site inside to a site outside passes through at least consecutive sites of in between.
Define to be the set of all which have a particular set as its “moat.” Clearly is the disjoint union of over all possible . We wish to give an upper bound on each by a simple counting argument. Fix an and corresponding , and define a function as follows: is obtained from by “flipping” (i.e. changing to ) every symbol at a location with the following property: there exists a finite path of sites in where for every , is within distance of , and is within distance of for every . The case is included, i.e. symbols in which are themselves within distance from are flipped. (Figure 2 shows the application of to the pattern from Figure 1.)
We first wish to show that is indeed legal. Since changing to involves only switching of symbols to symbols, the only possible problem would be if there exist with distance less than or equal to for which and one was flipped in the process of changing to while the other was not. We show that such can not exist by considering three cases. First, it’s not possible to have and : by definition of , if and have distance less than or equal to , then , which would imply . (Clearly, the same proof shows that , is impossible.) Second, it’s clearly not possible to have , since then neither site would be flipped. The third case is , but the rules defining imply that if a symbol in is within of another flipped symbol in , then the first symbol must also be flipped, ruling this case out as well.
The map is not necessarily one-to-one on ; in looking at , if one sees a symbol, it is not immediately clear whether that was a present in or a changed to a . In order to determine from , it would suffice to know whether each such was flipped or not. We first note that it’s sufficient to know whether the symbols within distance of a site in were flipped or not. To see this, note that the only symbols in which could have been flipped must be connected to a symbol within distance of by a path of symbols of distances at most . But then, either all symbols on the path were flipped or all were not flipped, since their distances of at most mean that they were all forced to have the same sign in . This implies that the set of all symbols in which were flipped is precisely the set of symbols which can be connected to a flipped symbol within distance of by a path of symbols of distances at most .
We now wish to give an upper bound on the number of ways in which the symbols within distance of could have arisen. To do this, consider , the set of sites in within of a site in . This is a subset of , the set of sites in within of a site in . Then can be written as a union of sets over all . We break each such set into disjoint regions for . Then, for each one of these regions, all symbols inside must either all have been flipped or all have been not flipped, since the diameter of the region is at most . This means that we have an upper bound of on the number of ways in which each symbol in can have status “flipped” or “not flipped,” yielding the following upper bound:
For every , we wish to generate many legal patterns in by changing some of the symbols in to symbols. For this purpose, we note that in , no site in is within of a symbol; any such symbol would have been flipped by definition of . Therefore, when introducing symbols into sites in in , we must only check that we do not create a pair of symbols with distance less than .
By definition of , for every site , there exists a direction so that for all . Choose a fixed so that there is a set , , for which each site in satisfies the above condition for . Define ; clearly this union is disjoint and . Now we use a greedy algorithm to choose a subset so that each pair of sites in is separated by distance more than . Formally speaking, start with , and add sites to in the following way. Choose any site , remove it from , and add it to (making it the only element of for the moment). Then, remove all sites in within distance of . Repeat this procedure until is empty. At each step of this procedure, we increase by exactly one and decrease by less than , and so .
Finally, we wish to remove from any sites within distance of , which reduces the size of by less than or equal to . Doing this yields a set with
We finally note that in , since all sites in are separated by more than from each other and none is within of any site outside , we may independently change the s at sites in to in any way to yield a legal pattern in . For given by , the sets of patterns thus obtained will obviously be disjoint, since the only changes are made within , where and both were labeled with all symbols. Therefore, by (3.1) and (3.2), we have
(The last inequality holds since .) This inequality gives that . Then, since determines , we obtain that
where is the number of possible contours of size surrounding the origin. It is well-known that up to translation, the number of connected subsets of sites of with size (the so-called lattice animals) is bounded from above by , and then the number of translates of a contour that could surround the origin is bounded from above by , so that we have . Therefore,
where . (We note that due to the original assumption .) This bound holds for every that is a sufficiently large multiple of and every . Note that if is fixed and we allow to approach infinity, then , yielding an upper bound approaching on that does not depend on or . On the other hand, we claim that regardless of how large is; indeed, one can make legal patterns in by independently choosing sites with all coordinates divisible by to be or and assigning all other sites to be . Now, note that for any measure in , if we define , , and , then
by convexity. The right-most expression in the above display is clearly continuous in and decreases to when decreases to , and so we can choose so that . Then, for any measure of maximal entropy of , we must have (otherwise, by the above computations, , contradicting being a measure of maximal entropy on ).
Now define so that whenever , we have that for every that is a sufficiently large multiple of and every . Then, take any weak limit point of the measures as ; clearly is shift-invariant. We note that for any two patterns in separated by distance greater than , the remainder of can be filled with s to make a point in , which implies that is strongly irreducible as defined in . Therefore, by Proposition 1.12(ii) from that same paper, is a measure of maximal entropy for . However, is a limit of averages of over various and is therefore less than or equal to whenever . We finally note that by the symmetry of the local rules defining , there must be another measure of maximal entropy obtained by simply flipping signs of and symbols for ; more rigorously, for any pattern , let , where is obtained from by flipping every nonzero symbol. Clearly, when , , proving that . Hence is not intrinsically ergodic.
We now show that any subshift with the D*-condition has Widom-Rowlinson SFTs as factors.
For any and any non-trivial subshift which has the D*-condition, there exists so that for any , there is a factor map .
Suppose that is a nontrivial subshift with the D*-condition, with associated sequence as in Definition 2.15. We may clearly assume without loss of generality that is nondecreasing, since replacing any with a larger integer preserves the conclusion of Definition 2.15. First, since , by Lemma 2.17, we may choose and a pattern so that removing from yields a nonempty subshift with .
Now we’ll use some results from  to create some markers. In , for any , a marker on is defined to be any pattern with the property that if and for some , then . Informally, a marker is a word on a square annulus such that two copies may overlap, but not in such a way that one intersects the “interior” of the other. Note that , and so any marker also satisfies this definition when is replaced by . Since , Proposition 3.5 from  guarantees, for any large enough , the existence of a marker for some which can be completed to at least two patterns in (in fact it guarantees much more, but this is all that we’ll need). We apply this to , yielding patterns (the completions of the marker to ) with the following property: if and and are both in for some , then . Clearly this implies that and that and contain no occurrences of .
Now we claim that suffices to prove the theorem. To this end, choose any and where . We define our factor map on as follows: if , then . If and there exists so that , then . If and there exists so that and , then . If and none of the previous three rules applies, then if and if . The reader may check that these rules are not contradictory, and so is a continuous shift-commuting map on . It remains to show that is surjective, i.e. that .
It is easy to see that , since the rules defining force any to satisfy the local rules from Definition 3.1. We now prove the opposite inclusion. Choose any , and we will construct so that (see Figure 3). We begin with an arbitrary . We will use Lemma 2.16 to change letters on in several phases, eventually yielding the desired . We begin by defining for every for which , and for every for which . We may do this by Lemma 2.16 since , and so each distinct pair , where we place and are distance at least apart. From now on, we will call these occurrences of and in “intentional placements.” Then, for every for which is a distance of at least from each of the intentional placements within , we define . Again, we may use Lemma 2.16 to define with the desired occurrences of at the desired locations, since all of the translates of on which we are placing are distance of at least apart. Also note that since we only placed these copies of at translates of which are distance at least from all intentional placements, no letter in any intentional placement is changed during this step. It is obvious that agrees with on all of the nonzero symbols in , so it remains to show that at all for which .
For this purpose, consider any at which , meaning that is not an intentional placement. We assume for a contradiction that . Then we consider two cases. First, assume that does not even overlap an intentional placement. In this case, consider the cube of the form whose center is closest to the center of . Clearly the distance between and the center of is less than or equal to . Since , this means that is contained within and is distance at least from the boundary of . Therefore also has distance of at least from the closest intentional placement. This is greater than since , and so by definition of , we have that . Since and , this means that , a contradiction.
We now deal with the case where does overlap an intentional placement. We first note that since , if overlaps an intentional placement, then every other intentional placement is distance at least from . Given this, define the unique with so that is an intentional placement. (Note that .) Since and have the marker property, we have that . Then regardless of , contains a translate of , which we denote by . As in the previous paragraph, if we define