The entropy of \alpha-continued fractions:numerical results

The entropy of -continued fractions:
numerical results

Carlo Carminati, Stefano Marmi, Alessandro Profeti, Giulio Tiozzo
28 November, 2009

We consider the one-parameter family of interval maps arising from generalized continued fraction expansions known as -continued fractions. For such maps, we perform a numerical study of the behaviour of metric entropy as a function of the parameter. The behaviour of entropy is known to be quite regular for parameters for which a matching condition on the orbits of the endpoints holds. We give a detailed description of the set where this condition is met: it consists of a countable union of open intervals, corresponding to different combinatorial data, which appear to be arranged in a hierarchical structure. Our experimental data suggest that the complement of is a proper subset of the set of bounded-type numbers, hence it has measure zero. Furthermore, we give evidence that the entropy on matching intervals is smooth; on the other hand, we can construct points outside of on which it is not even locally monotone.

1 Introduction

Let . We will study the one-parameter family of one-dimensional maps of the interval

If we let , , , then for every we get the expansion

with which we call -continued fraction. These systems were introduced by Nakada ([11]) and are also known in the literature as Japanese continued fractions.

The algorithm, analogously to Gauss’ map in the classical case, provides rational approximations of real numbers. The convergents are given by

It is known (see [9]) that for each there exists a unique invariant measure absolutely continuous w.r.t. Lebesgue measure.

Figure 1: Graph of

In this paper we will focus on the metric entropy of the ’s, which by Rohlin’s formula ([13]) is given by

Equivalently, entropy can be thought of as the average exponential growth rate of the denominators of convergents: for -a.e. ,

The exact value of has been computed for by Nakada ([11]) and for by Cassa, Marmi and Moussa ([10]).

In [9], Luzzi and Marmi computed numerically the entropy for by approximating the integral in Rohlin’s formula with Birkhoff averages

for a large number of starting points and then averaging over the samples:

Their computations show a rich structure for the behaviour of the entropy as a function of ; it seems that the function is piecewise regular and changes monotonicity on different intervals of regularity.

These features have been confirmed by some results by Nakada and Natsui ([12], thm. 2) which give a matching condition on the orbits of and

which allows to find countable families of intervals where the entropy is increasing, decreasing or constant (see section 3). It is not difficult to check that the numerical data computed via Birkhoff theorem fit extremely well with the matching intervals of [12].

Figure 2: Numerical data vs. matching intervals

In this paper we will study the matching condition in great detail. First of all, we analyze the mechanism which produces it from a group-theoretical point of view and find an algorithm to relate the -continued fraction expansion of and when a matching occurs. This allows us to understand the combinatorics behind the matchings once and for all, without having to resort to specific matrix identities. As an example, we will explicitly construct a family of matching intervals which accumulate on a point different from . In fact we also have numerical evidence that there exist positive values, such as , which are cluster point for intervals of all the three matching types: with , and .

We then describe an algorithm to produce a huge quantity of matching intervals, whose exact endpoints can be found computationally, and analyze the data thus obtained. These data show matching intervals are organized in a hierarchical structure, and we will describe a recursive procedure which should produce such structure.

Let now be the union of all matching intervals. It has been conjectured ([12], sect. 4, pg. 1213) that is an open, dense set of full Lebesgue measure. In fact, the correctness of our scheme would imply the following stronger

Conjecture 1.1.

For any , all elements of have regular continued fraction expansion bounded by .

Since the set of numbers with bounded continued fraction expansion has Lebesgue measure zero, this clearly implies the previous conjecture.

We will then discuss some consequences of these matchings on the shape of the entropy function, coming from a formula in [12]. This formula allows us to recover the behaviour of entropy in a neighbourhood of points where a matching condition is present. First of all, we will use it to prove that entropy has one-sided derivatives at every point belonging to some matching interval, and also to recover the exact value of for . In general, though, to reconstruct the entropy one also has to know the invariant density at one point.

As an example, we shall examine the entropy on an interval on which (by previous experiments, see [9], sect. 3) it was thought to be linearly increasing: we numerically compute the invariant density for a single value of and use it to predict the analytical form of the entropy on , which in fact happens to be not linear. The data produced with this extrapolation method agree with high precision, and much better than any linear fit, with the values of computed via Birkhoff averages.

The paper is structured as follows: in section 2 we will discuss numerical simulations of the entropy and provide some theoretical framework to justify the results; in section 3 we shall analyze the mechanisms which produce the matching intervals and in section 4 we will numerically produce them and study their hierarchical structure; in section 5 we will see how these matching conditions affect the entropy function.


This research was partially supported by the project “Dynamical Systems and Applications” of the Italian Ministry of University and Research111PRIN 2007B3RBEY., and the Centro di Ricerca Matematica “Ennio De Giorgi”.

2 Numerical computation of the entropy

Let us examine more closely the algorithm used in [9] to compute the entropy. A numerical problem in evaluating Birkhoff averages arises from the fact that the orbit of a point can fall very close to the origin: the computer will not distinguish a very small value from zero. In this case we neglect this point, and complete the (pseudo)orbit restarting from a new random seed222Another choice is to throw away the whole orbit and restart; it seems there is not much difference on the final result. As a matter of fact this algorithm produces an approximate value of

where ; of course is an excellent approximation of the entropy , since the difference is of order . To calculate we use the Birkhoff sums

and in [16] the fourth author proves that for large the random variable is distributed around its mean approximately with normal law and standard deviation where

which explains the aforementioned result by Luzzi and Marmi [9].

One of our goals is to study the function , in particular we ask whether it displays some regularity like continuity or semicontinuity. To this aim we pushed the same scheme as in [9] to get higher precision:

  1. We take a sample of values chosen in a particular subinterval ;

  2. For each value we choose a random sample in (the cardinality M of this sample is usually or );

  3. For each () we evaluate as described before (the number of iterates N will be );

  4. Finally, we evaluate the (approximate) entropy and take record of standard deviation as well:

2.1 Central limit theorem

Let us restate more precisely the convergence result for Birkhoff sums proved in [16]. Let us denote by the space of real-valued, -integrable, bounded variation functions of the interval . We will denote by the Birkhoff sum

Lemma 2.1.

Let and be an element of . Then the sequence

converges to a real nonnegative value, which will be denoted by . Moreover, if and only if there exists such that and


The condition given by is the same as in the proof of the central limit theorem for Gauss’ map, and it’s known as cohomological equation. The main statement of the theorem is the following:

Theorem 2.2.

Let and be an element of such that has no solutions. Then, for every we have

Since we know that the invariant density is bounded from below by a nonzero constant, we can show that

Proposition 2.3.

For every real-valued nonconstant , the equation has no solutions. Hence, the central limit theorem holds.

Now, for every the function define in the previous section is of bounded variation, hence the central limit theorem holds and the distribution of the approximate entropy approaches a Gaussian when . As a corollary, for the standard deviation of Birkhoff averages

2.2 Speed of convergence

In terms of numerical simulations it is of primary importance to estimate the difference between the sum computed at the step and the asymptotic value: a semi-explicit bound is given by the following

Theorem 2.4.

For every nonconstant real-valued , there exists such that


It follows from a Berry-Esséen type of inequality. For details see ([1], th.8.1). ∎

2.3 Dependence of standard deviation on

Given these convergence results for the entropy, it is natural to ask how the standard deviation varies with . In this case not a single exact value of is known; by using the fact that natural extensions of are conjugate ([7], [12]), it is straightforward to prove the

Lemma 2.5.

The map is constant for .


See appendix. ∎

The numerical study of this quantity is pretty interesting. We first considered the window , where the entropy is non-monotone. On this interval the standard deviation shows quite a strange behaviour: the values we have recorded do not form a cloud clustering around a continuous line (like for the entropy) but they cluster all above it.

Figure 3: Variance on the interval .

One might guess that this is due to the fact that the map is only semicontinuous, but the same kind of asymmetry takes place also on the interval , where is constant. Indeed, we can observe the same phenomenon also evaluating for a fixed value but taking several different sample sets.

Figure 4: Plot of the standard deviation of the different runs on the Gauss map

On the other hand this strange behaviour cannot be detected for other maps, like the logistic map, and could yet not be explained. Nevertheless, we point out that if you only consider observables, the standard deviation of Birkhoff sums can be proved continuous, at least for . See [16].

3 Matching conditions

In [12], Nakada and Natsui found a condition on the orbits of and which allows one to predict more precisely the behaviour of the entropy. Let us denote for any , , by the matrix such that , i.e.

They proved the following

Theorem 3.1.

([12], thm. 2) Let us suppose that there exist positive integers and such that

Then there exists such that, on , is :

  • strictly increasing if

  • constant if

  • strictly decreasing if

It turns out that conditions (I)-(II)-(III) define a collection of open intervals (called matching intervals); they also proved that each of the cases (i), (ii) and (iii) takes place at least on one infinite family of disjoint matching intervals clustering at the origin, thus proving the non-monotonicity of the entropy function. Moreover, they conjectured that the union of all matching intervals is a dense, open subset of with full Lebesgue measure.

In the following we will analyze more closely the mechanism which leads to the existence of such matchings. As a consequence, we shall see that it looks more natural to drop condition (III) from the previous definition and replace (II) with

(which implies ).

We can now define the matching set as

Note is open, since the symbolic codings of up to step and of up to step are locally constant.

Moreover, we will see that under this condition it is possible to predict the symbolic orbit of given the symbolic orbit of , and viceversa. As an application, we will construct a countable family of intervals which accumulates in a point different from .

Let us point out that our definition of matching produces a set slightly bigger than the union of all matching intervals satisfying condition (I,II,III): in fact the difference is just a countable set of points.

3.1 Structure of PGL(2, )

Let us define , . We have an exact sequence

where the first arrow is the inclusion and the second the determinant; moreover, if we consider the group

and let , then

therefore we have the semidirect product decomposition

Now, it is well known that is the free product


are such that , . Geometrically, represents the function , and if we denote by the element corresponding to the translation , we have .

The matrix projects to a generator of and it satisfies , and in so we get the presentation

3.2 Encoding of matchings

Every step of the algorithm generating -continued fractions consists of an operation of the type:

which corresponds to the matrix with

so if belongs to the cylinder we can express

Now, suppose we have a matching and let belong to the cylinder and belong to the cylinder . One can rewrite the matching condition as

hence it is sufficient to have an equality of the two Möbius transformations

We call such a matching an algebraic matching. Now, numerical evidence shows that, if a matching occurs, then

If we make this assumption we can rewrite the matching condition as

which implies , i.e. . If for instance and , by substituting one has

Since every element of can be written as a product of and in a unique way, one can get a relation between the and . Notice that, since we are interested in , and for every , hence there is no cancellation in the equation above. By counting the number of blocks at the beginning of the word, one has , and by semplifying,


If one has and instead, the matching condition is

which implies , and simplifying still yields equation .

Let us remark that is equivalent to

which is precisely condition : by evaluating both sides on

Moreover, from one has that to every bigger than it corresponds exactly a sequence of of length precisely , and viceversa. More formally, one can give the following algorithm to produce the coding of the orbit of up to step given the coding of the orbit of up to step (under the hypothesis that an algebraic matching occurs, and at least is known).

  1. Write down the coding of from step to , separated by a symbol 

  2. Subtract from every ; if , then leave the space empty instead of writing .

  3. Replace stars with numbers and viceversa (replace the number with consecutive stars, and write the number in place of stars in a row)

  4. Add to every number you find and remove the stars: you’ll get the sequence .


Let us suppose there is a matching with and has initial coding . The steps of the algorithm are:

so the coding of is , and .

3.3 Construction of matchings

Let us now use this knowledge to construct explicitly an infinite family of matching intervals which accumulates on a non-zero value of . For every , let us consider the values of such that belongs to the cylinder with the respect to . Let us compute the endpoints of such a cylinder.

  • The right endpoint is defined by


  • The left endpoint is defined by


By diagonalizing the matrices and computing the powers one can compute these value explicitly. In particular,

The s such that belongs to the cylinder are defined by the equations

for the left endpoint and

for the right endpoint, so the left endpoint corresponds to the periodic point such that



By comparing the first order terms one gets asymptotically

hence the two intervals intersect for infinitely many , producing infinitely many matching intervals which accumulate at the point . The length of such intervals is

4 Numerical production of matchings

In this section we will describe an algorithm to produce a lot of matching intervals (i.e. find out their endpoints exactly), as well as the results we obtained through its implementation. Our first attempt to find matching intervals used the following scheme:

  1. We generate a random seed of values belonging to (or some other interval of interest). When a high precision is needed (we manage to detect intervals of size ) the random seed is composed by algebraic numbers, in order to allow symbolic (i.e. non floating-point) computation.

  2. We find numerically candidates for the values of and (if any) simply by computing the orbits of and of up to some finite number of steps, and numerically checking if holds approximately for some and smaller than some bound.

  3. Given any triplet determined as above, we compute the symbolic orbit of up to step and the orbit of up to step .

  4. We check that the two Möbius transformations associated to these symbolic orbits satisfy condition :

  5. We solve the system of quadratic equations which correspond to imposing that and have the same symbolic orbit as and , respectively.

    Let us remark that this is the heaviest step of the whole procedure since we must solve quadratic inequalities; for this reason the value may be thought of as a measure of the computational cost of the matching interval and will be referred to as order of matching.

Following this scheme, we detected more than matching intervals, whose endpoints are quadratic surds; their union still leaves many gaps, each of which smaller than . A table with a sample of such data is contained in the appendix. 333A more efficient algorithm, which avoids random sampling, will be discussed in subsection 4.1.

In order to detect some patterns in the data, let us plot the size of these intervals (figure 5). For each matching interval , we drew the point of coordinates .

Figure 5: Size of matchings

It seems there is some self-similar pattern: in order to understand better its structure it is useful to identify some “borderline” families of points. The most evident family is the one that appears as the higher line of points in the above figure (which we have highlighted in green): these points correspond to matching intervals which contain the values , and their endpoints are , ; this is the family already exhibited in [12]. Since and , for the points are very close to . This suggests that this family will “straighten” if we replot our data in log-log scale. This is indeed the case, and in fact it seems that there are also other families which get perfectly aligned along parallel lines of slope 3 (see figure 6).

Figure 6: Same picture, in log-log scale.

If we consider the ordinary continued fraction expansion of the elements of these families we realize that they obey to some very simple empirical444Unfortunately we are still not able to prove all these rules. rules:

  1. the endpoints of any matching interval have a purely periodic continued fraction expansion of the type and ; this implies that the rational number corresponding to is a common convergent of both endpoints and is the rational with smallest denominator which falls inside the matching interval;

  2. any endpoint of a matching interval belongs to a family ; in particular this family has a member in each cylinder for , so that each family will cluster at the origin.

  3. other families can be detected in terms of the continued fraction expansion: for instance on each cylinder ( the largest matching interval on which is decreasing has endpoints with expansion and

  4. matching intervals seem to be organized in a binary tree structure, which is related to the Stern-Brocot tree555Sometimes also known as Farey tree. See [3].: one can thus design a bisection algorithm to fill in the gaps between intervals, and what it’s left over is a closed, nowhere dense set. This and the following points will be analyzed extensively in subsection 4.1;

  5. if is the endpoint of some matching interval then with ; this would imply that the values which do not belong to any matching interval must be bounded-type numbers with partial quotients bounded above by ;

  6. it is possible to compute the exponent of a matching from the continued fraction expansion of any one of its endpoints.

Figure 7: A few of the largest matching intervals in the window [1/4, 1/3], and the corresponding nodes of Stern-Brocot tree. The value on the y axis is the logarithm of the size of each interval.

From our data it is also evident that the size of these intervals decreases as increases, and low order matchings tend to disappear as approaches zero. Moreover, as tends to the space covered by members of “old” families of type (ii) encountered decreases, hence new families have to appear. One can quantify this phenomenon from figure 6: since the size of matching intervals in any family decreases as on the interval cylinder (whose size decreases like ): this means that, as n increases, the mass of gets more and more split among a huge number of tiny intervals.

This fact compromises our numerical algorithm: it is clear that choosing floating point values at random becomes a hopeless strategy when approaching zero. Indeed, even if there still are intervals bigger than the double-precision threshold, in most cases the random seed will fall in a really tiny interval corresponding to a very high matching order: this amounts to having very little gain as the result of a really heavy computation.

We still can try to test numerically the conjecture that the matching set has full measure on ; but we must expect that the percentage of space covered by matching intervals (found numerically) will decrease dramatically near the origin, since we only detect intervals with bounded by some threshold. The matching intervals we have found so far cover a portion of of the interval ; this ratio increases to if we restrict to the interval and it reaches restricting to the interval .

Figure 8: Dependence of the order of a matching interval on the left endpoint

The following graph represents the percentage of the interval which is covered by matching intervals of order for different values of 666 Let us point out that for big values of the graph does not take into account all matching intervals of order but only those we have found so far. . It gives an idea of the gain, in terms of the total size covered by matching intervals, one gets when refining the gaps (i.e. considering matching intervals of higher order).

Figure 9: Percentage of covering by matching intervals

Finally, to have a more precise idea of the relationship between order of matching and size of the matching interval it is useful to see the following scattered plot: the red dots correspond to matching intervals found using a random seed, and the green ones to intervals found using the bisection algorithm. The two lines bounding the cloud correspond to matching intervals with very definite patterns: the upper line corresponds to the family (with endpoints of type and ), the lower line corresponds to matching intervals with endpoints of type and . The latter ones converge to , which is the supremum of all values where the entropy is increasing.

Thus numerical evidence shows that, if is an interval with matching order then the size of is bounded below by where and . On the other hand we know for sure that, on the right of 0.0475 (which corresponds to the leftmost matching interval of our list), the biggest gap left by the matching intervals found so far is of order . So, if is a matching interval which still does not belong to our list, either and (see figure 8), or its size must be smaller than and by the forementioned empirical rule, its order must be . Hence, our list should include all matching intervals with .

Figure 10: The order versus the logarithm of the size of the first matching intervals found.

4.1 The matching tree

As mentioned before, it seems that matching intervals are organized in a binary tree structure. To describe such structure, we will provide an algorithm which allows to construct all matching intervals by recursively “filling the gaps” between matching intervals previously obtained, similarly to the way the usual Cantor middle third set is constructed.

In order to do so, let us first notice that every rational value has two (standard) continued fraction expansions:

One can associate to the interval whose endpoints are the two quadratic surds with continued fraction obtained by endless repetition of the two expansions of :

Definition 4.1.

Given with continued fraction expansion as above, we define to be the interval with endpoints

(in any order). The strings and will be said to be conjugate and we will write .

Notice that . It looks like all matching intervals are of type for some rational . On the other hand,

Definition 4.2.

Given an open interval one can define the pseudocenter of as the rational number which has the minimum denominator among all rational numbers contained in .

It is straightforward to prove that the pseudocenter of an interval is unique, and the pseudocenter of is itself.

We are now ready to describe the algorithm:

  1. The rightmost matching interval is ; its complement is the gap .

  2. Suppose we are given a finite set of intervals, called gaps of level , so that their complement is a union of matching intervals. Given each gap , we determine its pseudocenter . Let be the continued fraction expansion of , where is the finite string containing the first common partial quotients, the first partial quotient on which the two values differ, and the rest of the expansion of , respectively. The pseudocenter of will be the rational number with expansions