Limit shape of random permutations

The limit shape of random permutations with polynomially growing cycle weights

Alessandra Cipriani Weierstraß-Institut
Mohrenstraße 39
10117 Berlin
Germany
Alessandra.Cipriani@wias-berlin.de
 and  Dirk Zeindler Sonderforschungsbereich 701
Fakultät für Mathematik
Universität Bielefeld
Postfach 10 01 31
33501 Bielefeld
Germany
zeindler@math.uni-bielefeld.de
July 18, 2019
Abstract.

In this work we are considering the behaviour of the limit shape of Young diagrams associated to random permutations on the set under a particular class of multiplicative measures with polynomial growing cycle weights. Our method is based on generating functions and complex analysis (saddle point method). We show that fluctuations near a point behave like a normal random variable and that the joint fluctuations at different points of the limiting shape have an unexpected dependence structure. We will also compare our approach with the so-called randomization of the cycle counts of permutations and we will study the convergence of the limit shape to a continuous stochastic process.

1. Introduction and main results

The aim of this paper is to study the limit shape of a random permutation under the generalised Ewens measure with polynomially growing cycle weights and the fluctuations at each point of the limit shape. The study of such objects has a long history, which started with the papers of Temperley [24] and Vershik [25]. Later on Young diagrams have been approached under a different direction, as in the independent works of [26] and [20], which first derived the limit shape when the underpinned measure on partitions is the so-called Plancherel measure. We will not handle this approach here, even though it presents remarkable connections with random matrix theory and random polymers, among others (see for example [10]).

We first specify what we define as the limit shape of a permutation. We denote by the set of permutations on elements and write each permutation as with disjoint cycles of length . Disjoint cycles commute and we thus can assume . This assigns to each permutation in a unique way a partition of and this partition is called the cycle type of . We will indicate that is such a partition with the notation . We define the size (so obviously if then ). features a nice geometric visualisation by its Young diagram . This is a left- and bottom-justified diagram of rows with the th row consisting of squares, see Figure 1(a).

(a)  The Young diagram
(b)  The shape function
Figure 1. Illustration of the Young diagram and the shape of

It is clear that the area of is if . After introducing a coordinate system as in Figure 1(b), we see that the upper boundary of a Young diagram is a piecewise constant and right continuous function with

(1.1)

The cycle type of a permutation becomes a random partition if we endow the space with a probability measure . What we are then interested in studying is the now random shape as , and more specifically to determine its limit shape. The limit shape with respect to a sequence of probability measures on (and sequences of positive real numbers and with ) is understood as a function such that for each

(1.2)

The assumption ensures that the area under the rescaled Young diagram is . One of the most frequent choices is , but we will see that it’s useful to adjust the choice of and to the measures . Equation (1.2) can be viewed as a law of large numbers for the process . The next natural question is then whether fluctuations satisfy a central limit theorem, namely whether

converges (after centering and normalization) in distribution to a Gaussian process on the space of càdlàg functions, for example. Of course the role of the probability distribution with which we equip the set of partitions will be crucial to this end.

In this paper, we work with the following measure on :

(1.3)

where is the cycle type of , is a sequence of non-negative weights and is a normalization constant ( is defined to be ). From time to time we will also use introduced as convention.

This measure has recently appeared in mathematical physics for a model of the quantum gas in statistical mechanics and has a possible connection with the Bose-Einstein condensation (see e.g. [6] and [12]). Classical cases of this measure are the uniform measure () and the Ewens measure (). The uniform measure is well studied and has a long history (see e.g. the first chapter of [3] for a detailed account with references). The Ewens measure originally appeared in population genetics, see [14], but has also various applications through its connection with Kingman’s coalescent process, see [18]. The measure in (1.3) also has some similarities to multiplicative measure for partitions, see for instance [8]. It is clear that we have to make some assumptions on the sequence to be able to study the behaviour as . We use in this paper the weights

(1.4)

with some and . We would like to point out that the requirement and the normalisation constant are not essential and it only simplifies the notation and the computations. In fact we can study without further mathematical problems the case , see Remark 4.5. We get

Theorem 1.1.

We define

(1.5)

We then have

  1. The limit shape exists for the process as with respect to and the weights in (1.4) with the scaling and . The limit shape is given by

    where denotes the upper incomplete Gamma function.

  2. The fluctuations at a point of the limit shape behave like

    with

    and .

Remark 1.1.

The expectation of can be expanded asymptotically also to terms of lower order with the same argument. This will actually be important in the proof of Thm. 3.6. For the time being however we want to stress only the leading coefficient of the expansion.

Theorem 1.1 was already obtained in the special case , i.e. , by Erlihson and Granovsky in [13] in the context of Gibbs distributions on integer partitions and as we were writing the present paper, we were made aware of their work. To be precise, one can push forward the measure to a measure on the set of partitions of with

(1.6)

where is a partition of and is the number of parts of length (see Section 2.1). These Gibbs distributions have been treated extensively in the literature ([6], [12] for example). One thus can work with or with . We prefer here to use .

The argumentation of Erlihson and Granovsky in [13] is stochastic and is based on randomisation: this technique has been successfully introduced by [16] and used also in particular by [8] as a tool to investigate combinatorial structures, and later applied in many contexts. However, the approach in this paper is slightly different and bases on complex analysis and uses the saddle-point method as described in Section 4. This method was used in [12] and [22] and an introduction can be found for instance in [15, Section VIII]. Our Ansatz enables us to reprove Theorem 1.1, but with two big advantages. First, our computations are much simpler than the one in [13]. Second, we get almost for free large deviations estimates. More precisely

Proposition 1.2.

We have for all and

The error terms are absolute.

We prove this by studying the cumulant generating function

(1.7)

This is an important difference to [13]. Erlihson and Granovsky directly consider the distribution of and studying with their method is computationally harder. In fact, we can compute the behaviour of all cumulants.

Theorem 1.3.

Let and

(1.8)

We then have for

(1.9)

with

(1.10)

We give the proofs of Theorem 1.1, Theorem 1.3 and Proposition 1.2 in Section 4.2. Furthermore, we introduce in Section 3 the so called grand canonical ensemble with and a measure such that (see (3.2)). It is widely expected that the behaviour on grand canonical ensembles agrees with the behaviour on the canonical ensembles, but we will see here that this is only the case for macroscopic properties. More precisely, we will see in Theorem 3.8 that has a limit shape for the grand canonical ensemble and this agrees with the one for the canonical ensemble in Theorem 1.1. However, we will see also in Theorem 3.8 that the fluctuations at the points of the limit shape follow a different central limit theorem than in Theorem 1.1. Notice that we will not deduce Theorem 1.1 (nor any other of our results) from the grand canonical ensemble .

2. Preliminaries

We introduce in this section the notation of the cycle counts and the notation of generating functions.

2.1. Cycle counts

The notation is very useful for the illustration of via its Young diagram, but in the computations it is better to work with the cycle counts . These are defined as

(2.1)

for and the cycle type of . Conventionally . We obviously have for

(2.2)

It is also clear that the cycle type of permutation (or a partition) is uniquely determined by the vector . The function and the measure in (1.1) and (1.3) can now be written as

(2.3)

Our aim is to study the behaviour of as . It is thus natural to consider the asymptotic behaviour of with respect to the measure .

Lemma 2.1 ([12], Corollary 2.3).

Under the condition the random variables converge for each in distribution to a Poisson distributed random variable with . More generally for all the following limit in distribution holds:

with independent Poisson random variables with mean .

One might expect at this point that is close to . Unfortunately we will see in Section 4 that the asymptotic behaviour of is more complicated.

2.2. Generating functions

The (ordinary) generating function of a sequence of complex numbers is defined as the formal power series

(2.4)

As usual, we define the extraction symbol , that is, as the coefficient of in the power series expansion (2.4) of .

A generating function that plays an important role in this paper is

(2.5)

As mentioned in the introduction, we will use , . We stress that generating functions of the type fall also in this category, and for them we will recover the limiting shape as previously done in [13]. We will see in particular this case in Section 4.
The reason why generating functions are useful is that it is often possible to write down a generating function without knowing explicitly. In this case one can try to use tools from analysis to extract information about , for large , from the generating function. It should be noted that there are several variants in the definition of generating functions. However, we will use only the ordinary generating function and thus call it ‘just’ generating function without risk of confusion.

The following well-known identity is a special case of the general Pólya’s Enumeration Theorem [23, p. 17] and is the main tool in this paper to obtain generating functions.

Lemma 2.2.

Let be a sequence of complex numbers. We then have as formal power series in

where . If one series converges absolutely, so do the others.

We omit the proof of this lemma, but details can be found for instance in [21, p. 5].

2.3. Approximation of sums

We require for our argumentation the asymptotic behaviour of the generating function as tends to the radius of convergence, which is in our case.

Lemma 2.3.

Let a sequence of positive numbers with as . We have for all

(2.6)

indicates the Riemann Zeta function. Furthermore, we have for

(2.7)

We indicate

This lemma can be proven with Euler Maclaurin summation formula or with the Mellin transformation. The computations with Euler Maclaurin summation are straightforward and the details of the proof with the Mellin transformation can be found for instance in [15, Chapter VI.8]. We thus omit it.

We require also the behaviour of partial sum as and as . We have

Lemma 2.4 (Approximation of sums).

Let and be given with and for and . We then have for all and all

with the incomplete Gamma function.

Remark 2.1.

One can obtain more error terms by using the Euler Maclaurin summation formula with more derivatives. We have given in Appendix A a formulation of the Euler Maclaurin summation formula with non-integer boundaries, which is more suitable for this computation than the usual one. Our primary interest is in the leading coefficient, hence we state the result only up to order . However, the lower order terms can not be completely ignored. In particular they play an important role for the expectation of in Theorem 1.1 since there are, beside the leading term , also other terms in the asymptotic expansion which are not .

Proof.

stands in the proof for the first Bernoulli polynomial. The proof of this lemma is based on the Euler Maclaurin summation formula, see [2] or [1, Theorem 3.1]. We use here the following version: let have a continuous derivative and suppose that and are integrable. Then

(2.8)

We substitute , and notice that and all its derivatives tend to zero exponentially fast as . We begin with the first integral. Now by the change of variables

(2.9)

where we have swapped integral and series expansion of the exponential by Fubini’s theorem. This gives the behaviour of the leading term in (2.8) with . The remaining terms can be estimated with a similar computations and using that is bounded.

3. Randomization

We introduce in this section a probability measure on , where denotes the disjoint union, dependent on a parameter with and consider the asymptotic behaviour of with respect to as .

3.1. Grand canonical ensemble

Computations on can turn out to be difficult and many formulas can not be used to study the behaviour as . A possible solution to this problem is to adopt a suitable randomization. This has been successfully introduced by [16] and used also by [8] as a tool to investigate combinatorial structures, and later applied in many contexts. The main idea of randomization is to define a one-parameter family of probability measures on for which cycle counts turn out to be independent. Then one is able to study their behaviour more easily, and ultimately the parameter is tuned in such a way that randomized functionals are distributed as in the non-randomized context. Let us see how to apply this in our work. We define

(3.1)

with as in (2.5). If is finite for some , then for each let us define the probability measure

(3.2)

Lemma 2.2 shows that is indeed a probability measure on . The induced distribution on cycle counts can easily be determined.

Lemma 3.1.

Under the ’s are independent and Poisson distributed with

Proof.

From Pólya’s enumeration theorem (Lemma 2.2) we obtain

Analogously one proves the pairwise independence of cycle counts. ∎

Obviously the following conditioning relation holds:

A proof of this fact is easy and can be found for instance in [17, Equation (1)]. We note that is -a.s. finite, since . Now since the conditioning relation holds for all with , one can try to look for satisfying “”, which heuristically means that we choose a parameter for which permutations on weigh as most of the mass of the measure . We have on

A natural choice for is thus the solution of

(3.3)

which is guaranteed to exist if the series on the right-hand side is divergent at the radius of convergence (we will see this holds true for our particular choice of weights). We write and use Lemma 2.3 in our case to obtain

(3.4)

We will fix this choice for the rest of the section.

3.2. Limit shape and mod-convergence

In order to derive our main results from the measure we will use a tool developed by [19], the mod-Poisson convergence.

Definition 3.2.

A sequence of random variables converges in the mod-Poisson sense with parameters if the following limit

exists for every , and the convergence is locally uniform. The limiting function is then continuous and .

This type of convergence gives stronger results than a central limit theorem, indeed it implies a CLT (and other properties we will see below). For the rest of the Section let us fix and as in (1.5). We obtain

Proposition 3.3.

Let be arbitrary and . Furthermore, let with as in (3.4). Then the random variables converge in the mod-Poisson sense with parameters

where

(3.5)

is the upper incomplete Gamma function.

Proof.

We have

(3.6)

This is the characteristic function of Poisson distribution. We thus obviously have mod-Poisson convergence with limiting function . It remains to compute the parameter . Applying Lemma 2.3 for and Lemma 2.4 for together with (3.4) gives

(3.7)

We deduce that . This completes the proof. ∎

This yields a number of interesting consequences. In first place we can prove a CLT and detect the limit shape accordingly.

Corollary 3.4 (CLT and limit shape for randomization).

With the notation as above, we have as with respect to

(3.8)

Furthermore the limit shape of is given by (with scaling and , see (1.2)). In particular, we can choose in (1.2).

Proof.

The CLT follows immediately from [19, Prop. 2.4], but also can be deduced easily from (3.6) by replacing by . It is also straightforward to show that is the limit shape. For a given , we choose such that for and . It is now easy to see that for each

Thus

(3.9)

It now follows from (3.8) that each summand in (3.9) tends to as . This completes the proof. ∎

Another by-product of mod-Poisson convergence of a sequence is that one can approximate with a Poisson random variable with parameter , see [19, Prop. 2.5]. However in our situation this is trivial since is already Poisson distributed.

As we are going to do in the next section, we are also interested in the (joint) behaviour of increments.

Proposition 3.5.

For all , , set

Then

(3.10)

as with and with .

Furthermore, and are asymptotically independent.

Remark 3.1.

As we will see, the proof of independence relies on the independence of cycles coming from Lemma 3.1. Therefore it is easy to generalize the above result to more than two points.

Proof.

The proof of (3.10) almost the same as the proof of (3.8) and we thus omit it. Since

and all are independent, we have that and are independent for each . Thus and are also independent in the limit. ∎

3.3. Functional CLT

The topic of this section is to prove a functional CLT for the profile of the Young diagram. Similar results were obtained in a different framework by [17, 11] on the number of cycle counts not exceeding , and by [5] for Young diagrams confined in a rectangular box. We show

Theorem 3.6.

The process (see (3.8)) converges weakly with respect to as to a continuous process with and independent increments.

The technique we will exploit is quite standardized (see [17]). We remark that, unlike in this paper where the Ewens measure is considered, we do not obtain here a Brownian process, as the variance of for is more complicated than in the case of the Wiener measure.

We know from Proposition 3.5 the finite dimensional marginals of the process. More specifically we have for that

(3.11)

where is a diagonal matrix with

Now all we need to show to complete the proof of Theorem 3.6 is the tightness of the process . In order to do so, we will proceed similarly to [17], namely we will show that

Lemma 3.7.

We have for with arbitrary

(3.12)

with , and .

Lemma 3.7 together with [7, Theorem 15.6] implies that the process is tight. This and the marginals in (3.11) prove Theorem 3.6.

Proof of Lemma 3.7.

We define

(3.13)

The independence of the cycle counts leads us to