1 Introduction
\FirstPageHeading\ShortArticleName

On the Scaling Limits of Determinantal Point Processes

\ArticleName

On the Scaling Limits of Determinantal
Point Processes with Kernels Induced
by Sturm–Liouville OperatorsThis paper is a contribution to the Special Issue on Asymptotics and Universality in Random Matrices, Random Growth Processes, Integrable Systems and Statistical Physics in honor of Percy Deift and Craig Tracy. The full collection is available at http://www.emis.de/journals/SIGMA/Deift-Tracy.html

\Author

Folkmar BORNEMANN

\AuthorNameForHeading

F. Bornemann

\Address

Zentrum Mathematik – M3, Technische Universität München, 80290 München, Germany \Emailbornemann@tum.de \URLaddresshttp://www-m3.ma.tum.de/bornemann

\ArticleDates

Received April 15, 2016, in final form August 16, 2016; Published online August 19, 2016

\Abstract

By applying an idea of Borodin and Olshanski [J. Algebra 313 (2007), 40–60], we study various scaling limits of determinantal point processes with trace class projection kernels given by spectral projections of selfadjoint Sturm–Liouville operators. Instead of studying the convergence of the kernels as functions, the method directly addresses the strong convergence of the induced integral operators. We show that, for this notion of convergence, the Dyson, Airy, and Bessel kernels are universal in the bulk, soft-edge, and hard-edge scaling limits. This result allows us to give a short and unified derivation of the known formulae for the scaling limits of the classical random matrix ensembles with unitary invariance, that is, the Gaussian unitary ensemble (GUE), the Wishart or Laguerre unitary ensemble (LUE), and the MANOVA (multivariate analysis of variance) or Jacobi unitary ensemble (JUE).

\Keywords

determinantal point processes; Sturm–Liouville operators; scaling limits; strong operator convergence; classical random matrix ensembles; GUE; LUE; JUE; MANOVA

\Classification

15B52; 34B24; 33C45

Dedicated to Percy Deift at the occasion of his 70th birthday.

1 Introduction

We consider determinantal point processes on a (not necessarily bounded) interval with a correlation kernel given by a trace class projection kernel,

(1.1)

where are orthonormal in ; each may have some dependence on that we suppress from the notation. We recall (see, e.g., [2, Section 4.2]) that for such processes the joint probability density of the points is given by

the mean counting probability is given by the density (note that )

and the gap probabilities are given, by the inclusion-exclusion principle, in terms of a Fredholm determinant, namely

The various scaling limits are usually derived from an appropriate convergence of the kernel by considering the large asymptotic of the eigenfunctions , which can be technically quite involved111Based on the two-scale Plancherel–Rotach asymptotic of classical orthogonal polynomials or, methodologically more general, on the asymptotic of Riemann–Hilbert problems; see, e.g., Tracy and Widom [21, 22], Deift [6], Lubinsky [16], Johnstone [12, 13], Collins [5], Forrester [8], Anderson et al. [2], and Kuijlaars [14]..

Borodin and Olshanski [4] suggested, for discrete point processes, a different, conceptually and technically much simpler approach based on selfadjoint difference operators. We will show that their method, generalized to selfadjoint Sturm–Liouville operators, allows us to give a short and unified derivation of the various scaling limits for the random matrix ensembles with unitary invariance that are based on the classical orthogonal polynomials (Hermite, Laguerre, Jacobi).

The Borodin–Olshanski method

The method proceeds along three steps: First, we identify the induced integral operator as the spectral projection (where we denote by the characteristic function of a Borel subset and by the application of that function to the selfadjoint operator in the sense of measurable functional calculus [17, Theorem VIII.6])

of some selfadjoint ordinary differential operator on . Any scaling of the point process by () yields, in turn, the induced rescaled operator

where is a selfadjoint differential operator on , .

Second, if with , , we aim for a selfadjoint operator on with a core such that eventually and

(1.2)

The point is that, if the test functions from are particularly nice, such a convergence is just a simple consequence of the locally uniform convergence of the coefficients of the differential operators  – a convergence that is, typically, an easy calculus exercise. Now, given (1.2), the concept of strong resolvent convergence (see Theorem A.1) immediately yields222By “” we denote the strong convergence of operators acting on ., if ,

Third, we take an interval , eventually satisfying , such that the operator is trace class with kernel (which can be obtained from the generalized eigenfunction expansion of , see Section A.2). Then, we immediately get the strong convergence

Remark 1.1.

Tao [20, Section 3.3] sketches the Borodin–Olshanski method, applied to the bulk and edge scaling of GUE, as a heuristic device. Because of the microlocal methods that he uses to calculate the projection , he puts his sketch under the headline “The Dyson and Airy kernels of GUE via semiclassical analysis”.

Scaling limits and other modes of convergence

Given that one just has to establish the convergence of the coefficients of a differential operator (instead of an asymptotic of its eigenfunctions), the Borodin–Olshanski method is an extremely simple device to determine all the scalings that would yield some meaningful limit , namely in the strong operator topology. Other modes of convergence have been studied in the literature, ranging from some weak convergence of -point correlation functions over convergence of the kernel functions to the convergence of gap probabilities, that is,

From a probabilistic point of view, the latter convergence is of particular interest and has been shown in at least three ways:

  1. By Hadamard’s inequality, convergence of the determinants follows directly from the locally uniform convergence of the kernels [2, Lemma 3.4.5] and, for unbounded , from additional large deviation estimates [2, Lemma 3.3.2]. This way, the limit gap probabilities in the bulk and soft edge scaling limit of GUE can rigorously be established (see, e.g., Anderson et al. [2, Sections 3.5 and 3.7]). Johansson [11, Lemma 3.1] gives some general conditions on a scaling of the  such that the determinant converges to the soft edge of GUE.

  2. Since is continuous with respect to the trace class norm [18, Theorem 3.4], in trace class norm would generally suffice. Such a convergence can be proved by factorizing the trace class operators into Hilbert–Schmidt operators and obtaining the -convergence of the factorized kernels once more from locally uniform convergence, see the work of Johnstone [12, 13] on the scaling limits of the LUE/Wishart ensembles and on the limits of the JUE/MANOVA ensembles.

  3. Since and are selfadjoint and positive semi-definite, yet another way is by observing that the convergence in trace class norm is, for continuous kernels, equivalent [18, Theorem 2.20] to the combination of both, the convergence in the weak operator topology and the convergence of the traces

    (1.3)

    Once again, these convergences follow from locally uniform convergence of the kernels; see Deift [6, Section 8.1] for an application of this method to the bulk scaling limit of GUE.

Since convergence in the strong operator topology implies convergence in the weak one, the Borodin–Olshanski method would thus establish the convergence of gap probabilities if we were only able to show condition (1.3) by some additional, similarly short and simply argument. Note that, by the ideal property of the trace class, condition (1.3) implies the same condition for all . We fall, however, short of conceiving a proof strategy for condition (1.3) that would be independent of all the laborious proofs of locally uniform convergence of the kernels.

Remark 1.2.

Contrary to the discrete case considered by Borodin and Olshanski, it is also not immediate to infer from the strong convergence of the induced integral operators the pointwise convergence of the kernels. In Section 2 we will need only a single such instance, namely

(1.4)

to prove a limit law for the mean counting probability. Using mollified Dirac deltas, pointwise convergence would generally follow, for continuously differentiable , if we were able to bound, locally uniform, the gradient of . Then, by dominated convergence, criterion (1.3) would already be satisfied if we established an integrable bound of on . Since the scalings laws are, however, maneuvering just at the edge between trivial cases (i.e., zero limits) and divergent cases, it is conceivable that a proof of such bounds might not be significantly simpler than a proof of convergence of the gap probabilities itself.

The main result

To prepare we recall how an integral kernel is getting covariantly transformed in the presence of an affine coordinate change , : by invariance of the -form

the transformed kernel is given by

(1.5)

Using the Borodin–Olshanski method, we will prove the following general result for selfadjoint Sturm–Liouville operators; a result that adds a further class of problems to the universality [14] of the Dyson, Airy, and Bessel kernel333For the definitions of the kernels , , see (A.3), (A.4) and (A.5). in the bulk, soft-edge, and hard-edge scaling limits.

Theorem 1.3.

Let be one of the three domains , , or , and let be a selfadjoint realization on of the formally selfadjoint Sturm–Liouville operator444Since, in this paper, we consider always a particular selfadjoint realization of a formal differential operator, we will use the same letter to denote both.

with coefficients such that for all . Assume that, for and , there are asymptotic expansions

(1.6)

with a remainder that is of order locally uniform in , and exponents normalized by

(1.7)

where if . Further assume that these expansions can be differentiated555We say that an expansion can be differentiated if ., that the roots of are simple, and that the spectral projection is normalized by

Let a scaling by induce the transformed projection kernel according to (1.5).

Then, depending on particular choices of and , the following three scaling limits hold.

  • Bulk scaling limit: given with , the scaling parameters

    where

    (1.8)

    yield, for a bounded interval , the strong limit

    At , the mean counting probability density transforms to the new variable as

    Under condition (1.4), and if as defined in (1.8) has unit mass on , there is the limit law

  • Soft-edge scaling limit: given with , the scaling parameters

    yield, for and a not necessarily bounded interval , the strong limit

  • Hard-edge scaling limit: given that or with

    (1.9)

    the scaling parameters

    yield, for a bounded interval , the strong limit666Here, if , the selfadjoint realization is defined by means of the boundary condition (1.10)

    (1.11)
Remark 1.4.

Whether the interval in the strong operator limit can be chosen unbounded or not depends on whether the limit operator is trace class or not (see the explicit formulae of the traces given in the appendix for each of the three limits): only in the former case we get a representation of the scaling limit in terms of a particular integral kernel, cf. Theorem A.3. Note that it is impossible to use since .

Outline of the paper

The proof of Theorem 1.3 is subject of Section 2. In Section 3 we apply it to the classical orthogonal polynomials, which yields a short and unified derivation of the known formulae for the scaling limits for the classical random matrix ensembles with unitary invariance (GUE, LUE/Wishart, JUE/MANOVA). In fact, by a result of Tricomi, the only input needed is the weight function  of the orthogonal polynomials; from there one gets in a purely formula based fashion (by simple manipulations which can easily be coded in any computer algebra system), first, to the coefficients  and  as well as to the eigenvalues  of the Sturm–Liouville operator  and next, by applying Theorem 1.3, to the particular scaling limits.

To emphasize that our main result and its application is largely independent of concretely identifying the limit projection kernel , we postpone this identification to Lemmas A.5, A.7 and A.9: there, using generalized eigenfunction expansions, we calculate the Dyson, Airy, and Bessel kernels directly from the limit differential operator .

2 Proof of the main result for Sturm–Liouville operators

We start the proof of Theorem 1.3 with some preparatory steps before we deal with the particular scaling limits. Since is a selfadjoint realization on of the Sturm–Liouville operator

with and for , we have .

Preparatory Step 1: transformation

The scaling

maps bijectively to . Since such an affine coordinate transform just induces a unitary equivalence of integral and differential operators, the spectral projection relation

is left invariant if the kernel is transformed according to (1.5) and the differential operator  is transformed using as

Since the spectral projection to the negative part of the spectrum of a differential operator is left invariant if we multiply that operator by some positive constant , , we see that

where the transformed differential operator is given finally by

with coefficients

(2.1)

Preparatory Step 2: strong operator limit

Suppose the transformed domain satisfies , . Then, with we have that, eventually, . Further, suppose that the coefficients of converge locally uniform in as (where the limit of can be differentiated)

such that the limit coefficients and are smooth functions and

(2.2)

defines a Sturm–Liouville operator that is essentially selfadjoint on . Then, by dominated convergence, we get the convergence in for each test function in the core . Hence, by Theorem A.1 we have the strong operator convergence

if and, eventually, . In the particular cases considered in the following limit steps of the proof, the spectrum of is always absolutely continuous, that is, . Finally, by Theorem A.3, under the finite trace condition mentioned already in Remark 1.4, there is an integral kernel such that

which finishes the proof of a strong operator convergence in general.

Preparatory Step 3: Taylor expansions of the coefficients

The case

Suppose that is fixed. The choice is then admissible and we get, if

from (1.6), (1.7), and (2.1) by a Taylor expansion

(2.3)

which holds locally uniform in (where the expansion of can be differentiated).

The case

Suppose that the assumptions in (1.9) are met. If , the choice is admissible and we get from (2.1) by a Taylor expansion

(2.4)

which holds locally uniform in (where the expansion of can be differentiated).

Limit Step 1: bulk scaling limit

If , by inserting

we read off from (2.3) the limit coefficients and , where ; that is, the limit differential operator (2.2) is given by

Note that, for the domains and the values of considered, we have .

Lemma A.5 states that is essentially selfadjoint on and that its unique selfadjoint extension has absolutely continuous spectrum: . Thus, for , the spectral projection is zero. For , the spectral projection can be calculated by a generalized eigenfunction expansion, yielding the Dyson kernel (A.3).

We will see in the next step that the dichotomy between is also reflected in the structure of the support of the limit law .

Limit Step 2: limit law

The result for the bulk scaling limit allows, in passing, to calculate a limit law of the mean counting probability density : we observe that transforms the density into

Thus, to get to a limit, we have to assume condition (1.4), so that a pointwise rendering of the bulk scaling limit just considered yields777The Iverson bracket stands for if the statement is true, otherwise.

This way we get

Hence, by Helly’s selection theorem, the probability measure converges vaguely to , which is, in general, just a sub-probability measure. If, however, it is checked that has unit mass, the convergence is weak.

Limit Step 3: soft-edge scaling limit

If , by inserting888Note that, by the assumption made on the simplicity of the roots of , we have .

we read off from (2.3) the limit coefficients and ; that is, the limit differential operator (2.2) is

Note that, for the domains and the values of considered, we have .

Lemma A.7 states that is essentially selfadjoint on and that its unique selfadjoint extension has absolutely continuous spectrum: . The spectral projection can be calculated by a generalized eigenfunction expansion, yielding the Airy kernel (A.4).

Limit Step 4: hard-edge scaling limit

For or , we take a scaling

with appropriately chosen, to explore the vicinity of the “hard edge” ; note that such a scaling yields . We make the assumptions stated in (1.9). By inserting

we read off from (2.4), using (1.6), the limit coefficients and , where  is defined as in (1.11); that is, the limit differential operator (2.2) is given by

If , Lemma A.9 states that the limit is essentially selfadjoint on and that the spectrum of its unique selfadjoint extension is absolutely continuous: . The spectral projection can be calculated by a generalized eigenfunction expansion, yielding the Bessel kernel (A.5).

Remark 2.1.

The theorem also holds in the case if the particular selfadjoint realization  is defined by the boundary condition (1.10), see Remark A.10.

3 Application to classical orthogonal polynomials

In this section we apply Theorem 1.3 to the kernels associated with the classical orthogonal polynomials, that is, the Hermite, Laguerre, and Jacobi polynomials. In random matrix theory, the thus induced determinantal processes are modeled by the spectra of the Gaussian unitary ensemble (GUE), the Wishart or Laguerre unitary ensemble (LUE), and the MANOVA (multivariate analysis of variance) or Jacobi unitary ensemble (JUE).

To prepare the study of the individual cases, we first discuss their common structure. Let  be the sequence of classical orthogonal polynomials belonging to the weight function on the (not necessarily bounded) interval . We normalize such that , where . The functions form a complete orthogonal set in ; conceptual proofs of the completeness can be found, e.g., in Andrews, Askey and Roy [3] (Section 5.7 for the Jacobi polynomials, Section 6.5 for the Hermite and Laguerre polynomials).

By a result of Tricomi [7, Section 10.7], the satisfy the eigenvalue problem

where is a quadratic polynomial999With the sign chosen such that for . and a linear polynomial such that

In terms of , a brief calculation shows that

Therefore, by the completeness of the , the formally selfadjoint Sturm–Liouville operator has a particular selfadjoint realization on (which we continue to denote by the letter ) with spectrum

and corresponding eigenfunctions . Hence, if the eigenvalues are, eventually, strictly increasing, the projection kernel (1.1) defines an integral operator with such that, eventually,

Note that this relation remains true if we choose to make some parameters of the weight (and, therefore, of the functions ) to depend on . For the scaling limits of , we are now in the realm of Theorem 1.3: given the weight as the only input all the other quantities can now be obtained simply by routine calculations.

Hermite polynomials

The weight is on ; hence

and, therefore,

Theorem 1.3 is applicable and we directly read off the following well-known scaling limits of the GUE (see, e.g., [2, Chapter 3]):

  • bulk scaling limit: if , the transformation

    induces with a strong limit given by the Dyson kernel;

  • limit law: the transformation induces the mean counting probability density with a weak limit given by the Wigner semicircle law

  • soft-edge scaling limit: the transformation

    induces with a strong limit given by the Airy kernel.

Laguerre polynomials

The weight is on ; hence

In random matrix theory, the corresponding determinantal point process is modeled by the spectra of complex Wishart matrices with a dimension parameter ; the Laguerre parameter is then given by . Of particular interest in statistics [12] is the simultaneous limit with

for which we get

Note that

Theorem 1.3 is applicable and we directly read off the following well-known scaling limits of the Wishart ensemble [12]:

  • bulk scaling limit: if ,

    induces with a strong limit given by the Dyson kernel;

  • limit law: the scaling induces the mean counting probability density with a weak limit given by the Marchenko–Pastur law

  • soft-edge scaling limit: with signs chosen consistently as either or ,

    (3.1)

    induces with a strong limit given by the Airy kernel.

Remark 3.1.

The scaling (3.1) is better known in the asymptotically equivalent form

which is obtained from (3.1) by replacing with , see [12, p. 305].

In the case , which implies , the lower soft-edge scaling (3.1) breaks down and has to be replaced by a scaling at the hard edge:

  • hard-edge scaling limit: if is a constant101010By Remark 2.1, there is no need to restrict ourselves to : since with extending smoothly to , we have, for , Hence, the selfadjoint realization is compatible with the boundary condition (1.10)., induces with a strong limit given by the Bessel kernel .

Jacobi polynomials

The weight is on ; hence

and

In random matrix theory, the corresponding determinantal point process is modeled by the spectra of complex MANOVA matrices with dimension parameters ; the Jacobi parameters , are then given by and . Of particular interest in statistics [13] is the simultaneous limit with

for which we get

Note that

Theorem 1.3 is applicable and we directly read off the following (less well-known) scaling limits of the MANOVA ensemble [5, 13]:

  • bulk scaling limit: if ,

    induces with a strong limit given by the Dyson kernel;

  • limit law: (because of there is no transformation here) the mean counting probability density has a weak limit given by the law [23]

  • soft-edge scaling limit: with signs chosen consistently as either or ,

    (3.2)

    induces with a strong limit given by the Airy kernel.

Remark 3.2.

Johnstone [13, p. 2651] gives the soft-edge scaling in terms of a trigonometric parametrization of and . By putting

we immediately get

and (3.2) becomes

In the case , which is equivalent to , we have and . Hence, the lower and the upper soft-edge scaling (3.2) break down and have to be replaced by a scaling at the hard edges:

  • hard-edge scaling limit: if , are constants111111For the cases and , see the justification of the limit given in footnote 10., induces  with a strong limit given by the Bessel kernel ; by symmetry, the Bessel kernel is obtained for .

A Appendices

a.1 Generalized strong convergence

The notion of strong resolvent convergence [24, Section 9.3] links the convergence of differential operators, tested for an appropriate class of smooth functions, to the strong convergence of their spectral projections. We recall a slight generalization of that concept, which allows the underlying Hilbert space to vary.

Specifically we consider, on an interval (not necessarily bounded) and on a sequence of subintervals with and , selfadjoint operators

By means of the natural embedding (that is, extension by zero) we take ; the multiplication operator induced by the characteristic function , which we will denote by the same symbol, constitutes the orthogonal projection of onto . Following Stolz and Weidmann [19, Section 2], we say that converges to in the sense of generalized strong convergence (gsc), if for some , and hence, a forteriori, for all such ,

in the strong operator topology of