On the Scaling Limits of Determinantal Point Processes
On the Scaling Limits of Determinantal
Point Processes with Kernels Induced
by Sturm–Liouville Operators^{}^{}This paper is a contribution to the Special Issue on Asymptotics and Universality in Random Matrices, Random Growth Processes, Integrable Systems and Statistical Physics in honor of Percy Deift and Craig Tracy. The full collection is available at http://www.emis.de/journals/SIGMA/DeiftTracy.html
Folkmar BORNEMANN
F. Bornemann
Zentrum Mathematik – M3, Technische Universität München, 80290 München, Germany \Emailbornemann@tum.de \URLaddresshttp://wwwm3.ma.tum.de/bornemann
Received April 15, 2016, in final form August 16, 2016; Published online August 19, 2016
By applying an idea of Borodin and Olshanski [J. Algebra 313 (2007), 40–60], we study various scaling limits of determinantal point processes with trace class projection kernels given by spectral projections of selfadjoint Sturm–Liouville operators. Instead of studying the convergence of the kernels as functions, the method directly addresses the strong convergence of the induced integral operators. We show that, for this notion of convergence, the Dyson, Airy, and Bessel kernels are universal in the bulk, softedge, and hardedge scaling limits. This result allows us to give a short and unified derivation of the known formulae for the scaling limits of the classical random matrix ensembles with unitary invariance, that is, the Gaussian unitary ensemble (GUE), the Wishart or Laguerre unitary ensemble (LUE), and the MANOVA (multivariate analysis of variance) or Jacobi unitary ensemble (JUE).
determinantal point processes; Sturm–Liouville operators; scaling limits; strong operator convergence; classical random matrix ensembles; GUE; LUE; JUE; MANOVA
15B52; 34B24; 33C45
Dedicated to Percy Deift at the occasion of his 70th birthday.
1 Introduction
We consider determinantal point processes on a (not necessarily bounded) interval with a correlation kernel given by a trace class projection kernel,
(1.1) 
where are orthonormal in ; each may have some dependence on that we suppress from the notation. We recall (see, e.g., [2, Section 4.2]) that for such processes the joint probability density of the points is given by
the mean counting probability is given by the density (note that )
and the gap probabilities are given, by the inclusionexclusion principle, in terms of a Fredholm determinant, namely
The various scaling limits are usually derived from an appropriate convergence of the kernel by considering the large asymptotic of the eigenfunctions , which can be technically quite involved^{1}^{1}1Based on the twoscale Plancherel–Rotach asymptotic of classical orthogonal polynomials or, methodologically more general, on the asymptotic of Riemann–Hilbert problems; see, e.g., Tracy and Widom [21, 22], Deift [6], Lubinsky [16], Johnstone [12, 13], Collins [5], Forrester [8], Anderson et al. [2], and Kuijlaars [14]..
Borodin and Olshanski [4] suggested, for discrete point processes, a different, conceptually and technically much simpler approach based on selfadjoint difference operators. We will show that their method, generalized to selfadjoint Sturm–Liouville operators, allows us to give a short and unified derivation of the various scaling limits for the random matrix ensembles with unitary invariance that are based on the classical orthogonal polynomials (Hermite, Laguerre, Jacobi).
The Borodin–Olshanski method
The method proceeds along three steps: First, we identify the induced integral operator as the spectral projection (where we denote by the characteristic function of a Borel subset and by the application of that function to the selfadjoint operator in the sense of measurable functional calculus [17, Theorem VIII.6])
of some selfadjoint ordinary differential operator on . Any scaling of the point process by () yields, in turn, the induced rescaled operator
where is a selfadjoint differential operator on , .
Second, if with , , we aim for a selfadjoint operator on with a core such that eventually and
(1.2) 
The point is that, if the test functions from are particularly nice, such a convergence is just a simple consequence of the locally uniform convergence of the coefficients of the differential operators – a convergence that is, typically, an easy calculus exercise. Now, given (1.2), the concept of strong resolvent convergence (see Theorem A.1) immediately yields^{2}^{2}2By “” we denote the strong convergence of operators acting on ., if ,
Third, we take an interval , eventually satisfying , such that the operator is trace class with kernel (which can be obtained from the generalized eigenfunction expansion of , see Section A.2). Then, we immediately get the strong convergence
Remark 1.1.
Tao [20, Section 3.3] sketches the Borodin–Olshanski method, applied to the bulk and edge scaling of GUE, as a heuristic device. Because of the microlocal methods that he uses to calculate the projection , he puts his sketch under the headline “The Dyson and Airy kernels of GUE via semiclassical analysis”.
Scaling limits and other modes of convergence
Given that one just has to establish the convergence of the coefficients of a differential operator (instead of an asymptotic of its eigenfunctions), the Borodin–Olshanski method is an extremely simple device to determine all the scalings that would yield some meaningful limit , namely in the strong operator topology. Other modes of convergence have been studied in the literature, ranging from some weak convergence of point correlation functions over convergence of the kernel functions to the convergence of gap probabilities, that is,
From a probabilistic point of view, the latter convergence is of particular interest and has been shown in at least three ways:

By Hadamard’s inequality, convergence of the determinants follows directly from the locally uniform convergence of the kernels [2, Lemma 3.4.5] and, for unbounded , from additional large deviation estimates [2, Lemma 3.3.2]. This way, the limit gap probabilities in the bulk and soft edge scaling limit of GUE can rigorously be established (see, e.g., Anderson et al. [2, Sections 3.5 and 3.7]). Johansson [11, Lemma 3.1] gives some general conditions on a scaling of the such that the determinant converges to the soft edge of GUE.

Since is continuous with respect to the trace class norm [18, Theorem 3.4], in trace class norm would generally suffice. Such a convergence can be proved by factorizing the trace class operators into Hilbert–Schmidt operators and obtaining the convergence of the factorized kernels once more from locally uniform convergence, see the work of Johnstone [12, 13] on the scaling limits of the LUE/Wishart ensembles and on the limits of the JUE/MANOVA ensembles.

Since and are selfadjoint and positive semidefinite, yet another way is by observing that the convergence in trace class norm is, for continuous kernels, equivalent [18, Theorem 2.20] to the combination of both, the convergence in the weak operator topology and the convergence of the traces
(1.3) Once again, these convergences follow from locally uniform convergence of the kernels; see Deift [6, Section 8.1] for an application of this method to the bulk scaling limit of GUE.
Since convergence in the strong operator topology implies convergence in the weak one, the Borodin–Olshanski method would thus establish the convergence of gap probabilities if we were only able to show condition (1.3) by some additional, similarly short and simply argument. Note that, by the ideal property of the trace class, condition (1.3) implies the same condition for all . We fall, however, short of conceiving a proof strategy for condition (1.3) that would be independent of all the laborious proofs of locally uniform convergence of the kernels.
Remark 1.2.
Contrary to the discrete case considered by Borodin and Olshanski, it is also not immediate to infer from the strong convergence of the induced integral operators the pointwise convergence of the kernels. In Section 2 we will need only a single such instance, namely
(1.4) 
to prove a limit law for the mean counting probability. Using mollified Dirac deltas, pointwise convergence would generally follow, for continuously differentiable , if we were able to bound, locally uniform, the gradient of . Then, by dominated convergence, criterion (1.3) would already be satisfied if we established an integrable bound of on . Since the scalings laws are, however, maneuvering just at the edge between trivial cases (i.e., zero limits) and divergent cases, it is conceivable that a proof of such bounds might not be significantly simpler than a proof of convergence of the gap probabilities itself.
The main result
To prepare we recall how an integral kernel is getting covariantly transformed in the presence of an affine coordinate change , : by invariance of the form
the transformed kernel is given by
(1.5) 
Using the Borodin–Olshanski method, we will prove the following general result for selfadjoint Sturm–Liouville operators; a result that adds a further class of problems to the universality [14] of the Dyson, Airy, and Bessel kernel^{3}^{3}3For the definitions of the kernels , , see (A.3), (A.4) and (A.5). in the bulk, softedge, and hardedge scaling limits.
Theorem 1.3.
Let be one of the three domains , , or , and let be a selfadjoint realization on of the formally selfadjoint Sturm–Liouville operator^{4}^{4}4Since, in this paper, we consider always a particular selfadjoint realization of a formal differential operator, we will use the same letter to denote both.
with coefficients such that for all . Assume that, for and , there are asymptotic expansions
(1.6) 
with a remainder that is of order locally uniform in , and exponents normalized by
(1.7) 
where if . Further assume that these expansions can be differentiated^{5}^{5}5We say that an expansion can be differentiated if ., that the roots of are simple, and that the spectral projection is normalized by
Let a scaling by induce the transformed projection kernel according to (1.5).
Then, depending on particular choices of and , the following three scaling limits hold.

Softedge scaling limit: given with , the scaling parameters
yield, for and a not necessarily bounded interval , the strong limit

Hardedge scaling limit: given that or with
(1.9) the scaling parameters
yield, for a bounded interval , the strong limit^{6}^{6}6Here, if , the selfadjoint realization is defined by means of the boundary condition (1.10)
(1.11)
Remark 1.4.
Whether the interval in the strong operator limit can be chosen unbounded or not depends on whether the limit operator is trace class or not (see the explicit formulae of the traces given in the appendix for each of the three limits): only in the former case we get a representation of the scaling limit in terms of a particular integral kernel, cf. Theorem A.3. Note that it is impossible to use since .
Outline of the paper
The proof of Theorem 1.3 is subject of Section 2. In Section 3 we apply it to the classical orthogonal polynomials, which yields a short and unified derivation of the known formulae for the scaling limits for the classical random matrix ensembles with unitary invariance (GUE, LUE/Wishart, JUE/MANOVA). In fact, by a result of Tricomi, the only input needed is the weight function of the orthogonal polynomials; from there one gets in a purely formula based fashion (by simple manipulations which can easily be coded in any computer algebra system), first, to the coefficients and as well as to the eigenvalues of the Sturm–Liouville operator and next, by applying Theorem 1.3, to the particular scaling limits.
To emphasize that our main result and its application is largely independent of concretely identifying the limit projection kernel , we postpone this identification to Lemmas A.5, A.7 and A.9: there, using generalized eigenfunction expansions, we calculate the Dyson, Airy, and Bessel kernels directly from the limit differential operator .
2 Proof of the main result for Sturm–Liouville operators
We start the proof of Theorem 1.3 with some preparatory steps before we deal with the particular scaling limits. Since is a selfadjoint realization on of the Sturm–Liouville operator
with and for , we have .
Preparatory Step 1: transformation
The scaling
maps bijectively to . Since such an affine coordinate transform just induces a unitary equivalence of integral and differential operators, the spectral projection relation
is left invariant if the kernel is transformed according to (1.5) and the differential operator is transformed using as
Since the spectral projection to the negative part of the spectrum of a differential operator is left invariant if we multiply that operator by some positive constant , , we see that
where the transformed differential operator is given finally by
with coefficients
(2.1) 
Preparatory Step 2: strong operator limit
Suppose the transformed domain satisfies , . Then, with we have that, eventually, . Further, suppose that the coefficients of converge locally uniform in as (where the limit of can be differentiated)
such that the limit coefficients and are smooth functions and
(2.2) 
defines a Sturm–Liouville operator that is essentially selfadjoint on . Then, by dominated convergence, we get the convergence in for each test function in the core . Hence, by Theorem A.1 we have the strong operator convergence
if and, eventually, . In the particular cases considered in the following limit steps of the proof, the spectrum of is always absolutely continuous, that is, . Finally, by Theorem A.3, under the finite trace condition mentioned already in Remark 1.4, there is an integral kernel such that
which finishes the proof of a strong operator convergence in general.
Preparatory Step 3: Taylor expansions of the coefficients
The case
The case
Limit Step 1: bulk scaling limit
If , by inserting
we read off from (2.3) the limit coefficients and , where ; that is, the limit differential operator (2.2) is given by
Note that, for the domains and the values of considered, we have .
Lemma A.5 states that is essentially selfadjoint on and that its unique selfadjoint extension has absolutely continuous spectrum: . Thus, for , the spectral projection is zero. For , the spectral projection can be calculated by a generalized eigenfunction expansion, yielding the Dyson kernel (A.3).
We will see in the next step that the dichotomy between is also reflected in the structure of the support of the limit law .
Limit Step 2: limit law
The result for the bulk scaling limit allows, in passing, to calculate a limit law of the mean counting probability density : we observe that transforms the density into
Thus, to get to a limit, we have to assume condition (1.4), so that a pointwise rendering of the bulk scaling limit just considered yields^{7}^{7}7The Iverson bracket stands for if the statement is true, otherwise.
This way we get
Hence, by Helly’s selection theorem, the probability measure converges vaguely to , which is, in general, just a subprobability measure. If, however, it is checked that has unit mass, the convergence is weak.
Limit Step 3: softedge scaling limit
Limit Step 4: hardedge scaling limit
For or , we take a scaling
with appropriately chosen, to explore the vicinity of the “hard edge” ; note that such a scaling yields . We make the assumptions stated in (1.9). By inserting
we read off from (2.4), using (1.6), the limit coefficients and , where is defined as in (1.11); that is, the limit differential operator (2.2) is given by
If , Lemma A.9 states that the limit is essentially selfadjoint on and that the spectrum of its unique selfadjoint extension is absolutely continuous: . The spectral projection can be calculated by a generalized eigenfunction expansion, yielding the Bessel kernel (A.5).
3 Application to classical orthogonal polynomials
In this section we apply Theorem 1.3 to the kernels associated with the classical orthogonal polynomials, that is, the Hermite, Laguerre, and Jacobi polynomials. In random matrix theory, the thus induced determinantal processes are modeled by the spectra of the Gaussian unitary ensemble (GUE), the Wishart or Laguerre unitary ensemble (LUE), and the MANOVA (multivariate analysis of variance) or Jacobi unitary ensemble (JUE).
To prepare the study of the individual cases, we first discuss their common structure. Let be the sequence of classical orthogonal polynomials belonging to the weight function on the (not necessarily bounded) interval . We normalize such that , where . The functions form a complete orthogonal set in ; conceptual proofs of the completeness can be found, e.g., in Andrews, Askey and Roy [3] (Section 5.7 for the Jacobi polynomials, Section 6.5 for the Hermite and Laguerre polynomials).
By a result of Tricomi [7, Section 10.7], the satisfy the eigenvalue problem
where is a quadratic polynomial^{9}^{9}9With the sign chosen such that for . and a linear polynomial such that
In terms of , a brief calculation shows that
Therefore, by the completeness of the , the formally selfadjoint Sturm–Liouville operator has a particular selfadjoint realization on (which we continue to denote by the letter ) with spectrum
and corresponding eigenfunctions . Hence, if the eigenvalues are, eventually, strictly increasing, the projection kernel (1.1) defines an integral operator with such that, eventually,
Note that this relation remains true if we choose to make some parameters of the weight (and, therefore, of the functions ) to depend on . For the scaling limits of , we are now in the realm of Theorem 1.3: given the weight as the only input all the other quantities can now be obtained simply by routine calculations.
Hermite polynomials
The weight is on ; hence
and, therefore,
Theorem 1.3 is applicable and we directly read off the following wellknown scaling limits of the GUE (see, e.g., [2, Chapter 3]):

bulk scaling limit: if , the transformation
induces with a strong limit given by the Dyson kernel;

limit law: the transformation induces the mean counting probability density with a weak limit given by the Wigner semicircle law

softedge scaling limit: the transformation
induces with a strong limit given by the Airy kernel.
Laguerre polynomials
The weight is on ; hence
In random matrix theory, the corresponding determinantal point process is modeled by the spectra of complex Wishart matrices with a dimension parameter ; the Laguerre parameter is then given by . Of particular interest in statistics [12] is the simultaneous limit with
for which we get
Note that
Theorem 1.3 is applicable and we directly read off the following wellknown scaling limits of the Wishart ensemble [12]:

bulk scaling limit: if ,
induces with a strong limit given by the Dyson kernel;

limit law: the scaling induces the mean counting probability density with a weak limit given by the Marchenko–Pastur law

softedge scaling limit: with signs chosen consistently as either or ,
(3.1) induces with a strong limit given by the Airy kernel.
Remark 3.1.
In the case , which implies , the lower softedge scaling (3.1) breaks down and has to be replaced by a scaling at the hard edge:

hardedge scaling limit: if is a constant^{10}^{10}10By Remark 2.1, there is no need to restrict ourselves to : since with extending smoothly to , we have, for , Hence, the selfadjoint realization is compatible with the boundary condition (1.10)., induces with a strong limit given by the Bessel kernel .
Jacobi polynomials
The weight is on ; hence
and
In random matrix theory, the corresponding determinantal point process is modeled by the spectra of complex MANOVA matrices with dimension parameters ; the Jacobi parameters , are then given by and . Of particular interest in statistics [13] is the simultaneous limit with
for which we get
Note that
Theorem 1.3 is applicable and we directly read off the following (less wellknown) scaling limits of the MANOVA ensemble [5, 13]:

bulk scaling limit: if ,
induces with a strong limit given by the Dyson kernel;

limit law: (because of there is no transformation here) the mean counting probability density has a weak limit given by the law [23]

softedge scaling limit: with signs chosen consistently as either or ,
(3.2) induces with a strong limit given by the Airy kernel.
Remark 3.2.
In the case , which is equivalent to , we have and . Hence, the lower and the upper softedge scaling (3.2) break down and have to be replaced by a scaling at the hard edges:

hardedge scaling limit: if , are constants^{11}^{11}11For the cases and , see the justification of the limit given in footnote 10., induces with a strong limit given by the Bessel kernel ; by symmetry, the Bessel kernel is obtained for .
A Appendices
a.1 Generalized strong convergence
The notion of strong resolvent convergence [24, Section 9.3] links the convergence of differential operators, tested for an appropriate class of smooth functions, to the strong convergence of their spectral projections. We recall a slight generalization of that concept, which allows the underlying Hilbert space to vary.
Specifically we consider, on an interval (not necessarily bounded) and on a sequence of subintervals with and , selfadjoint operators
By means of the natural embedding (that is, extension by zero) we take ; the multiplication operator induced by the characteristic function , which we will denote by the same symbol, constitutes the orthogonal projection of onto . Following Stolz and Weidmann [19, Section 2], we say that converges to in the sense of generalized strong convergence (gsc), if for some , and hence, a forteriori, for all such ,
in the strong operator topology of