Products of independent elliptic random matrices
For fixed , we study the product of independent elliptic random matrices as tends to infinity. Our main result shows that the empirical spectral distribution of the product converges, with probability , to the -th power of the circular law, regardless of the joint distribution of the mirror entries in each matrix. This leads to a new kind of universality phenomenon: the limit law for the product of independent random matrices is independent of the limit laws for the individual matrices themselves.
We begin by recalling that the eigenvalues of a matrix are the roots in of the characteristic polynomial , where is the identity matrix. We let denote the eigenvalues of . In this case, the empirical spectral measure is given by
The corresponding empirical spectral distribution (ESD) is given by
Here denotes the cardinality of the set .
If the matrix is Hermitian, then the eigenvalues are real. In this case the ESD is given by
One of the simplest random matrix ensembles is the class of random matrices with independent and identically distributed (iid) entries.
Definition 1.1 (iid random matrix).
Let be a complex random variable. We say is an iid random matrix with atom variable if the entries of are iid copies of .
When is a standard complex Gaussian random variable, can be viewed as a random matrix drawn from the probability distribution
on the set of complex matrices. Here denotes the Lebesgue measure on the real entries
of . The measure is known as the complex Ginibre ensemble. The real Ginibre ensemble is defined analogously. Following Ginibre , one may compute the joint density of the eigenvalues of a random matrix drawn from the complex Ginibre ensemble. Indeed, has density
Mehta [35, 36] used the joint density function (1.1) to compute the limiting spectral measure of the complex Ginibre ensemble. In particular, he showed that if is drawn from the complex Ginibre ensemble, then the ESD of converges to the circular law as , where
and is the uniform probability measure on the unit disk in the complex plane. Edelman  verified the same limiting distribution for the real Ginibre ensemble.
For the general (non-Gaussian) case, there is no formula for the joint distribution of the eigenvalues and the problem appears much more difficult. The universality phenomenon in random matrix theory asserts that the spectral behavior of an iid random matrix does not depend on the distribution of the atom variable in the limit . In other words, one expects that the circular law describes the limiting ESD of a large class of random matrices (not just Gaussian matrices).
An important result was obtained by Girko [21, 22] who related the empirical spectral measure of a non-Hermitian matrix to that of a family of Hermitian matrices. Using this Hermitization technique, Bai [8, 9] gave the first rigorous proof of the circular law for general (non-Gaussian) distributions. He proved the result under a number of moment and smoothness assumptions on the atom variable , and a series of recent improvements were obtained by Götze and Tikhomirov , Pan and Zhou  and Tao and Vu [46, 48]. In particular, Tao and Vu [47, 48] established the law with the minimum assumption that has finite variance.
Theorem 1.2 (Tao-Vu, ).
Let be a complex random variable with mean zero and unit variance. For each , let be a iid random matrix with atom variable . Then the ESD of converges almost surely to the circular law as .
More recently, Götze and Tikhomirov  consider the ESD of the product of independent iid random matrices. They show that, as the sizes of the matrices tend to infinity, the limiting distribution is given by , where is supported on the unit circle in the complex plane and has density given by
in the complex plane. It can be verified directly, that if is a random variable distributed uniformly on the unit disk in the complex plane, then has distribution .
Theorem 1.3 (Götze-Tikhomirov, ).
Let be an interger, and assume are complex random variables with mean zero and unit variance. For each and , let be an iid random matrix with atom variable , and assume are independent. Define the product
Then converges to as .
The convergence of to in Theorem 1.3 was strengthened to almost sure convergence in [10, 44]. The Gaussian case was originally considered by Burda, Janik, and Waclaw ; see also . We refer the reader to [1, 2, 3, 4, 5, 6, 14, 18, 19] and references therein for many other interesting results concerning products of Gaussian random matrices.
2. New results
Definition 2.1 (Real elliptic random matrix).
Let be a random vector in , and let be a real random variable. We say is a real elliptic random matrix with atom variables if the following conditions hold.
(independence) is a collection of independent random elements.
(off-diagonal entries) is a collection of iid copies of .
(diagonal entries) is a collection of iid copies of .
Real elliptic random matrices generalize iid random matrices. Indeed, if are iid, then is just an iid random matrix. On the other hand, if almost surely, then is a real symmetric matrix. In this case, the eigenvalues of are real and is known as a real symmetric Wigner matrix .
Suppose have mean zero and unit variance. Set . When and has mean zero and finite variance, it was shown in  that the ESD of converges almost surely to the elliptic law as , where
and is the uniform probability measure on the ellipsoid
In this note, we consider the product of independent real elliptic random matrices. In particular, we assume each real elliptic random matrix has atom variables which satisfy the following conditions.
There exists such that the following conditions hold.
both have mean zero and unit variance.
has mean zero and finite variance.
In our main result below, we show that the limiting distribution (with density given by (1.2)) from Theorem 1.3 for the product of independent iid random matrices is also the limiting distribution for the product of independent elliptic random matrices. In other words, the limit law for the product of independent random matrices is independent of the limit laws for the individual matrices themselves. This type of universality was first considered by Burda, Janik, and Waclaw in  for matrices with Gaussian entries; see also . Figure 2 displays several numerical simulations which illustrate this phenomenon.
More generally, we establish a version of Theorem 2.3 where each elliptic random matrix is perturbed by a deterministic, low rank matrix with small Hilbert-Schmidt norm. In fact, Theorem 2.3 will follow from Theorem 2.4 below. We recall that, for any matrix , the Hilbert-Schmidt norm is given by the formula
Let be an integer. For each , let be real random elements that satisfy Assumption 2.2. For each and , let be an real elliptic random matrix with atom variables , and assume are independent. For each , let be a deterministic matrix, and assume
for some . Then the ESD of the product
converges almost surely to (with density given by (1.2)) as .
We conjecture that items (2) and (3) from Assumption 2.2 are not required for Theorem 2.4 to hold. Indeed, in view of Theorem 1.2 and , it is natural to conjecture that need only have two finite moments. Also, our proof of Theorem 2.4 can almost be completed under the assumption that . We only require that in Section 5 in order to control the least singular value of matrices of the form , where is a deterministic matrix whose entries are bounded by , for some . See Remark 5.3 and Theorem 2.8 below for further details.
Among other things, the perturbation by in Theorem 2.4 allows one to consider elliptic random matrices with nonzero mean. Indeed, let be a real number, and assume each entry of takes the value . Then is an elliptic random matrix whose atom variables have mean .
As noted above, when , the matrix is known as a real symmetric Wigner matrix. Theorem 2.4 requires that , but in the special case when , we are able to extend our proof to show that the same result holds for the product of two independent real symmetric Wigner matrices.
Let be real random variables with mean zero and unit variance, and which satisfy
for some . For each and , let be an real symmetric matrix whose diagonal entries and upper diagonal entries are iid copies of , and assume and are independent. Then the ESD of the product
converges almost surely to (with density given by (1.2) when ) as .
2.1. Overview and outline
We begin by outlining the proof of Theorem 2.4. Instead of directly considering , we introduce a linearized random matrix, , where and are block matrices of the form
The following theorem gives the limiting distribution of , from which we will deduce our main theorem as a corollary.
Under the assumptions of Theorem 2.4, the ESD of converges almost surely to the circular law as .
In Section 3, we show that Theorem 2.4 is a short corollary of Theorem 2.9. This same linearization trick was used in  to study products of non-Hermitian matrices with iid entries. Similar techniques were also used in [7, 30] to study general self-adjoint polynomials of self-adjoint random matrices.
Sections 4, 5, and 6 are dedicated to proving Theorem 2.9. Following the ideas of Girko [21, 22], we compute the limiting spectral measure of a non-Hermitian random matrix , by employing the method of Hermitizaition. Given an matrix , we recall that the empirical spectral measure of is given by
where are the eigenvalues of . We let denote the symmetric empirical measure built from the singular values of . That is,
where are the singular values of . In particular,
is the largest singular value of and
is the smallest singular value, both of which will play a key role in our analysis below.
The key observation of Girko [21, 22] relates the empirical spectral measure of a non-Hermitian matrix to that of a Hermitian matrix. To illustrate the connection, consider the Cauchy–Stieltjes transform of the measure , where is an matrix, given by
for . Since is analytic everywhere except at the poles (which are exactly the eigenvalues of ), the real part of determines the eigenvalues. Let denote the imaginary unit, and set . Then we can write the real part of as
where denotes the identity matrix. In other words, the task of studying reduces to studying the measures . The difficulty now is that the function has two poles, one at infinity and one at zero. The largest singular value can easily be bounded by a polynomial in . The main difficulty is controlling the least singular value.
In order to study it is useful to note that it is also the empirical spectral measure for the Hermitization of . The Hermitization of is defined to be
For an matrix, the Stieltjes transform of is also the trace of the Hermitized resolvent. That is, for , we have
Here denotes the Kronecker product of the matrix and the identity matrix .
Typically, in order to estimate the measures , one shows that the Stieltjes transform approximately satisfies a fixed point equation. Then one can show that this Stieltjes transform is close to the Stieltjes transform that exactly solves the fixed point equation. Because of the dependencies between entries in the matrix , directly computing the trace of the resolvent of the Hermitization of is troublesome. To circumvent this issue, in Section 4, instead of taking the trace of the resolvent, we instead take the partial trace and consider a matrix-valued Stieltjes transform. Then we show this partial trace approximately satisfies a matrix-valued fixed point equation.
In Section 5, we deduce a bound for the least singular value of the matrix from the known bounds on the least singular values of the individual matrices . We finally complete the proof of Theorem 2.9 in Section 6.
2.2. A remark from free probability
The fact that the limiting distribution of the product is isotropic when the limiting distributions of the individual matrices are not might be surprising at first. Free probability, which offers a natural way to study limits of random matrices by considering joint distributions of elements from a non-commutative probability space, can shed some light on this. In free probability, the natural distribution of non-normal elements is known as the Brown measure. For an introduction to free probability, we refer the reader to ; see  for further details about -diagonal pairs as well as [12, 29] for computations of Brown measures. The distribution in Theorem 2.4 has also appeared in .
A non-commutative probability space is a unital algebra with a tracial state . We say a collection of elements are free if
whenever are polynomials such that , and .
In free probability, there are a distinguished set of elements known as -diagonal elements. We refer the reader to [32, Section 4.4] for complete details. These operators enjoy several nice properties. When they are non-singular, one such property is that their polar decomposition is , where is a haar unitary operator, is a positive operator, and are free. As a result of this decomposition, their Brown measure is isotropic. Additionally, the set of -diagonal operators is closed under addition and multiplication of free elements.
In many cases, the Brown measure can be computed using the techniques of [12, 29]; however, for the purposes of this note (and due to discontinuities of the Brown measure), we will instead focus on a purely random matrix approach when computing the limiting distribution.
We conclude this subsection by showing that the product of two elliptical elements is -diagonal. We consider two elements for simplicity; however, the argument easily generalizes to the product of elliptical elements.
First, we decompose an elliptical operator into the sum of a semicircular and circular elements, that are free from each other: . Since the sum of free -diagonal elements is again -diagonal, it suffices to consider each term in the sum individually and then observe that the terms are free from one another. Each term is of the form , where is either semicircular or circular, with polar decomposition: , , where is a quarter circular element, has distribution 1/2 at -1 and 1/2 at 1, and commutes with , is haar unitary free from . Then we consider the product:
We begin by introducing a new free haar unitary . Indeed, has the same distribution as
Then and are haar unitaries, and one can check they are free from each other and and . Since the product of -diagonal elements remains -diagonal is -diagonal. Repeating this process for each term leads to the sum of free -diagonal operators.
We use asymptotic notation (such as ) under the assumption that . We use , or to denote the bound for all sufficiently large and for some constant . Notations such as and mean that the hidden constant depends on another constant . We always allow the implicit constants in our asymptotic notation to depend on the integer from Theorem 2.4; we will not denote this dependence with a subscript. or means that as .
is the spectral norm of the matrix . denotes the Hilbert-Schmidt norm of (defined in (2.1)). We let denote the identity matrix. Often we will just write for the identity matrix when the size can be deduced from the context.
We write a.s., a.a., and a.e. for almost surely, Lebesgue almost all, and Lebesgue almost everywhere respectively. We use to denote the imaginary unit and reserve as an index. We let denote the indicator function of the event .
We let and denote constants that are non-random and may take on different values from one appearance to the next. The notation means that the constant depends on another parameter . We always allow the constants and to depend on the integer from Theorem 2.4; we will not denote this dependence with a subscript.
3. Proof of Theorem 2.4
Proof of Theorem 2.4.
Let . Then is a block diagonal matrix of the form
where and is the matrix
Let be a bounded and continuous function. Since each has the same eigenvalues as , we have
By Theorem 2.9, we have almost surely
as , where is the unit disk in the complex plane centered at the origin and . Thus, by the transformation , we obtain
where the factor of out front of the integral corresponds to the fact that the transformation maps the complex plane times onto itself.
Combining the computations above, we conclude that almost surely
as . Since was an arbitrary bounded and continuous function, the proof of Theorem 2.4 is complete. ∎
4. A matrix-valued Stieltjes transform
In this section, we define a matrix-valued Stieltjes transform and introduce the relevant notation and limiting objects. Then we show that this Stieltjes transform concentrates around its expectation and estimate the error between its expectation and the limiting transform.
Here and in the sequel, we will take advantage of the following form for the inverse of a partitioned matrix (see, for instance, [33, Section 0.7.3]):
where and are square matrices.
Set , and let . Let be the Hermitization of . Define the resolvent
By the Stieltjes inversion formula, can be recovered from . Because of the dependencies between matrix entries, each entry of the resolvent cannot be computed directly by Schur’s Complement. One possible way to compute resolvent entries is by following the approach in [44, Section 4.3] and use a decoupling formula to compute matrix entries. See also [37, 40] for computations in the elliptical case. The dependencies introduce more terms to these computations, leading to a system of equations involving diagonal entries of each block of the resolvent. These equations do not seem to admit an obvious solution. Instead we offer a matrix-valued interpretations of these equations as well as a more direct derivation of the equations.
In order to study the resolvent we will retain the block structure of and view matrices as elements of matrices tensored with matrices. Taking this view, is by matrix with by blocks. When we wish to refer to one of these blocks (or more generally any element of a matrix) we will use a superscript for the entry. Instead of considering the full trace of , we instead take the partial trace over the matrix part of the tensor product and define . That is, is a matrix whose entry is the normalized trace of the block of . In other words, . To compute this partial trace we consider , the matrix whose entry is the entry of the block . Finally, we define the scalar
For each , let denote the matrix with the -th rows and -th columns of replaced by zeroes. Let be the Hermitization of . Define the resolvent
and set .
Let be the matrix whose entry is the column of the block of , with the entry of each vector set to . Note that we use a semi-colon when we refer to matrix entries or columns, in contrast to the comma, which referred to a matrix.
Later in this section we will show that approximately satisfies the fixed point equation
with being a linear operator on matrices defined by:
where and is the matrix that is the block of the matrix , of course, the choice was arbitrary. More concretely, we define for any to be the column index of nonzero block in the row of . So
where for we define . It is important that leaves diagonal entries of on the diagonal and that .
To describe the limiting matrix-valued Stieltjes transform, , we first define , the Stieltjes transform corresponding to the circular law. That is, for each , is the unique Stieltjes transform that solves the equation
for all ; see [27, Section 3].
additionally, (4.5) implies the equality
We recall that for a square matrix , the imaginary part of is given by . We say has positive imaginary part if is positive definite. It was shown in  that (4.4) has one solution with positive imaginary part and is therefore a matrix-valued Stietljes transform. Furthermore, the last two equalities show that is a solution to (4.4).
A good way to see that the solution to (4.4) is of the given form is to note that for large , is approximately . Then by analytic continuation, the entries of that are non-zero must also be non-zero entries of . Finally, this ansatz for the form of the solution is applied to (4.4) and iterated until the non-zero entries of are preserved by (4.4). Through this process one observes that the value of each does not affect the solution.
In this section we show that concentrates around its expectation.
We introduce -nets as a convenient way to discretize a compact set. Let . A set is an -net of a set if for any , there exists such that . The following estimate for the maximum size of an -net is well-known and follows from a standard volume argument (see, for example, [43, Lemma 3.11]).
Lemma 4.1 (Lemma 3.11 from ).
Let be a compact subset of . Then admits an -net of size at most
Let . Under the assumptions of Theorem 2.4, a.s.
By the Borel-Cantelli lemma, it suffices to show that
for some constant .
Let and be -nets of and respectively. By Lemma 4.1,
Let be the set of all (defined by (4.2)) such that and . Hence . By the resolvent identity,
Thus, by a standard -net argument, it suffices to show that
By the union bound and Markov’s inequality, we have, for any ,
Therefore, it will suffice to show that for some sufficiently large, there exists a constant (depending only on ), such that
for any .
In fact, since is a matrix, we will show that, for every ,
for any and .
Fix . Let denote the conditional expectation with respect to the first rows and columns of each matrix .
We now rewrite as a martingale difference sequence. Indeed,
Since is at most rank , the resolvent identity implies that is at most rank and . This then gives the bound