Products of independent elliptic random matrices
Abstract.
For fixed , we study the product of independent elliptic random matrices as tends to infinity. Our main result shows that the empirical spectral distribution of the product converges, with probability , to the th power of the circular law, regardless of the joint distribution of the mirror entries in each matrix. This leads to a new kind of universality phenomenon: the limit law for the product of independent random matrices is independent of the limit laws for the individual matrices themselves.
Our result also generalizes earlier results of Götze–Tikhomirov [28] and O’Rourke–Soshnikov [44] concerning the product of independent iid random matrices.
1. Introduction
We begin by recalling that the eigenvalues of a matrix are the roots in of the characteristic polynomial , where is the identity matrix. We let denote the eigenvalues of . In this case, the empirical spectral measure is given by
The corresponding empirical spectral distribution (ESD) is given by
Here denotes the cardinality of the set .
If the matrix is Hermitian, then the eigenvalues are real. In this case the ESD is given by
One of the simplest random matrix ensembles is the class of random matrices with independent and identically distributed (iid) entries.
Definition 1.1 (iid random matrix).
Let be a complex random variable. We say is an iid random matrix with atom variable if the entries of are iid copies of .
When is a standard complex Gaussian random variable, can be viewed as a random matrix drawn from the probability distribution
on the set of complex matrices. Here denotes the Lebesgue measure on the real entries
of . The measure is known as the complex Ginibre ensemble. The real Ginibre ensemble is defined analogously. Following Ginibre [20], one may compute the joint density of the eigenvalues of a random matrix drawn from the complex Ginibre ensemble. Indeed, has density
(1.1) 
Mehta [35, 36] used the joint density function (1.1) to compute the limiting spectral measure of the complex Ginibre ensemble. In particular, he showed that if is drawn from the complex Ginibre ensemble, then the ESD of converges to the circular law as , where
and is the uniform probability measure on the unit disk in the complex plane. Edelman [17] verified the same limiting distribution for the real Ginibre ensemble.
For the general (nonGaussian) case, there is no formula for the joint distribution of the eigenvalues and the problem appears much more difficult. The universality phenomenon in random matrix theory asserts that the spectral behavior of an iid random matrix does not depend on the distribution of the atom variable in the limit . In other words, one expects that the circular law describes the limiting ESD of a large class of random matrices (not just Gaussian matrices).
An important result was obtained by Girko [21, 22] who related the empirical spectral measure of a nonHermitian matrix to that of a family of Hermitian matrices. Using this Hermitization technique, Bai [8, 9] gave the first rigorous proof of the circular law for general (nonGaussian) distributions. He proved the result under a number of moment and smoothness assumptions on the atom variable , and a series of recent improvements were obtained by Götze and Tikhomirov [27], Pan and Zhou [45] and Tao and Vu [46, 48]. In particular, Tao and Vu [47, 48] established the law with the minimum assumption that has finite variance.
Theorem 1.2 (TaoVu, [48]).
Let be a complex random variable with mean zero and unit variance. For each , let be a iid random matrix with atom variable . Then the ESD of converges almost surely to the circular law as .
More recently, Götze and Tikhomirov [28] consider the ESD of the product of independent iid random matrices. They show that, as the sizes of the matrices tend to infinity, the limiting distribution is given by , where is supported on the unit circle in the complex plane and has density given by
(1.2) 
in the complex plane. It can be verified directly, that if is a random variable distributed uniformly on the unit disk in the complex plane, then has distribution .
Theorem 1.3 (GötzeTikhomirov, [28]).
Let be an interger, and assume are complex random variables with mean zero and unit variance. For each and , let be an iid random matrix with atom variable , and assume are independent. Define the product
Then converges to as .
The convergence of to in Theorem 1.3 was strengthened to almost sure convergence in [10, 44]. The Gaussian case was originally considered by Burda, Janik, and Waclaw [13]; see also [15]. We refer the reader to [1, 2, 3, 4, 5, 6, 14, 18, 19] and references therein for many other interesting results concerning products of Gaussian random matrices.
2. New results
In this paper, we generalize Theorem 1.3 by considering products of independent real elliptic random matrices. Elliptic random matrices were originally introduced by Girko [23, 24] in the 1980s.
Definition 2.1 (Real elliptic random matrix).
Let be a random vector in , and let be a real random variable. We say is a real elliptic random matrix with atom variables if the following conditions hold.

(independence) is a collection of independent random elements.

(offdiagonal entries) is a collection of iid copies of .

(diagonal entries) is a collection of iid copies of .
Real elliptic random matrices generalize iid random matrices. Indeed, if are iid, then is just an iid random matrix. On the other hand, if almost surely, then is a real symmetric matrix. In this case, the eigenvalues of are real and is known as a real symmetric Wigner matrix [50].
Suppose have mean zero and unit variance. Set . When and has mean zero and finite variance, it was shown in [40] that the ESD of converges almost surely to the elliptic law as , where
and is the uniform probability measure on the ellipsoid
This is a natural generalization of the circular law (Theorem 1.2). Figure 1 displays a numerical simulation of the eigenvalues of a real elliptic random matrix.
In this note, we consider the product of independent real elliptic random matrices. In particular, we assume each real elliptic random matrix has atom variables which satisfy the following conditions.
Assumption 2.2.
There exists such that the following conditions hold.

both have mean zero and unit variance.

.

satisfies .

has mean zero and finite variance.
In our main result below, we show that the limiting distribution (with density given by (1.2)) from Theorem 1.3 for the product of independent iid random matrices is also the limiting distribution for the product of independent elliptic random matrices. In other words, the limit law for the product of independent random matrices is independent of the limit laws for the individual matrices themselves. This type of universality was first considered by Burda, Janik, and Waclaw in [13] for matrices with Gaussian entries; see also [15]. Figure 2 displays several numerical simulations which illustrate this phenomenon.
Theorem 2.3.
More generally, we establish a version of Theorem 2.3 where each elliptic random matrix is perturbed by a deterministic, low rank matrix with small HilbertSchmidt norm. In fact, Theorem 2.3 will follow from Theorem 2.4 below. We recall that, for any matrix , the HilbertSchmidt norm is given by the formula
(2.1) 
Theorem 2.4.
Let be an integer. For each , let be real random elements that satisfy Assumption 2.2. For each and , let be an real elliptic random matrix with atom variables , and assume are independent. For each , let be a deterministic matrix, and assume
(2.2) 
for some . Then the ESD of the product
(2.3) 
converges almost surely to (with density given by (1.2)) as .
Remark 2.5.
We conjecture that items (2) and (3) from Assumption 2.2 are not required for Theorem 2.4 to hold. Indeed, in view of Theorem 1.2 and [40], it is natural to conjecture that need only have two finite moments. Also, our proof of Theorem 2.4 can almost be completed under the assumption that . We only require that in Section 5 in order to control the least singular value of matrices of the form , where is a deterministic matrix whose entries are bounded by , for some . See Remark 5.3 and Theorem 2.8 below for further details.
Remark 2.6.
Among other things, the perturbation by in Theorem 2.4 allows one to consider elliptic random matrices with nonzero mean. Indeed, let be a real number, and assume each entry of takes the value . Then is an elliptic random matrix whose atom variables have mean .
Remark 2.7.
As noted above, when , the matrix is known as a real symmetric Wigner matrix. Theorem 2.4 requires that , but in the special case when , we are able to extend our proof to show that the same result holds for the product of two independent real symmetric Wigner matrices.
Theorem 2.8.
Let be real random variables with mean zero and unit variance, and which satisfy
for some . For each and , let be an real symmetric matrix whose diagonal entries and upper diagonal entries are iid copies of , and assume and are independent. Then the ESD of the product
converges almost surely to (with density given by (1.2) when ) as .
2.1. Overview and outline
We begin by outlining the proof of Theorem 2.4. Instead of directly considering , we introduce a linearized random matrix, , where and are block matrices of the form
(2.4) 
and
(2.5) 
The following theorem gives the limiting distribution of , from which we will deduce our main theorem as a corollary.
Theorem 2.9.
Under the assumptions of Theorem 2.4, the ESD of converges almost surely to the circular law as .
In Section 3, we show that Theorem 2.4 is a short corollary of Theorem 2.9. This same linearization trick was used in [44] to study products of nonHermitian matrices with iid entries. Similar techniques were also used in [7, 30] to study general selfadjoint polynomials of selfadjoint random matrices.
Sections 4, 5, and 6 are dedicated to proving Theorem 2.9. Following the ideas of Girko [21, 22], we compute the limiting spectral measure of a nonHermitian random matrix , by employing the method of Hermitizaition. Given an matrix , we recall that the empirical spectral measure of is given by
where are the eigenvalues of . We let denote the symmetric empirical measure built from the singular values of . That is,
where are the singular values of . In particular,
is the largest singular value of and
is the smallest singular value, both of which will play a key role in our analysis below.
The key observation of Girko [21, 22] relates the empirical spectral measure of a nonHermitian matrix to that of a Hermitian matrix. To illustrate the connection, consider the Cauchy–Stieltjes transform of the measure , where is an matrix, given by
for . Since is analytic everywhere except at the poles (which are exactly the eigenvalues of ), the real part of determines the eigenvalues. Let denote the imaginary unit, and set . Then we can write the real part of as
where denotes the identity matrix. In other words, the task of studying reduces to studying the measures . The difficulty now is that the function has two poles, one at infinity and one at zero. The largest singular value can easily be bounded by a polynomial in . The main difficulty is controlling the least singular value.
In order to study it is useful to note that it is also the empirical spectral measure for the Hermitization of . The Hermitization of is defined to be
For an matrix, the Stieltjes transform of is also the trace of the Hermitized resolvent. That is, for , we have
where
Here denotes the Kronecker product of the matrix and the identity matrix .
Typically, in order to estimate the measures , one shows that the Stieltjes transform approximately satisfies a fixed point equation. Then one can show that this Stieltjes transform is close to the Stieltjes transform that exactly solves the fixed point equation. Because of the dependencies between entries in the matrix , directly computing the trace of the resolvent of the Hermitization of is troublesome. To circumvent this issue, in Section 4, instead of taking the trace of the resolvent, we instead take the partial trace and consider a matrixvalued Stieltjes transform. Then we show this partial trace approximately satisfies a matrixvalued fixed point equation.
2.2. A remark from free probability
The fact that the limiting distribution of the product is isotropic when the limiting distributions of the individual matrices are not might be surprising at first. Free probability, which offers a natural way to study limits of random matrices by considering joint distributions of elements from a noncommutative probability space, can shed some light on this. In free probability, the natural distribution of nonnormal elements is known as the Brown measure. For an introduction to free probability, we refer the reader to [32]; see [42] for further details about diagonal pairs as well as [12, 29] for computations of Brown measures. The distribution in Theorem 2.4 has also appeared in [34].
A noncommutative probability space is a unital algebra with a tracial state . We say a collection of elements are free if
whenever are polynomials such that , and .
In free probability, there are a distinguished set of elements known as diagonal elements. We refer the reader to [32, Section 4.4] for complete details. These operators enjoy several nice properties. When they are nonsingular, one such property is that their polar decomposition is , where is a haar unitary operator, is a positive operator, and are free. As a result of this decomposition, their Brown measure is isotropic. Additionally, the set of diagonal operators is closed under addition and multiplication of free elements.
In many cases, the Brown measure can be computed using the techniques of [12, 29]; however, for the purposes of this note (and due to discontinuities of the Brown measure), we will instead focus on a purely random matrix approach when computing the limiting distribution.
We conclude this subsection by showing that the product of two elliptical elements is diagonal. We consider two elements for simplicity; however, the argument easily generalizes to the product of elliptical elements.
First, we decompose an elliptical operator into the sum of a semicircular and circular elements, that are free from each other: . Since the sum of free diagonal elements is again diagonal, it suffices to consider each term in the sum individually and then observe that the terms are free from one another. Each term is of the form , where is either semicircular or circular, with polar decomposition: , , where is a quarter circular element, has distribution 1/2 at 1 and 1/2 at 1, and commutes with , is haar unitary free from . Then we consider the product:
We begin by introducing a new free haar unitary . Indeed, has the same distribution as
Then and are haar unitaries, and one can check they are free from each other and and . Since the product of diagonal elements remains diagonal is diagonal. Repeating this process for each term leads to the sum of free diagonal operators.
2.3. Notation
We use asymptotic notation (such as ) under the assumption that . We use , or to denote the bound for all sufficiently large and for some constant . Notations such as and mean that the hidden constant depends on another constant . We always allow the implicit constants in our asymptotic notation to depend on the integer from Theorem 2.4; we will not denote this dependence with a subscript. or means that as .
is the spectral norm of the matrix . denotes the HilbertSchmidt norm of (defined in (2.1)). We let denote the identity matrix. Often we will just write for the identity matrix when the size can be deduced from the context.
We write a.s., a.a., and a.e. for almost surely, Lebesgue almost all, and Lebesgue almost everywhere respectively. We use to denote the imaginary unit and reserve as an index. We let denote the indicator function of the event .
We let and denote constants that are nonrandom and may take on different values from one appearance to the next. The notation means that the constant depends on another parameter . We always allow the constants and to depend on the integer from Theorem 2.4; we will not denote this dependence with a subscript.
3. Proof of Theorem 2.4
We begin by proving Theorem 2.4 assuming Theorem 2.9. The majority of the paper will then be devoted to proving Theorem 2.9.
Proof of Theorem 2.4.
Let . Then is a block diagonal matrix of the form
where and is the matrix
for .
Let be a bounded and continuous function. Since each has the same eigenvalues as , we have
By Theorem 2.9, we have almost surely
as , where is the unit disk in the complex plane centered at the origin and . Thus, by the transformation , we obtain
where the factor of out front of the integral corresponds to the fact that the transformation maps the complex plane times onto itself.
Combining the computations above, we conclude that almost surely
as . Since was an arbitrary bounded and continuous function, the proof of Theorem 2.4 is complete. ∎
4. A matrixvalued Stieltjes transform
In this section, we define a matrixvalued Stieltjes transform and introduce the relevant notation and limiting objects. Then we show that this Stieltjes transform concentrates around its expectation and estimate the error between its expectation and the limiting transform.
Here and in the sequel, we will take advantage of the following form for the inverse of a partitioned matrix (see, for instance, [33, Section 0.7.3]):
(4.1) 
where and are square matrices.
Set , and let . Let be the Hermitization of . Define the resolvent
where
(4.2) 
for .
By the Stieltjes inversion formula, can be recovered from . Because of the dependencies between matrix entries, each entry of the resolvent cannot be computed directly by Schur’s Complement. One possible way to compute resolvent entries is by following the approach in [44, Section 4.3] and use a decoupling formula to compute matrix entries. See also [37, 40] for computations in the elliptical case. The dependencies introduce more terms to these computations, leading to a system of equations involving diagonal entries of each block of the resolvent. These equations do not seem to admit an obvious solution. Instead we offer a matrixvalued interpretations of these equations as well as a more direct derivation of the equations.
In order to study the resolvent we will retain the block structure of and view matrices as elements of matrices tensored with matrices. Taking this view, is by matrix with by blocks. When we wish to refer to one of these blocks (or more generally any element of a matrix) we will use a superscript for the entry. Instead of considering the full trace of , we instead take the partial trace over the matrix part of the tensor product and define . That is, is a matrix whose entry is the normalized trace of the block of . In other words, . To compute this partial trace we consider , the matrix whose entry is the entry of the block . Finally, we define the scalar
For each , let denote the matrix with the th rows and th columns of replaced by zeroes. Let be the Hermitization of . Define the resolvent
(4.3) 
and set .
Let be the matrix whose entry is the column of the block of , with the entry of each vector set to . Note that we use a semicolon when we refer to matrix entries or columns, in contrast to the comma, which referred to a matrix.
Later in this section we will show that approximately satisfies the fixed point equation
(4.4) 
with being a linear operator on matrices defined by:
where and is the matrix that is the block of the matrix , of course, the choice was arbitrary. More concretely, we define for any to be the column index of nonzero block in the row of . So
where for we define . It is important that leaves diagonal entries of on the diagonal and that .
To describe the limiting matrixvalued Stieltjes transform, , we first define , the Stieltjes transform corresponding to the circular law. That is, for each , is the unique Stieltjes transform that solves the equation
(4.5) 
for all ; see [27, Section 3].
We recall that for a square matrix , the imaginary part of is given by . We say has positive imaginary part if is positive definite. It was shown in [31] that (4.4) has one solution with positive imaginary part and is therefore a matrixvalued Stietljes transform. Furthermore, the last two equalities show that is a solution to (4.4).
A good way to see that the solution to (4.4) is of the given form is to note that for large , is approximately . Then by analytic continuation, the entries of that are nonzero must also be nonzero entries of . Finally, this ansatz for the form of the solution is applied to (4.4) and iterated until the nonzero entries of are preserved by (4.4). Through this process one observes that the value of each does not affect the solution.
4.1. Concentration
In this section we show that concentrates around its expectation.
We introduce nets as a convenient way to discretize a compact set. Let . A set is an net of a set if for any , there exists such that . The following estimate for the maximum size of an net is wellknown and follows from a standard volume argument (see, for example, [43, Lemma 3.11]).
Lemma 4.1 (Lemma 3.11 from [43]).
Let be a compact subset of . Then admits an net of size at most
Lemma 4.2.
Let . Under the assumptions of Theorem 2.4, a.s.
Proof.
By the BorelCantelli lemma, it suffices to show that
for some constant .
Let and be nets of and respectively. By Lemma 4.1,
Let be the set of all (defined by (4.2)) such that and . Hence . By the resolvent identity,
Thus, by a standard net argument, it suffices to show that
By the union bound and Markov’s inequality, we have, for any ,
Therefore, it will suffice to show that for some sufficiently large, there exists a constant (depending only on ), such that
for any .
In fact, since is a matrix, we will show that, for every ,
(4.6) 
for any and .
Fix . Let denote the conditional expectation with respect to the first rows and columns of each matrix .
We now rewrite as a martingale difference sequence. Indeed,
Since is at most rank , the resolvent identity implies that is at most rank and . This then gives the bound
(4.7) 
for any . Thus, by the Burkholder inequality [16] (see for example [9, Lemma 2.12] for a complexvalued version of the Burkholder inequality), for any ,