Spectra of nearly Hermitian random matrices

# Spectra of nearly Hermitian random matrices

Sean O’Rourke Department of Mathematics, University of Colorado at Boulder, Boulder, CO 80309  and  Philip Matchett Wood Department of Mathematics, University of Wisconsin-Madison, 480 Lincoln Dr., Madison, WI 53706
###### Abstract.

We consider the eigenvalues and eigenvectors of matrices of the form , where is an Wigner random matrix and is an arbitrary deterministic matrix with low rank. In general, we show that none of the eigenvalues of need be real, even when has rank one. We also show that, except for a few outlier eigenvalues, most of the eigenvalues of are within of the real line, up to small order corrections. We also prove a new result quantifying the outlier eigenvalues for multiplicative perturbations of the form , where is a sample covariance matrix and is the identity matrix. We extend our result showing all eigenvalues except the outliers are close to the real line to this case as well. As an application, we study the critical points of the characteristic polynomials of nearly Hermitian random matrices.

The second author was partially supported by National Security Agency (NSA) Young Investigator Grant number H98230-14-1-0149.

## 1. Introduction and notation

The fundamental question of perturbation theory is the following. How does a function change when its argument is subject to a perturbation? In particular, matrix perturbation theory is concerned with perturbations of matrix functions, such as solutions to linear system, eigenvalues, and eigenvectors. The purpose of this paper is to analyze the eigenvalues and eigenvectors of large perturbed random matrices.

For an matrix with complex entries, the eigenvalues of are the roots in of the characteristic polynomial , where denotes the identity matrix. Let denote the eigenvalues of counted with multiplicity, and let denote the set of all eigenvalues of . In particular, if is Hermitian (that is, ), then .

The central question in the perturbation theory of eigenvalues is the following. Given a matrix and a perturbation of , how are the spectra and related? Many challenges can arise when trying to answer this question. For instance, different classes of matrices behave differently under perturbation.

The simplest case to consider is when both and are Hermitian. Indeed, if and are Hermitian matrices then the eigenvalues of and are real and the classic Hoffman–Wielandt theorem [27] states that there exists a permutation of such that

 n∑j=1∣∣λπ(j)(M)−λj(M+P)∣∣2≤∥P∥22. (1.1)

Here, denotes the Frobenius norm of defined by the formula

 ∥P∥2:=√tr(PP∗)=√tr(P∗P). (1.2)

The classic Hoffman–Weilandt theorem [27] is in fact a bit more general: (1.1) holds assuming only that and are normal.

However, in this note, we focus on the case when is Hermitian but is arbitrary. In this case, the spectrum is real, but the eigenvalues of need not be real. Indeed, in the following example, we show that even when contains only one nonzero entry, all the eigenvalues of can be complex.

###### Example 1.1.

Consider an tridiagonal Toeplitz matrix with zeros on the diagonal and ones on the super- and sub-diagonals. The matrix has eigenvalues for (see for example [31]). Let denote the characteristic polynomial for . Let be the matrix with every entry zero except for the -entry which is set to , the imaginary unit.111Here and in the sequel, we use to denote the imaginary unit and reserve as an index. Then the characteristic polynomial of is . Since for all , it follows that none of the eigenvalues of are real.

Example 1.1 shows that we cannot guarantee that even one of the eigenvalues of is real. However, this example does raise the following question.

###### Question 1.2.

If is Hermitian and is arbitrary, how far are the eigenvalues of from the real line?

Question 1.2 was addressed by Kahan [30]. We summarize Kahan’s results below in Theorem 1.3, but we first fix some notation. For an matrix , let denote the spectral norm of . That is, is the largest singular value of . We also denote the real and imaginary parts of as

 Re(M):=M+M∗2,Im(M):=M−M∗2i.

It follows that is Hermitian if and only if .

###### Theorem 1.3 (Kahan [30]).

Let be an Hermitian matrix with eigenvalues . Let be an arbitrary matrix, and denote the eigenvalues of as , where . Then

 sup1≤k≤n|νk|≤∥Im(P)∥,n∑k=1ν2k≤∥Im(P)∥22, (1.3)

and

 n∑k=1∣∣(μk+√−1νk)−λk∣∣2≤2∥P∥22. (1.4)
###### Remark 1.4.

In the case that is only assumed to be normal (instead of Hermitian) but is arbitrary, Sun [48] proved that there exists a permutation of such that

 n∑j=1∣∣λπ(j)(M)−λj(M+P)∣∣2≤n∥P∥22, (1.5)

and [48] also shows that this bound is sharp.

We refer the reader to [47, Section IV.5.1] for a discussion of Kahan’s results as well as a concise proof of Theorem 1.3. The bounds in (1.3) were shown to be sharp in [30]. In the next example, we show that (1.4) is also sharp.

###### Example 1.5.

Consider the matrices and . Then has eigenvalues and has only the eigenvalue with multiplicity two. Since , it follows that

 |1−0|2+|−1−0|2=2∥P∥22,

and hence the bound in (1.4) is sharp.

In this note, we address Question 1.2 when is a Hermitian random matrix and is a deterministic, low rank perturbation. In particular, our main results (presented in Section 2) show that, in this setting, one can improve upon the results in Theorem 1.3. One might not expect this improvement since the bounds in Theorem 1.3 are sharp; however, the bounds appear to be sharp for very contrived examples (such as Example 1.5). Intuitively, if we consider a random matrix , we expect with high probability to avoid these worst-case scenarios, and thus, some improvement is expected. Before we present our main results, we describe the ensembles of Hermitian random matrices we will be interested in.

### 1.1. Random matrix models

We consider two important ensembles of Hermitian random matrices. The first ensemble was originally introduced by Wigner [54] in the 1950s to model Hamiltonians of atomic nuclei.

###### Definition 1.6 (Wigner matrix).

Let be real random variables. We say is a real symmetric Wigner matrix of size with atom variables and if is a random real symmetric matrix that satisfies the following conditions.

• is a collection of independent random variables.

• is a collection of independent and identically distributed (iid) copies of .

• is a collection of iid copies of .

###### Remark 1.7.

One can similarly define Hermitian Wigner matrices where is allowed to be a complex-valued random variable. However, for simplicity, we only focus on the real symmetric model in this note.

The prototypical example of a Wigner real symmetric matrix is the Gaussian orthogonal ensemble (GOE). The GOE is defined as an Wigner matrix with atom variables and , where is a standard Gaussian random variable and is a Gaussian random variable with mean zero and variance two. Equivalently, the GOE can be viewed as the probability distribution

 P(dW)=1Znexp(−14trW2)dW

on the space of real symmetric matrices, where refers to the Lebesgue measure on the different elements of the matrix. Here denotes the normalization constant.

We will also consider an ensemble of sample covariance matrices.

###### Definition 1.8 (Sample covariance matrix).

Let be a real random variable. We say is a sample covariance matrix with atom variable and parameters if , where is a random matrix whose entries are iid copies of .

A fundamental result for Wigner random matrices is Wigner’s semicircle law, which describes the global behavior of the eigenvalues of a Wigner random matrix. Before stating the result, we present some additional notation. For an arbitrary matrix , we define the empirical spectral measure of as

 μA:=1nn∑k=1δλk(A).

In general, is a probability measure on , but if is Hermitian, then is a probability measure on . In particular, if is a random matrix, then is a random probability measure on . Let us also recall what it means for a sequence of random probability measures to converge weakly.

###### Definition 1.9 (Weak convergence of random probability measures).

Let be a topological space (such as or ), and let be its Borel -field. Let be a sequence of random probability measures on , and let be a probability measure on . We say converges weakly to in probability as (and write in probability) if for all bounded continuous and any ,

 limn→∞P(∣∣∣∫fdμn−∫fdμ∣∣∣>ε)=0.

In other words, in probability as if and only if in probability for all bounded continuous . Similarly, we say converges weakly to almost surely (or, equivalently, converges weakly to with probability ) as (and write almost surely) if for all bounded continuous ,

 limn→∞∫fdμn=∫fdμ

almost surely.

Recall that Wigner’s semicircle law is the (non-random) measure on with density

 ρsc(x):={12π√4−x2,if |x|≤2,0,otherwise. (1.6)
###### Theorem 1.10 (Wigner’s semicircle law; Theorem 2.5 from [3]).

Let and be real random variables; assume has unit variance. For each , let be an real symmetric Wigner matrix with atom variables and . Then, with probability , the empirical spectral measure of converges weakly on as to the (non-random) measure with density given by (1.6).

For sample covariance matrices, the Marchenko–Pastur law describes the limiting global behavior of the eigenvalues. Recall that the Marchenko–Pastur law is the (non-random) measure with parameter which has density

 ρMP,y(x):={√y2xπ√(x−λ−)(λ+−x),if λ−≤x≤λ+,0,otherwise (1.7)

and with point mass at the origin if , where

 λ±:=√y(1±1√y)2. (1.8)
###### Theorem 1.11 (Marchenko–Pastur law; Theorem 3.6 from [3]).

Let be a real random variable with mean zero and unit variance. For each , let be a sample covariance matrix with atom variable and parameters , where is a function of such that as . Then, with probability , the empirical spectral measure of converges weakly on as to the (non-random) measure .

### 1.2. Notation

We use asymptotic notation (such as ) under the assumption that . We use to denote the bound for all sufficiently large and for some constant . Notation such as mean that the hidden constant depends on another constant . The notation or means that as .

For an event , we let denote the indicator function of , and denotes the complement of . We write a.s. and a.e. for almost surely and Lebesgue almost everywhere, respectively. We let denote the imaginary unit and reserve as an index.

For any matrix , we define the Frobenius norm of by (1.2), and we use to denote the spectral norm of . We let denote the identity matrix. Often we will just write for the identity matrix when the size can be deduced from the context.

We let and denote constants that are non-random and may take on different values from one appearance to the next. The notation means that the constant depends on another parameter .

## 2. Main results

Studying the eigenvalues of deterministic perturbations of random matrices has generated much interest. In particular, recent results have shown that adding a low-rank perturbation to a large random matrix barely changes the global behavior of the eigenvalues. However, as illustrated below, some of the eigenvalues can deviate away from the bulk of the spectrum. This behavior, sometimes referred to as the BBP transition, was first studied by Johnstone [28] and Baik, Ben Arous, and Péché [5] for spiked covariance matrices. Similar results have been obtained in [6, 7, 8, 9, 15, 16, 17, 32, 33, 39, 40, 43] for other Hermitian random matrix models. Non-Hermitian models have also been studied, including [14, 42, 49, 50] (iid random matrices), [38] (elliptic random matrices), and [10, 11] (matrices from the single ring theorem).

In this note, we focus on Hermitian random matrix ensembles perturbed by non-Hermitian matrices. This model has recently been explored in [38, 44]. However, the results in [38, 44] address only the “outlier” eigenvalues, and leave Question 1.2 unanswered for the bulk of the eigenvalues. The goal of this paper is to address these bulk eigenvalues. We begin with some examples.

### 2.1. Some example perturbations

In Example 1.1, we gave a deterministic example when is Hermitian, is non-Hermitian, and none of the eigenvalues of are real. We begin this section by giving some random examples where the same phenomenon holds. We first consider a Wigner matrix perturbed by a diagonal matrix.

###### Theorem 2.1.

For , let be a real random variable satisfying

 P(ξ=x)≤1−μ for all x∈R, (2.1)

and let be an arbitrary real random variable. Suppose is an Wigner matrix with atom variables and . Let be the diagonal matrix for some with . Then, for any , there exists (depending on and ) such that the following holds with probability at least :

• if , then all eigenvalues of are in the upper half-plane .

• if , then all eigenvalues of are in the lower half-plane .

Moreover, if and are absolutely continuous random variables, then the above holds with probability .

###### Remark 2.2.

The choice for the last coordinates of to take the value is completely arbitrary. Since is invariant under conjugation by a permutation matrix, the same result also holds for , where is any -dimensional standard basis vector.

Figure 1 depicts a numerical simulation of Theorem 2.1 when the entries of are Gaussian. The proof of Theorem 2.1 relies on some recent results due to Tao and Vu [52] and Nguyen, Tao, and Vu [35] concerning gaps between eigenvalues of Wigner matrices.

The next result is similar to Theorem 2.1 and applies to perturbations of random sample covariance matrices.

###### Theorem 2.3.

Let be an absolutely continuous real random variable. Let be a sample covariance matrix with atom variable and parameters , where and are positive integers, and take . Let be any -dimensional standard basis vector, so one coordinate of equals 1 and the rest are 0. Then, with probability ,

• if , then has eigenvalues with positive imaginary part, and

• if , then has eigenvalues with negative imaginary part.

The remaining eigenvalues of (if any) are all equal to 0.

Figure 2 gives a numerical demonstration of Theorem 2.3. We conjecture that Theorem 2.3 can also be extended to the case when is a discrete random variable which satisfies a non-degeneracy condition, such as (2.1). In order to prove such a result, one would need to extend the results of [35, 52] to the sample covariance case; we do not pursue this matter here.

Below, we give a deterministic result with a similar flavor to Theorem 2.1 and Theorem 2.3 which applies to any Hermitian matrix.

###### Theorem 2.4.

Let be an Hermitian matrix with eigenvalues (including repetitions). Assume that the eigenvalues are distinct. Then there exists a column vector such that shares the eigenvalues with and also the following holds:

• if , then has eigenvalues with positive imaginary part, and

• if , then has eigenvalues with negative imaginary part.

Furthermore, there are many choices for the vector : if are eigenvectors corresponding to the distinct eigenvalues of , then any suffices, so long as the complex numbers for all .

Theorem 2.4 has a natural corollary applying to multiplicative perturbations of the form , where can be any Hermitian matrix, including a sample covariance matrix (see Corollary 3.6). In fact, we will prove Theorem 2.3 essentially by combining a version of Theorem 2.4 (see Lemma 3.5) with a lemma showing the necessary conditions on the eigenvalues and eigenvectors are satisfied with probability 1 (see Lemma 3.7).

### 2.2. Global behavior of the eigenvalues

As Theorem 2.1 shows, it is possible that no single eigenvalue of the sum is real, even when is random and has low rank. However, we can still describe the limiting behavior of the eigenvalues. We do so in Theorem 2.5 and Theorem 2.6 below, both of which are consequences of Theorem 1.3.

Recall that Wigner’s semicircle law is the measure on with density given in (1.6). Here and in the sequel, we view as a measure on . In particular, Definition 1.9 defines what it means for a sequence of probability measures on to converge to . We observe that, as a measure on , is not absolutely continuous (with respect to Lebesgue measure). In particular, the density given in (1.6) is not the density of when viewed as a measure on . However, if is a bounded continuous function, then

 ∫Cf(z)dμsc(z)=∫Rf(x)dμsc(x)=∫2−2f(x)ρsc(x)dx.
###### Theorem 2.5 (Perturbed Wigner matrices).

Let and be real random variables, and assume has unit variance. For each , let be an real symmetric Wigner matrix with atom variables and , and let be an arbitrary deterministic matrix. If

 limn→∞1√n∥Pn∥2=0, (2.2)

then, with probability , the empirical spectral measure of converges weakly on as to the (non-random) measure .

In the coming sections, we will typically consider the case when and . In this case, it follows that , and hence (2.2) is satisfied.

We similarly consider the Marchenko–Pastur law as a measure on . For perturbed sample covariance matrices, we can also recover the Marchenko–Pastur law as the limiting distribution.

###### Theorem 2.6 (Perturbed sample covariance matrices).

Let be a real random variable with mean zero, unit variance, and finite fourth moment. For each , let be a sample covariance matrix with atom variable and parameters , where is a function of such that as . Let be an arbitrary deterministic matrix. If (2.2) holds, then, with probability , the empirical spectral measure of converges weakly on as to the (non-random) measure .

### 2.3. More refined behavior of the eigenvalues

While Theorems 2.5 and 2.6 show that all but a vanishing proportion of the eigenvalues of converge to the real line, the results do not quantitatively answer Question 1.2. We now give a more quantitative bound on the imaginary part of the eigenvalues.

We first consider the Wigner case. For any , let denote the -neighborhood of the interval in the complex plane. That is,

 Escδ:={z∈C:infx∈[−2,2]|z−x|≤δ}.

Here, we work with the interval as this is the support of the semicircle distribution . Our main results below are motivated by the following result from [38], which describes the eigenvalues of when is a Wigner matrix and is a deterministic matrix with rank . In the numerical analysis literature [30], such perturbations of Hermitian matrices are sometimes referred to as nearly Hermitian matrices.

###### Theorem 2.7 (Theorem 2.4 from [38]).

Let and be real random variables. Assume has mean zero, unit variance, and finite fourth moment; suppose has mean zero and finite variance. For each , let be an Wigner matrix with atom variables and . Let and . For each , let be an deterministic matrix, where and . Suppose for sufficiently large, there are no nonzero eigenvalues of which satisfy

 λi(Pn)+1λi(Pn)∈Esc3δ∖Escδ%with|λi(Pn)|>1,

and there are eigenvalues for some which satisfy

 λi(Pn)+1λi(Pn)∈C∖Esc3δwith|λi(Pn)|>1.

Then, almost surely, for sufficiently large, there are exactly eigenvalues of in the region , and after labeling the eigenvalues properly,

 λi(1√nWn+Pn)=λi(Pn)+1λi(Pn)+o(1) (2.3)

for each .

See Figure 3 for a numerical example illustrating Theorem 2.7. Recently, Rochet [44] has obtained the rate of convergence for the eigenvalues characterized in (2.3) as well as a description of their fluctuations.

While Theorem 2.7 describes the location of the “outlier” eigenvalues, it says nothing substantial about the eigenvalues in . We address this point in the following theorem. We will require that the atom variables and satisfy the following assumptions.

###### Definition 2.8 (Condition C0).

We say the atom variables and satisfy condition C0 if either one of the following conditions hold.

1. and both have mean zero, unit variance, and finite moments of all orders; that is, for every non-negative integer , there exists a constant such that

 E|ξ|p+E|ζ|p≤Cp.
2. and both have mean zero, has unit variance, has variance , , and both and have sub-exponential tails, that is, there exists such that

 P(|ξ|≥t)≤ϑ−1exp(−tϑ) and P(|ζ|≥t)≤ϑ−1exp(−tϑ)

for all .

###### Theorem 2.9 (Location of the eigenvalues for Wigner matrices).

Let and be real random variables which satisfy condition C0. For each , let be an Wigner matrix with atom variables and . Let and . For each , let be an deterministic matrix, where and . Suppose for sufficiently large, there are no eigenvalues of which satisfy

 |1−|λi(Pn)||<δ

and there are eigenvalues for some which satisfy

 |λi(Pn)|≥1+δ. (2.4)

Then, there exists a constant satisfying such that, for every , the following holds. Almost surely, for sufficiently large, there are exactly eigenvalues of in the region , and after labeling the eigenvalues properly,

 λi(1√nWn+Pn)=λi(Pn)+1λi(Pn)+o(1)

for each . In addition, almost surely, for sufficiently large, the remaining eigenvalues of satisfy

 supj+1≤i≤n∣∣ ∣∣Im(λi(1√nWn+Pn))∣∣ ∣∣≤n−1+ε

and

 supj+1≤i≤n∣∣ ∣∣Re(λi(1√nWn+Pn))∣∣ ∣∣≤2+n−2/3+ε.

In other words, Theorem 2.9 states that besides for the “outlier” eigenvalues noted in Theorem 2.7, all of the other eigenvalues are within of the real line, up to multiplicative corrections. In addition, the real parts of the eigenvalues highly concentrate in the region .

Similar results to Theorem 2.9 have also appeared in the mathematical physics literature due to the role random matrix theory plays in describing scattering in chaotic systems. We refer the interested reader to [23, 24, 25, 26, 46] and references therein. In particular, these works focus on the case when is drawn from the GOE or it complex Hermitian relative, the Gaussian unitary ensemble (GUE), and is of the form , where is a positive definite matrix of low rank. We emphasis that the methods used in [23, 24, 25, 26, 46] only apply to the GUE and GOE, while Theorem 2.9 applies to a large class of Wigner matrices.

We now consider the sample covariance case. For simplicity, we will consider sample covariance matrices with parameters , that is, sample covariance matrices of the form , where is . For any , let be the -neighborhood of in the complex plane. That is,

 EMPδ:={z∈C:infx∈[0,4]|z−x|≤δ}.

Here, we work on as this is the support of . Our main result for sample covariance matrices is the following theorem. We assume the atom variable of satisfies the following condition.

###### Definition 2.10 (Condition C1).

We say the atom variable satisfies condition C1 if has mean zero, unit variance, and finite moments of all orders, that is, for every non-negative integer there exists a constant such that

 E|ξ|p≤Cp.
###### Theorem 2.11 (Location of the eigenvalues for sample covariance matrices).

Let be a real random variable which satisfies condition C1. For each , let be an sample covariance matrix with atom variable and parameters . Let and . For each , let be an deterministic matrix, where and . Suppose for sufficiently large, there are no eigenvalues of which satisfy

 |1−|λi(Pn)||<δ (2.5)

and there are eigenvalues for some which satisfy

 |λi(Pn)|≥1+δ. (2.6)

Then, there exists a constant satisfying such that, for every , the following holds. Almost surely, for sufficiently large, there are exactly eigenvalues of in the region , and after labeling the eigenvalues properly,

for each . In addition, almost surely, for sufficiently large, the remaining eigenvalues of either lie in the disk or the eigenvalues satisfy

 supj+1≤i≤n∣∣∣Im(λi(1nSn(I+Pn)))∣∣∣≤n−1+ε,
 infj+1≤i≤nRe(λi(1nSn(I+Pn)))>0,

and

 supj+1≤i≤nRe(λi(1nSn(I+Pn)))≤4+n−2/3+ε.

See Figure 4 for a numerical demonstration of Theorem 2.11.

### 2.4. Eigenvectors of random perturbations

In this section, we mention a few results concerning the eigenvectors of perturbed Wigner matrices. In particular, using the methods developed by Benaych-Georges and Nadakuditi [8], when the perturbation is random (and independent of ), we can say more about the eigenvectors of . For simplicity, we consider the case when is a rank one normal matrix.

###### Theorem 2.12 (Eigenvectors: Wigner case).

Let and be real random variables. Assume has mean zero, unit variance, and finite fourth moment; suppose has mean zero and finite variance. For each , let be an Wigner matrix with atom variables and . In addition, for each , let be a random vector uniformly distributed on the unit sphere in or, respectively, in ; and let be independent of . Set , where and . Then there exists (depending on ) such that the following holds. Almost surely, for sufficiently large, there is exactly one eigenvalue of outside , and this eigenvalue takes the value . Let be the unit eigenvector corresponding to this eigenvalue. Then, almost surely,

 |u∗nvn|2⟶∣∣∫ρsc(x)x−~θdx∣∣2∫ρsc(x)|x−~θ|2dx

as , where and is the density of the semicircle law given in (1.6).

The first part of Theorem 2.12 follows from Theorem 2.7. The second part, regarding the unit eigenvector is new. In particular, Theorem 2.12 describes the portion of the vector pointing in direction . In the case when is Hermitian (i.e. ), we recover a special case of [8, Theorem 2.2]. We also have the following version for the sample covariance case.

###### Theorem 2.13 (Eigenvectors: sample covariance case).

Let be a real random variable which satisfies condition C1. For each , let be an sample covariance matrix with atom variables and parameters . In addition, for each , let be a random vector uniformly distributed on the unit sphere in or, respectively, in ; and let be independent of . Set , where and . Then there exists (depending on ) such that the following holds. Almost surely, for sufficiently large, there is exactly one eigenvalue of outside , and this eigenvalue takes the value . Let be the unit eigenvector corresponding to this eigenvalue. Then, almost surely,

 |u∗nvn|2⟶∣∣∣∫xρMP,1(x)x−^θdx∣∣∣2∫x2ρMP,1(x)|x−^θ|2dx

as , where and is the density of the Marchenko-Pastur law given in (1.7) and (1.8) with .

###### Remark 2.14.

Both Theorem 2.12 and Theorem 2.13 can be extended to the case when is non-normal. For instance, Theorem 2.12 holds when , where , where and are unit vectors, where is a random vector uniformly distributed on the unit sphere in or, respectively, in , and where is independent of (but and are not necessarily assumed independent of each other). In this case, has eigenvalue with corresponding unit eigenvector . Since is now random and (possibly) dependent on , it must be additionally assumed that almost surely for some . Similarly, both theorems can also be extended to the case when has rank larger than one. These extensions can be proved by modifying the arguments presented in Section 6.

### 2.5. Critical points of characteristic polynomials

As an application of our main results, we study the critical points of characteristic polynomials of random matrices. Recall that a critical point of a polynomial is a root of its derivative . There are many results concerning the location of critical points of polynomials whose roots are known. For example, the famous Gauss–Lucas theorem offers a geometric connection between the roots of a polynomial and the roots of its derivative.

###### Theorem 2.15 (Gauss–Lucas; Theorem 6.1 from [34]).

If is a non-constant polynomial with complex coefficients, then all zeros of belong to the convex hull of the set of zeros of .

Pemantle and Rivin [36] initiated the study of a probabilistic version of the Gauss–Lucas theorem. In particular, they studied the critical points of the polynomial

 pn(z):=(z−X1)⋯(z−Xn) (2.7)

when are iid complex-valued random variables. Their results were later generalized by Kabluchko [29] to the following.

###### Theorem 2.16 (Theorem 1.1 from [29]).

Let be any probability measure on . Let be a sequence of iid random variables with distribution . For each , let be the degree polynomial given in (2.7). Then the empirical measure constructed from the critical points of converges weakly to in probability as .

Theorem 2.16 was later extended by the first author; indeed, in [37], the critical points of characteristic polynomials for certain classes of random matrices drawn from the compact classical matrix groups are studied. Here, we extend these results to classes of perturbed random matrices. For an matrix , we let denote the characteristic polynomial of . In this way, the empirical spectral measure can be viewed as the empirical measure constructed from the roots of . Let denote the empirical measure constructed from the critical points of . That is,

 μ′A:=1n−1n−1∑k=1δξk,

where are the critical points of counted with multiplicity.

We prove the following result for Wigner random matrices.

###### Theorem 2.17.

Assume , , , and satisfy the assumptions of Theorem 2.9. In addition, for sufficiently large, assume that the eigenvalues of which satisfy (2.4) do not depend on . Then, almost surely, converges weakly on as to the (non-random) measure .

A numerical example of Theorem 2.17 appears in Figure 5.

###### Remark 2.18.

Due to the outlier eigenvalues (namely, those eigenvalues in ) observed in Theorems 2.7 and 2.9, the convex hull of the eigenvalues of can be quite large. In particular, it does not follow from the Gauss–Lucas theorem (Theorem 2.15) that the majority of the critical points converge to the real line. On the other hand, besides describing the limiting distribution, Theorem 2.17 asserts that all but of the critical points converge to the real line.

### 2.6. Outline

The rest of the paper is devoted to the proofs of our main results. Section 3 is devoted to the proofs of Theorem 2.1, Theorem 2.3, and Theorem 2.4. The results in Section 2.2 are proven in Section 4. Section 5 contains the proof of our main result from Section 2.3. Theorem 2.12 and Theorem 2.13 are proven in Section 6. We prove Theorem 2.17 in Section 7. Lastly, the appendix contains a number of auxiliary results.

## 3. Proof of Theorem 2.1, Theorem 2.3, and Theorem 2.4

### 3.1. Proof of Theorem 2.1

We begin this section with a proof of Theorem 2.1. The proof of Theorem 2.4 is in Subsection 3.2, and the proof of Theorem 2.3 is in Subsection 3.3.

Recall that two non-constant polynomials and with real coefficients have weakly interlacing zeros if

• their degrees are equal or differ by one,

• their zeros are all real, and

• there exists an ordering such that

 α1≤β1≤α2≤β2≤⋯≤αs≤βs≤⋯, (3.1)

where are the zeros of one polynomial and are those of the other.

If, in the ordering (3.1), no equality sign occurs, then and have strictly interlacing zeros. Analogously, we say two Hermitian matrices have weakly or strictly interlacing eigenvalues if the respective interlacing property holds for the zeros of their characteristic polynomials.

We begin with the following criterion, which completely characterizes when the zeros of two polynomials have strictly interlacing zeros.

###### Theorem 3.1 (Hermite–Biehler; Theorem 6.3.4 from [41]).

Let and be non-constant polynomials with real coefficients. Then and have strictly interlacing zeros if and only if the polynomials has all its zeros in the upper half-plane or in the lower half-plane .

We now extend Theorem 3.1 to the characteristic polynomial of a perturbed Hermitian matrix.

###### Lemma 3.2.

Let . Suppose is an Hermitian matrix of the form

 A=[BXX∗d],

where is a Hermitian matrix, , and . Let be the diagonal matrix for some with . Then all eigenvalues of are in the upper half-plane or the lower half-plane if and only if the eigenvalues of and strictly interlace.

###### Proof.

We observe that the characteristic polynomial of can be written as

 det(A+P−zI) =det(B−zIXX∗d−z+√−1γ) =det(B−zIXX∗d−z)+det(B−zIX0√−1γ) =det(A−zI)+√−1γdet(B−zI)

by linearity of the determinant. Since and are Hermitian matrices, it follows that their characteristic polynomials and are non-constant polynomials with real coefficients. Hence, the polynomial is also a non-constant polynomial with real coefficients. Thus, the claim follows from Theorem 3.1. ∎

In order to prove Theorem 2.1, we will also need the following lemma, which is based on the arguments of Tao-Vu [52] and Nguyen-Tao-Vu [35].

###### Lemma 3.3 (Strict eigenvalue interlacing).

Let , and let be a real random variables such that (2.1) holds. Let be an arbitrary real random variable. Suppose is an Wigner matrix with atom variables and . Let be the upper-left minor of . Then, for any , there exists (depending on and ) such that, with probability at least , the eigenvalues of strictly interlace with the eigenvalues of .

Furthermore, if and are absolutely continuous random variables, then the eigenvalues of and strictly interlace with probability .

We present a proof of Lemma 3.3 in Appendix A based on the arguments from [35, 52]. We now complete the proof of Theorem 2.1.

###### Proof of Theorem 2.1.

Let be the upper-left minor of . We decompose