How Many Eigenvalues of a Random Symmetric Tensor are Real?

How Many Eigenvalues of a Random
Symmetric Tensor are Real?

Paul Breiding
Abstract

This article answers a question posed by Draisma and Horobet [8], who asked for a closed formula for the expected number of real eigenvalues of a random real symmetric tensor drawn from the Gaussian distribution relative to the Bombieri norm. This expected value is equal to the expected number of real critical points on the unit sphere of a Kostlan polynomial. We also derive an exact formula for the expected absolute value of the determinant of a matrix from the Gaussian Orthogonal Ensemble.

\pdfstringdefDisableCommands\pdfstringdefDisableCommands

Max-Planck Institute for Mathematics in the Sciences, Leipzig, breiding@mis.mpg.de.
Partially supported by DFG research grant BU 1371/2-2.

1 Introduction

The title of this article is a homage to the article [9], in which Edelman, Kostlan and Shub compute the expected number of real eigenvalues of a matrix filled with i.i.d. standard Gaussian random variables. This model of a random matrix is extended to matrices of higher order, called tensors. In [4] the author computed the expected number of real eigenvalues of a random tensor, whose entries are i.i.d. standard Gaussian random variables. The content of this article is the computation of the expected number of real eigenvalues of a random symmetric tensor. Unlike symmetric matrices, symmetric tensors may have complex eigenvalues. But, as we will see, symmetric tensors have in a sense more real eigenvalues than non-symmetric tensors.

In this article, a tensor is an array of numbers arranged in  dimensions, where the -th dimension is of length . We call  the order of . We are interested in the case when are all equal. The space of real tensors of order is denoted by . For  tensors are higher-dimensional analogues of matrices, which form the case . A tensor is called symmetric, if for all permuations on on  elements we have . We denote the vector space of symmetric tensors by . As in [4] we say that a random real tensor is a Gaussian tensor, if it has density , where is the square of the Frobenius norm.

Definition 1.1.

A Gaussian symmetric tensor is a random tensor given by the density , where (Draisma and Horobet [8] call the distribution given by this density the Gaussian distribution with respect to the Bombieri-norm).

The complex number is called an eigenvalue of the tensor , if there exists a vector , such that the following equation holds:

(1.1)

The pair is called an eigenpair in this case. For , equation creftype 1.1 is the defining equation of matrix eigenpairs. The condition serves for selecting a point from each class of eigenpairs: if , then also . In particular, if is odd and if is an eigenvalue of , then also is an eigenvlaue of . To take into account this reflection property we make the following definition.

Definition 1.2.

If is odd we define the number of eigenvalues of , to be the the number of solutions of creftype 1.1 divided by two. For even even we define number of eigenvalues of , to be the the number of solutions of creftype 1.1.

For this definition Cartwright and Sturmfels [7] show that the number of complex eigenpairs for the generic tensor is . In the following we use the notation

In Theorem 1.4 below we give an exact formula for . This complements our result from [4], where have given an exact formula for

This formula is given in terms of Gauss’ hypergeometric function and the Gamma function (see creftype 2.2 and creftype 2.4 for their definitions):

Furthermore, we have shown the following asymptotic formulas:

Auffinger et. al. provide in [2, Theorem 2.17] the following formula:

(1.2)

Comparing this with creftype 1.2 it is fair to say that:

For fixed and large , and on the average, a real symmetric tensor has more eigenvalues than a real general tensor.

However, the point of view from Auffinger et. al. is not eigenvalues of tensors, but critical points of the polynomial restricted to the unit sphere. If  is Gaussian symmetric, is called a Kostlan polynomial. Indeed, every solution of creftype 1.1 corresponds to a critical point of on the unit sphere and vice versa. This point of view is also taken by Fyorodov, Lerario and Lundberg [10], who argue that each connected component of the zero set of a polynomial contains at least one critical point. Consequently, yields information about the average topology of the zero set of a Kostlan polynomial.

Using the formula from Theorem 1.4 we get the following asymptotic formulas for large :

(1.3)

where and denote the following matrices depending on :

(1.4)

and where

(1.5)

Note that , where are the rational polynomials from Theorem 1.4.

At first glance the formulas in creftype 1.3 don’t provide much insight, and unfortunately we don’t know how to simplify them any further, nor do we know how to compute the leading order like in creftype 1.2. But we have the following interesting corollary:

Corollary 1.3.

For fixed we have

Proof.

For all we have for some ; see, e.g., [14, 43:4:3]. In both formulas in creftype 1.3 there are as many s in the numerator as there are in the denominator. ∎

Figure 1.1: The histogram shows the output of the following experiment: for we sampled Gaussian symmetric tensors in and computed the number of real eigenvalues using [6]. Theorem 1.4 predicts , , , and .

Here is our main theorem. We give a proof in Section 3.

Theorem 1.4 (An exact formula for ).

We define the rational polynomial functions

Then, for all we have

Using essentially the same argument as for Corollary 1.3 we can prove the following.

Corollary 1.5.

We have and

2
3
4
5
6
7
Table 1: Formulas for and for . We used sage [15] to derive the middle column. The source code of the scripts are given in Section A.

1.1 Random matrix theory

Gaussian tensors of order are better known under another name: the Gaussian Orthogonal Ensemble. If is a matrix from this ensemble, we write . For , let us denote

(1.6)

where is the identity matrix. The proof of Theorem 1.4 is based on the computation of . We remark that by the triangle inequality. A computation of can be found in [12, Section 22] and the ideas in this paper are inspired by the computations in this reference. The following result, which is new to our best knowledge, shows that can be expressend in terms of  and a collection of Hermite polynomials.

Theorem 1.6 (The expected absolute value of the determinant of a GOE matrix).

Let be fixed. Define the functions via

where is the -th (probabilist’s) Hermite polynomial; see creftype 2.16. Then, we have

Here, and are the matrices from creftype 1.4.

Remark.

The computation of is based on the formula creftype 3.1 by Draisma and Horobet, which involves the expectation for . In the recent article [5], together with Khazhgali Kozhasov and Antonio Lerario, we have computed the volume of the set of matrices with repeated eigenvalues, and this computation is based on .

1.2 Organization of the article

In the next section we give some preliminary material. Then, in Section 3 we prove Theorem 1.4. In Section 4 we compute several integrals that are used in the proof of Theorem 1.6, which we prove in Section 5.

1.3 Acknowledgements

The author wants to thank Antonio Lerario for helpful remarks on the structure of this article.

2 Preliminaries

We first fix notation: in what follows is always a positive integer and ; that is, , if is even, and , if is odd. The symbols will denote variables or real numbers. By capital calligraphic letters we denote matrices. The symbols and are reserved for the functions defined in creftype 2.16 and and denote the two hypergeometric functions defined in creftype 2.1 and creftype 2.2 below. The symbol always denotes the inner product defined in creftype 2.18.

2.1 Special functions

Throughout the article we use a collection of special function. We present them in this subsection. The Pochhammer polynomials [14, 18:3:1] are defined by where is a positive integer. If , the definition is . Kummer’s confluent hypergeometric function [14, Sec. 47] is defined as

(2.1)

and Gauss’ hypergeometric function [14, Sec. 60] is defined as

(2.2)

where , . Generally, neither nor converges for all . But if either of the numeratorial parameters is a non-positive integer, both and reduce to polynomials and hence are defined for all (and this is the only case we will meet throughout the paper).

Remark.

Other common notations are and . This is due to the fact that both and are special cases of the general hypergeometric function .

The following will be useful later.

Lemma 2.1.

Let be non-positive integers and . Then

Proof.

Since and are non-negative integers, and are polynomials, whose constant term is equal to . Therefore,

(2.3)

We have According to [14, 18:5:7] the latter is equal to and, moreover, . The claim follows when plugging this into creftype 2.3. ∎

For the Gamma function [14, Sec. 43] is defined as

(2.4)

The cumulative distribution function of the normal distribution [14, 40:14:2] and the error function [14, 40:3:2] are respectively defined as

(2.5)

The error function and are related by the following equation [14, 40:14:2]

(2.6)

The error function and the Kummer’s hypergeometric function are related by

(2.7)

see [1, 13.6.19].

2.2 Hermite polynomials

Hermite polynomials are a family of polynomials that are defined as

(2.8)

see [14, 24:3:2]. An alternative Hermite function is defined by

(2.9)

The two definitions are related by the following equality [14, 24:1:1]

(2.10)

By [14, 24:5:1] we have that

(2.11)
Remark.

In the literature, the polynomials are sometimes called the physicists’ Hermite polynomials and the are sometimes called the probabilists’ Hermite polynomials. We will refer to both simply as Hermite polynomials and distinguish them by using the respective symbols.

Hermite polynomials can be expressed in terms of Kummer’s confluent hypergeometric function from creftype 2.1:

(2.12)
(2.13)

see [1, 13.6.17 and 13.6.18].

2.3 Orthogonality relations of Hermite polynomials

The Hermite polynomials satisfy the following orthogonality relations. By [11, 7.374.2] we have

(2.14)

where is the Gamma function from creftype 2.4. More generally, by [3, p. 289, eq. (12)], if is even, we have for , , that

(2.15)

Here is Gauss’ hypergeometric function as defined in creftype 2.2. Recall from creftype 2.5 the definition of . In the following we abbreviate

(2.16)

and put

(2.17)

We can express the functions in terms of the .

Lemma 2.2.

We have

  1. For all : .

Proof.

Note that (2) is a direct consequence of (1). For (1) let and write

Thus as desired. ∎

We now fix the following notation: if two functions and satisfy and , we define

(2.18)

The Cauchy-Schwartz inequality implies . The functions and satisfy the following orthogonality relations

Lemma 2.3.

For all with we have

  1. .

Proof.

For (1) we have

(2.19)

where the fourth equality is due to the transformation and equation creftype 2.11 and the fifth equality is obtained using the transformation . This shows (1) for the case odd. The case even is implied by (2), which we prove next.

Since and are not both zero, by creftype 2.19, we may assume that . In this case, by Lemma 2.2, we have , so that

Combining this equation with creftype 2.14, we have

In particular, for even, which finishes the proof of the first part of this lemma. The second part is proved by replacing and . The case and is a consequence of the case and and the first part of the theorem (we can’t prove this last case simply by plugging in, because might violate the assumption ). This finishes the proof. ∎

2.4 The expected value of Hermite polynomials

In this section we will compute the expected value of the Hermite polynomials when the argument follows a normal distribution.

Lemma 2.4.

For we have .

Proof.

Write

where the second equality is due to the change of variables . Applying [11, 7.373.2] we get

This finishes the proof. ∎

Lemma 2.5.

Let and recall from creftype 2.16 the definition of , .

  1. If and is even, we have

  2. For all we have