Matrix factorizations of correlation matrices and applications

Matrix factorizations of correlation matrices and applications

Anupam Prakash Nanyang Technological University, Singapore and Centre for Quantum Technologies, Singapore aprakash@ntu.edu.sg  and  Antonios Varvitsiotis Nanyang Technological University, Singapore and Centre for Quantum Technologies, Singapore avarvits@gmail.com
Abstract.

We introduce a notion of matrix valued Gram decompositions for correlation matrices whose study is motivated by quantum information theory. We show that for extremal correlations, the matrices in such a factorization generate a Clifford algebra and thus, their size is exponential in terms of the rank of the correlation matrix. Using this we give a self-contained and succinct proof of the existence of completely positive semidefinite matrices with sub-exponential cpsd-rank, recently derived in [11, 5]. This fact also underlies and generalizes Tsirelson’s seminal lower bound on the local dimension of a quantum system necessary to generate an extreme quantum correlation.

1. Introduction

The correlation matrix for a family of random variables is the matrix whose entry is equal to the correlation between and , i.e.,

where denote the mean and standard deviation of . Correlation matrices capture the association between random variables and their use is ubiquitous in statistics.

It is easy to verify that correlation matrices are positive semidefinite and have all diagonal entries equal to one. Conversely, any such matrix can be expressed as a correlation matrix for some family of random variables. Thus, the set of correlation matrices coincides with the -dimensional elliptope, denoted by , defined as the set of symmetric positive semidefinite matrices with diagonal entries equal to one, i.e.,

The elliptope is a spectrahedral set whose structure has been extensively studied (e.g. see [3] and references therein). Its significance is illustrated by the fact that it corresponds to the feasible region of various semidefinite programs that are used to approximate NP-hard combinatorial optimization problems (e.g. MAX-CUT [4]).

In this work we mostly consider , i.e., the elliptope of size . Each matrix in admits a block decomposition where , and . For any we write to indicate that the vectors form a Gram decomposition for , i.e.,

As it turns out, the image of the elliptope under the projection operator

(1)

is of central importance to quantum information theory. The set is known as the set of bipartite correlation matrices. This connection is explained in Section 4.

In this work we introduce and study matrix factorizations of a specific form for correlation matrices. As explained below, such factorizations can be used to study a newly introduced notion of matrix rank relevant to quantum information theory. Informally, our main result is that for extreme points of the set of correlation matrices, the matrices in such a factorization generate a Clifford algebra.

Recall that the rank- Clifford algebra, denoted by , is the universal -algebra generated by Hermitian indeterminates satisfying the following relations:

(2)

Furthermore, it is well-known that depending on the parity of , the algebra has either one or two irreducible representations, each of dimension , e.g., see [4, Chapter 6].

Having introduced Clifford algebras, we now formally state our main technical result:

Theorem 1.1.

Let be an extreme point of where . Consider Hermitian matrices  satisfying

  • ;

  • ;

  • and is positive definite.

Then, the algebra is isomorphic to the rank- Clifford algebra In particular, the size of the matrices is lower bounded by .

Throughout this work, we refer to any family of matrices satisfying conditions and  above as a matrix factorization of the correlation matrix .

The proof of Theorem 1.1 is given in Section 2.3. As we explain in Section 4, Theorem 1.1 is related to Tsirelson’s results concerning the structure of the quantum state and the observables that are necessary to generate an extremal quantum correlation matrix.

A few comments are in order concerning the statement of Theorem 1.1. First, there exists a rank- extreme point of if and only if . This is a consequence of some well-known facts concerning extreme points of the elliptope which we review in Section 2.1 and will be satisfied whenever we apply Theorem 1.1. Second, although not immediately obvious, we will see in Lemma 2.6 that any correlation matrix admits such a matrix factorization (where we can even always take to be a multiple of the identity).

Also, it is worth noting that Theorem 1.1 remains valid when is replaced with:

and also when and are replaced with:

  • ;

  • .

Lastly, we note in passing that representations of generalized Clifford algebras (associated with a real zero polynomial) are related to the existence of determinantal representations of (powers of) hyperbolic polynomials, e.g. see [9, 10]. It is an interesting question if representations of generalized Clifford algebras are related in a similar manner to other spectrahedra, more general than the elliptope.

Applications to the cpsd-rank.

An matrix is called completely positive semidefinite (cpsd) if there exist Hermitian positive semidefinite matrices (for some ) satisfying

(3)

The set of cpsd matrices is a convex cone denoted by . The completely positive semidefinite rank () of a matrix , denoted by , is defined as the least for which there exist Hermitian psd matrices such that  for all .

We use Theorem 1.1 to give a succinct and self-contained proof of the existence of cpsd matrices whose cpsd-rank is sub-exponential in terms of their size. Specifically, we recover the following result that was recently shown in [11], and independently in [5].

Theorem 1.2 ([11, 5]).

For any there exists a matrix such that

(4)

where is the greatest integer satisfying , i.e.,

Both proofs of Theorem 1.2 were obtained using Tsirelson’s results [15] as a black-box, which we replace by Theorem 1.1. The value of this alternative proof is that it bypasses all the “quantum overhead” that was used in the original proofs [11, 5] and as such, it makes it accessible to the broader mathematical community. Additionally, our new proof puts forward matrix factorizations of the ellitope as a useful mathematical tool.

The study of the cpsd cone and the is motivated by physical considerations Specifically, the set of cpsd matrices is a convex cone that was recently introduced to provide linear conic formulations for the quantum analogues of various classical graph parameters [7, 12]. Subsequently, it was shown in [13] that the set of quantum behaviors can be expressed as the projection of an affine section of the cpsd cone. Additionally, the can be used to upper and lower bound the size of a quantum system needed to generate a quantum behavior [11].

Furthermore, from the perspective of conic optimization, the cpsd-rank is a natural non-commutative analogue of the well-studied notion of the completely positive rank [2].

Relation to Tsirelson’s work.

In his seminal work [15], Tsirelson showed that the set of bipartite correlation matrices coincides with the set of correlations that can be obtained by performing local measurements on a bipartite quantum system; for details see Section 4. Furthermore, he studied matrix factorizations of bipartite correlation matrices that have operational interpretation within the context of quantum information theory.

Among other things, Tsirelson showed in [15] that for extremal correlations, the matrices in such a factorization (in the finite-dimensional case) generate an appropriate Clifford algebra, and as a consequence, their size can be lower bounded.

The proof of Theorem 1.1 uses ideas from Tsirelson’s work but, as we show in Section 4, it strictly generalizes Tsirelson’s structure theorem. Furthermore, Theorem 1.1 provides an alternative interpretation of Tsirelson’s results, by highlighting matrix factorizations of the elliptope as the underlying mathematical object.

1.1. Preliminaries

In this section we introduce the most important definitions, notation and background material that we use throughout this paper.

Linear Algebra. Throughout, we use the shorthand notation and . We denote by the standard basis of . The canonical inner product of two vectors is denoted by . We write for the linear span of the vectors .

We denote by the set of complex matrices and by (resp. ) the set of Hermitian (resp. symmetric) matrices. Given a matrix , its transpose is denoted by and its conjugate transpose by . Furthermore, we denote by the Kronecker product of and . Throughout, we equip with the Hilbert-Schmidt inner product . For a block matrix we use that

(5)

A matrix is called positive semidefinite (psd) if for all . The set of Hermitian psd (resp. symmetric psd) matrices forms a closed convex cone denoted by (resp. . We sometimes also write to indicate that is psd.

The Gram matrix of a family of vectors , denoted by or is the symmetric matrix whose entry is given by , for all It is easy to see that an matrix is positive semidefinite if and only if there exist vectors (for some ) such that . For any Gram matrix we have that . Lastly, if we make use the following property:

(6)

We use a well-known correspondence between and given by the map which is given by on basis vectors and is extended linearly. The map is an isometry, i.e., for all . We also need the following fact:

(7)

Any vector can be uniquely expressed as for some integer , positive scalars , and orthonormal sets and . An expression of this form is known as a Schmidt decomposition for and is derived by the singular value decomposition of . Note that if is a Schmidt decomposition for , then we have that .

The Pauli matrices are given by

Note that the (non-identity) Pauli matrices are Hermitian, their trace is equal to zero, they have eigenvalues and they pairwise anti-commute.

Clifford algebras. Throughout this section set It is well-known that

(8)

For a proof of this fact and additional details the reader is referred to [4, Chapter 6]. An explicit representation of is obtained using the Brauer-Weyl matrices. Specifically, for , the map given by

(9)

and

(10)

is a complex representation of . Furthermore, in the case where we define as described in (9) and (10) and additionally set

Lastly, we collect some properties of the map which will be crucial for our results in the next section. Specifically, setting for all we have the following:

(11)

Convexity. A set is convex if for all and we have that . A subset is called a face of if implies that , for all and . We say that is an extreme point of the convex set if the singleton is a face of . We denote by the set of extreme points of the convex set . By the Krein-Milman theorem, any compact convex subset of is equal to the convex hull of its extreme points, e.g. see [1].

2. Correlation matrices, extreme points and matrix factorizations

2.1. Extreme correlation matrices and quadratic maps

An operator valued quadratic map is a function (for some ) such that for all and . The following result, observed already in [15], will be crucial.

Lemma 2.1.

Consider vectors satisfying and

(12)

Then, for any operator valued quadratic map we have that

(13)
Proof.

First consider the case , i.e., we have a quadratic form . Let be the symmetric matrix, corresponding to the bilinear form associated to , with respect to the standard basis of . By assumption (13) we have that for all and thus, (12) implies that . For the case , since is a -dimensional vector space over the real numbers, we can equivalently view as a quadratic form where . As all the ’s are real valued quadratic forms, the proof is concluded from the base case. ∎

Next, we recall the following well-known characterization of the extreme points of .

Theorem 2.2 ([8]).

Consider with and let where . Then if and only if

(14)

Equivalently, we have that if and only if

(15)

where denotes the entrywise product of and .

Combining Lemma 2.1 with Theorem 2.2 we have the following useful result.

Theorem 2.3.

Consider with and let where . For any two operator valued quadratic maps (for some satisfying for all we have that .

Proof.

Consider the operator valued quadratic form . By assumption we have that for all . As , by Theorem 2.2 we have that (14) holds. Lastly, Lemma 2.1 implies that . ∎

2.2. Bipartite correlation matrices

By definition of (recall (1)) and we have:

(16)

Given a bipartite correlation matrix , any with is called a completion of . As was shown in [15], all equality constraints in (16) can be relaxed with inequalities, without enlarging the set. For completeness, we give a proof in the Appendix.

Lemma 2.4 ([15]).

For all we have that

(17)

Given a bipartite correlation , any family of vectors satisfying and for all is called a C-system.

In the next result we summarize some basic properties of the set of completions of extreme bipartite correlations. For completeness, we give a short proof in the Appendix.

Lemma 2.5 ([15]).

For any we have that:

  • All -systems necessarily consist of unit vectors;

  • For any -system we have that ;

  • There exists a unique matrix satisfying . Furthermore, we have and .

2.3. Matrix factorizations of correlation matrices

We are finally ready to show that any correlation matrix admits a matrix factorization as defined in Theorem 1.1.

Lemma 2.6.

Consider a real symmetric matrix . The following are equivalent:

  • , i.e., there exist real unit vectors such that

  • There exist Hermitian matrices such that

    • ;

  • There exist Hermitian matrices such that

    • ;

    • ;

    • and is positive definite.

Proof.

Let . By the properties of the map (recall (11)) we have that and for all .

Set and , , for all .

For all let and set . For all define and analogously. As the entries of are real numbers we have that Lastly, note that

Similarly we have that for all . ∎

We refer to any family of matrices satisfying condition from Lemma 2.6 as a matrix factorization of . As already described in the introduction, our goal is to show that for extreme points of , we can place a lower bound on the size of matrix factorizations, which is exponential in terms of . We note in passing that using the same arguments we can also lower bound matrix factorizations satisfying condition from Lemma 2.6. Nevertheless, lower bounds for matrix factorizations of type are stronger.

2.4. Proof of the lower bound

In this section we give the proof of Theorem 1.1. This will follow as a consequence of the following result.

Lemma 2.7.

Let be an extreme point of where . Consider a family of Hermitian operators satisfying

(18)

where is an matrix satisfying and (the fact that such a matrix exists follows from (5)). Then we have that

(19)

In particular, the algebra generated by is isomorphic to the rank- Clifford algebra and thus, the size of the matrices is lower bounded by .

Proof.

Consider vectors satisfying

Using (6) combined with the fact that and we get

(20)

For we define operator valued quadratic maps as

As the vectors form a basis of the maps and are well-defined. Note that the claim (19) is equivalent to . Thus, by Theorem 2.3 it suffices to show that

(21)

First, note that

where we use that for all . Furthermore, for all we have

where for the first equality we use (20) and for the last equality (18). Similarly,

where we use that for all . Thus, (21) holds which in turn implies (19).

Lastly, as an immediate consequence of (19) we get that

(22)

Let be a spectral decomposition of . By assumption is positive definite and thus, for all . Setting

we have that

and thus is isomorphic to the rank- Clifford algebra .∎

We are now ready to give the proof of Theorem 1.1.

Proof.

(of Theorem 1.1) As , there exists an matrix such that and . Since it follows by (6) that

and as is positive definite (and hence invertible) we obtain

(23)

Lastly, define

By assumption we have that , for all , which implies that . Furthermore, as for all it follows by (23) that

(24)

The proof of the theorem is is concluded using Lemma 2.7. ∎

3. Constructing matrices with sub-exponential cpsd-rank

In this section we use Theorem 1.1 to prove Theorem 1.2. The crux of the proof lies in the following result.

Theorem 3.1.

For any the matrix

(25)

is cpsd and furthermore, .

Proof.

Let where and . As suggested by (25) we think of as an block matrix where each block has size and is indexed by .

We first show that . For this, set and define

(26)

and note that by the properties of the map (recall (11)), these matrices are Hermitian psd. Furthermore, by direct calculation for all and we have that

(27)

which shows that the matrices form a cpsd-factorization for . Next we proceed to show the lower bound.

Let be a size-optimal cpsd-factorization for . We now identify some useful properties of these matrices which we use later in the proof. As the entries of in each block sum up to one we get

(28)

For all set

(29)

which is well-defined by (28). Furthermore, note that is psd and

Since the cpsd factorization is size-optimal we may assume without loss of generality that is diagonal and positive definite. Indeed, let be its spectral decomposition. Clearly, the matrices are Hermitian positive semidefinite and as is unitary, it follows that they form a cpsd-factorization for . As a consequence, if was rank-deficient, by restricting the matrices onto the support of , we would get another cpsd-factorization of smaller size. This contradicts the assumption that was size-optimal.

Our next goal is to use the cpsd-factorization to obtain the matrix factorization to which Theorem 1.1 will be applied. As invertible we have that

(30)

where we define

(31)

An easy calculation shows that

(32)