Constraints on Multipartite Quantum Entropies

Constraints on Multipartite Quantum Entropies

Christian Majenz

[5pt] christian.majenz@pluto.uni-freiburg.de

[14pt] Supervisor: Prof. David Gross

[5pt] Master’s Thesis submitted to

[5pt] Albert-Ludwigs-Universität Freiburg

[5pt]

Febuary 2014

Abstract

The von Neumann entropy plays a vital role in quantum information theory. As the Shannon entropy does in classical information theory, the von Neumann entropy determines the capacities of quantum channels. Quantum entropies of composite quantum systems are important for future quantum network communication their characterization is related to the so called quantum marginal problem. Furthermore, they play a role in quantum thermodynamics. In this thesis the set of quantum entropies of multipartite quantum systems is the main object of interest. The problem of characterizing this set is not new – however, progress has been sparse, indicating that the problem may be considered hard and that new methods might be needed. Here, a variety of different and complementary aprroaches are taken.

First, I look at global properties. It is known that the von Neumann entropy region – just like its classical counterpart – forms a convex cone. I describe the symmetries of this cone and highlight geometric similarities and differences to the classical entropy cone.

In a different approach, I utilize the local geometric properties of extremal rays of a cone. I show that quantum states whose entropy lies on such an extremal ray of the quantum entropy cone have a very simple structure.

As the set of all quantum states is very complicated, I look at a simple subset called stabilizer states. I improve on previously known results by showing that under a technical condition on the local dimension, entropies of stabilizer states respect an additional class of information inequalities that is valid for random variables from linear codes.

In a last approach I find a representation-theoretic formulation of the classical marginal problem simplifying the comparison with its quantum mechanical counterpart. This novel correspondence yields a simplified formulation of the group characterization of classical entropies (IEEE Trans. Inf. Theory, 48(7):1992–1995, 2002) in purely combinatorial terms.

Zusammenfassung

Die Von-Neumann-Entropie spielt eine zentrale Rolle in der Quanteninformationstheorie. Wie die Shannonentropie in der klassischen Informationstheorie charakterisiert die Von-Neumann-Entropie die Kapazität von Quantenkanälen. Quantenentropien von Quantenvielteilchensystemen bestimmen die Kommunikationsrate über ein Quantennetzwerk und das problem ihrer Charakterisierung ist verwandt mit dem sogenannten Quantenmarginalproblem. Außerdem spielen sie in der Quantenthermodynamik eine Rolle. In dieser Arbeit liegt das Hauptaugenmerk auf der Menge der Quantenentropien von Quantenzuständen einer bestimmten Teilchenzahl. Das Characterisierungsproblem für diese Menge ist nicht neu – Fortschritte wurden bisher jedoch nur wenige erzielt, was darauf hinweist, dass das Problem als schwierig bewertet werden kann und dass wahrscheinlich neue Methoden benötigt werden, um einer Lösung näher zu kommen. Hier werden verschieden Herangehensweisen erprobt.

Zuerst gehe ich das Problem aus einer “globalen Perspektive” an. Es ist bekannt, dass die Region aller Von-Neumann-Entropien einen konvexen Kegel bildet, genau wie ihr klassisches Gegenstück. Ich beschreibe Symmetrien dieses Kegels und untersuche Gemeinsamkeiten und Unterschiede zum klassischen Entropiekegel.

Ein komplementärer Ansatz ist die Untersuchung von lokalen geoemetrischen Eigenschaften – Ich zeige, dass Quantenzustände, deren Entropien auf einem Extremstrahl des Quantenentropiekegel liegen, eine sehr einfache Struktur besitzen.

Da die Menge aller Quantenzustände sehr kompliziert ist, schaue ich mir eine einfache Untermenge an: die Menge Stabilisatorzustände. Ich verbessere bisher bekannte Ergebnisse, indem ich zeige, dass die Entropien von Stabilisatorzuständen eine zusätzliche Klasse von Ungleichungen erfüllen, die für Zufallsvariablen aus linearen Codes gelten.

Ein vierter Ansatz, den ich betrachte, ist der darstellungstheoretische. Ich formuliere das klassische Marginalproblem in der Sprache der Darstellungstheorie, was den Vergleich mit dem Quantenmarginalproblem vereinfacht. Diese neuartige Verknüpfung ergibt eine vereinfachte kombinatorische Formulierung der Gruppencharakterisierung von klassischen Entropien (IEEE Trans. Inf. Theory, 48(7):1992–1995, 2002).

Acknowledgements

First of all I want to thank my supervisor David Gross. Only his support and encouragement as well as our discussions made this thesis possible, our collaboration was a great pleasure. I want to thank Michael Walter for great discussions. Special thanks go to all of the quantum correlations research group at University of Freiburg, which has been a splendid environment for the last year. In particular I want to thank Lukas Luft and Rafael Chaves for sharing their perspective on convex geometry and Shannon entropic inequalities. I want to thank Joe Tresadern for proofreading part of this thesis.

I want to thank my parents, Jaqueline and Klaus Majenz, for their support. Furthermore I want to thank Laura König as well as my housemates for their leniency when I missed some elements of reality every now end then due to their low dimensionality.

I Acknowledge financial support by the German National Academic Foundation.

Chapter 0 Introduction

0.1 Motivation

The main goal of this thesis is a better understanding of the entropies of multi-particle quantum states. This is an important task from a number of perspectives.

First, there is the information theoretic perspective. In both classical and quantum information theory, entropy is a key concept which determines the capacity of a comunication channel [55, 53]. In simple communication scenarios with one sender and one receiver, it suffices to study the entropies of bipartite systems, i.e. of two random variables or a bipartite quantum state. Bipartite entropies are well understood in both classical and quantum information theory.

In a network scenario, however, where data has to be sent from multiple senders to multiple receivers, relations between joint and marginal entropies of multiple random variables determine the constraints on achievable communication rates [59]. Although little progress has been made for almost fifty years, in the past fifteen years finally there have been results towards understanding the laws governing the entropies of more than two random variables. In the quantum setting virtually nothing is known. In particular, as the bipartite case shows very strong similarities between quantum and classical entropies, it is promising to search for analogues of the aforementioned recent multivariate classical results.

In this regard the problem of characterizing the region of possible entropy vectors of multipartite quantum states naturally appears as part of one of the overarching programs in quantum information theoretic research: If possible, find quantum analogues to the results and concepts from classical information theory, otherwise shed light on the differences between the two theories.

Sometimes insights from quantum information theory also have an impact on classical information theory [45], providing another motivation to study quantum information problems that might be still far from possible applications.

Another perspective is that of the quantum marginal problem. This is defined more formally in Section 2.2.3, and can be stated as follows: Given a multipartite quantum system and some reduced states, is there a global state of that system that is compatible with the given reductions?

A general solution to this problem would have vast implications for quantum physics and quantum information theory. It would, for example, render the task of finding ground states of lattice systems with nearest neighbor interaction [23] and the calculation of binding energies and other properties of matter [36] computationally tractable. This is unfortunately too optimistic an assumption as the quantum marginal problem turns out to be QMA-complete [43], as are several specialized variants of practical relevance [44, 57]. This is believed to imply that these problems are intractable even for a quantum computer, as QMA is the quantum analogue of the complexity class NP.

Due to the difficulty of the quantum marginal problem there is little hope for a general solution. But this is not the end of the research program, it is natural to study a “coarse-grained ”variant: Quantum entropies are functions of the marginals and seem to be amenable to analytic insight.

A third motivation to study multi-particle entropies comes from the very field where researchers defined the first entropies, that is from thermodynamics. The strong subadditivity inequality [39] of the von Neumann entropy, for example, has applications in quantum thermodynamics. In one of these applications it is used to prove that the mean entropy of the equilibrium state of an arbitrary quantum system exists in the thermodynamic limit [38, 56], underpinning the correctness of the mathematical formalism used to explicitly take the latter. The application of information theoretic tools in thermodynamics is possible because the respective notions of entropy are mathematically identical and also physically closely related [37, 19].

The applications of strong subadditivity suggest that further results in the direction of understanding quantum entropies of multi-particle systems could lead to thermodynamic insights as well.

0.2 Goals and Results

The general program pursued in this thesis – i.e. understanding multiparticle quantum entropies – is not new. Several experienced researchers have worked on it before [39, 52, 42, 16, 17, 8, 32, 41, 28, 40]. Progress, however, has been scarce. In that sense, the problem of finding constraints on quantum entropies can be considered ”hard” and it would be too much to ask for anything approaching a complete solution. As a result, we have pursued a variety of very different approaches to the problem in order to gain partial insights. The overall goal of this thesis is to show which approaches could be promising. We are therefore not solving the problem completely, but instead determining which methods may prove useful. As a consequence, The results obtained in this thesis therefore comprise of a collection of relatively independent insights, rather than being one ’final theorem’. For the benefit of the reader, a list of these individual results are given below.

Global perspective.

A quantum state on an -fold tensor product Hilbert space gives rise to entropies, one for each subset of subsystems. Collecting them in a real vector yields a point in the high-dimensional vector space . It turns out, that the set of all such entropy vectors forms a convex cone [52]. The same is known to be true for the classical entropy region defined analogously [60]. In Chapter 3 some global properties of this geometric object are investigated:

  • Proposition 3.1 shows that the quantum entropy cone has a symmetry group that is strictly larger than the known symmetry group of its classical analogue.

  • Corollary 3.3 uses this symmetry to show that some known quantum information inequalities define facets of the quantum entropy cone, i.e. they are independent from all other (known and unknown) quantum information inequalities.

  • The classical entropy cone is known to have the property that all interesting information inequalities satisfy a number of linear relations [12]. Such information inequalities are called balanced. Corollary 3.7 and the preceding discussion clarify the geometric property underlying this result: The dual of the quantum entropy cone has a certain direct sum structure. Theorem 3.9 proves a characterization of cones whose duals have this structure. Corollary 3.10 uses this theorem and the facets identified in Corollary 3.3 to show that the quantum entropy cone does not have this simpler structure and that therefore the result from [12] does not have a straightforward quantum analogue.

Local perspective.

The most important points of a convex set are its extremal points. Here we study the local geometry of extremal rays, which are the cone analogues of extremal points. In particular, we characterize quantum states that have an entropy vector that lies on such an extremal ray.

  • Theorem 4.3 proves that all non-trivial states whose entropy vectors lie on an edge of the quantum entropy cone have the property that all marginal spectra are flat, i.e. that the reduced states have only one distinct nonzero eigenvalue. This is a very simple structure and narrows down the search for states on extremal rays tremendously.

  • Theorem 4.10 provides an analogous result for the classical entropy cone.

Variational perspective.

As the characterization of the whole quantum entropy cone for parties has so far proved elusive, and even the classical entropy cone is far from characterized in this case, it seems reasonable to start by finding simpler inner approximations. This can also be done by looking at a subset of states that has additional structure such as to allow for a direct algebraic characterization of the possible entropy vectors.

One subset that allows for such an algebraic approach is the set of stabilizer states [40, 28]. In Chapter 5 the results from [28] and [40] are improved:

  • Corollary 5.12 states that under a technical assumption on the local Hilbert space dimension, entropies from stabilizer states satisfy an additional class of linear inequalities that governs the behavior of linear network codes, the linear rank inequalities. The result includes the important qubit case. This partially answers a question raised in [40] and shows that stabilizer codes behave similar to classical linear codes from an entropic point of view.

Structural perspective.

For the classical entropy cone [13] provides a remarkable characterization result: for a given entropy vector there exists a group and subgroups thereof such that the entropies are determined by the relative sizes of the subgroups. Considerable research effort has been directed at finding an analogous relation for quantum states [16, 17]. Chapter 6 is concerned with the result from [13] and possible quantum analogues:

  • Theorem 6.4 recasts the main result from [13] purely in terms of certain combinatorial objects known from mathematical statistics called type classes. This characterization is simpler in the sense that type classes are much simpler objects than finite groups.

  • Section 6.2.1 provides a connection between strings and certain representations of the symmetric group called permutation modules.

  • This formalism allows for representation-theoretic proofs of the Shannon-type information inequalities such as the strong subadditivity for the Shannon entropy (Proposition 6.8).

  • Section 6.2.3 gives a novel decomposition of into a direct sum of permutation representations of the direct product of symmetric groups . This allows for a simple argument why Theorem 6.4 does not have a direct quantum analogue.

  • Theorem 6.9 gives a formula for the decomposition of Weyl modules restricted from the unitary to the symmetric group into irreducible representation of the latter as a byproduct.

0.3 Overview

After this introduction, there are two chapters devoted to introducing the mathematical and the physical and information theoretical fundamentals respectively. In Chapter 1 the relevant mathematical background is discussed, that is convex geometry, Lie groups and Lie algebras, the group algebra and representation theory. The section about representation theory is somewhat longer, as deeper results from that field are used in Chapters 5 and 6, where as the other sections are mostly dedicated to introducing the concepts and fixing a notation. Chapter 2 contains an introduction to the information theoretical and some physical concepts, i.e. classical information theory, quantum information theory and some concepts from quantum mechanics. In this chapter the classical and quantum entropy cones are introduced that are the main objects of study in this thesis.

Chapter 3 is concerned with the convex geometry of entropy cones. Section 3.1 clarifies the symmetries of the quantum entropy cone. Section 3.2 is concerned with investigating the possibility of generalizing a result from classical information theory [12]. The last short section in this chapter, Section 3.3, reviews a class of maps between entropy cones of different dimensions introduced in [31] and presents them in a more accessible way using the cone morphism formalism.

Chapter 4 makes a complementary approach to characterizing the quantum entropy cone: While Chapter 3 investigates global properties by looking at symmetry operations, this Chapter is concerned with the local geometry of extremal rays. In Section 4.2 the classical case is investigated with the techniques developed for the quantum case.

Chapter 5 introduces the set of stabiliser states and their description by a finite phase space. The independently obtained result in [40] and [28] is also strengthened by partly answering a question posed in [28].

Chapter 6, is concerned with the representation theoretic point of view on the quantum marginal problem and quantum information inequalities introduced in [16]. A The classical result that inspired the research in this direction, [13], is reviewed and reformulated in a more information theoretic way using type classes.

A quantum analogue of the construction is attempted, but only succeeds for the trivial case .

0.4 Conventions

The following conventions and notations are used in this thesis:

  • is the set of integers including zero

  • is the logarithm with basis two, otherwise the basis is specified as in , is the natural logarithm

  • In a topological space, given a set I denote its closure by .

  • For a subset the complement of a is denoted by . If is a singleton, I write instead of .

  • or means “define to be equal to ”, means “ is equal to

Chapter 1 Mathematical Background

1.1 Convex Geometry

As already mentioned in the introduction, convex geometry plays an important role in the characterization efforts for classical and quantum joint entropies. In particular, the notion of a convex cone is important when investigating joint entropies, as both the set of Shannon entropy vectors of all n-partite probability distributions and the set of von Neumann entropy vectors of all n-partite quantum states, which I will define in Sections 2.1.2 and 2.2.2 respectively, can be proven to form convex cones up to topological closure. In this chapter I introduce some basic notions of convex geometry. A more careful introduction can be found for example in [4].

Roughly speaking, a body is convex, if it has neither dents nor holes. Mathematically, let us make the following

Definition 1.1 (Convex Set).

Let be a real vector space. A subset is called convex, if

(1.1)

The concept that is most important for this thesis among the ones introduced in this section is the (convex) cone. We define a cone to be a convex set that invariant under positive scaling, i.e.

Definition 1.2 (Cone).

Let be a real Vector space. A convex subset is a cone, if

(1.2)

For an arbitrary subset we define the convex hull

(1.3)

as the smallest convex set that contains the original one and analogously the conic hull . Simple examples of cones are the open and the closed quadrants in , the open and the closed octants in or the eponymous one, shown in Figure 1.1.

A face of a convex set is, roughly speaking, a flat part of its boundary, or, mathematically precisely put,

Definition 1.3 (Face).

Let be a real vector space and convex. A face of is a subset of its closure such that there exists a linear functional and a number with

(1.4)

If the face is a singleton, is called an exposed point. A face is called proper if . If there is no proper face that contains a face except for itself, we call a facet.

Note that for a proper face of a cone one always has .

A base of a cone is a minimal convex set that generates upon multiplication with , i.e.

Definition 1.4 (Base).

Let be a cone. A base of is a convex set such that for each there exist unique and with .

A ray is a set of the form for a vector . A cone contains each ray that is generated by one of it’s elements, and there is a natural bijection between the set of rays and any base. That motivates the definition of extremal rays, which correspond to extremal points of any base:

Definition 1.5 (Extremal Ray).

Let be a convex cone. A ray is called extremal, if for each and each sum decomposition with we have . We denote the set of extremal rays of by .

In convex geometry duality is an important concept. Instead of describing which points are in a convex set, one can give the set of affine inequalities that are fulfilled by all points in the convex set. By inequality we always mean statements involving the non-strict relations and . If a certain affine inequality is valid for all elements of a cone, then also its homogeneous, i.e. linear, version holds. By fixing the exclusive usage of either or , any linear inequality on a vector space can be described by an element of the dual space . We adopt the convention to use and define the

Definition 1.6 (Dual Cone).

Let be a real vector space and its dual space. Let be a convex cone. The dual cone is defined by

(1.5)

If we have and hence via the standard inner product in . The extremal rays of the dual cone are exactly the ones corresponding to facets.

Convex sets come n different shapes, e.g. a circle is convex as well as a triangle. An important difference between the two is that the latter is described by finitely many faces or finitely many extremal points.

Definition 1.7 (Polyhedron, Polytope).

Let be a real vector space. A convex set is called a polyhedron, if it is the intersection of finitely many halfspaces, i.e. there exist a finite number of functionals and a real number for each functional such that

(1.6)

is, in addition, compact, it is called a polytope.

We call a cone polyhedral, if it is a polyhedron.

As the classical and quantum entropy cones are not exactly cones but only after topological closure, I reproduce a characterization result here for such sets. Adopting the notions in [52], we say a subset of a vector space with is additive, if , and a set is said to be approximately diluable if for all there exists a , such that for all there exists a such that . Note that, as we are talking about finite dimensional vector spaces, all norms are equivalent, so we do not have to specify. It turns out that a set that is additive and approximately diluable turns into a cone after taking the closure:

Proposition 1.8 ([52]).

Let be a real vector space and additive and approximately diluable. Then is a convex cone.

Figure 1.1: A cone.

To investigate relations between different cones and to find their symmetries, we would like to introduce a class of maps between vector spaces containing cones that preserves their structure. The set of maps will be a subset of the homomorphisms of the ambient vector spaces that map cone points to cone points, i.e.

Definition 1.9 (Cone Morphism).

Let , be real vector spaces, , cones. A map is called cone morphism, if

(1.7)

If is injective and , is called a cone isomorphism, in this case and are called isomorphic. The set of all such cone morphisms is denoted by .

Remark 1.10.

The notion of a cone isomorphism introduced here coincides with the notion of an order isomorphism in the theory of ordered vector spaces.

Note that a cone isomorphism is not always a vector space isomorphism. However, if , then a cone isomorphism is also a vector space isomorphism. A cone homomorphism naturally induces a cone homomorphism by pulling back functionals via , i.e. . If we look at an arbitrary linear map we get a new cone in from a cone , that is . Can we express by and ? We calculate

(1.8)

where is the set-valued inverse of the adjoint of .

1.2 Groups and Group Algebras

The Following chapter is dedicated to a concise introduction to Lie groups, and group algebra, as they may be not familiar to all readers and also to fixing a notation for the subsequent chapters. A reference for a more extensive introduction that is still focused on representation theory is [25].

1.2.1 Lie Groups

A Lie group is, roughly speaking, a Group that also is a -manifold and in which the group structure is smooth with respect to differentiation on the manifold. Recall the definition of a

Definition 1.11 (Manifold).

A -manifold is a topological space (Hausdorff, paracompact) with the following properties:

  1. There exists a dimension such that for all there is an open neighborhood of and a homeomorphism called chart.

  2. For two such maps, and with , the map is or smooth.

In this text all manifolds are . A map between manifolds is called smooth, if is smooth, where are charts on () respectively. With this in mind we can go forward and define a

Definition 1.12 (Lie Group).

A Lie Group is a group with the additional property that is a manifold and the maps and are smooth. Note that the is equipped with the obvious manifold structure.

Lie groups can be characterized by manifold properties such as connected, simply connected or compact, and by group properties such as simple or Abelian. A manifold can be a complicated object, but we can always map its local properties to its tangent space, the same can be done with the group structure of a Lie group. This motivates the definition of a

Definition 1.13 (Lie Algebra).

A Lie algebra is a vector space over a field with characteristic , with a bilinear map called Lie bracket, which fulfills the following properties:

  1. it is alternating, that is

  2. it fulfills the Jacobi identity

A representation of a Lie algebra is vector space homomorphism which maps the Lie bracket to the commutator:

(1.9)

The tangent space of a Lie group has a natural bilinear map of this form. To construct it we write down the conjugation map

(1.10)

and differentiate in both arguments at . The resulting bilinear map makes the tangent space a Lie algebra, as can easily be checked for the special case of being a subgroup of , the only case we will be dealing with. In that case the Lie bracket is the commutator.

The important fact about the Lie algebra of a Lie group is that it contains the essential part of the group structure in the sense that each representation of a Lie group defines a representation of its Lie algebra, and each representation of its Lie algebra defines a representation of the universal cover of the connected component of the Identity. Usually the Lie groups are real manifolds, and therefore they have real Lie algebras, but often it is simpler to have a complex algebra, especially because is algebraically closed. A helpful fact is that, given a Lie algebra , the representations of the complexified Lie algebra are irreducible if and only if the corresponding representation of the real Lie algebra is irreducible.

The connection between Lie group and Lie algebra is even more explicit. An element of the Lie Algebra generates a one parameter subgroup of : we just find a smooth curve with and and define

(1.11)

In matrix Lie groups/algebras this coincides with the matrix exponential.

An Important representation of a Lie algebra is the adjoint representation. A Lie algebra acts on itself by means of the bracket, i.e.

(1.12)

The Jacobi identity ensures that the bracket is preserved under this vector space homomorphism, it thus really is a representation of

1.2.2 Group Algebras

Let be a finite group and the free complex vector space over . Then inherits the multiplication law from which makes it an associative unital algebra:

(1.13)

is usually equipped with the standard inner product of rescaled by the size of :

(1.14)

The free complex vector space over any set is nothing else but the vector space of complex functions on , so we can also view elements of the group algebra as complex functions on . A projection in is an element with . A projection is called minimal, if it can not be decomposed into a sum of two projections. The concept of a group algebra generalizes in a straightforward way to compact Lie groups, where the sums over have to replaced by integrals with respect to the invariant Haar measure that assigns the volume 1 to .

1.3 Representation Theory

The basic results stated in this section can be found in textbooks like [26] and [25]. Given a group we can investigate homomorphisms to the general linear group of the -dimensional Vector space over some field which are called representations of . More generally we write for a representation on an arbitrary vector space . The vector space which the group acts on is called representation space. The representation is called complex (real) representation if (). In the context of quantum information theory we are almost exclusively concerned with complex representation, as quantum mechanics take place in a complex Hilbert space (Although Asher Peres once said that “…quantum phenomena do not occur in a Hilbert space, they occur in a laboratory.” [50], page 112). Any representation of a group can be extended by linearity to a representation of the group algebra . Two representations and are considered equivalent if there exists a vector space isomorphism which acts as an intertwiner for the two representations:

(1.15)

A representation on is called irreducible if it has no non-trivial proper invariant subspaces, otherwise it is called reducible. A representation on is called completely reducible if , are invariant subspaces and is irreducible. All representations of finite Groups are completely reducible. Also this result, which is built on the possibility of averaging over the group, generalizes to compact Lie groups. In the sequel we do not always distinguish between a representation and its representation space. Given a group and a representation we say is a representation of and write if . An important tool in representation theory is Schur’s lemma which characterizes the homomorphisms between two representations that commute with the action of th group:

Lemma 1.14 (Schur’s Lemma).

Let and be irreducible representations of a Group , and let be a vector space homomorphism that commutes with the action of , i.e.

(1.16)

Then either or is an isomorphism. In particular, if then for some .

Proof.

Observe that if , then , i.e.  is an invariant subspace of which can, by the irreducibility of , only be zero or . This proves that is either or injective. Also is invariant, as . This shows that is surjective, unless it is . We conclude that is either or an isomorphism. For , is an endomorphism of a vector space over the algebraically closed field , so it has an eigenvalue . Hence and commutes with the action of , so by the first part of this proof

As an important corollary of this lemma, we find the multiplicity of an irreducible representation of a group in some representation being equal to the dimension of the space of -invariant homomorphisms i.e. of the space

(1.17)
Corollary 1.15.

Let be a representation of a finite group . Then

(1.18)
Proof.

Let be any element from . Then has the form

(1.19)

According to Schur’s lemma (Lemma 1.14)

(1.20)

with . Thus we have an obvious isomorphism

(1.21)

and the statement follows. ∎

The unitary representations of a finite group , somewhat surprisingly, provide us with an orthonormal basis of the group algebra. Here we prove a first part of this fact:

Theorem 1.16 (Schur Orthogonality Relations, Part I).

Let be a finite Group. Let label the equivalence classes of irreducible representations of and pick a unitary representative from each class. Then

(1.22)

where is the inner product of the group algebra (1.14) and .

Proof.

for two fixed unitary irreducible representations we define for each map an associated element by

(1.23)

If now is the standard basis of the space of -matrices , i.e. , then

(1.24)

According to Schur’s lemma (Lemma 1.14) we have . So with the above equation we already get if .

For we have , hence and thus

(1.25)

This proves (1.22). ∎

1.3.1 Restriction and Induction

Given a Group with a representation and a subgroup it is straightforward to define a representation of on by restriction. for this representation we write . A little less obvious is the construction of a representation of from a representation of . To define this recipe called induction we need the definition of a

Definition 1.17 (Transversal).

Let be a group and a subgroup. A subset is called (left) transversal for , if

  1. .

The above definition is equivalent to saying that a transversal for in contains exactly one element from each (left) coset. Let us now define the

Definition 1.18 (Induced Representation).

Let be a group, a subgroup and a representation of . Furthermore set for and fix and order a transversal . Then we define the induced representation on by

(1.26)

It is straightforward to verify that the induced representation is a representation and that induced representations corresponding to different transversals of the same subgroup are isomorphic. While being easily explained in simple terms, the above construction is somewhat dissatisfactory because it first uses a transversal and it has to be proven afterwards that the construction doesn’t depend on it. This can be circumvented by giving the definition in terms of a generalized notion of tensor products.

Definition 1.19 (Tensor Product).

Let be a ring, a right -module and a left -module. Let be the free Abelian group over the Cartesian product of and and define the subgroup generated by the set ,

(1.27)
(1.28)

Then

(1.29)

Is the -tensor-product of and . Whenever is also a left -module for another ring , is a left module as well, and when is also a right -module for yet another ring , is a right -module as well.

Note that this definition specializes to the usual definition of a tensor product between vector spaces if is a field.

We are now in the position to give a transversal independent definition of the induced representation:

Definition 1.20 (Induced Representation, 2nd Definition).

Let be a field, be a group, a subgroup and a -representation of . This is equivalent to stating that is a -left-module. Then then induced representation is the -left-module

(1.30)

Note that is a -bimodule for any subgroup . To recover the transversal dependent construction, we choose a left-transversal and observe that the set generates for any basis of .

1.3.2 Character Theory

Character theory is a powerful means of analyzing group representations. Given a representation of a finite group , we define its character as the map (group algebra element)

(1.31)

Note that, for the purpose of a clear definition of the character, we have temporarily reintroduced the distinction between the representation (-map) and the representation space . The characters are in the center of denoted by , which follows from the fact that they are constant on conjugacy classes:

(1.32)

The characters of equivalent representations are identical, as the trace is basis independent and the transition to an equivalent representation can be viewed as a basis change. It follows directly from the Schur orthogonality relations, Theorem 1.16 that the characters are orthonormal in , i.e.

(1.33)

This provides us with a way finding the multiplicity of an irreducible representation in a given representation far simpler that Corollary 1.15. If an arbitrary representation has a decomposition into irreducible representations

(1.34)

then its character is easily determined to be

(1.35)

and hence, using (1.33),

(1.36)

All the above can be summarized by the statement that an equivalence class of representations is uniquely determined by its character and that the irreducible characters are orthonormal.

1.3.3 The Regular Representation

Let be a group. Consider the action

(1.37)

on the group algebra as a vector space. Let for now be finite. Using the theory of characters introduced in the last subsection we can analyze the regular representation. It’s character is

(1.38)

as if . Explicitly calculating the inner product of with the irreducible representations yields

(1.39)

which implies that the decomposition of the Regular representation into a sum of irreducible representations is

(1.40)

The last expression reflects the fact that there is, in addition to the left action (1.37), a right action

(1.41)

which commutes with the former. From the decomposition (1.40) we also get an explicit formula for the cardinality of the group in terms of the dimensions of its irreducible representations,

(1.42)

We are now ready to prove part two of Theorem 1.16.

Theorem 1.21 (Schur orthogonality relations, Part II).

Let be a finite Group. Let label the equivalence classes of irreducible representations of and pick a unitary representative . Then is a basis of , and the components don’t mix in the sense that

(1.43)
Proof.

In Theorem 1.16 we already saw that is an orthogonal set and in particular linearly independent. But Equation 1.42 directly implies , hence is indeed a basis. For the last part of the theorem, let us calculate

(1.44)

where for the second equality we used the properties of a unitary representation and for the third one we used Theorem 1.16. Using the orthonormal basis property proven above this implies the multiplication law (1.43). ∎

The last result implies that the group algebra of a finite group is isomorphic to the direct sum of the matrix algebras over the irreducible representation spaces,

(1.45)

for example via the isomorphism

(1.46)

where in the second line we fixed a set of unitary irreducible representations, or, equivalently, a basis for each to choose a definite isomorphism. Can we explicitly find minimal projections of as well as its center? According to Theorem 1.21 the diagonal elements of any irreducible unitary representation are proportional to projections, and in view of (1.3.3) they are also minimal. In view of the decomposition (1.40) they project onto a single copy of the corresponding irreducible representation with respect to the right action (1.41). The isomorphism (1.3.3) also implies, together with (1.33) that the set of irreducible characters forms an orthonormal basis of . The multiplication rule (1.43) from Theorem 1.21 implies furthermore that the irreducible characters must square to multiples of themselves, in fact, explicitly exploiting (1.43),

(1.47)

and therefore

(1.48)

are the minimal central projections. Now consider an arbitrary representation of with decomposition into irreducible representations

(1.49)

As in the group algebra projects onto the irreducible component, acts on by projecting onto .

1.3.4 Irreducible representations of

We want to identify the irreducible representations of the symmetric group , i.e. the permutation group of elements. To this end, we make use of the regular representation as it contains all irreducible representations of a finite group. Let us first introduce the important tool called young diagrams.

Given a partition of into a sum of non increasing numbers we can define the corresponding

Definition 1.22 (Young Diagram).

A Young diagram is a subset for which the following holds:

(1.50)

Elements of a Young diagram are called boxes, subsets with constant first component are called columns, such with constant second component rows. For a Young diagram of boxes we write , for a Young diagram of Boxes and at most rows .

The picture one should have in mind reading this definition is the one obtained by taking an empty box for each element of and arranging them in a diagram such that the “origin” of is in the upper left corner:

This example corresponds to a partition of , namely . We also write . A Young diagram filled with numbers is called a Young tableau:

The first kind is called standard, the second semistandard:

Definition 1.23.

A Young tableau of shape is a Young diagram with a number in each box. We also write . A standard Young tableau is a Young diagram of boxes that is filled with the numbers from 1 to such that numbers increase from left to right along each row and down each column. A semistandard Young tableau is a Young diagram filled with numbers which are nondecreasing along each row and increasing down each column. We write , and for the sets of Young tableaux filled with the numbers to , the standard Young tableaux and the semistandard Young Tableaux with entries smaller or equal to , respectively.

A standard Young tableau of shape defines two subgroups of , one that permutes the entries of the rows, , and one that permutes the entries of the columns, . Define the group algebra elements

(1.51)

The last one, , is proportional to a minimal projection

(1.52)

and is called the young symmetrizer. As a minimal projection, according to the discussion in section 1.3.3, it projects onto a single copy of an irreducible representation in the decomposition of the group algebra with respect to the right action of .

Another way of constructing the representations of is to stop after symmetrizing the rows of a Young tableau and looking at the corresponding representation, that is

(1.53)

where the action is defined by permuting the entries and then resorting the rows. Tableaux whose rows are ordered but their columns are not are called row-standard. This representation is called permutation module. It can also be constructed in a different way. Each Young diagram defines a subgroup of , the so called Young subgroup with permutes the first elements, permutes etc. Then it is easy to verify that the permutation module is the representation of induced by the trivial representation of the corresponding Young subgroup , as a formula .

What is the relation between the permutation modules and the irreducible representations of , which are also called Specht modules? The permutation module contains the irreducible representation exactly once, and otherwise only contains irreducible representations with and , i.e.

(1.54)

the multiplicities are called Kostka numbers. These numbers have a simple combinatorial description: Define the set of semistandard Young tableau with shape and content , i.e. the numbers in the tableau have frequency as a string. Then the Kostka number is the number of such tableaux,

(1.55)

1.3.5 Irreducible Representations of the Unitary Group

The unitary group is defined as the group of endomorphisms of that leaves the standard inner product invariant. It is a Lie group and we can easily find its Lie algebra. For with , . Also if is not antihermitian, so the Lie algebra is the set of antihermitian -matrices. If we look at an arbitrary (finite dimensional, unitary) irreducible representation of , we know that for the restriction to the Abelian subgroup

(1.56)

of diagonal unitaries (in some fixed basis) – a Cartan subgroup – there is a basis of where it also acts diagonal. The holomorphic irreducible representations of are just

(1.57)

for some , so to each basis vector there is a such that for all we have

(1.58)

Such a vector is called a weight vector, and is called its weight. This translates to a similar property in the Lie algebra. The restriction to Lie subalgebra corresponding to of the Lie algebra representation defined on acts diagonally in the same basis, for we get

(1.59)

This procedure of diagonalizing the action of a Cartan subgroup and the corresponding Cartan subalgebra can also be done for the adjoint representation (see Section 1.2.1). The weights that are encountered there are called roots, the vector space they live in is called root space. The structure of the root lattice generated by the roots captures the properties of the underlying Lie group.

Let us now look at the complexified Lie algebra . The representations of a real Lie algebra and its complexification are in a one to one correspondence, see e.g. [9]. Define the standard basis of to be the set of matrices which have a one at position and are zero elsewhere. This basis is an eigenbasis of the adjoint action defined in (1.12), because

(1.60)

Using the matrices , which are the multidimensional analogues of the well known ladder operators of we can reconstruct the whole irreducible representation. Let us consider an ordering on the set of weights, for example the lexicographical order, that is

(1.61)

Note that this ordering is arbitrarily chosen. This is equivalent to choosing an irrational functional in the dual of the root space and thus totally ordering the roots. Looking at an arbitrary representation again, because we can find a highest weight and at least one corresponding weight vector . It turns out that this highest weight vector is unique. But first take and calculate

(1.62)

so either is zero, or it is a weight vector for the weight where . Because is the highest weight, for . For fixed look at the three Lie algebra Elements

(1.63)

Then is a Lie subalgebra isomorphic to , as and , so generates a +1-dimensional representation of . In this fashion repeated application of the , , yields a basis for the irreducible representation .

1.3.6 Schur-Weyl Duality

In this section I shortly explain the Schur-Weyl duality theorem. A good introduction to this topic can, for example, be found in [15]. This will be important when I consider similar constructions in Chapter 6.

Consider the tensor product space . The symmetric group has a natural unitary action on that space by permuting the tensor factors, i.e.

(1.64)

And acts via its tensor representation, i.e. for ,

(1.65)

Obviously the two actions commute. But even more is true, that is, the subalgebras of End generated by the two representations are each others commutants. The Schur-Weyl duality theorem then states that

(1.66)

where is the the representation projected out by the central projection corresponding to the frame and is the representation of U with highest weight .

Of course the representation of the group extends by linearity to a representation of the group algebra. Recall the definition of the Young symmetrizer corresponding to a standard young tableau of shape . It projects onto a single vector in the representation of so it projects onto a space of dimension equal to the multiplicity of in the representation (1.64). More precisely, because is paired with ind (1.66), actually projects onto a copy of .

Chapter 2 Physical and Information Theoretical Background

2.1 Classical Information Theory

In the following chapter I first give a short introduction into the mathematical formalism of classical information theory. In the subsequent sections I introduce the Shannon entropy, investigate its basic properties, and describe the convex geometry framework used to describe joint and marginal entropies of a multipartite random variable. Finally I give a short example how characterization results for the entropy cone are useful in applications by elaborating the connection to network coding.

In classical information theory states are modeled as measurable functions called random variables, where is a probability space and is a discrete set called alphabet. Explaining the concept of a probability space at length lies beyond the scope of this thesis, an introduction can be found in [33]. In simple words, is just a set, is a sigma algebra of measurable sets, and is a measure with the additional requirement that called probability measure. As most of the following is concerned with random variables on finite alphabets, it is enough to know that is an object that outputs elements from this alphabet with fixed probabilities given by the corresponding probability distribution