Swiveled Rényi entropies

Swiveled Rényi entropies

Frédéric Dupuis Faculty of Informatics, Masaryk University, Brno, Czech Republic    Mark M. Wilde Hearne Institute for Theoretical Physics, Department of Physics and Astronomy, Center for Computation and Technology, Louisiana State University, Baton Rouge, Louisiana 70803, USA
Abstract

This paper introduces “swiveled Rényi entropies” as an alternative to the Rényi entropic quantities put forward in [Berta et al., Phys. Rev. A 91, 022333 (2015)]. What distinguishes the swiveled Rényi entropies from the prior proposal of Berta et al. is that there is an extra degree of freedom: an optimization over unitary rotations with respect to particular fixed bases (swivels). A consequence of this extra degree of freedom is that the swiveled Rényi entropies are ordered, which is an important property of the Rényi family of entropies. The swiveled Rényi entropies are however generally discontinuous at and do not converge to the von Neumann entropy-based measures in the limit as , instead bounding them from above and below. Particular variants reduce to known Rényi entropies, such as the Rényi relative entropy or the sandwiched Rényi relative entropy, but also lead to ordered Rényi conditional mutual informations and ordered Rényi generalizations of a relative entropy difference. Refinements of entropy inequalities such as monotonicity of quantum relative entropy and strong subadditivity follow as a consequence of the aforementioned properties of the swiveled Rényi entropies. Due to the lack of convergence at , it is unclear whether the swiveled Rényi entropies would be useful in one-shot information theory, so that the present contribution represents partial progress toward this goal.

1 Introduction

In 1961, Alfred Rényi defined a parametrized family of entropies now bearing his name, by relaxing one of the axioms that singles out the Shannon entropy [Rén61]. This led to both the -Rényi entropy and the -Rényi divergence, defined respectively for a parameter and probability distributions and as

(1.1)
(1.2)

where denotes the natural logarithm here and throughout the paper. The Shannon entropy and relative entropy are recovered in the limit as :

(1.3)
(1.4)

What began largely as a theoretical exploration ended up having many practical ramifications, especially in the contexts of information theory and statistics. For example, it is now well known that the Rényi entropies play a fundamental role in obtaining a sharpened understanding of the trade-off between communication rate, error probability, and number of resources in communication protocols, such as data compression and channel coding [Csi95, vEH14]. “Smoothing” the Rényi entropies [RW04] has also led to the development of “one-shot” information theory [Ren05, Tom12], with applications to cryptography.

Part of what makes the Rényi entropies so useful in applications is their properties: convergence to the Shannon and relative entropies in the limit as , monotonicity in the parameter, and additivity, in addition to others. The convergence to the Shannon and relative entropies ensures that, by taking this limit, one recovers asymptotic information-theoretic statements, such as the data compression theorem or the channel capacity theorem, from the more fine-grained statements. Monotonicity in the parameter ensures that gives more weight to low surprisal events for and vice versa for , helping to characterize the aforementioned trade-off in information-theoretic settings. The additivity property implies that the Rényi entropies can simplify immensely when evaluated for memoryless stochastic processes.

In light of the progress that the Rényi paradigm has brought to information theory, one is left to wonder if this could happen in more exotic settings, such as quantum information theory and/or for “multipartite” settings (here by multipartite, we mean three or more parties). This line of thought has led to the development of several non-commutative generalizations of the Rényi relative entropy in (1.2), which has in turn led to a sharpened understanding of several quantum information-theoretic tasks (see [CMW14, Tom15] and references therein) and refinements of the uncertainty principle [CBTW15]. As far as we are aware, the development of the multipartite generalization of the Rényi entropy in (1.1) is less explored, with the exception of a recent proposal [BSW15b] for a multipartite quantum generalization.

With the intent of developing either a multipartite classical or quantum generalization of (1.1), one might suggest after a moment’s thought to replace a quantity which features a linear combination of entropies by one with the same linear combination of Rényi entropies. However, this approach is objectively unsatisfactory in at least two regards: properties of the original information measure are not preserved by doing so and one is not guaranteed to have the powerful monotonicity in property mentioned above. For example, take the case of the conditional mutual information of a tripartite density operator defined as

(1.5)

where is the quantum entropy of a density operator on system . One of the most important properties of this quantity is that it is non-negative (known as strong subadditivity of quantum entropy [LR73a, LR73b]), and as a consequence, it is monotone non-increasing with respect to any quantum channel applied to the system [CW04] (by symmetry, the same is true for one applied to ). However, if we define a Rényi generalization of as , where , then explicit counterexamples reveal that this Rényi generalization can be negative, monotonicity with respect to quantum channels need not hold, and neither does monotonicity in  [LMW13].

To remedy these deficiencies, the authors of [BSW15b] put forward a general prescription for producing a Rényi generalization of a quantum information measure, with the aim of having the properties of the original measure retained while also satisfying the monotonicity in property. The work in [BSW15b] was only partially successful in this regard. Continuing with our example of conditional mutual information, consider the following Rényi generalization [BSW15a]:

(1.6)

For , the quantity is non-negative, monotone non-increasing with respect to quantum channels acting on the system, converges to in the limit as , and is conjectured to obey the monotonicity in property (with some numerical and analytical evidence in favor established) [BSW15a]. However, hitherto a proof of the monotonicity in property for remains lacking. It is also an open question to determine whether is monotone non-increasing with respect to quantum channels acting on the system—this partially has to do with the fact that is not symmetric with respect to exchange of the and systems, unlike the conditional mutual information in (1.5).

2 Summary of results

In this paper, we modify the recently proposed Rényi generalizations of quantum information measures from [BSW15b] by placing “swivels” in a given chain of operators.111A “swivel” is a coupling placed between two objects in a chain in order to allow for them to “swivel” about a given axis. As an example of the idea, consider that we can rewrite the quantity in (1.6) in terms of the Schatten 2-norm as follows:

(2.1)

The new idea is to modify this quantity to include swivels as follows:

(2.2)

where is the compact set of all unitaries  commuting with the Hermitian operator . Thus, the fixed eigenbases of and act as swivels connecting adjacent operators in the operator chain above, such that the unitary rotations and about these swivels are allowed. Of course, such swivels make no difference when the density operator and its marginals commute with each other (the classical case), or when the system is trivial, in which case the above quantity reduces to a Rényi mutual information

(2.3)
(2.4)

We mention that we were led to the definition in (2.2) as a consequence of the developments in [Wil15], in which similar swivels appeared in refinements of entropy inequalities such as monotonicity of quantum relative entropy and strong subadditivity.

The quantity in (2.2) satisfies some of the properties already established for in [BSW15a], which include non-negativity for and monotonicity with respect to quantum channels acting on the system. However, the extra degree of freedom in (2.2) allows us to prove that this swiveled Rényi conditional mutual information is monotone non-decreasing in for .

The swiveled Rényi entropies are in general discontinuous at and do not converge to the von Neumann entropy-based measures in the limit as . Thus, the present paper represents a work in progress toward the general goal of find Rényi generalizations of quantum information measures that satisfy all of the desired properties that one would like to have. It thus remains an open question to find Rényi quantities that meet all desiderata.

The rest of the paper proceeds by developing this idea in detail. We review some background material in Section 3, which includes various quantum Rényi entropies and the Hadamard three-line theorem, the latter being the essential tool for establishing monotonicity in for the swiveled Rényi entropies. We then focus in Section 4 on developing swiveled Rényi generalizations of the quantum relative entropy difference in (3.3), given that many different information measures can be written in terms of this relative entropy difference, including conditional mutual information (see, e.g., the discussions in [SBW15, Wil14, Wil15]). Our main contributions are Theorems 6 and 7, which state that these quantities are monotone non-decreasing in for particular values. We then briefly discuss how refinements of entropy inequalities follow as a consequence of the properties of the swiveled Rényi entropies. Section 5 discusses swiveled Rényi conditional mutual informations and justifies that they possess the properties stated above. We extend the idea in Section 6 to establish swiveled Rényi generalizations of an arbitrary linear combination of von Neumann entropies with coefficients chosen from the set . We finally show how our methods can be used to address an open question posed in [Zha14]. Section 8 concludes with a summary and some open directions.

3 Preliminaries

3.1 Quantum states and channels

A quantum state is described mathematically by a density operator, which is a positive semi-definite operator with trace equal to one. A quantum channel is a linear, trace-preserving, completely positive map. For more background on quantum information theory, we refer to [NC10, Wil13]. Our results apply to finite-dimensional Hilbert spaces. For most developments, we take , , and to be as given in the following definition:

Definition 1

Let be a density operator acting on a finite-dimensional Hilbert space , be a non-zero positive semi-definite operator acting on , and be a quantum channel, taking operators acting on to those acting on a finite-dimensional Hilbert space .

Sometimes we need more restrictions, in which case we take , , and as follows:

Definition 2

Let , , and be as given in Definition 1, with the additional restriction that and are positive definite, and is such that and are also positive definite.

We employ the common convention that functions of Hermitian operators are evaluated on their support. In more detail, the support of a Hermitian operator , written as , is defined as the vector space spanned by its eigenvectors whose corresponding eigenvalues are non-zero. Let an eigendecomposition of be given as for eigenvectors . Then . Let denote the projection onto the support of . A function of an operator is then defined as .

3.2 Entropies and norms

Let , , and be as given in Definition 1. The quantum relative entropy [Ume62] is defined as

(3.1)

whenever , and otherwise, it is defined to be equal to . The quantum relative entropy is monotone non-increasing with respect to quantum channels [Lin75, Uhl77], in the sense that

(3.2)

Another relevant information measure is the quantum relative entropy difference, defined as

(3.3)

We can use the Schatten norms in order to establish Rényi generalizations of von Neumann entropies, which are more refined information measures for quantum states and channels that reduce to the von Neumann quantities in a limit. The Schatten -norm of an operator is defined as

(3.4)

where and (note that we sometimes use the notation even for values when the quantity on the right-hand side of (3.4) is not a norm). From the above definition, we can see that the following equalities hold for any operators and :

(3.5)
(3.6)

The quantum Rényi entropy of a state is defined for as

(3.7)

and reduces to the von Neumann entropy in the limit as :

(3.8)

There are at least two ways to generalize the quantum relative entropy, which we refer to as the Rényi relative entropy [Pet86a] and the sandwiched Rényi relative entropy [MLDS13, WWY14]. They are defined respectively as follows:

(3.9)
(3.10)
(3.11)
(3.12)

if or if and . If and , then they are defined to be equal to . The rewritings in (3.10) and (3.12) are helpful for our developments in this paper and follow from (3.5)–(3.6) and the following:

(3.13)
(3.14)
(3.15)

Both Rényi generalizations reduce to the quantum relative entropy in the limit as [Pet86a, MLDS13, WWY14]:

(3.16)

The Rényi relative entropy is monotone non-increasing with respect to quantum channels when [Pet86a]:

(3.17)

and the sandwiched Rényi relative entropy possesses a similar monotonicity property when [FL13, Bei13]:

(3.18)

By picking particular values of the Rényi parameter , the quantities above take on special forms and have meaning in operational contexts, being known as the zero-relative entropy [Dat09], the collision relative entropy [DFW15], the min-relative entropy [DKF12], and the max-relative entropy [Dat09], respectively:

(3.19)
(3.20)
(3.21)
(3.22)

where is the quantum fidelity [Uhl76].

3.3 Hadamard three-line theorem

One of the most important technical tools for proving our main result is the operator version of the Hadamard three-line theorem given in [Bei13], in particular, the very slight modification stated in [Dup15]. We note that the theorem below is a variant of the Riesz-Thorin operator interpolation theorem (see, e.g., [BL76, RS75]).

Theorem 3

Let , and let be the space of bounded linear operators acting on a Hilbert space . Let be a bounded map that is holomorphic on the interior of and continuous on the boundary.222A map is holomorphic (continuous, bounded) if the corresponding functions to matrix entries are holomorphic (continuous, bounded). Let and define by

(3.23)

where . For define

(3.24)

Then

(3.25)

3.4 Rényi generalizations of the quantum relative entropy difference

Let , , and be as given in Definition 1. In [SBW15], two Rényi generalizations of the relative entropy difference in (3.3) were defined as follows:

(3.26)

where . Let be an isometric extension of , so that

(3.27)

We can write the adjoint  in terms of this isometric extension as follows:

(3.28)

This then allows us to write the definitions above in a simpler form:

(3.29)
(3.30)

It is known that the following limits hold for , , and taken as in Definition 2 [SBW15]:

(3.31)

The fact that these limits hold for , , and taken as in Definition 1 and subject to follows from [Wil15] and the development in Appendix A. [DW15] proved that for ,

(3.32)

and for :

(3.33)

when , , and are taken as in Definition 2. The latter inequality was refined recently in [Wil15] for and for , , and taken as in Definition 1 and subject to . It remains an open question to determine whether these quantities are non-decreasing in for any non-trivial range of (note that [SBW15] argued that they are non-decreasing in in a neighborhood of ).

4 Swiveled Rényi generalizations of the quantum relative entropy difference

In the spirit of the discussion in Section 2, we consider different definitions of and  in order to allow for unitary rotations about swivels, i.e., an optimization over unitaries of the form and :

Definition 4

Let , , and be as given in Definition 1. We define swiveled Rényi generalizations of the quantum relative entropy difference in (3.3) as follows:

(4.1)
(4.2)

where and the optimizations are over the compact sets of unitaries and commuting with and , respectively.

This slight extra degree of freedom allows us to establish that and are monotone non-decreasing in for particular values (see Theorems 6 and 7).

4.1 Reduction to Rényi relative entropy

Observe that by choosing , we find that reduces to the Rényi relative entropy whenever :

(4.3)
(4.4)

and to the sandwiched Rényi relative entropy whenever :

(4.5)
(4.6)

just as

(4.7)

4.2 Behavior around

Here we discuss the behavior of and around , with the result being that these quantities are generally discontinuous at :

Proposition 5

Let , , and be as given in Definition 2. Then

(4.8)
(4.9)

where

(4.10)

As a consequence, we have that

(4.11)

and there is generally a discontinuity at .

Proof. Let , which we will choose shortly. Define the function as

(4.12)

whenever , and as in (4.10). One can check that

(4.13)

for example by performing Taylor expansions to calculate the limit (see Appendix C for details of this calculation). The function is then continuous in , , and . Furthermore, it fulfills the conditions of Lemma 22 in Appendix B if we choose for any and . Hence, we get that

(4.14)

is continuous on and thus

(4.15)

Repeating the same argument with yields that

(4.16)

is continuous on and thus

(4.17)

Given that , we can conclude the following inequality:

(4.18)

The arguments for the quantity are similar, so we just sketch them briefly. Define the function

(4.19)

for and set . One can then compute (again via Taylor expansions, e.g.) that

(4.20)

The rest of the argument proceeds as above, which leads to the other equalities in (4.8)-(4.9).   

4.3 Monotonicity in the Rényi parameter

This section contains our main result, that both and are monotone non-decreasing with respect to for particular values.

Theorem 6

Let , , and be as given in Definition 1. The swiveled Rényi quantity is monotone non-decreasing with respect to , in the sense that for , , and

(4.21)

Proof. The main tool for our proof is Theorem 3. We break the proof of inequality in (4.21) into several cases. We first consider . For some and , pick

(4.22)
(4.23)
(4.24)
(4.25)

which fixes . Then

(4.26)
(4.27)
(4.28)
(4.29)
(4.30)
(4.31)
(4.32)
(4.33)

We then apply Theorem 3 to find that the following inequality holds for all and :

(4.34)

As a consequence, we can take the maximum over all and and apply the definition in (4.1) to establish that

(4.35)

We finally apply a logarithm to arrive at the conclusion that (4.21) holds for all .

To get the monotonicity for the range , we exchange and in (4.22)-(4.25) and apply the same reasoning as in (4.26)-(4.34) to arrive at the following inequality:

(4.36)

Taking a negative logarithm and noting that then gives (4.21) for this range.

We are now left with proving the case and the dual parameter of , such that . Notice that . Let . We pick

(4.37)
(4.38)
(4.39)
(4.40)

so that . Consider that , so that

(4.41)
(4.42)

We then find that

(4.43)
(4.44)
(4.45)
(4.46)

Consider that

(4.47)

Thus, similarly, we have

(4.48)
(4.49)
(4.50)
(4.51)

Applying Theorem 3 gives

(4.52)