Second-order asymptotics for source coding, dense coding and pure-state entanglement conversions

Second-order asymptotics for source coding, dense coding and pure-state entanglement conversions

Nilanjana Datta and Felix Leditzky
Statistical Laboratory, Centre for Mathematical Sciences, University of Cambridge,
Wilberforce Road, Cambridge CB3 0WA, United Kingdom
Abstract

We introduce two variants of the information spectrum relative entropy defined by Tomamichel and Hayashi [TH13] which have the particular advantage of satisfying the data-processing inequality, i.e. monotonicity under quantum operations. This property allows us to obtain one-shot bounds for various information-processing tasksin terms of these quantities. Moreover, these relative entropies have a second order asymptotic expansion, which in turn yields tight second order asymptotics for optimal rates of these tasks in the i.i.d. setting. The tasks studied in this paper are fixed-length quantum source coding, noisy dense coding, entanglement concentration, pure-state entanglement dilution, and transmission of information through a classical-quantum channel. In the latter case, we retrieve the second order asymptotics obtained by Tomamichel and Tan [TT13]. Our results also yield the known second order asymptotics of fixed-length classical source coding derived by Hayashi [Hay08]. The second order asymptotics of entanglement concentration and dilution provide a refinement of the inefficiency of these protocols - a quantity which, in the case of entanglement dilution, was studied by Harrow and Lo [HL04]. We prove how the discrepancy between the optimal rates of these two processes in the second order implies the irreversibility of entanglement concentration established by Kumagai and Hayashi [KH13]. In addition, the spectral divergence rates of the Information Spectrum Approach (ISA) can be retrieved from our relative entropies in the asymptotic limit. This enables us to directly obtain the more general results of the ISA from our one-shot bounds.

\renewbibmacro

in: \addbibresource../../papers.bib

1 Introduction

Optimal rates of information-processing tasks such as storage and transmission of information, and manipulation of entanglement, are of fundamental importance in Information Theory. These rates were originally evaluated in the so-called asymptotic, i.i.d.111Here, i.i.d. is the standard acronym for “independent and identically distributed”. setting, in which it is assumed that the underlying resources (sources, channels or entangled states) employed in the tasks are available for asymptotically many uses, and that there are no correlations between their successive uses. The rates in this scenario are given in terms of entropic quantities obtainable from the relative entropy. It is, however, unrealistic to assume the availability of infinitely many copies of the required resources. In practice, we have finite resources and hence a fundamental problem of both theoretical and practical interest is to determine how quickly the behaviour of a finite system approaches that of its asymptotic limit.

The first step in this direction, from the standpoint of Information Theory, is to determine the second order asymptotics of optimal rates.222The precise meaning of the phrase “second order asymptotics” is elucidated in the following paragraph. Interest in this was initiated in the classical realm by Strassen [Str62], who evaluated the second order asymptotics for hypothesis testing and channel coding. In the last decade there has been a renewal of interest in the evaluation of second order asymptotics for other classical information theoretic tasks (see e.g. [Hay08, Hay09, KH13a] and references therein) and, more recently, even in third-order asymptotics [KV13a]. Moreover, the recent papers by Tomamichel and Hayashi [TH13] and Li [Li14] have introduced the study of second order asymptotics in Quantum Information Theory as well. The achievability parts of the second order asymptotics for the tasks studied in [TH13, Li14] were later also obtained by Beigi and Gohari [BG13] via the collision relative entropy.

Let us explain what exactly is meant by the phrase “second order asymptotics”. Consider the familiar task of fixed-length quantum source coding. Schumacher [Sch95] proved that for a memoryless, quantum information source characterized by a state , the optimal rate of reliable data compression is given by its von Neumann entropy, . The criterion of reliability is that the error incurred in the compression-decompression scheme vanishes in the asymptotic limit, , where denotes the number of copies (or uses) of the source. Let us now consider instead a finite number () of copies of the source and require that the error incurred in compressing its state is at most (for some ). Suppose denotes the compression length, i.e. the minimum of the logarithm of the dimension of the compressed Hilbert space in this case. This quantity can be expanded as follows:

(1)

Here, the coefficient of the leading term constitutes the first order asymptotics of the compression length. As expected, it turns out to be the the optimal rate in the asymptotic limit, i.e. the von Neumann entropy of the source. The second order asymptotics is given by the coefficient , and is a function of both the error threshold and the state . Determining the second order asymptotics comprises the evaluation of the coefficient .

Theorem 5.8(i) of Section 5 gives an explicit expression for the coefficient in the case of fixed-length (visible) quantum source coding. In Figure 1 we plot against for a memoryless quantum information source which emits the (mutually non-orthogonal) pure states and with equal probability, for three different values of the error threshold . This exhibits how the rate of data compression converges to its asymptotically optimal value.

\includegraphics

[width=0.9]asym-wide-crop.pdf

Figure 1: Plot of for the source .

In this paper we study the second order asymptotics of various information-processing tasks: fixed-length quantum source coding, noisy dense coding, entanglement concentration, pure-state entanglement dilution, and transmission of information through a classical-quantum channel. For each task we consider the -copy case, where denotes the number of copies of the relevant resource (source, entangled state or channel) employed in the protocol, denotes the error threshold, and denotes the characteristic quantity of the protocol (e.g. the minimum compression length in the case of source coding). We arrive at an expression of the form (1) for – which in this case is a quantity from which the optimal asymptotic rate of the protocol is obtained through the relation

For each of the tasks studied, the coefficient turns out to be the entropic quantity characterizing . Moreover, the coefficient is proportional to , where denotes the inverse of the cumulative distribution function of the standard normal distribution, and the constant of proportionality depends on the underlying resource (the latter is often called the dispersion or information variance). This form of is a feature of all results on second order asymptotics and stems from the Berry-Esseen theorem [Fel71] – a refinement of the central limit theorem which takes into account the rate of convergence of a distribution to a standard normal distribution.

Two mathematical quantities play key roles in the derivation of our results. These are variants of the information spectrum relative entropy defined by Tomamichel and Hayashi [TH13], but have the particular advantage of satisfying the data-processing inequality. For a state and a positive semi-definite operator , we denote them as and (where ) and refer to them simply as information spectrum relative entropies.333 To avoid confusion, we denote the information spectrum relative entropy defined in [TH13] by . These notations and nomenclatures stem from the fact that for any arbitrary sequence of states and positive semi-definite operators such that and are defined on the Hilbert space for each , the following relations hold:

(2)

where and are the inf- and sup- spectral divergence rates of the so-called quantum Information Spectrum Approach (ISA) (see Definition 4.11). The ISA provides a unifying mathematical framework for obtaining asymptotic rate formulae for various different tasks in information theory. The power of the approach lies in the fact that it does not rely on any particular structure or property of the resources used in the tasks. It was introduced by Han and Verdu [HV93] in classical Information Theory, and generalized to the quantum setting by Hayashi, Nagaoka and Ogawa [Hay06, HN03, NH07, ON00].

The information spectrum relative entropies, and , can also be related to other relative entropies which arise in one-shot information theory (see [Ren05, KRS09, Dat09, Tom12, DH11, MW12] and references therein), e.g. the hypothesis testing relative entropy  [WR12], and the smooth max-relative entropy  [Dat09]. In fact, one can prove that all these relative entropies are equivalent in the sense that upper and lower bounds to any one of them can be obtained in terms of any one of the others, modulo terms which depend only on the (smoothing) parameter . These equivalences prove very useful. In particular, the bounds on the information spectrum relative entropy in terms of directly yield second order asymptotic expansions for and via the second order asymptotic expansion for , which has been derived by Tomamichel and Hayashi in [TH13].

In addition, as in the case of the usual relative entropy, one can derive other entropic quantities, namely, entropy, conditional entropy and mutual information, from the information spectrum relative entropies. For each of the information-processing tasks considered in this paper, we obtain one-shot bounds in terms of quantities derived from the information spectrum relative entropies. The second order expansions of these quantities then directly yield tight second order asymptotics for optimal rates of the corresponding tasks in the i.i.d. setting.

Finally, the relations (2) enable us to directly obtain the more general results of the ISA, for each of the tasks considered, from our one-shot bounds. For example, our bounds for one-shot fixed-length source coding yields the optimal data compression limit for a general (i.e. not necessarily memoryless) quantum information source.

2 Overview of results

Here we summarize our main contributions and give pointers to the relevant theorems.

  • We define the information spectrum relative entropies in Definition 4.1 as a variant of a previously introduced relative entropy [TH13], and furthermore define quantities derived from them in Definition 4.4. These quantities all depend on a parameter .

  • We prove the data-processing inequalities and other properties of the information spectrum relative entropies in Proposition 4.3. We also show in Proposition 4.7 that the information spectrum relative entropies are equivalent to the hypothesis testing relative entropy and the smooth max-relative entropy, in the sense that these quantities are bounded by each other.

  • Using their equivalences with the hypothesis testing relative entropy, and the second order asymptotic expansion for the latter [TH13], we obtain second order asymptotic expansions for the information spectrum relative entropies in Proposition 4.9.

  • In Proposition 4.12 we prove that the information spectrum relative entropies reduce to the spectral divergence rates of the ISA in the asymptotic limit, when the parameter is taken to zero.

  • We obtain one-shot bounds for the following information-processing tasks in terms of quantities derived from the information spectrum relative entropies. In each case the parameter plays the role of the error threshold allowed in the protocol.

    1. Quantum fixed-length source coding [Theorem 5.5].

    2. Noisy dense coding [Theorem 5.10].

    3. Entanglement concentration [Theorem 5.15].

    4. Pure-state entanglement dilution [Theorem 5.18].

    5. Classical-quantum channel coding [Theorem 5.24].

  • We obtain second order asymptotic expansions for the optimal rates of the above tasks in the i.i.d. setting in Theorems 5.8, 5.11, 5.16, 5.19 and Proposition 5.25 respectively.

    In particular, Theorem 5.8(i) gives the second order expansion for fixed-length visible quantum source coding. We also obtain asymptotic upper and lower bounds on the minimal compression length in the blind setting, stated in Theorem 5.8(ii). The distinction between visible and the blind source coding is elaborated in Section 5.1.1.

  • Even though the leading order terms for the optimal rates for entanglement concentration and dilution are identical (and given by the entropy of entanglement), there is a difference in their second order terms. Explicit evaluation of these terms lead to a refinement of the inefficiency of these protocols. In the case of entanglement dilution, the latter quantity (studied by Harrow and Lo [HL04]) was introduced as a measure of the amount of entanglement wasted (or lost) in the dilution process. More precisely, in [HL04] it was proved that the number of ebits needed to create copies of a desired bipartite pure state with entropy of entanglement was of the form . We prove that the number of ebits can, in fact, be expressed in the form , and we evaluate the coefficient explicitly.

    We also show how the irreversibility of entanglement concentration, established by Kumagai and Hayashi [KH13], can be proved using the discrepancy between the asymptotic expansions for distillable entanglement and entanglement cost in the second order ().

  • Finally, in Proposition 6.2, we recover the known expressions for optimal rates in the case of arbitrary resources as obtained by the Information Spectrum Approach.

The paper is organized as follows. In the next section, we introduce necessary notation and definitions. The rest of the paper proceeds in the order of the results mentioned above. We end with a conclusion that summarizes our results and points to open questions for future research.

3 Mathematical preliminaries

For a Hilbert space , let denote the algebra of linear operators acting on , and let denote the set of positive semi-definite operators on . Further, let and denote the set of states (density matrices) and subnormalized states on respectively. For a state , the von Neumann entropy is defined as . Here and henceforth, all logarithms are taken to base . Unless otherwise stated, we assume all Hilbert spaces to be finite-dimensional. Let denote the identity operator on , and the identity map on operators on . For a pure state , we denote the corresponding projector by . For a completely positive, trace-preserving (CPTP) map , we also use the shorthand notation .

For self-adjoint operators , let denote the spectral projection of the difference operator corresponding to the interval ; the spectral projections , and are defined in a similar way. We also define where for any self-adjoint operator . The following lemmas are used in our proofs:

Lemma 3.1.

[ON00] Let and be an arbitrary operator, then

and the same assertion holds for .

Lemma 3.2.

[BD06] Let and be a CPTP map. Then

Lemma 3.3.

[BD06] Let and . Then for any ,

For the convenience of the reader, we recall the definitions of the most important distance measures and state their relations [TCR10]:

Definition 3.4.

Let be subnormalized states, then:

  1. The generalized fidelity of and is defined by

  2. The purified distance is defined by

    and constitutes a metric on , i.e. it satisfies the triangle inequality.

  3. The generalized trace distance is defined by

    and constitutes a metric on .

If at least one of the subnormalized states and is normalized, the generalized fidelity reduces to the usual fidelity, i.e. If both and are normalized, the generalized trace distance also reduces to the usual trace distance,

Lemma 3.5.

[Tom12] For we have the following bounds:

The following entropic quantities play a key role in second order asymptotic expansions:

Definition 3.6.

For and , the quantum relative entropy is defined as

The quantum information variance is defined as

and we set

(3)

The inverse of the cumulative distribution function (cdf) of a standard normal random variable is defined by

where . We frequently make use of the following lemma:

Lemma 3.7.

Let , then

for some with .

Proof.

We make the following general observation: Let be a continuously differentiable function. Then by Taylor’s theorem we can write

(4)

for some in the case ‘’ and in the case ‘’. Applying (4) to yields the claim. ∎

4 Information spectrum relative entropies

4.1 Definitions and mathematical properties

The central quantities of this paper are the information spectrum relative entropies and , which we now define.

Definition 4.1.

For , and , the information spectrum relative entropies are defined as

These relative entropies are one-shot generalizations of the spectral divergences used in the ISA to quantum information theory. In Section 4.4 these generalizations are discussed in detail.

Note furthermore, that the information spectrum relative entropies as defined in Definition 4.1 are variants of the definition introduced by Tomamichel and Hayashi [TH13]:

(5)

Our definitions have the advantage of satisfying the data processing inequality, as shown in Proposition 4.3(iii). First, we prove the following lemma which implies that the supremum and infimum in Definition 4.1 are attained.

Lemma 4.2.
  1. For let be a one-parameter family of Hermitian operators such that is continuous. Then the function is continuous.

  2. For , the function is continuous and monotonically decreasing, and strictly so if .

  3. The supremum and infimum in Definition 4.1 are attained, and unique if .

Proof.

(i) First, we observe that the trace of the positive part of a Hermitian operator is the sum of its non-negative eigenvalues, the eigenvalues being the roots of the characteristic polynomial. Since the determinant is a continuous function with respect to the operator norm on , the coefficients of the characteristic polynomial of continuously depend on , and by factorizing the polynomial into linear factors we see that this carries over to the roots. Composing the sum over the roots with the continuous maximum function , we obtain that the function is a composition of continuous functions and thus continuous itself.

(ii) Since is continuous with respect to the operator norm, the function is continuous by (i). To prove that is decreasing, let and observe that

where . Hence, Weyl’s Monotonicity Theorem (see e.g. Section III in [Bha13]) implies that

(6)

where for a Hermitian operator we write to denote the -th largest eigenvalue of . It then follows from (6) that .

To prove strict monotonicity, let . Without loss of generality, we restrict to the support of (which is possible due to the assumption ), such that has strictly positive eigenvalues. Consequently, , and the inequality in (6) is a strict one for all , from which we obtain that .

(iii) Since and , we infer by (ii) and the Intermediate Value Theorem that the supremum in the definition of as well as the infimum in the definition of are attained. If , then they are moreover unique by the strict monotonicity of . ∎

We are now ready to record a series of properties of the information spectrum relative entropies:

Proposition 4.3.

Let , and . Then the following properties hold:

  1. Data processing inequality: For any CPTP map , we have

  2. Monotonicity in : Let , then

  3. Let with , then

  4. Let , then .

  5. Let and with , then

Proof.

(i) Let . Then by the definition of we have

Hence,

since by assumption. Therefore, is feasible for and consequently, from (5) it follows that

which yields the claim.

(ii) Let . Then by the definition of , we have

Hence, is feasible for , and we obtain

Assume now that

for some , i.e. . By the monotonicity of , we have

On the other hand, by definition of . This leads to a contradiction, yielding .

(iii) Let . Then Lemma 3.2 implies that

Hence, is feasible for , and we obtain

Similarly, let . Then

by Lemma 3.2. Hence, is feasible for , and we obtain

(iv) Let , then

Hence, is feasible for , and consequently, .

Similarly, let . Then

and hence, is feasible for , and we obtain .

(v) Let and , then we compute:

where the first inequality follows from and the second inequality follows from Lemma 3.1. Hence, is feasible for and we obtain .

(vi) Let for , then

and hence,

Conversely, let . Then

and hence,

which proves the claim.

(vii) Let and . Then

where the first inequality follows from Lemma 3.1 and the second inequality follows from the fact [Tom12] that

by assumption. This proves the claim. ∎

Remark.

Proposition 4.3(ii) shows that we only need to focus on one of the information spectrum relative entropies (we choose without loss of generality). However, given their close relationship to the quantum spectral inf- and sup divergence (Section 4.4), we note that it is useful to keep the two alternative definitions in Definition 4.1.

Children entropies of the information spectrum relative entropies

The quantum relative entropy acts as a parent quantity for other entropic quantities:

  • the von Neumann entropy

  • the quantum conditional entropy

  • the quantum mutual information

This motivates us to define the following information spectrum entropies:

Definition 4.4.

Let and be states. Then we define:

  1. the information spectrum entropies

  2. the information spectrum conditional entropies

  3. the information spectrum mutual informations

Note that in Definition 4.4, (i) and (ii) the occurrence of the minus sign is the reason for changing the upper bar to a lower bar and vice versa. In Section 5 these quantities arise in one-shot bounds for operational tasks.

The information spectrum conditional entropies satisfy the following interesting property under local operations and classical communication (LOCC) taking pure states to pure states:

Lemma 4.5.

Let where denotes any LOCC operation which takes pure states to pure states, and is a bipartite pure state. Then the following inequality holds:

Proof.

By definition, we have and furthermore,

However, by a result of Lo and Popescu [LP01], the action of the LOCC map on the pure state can be expressed as follows:

(7)

where the are unitary operators and are operators such that . Consequently, using the cyclicity of the trace we obtain

(8)

Further, for , we have

which can be seen as follows:

where the second identity follows from the unitarity of the operators , and the last identity follows from (8). Hence,

where the inequality follows from Proposition 4.3, (iii). ∎

4.2 Relation to other relative entropies

In this section we prove that the information spectrum relative entropies are equivalent to the hypothesis testing relative entropies and the smooth max-relative entropy, which arise in one-shot information theory. This equivalence is in the sense that upper and lower bounds to any one of them can be obtained in terms of any one of the others, modulo terms which depend only on the (smoothing) parameter . Let us first recall the definitions of the hypothesis testing relative entropies and the smooth max-relative entropy:

Definition 4.6.
  1. For and , the hypothesis testing relative entropy is defined as

  2. For , and , the smooth max-relative entropy is defined as

    where is the -ball with respect to the purified distance .

We prove the following relations between these relative entropies:

Proposition 4.7.

Let , , and with . Then we obtain the following bounds:

Proof.

(i) To prove the upper bound, let where for some . Since

we see that is feasible for . Hence,

since . By the definition of , we obtain

and hence the upper bound.

Conversely, let be optimal for and set . Then

where the first inequality follows from Lemma 3.1. Hence, is feasible for and we obtain

(ii) We start with the upper bound. Let for some arbitrary and set . We have by definition of , and hence,

Thus, is feasible for . Furthermore,

implying that

Conversely, assume that and let be optimal for , such that

(9)

For arbitrary we have