# The conditional Entropy Power Inequality

for quantum additive noise channels

###### Abstract

We prove the quantum conditional Entropy Power Inequality for quantum additive noise channels. This inequality lower bounds the quantum conditional entropy of the output of an additive noise channel in terms of the quantum conditional entropies of the input state and the noise when they are conditionally independent given the memory. We also show that this conditional Entropy Power Inequality is optimal in the sense that we can achieve equality asymptotically by choosing a suitable sequence of Gaussian input states. We apply the conditional Entropy Power Inequality to find an array of information-theoretic inequalities for conditional entropies which are the analogues of inequalities which have already been established in the unconditioned setting. Furthermore, we give a simple proof of the convergence rate of the quantum Ornstein-Uhlenbeck semigroup based on Entropy Power Inequalities.

## 1 Introduction

Additive noise channels are central objects of interest in information theory. A general class of such channels can be modeled by the well-known convolution operation: If and are two independent random variables with values in , the convolution operation combines and into a new random variable , the probability density function of which is given by

(1) |

The convolution is a well-studied operation and it plays a role in many inequalities from functional analysis, such as Young’s Inequality and its sharp version [1, 2] as well as the Entropy Power Inequality [3, 4, 5, 6]. These inequalities have important applications in classical information theory, as they can be used to bound communication capacities, which was originally carried out by Shannon [3]. An extensive overview of the many related inequalities in this area is given in [6].

Central to the work presented here is the Entropy Power Inequality. It deals with the entropy of a linear combination of two independent random variables and with values in ,

(2) |

The statement of the Entropy Power Inequality [3, 4, 5, 6] is

(3) |

where is the Shannon differential entropy of the random variable . A conditional version of (3) can easily be derived: If and are conditionally independent given the random variable (sometimes interpreted as a memory), then

(4) |

In quantum information theory, an analogous operation to the convolution (1) is given by the action of a beamsplitter with transmissivity on a quantum state (i.e., a linear positive operator with unit trace) which is bipartite on two -mode Gaussian quantum systems . This action has the form

(5) |

where is again an -mode quantum system and denotes the partial trace over the second system. The mathematical motivation of the study of this operation is that in the special case of a product state, that is , it is formally similar to the convolution described in (1) on the level of Wigner functions. For the beamsplitter (5), several important inequalities in the same spirit as in classical information theory have been established [7, 8, 9, 10, 11]. For instance, the quantum Entropy Power Inequality reads

(6) |

with being the von Neumann entropy of a quantum state. Unlike in the classical setting, a conditional Entropy Power Inequality for the operation (5) does not trivially follow from the unconditioned inequality (6). However, it was recently established in [11] that such an inequality holds nonetheless: For a joint quantum state such that and are conditionally independent given the memory system , we have

(7) |

where is the quantum conditional entropy. The conditional independence of and given is expressed with the condition that the quantum conditional mutual information equals zero:

(8) |

Our work concerns yet another convolution operation, which mixes a probability density function on phase space with an -mode quantum state .

(9) |

where are the Weyl displacement operators in phase space. This operation was first introduced by Werner in [12]. Werner established a number of results regarding (9), most notably a Young-type inequality. In [13], more inequalities involving this operation were shown, most prominently the Entropy Power Inequality

(10) |

In the context of mixing times of semigroups, the authors in [14] have used this convolution extensively and proved various properties which are related to the discussion of the Entropy Power Inequality.

### 1.1 Our contribution

Similarly to the work carried out in [11] for the beamsplitter, we prove the conditional version of the Entropy Power Inequality for the convolution given by (9). Let us consider an -mode Gaussian quantum system , a generic quantum system and a classical system which “stores” a classical probability density function . Let us further consider the map , , linearly extended to generic states as

(11) |

We show in Theorem 5 that the conditional entropy of the output of is lower bounded as

(12) |

if , i.e., the systems and are conditionally independent given the system . As a special case, this inequality implies useful inequalities about the convolution (9) in the case when is uncorrelated with :

(13) |

In the particular case when is a Gaussian random variable with probability density function , the inequality becomes

(14) |

The special cases mentioned above are important in various applications, as we will show later.

This conditional Entropy Power Inequality is tight in the sense that it is saturated for any couple of values of and by an appropriate sequence of Gaussian input states, which we show in Theorem 6. This behaviour is similar to the case of the beamsplitter. On the way to this inequality, several intermediate results are proven which make up a set of information-theoretic inequalities regarding conditional Fisher information and conditional entropies. To complete the picture of information-theoretic inequalities involving quantum conditional entropies, we apply our results to prove a number of additional inequalities in similar spirit to the classical case. Among them there are the concavity of the quantum conditional entropy along the heat flow (Theorem 8) and an isoperimetric inequality for quantum conditional entropies (Lemma 7). Furthermore, we show in subsection 8.3 how, similarly to the case of the beamsplitter, the conditional Entropy Power Inequality implies a converse bound on the entanglement-assisted classical capacity of a non-Gaussian quantum channel, the classical noise channel defined in (9).

Another part of our work regards the quantum Ornstein-Uhlenbeck (qOU) semigroup. It is the one-parameter semigroup of completely positive and trace-preserving (CPTP) maps on the one-mode Gaussian quantum system generated by the Liouvillian

(15) |

where

(16) |

and is the ladder operator of . This quantum dynamical semigroup has a unique fixed point given by

(17) |

where is the Fock basis of . It has been shown in [15] using methods of gradient flow that the quantum Ornstein-Uhlenbeck semigroup converges in relative entropy to the fixed point at an exponential rate given by the exponent :

(18) |

where is the quantum relative entropy [16].

We show that a simple application of the linear version of the Entropy Power Inequality (6) for the beamsplitter is sufficient to prove this convergence rate. We also show a simple derivation of an analogous result for the case of a bipartite quantum system , where the system undergoes a qOU evolution, using the linear conditional Entropy Power Inequality for the beamsplitter recently proven in [11]. Namely, we are going to show in Theorem 9 that

(19) |

which directly implies the statement (18). Finite-dimensional versions of the statement (19) for general semigroups have recently been studied by Bardet [17]. Our argument shows that Entropy Power Inequalities are a useful tool to study the convergence rate of semigroups.

The proof of the unconditioned Entropy Power Inequality (10) given in [13] exhibits certain regularity issues regarding the Fisher information: the Fisher information was defined as the Hessian of a relative entropy, without a proof of well-definedness. Various proofs of the Entropy Power Inequality for the beamsplitter had similar issues [7, 8, 9]. They were settled in [11] by the adoption of a proof technique which starts with an integral version of the quantum Fisher information. We adopt a similar approach here. Since the conditional Entropy Power Inequality reduces to the unconditioned inequality in the case where the system is trivial, this also gives a more rigorous proof of the unconditioned Entropy Power Inequality. As such, our work can be seen as both a completion of the work carried out in [13] and a generalization thereof.

We now sketch the basic structure of the proof of our main result. The main ingredients in proving Entropy Power Inequalities [5, 7, 9, 11, 13] are similar in all proofs, which all use the evolution under the heat semigroup. These ingredients are the Fisher information, de Bruijn’s identity, the Stam inequality, and a result on the asymptotic scaling of the entropy under the heat flow. First we define a “classical-quantum” integral conditional Fisher information, by which we mean a Fisher information of a classical system which is conditioned on a quantum system. We show in Theorem 1 that this quantity satisfies a de Bruijn identity, which links it to the change of the conditional entropy under the heat flow. We show the regularity of the integral conditional Fisher information in Theorem 2 and then prove the conditional Stam inequality in Theorem 3. In the next part, we show in Theorem 4 that the quantum conditional entropy of a classical system undergoing the classical heat flow evolution conditioned on a quantum system satisfies the same universal scaling which was shown for the quantum conditional entropy of a quantum system undergoing the quantum heat flow evolution conditioned on a quantum system. It is crucial for the proof of our conditional Entropy Power Inequality that these two scalings are not only both universal, but also the same. This scaling then implies that asymptotically, the inequality we want to prove becomes an equality. Then it is left to show that it is enough to consider the inequality in the asymptotic limit, i.e., the difference of the two sides of the inequality behaves under the heat flow in a way which only makes the inequality “worse”.

The paper is structured as follows: In section 2 we present bosonic quantum systems and the relevant quantities required for our discussion. In the following section 3 the integral version of the quantum conditional Fisher information is adapted to the convolution (9). Sections 4 and 5 are dedicated to the proof of various inequalities that are central to the proof of Entropy Power Inequalities, such as the Stam inequality and an asymptotic scaling of the conditional entropy. The following section 6 then proves the conditional Entropy Power Inequality for the convolution (9) as our main result. Optimality of the conditional Entropy Power Inequality is shown in section 7. This is followed by the derivation of various related information-theoretic inequalities involving the quantum conditional entropy in section 8. Before concluding, we apply the conditional Entropy Power inequality to bound the convergence rate of bipartite systems where one system undergoes a quantum Ornstein-Uhlenbeck semigroup evolution in section 9.

## 2 Preliminaries

Let us consider an -mode bosonic system with “position” and “momentum” operators , for each mode which satisfy the canonical commutation relations . If we denote the vector of position and momentum operators by , the canonical commutation relations become

(20) |

where is the symplectic form.

The Weyl displacement operators are defined by

(21) |

The displacement operators satisfy the commutation relations

(22) |

as well as the “displacement property” on the mode operators

(23) |

Given an -mode quantum state , we define its first moments as

(24) |

and its covariance matrix (for finite first moments) as

(25) |

with the anticommutator .

The aforementioned concepts of displacements and first and second moments are the quantum analogues of the classical concepts. For a probability distribution function , we define its displacement by a vector as

(26) |

Furthermore, we denote the energy of the function by the sum of its second moments,

(27) |

The quantities are called the first moments of , and

(28) |

is called the covariance matrix of . We remark that we have rescaled the Lebesgue measure on in these definitions, which we have done purely for convenience.

###### Definition 1 (Quantum heat semigroup).

The quantum heat semigroup is the following time evolution for any quantum state :

(29) | ||||

(30) |

where is a displacement of the state by .

The quantum heat semigroup has a semigroup structure, that is, for any , we have

(31) |

We note that if is the probability distribution of a Gaussian random variable with covariance matrix , then we have

(32) |

The quantum heat semigroup is the quantum analogue of the classical heat semigroup, which we will repeat here. It can be written in an analogous way to the quantum heat semigroup:

###### Definition 2 (Classical heat semigroup).

The classical heat semigroup is the following time evolution defined on a function :

(33) | ||||

(34) |

We also have that for any

(35) |

We note again that we have

(36) |

where

(37) |

is the well-known classical convolution of the two functions and (with a factor of in the Lebesgue measure on which we introduce purely for convenience).

The convolution (9) is compatible with displacements and with the heat semigroup evolution in a convenient way, which is stated in the following two Lemmas:

###### Lemma 1 (Compatibility with displacements of the convolution (9)).

[13, Lemma 2] Let be a probability distribution and an -mode quantum state. Then we have for any :

(38) |

where .

###### Remark 1.

Lemma 2 in [13] only states the compatibility for the case where are parallel. Nonetheless, the proof given there also works to prove the statement above.

###### Lemma 2 (Compatibility with the heat semigroup of the convolution (9)).

###### Definition 3 (Shannon differential entropy).

For a classical -valued random variable with a probability density function , we define the Shannon differential entropy as

(40) |

We continue with a short review of Gaussian quantum states. An -mode quantum state is called Gaussian if it has the following form [16]:

(41) |

where is a positive definite real matrix and is the vector of first moments of the state. The entropy of such a Gaussian state is given by

(42) |

where and are the symplectic eigenvalues of the covariance matrix , i.e., the absolute values of the eigenvalues of .

A Gaussian state is called thermal if its first moments are zero and the matrix is proportional to the identity. Such thermal states have the special form

(43) |

for the Hamiltonian of harmonic oscillators . Gaussian states fulfill a special extremality property. Among all states with a given average energy , thermal states maximize the von Neumann entropy. Furthermore, among all states with fixed covariance matrix, the Gaussian state is the one with maximal entropy [18, 19].

In our proofs, we are going to require the notion of quantum conditional Fisher information of quantum systems which was introduced in [11]. We repeat the main properties of this quantity here. For a thorough definition and proofs we refer to [11]. Before giving this definition, we clarify the notion of “classical-quantum” states on a system if the classical system is continuous. A state on is a probability measure on which takes values in the trace class operators, i.e., a collection of trace class operators on with the normalization condition

(44) |

This state “stores” a classical probability distribution in the classical system if its marginal on has as probability distribution. The marginals of are

(45) |

and the conditional states on given the value of are

(46) |

We do not consider the case where the probability measure is not absolutely continuous with respect to the Lebesgue measure, since in this case its Shannon differential entropy is not defined. For a more detailed discussion, we refer to [20, Section III.A.3] and references therein ([21] and [22, Chapter 4.6-4.7]).

We can also define displacements of such a classical-quantum state: We write to denote a state where the classical system has been displaced by and the quantum system has been displaced by .

###### Definition 4 (Quantum integral conditional Fisher information).

[11, Definition 6] Let be an -mode bosonic quantum system, and a generic quantum system. Let be a quantum state on . For any , the integral Fisher information of conditioned on is given by

(47) | ||||

(48) |

where is a classical Gaussian random variable with values in and probability density function

(49) |

and is the quantum state on such that its marginal on is and for any

(50) |

###### Definition 5 (Quantum conditional Fisher information).

Finally, we are going to require a notion of conditional entropy of a classical system which is conditioned on a quantum system. If the system on which we condition is classical, the conditional entropy is simply

(52) |

where is the probability distribution of . This definition is independent of whether the system is classical or quantum. We now define the conditional entropy of a classical system which is conditioned on a quantum system in a way such that the chain rule for entropies is preserved.

###### Definition 6 (Quantum conditional entropy of classical-quantum systems).

Let be a classical system, a quantum system. We define the conditional entropy of given as

(53) |

whenever the three quantities appearing on the right handside are finite.

The case where , and are not finite will not be part of our consideration.

## 3 Quantum integral conditional Fisher information

In this section we consider a generic quantum system and a classical system . We are going to define the quantum integral conditional Fisher information of conditioned on and prove a de Bruijn identity as well as a number of useful properties.

###### Definition 7 (quantum integral conditional Fisher information).

For a quantum state on whose marginal on is and , define the integral Fisher information of conditioned on as

(54) | ||||

(55) |

where is a classical Gaussian random variable with probability density function equal to

(56) |

and is the quantum state on such that its marginal on is equal to , and for any , we have

(57) |

The marginal of on is equal to

(58) |

The marginal on has probability density function .

###### Theorem 1 (Integral conditional de Bruijn identity).

(59) |

###### Proof.

We use the definition of the conditional mutual information as well as the definition of the conditional quantum entropy when the system on which we condition is classical. We calculate

(60) | ||||

(61) | ||||

(62) | ||||

(63) |

The second to last step follows because the entropy is invariant under displacements of the classical system. ∎

We now show that the integral conditional Fisher information defined as above, as a function of , is continuous, increasing, and concave. The proof strategy is similar to the proof of regularity for the quantum integral conditional Fisher information given in [11].

###### Lemma 3 (Continuity of the integral conditional Fisher information).

Let be a state such that the function is continuous with respect to the trace norm and the marginal has finite average energy. Then, the function is continuous for any .

###### Proof.

From the de Bruijn identity Theorem 1, it is sufficient to prove that

(64) |

where we have defined for any

(65) |

From the data processing inequality, for any

(66) |

It is then sufficient to prove that

(67) |

We have from the chain rule

(68) |

From [23], Remark 9.3.8, and [24, 25, 26], the Shannon differential entropy is upper semicontinuous on the set of probability measures on absolutely continuous with respect to the Lebesgue measure and with finite average energy, and

(69) |

On the other hand, we have

(70) |

Since the function is continuous with respect to the trace norm, we have for any

(71) |

Because the relative entropy is positive, we get from Fatou’s lemma

(72) |

Since the relative entropy is lower semicontinuous, we have for any

(73) |

Combining (3), (73) and (70) we get

(74) |

∎

###### Lemma 4.

For any ,

(75) |

###### Proof.

Follows from the semigroup structure of . ∎

###### Lemma 5.

For any ,

(76) |

###### Proof.

Follows from the data processing inequality for the quantum mutual information. ∎

###### Lemma 6.

: For any ,

(77) | ||||

(78) |

###### Proof.

Follows from Theorem 1. ∎

###### Theorem 2 (regularity of the integral conditional Fisher information).

For any quantum state on such that the conditions of Lemma 3 are fulfilled, the integral conditional Fisher information is a continuous, increasing, and concave function of .

## 4 Quantum conditional Fisher information

###### Definition 8.

For a quantum state on such that the conditions of Lemma 3 are fulfilled, we define the Fisher information of conditioned on as

(82) |

This limit always exists because the function is continuous and concave by Theorem 2.

###### Proposition 1 (Quantum conditional de Bruijn).

Assume the hypotheses of Theorem 2. Then we have

(83) |

###### Proof.

Follows from the integral conditional de Bruijn identity given in Theorem 1. ∎

### 4.1 Stam inequality

###### Theorem 3.

Let be an -mode quantum system, a classical system and a generic quantum system. Let be a quantum state on such that its marginal on has a probability density function . Let further fulfill

(84) |

Let us suppose that and are conditionally independent given :

(85) |

Then the linear conditional Stam inequality holds:

(86) |

where

(87) |

Choosing , we obtain the conditional Stam inequality

(88) |

###### Proof.

We prove the following:

(89) |

Because is increasing and concave the Stam inequality follows taking the derivative at .

By definition, we have for any that

(90) |

for a -valued Gaussian random variable with probability density function

(91) |

and has as marginal on and for any , it fulfills

(92) |

We now define the state as the state with marginal on equal to and for any ,

(93) |

i.e., the system is displaced by and the system is displaced by . By compatibility of the convolution (9) with displacements, we have

(94) |

We notice that

(95) | ||||

(96) |

Now we obtain by data processing,

(97) | |||