Coding for classical-quantum channels with rate limited side information at the encoder: An information-spectrum approach

# Coding for classical-quantum channels with rate limited side information at the encoder: An information-spectrum approach

Naqueeb Ahmad Warsi    Justin Coon Department of Engineering Science, University of Oxford, Email: naqueeb.ahmedwarsi@eng.ox.ac.uk, justin.coon@eng.ox.ac.uk
###### Abstract

We study the hybrid classical-quantum version of the channel coding problem for the famous Gel’fand-Pinsker channel. In the classical setting for this channel the conditional distribution of the channel output given the channel input is a function of a random parameter called the channel state. We study this problem when a rate limited version of the channel state is available at the encoder for the classical-quantum Gel’fand-Pinsker channel. We establish the capacity region for this problem in the information-spectrum setting. The capacity region is quantified in terms of spectral-sup classical mutual information rate and spectral-inf quantum mutual information rate.

## 1 Introduction

In traditional information theory literature it is common to study the underlying problems assuming that the channel characteristics do not change over multiple use. The proofs appeal to typicality of sequences or typical subspaces in the quantum setting [1]: the empirical distribution of symbols in a long sequence of trials will with high probability be close to the true distribution [2]. However, information theoretic arguments based on typicality or the related Asymptotic Equipartition Property (AEP) assume that both the source and channel are stationary and/or ergodic (memoryless), assumptions that are not always valid, for example, in [3] Gray analyzes the details of asymptotically mean stationary sources, which are neither stationary nor ergodic. To overcome such assumptions Verdú and Han pioneered the technique of information-spectrum methods in their seminal work [4]. In this work Verdú and Han define the notions of limit inferior and limit superior in probability. They then use these definitions to establish the capacity of general channels (channels that are not necessarily stationary and/or memoryless). Since this work of Verdú and Han there have been a considerable interest in generalizing the results of information theory in the information spectrum setting, see for example, [5, 6, 7, 8] and references therein.

This general technique of information-spectrum methods wherein no assumptions are made on the channels and/or sources were extended to the quantum case by Hayashi, Nagaoka and Ogawa. Using this method they studied the problem of quantum hypothesis testing [9, 10], deriving the classical capacity formula of general quantum channels [11] and establishing general formula for the optimal rate of entanglement concentration [12]. Since the work of Hayashi, Nagaoka and Ogawa the study of various quantum information theoretic protocols in the information spectrum setting have been one of the most interesting areas of research in the theoretical quantum information science. In [13] Bowen and Datta further carried forward this approach to study various other quantum information theoretic problems. In [14] Datta and Renner showed that there is a close relationship with the information theoretic quantities that arise in the information-spectrum scenario and smooth Rényi entropies which play a crucial role in one-shot information theory. In [15] Radhakrishnan, Sen and Warsi proved one-shot version of the Marton inner bound for the classical-quantum broadcast channels. They then showed that their one-shot bounds yields the quantum information-spectrum genralization of the Marton inner bound in the asymptotic setting.

In this paper, we carry forward the subject of studying quantum information theoretic protocols in the information-spectrum setting. We study the problem of communication over the sequence (also called as the classical-quantum Gel’fand-Pinsker channel), where are the input and state alphabets and is a positive operator with trace one acting on the Hilbert space We establish the capacity region of this channel when rate limited version of the state sequence is available at the encoder. Figure 1 below illustrates this communication scheme.

The classical version of this problem was studied by Heggard and El Gamal (achievability) in [16] in the asymptotic iid setting. They proved the following:

###### Theorem 1.

Fix a discrete memoryless channel with state characterized by Let be such that

 R I[S;~S],

for some distribution Then, the rate pair is achievable.

Furthermore, in [16] Heggard and El Gamal argued that Theorem 1 implies the result of Gel’fand and Pinsker [17] who showed the following:

###### Theorem 2.

Fix a discrete memoryless channel with state characterized by The capacity of this channel when the state information is directly available non-causally at the encoder is

 C=maxpu∣s,g:U×S→X(I[U;Y]−I[U;S]),

where

The above formula for the capacity is quite intuitive. If we set and in Theorem 2 then we rederive the famous Shannon’s channel capacity formula [18]. However, when , Theorem 2 implies that there is a loss in the maximum transmission rate per channel use at which Alice can communicate to Bob. This loss in the transmission rate is reflected by the term . Thus, can be thought of as the minimum number of bits Alice needs to send to Bob per channel use to help him get some information about the channel state sequence . Bob can then use this information about to recover the intended message.

### Our Result

We establish the capacity region of the classical-quantum Gel’fand-Pinsker channel in the information-spectrum setting when rate limited version of the channel state is available at the encoder. In the information-spectrum setting the channel output need not be a tensor product state. Furthermore, the channel state is a sequence of arbitrarily distributed random variables. This extremely general setting is the hallmark of information-spectrum approach. We prove the following:

###### Theorem 3.

Let be a sequence of classical-quantum Gel’fand-Pinsker channels. The capacity region for this sequence of channels with rate limited version of the channel state available only at the encoder is the set of rate pairs satisfying the following:

 R ≤I–[U;B]−¯¯I[U;~S]; RS ≥¯¯I[S;~S].

The information theoretic quantities are calculated with respect to the sequence of states where for every ,

 ΘSn~SnUnXnBn:=∑sn,~sn,u,xnpSn(sn)p~Sn∣Sn(~sn∣sn)pUn∣~Sn(un∣~sn)pXn∣Un~Sn(xn∣un,~sn)|sn⟩⟨sn|Sn⊗|~sn⟩⟨~sn|~Sn ⊗|un⟩⟨un|Un⊗|xn⟩⟨xn|Xn⊗ρBnxn,sn.

An immediate consequence of Theorem 3 is the following corollary:

###### Corollary 1.

(Hayashi and Nagaoka, [11]) The capacity of a sequence of classical-quantum channels is the following:

 C=sup{Xn}∞n=1I–[X;B].

The capacity of a sequence of classical-quantum Gel’fand-Pinsker channels with channel state directly available at the encoder is the following:

 C=sup{Θn}∞n=1I–[U;B]−¯¯I[U;S],

where for every ,

 Θn=∑sn,u,xnpSn(sn)pUn∣Sn(un∣sn)pXn∣UnSn(xn∣un,sn)|sn⟩⟨sn|Sn⊗|un⟩⟨un|Un⊗|xn⟩⟨xn|Xn⊗ρBnxn,sn.
###### Proof.

The proof follows by setting and in Theorem 3.

The proof follows by setting in Theorem 3.

## 2 Definition

###### Definition 1.

Let be a sequence of pair of random variables, where for every and take values over the set . The spectral-sup mutual information rate between and is defined as follows:

 ¯¯I[U;~S]:=inf{a:limn→∞Pr{1nlogpUn~SnpUnp~Sn>a+γ}=0}, (1)

where is arbitrary and the probability above is calculated with respect to .

###### Definition 2.

Let and be sequences of quantum states where for every and are density matrices acting on the Hilbert space The spectral-inf mutual information rate between and is defined as follows:

 I–[ρ;σ]:=sup{a:limn→∞Tr[{ρn⪰2n(a−γ)σn}ρn]=1}, (2)

where is arbitrary and represents a projection operator onto the non-negative Eigen space of the operator

###### Definition 3.

An code for the Gel’fand-Pinsker channel with coded side information available at the encoder consists of

• a state encoding

• an encoding function (possibly randomized)

• A decoding POVM such that

 1Mn∑snpSn(sn)Tr[(I−N(fn(m,fe,n(sn)),sn))β(m)]≤εn.
###### Definition 4.

A rate pair is achievable if there exists a sequence of codes such that

 liminfn→∞1nlogMn >R limsupn→∞εn <ε limsupn→∞1nlogMe,n

The set of all achievable rate pairs is known as the capacity region.

## 3 Proof of Theorem 3

### 3.1 Achievability

Let,

 ρBnun,~sn =∑sn,xnpSn∣~Sn(sn∣~sn)pXn∣Un~Sn(xn∣un,~sn)ρBnxn,sn (3) ΘUnBn =TrSn~SnXn[ΘSn~SnUnXnBn] (4) ΘUn =TrBn[ΘUnBn] (5) ΘBn =TrUn[ΘUnBn]. (6)

Let,

 ΠUnBn:={ΘUnBn⪰2n(I–[U;B]−γ)ΘUn⊗ΘBn}, (7)

where is calculated with respect to the sequence of states and Further, for every let

 Λun:=TrUn[ΠUnBn(|un⟩⟨un|⊗I)]. (8)

Fix Define the following sets:

 Tn(pSn~Sn) :={(sn,~sn):1nlogpSn~Sn(sn~sn)pSn(sn)p~Sn(~sn)≤¯¯I[S;~S]+γ}; Tn(pUn~Sn) :={(un,~sn):1nlogpUn~Sn(un~sn)pUn(un)p~Sn(~sn)≤¯¯I[U;~S]+γ}.

Furthermore, let and be defined as follows:

 g1(~sn)=∑un:(un,~sn)∉Tn(pUn~Sn)pUn∣~Sn(un∣~sn); (9) g2(~sn)=∑un:Tr[ΛunρBnun,~sn]≤1−√εpUn∣~Sn(un∣~sn). (10)

In what follows we will use the notation to represent the set

The codebook: Let be as in the statement of the theorem. Let , so that . Let be drawn independently according to the distribution . We associate these samples with a row vector having entries. We then partition this row vector into classes each containing elements. Every message is uniquely assigned a class. We will denote the class corresponding to the message by .

Fix Further, let be drawn independently according to the distribution We will denote this collection of sequences by These collection of sequences present in are made known to Alice as well.

Charlie’s encoding strategy: For each let be independently and uniformly distributed over For a given realisation of the state sequence , let be an indictor random variable defined as follows:

 ζ(k)=⎧⎪⎨⎪⎩1if  Z(k)≤p~SnSn(~sn[k],sn)2n(¯I[S;~S]+γ)p~Sn(~sn[k])pSn(sn);0otherwise. (11)

Further, for a given realisation of the state sequence let be an indicator random variable defined as follows:

 I(k)={1if  ζ(k)=1,g1(~Sn[k])<√ε and g2(~Sn[k])<ε14;0otherwise, (12)

where and are defined in (9) and (10). Charlie on observing the state sequence finds an index such that . If there are more than one such indices then is set as the smallest one among them. If there is none such index then Charlie then sends this index to Alice.

Alice’s encoding strategy: For each pair , let be independently and uniformly distributed over and let be defined as follows:

 g(k,ℓ):=Tr[Λun[ℓ]ρBnun[ℓ],~sn[k]], (13)

where is defined in (3) and is defined in (8). Let be an indicator random variable such that

 I(k,ℓ)=⎧⎪⎨⎪⎩1if  η(k,ℓ)≤p~SnUn(~sn[k],un[ℓ])2n(¯I[U;~S]+γ)p~Sn(~sn[k])pUn(un[ℓ]);0otherwise. (14)

Further, let be an indicator random variable defined as follows:

 J(k,ℓ)={1if  I(k,ℓ)=1 % and g(k,ℓ)>1−√ε;0otherwise. (15)

To send a message and on receiving the index from Charlie, Alice finds an index such that If there are more than such indices then is set as the smallest one among them. If there is none such index then Alice then randomly generates and transmits it over the classical-quantum channel over channel uses. In the discussions below we will use the notation to highlight the dependence of on . A similar encoding technique was also used by Radhakrishnan, Sen and Warsi in [15].

Bobs’ decoding strategy: For each we have the operators as defined in (8). Bob will normalize these operators to obtain a POVM. The POVM element corresponding to will be

 βn(ℓ):=⎛⎜⎝∑ℓ′∈[1:2n(R+r)]Λun(ℓ′)⎞⎟⎠−12Λun(ℓ)⎛⎜⎝∑ℓ′∈[1:2n(R+r)]Λun(ℓ′)⎞⎟⎠−12. (16)

Bob on receiving the channel output measures it using these operators. If the measurement outcome is then he outputs if Similar decoding POVM elements were also used by Wang and Renner in [19].

Probability of error analysis: Let a message be transmitted by Alice by using the protocol discussed above and suppose it is decoded as by Charlie. We will now show that the probability averaged over the random choice of codebook, the state sequence and is arbitrary close to zero. By the symmetry of the code construction it is enough to prove the claim for There are following sources of error:

1. Charlie on observing the state sequence does not find a suitable such that .

2. Alice on receiving the index from Charlie is not able to find a suitable such that such that

3. Charlie finds a suitable and Alice finds a suitable , but Bob’s measurement is not able to determine the index correctly.

Let and be the indices chosen by Charlie and Alice. Let us now upper bound the probability of error while decoding the transmitted message. Towards this we first define the following events:

 E1 :=for all k∈[1:2nRs]:I(k)=0; E2 :=for all ℓ∈C(A)n(1):J(k⋆,ℓ)=0.

We now have the following bound on the error probability:

 Pr{~m≠1} ≤Pr{~ℓ≠ℓ⋆} ≤Pr{E1}+Pr{E2}+Pr{Ec1∩Ec2,~ℓ≠ℓ⋆} ≤2Pr{E1}+Pr{Ec1∩E2}+Pr{Ec1∩Ec2,~ℓ≠ℓ⋆}, (17)

where the first inequality follows from the setting of the protocol discussed above and remaining all of the inequalities till (17) follow from the union bound. In what follows we will now show that for large enough we have

 2Pr{E1}+Pr{Ec1∩E2}+Pr{Ec1∩Ec2,~ℓ≠ℓ⋆}≤6ε+3√ε+3ε14+2√ε(1−ε−√ε−ε14)+3exp(−2nγ),

where is arbitrarily close to zero such that .

Consider

 Pr{E1} ≤∑sn∈SnpSn(sn)⎛⎜ ⎜ ⎜ ⎜ ⎜ ⎜ ⎜ ⎜⎝1−∑~sn:(~sn,sn)∈Tn(pSn~Sn)g1(~sn)<√εg2(~sn)<ε14p~Sn(~sn)Pr{I(k)=1∣Sn=sn,~Sn[k]=~sn}⎞⎟ ⎟ ⎟ ⎟ ⎟ ⎟ ⎟ ⎟⎠2nRS c≤∑sn∈SnpSn(sn)exp⎛⎜ ⎜ ⎜ ⎜ ⎜ ⎜ ⎜ ⎜⎝−2n(RS−(¯I[S;~S]+γ))∑~sn:(~sn,sn)∈Tn(pSn~Sn)g1(~sn)<√εg2(~sn)<ε14p~Sn∣Sn(~sn∣sn)⎞⎟ ⎟ ⎟ ⎟ ⎟ ⎟ ⎟ ⎟⎠ d=∑sn∈SnpSn(sn)exp⎛⎜ ⎜ ⎜ ⎜ ⎜ ⎜ ⎜ ⎜⎝−2nγ∑~sn:(~sn,sn)∈Tn(pSn~Sn)g1(~sn)<√εg2(~sn)<ε14p~Sn∣Sn(~sn∣sn)⎞⎟ ⎟ ⎟ ⎟ ⎟ ⎟ ⎟ ⎟⎠ e≤Pr{Tcn(pSn~Sn)}+Pr{g1(~Sn)≥√ε}+Pr{g2(~Sn)≥ε14}+exp(−2nγ) f≤ε+exp(−2nγ)+Pr{g1(~Sn)≥√ε}+Pr{g2(~Sn)≥ε14}, (18)

where follows because are independent and identically distributed and are independent and identically distributed according to the distribution follows from the definition of follows from the inequality follows becuase follows because and union bound and follows because is large enough such that Let us now bound the each of the last two terms on the R.H.S. of (3.1) as follows:

Consider :

 Pr{g1(~Sn)≥√ε} a≤E[g1(~Sn)]√ε b=∑~sn∑un:(un,~sn)∉Tn(pUn~Sn)pUn~Sn(un,~Sn)√ε =Pr{(Un,~Sn)∉Tn(pUn~Sn)}√ε c≤ε√ε =√ε, (19)

where follows from Markov inequality; follows from the definition of and taking expectation over the random variable and follows under the assumption that is large enough such that

Consider :

 Pr{g2(~Sn)≥ε14} a≤E[g2(~Sn)]ε14 b=∑~sn∑un:Tr[ΛunρBnun,~sn]≤1−√εpUn~Sn(un,~sn)ε14 =Pr{Tr[ΛUnρBnUn,~Sn]≤1−√ε}ε14 c≤√εε14 =ε14, (20)

where follows from Markov inequality; follows from the definition of and by taking the expectation over the random variable and follows because of the following set of inequalities:

 Pr{Tr[ΛUnρBnUn,~Sn]≤1−√ε} a≤1−ETr[ΛUnρBnUn,~Sn]√ε b=1−ETr[TrUn[ΠUnBn(|Un⟩⟨Un|⊗I)]ρBnUn,~Sn]√ε =1−Tr[TrUn[ΠUnBn∑unpUn(un)(|un⟩⟨un|⊗∑~snp~Sn∣Un(~sn∣un)ρBnun,~sn)]]√ε c=1−Tr[ΠUnBnΘUnBn]√ε d≤√ε,

where follows from the Markov inequality, follows from the definition of mentioned in (8), follows from the definition of mentioned in (4) and follows under the assumption that is large enough such that Thus, it now follows from (3.1), (19) and (20) that

 Pr{E1}≤ε+√ε+ε14+exp(−2nγ). (21)

Consider

 Pr{Ec1∩E2} =EC(C)nSnI(Ec1)Pr{∀ℓ∈C(1):J(k⋆,ℓ)=0} a=EC(C)nSnI(Ec1)(1−Pr{J(k⋆,ℓ)=1})2nr b