Message Transmission over Classical Quantum Channels with a Jammer with Side Information: Message Transmission Capacity and Resources
Abstract
In this paper we propose a new model for arbitrarily varying classicalquantum channels. In this model a jammer has side information. We consider two scenarios. In the first scenario the jammer knows the channel input, while in the second scenario the jammer knows both the channel input and the message. The transmitter and receiver share a secret random key with a vanishing key rate. We determine the capacity for both average and maximum error criteria for both scenarios. We also establish the strong converse. We show that all these corresponding capacities are equal, which means that additionally revealing the message to the jammer does not change the capacity.
I Introduction
Quantum information theory has developed into a very active field of reseach in the last years and its study provide an enormous amount of potential advantages. Quantum channels differs significantly from communication over classical channels. Quantum communication allow us to exploit possibilities for new applications for communications. To name a few: message transmission, secret message transmission, entanglement transmission, entanglement generation. secure communications over quantum channels is one of the first practical applications of quantum communications. In such systems one usually consider active jamming and passive eavesdropping attacks.
Communication models including a jammer who tries to disturb the legal parties’ communication have received a lot of attention in recent years. These publications concentrated on the model of message transmission over an arbitrarily varying channel where a third channel user, the jammer, may change his input in every channel use. This model captures completely all possible jamming attacks and is not restricted to use a repetitive probabilistic strategy. The arbitrarily varying channel was introduced in [9]. In the model of message transmission over arbitrarily varying channels it is understood that the sender and the receiver have to select their coding scheme first. In the conventional model it is assumed that this coding scheme is known by the jammer, and he may choose the most advantaged jamming attacking strategy depending on his knowledge, but the jammer has neither knowledge about the transmitted codeword nor knowledge about the message. Ahlswede showed in [2] the surprising result, that either the deterministic capacity of an arbitrarily varying channel is zero or it is equal to its random correlated capacity (Ahlswede dichotomy). For this dichotomy it is essential that the average error criterion was used. After that discovery, it remained an open question exactly when the deterministic capacity is nonzero. In [17] Ericson gave a sufficient condition for that, and in [16] Csiszár and Narayan proved that this is condition is also necessary. Ahlswede dichotomy demonstrates the importance of resources (shared randomness) in a very clear form. It is required that both sender and receiver have access to a perfect copy of the outcome of a random experiment, and thus we should assume an additional perfect channel. The legal channel users’ knowledge about the shared randomness is very helpful for message transmission through an arbitrarily varying channel (random correlated capacity), where we assume that the resource is only known by the legal channel users, since otherwise it will be completely useless (cf. [12]).
In this work we consider classical quantum channels, i.e., the sender’s inputs are classical symbols and the receiver’s outputs are quantum systems. The capacity of classicalquantum channels under average error criterion has been determined in [19], [23], and [24]. The capacity of arbitrarily varying classicalquantum channels has been delivered in [5]. An alternative proof of [5]’s result and a proof of the strong converse have been given in [7]. In [4] Ahlswede dichotomy for the arbitrarily varying classicalquantum channels was established, and a sufficient and necessary condition for the zero deterministic capacity has been given. In [13] a simplification of this condition was delivered. See also [20] and [21] for a classical quantum channel model with a benevolent third channel user instead of with a jammer. These results are basis tools for secure communication over arbitrarily varying wiretap channels. An arbitrarily varying wiretap channel is a channel with both a jammer and an eavesdropper. Classical arbitrarily varying wiretap channels have been studied extensively in the context of classical information theory. The secrecy capacity of arbitrarily varying wiretap classical quantum channels has been determined in [12].
As already mentioned the message transmission capacity of an arbitrarily varying channel depends on the demanded error criterion. The deterministic capacities of classical arbitrarily varying channel under maximal error criterion and under the average error criterion are in general, not equal. The deterministic capacity formula of classical arbitrarily varying channels under average error criterion is already well studied in the context of classical information theory. The deterministic capacity formula of classical arbitrarily varying channels under maximal error criterion is still an open problem. It has been shown by Ahlswede in [1] that the capacity under maximal error criterion of certain arbitrarily varying channels can be equal to the zeroerror capacity of related discrete memoryless channels. Furthermore the random correlated capacities of arbitrarily varying quantum to quantum channels under maximal error criterion and under the average error criterion are equal. Interestingly, [13] shows that the deterministic capacities of arbitrarily varying quantum to quantum channels under maximal error criterion and under the average error criterion are equal, since randomness for encoding is available for quantum to quantum channels, i.e., quantum encoding is very powerful. By the above facts there is no Ahlswede dichotomy for arbitrarily varying channels under maximal error criterion: It may occur that the deterministic capacity of a classical arbitrarily varying channel under maximal error criterion is not zero, but on the other hand, unequal to its random correlated capacity. We will provide a example in Section III.
In all the above mentioned works it is assumed that the jammer knows the coding scheme, but has neither side information about the codeword nor side information about the message of the legal transmitters. In many applications, especially for secure communications, it is too optimistic to assume this. Thus in this paper we want to consider two scenarios, where the jammer has side information: In the first one the jammer knows both coding scheme and input codeword. In the second one the jammer knows additionally the message (cf. Figure 1 and 2). The jammer can make use of this knowledge in each scenario to advance his attacking strategy. We require that information transmission can be guaranteed even in the worst case, when the jammer chooses the most advantageous attacking strategy according to his knowledge. For classical arbitrarily varying channels this was first considered by [22]. In this paper we extend this result to arbitrarily varying classicalquantum channels, where we use techniques different to these used in [22] (cf. Section IV). In this work we consider for both scenarios the random correlated capacities under average and maximal error criteria. Detailed descriptions for both scenarios are given in Section II. In Section III the message transmission capacities for both scenarios and both error criteria are completely characterized. In Section IV, Section V, and Section VI we deliver proofs for the capacities results for both scenarios and both error criteria. A vanishing rate of the key is sufficient for our codes since the resource we use here is only of polynomial size of the code length (cf. Remark 2, and also [13] and [11] for a discussion about the difference between various forms of shared randomness).
Ii Problem Formulation
A: Basic notations
Throughout the paper random variables will be denoted by capital letters e. g., and their realizations (or values) and domains (or alphabets) will be denoted by corresponding lower case letters e. g., and script letters e.g., , respectively. Random sequences will be denoted a by capital boldface letters, whose lengths are understood by the context, e. g., and , and deterministic sequences are written as lower case boldface letters e. g., .
is distribution of random variable . Joint distributions and conditional distributions of random variables and will be written as , etc and etc, respectively and and are their product distributions i. e., , and . Moreover and are sets of (strongly) typical sequences of the type , joint type and conditional type , respectively. The cardinality of a set will be denoted by . For a positive integer , . “ is a classical channel, or a conditional probability distribution, from set to set ” is abbreviated to “”. “Random variables and form a Markov chain” is abbreviated to “”. will standard for the operator of mathematical expectation.
Throughout the paper dimensions of all Hilbert spaces are finite, and the identity operator in a Hilbert space is denoted by .
Throughout the paper the base(s) of logarithm is 2. For a discrete random variable on a finite set and a discrete random variable on a finite set , we denote the Shannon entropy of by and the mutual information between and by . Here is the joint probability distribution function of and , and and are the marginal probability distribution functions of and respectively.
Let and be quantum systems. We denote the Hilbert space of and by and , respectively. Let be a bipartite quantum state in . We denote the partial trace over by
where is an orthonormal basis of . We denote the conditional entropy by
Here .
For a finitedimensional complex Hilbert space , we denote the (convex) set of density operators on by
where is the set of linear operators on , and is the null matrix on . Note that any operator in is bounded.
For finitedimensional complex Hilbert spaces and a quantum channel : , is represented by a completely positive tracepreserving map which accepts input quantum states in and produces output quantum states in .
B: Code definitions
If the sender wants to transmit a classical message of a finite set to the receiver using a quantum channel , his encoding procedure will include a classicaltoquantum encoder to prepare a quantum message state suitable as an input for the channel. If the sender’s encoding is restricted to transmit an indexed finite set of quantum states , then we can consider the choice of the signal quantum states as a component of the channel. Thus, we obtain a channel with classical inputs and quantum outputs, which we call a classicalquantum channel. This is a map : , which is represented by the set of possible output quantum states , meaning that each classical input of leads to a distinct quantum output . In view of this, we have the following definition.
Definition 1
Let be a finitedimensional complex Hilbert space. A classicalquantum channel is a mapping , specified by a set of quantum states , indexed by “input letters” in a finite set . and are called input alphabet and output space respectively. We define the th extension of classicalquantum channel as follows. The channel outputs a quantum state , in the th tensor power of the output space , when an input codeword of length is input into the channel.
Let : be a classicalquantum channel. For , the conditional entropy of the channel for with input distribution is denoted by
Let be a be a classicalquantum channel, i.e., a set of quantum states labeled by elements of . For a probability distribution on , the Holevo quantity is defined as
For a probability distribution on a finite set and a positive constant , we denote the set of typical sequences by
where is the number of occurrences of the symbol in the sequence .
Let be a finitedimensional complex Hilbert space. Let and . We suppose has the spectral decomposition , its typical subspace is the subspace spanned by , where . The orthogonal subspace projector which projected onto the typical subspace is
Similarly, let be a finite set, and be a finitedimensional complex Hilbert space. Let : be a classicalquantum channel. For , suppose has the spectral decomposition for a stochastic matrix . The conditional typical subspace of for a typical sequence is the subspace spanned by . Here is an indicator set that selects the indices in the sequence for which the th symbol is equal to . The subspace is often referred to as the conditional typical subspace of the state . The orthogonal subspace projector which projected onto it is defined as
The typical subspace has following properties:
For and there are positive constants , , and , depending on and tending to zero when such that
(1) 
(2) 
(3) 
For there are positive constants , , and , depending on and tending to zero when such that
(4) 
(5) 
(6) 
For the classicalquantum channel and a probability distribution on we define a quantum state on . For we define an orthogonal subspace projector fulfilling (1), (2), and (3). Let . For there is a positive constant such that following inequality holds:
(7) 
We give here a sketch of the proof. For a detailed proof please see [26].
proof
(1) holds because . (2) holds because . (3) holds because for and a positive . (4), (II), and (6) can be obtained in a similar way. (7) follows from the permutationinvariance of .
Definition 2
A arbitrarily varying classicalquantum channel (AVCQC) is specified by a set of classical quantum channels with a common input alphabet and output space , which are indexed by elements in a finite set . Elements usually are called the states of the channel. outputs a quantum state
(8) 
if an input codeword is input into the channel, and the channel is governed by a state sequence , while the state varies from symbol to symbol in an arbitrary manner.
We assume that the channel state is in control of the jammer. Without loss of generality we also assume that the jammer always chooses the most advantageous attacking strategy according to his knowledge.
Definition 3
A code of length for a classical quantum channel consists of its code book and decoding measurement , where the code book is a subset of input alphabet indexed by messages in the message set , and the decoding measurement is a quantum measurement in the output space that is, for all and .
Definition 4
A random correlated code for a AVCQC is a uniformly distributed random variable taking values in a set of codes with a common message set , where and are the code book and decoding measurement of the th code in the set respectively. is called the key size.
Remark 1
Usually a random correlated code is defined as any random variable taking values in a set of codes. Here we restrict ourselves to uniformly distributed random variables, since it is sufficiently for our purpose (cf. [25]).
C: Capacity definitions and basic relations
One of the fundamental task of quantum Shannon theory is to characterize performance measurements maximizing the efficiency of quantum communication. Hence we introduce here capacity for message transmission and simple relations between different quantities.
As already mentioned this work concentrates on message transmission over classical quantum channels with a jammer with additonal side information. It is clear that this side information are encoded by the same coding scheme, which is known by the jammer by assumption, as the legal transmitters use for their communication. We assume that the jammer chooses the most advantageous attacking strategy according to his side information. We now distinguish two scenarios depending on the jammer’s knowledge (cf. Figure 1 and 2). We consider for each scenario both average and maximum error criteria.
Scenario 1
In this scenario jammer knows coding scheme and input codeword but not the message to be sent.
Definition 5
By assuming that the random message is uniformly distributed, we define the average probability of error by
(9) 
This can be also rewritten as
(10) 
The maximum probability of error is defined as
(11) 
Definition 6
A nonnegative number is an achievable rate for the arbitrarily varying classicalquantum channel under random correlated coding in scenario 1 under the average error criterion and under the maximal error criterion if for every and , if is sufficiently large, there is an random correlated code of length such that , and and , respectively.
The supremum on achievable rate under random correlated coding of under the average error criterion and under the maximal error criterion in scenario 1 is called the random correlated capacity of under the average error criterion and under the maximal error criterion in scenario 1, denoted by and , respectively.
Definition 7
Let . A nonnegative number is an  achievable rate for the arbitrarily varying classicalquantum channel under random correlated coding in scenario 1 under the average error criterion and under the maximal error criterion if for every if is sufficiently large, there is an random correlated code of length such that , and and , respectively.
The supremum on achievable rate under random correlated coding of under the average error criterion and under the maximal error criterion in scenario 1 is called the random correlated  capacity of under the average error criterion and under the maximal error criterion in scenario 1, denoted by and , respectively.
By (5) it is clear, that to employ a “mixed strategy” for the jammer may not do better than only to use deterministic strategy. That is, the jammer may not enlarge the average probability of error, if he randomly chooses a state sequence with any conditional distribution , according to the input codeword, instead chooses a fixed state sequence with the best deterministic strategy, because
for all and all (with ).
Scenario 2
Now the jammer has more benefit and he can choose the state sequence according to both input codeword and message which sender wants to transmit, or a function .
Definition 8
We define the average probability of error in scenario 2 by
(12) 
The maximum probability of error in scenario 2 is defined as
(13) 
Definition 9
A nonnegative number is an achievable rate for the arbitrarily varying classicalquantum channel under random correlated coding in scenario 2 under the average error criterion and under the maximal error criterion if for every and , if is sufficiently large, there is an random correlated code of length such that , and and , respectively.
The supremum on achievable rate under random correlated coding of under the average error criterion and under the maximal error criterion in scenario 2 is called the random correlated capacity of under the average error criterion and under the maximal error criterion in scenario 2, denoted by and , respectively.
Definition 10
Let . A nonnegative number is an  achievable rate for the arbitrarily varying classicalquantum channel under random correlated coding in scenario 2 under the average error criterion and under the maximal error criterion if for every , if is sufficiently large, there is an random correlated code of length such that , and and , respectively.
The supremum on  achievable rate under random correlated coding of under the average error criterion and under the maximal error criterion in scenario 2 is called the random correlated  capacity of under the average error criterion and under the maximal error criterion in scenario 2, denoted by and , respectively.
Obviously
It is easy to show that
because both (5) and (8) are equal to
Moreover, the average probability of error (8) can rewritten as
Thus, in the standard way, by Markov inequality one may conclude that the message set of any code with average probability of error in scenario 2 contains a subset such that and
for all . That is,
thus
(14) 
Iii Main Results
For a given AVCQC with set of state , let
(15) 
Theorem 1
(Direct Coding Theorem for Scenario 1) Given a AVCQC and a type , for all , and , there is a , such that for all sufficiently large , there exists a code of length with a rate larger than , average probability of error in scenario 1 smaller than , and key size of the random correlated code smaller then . Moreover codewords of code books in support set of the random correlated code , all are in .
Remark 2
In particular, there is a constant (depending only on the AVCQC) such that for any sequence of positive real numbers , lower bounded by for an (depending on ), with , there exists a sequence of random correlated codes with a rate larger than , average probability of error smaller than and the amount of common randomness upper bounded by .
Theorem 2
(Strong Converse Coding Theorem for Scenario 1)
For every we have
(16) 
Let
(17) 
Then obviously
(18) 
The following Example 1 shows that the inequality is strict already in classical arbitrarily varying channels, as a special case of AVCQC. It was shown the random correlated capacities of a AVCQC under maximum error probability and average error probability when the jammer does not know the channel input are the same and both equal to . Recalling that to employ the criterion of average probability of error corresponds to scenario 1 and the criterion of maximum probability of error corresponds to scenario 2, we conclude that knowing the message to be sent may not help a jammer who only know the coding scheme, for reduction the capacity, if random correlated codes are allowed to be used by the communicators side.
Example 1
Let and . We define a classical arbitrarily varying channel represented by the transmission matrices
The jammer may choose by setting , and . Since
we have
But when the jammer has no knowledge about the channel input, we can always achieve positive capacity, since zero capacity means there is a such that
has rank , which can only be true when
But there is clearly no such since else we would have
Thus when the jammer has no knowledge about the channel input, this channel has a positive deterministic capacity.
Example 1 shows that the jammer really benefits from his knowledge about the channel input.
The following example was first presented at the IEEE International Symposium on Information Theory 2010 in a talk by N. Cai, T. Chen, and A, Grant.
Example 2
Let and . We define a classical arbitrarily varying channel such that , if for . That is the transmission matrices in are
At first we have that the deterministic capacity of under maximum error probability is larger or equal to because for all there is a zeroerror code of length and therefore a code with criterion of maximum probability of error. Secondly let be a mapping from for arbitrary sending to such that if and otherwise , for . Then no pair of codewords in a code with criterion of maximum probability of error have the same image under the mapping because in probability one the decoder may not separate the two codewords with the same image if the jammer properly chooses the state sequence according to the input codeword. Thus the deterministic capacity of under maximum error probability is equal to .
On other hand let be a input distribution such that and for . Let and be the input and output random variables for and , the channel in , minimizing . Then . Next by considering the support sets of conditional distributions, we have and for . Thus and therefore . Moreover by simple calculation, for . Thus . and .
Example 2 show that the legal transmitters really benefits from the resource even when the deterministic capacity under the maximal error criterion is positive.
Now one may concern the same question in scenario 2. This is answered by the following Theorem, which can be proven by modifying the proof of Theorem 1:
Theorem 3
The same conclusion for scenario 2, as that for scenario 1 in Theorem1, holds.
The above three Theorems and the facts that
yield the coding theorem:
Corollary 1
For all we have
(19) 
Moreover the both capacity and can be achieved by codes with vanishing key rates.
Thus we conclude that:

Further knowing message to be sent, may help a jammer to reduce the capacity neither in the scenario that the jammer knows coding scheme nor in the scenario that the jammer knows both coding scheme and input codeword.

knowing input codeword is more effectual than knowing the message for a jammer, who knows coding scheme, for attack the communication.
Iv proof Theorem 1
Although coding for classical arbitrarily varying channels is already a challenging topic with a lot of open problems, coding for AVCQC is even much harder. Due to the noncommutativity of quantum operators, many techniques, concepts and methods of classical information theory, for instance, nonstandard decoder and list decoding, may not be extended to quantum information theory. Sarwate used in [22] list decoding to prove the coding theorem for classical arbitrarily varying channels when the jammer knows input codeword. However since how to apply list decoding for quantum channels is still an open problem, the technique for classical channels in [22] can not be extended to AVCQC. We need a different approach for our scenario 1.
If the jammer would have some information about the outcome of the random key through the input codeword, to which he has access in scenario 1, he could apply a strategy against the th deterministic coding for AVCQC by choosing the worst state sequence to attack the communication, which we do not want. To this end a codeword must be used by “many” outcomes of a random correlated code , if it is used by at least one of . This is the main idea of our proof. We divide the proof into 5 steps. At the first step we derive a useful auxiliary result from known results. Next with the auxiliary result and Chernoff bound, we shall generate a ground set of code books from a typical set . Then our code is constructed through the ground set and analyzed at the 3th and 4th steps, respectively. To simplify the statement, we shall not fix the values of parameters at the 24th steps exactly, but only set up necessary constraints to them. So finally we have to assign values to the parameters appearing in the proof at the last step.
Iva An Auxiliary Result
We first derive a useful auxiliary result from known projections in previous work.
To construct decoding measurements of codes for classical quantum compound channel the authors in [6] and [18] introduced two kinds of projections for a set of classical quantum channels and input codewords respectively. Although the two projections are quite different, they share the same properties. We summary their properties, which will be used in the paper, as the following lemma.
Lemma 1
For a set of classical quantum channels with a common input alphabet and a common output Hilbert space and any an input codeword , there exits a projection in such that,
(i) For all ,
(20) 
for an ;
(ii)
(21) 
for all , and sufficiently large , where
(iii) Moreover, for all permutation on with , keeps invariant when permutation acts on coordinates of th tensor power of Hilbert space .
Let be a finite set of classical quantum channels, indexed by elements of and let is defined by (15). Then
Corollary 2
Let be the projection in Lemma 1 for , and be randomly and uniformly distributed on , then
(22) 
for all and sufficiently large .
Proof: Let be joint type of . Let be randomly and uniformly distributed on and be random variable with uniform distribution on , and independent of . Then by Lemma 1 (ii), we have that