A one-shot quantum joint typicality lemma
A fundamental tool to prove inner bounds in classical network information theory is the so-called ‘conditional joint typicality lemma’. In addition to the lemma, one often uses unions and intersections of typical sets in the inner bound arguments without so much as giving them a second thought. These arguments fail spectacularly in the quantum setting. This bottleneck shows up in the fact that so-called ‘simultaneous decoders’, as opposed to ‘successive cancellation decoders’, are known for very few channels in quantum network information theory. Another manifestation of this bottleneck is the lack of so-called ‘simutaneous smoothing’ theorems for quantum states.
In this paper, we overcome the bottleneck by proving for the first time a one-shot quantum joint typicality lemma with robust union and intersection properties. To do so, we develop two novel tools in quantum information theory which may be of independent interest. The first tool is a simple geometric idea called tilting, which increases the angles between a family of subspaces in orthogonal directions. The second tool, called smoothing, is a way of perturbing a multipartite quantum state such that the partial trace over any subset of registers does not increase the operator norm by much.
Our joint typicality lemma allows us to construct simultaneous quantum decoders for many multiterminal quantum channels. It provides a powerful tool to extend many results in classical network information theory to the one-shot quantum setting.
A fundamental tool to prove inner bounds for communication channels in classical network information theory is the so-called conditional joint typicality lemma [EK12]. Very often, the joint typicality lemma is used together with implicit intersection and union arguments in the inner bound proofs. This is especially so in the construction of so-called simultaneous decoders (as opposed to successive cancellation decoders) for communication problems. In this paper, we investigate what happens when one tries to extend the classical inner bound proofs to the quantum setting. It turns out that the union and intersection arguments present a huge stumbling block in this effort. The main result of this paper is a one-shot quantum joint typicality lemma that takes care of these union and intersection bottlenecks, and allows us to extend many classical inner bound proofs to the quantum setting.
1.1 One-shot inner bound for the classical MAC
Let us illustrate the need for an intersection argument together with a joint typicality lemma by considering the problem of proving inner bounds for arguably the simplest multiterminal communication channel viz. the multiple access channel (MAC). We consider the one-shot classical setting. There are two senders Alice and Bob who would like to send messages , to a receiver Charlie. There is a communication channel with two inputs and one output called the two sender one receiver MAC connecting Alice and Bob to Charlie. The two input alphabets of will be denoted by , and the output alphabet by . Let . On getting message , Alice encodes it as a letter and feeds it to her channel input. Similarly on getting message , Bob encodes it as a letter and feeds it to his channel input. The channel outputs a letter according to the channel probability distribution . Charlie now has to try and guess the message pair from the channel output. We require that the probability of Charlie’s decoding error averaged over the channel behaviour as well as over the uniform distribution on the set of message pairs is at most .
Consider the following randomised construction of a codebook for Alice and Bob. Fix probability distributions , on sets , . For , choose independently according to . Similarly for , choose independently according to .
We now describe the decoding strategy that Charlie follows in order to try and guess the message pair that was actually sent. Let . Let , be two probability distrbutions on the same sample space . A ‘classical POVM element’ or ‘test’ on is defined to be a function . Intuitively, for a sample point , denotes its probability of acceptance by the test. For two classical POVM elements , on , we can define the ‘intersection’ classical POVM element as follows: . Similarly, we can define the ‘union’ classical POVM element as follows: . Following Wang and Renner [WR12], we define the classical hypothesis testing relative entropy as follows:
where the maximisation is over all classical POVM elements on ‘accepting’ the distribution with probability at least . It is easy to see that the optimising POVM element attains equality in the constraint for , as well as achieves the maximum in objective function for .
Define the probability distribution on as for all . With a slight abuse of notation, we shall often use to also denote the distribution on . Similarly, define probability distributions , on in the natural fashion. Define the probability distributions , , , on in the natural manner. Consider classical POVM elements , , achieving the maximum in the definitions of , , respectively. As a shorthand, we will use the hypothesis testing mutual information quantities , , to denote the hypothesis testing relative entropy quantities , , respectively. Now consider the intersection classical POVM element . Charlie’s decoding strategy is as follows. Suppose the channel output is . Then, Charlie uses the following randomised algorithm for decoding.
Toss a coin with probability of HEAD being .
If the coin comes up HEAD, declare as Charlie’s guess and halt.
If the coin comes up TAILS, go to next iteration.
Declare FAIL, if Charlie did not declare any guess above.
We now analyse the expectation, under the choice of a random codebook , of the error probability of Charlie’s decoding algorithm. Suppose the message pair is inputted to the channel. Let denote the lexicographic order on message pairs. Let the channel output be denoted by . Then, a decoding error occurs only if Charlie tosses a HEAD for a pair or if Charlie tosses a TAIL for . The probability of this occurring, for a given codebook , is upper bounded by
The expectation, over the choice of the random codebook , of the decoding error is then upper bounded by
Above, we use the properties that
for all triples . Now define the average decoding error probability under a codebook to be
Then the expectation, under the choice of a random codebook , of the average decoding error probability is upper bounded by
Thus, for any rate pair in the region given by
there is a codebook with average decoding error probability less than .
1.2 Unions and intersections in the classical setting
We now step back and discuss the intersection classical POVM element used in the above proof. The intersection argument was crucial in constructing a simultaneous decoder for Charlie. In the asymptotic iid setting for the multiple access channel, it is possible to avoid simultaneous decoding and instead use successive cancellation decoding combined with time sharing [EK12]. However, in the one-shot setting time sharing does not make sense and successive cancellation gives only a finite set of achievable rate pairs. Thus, in order to get a continuous achievable rate region, we are forced to use simultaneous decoders only. There are also situations even in the asymptotic iid setting, e.g. in Marton’s inner bound with common message for the broadcast channel, where we need to use intersection arguments [EK12]. Similarly, union arguments crop up in some inner bound proofs, e.g. Marton’s inner bound without common message for the broadcast channel, even in the asymptotic iid setting [EK12]. In the one-shot setting union bounds occur more frequently, e.g. in the Han-Kobayashi inner bound for the interference channel [Sen18]. Thus, intersection and union arguments are indispensable in network information theory.
Before we proceed to the quantum setting, we prove for completeness sake a ‘one-shot classical joint typicality lemma’. In fact, it is nothing but an application of intersection and union of classical POVM elements.
Fact 1 (Classical joint typicality lemma).
Let , be probability distributions on a set . Let , where , . Then there is a classical POVM element on such that:
For all ,
For all ,
For , , let be the classical POVM element achieving the minimum in the definition of . Define the classical POVM element . Observe that for any ,
It is now easy to see that satisfies the properties claimed above. ∎
1.3 Extending unions and intersections to the quantum setting
We now ponder what is required to extend the above inner bound proof for the classical MAC to the setting of the one-shot classical-quantum multiple access channel (cq-MAC). In the cq-MAC, there are two senders Alice and Bob who would like to send messages , to a receiver Charlie. There is a communication channel with two classical inputs and one quantum output connecting Alice and Bob to Charlie. The two input alphabets of will be denoted by , and the output Hilbert space by . If the pair is fed into the channel inputs, the output of the channel is a density matrix in . Let . On getting message , Alice encodes it as a letter and feeds it to her channel input. Similarly on getting message , Bob encodes it as a letter and feeds it to his channel input. Charlie now has to try and guess the message pair from the channel output state . We require that the probability of Charlie’s decoding error averaged over the uniform distribution on the set of message pairs is at most .
One can do a similar randomised construction of a codebook for Alice and Bob as before. One can make use of Wang and Renner’s [WR12] hypothesis testing relative entropy for a pair of quantum states , in the same Hilbert space , which is defined as follows:
where the maximisation is over all POVM elements on (i.e. positive semidefinite operators , ) ‘accepting’ the state with probability at least . Again, it is easy to see that the optimising POVM element attains equality in the constraint for , as well as achieves the maximum in objective function for . One can define the analogous quantum state
and the tensor products of the marginals , , , as well as the hypothesis testing mutual informations , , in the quantum setting too. Thus, we would like to prove that any rate pair in the region given by
is achievable with small average decoding error probability.
Suppose there exists a single POVM element on the Hilbert space that simultaneously satisfies the following properties:
In the classical setting, such a POVM element was constructed by taking the intersections of the three POVM elements , , achieving the maximums in the definitions of the three entropic quantities , , . In the quantum setting, if such an ‘intersection’ POVM element exists it is indeed possible to construct a decoding algorithm for Charlie with average error probability at most using the ‘pretty good measurement’ [Bel75b], the Hayashi-Nagaoka operator inequality [HN03] and mimicing the classical analysis given above for the decoding error.
Let , , be the three POVM elements achieving the maximums in the definitions of the three entropic quantities , , . Let us now look at various simple ideas to generalise intersection of POVM elements to the quantum setting. If the POVM elements were projectors, which can be ensured without loss of generality by embedding the quantum states into a larger Hilbert space , one can try taking the projector onto the intersection of the supports of , , . This indeed ensures that Properties 2, 3, 4 described above hold for . Unfortunately, the intersection can easily be the zero subspace which kills all hope of satisfying Property 1 even approximately. The next simple idea to define the ‘intersection’ POVM element would be to take the product . With this definition, can be shown to satisfy Property 1 with lower bound using the non-commutative union bound [Gao15]. However, it is not at all clear that the remaining three properties can be simultaneously satisfied, even approximately, because , , do not commute in general. To get an idea of the difficulty, consider the expression
that arises if one were to attempt to prove Property 2. It is true that , but it is also possible that
This makes it impossible to show the achievability of the desired rate region (Equation 1) with this definition of . In fact, one of the main technical contributions of this paper is a robust novel notion of intersection of non-commuting POVM elements achieving the maximums in the definitions of the appropriate hypothesis testing mutual information quantities.
As a first step towards defining a robust notion of intersection of non-commuting POVM elements, we address the complementary problem of defining a robust notion of union of non-commuting POVM elements. The ‘intersection’ POVM element can then be defined to be simply the complement of the ‘union’ of the complements. Note that the complement of a POVM element in a Hilbert space is defined to be . Suppose that the POVM elements were projectors, which can be ensured without loss of generality by embedding the states into a larger Hilbert space . One can then naively define the ‘union’ of a family of projectors to be the projector onto the span of the supports (the union of supports is not a vector space in general). However, this does not give us anything new as the span can indeed be the entire space and so Property 1 can fail spectacularly, that is, the lower bound in Property 1 can be as low as zero! The problem with the span idea is captured by the following example. Consider the two dimensional Hilbert space . Let be the one dimensional space spanned by and the one dimensional space spanned by . Consider the quantum state . Now the probability of being accepted by , is and respectively. However, the probability of being accepted by the span of and is one.
Notice however that the above pathological phenomenon with the span occurs only because the subspaces and have ‘small angles’ between them. We overcome this problem by ‘tilting’ , in orthogonal directions to form new spaces , . The Hilbert space has to be enlarged sufficiently to allow this tilting to be possible. The process of tilting increases the ‘angles’ between the subspaces in orthogonal directions allowing one to recover an upper bound for the span very close to the sum of acceptance probabilities, just like in the classical setting. This ‘tilting’ idea is formlised in Proposition 2. Thus, the ‘tilted span’ is best thought of as a robust notion of union of subspaces satisfying a well behaved union bound.
Let us define the ‘intersection’ of subspaces to be the complement of the tilted span of the complementary subspaces. Applying this recipe to , , gives us a projector that satisfies Property 1 with lower bound by setting in the upper bound of Proposition 2. However, it is not clear whether satisfies the Properties 2, 3, 4 even approximately. This is because in order to prove, say, Property 2, one has to use the lower bound of Proposition 2 with , which would give an upper bound of at least for Property 2!
Nevertheless, it turns out that the tilting idea can be used as a starting point and further refined, leading finally to a proof of the desired inner bound (Equation 1) for the cq-MAC. We do so with the following sequence of steps. A full proof is given in Section 6 below.
Enlarge the Hilbert space suitably and consider a ‘perturbed’ version of the channel . The channel maps an input pair to a state close to . This gives a state close to . A codebook achieving a certain rate point for channel with a certain error achieves the same rate point for channel with only slightly more error. In fact, is obtained by tilting along a carefully chosen direction;
The channel is constructed in such a way that , , are extremely close to the states , , tilted in approximately orthogonal directions. This is a crucial new idea that we call smoothing of . Smoothing is possible because, though partial trace can increase the -norm of a quantum state in general, the increase is negative or only slightly positive if the state is highly entangled between the traced out part and the untraced out part. The carefully chosen way of tilting to get ensures that is highly entangled across all bipartitions of the systems . Nevertheless, actually achieving this smoothing is technically intricate and requires the use of a completely mixed ancilla state of sufficiently large support. We use the term augmentation to refer to this technique of using a completely mixed ancilla state for the purpose of smoothing;
We then observe that the tilts of , , alluded to in the previous point are more complicated than the simple tilting idea described earlier and analysed in Proposition 2. To analyse these more complicated tilts, we define a tilting matrix and then define an -tilt. We now define the ‘intersection’ POVM element to be the projector onto the complement of the -tilted span of the complements of the supports of , , . We then show in Proposition 3 that an -tilt of subspaces also gives rise to a union bound which, even though not as good as the upper bound of Proposition 2, is still strong enough for the purpose of proving Property 1 with lower bound under the projector ;
Finally, we notice from the construction of that Properties 2, 3, 4 are easily satisfied by for the -tilted versions of , , , with upper bounds exactly the same as in the ideal case. Since the states , , are extremely close to the -tilted versions of , , , we finally conclude that they satisfy Properties 2, 3, 4 with upper bounds almost as good as in the ideal case under ;
Thus, can be used by Charlie in order to construct a decoding algorithm for the original channel that achieves any rate pair in the region described in Equation 1 with average error probablity at most .
The strategy outlined above enables us to prove the following one-shot quantum joint typicality lemma.
Let be a -partite quantum system with each isomorphic to a Hilbert space . Let be a quantum state in . For a subset , let denote the systems . Let denote the marginal state on obtained by tracing out the systems in from . Let . Let be a Hilbert space of dimension There exist a state independent of , a state , and a POVM element on where , with the following properties:
For every set , ,
1.4 Related work
The bottleneck of simultaneous decoding was first pointed out by Fawzi et al. [FHS12] in their paper on the quantum interference channel. Subsequently, Dutil [Dut11] pointed out a related bottleneck of ‘simultaneous smoothing’, which was further discussed by Drescher and Fawzi [DF13]. Simultaneous decoders have been recently used in several papers e.g. [QWW17], but the inner bounds obtained there were suboptimal compared to known inner bounds when restricted to the asymptotic iid setting.
1.5 Organisation of the paper
In the next section, we state some preliminary facts which will be useful throughout the paper. Section 3 defines and proves the union properties of the tilted span and -tilted span of subspaces. In Section 4 we prove the so-called one-shot quantum joint typicality lemma, which manages to construct a robust notion of intersection of POVM elements achieving a given set of hypothesis testing mutual information quantitites. The meat of the proof of the one-shot quantum joint typicality lemma is encapsulated into a proposition which we prove in Section 5. Next, we formally prove the achievability of the rate region described by Equation 1 in Section 6. Finally, we make some concluding remarks and list some open problems in Section 7.
All Hilbert spaces in this paper are finite dimensional. The symbol always denotes the orthogonal direct sum of Hilbert spaces. For a subspace of a Hilbert space , let denote the orthogonal projection in onto . When clear from the context, we may use instead of for brevity of notation.
By a quantum state or a density matrix in a Hilbert space , we mean a Hermitian, positive semidefinite linear operator on with trace equal to one. By a POVM element in , we mean a Hermitian positive semidefinite linear operator on with eigenvalues between and . Stated in terms of inequalities on Hermitian operators, , where l0, 11 denote the zero and identity operators on . In what follows, we shall use several times the Gelfand-Naimark theorem which is stated below for completeness.
Fact 2 (Gelfand-Naimark).
Let be a POVM element in a Hilbert space . For any integer , there exists an orthogonal projection in such that
for all linear operators acting on . Above, is a fixed vector, independent of and , in .
Since quantum probability is a generalisation of classical probability, one can talk of a so-called ‘classical POVM element’. Suppose we have a probability distribution , . A classical POVM element on is a function . The probability of accepting the POVM element is then . One can continue to use the operator formalism for classical probablity with the understanding that density matrices and POVM elements are now diagonal matrices.
Let denote the -norm of a vector . For an operator on , we use to denote the Schatten -norm, aka trace norm, of , which is nothing but the sum of singular values of . We use to denote the Schatten -norm, aka operator norm, of , which is nothing but the largest singular value of . For operators , on , we have the inequality
Let be a finite set. By a classical-quantum state on we mean a quantum state of the form where ranges over computational basis vectors of viewed as a Hilbert space, is a probability distribution on and the operators , are quantum states in . We will also use the terminology that is classical on and quantum on .
For a positive integer , we use to denote the set . If , we define . Let be a non-negative integer and a positive integer. We shall study systems that are classical on and quantum on , where and are treated as disjoint sets. We will use the notation to denote the union keeping in mind that , are disjoint. A subpartition of , denoted by , is a collection of non-empty, pairwise disjoint subsets of . Note that the order of subsets does not matter in defining a subpartition . For a subset , , we use the notation to denote a so-called pseudosubpartition of intersecting non-trivially i.e. for , , and for , . There is a natural partial order called the refinement partial order on the pseudosubpartitions of intersecting non-trivially, where iff there is a function such that for all . Under this partial order, the pseudosubpartitions form a lattice with minimum element being the empty pseudosubpartition with , and maximum element being the full block with and . It is easy to see that the number of pseudosubpartitions of is at most , and the number of pseudosubpartitions of with blocks is at most .
Suppose we have a -partite quantum system . For a non-empty set , let denote the systems . Similarly, if is a quantum state in , will denote its marginal in . Thus, . If is a computational basis vector of , for a subset will denote its restriction to the system . Thus, . We also use to denote computational basis vectors of without reference to the systems in . For a subset and a computational basis vector of , the notation has the analogous meaning. The notation denotes a tensor product only for the coordinates in .
Let , be two quantum states in the same Hilbert space. Let . Then the hypothesis testing relative entropy of with respect to is defined by
where the maximisation is over all POVM elements acting on the Hilbert space.
The definition quantifies the minimum probability of ‘accepting’ by a POVM element that ‘accepts’ with probability at least . From the definition, it is easy to see that if , . We will require the following property of the so-called hypothesis testing mutual information.
Let . Let be a quantum state in a bipartite system . Define the hypothesis testing mutual information Then,
Let be a third system and a purification of . Let be the optimising POVM element for Define . Let