# Approximating quantum channels

by completely positive maps with small Kraus rank

###### Abstract.

We study the problem of approximating a quantum channel by one with as few Kraus operators as possible (in the sense that, for any input state, the output states of the two channels should be close to one another). Our main result is that any quantum channel mapping states on some input Hilbert space to states on some output Hilbert space can be compressed into one with order Kraus operators, where , hence much less than . In the case where the channel’s outputs are all very mixed, this can be improved to order . We discuss the optimality of this result as well as some consequences.

## 1. Introduction

Quantum channels are the most general framework in which the transformations that a quantum system may undergo are described. These are defined as completely positive and trace preserving (CPTP) maps from the set of bounded operators on some input Hilbert space to the set of bounded operators on some output Hilbert space . Indeed, to be a physically valid evolution in the open system setting, a linear map has to preserve quantum states (i.e. positive semi-definiteness and unit-trace conditions) even when tensorized with the identity map on an auxiliary system.

Let us fix here once and for all some notation that we will use repeatedly in the remainder of the paper: Given a Hilbert space , we shall denote by the set of linear operators on , and by the set of density operators (i.e. positive semi-definite and trace operators) on . Also, whenever is finite dimensional (which will be the case of all the Hilbert spaces we will deal with in the sequel) we shall denote by its dimension.

So assume from now on that the Hilbert spaces and are finite dimensional. Then, we know by Choi’s representation theorem [6] that a CPTP map can always be written as

(1) |

where the operators , , are called the Kraus operators of and satisfy the normalization relation . The minimal such that can be decomposed in the Kraus form (1) is called the Kraus rank of , which we shall denote by . By Stinespring’s dilatation theorem [11], another alternative way of characterizing a CPTP map is as follows

(2) |

for some environment Hilbert space and some isometry (i.e. ). In such picture, is then nothing else than the minimal environment dimension such that may be expressed in the Stinespring form (2). It may be worth pointing out that there is a lot of freedom in representation (1): two sets of Kraus operators and give rise to the same quantum channel as soon as there exists a unitary on such that, for all , . On the contrary, representation (2) is essentially unique, up to the (usually irrelevant) transformation , for a unitary on . That is why we will often prefer working with the latter than with the former.

Yet another way of viewing the Kraus rank of a CPTP map is as the rank of its associated Choi-Jamiołkowski state. Denoting by the maximally entangled state on , the latter is defined as the state on . Consequently, it holds that any quantum channel from to has Kraus rank at most . And the extremal such quantum channels are those with Kraus rank less than . In particular, the case corresponds to being a unitary, hence reversible, evolution, whereas whenever , one can view as a noisy summary of a unitary evolution on a larger system. The Kraus rank of a quantum channel can thus legitimately be seen as a measure of its “complexity”: it quantifies the minimal amount of ancillary resources needed to implement it (or equivalently the amount of degrees of freedom in it that one is ignorant of). A natural question in this context would therefore be: given any quantum channel, is it possible to reduce its complexity while not affecting too much its action, or in other words to find a channel with much smaller Kraus rank which approximately simulates it?

One last definition we shall need concerning CP maps is the following: the conjugate (or dual) of a CP map is the CP map defined by

It is characterized as well by saying that is a set of Kraus operators for if and only if is a set of Kraus operators for . Hence obviously, and have same Kraus rank, while the trace-preservingness condition for is equivalent to the unitality condition for .

The remainder of this paper is organized as follows. In Section 2 we gather all needed background on quantum channel approximation that we are interested in. This includes precise definitions, previous works in this direction, etc. Our main results are then stated and commented in Section 3, while their proofs are relegated to Section 4. In Section 5 we present several corollaries, which have applications in quantum data hiding and locking, amongst other. We finally discuss some open questions in Section 6.

## 2. Quantum channel approximation: definitions and already known facts

Before going any further, we need to specify what we mean by “approximating a quantum channel”, since indeed, several definitions of approximation may be considered. In our setting, the most natural one is probably that of approximation in -norm: given CPTP maps , we will say that is an -approximation of in -norm, where is some fixed parameter, if

(3) |

At first sight it might appear that an even more natural error quantification in such a context would be in terms of the completely-bounded -norm (aka diamond norm) [1]. That is, in order to call an -approximation of , we would require that, for any Hilbert space ,

(4) |

Nevertheless, this notion of approximation is too strong for our purposes. Indeed, if and satisfy equation (4), it implies in particular that their associated Choi-Jamiołkowski states have to be -close in trace-norm. And this, in general, is possible only if and have the same, or at least comparable, number of Kraus operators, so that no environment dimensionality reduction can be achieved.

The question of quantum channel compression has already been studied in one specific case, which is the one of the fully randomizing (or depolarizing) channel. Let us recall what is known there. The fully randomizing channel is the CPTP map with same input and output spaces defined by

so that, in particular, all input states are sent to the maximally mixed state . has maximal Kraus rank (because is simply , and hence has rank ). This was of course to be expected, if adhering to the intuitive idea that the bigger is the Kraus rank of channel, the noisier is the channel. One possible minimal Kraus decomposition for is

where for each , with and the generalized Pauli shift and phase operators on . It was initially established in [8] and later improved in [2] that there exist almost randomizing channels with drastically smaller Kraus rank. More specifically, the following was proved: for any , the CPTP map can be -approximated in -norm by a CPTP map with Kraus rank at most , where is a universal constant. Actually, something stronger was established, namely

which obviously implies that, for any , is an -approximation of in -norm, in the sense that

(5) |

The question we investigate here is whether such kind of statement actually holds true for any channel. Note however that, for a channel which is not the fully randomizing one, the notion of approximation in Schatten--norm appearing in equation (5) is maybe not what we would expect as being the “correct” one. In fact, it would seem more accurate to quantify closeness in terms of relative error. Hence, given a CPTP map , we would rather be interested in finding a CPTP map with Kraus rank as small as possible, and such that

(6) |

## 3. Statement of the main results

###### Theorem 3.1.

Fix and let be a CPTP map with Kraus rank . Then, there exists a CP map with Kraus rank at most (where is a universal constant) and such that

(7) |

###### Remark 3.2.

Note that if satisfies equation (7), then it especially implies that it approximates in any Schatten-norm in a sense close to that of equation (6), namely

In particular, we have the -norm approximation of by

in which we can further impose that is strictly, and not just up to an error , trace preserving (see the proof of Theorem 3.1).

One important question at this point is the one of optimality in Theorem 3.1. A first obvious observation to make in order to answer it is the following: if a CP map has Kraus-rank , then it necessarily sends rank inputs to output states of rank at most . This is of course informative only if is smaller than the output space dimension. But as we shall see, having this is mind will be useful to prove that certain channels cannot be compressed further than as guaranteed by Theorem 3.1.

Our constructions will be based on the existence of so-called tight normalized frames. Namely, for any with , there exist unit vectors in such that

Denoting by an orthonormal basis of , a possible way of constructing such vectors is e.g. to make the choice

(8) |

Note that if this so, then any basis vector , , is such that, for each , .

Let us now come back to our objective. What we want to exhibit here are CPTP maps with either one or the other of the following two properties: if a CP map satisfies

(9) |

then it necessarily has to be such that either or . Besides, note that the CP maps fulfilling condition (9) above is equivalent to the conjugate CP maps fulfilling condition (10) below

(10) |

Depending on what we want to establish, it will be more convenient to work with either one or the other of these requirements.

Assume first of all that , and consider a so-called quantum-classical channel (aka measurement). More specifically, define the CPTP map

(11) |

where is an orthonormal basis of and are unit vectors of , defined in terms of an orthonormal basis of as by equation (8). Note that this tight normalized frame assumption implies that forms a rank- POVM on (hence a posteriori the justification of the denomination for ). Setting, for each , , we can clearly re-write , so . And what we actually want to show is that it is even impossible to approximate in the sense of Theorem 3.1 with strictly less than Kraus operators. Observe that by construction, say, is such that, for each , , so that . Yet, assume that is a CPTP map such that fulfill equation (9) for some . Then, the l.h.s. of equation (9) yields in particular, , so that has to have full rank. And therefore, it cannot be that .

Assume now that , and consider a so-called classical-quantum channel. More specifically, define the CPTP map

(12) |

where is an orthonormal basis of and are unit vectors in . Setting, for each , , we can clearly re-write , so . Now, we want to show that, at least for certain choices of , it is even impossible to approximate in the sense of Theorem 3.1 with strictly less than Kraus operators. For that, we impose that they are defined in terms of an orthonormal basis of as by equation (8). Since the conjugate of is the CP unital map

we have in this case that is precisely of the form (11) (with the roles of and switched). Hence, as we already showed, if is a CPTP map such that fulfil equation (9) (with the roles of and switched) for some , then it cannot be that . This means equivalently that if is a CP map such that fulfil equation (10) for some , then it cannot be that .

Summarizing, we just established that is for sure necessary in Theorem 3.1. But it is not clear whether or not the factor can be removed. In the case of “well-behaved” channels, whose range is only composed of sufficiently mixed states, we can answer affirmatively, which is the content of Theorem 3.3 below. However, we leave the question open in general.

###### Theorem 3.3.

Fix and let be a CPTP map with Kraus rank . Then, there exists a CP map with Kraus rank at most (where is a universal constant) and such that

Before moving on to the full proof of Theorems 3.1 and 3.3 let us briefly explain the main ideas in it. These two existence results of CPTP maps having some desired properties actually stem from proving that suitably constructed random ones have them with high probability. One thus has to show that for the random CPTP map the probability is high that, for every input state , is close to . This is achieved in two steps: establishing first that this holds for a given input state and second that it in fact holds for all of them simultaneously. The fact that the individual probability of deviating from average is small is a consequence of the concentration of measure phenomenon in high dimensions. Deriving from there that the global deviation probability is also small is done by discretizing the input set and using a union bound. This line of proof is extremely standard in asymptotic geometric analysis (this is for instance how Dvoretzky’s theorem is obtained from Levy’s lemma). In our case though, the first step requires a careful analysis of the sub-exponential behavior of a certain random variable.

## 4. Proofs of the main results

As a crucial step in establishing Theorems 3.1 and 3.3, we will need a large deviation inequality for sums of independent (aka sub-exponential) random variables. Recall that the -norm of a random variable (which quantifies the exponential decay of the tail) may be defined via the growth of moments

This definition is more practical than the standard definition through the Orlicz function , and leads to an equivalent norm (see [5], Corollary 1.1.6). The large deviation inequality for a sum of independent random variables is known as Bernstein’s inequality and is quoted below.

###### Theorem 4.1 (Bernstein’s inequality, see e.g. [5], Theorem 1.2.5.).

Let be independent random variables. Setting and , we have

where is a universal constant.

Our application of Bernstein’s inequality to a suitably chosen sum of independent random variables will yield Proposition 4.2 below. Note that in the latter, as well as in several other places in the remainder of the paper, we shall use the following shorthand notation, whenever no confusion is at risk: given a unit vector in , we also denote by the corresponding pure state on .

###### Proposition 4.2.

Let be a CPTP map with Kraus rank , defined by

(13) |

for some isometry .

For any given unit vector in define next the CP map by

(14) |

Now, fix unit vectors in , in , and pick random unit vectors in , independently and uniformly. Then,

where is a universal constant.

In order to derive this concentration result, we will need first of all an estimate on the -norm of a certain random variable appearing in our construction. This is the content of Lemma 4.3 below.

###### Lemma 4.3.

Fix . Let be a state on and be a unit vector in . Next, for a uniformly distributed unit vector in define the random variable

Then, is a random variable with mean and -norm satisfying

(15) |

###### Proof.

To begin with, recall that, for any , we have, for a uniformly distributed unit vector in ,

where denotes the orthogonal projector onto the completely symmetric subspace of .

Now, setting , positive sub-normalized operator on , we see that . Hence, we clearly have first of all the first statement in equation (15), namely

What is more, for any , . And therefore,

where the last inequality is simply by the rough bounds and .

###### Proof of Proposition 4.2.

Note first of all that we can obviously re-write

Next, for each , define the random variable . By Lemma 4.3, combined with the observation just made above, we know that these are independent random variables with mean and -norm upper bounded by . So by Bernstein’s inequality, recalled as Theorem 4.1, we get that

where is a universal constant. And hence,

which is precisely the result announced in Proposition 4.2. ∎

Having at hand the “fixed ” concentration inequality of Proposition 4.2, we can now get its “for all ” counterparts by a standard net-argument. This appears as the following Propositions 4.4 and 4.5.

###### Proposition 4.4.

###### Proof.

Fix and consider minimal -nets within the unit spheres of , so that by a standard volumetric argument (see e.g. [10], Chapter 4). Then, by Proposition 4.2 and the union bound, we get that, for any ,

(16) |

Now, fix and suppose that is a Hermiticity-preserving map which is such that

(17) |

Assume that additionally satisfies the boundedness property

(18) |

Note that if is Hermicity-preserving, then for any , and (this is because for any , ). Hence, it will be useful to us later on to keep in mind that assumption (18) is actually equivalent to

Then, for any unit vectors , , we know by definition that there exist , such that , . Hence, first of all

where the second inequality follows from the boundedness property (18) of , combined with the fact that . Then similarly, because ,

Putting together the two previous upper bounds, we see that we actually have

where the second inequality is by assumption (17) on . Now, arguing just as before (using this time that satisfies the boundedness property for any and ), we get

So eventually, what we obtain is

Therefore, choosing (and observing that, by the way is constructed, fulfills condition (18)), it follows from equation (16) that, for any ,

which is exactly what we wanted to show. ∎

###### Proposition 4.5.

###### Proof.

We will argue in a way very similar to what was done in the proof of Proposition 4.4, and hence skip some of the details here. Again, fix and consider minimal -nets within the unit spheres of . Now, fix and suppose that is a Hermiticity-preserving map which is such that,

Then, for any unit vectors , ,

where , are such that , . And consequently, taking supremum over unit vectors , , we get

that is equivalently,

Therefore, choosing