Asymptotic State Discrimination and a Strict Hierarchy in Distinguishability Norms

# Asymptotic State Discrimination and a Strict Hierarchy in Distinguishability Norms

Eric Chitambar , Min-Hsiu Hsieh
Department of Physics and Astronomy, Southern Illinois University,
Carbondale, Illinois 62901, USA
Centre for Quantum Computation & Intelligent Systems (QCIS),
Faculty of Engineering and Information Technology (FEIT),
University of Technology Sydney (UTS), NSW 2007, Australia
###### Abstract

In this paper, we consider the problem of discriminating quantum states by local operations and classical communication (LOCC) when an arbitrarily small amount of error is permitted. This paradigm is known as asymptotic state discrimination, and we derive necessary conditions for when two multipartite states of any size can be discriminated perfectly by asymptotic LOCC. We use this new criterion to prove a gap in the LOCC and separable distinguishability norms. We then turn to the operational advantage of using two-way classical communication over one-way communication in LOCC processing. With a simple two-qubit product state ensemble, we demonstrate a strict majorization of the two-way LOCC norm over the one-way norm.

## 1 Introduction

Any realistic scheme for processing information will inevitably encounter some experimental error. Consider, for instance, the task of quantum state identification. In the binary one-shot problem, a quantum system is prepared in some state with probability and another state with probability . A measurement is performed on the system, and based on this result, a guess is made on the state’s identity. The goal is to choose a measurement strategy that optimizes the probability of correctly identifying the state. In practice, each experimental setup implementing this process will have unavoidable imperfections that generate a nonzero probability of error. Hence for state discrimination, the experimental optimum refers to the success probability that can be approached arbitrarily close as the experimental errors are made smaller and smaller. This paradigm is known as asymptotic state discrimination since it involves a sequence of different measurement strategies with respective success probabilities that approach optimality in the limit.

Asymptotic state discrimination is usually not considered on its own since for measurements performed across some quantum system , there is no effective difference between asymptotic and non-asymptotic processes. To decide whether or not some success probability is asymptotically achievable, one need only consider whether it is theoretically possible to obtain exactly. This statement just reflects the mathematical fact that for some fixed number of outcomes, the set of positive-operator value measures (POVMs) on is compact, and so the limit to any sequence of quantum measurements is itself a quantum measurement.

The situation is quite different when the quantum system consists of different parts, , and the subsystems are split between spatially separated parties. In the state discrimination problem, the parties attempt to identify some globally shared state. Being confined to their own respective laboratories, they can only investigate the state’s identity by performing local measurements and globally communicating the measurement outcomes, a process known as local operations and classical communication (LOCC). Recently, much attention has been given to the mathematical properties of LOCC operations [KKB11, CLMO12, CLM12, CLMO13], and unlike the full set of global measurements on , it turns out that the set of LOCC measurements is not compact [CCL12, CLM12]. As a result, even if quantum theory prohibits LOCC from identifying two states with exactly some success probability , it still may be possible to attain in the asymptotic sense. Consequently, to truly characterize the limitations of realistic LOCC state discrimination, one must look beyond the question of exact discrimination and consider the asymptotic problem on its own.

To be a bit more formal, we say that a success probability can be obtained by asymptotic LOCC if for every , there exists an LOCC protocol that correctly identifies the given state with probability at least . The POVM actually obtaining success probability belongs to , which is the set-closure of all LOCC POVMs [CLM12]. To study , it is often helpful to consider the class of separable operations (SEP). A POVM is -partite separable if each is separable, i.e. it can be expressed as a positive combination of product projectors onto , i.e. with . Any task achievable by can also be achieved by SEP, however the converse is not true [BDF99]. Thus, to fully understand the operational power of , more fine-grained tools are needed beyond just a relaxation to separable operations.

The class of asymptotic LOCC is also vital to the subject of distinguishability norms [MWW09, RKW11, LW13, AL13]. Let be the set of hermitian operators acting on . Each can be uniquely decomposed as where and are orthogonal non-negative operators, and the trace norm of is given by . Denote a generic two-outcome ordered POVM by , and let be any collection of such POVMs. One can define the real function on by

 ||M||Ω:=supπ∈Ω[tr(ΠRR)+tr(ΠSS)].

A clear operational interpretation can be given to this function as follows. For any ensemble of states , one defines the operator . Then gives the infimum error probability of distinguishing from among all POVMs belonging to [MWW09]. For most operationally interesting classes of POVMs, it can be shown that is actually a norm on . The well-known Helstrom bound is recovered when is the full set of POVMs acting globally on ; i.e. [Hol73, Hel76]. On the other hand, by choosing to be the set of all LOCC POVMs, one obtains the so-called LOCC norm. Here, however, non-compactness of LOCC means the error probability is obtained by some POVM in and not necessarily LOCC itself.

Distinguishability norms have an important connection to the phenomenon of data hiding [TDL01, DLT02, MWW09, LW13, AL13]. For example, in any system, pairs of orthogonal states and exist for which , and yet . This implies the ability perfectly encode a single bit in the ensemble so that only an arbitrarily small amount of the information can be recovered using LOCC. Distinguishability norms have also emerged as useful concepts in other contexts such as proving faithfulness of the squashed entanglement [BCY11] and analyzing the quantum complexity in deciding separability [MGHW13].

With distinguishability norms, we have a mathematically precise way to compare different classes of POVMs. We say that a class is more powerful than class if for all , and there exists some for which . In this paper, the classes of operations we will focus on are GLOBAL, SEP, LOCC, and 1-LOCC, where the latter refers to LOCC operations under the restriction of one-way classical communication. In terms of distinguishability norms, these satisfy the ordering

 ||M||1≥||M||SEP≥||M||LOCC≥||M||1−LOCC. (1)

The data hiding literature offers nice examples of when the first inequality is strict [DLT02, MW09]. At the present, it is well-known that certain tasks can be accomplished by separable operations but not by LOCC [BDF99, DFXY09, CCL12], and likewise, two-way LOCC offers greater possibilities than one-way LOCC [BDSW96, Coh07, KTYI07, OH08, XD08, CH13, Nat13]. However, to our knowledge neither the second nor third inequalites in Eq. 1 have been shown as strict. Below we will construct explicit operators for which a separation between the distinguishability norms can be observed.

In general, it is not well-understood when bi-directional communication offers an operational advantage over uni-directional communication in LOCC information processing. For instance, to convert one bipartite pure entangled state to another using LOCC, one-way classical communication suffices [LP01]. Another example even more relevant to the problem at hand involves the minimum error discrimination of any two bipartite pure states. For pure states and of any dimension and any number of parties, the global optimal minimum-error discrimination can be achieved by one-way LOCC [WSHV00, VSPM01]. Thus, with respect to distinguishability norms, we have that

 ||M||1=||M||LOCC=||M||1−LOCC (2)

whenever is rank two. Given what we have learned in over 20 years of research into LOCC state discrimination, it is not surprising that the first equality fails to hold in general when has greater rank (one conclusive proof for the rank three case is given in [CDH13]). Below we will show that likewise the second inequality fails to hold in general when is rank three. This will be done by considering one pure state and one rank-two mixed state such that

 ||ψ−ρ||LOCC>||ψ−ρ||1−LOCC. (3)

## 2 Conditions for Asymptotic Discrimination

We begin by deriving a general necessary criterion for when two orthogonal states can be perfectly distinguished by . Note that no generality is lost by restricting attention to perfect discrimination of orthogonal states since if and are non-orthogonal, optimal discrimination in the minimum error sense is equivalent to asymptotic perfect discrimination of the two orthogonal (unnormalized) states and , where [VSPM01, CH13].

Our approach here is largely inspired by the work of Kleinmann et al. [KKB11], and we utilize some of the tools introduced in that paper. Let be an ensemble of orthogonal states with any choice of prior probabilities satisfying . We are free to choose since the prior probabilities are irrelevant for perfect discrimination 111Suppose and can be perfectly distinguished by asymptotic LOCC with nonzero a priori probabilities , and let be any other a priori probabilities. For any there exists an LOCC POVM such that . From this we deduce and . Hence, setting gives .. Let be any finite-round, finite-outcome LOCC protocol that errs in distinguishing and with some probability . As usual, we can envision protocol as a tree where the root node represents the first local measurement performed. The different outcomes establish different branches which successively split into more branches after additional rounds of measurements. At any point in the protocol, we will refer to the bias as the updated state probabilities given all the previous measurement outcomes. We thus represent the bias by a two-component vector where and indicates some particular sequence of outcomes. Let be the POVM element corresponding to outcome sequence , and note that is a product operator. Finally, the node in the protocol tree associated with will be denoted by , and the probability of reaching this node is .

Suppose that are the various subsequent nodes obtained by performing a local POVM at node . Then we say that is a bias-flipped node if and with . We say a node is a guessing node if no more measurements are performed after reaching that node; at which point, the parties make a guess as to the state’s identity. If is a guessing node, then the total error associated with the node is

 Perr(nλ)=pλ⋅min{pσ|λ,pρ|λ}.

If is a non-guessing node, the minimum guessing error attainable from node onward is bounded by the global optimal probability . At node , the a posteriori ensemble is . The trace norm can be related to the Hilbert-Schmidt inner product according to [NC00], and we thus obtain the following bound for non-guessing nodes:

 Perr(nλ) ≥12[pλ−√p2λ−4pρpσtr(ΠλρΠλσ)]. (4)

With the above terminology fixed, our first task is to modify so that the updated a posteriori probabilities at each bias-flipped node are exactly equal. There are various ways to perform this modification such as decomposing each strong measurement into a sequence of weak measurement so that the bias evolves by arbitrarily small increments throughout the protocol [BDF99, OB05]. Alternatively, Kleinmann et al. describe a “pseudo-weak” approach that simply breaks any measurement into two steps without affecting the overall success probability of the protocol [KKB11]. Either way, the following construction can be achieved.

###### Proposition 1.

Suppose that is some LOCC protocol that distinguishes and with error probability . Then there exits another protocol that also distinguishes with error probability but with all bias-flipped nodes having a bias .

Consider such a protocol . Starting from the root node where , we track the bias throughout the protocol. Every branch will either reach a guessing node without the bias flipping at least once (say these nodes belong to set ), or the branch will reach a node with bias (say these nodes belong to set ). Since for all , we have that

 Perr(nλ) =pλpρ|λ=pρpλ|ρ=pρtr(Πλρ)∀nλ∈B1.

Summing over these nodes gives the bound

 ϵ≥∑λ:nλ∈B1Perr(nλ)=∑λ:nλ∈B1pρtr(Πλρ)=pρtr(Π(ϵ)ρ),

where . For the nodes in , we use Eq. (4) to obtain the bound

 ϵ≥∑λ:nλ∈B212[pλ−√p2λ−4pρpσtr(ΠλρΠλσ)].

Thus, in order for an LOCC protocol to distinguish and with error probability , there must exist a POVM with a separable operator and each a product operator, such that for all :

 pρtr(Π(ϵ)ρ) ≤ϵ, (5) ∑λ12[pλ−√p2λ−4pρpσtr(ΠλρΠλσ)] ≤ϵ, (6) tr[Πλ(pρρ−pσσ)] =0, (7)

where . Note that the error bounds contained in Eqns. (5)–(7) apply to an actual LOCC-implementable POVM.

Unfortunately, the number of satisfying Eqns. (5)–(7) may be unbounded, and to deal with this, we slightly relax the LOCC-implementable condition. This is done via Carathéodory’s Theorem, which allows us to bound the number of .

###### Lemma 1 (Carathéodory’s Theorem [Roc96]).

Let be a subset of and its convex hull. Then any can be expressed as a convex combination of at most elements of .

For any POVM satisfying Eqns. (5)–(7), define so that belongs to , where . The set of hermitian operators acting on represents a -dimensional real vector space (every hermitian matrix is specified by real numbers). Thus by Carathéodory’s Theorem, there exists a set of non-negative numbers with such that and .

Let denote the -fold Cartesian product of hermitian operators acting on . We construct a compact subset as follows. A collection belongs to if: (i) is non-negative separable and are non-negative product operators satisfying Eqns. (5)–(7) for some , (ii) the are non-negative satisfying , and (iii) . Under these conditions, is a closed, bounded, and therefore compact set.

By assumption of asymptotic discrimination, we must be able to find a collection in for every . Compactness then assures the existence of some satisfying Eqns. (5)–(7) for , with . But with , the elements themselves satisfy Eqns. (5)–(7). Note that the sum in (6) vanishes iff for each . Hence we obtain our main result:

###### Theorem 1.

If -partite states and can be perfectly distinguished by asymptotic LOCC, then for each there must exist a POVM such that is a separable operator, each is a product operator, and

 tr(Π0ρ)=0 , (8) tr(ΠλρΠλσ)=0 ,∀1≤λ≤D (9) tr[Πλ[(1−x)ρ−xσ]]=0 ,∀1≤λ≤D (10)

where and is the dimension of system .

In the above proof, we began by assuming that , and this corresponds to the choice . The boundary point can be trivially satisfied by the identity. For , note that distinguishability by implies distinguishability by SEP. Hence, there must exist two separable operators and such that , , and . Setting and decomposing into a convex sum of product operators provides the necessary ingredients to satisfy Theorem 1 for .

The case of allows us to immediately draw conclusions about states and whose supports cover the full state space. Namely, when , from the discussion of the previous paragraph it follows that distinguishability by requires and to each possess a product state basis. We can apply this observation to the LOCC distinguishability norm of certain hermitian matrices having full rank. Recall that for some set of POVMs iff the orthogonal ensemble can be perfectly distinguished by some POVM in , where . We refer to the positive (resp. negative) eigenspace of an operator as the subspace spanned by eigenvectors whose corresponding eigenvalues are positive (resp. negative).

###### Lemma 2.

Let be a full rank hermitian operator possessing either a positive or negative eigenspace of dimension two. Then iff an orthonormal product basis exists for both the positive and negative eigenspaces.

###### Proof.

Let be the orthogonal decomposition of where and . Assume that and can be perfectly distinguished by with the POVM . From the above observation, we have that both and must contain a product state basis. If is a tensor product space (i.e. of the form ), then obviously has an orthonormal product basis as well as . Suppose then that is not a tensor product space. Then since any such two-dimensional subspace can possess at most two product states, the product state basis of will be unique. Now suppose that the product states in are not orthogonal and given by and with . Let denote the projection onto the subspace spanned by . Note that and likewise since and . This implies that the POVM perfectly distinguishes and . The POVM belongs to SEP, and the conditions for distinguishing the rank-two elements and by SEP is that contains an orthogonal product basis [CDH13]. This is a contradiction.

On the other hand, suppose that (and therefore also ) contains an orthogonal product basis of the form and (without loss of generality). Then Alice measures in the computational basis. If she obtains outcome , Bob projects into any orthogonal basis containing , and they choose state iff he obtains . If she obtains outcome , Bob projects into any orthogonal basis containing and they choose iff he obtains . ∎

## 3 Separating SEP and LOCC Norms

We will next apply Theorem 1 in a straightforward manner to show a gap between the SEP and LOCC norms. To achieve this, we will return to the original paper that demonstrated the subtle difference between separability and locality in terms of distinguishability [BDF99]. The authors presented nine orthogonal product states spanning the full state space:

 |ψ0⟩ =|1⟩⊗|1⟩, |ψ1±⟩ =|0⟩⊗|0±1⟩, |ψ2±⟩ =|0±1⟩⊗|2⟩, |ψ3±⟩ =|1±2⟩⊗|0⟩, |ψ4±⟩ =|2⟩⊗|1±2⟩. (11)

It was shown that as a nine-state ensemble, is unable to perfectly identify a given state while SEP, in contrast, can achieve the task. Here, we wish to prove the stronger result that the following mixtures cannot be perfectly distinguished by :

 σ =144∑i=1|ψi+⟩⟨ψi+|, ρ =15(|ψ0⟩⟨ψ0|+4∑i=1|ψi−⟩⟨ψi−|). (12)

Let us show that these states cannot be discriminated by . Eq. (9) necessitates that each acts invariantly on . The key observation is that the are the unique product states lying in . To see this, note that any product state in can be represented as a matrix whose minors vanish; this requires all but one to be zero. As a result of this property, up to overall non-negative scalars, must map the set of four states

 |ψ1+⟩ =|0⟩⊗|0+1⟩, |ψ2+⟩ =|0+1⟩⊗|2⟩, |ψ3+⟩ =|1+2⟩⊗|0⟩, |ψ4+⟩ =|2⟩⊗|1+2⟩

onto itself. Now the action is not possible for any when and . Applying this fact to both and implies that where .

Suppose now that there are two states and for which both . We can always find another state with such that both and . But since the eigenspaces of (resp. ) are orthogonal, it follows that ; for if (resp. ) were to be in the kernel of (resp. ), then a non-orthogonal state would also lie in the support of (resp. ). Now with and because the local parts of any three are linearly independent, and are necessarily full rank. In this case, both and will have eigenstates , which, by a simple calculation, can be seen as possible only if both and are proportional to the identity.

We have just shown that if is not proportional to the identity, then must eliminate at least three of the . However, by taking in Eq. (10), two conditions are ensured: is not proportional to the identity (since ), and is nonzero for at least one value of (since ). Hence, for in this interval, must eliminate three and only three of the . By again using the fact that Alice and Bob’s parts are linearly independent for any three of the , this is possible only if is rank two and having the form

 Aλ⊗Bλ=ci+|ψi+⟩⟨ψi+|+ci−|ψi−⟩⟨ψi−| (13)

with . The fact that again follows from Eq. (10). In summary, each has support on a two-dimensional space spanned by for some . But then discrimination of and is impossible by since Eq. (8) together with the fact that implies that . But will not be contained in .

We have thus demonstrated a gap between the distinguishability norms:

 ||ρ−σ||SEP>||ρ−σ||LOCC. (14)

This relatively simple argument can be applied to more ensembles with the same type of structure (see Ref. [DMS03] for such ensembles).

The example constructed in this section demonstrates the difference between Theorem 1 and the distinguishability criterion of Ref. [KKB11]. The criterion given in Proposition 1 of Ref. [KKB11] will not show the impossibility of discriminating and by ; indeed, a product operator of the form given in Eq. (13) will satisfy that distinguishability criterion. The essential component of Theorem 1 used to eliminate the possibility of discrimination is that the collective supports of must cover the full support of .

## 4 Separating One-Way and Two-Way LOCC Norms

By the “two-way” LOCC norm, we are referring to the general LOCC norm. To show a gap between and , the example ensemble we consider consists of the equiprobable two-qubit states

 ψ =|00⟩⟨00| ρ =1/2(|++⟩⟨++|+|−−⟩⟨−−|), (15)

where . Distinguishability of this ensemble was analyzed by Koashi et al. within the context of unambiguous discrimination [KTYI07]. However, for the task of minimum error discrimination, this simple-structure ensemble has only been in studied in Ref. [CDH13] where it was shown that separable operators are able to obtain the global minimum error probability, while this rate cannot be achieved in finite rounds of LOCC. In terms of norms, we can summarize this result as

 ||ψ−ρ||1=||ψ−ρ||SEP>||ψ−ρ||r−LOCC, (16)

where and denotes the set of -round two-outcome LOCC POVMs (acting on two qubits). We emphasize that despite the previous unambiguous discrimination analysis conducted by Koashi et al. on ensemble (15), it is a completely different problem to consider the minimum error discrimination of these states. Indeed, we have recently observed that SEP and LOCC are equally powerful for distinguishing certain states when optimal unambiguous discrimination is the figure of merit, but SEP and LOCC are inequivalent for the same states when minimum error discrimination is taken as the figure of merit [CH13, CDH13].

The goal at hand is to prove that . To do so we will first compute the minimum one-way LOCC error probability; then we will construct a specific two-way protocol that beats it. For both parts, we will need the explicit formula for Helstrom’s bound when distinguishing one pure qubit state from one mixed state. This is given by computing for qubit pure states and . It is relatively straightforward to make this calculation (see Ref. [CH13] for details); one finds the following.

###### Lemma 3.

Consider the weighted states and . The minimum error probability is

 12−12√|p0−p1−p2|2−4detΔ (17)

if , and if , where and

 detΔ=p1p2(1−|⟨ψ1|ψ2⟩|2)−p0p1(1−|⟨ψ0|ψ1⟩|2)−p0p2(1−|⟨ψ0|ψ2⟩|2). (18)

One-Way Optimal: Consider a one-way measurement scheme that is optimal in the minimum error sense. Without loss of generality, we consider communication from Alice to Bob with Alice’s measurement consisting of rank-one POVMs. The joint measurement is then given by where is a conditional POVM performed by Bob. Note that we are dealing with a real ensemble with each state being -invariant. Consequently, we can simplify the structure of Alice’s POVM according to the following proposition.

###### Proposition 2.

An optimal one-way LOCC scheme consists of Alice’s POVM being

 {p0|ϕ0⟩⟨ϕ0|,p0σz|ϕ0⟩⟨ϕ0|σz,p1|ϕ1⟩⟨ϕ1|,p1σz|ϕ1⟩⟨ϕ1|σz,}

where .

###### Proof.

Appendix A ∎

With this proposition, it suffices to consider only two measurement outcomes corresponding to POVM elements and for which

 q0+q1 =1, q0cosϕ0+q1cosϕ1 =0. (19)

The full POVM will then contain the -rotated elements as well, and the total average error probability will be given by . Using the relations

 |⟨ϕλ|0⟩|2 =qλ2(1+cosϕλ), |⟨ϕλ|±⟩|2 =qλ2(1±sinϕλ),

the unnormalized states that Bob must distinguish given outcome are

 pλ|02P(λ)|0⟩⟨0|,pλ|+4P(λ)|+⟩⟨+|+pλ|−4P(λ)|−⟩⟨−|,

where

 pλ|0 =qλ2(1+cosϕλ), pλ|± =qλ2(1±sinϕλ), P(λ) =qλ2(1+cosϕλ2). (20)

For outcome , a direct calculation gives

 |pλ|02P(λ)−pλ|+4P(λ)−pλ|−4P(λ)| =qλ4P(λ)|cosϕλ| detΔ =(qλ8P(λ))2[(1−cosϕλ)2−3]. (21)

Since the average error is given by , using Lemma 3, the error for when both outcomes satisfy is given by

 21∑λ=0⎛⎜⎝P(λ)2−P(λ)2 ⎷(qλ4P(λ)|cosϕλ|)2−4(qλ8P(λ))2[(1−2cosϕλ)]⎞⎟⎠ =12(1−1√2[q0√1+cosϕ0+q1√1+cosϕ1]). (22)

We want to minimize this under the constraints that and . Using concavity of the function , Eq. (22) immediately gives a lower bound of . In fact, this lower bound is saturated by the choice of , and . This corresponds to Alice performing the projective measurement .

The only other possibility is if for outcome , and for outcome . In this case, the average error is given by

 2×(14−q08|cosϕ0|−q14√2√1+cosϕ1) =12(1−q1(cosϕ12+√1+cosϕ1√2)) (23)

where we have used the relation and the fact that . This is minimized using a Lagrange multiplier, and the calculation is carried out in Appendix B. One finds that two extrema exists for Alice’s measurement: when she measures in the computational basis and when she measures in the Hadamard basis . Measuring in the computational basis leads to a smaller error probability. It corresponds to choosing and so that the the optimal one-way LOCC error probability is .

Two-Way Improvement:

Now we construct an improved two-way protocol. We will track the evolution of the three-state ensemble

 {(12,|00⟩);(14,|++⟩);(14,|−−⟩)},

keeping in mind that the actual problem includes a mixing over the last two states. Suppose that Alice performs a two-outcome measurement with Kraus operators given by

 A0 =√1/2(1+p)|0⟩⟨0|+√1/2(1−p)|1⟩⟨1| A1 =√1/2(1+p)|1⟩⟨1|+√1/2(1−p)|0⟩⟨0|. (24)

Here, parametrizes the strength of Alice’s measurement with corresponding to the optimal one-way measurement described above. First consider outcome that occurs with probability . Alice broadcasts her result and the updated ensemble becomes

 {(1+p2+p,|00⟩);(12(2+p),|s++⟩);(12(2+p),|s−−⟩)},

where . Bob now pretends that Alice has completely eliminated the state , i.e. if she had chosen . That is, he performs optimal discrimination measurement for the ensemble . This amounts to measuring in the computational basis . If he measures then the state will be perfectly identified. On the other hand, outcome occurs with probability and in that case the updated probabilities of Alice’s ensemble are

 P(0|B0) =1P(B0)⋅1+p2+p P(s±|B0) =12P(B0)⋅12(2+p)

so that she is left to distinguish the sub-normalized states