A framework for bounding nonlocality of state discrimination

# A framework for bounding nonlocality of state discrimination

Andrew M. Childs, Debbie Leung, Laura Mančinska, and Maris Ozols
Department of Combinatorics & Optimization
and Institute for Quantum Computing
University of Waterloo
###### Abstract

We consider the class of protocols that can be implemented by local quantum operations and classical communication (LOCC) between two parties. In particular, we focus on the task of discriminating a known set of quantum states by LOCC. Building on the work in the paper Quantum nonlocality without entanglement [BDF99], we provide a framework for bounding the amount of nonlocality in a given set of bipartite quantum states in terms of a lower bound on the probability of error in any LOCC discrimination protocol. We apply our framework to an orthonormal product basis known as the domino states and obtain an alternative and simplified proof that quantifies its nonlocality. We generalize this result for similar bases in larger dimensions, as well as the “rotated” domino states, resolving a long-standing open question [BDF99].

## 1 Introduction

The 1999 paper Quantum nonlocality without entanglement [BDF99] exhibits an orthonormal basis of product states, known as domino states, shared between two separated parties. When the parties are restricted to perform only local quantum operations and classical communication (LOCC), it is impossible to discriminate the domino states arbitrarily well [BDF99]. In such cases we say that perfect discrimination cannot be achieved with asymptotic LOCC. Moreover, [BDF99] also quantifies the extent to which any LOCC protocol falls short of perfect discrimination of the domino states.

This result spurred interest in state discrimination with LOCC. Several alternative proofs [WH02, GV01, Coh07] of the impossibility of perfect LOCC discrimination of the domino states were given along with many other results concerning perfect state discrimination (e.g., [BDM99, WSHV00, GKR01, GV01, VSPM01, CY01, CY02, WH02, DMS03, CL03, HSSH03, HM03, Fan04, GKRS04, Che04, CL04, JCY05, Wat05, Nat05, NC06, DFJY07, FS09, DFXY09, DXY10]). However, the problem of asymptotic LOCC state discrimination has not received much attention since the initial study of nonlocality without entanglement [BDF99].

The main motivation for our work is to better understand the phenomenon of quantum nonlocality without entanglement. More concretely, our goals are to

• simplify the original proof,

• render the technique applicable to a wider class of sets of bipartite states,

• exhibit new classes of product bases that cannot be asymptotically (as opposed to just perfectly) discriminated with LOCC,

• pin down where exactly the difference between LOCC and separable operations lies, and

• investigate the possibility of larger gaps between the sets of LOCC and separable operations.

In particular, we seek to exhibit quantitative gaps between the classes of LOCC and separable operations. Separable operations often serve as a relaxation of LOCC operations and such gaps show how imprecise this relaxation can be. The rationale behind this relaxation is that separable operations have a clean mathematical description whereas LOCC operations can be much harder to understand.

There is also an operational motivation to quantify the difference between separable measurements and those implemented by asymptotic LOCC: the former are precisely the measurements that cannot generate entangled states, while the latter are those that do not require entanglement to implement [BDF99, KTYI07, Koa09]. Thus, a separable measurement that cannot be implemented by asymptotic LOCC uses entanglement irreversibly.

### Our contributions

In this paper, we develop a framework for obtaining quantitative results on the hardness of quantum state discrimination by LOCC. More precisely, we provide a method for proving a lower bound on the error probability of any LOCC measurement for discriminating states from a given set .

Our first main contribution (Theorem 2) is that any LOCC measurement for discriminating states from a set errs with probability , where is a constant that depends on (see Definition 3.4). Intuitively, measures the nonlocality of .

Our second main contribution is a systematic method for bounding the nonlocality constant for a large class of product bases. Together with the above theorem, this lets us quantify the hardness of LOCC discrimination for the following bases of product states:

1. domino states, the original set of nine states in dimensions first considered in [BDF99], have ;

2. domino-type states, a generalization of domino states to higher dimensions corresponding to tilings of a rectangular grid by tiles of size at most two, have , where is a property of the tiling that we call “diameter”;

3. -rotated domino states, a -parameter family that includes the domino states and the standard basis as extreme cases, have (determining whether these states can be discriminated perfectly by LOCC and finding a lower bound on the probability of error were left as open problems in [BDF99]).

The rest of the paper is organized as follows. In Section 2 we introduce notation, give background on LOCC measurements and state discrimination, and summarize related prior work. In Section 3 we introduce our general framework for lower bounding the error probability of LOCC measurements, and in Section 3.5 we prove Theorem 2. In Section 4 we consider the case where is a product basis and propose a method for bounding the nonlocality constant by another quantity that we call “rigidity.” Our approach is based on a description of sets of bipartite states in terms of tilings. In Section 5 we define the three classes of states mentioned above and prove a bound on the rigidity of the domino states; bounds on the rigidity of the domino-type states and the rotated domino states appear in Appendices A and B, respectively. Finally, we discuss limitations of our framework in Section 6 and conclude with a discussion of open problems in Section 7.

## 2 Background

### 2.1 Notation

The following notation is used in this paper. Let be the set of all linear operators from to and let . Next, let be the set of all positive semidefinite operators on . Let denote the largest entry of in absolute value. Finally, for any natural number , let and let be the identity matrix.

### 2.2 Separable and LOCC measurements

A -outcome POVM measurement (or simply a measurement) on an -dimensional state space is a set of operators such that . The operators are called POVM elements. The probability of obtaining outcome upon measuring state is .

When it is necessary to keep track of the post-measurement state, it is more convenient to use a non-destructive measurement. Such a measurement is specified by a set of measurement operators for some finite where . The probability of obtaining outcome upon measuring state is and the -dimensional post-measurement state is .

Note that a non-destructive measurement followed by discarding the post-measurement state corresponds to a POVM measurement with elements .

#### 2.2.1 Separable measurements

###### Definition 1.

A measurement on a bipartite state space is separable if all POVM elements are separable, i.e.,

 Ei=∑jEAj⊗EBj (1)

for some and .

Note that the above definition is equivalent to saying that is obtained from a measurement with product POVM elements, followed by classical post-processing (coarse graining).

#### 2.2.2 LOCC measurements

Informally, a bipartite -outcome LOCC measurement consists of the two parties taking finitely many turns (called rounds) of applying adaptive non-destructive measurements to their state spaces and exchanging classical messages. This is followed by coarse graining all measurement records into bins, each corresponding to one of the outcomes of .

Let us describe such a protocol more formally, adopting notation similar to that of [BDF99]. Let denote the empty string, corresponding to no message being sent. The protocol begins when one of the parties, say Alice, applies a non-destructive measurement

 A(Λ)={A1(Λ),…,Ak(Λ)(Λ)} (2)

to her state space and communicates the round 1 measurement outcome to Bob. Then, depending on the value of received, Bob applies a non-destructive measurement

 B(m1)={B1(m1),…,Bk(m1)(m1)} (3)

to his state space and communicates the round 2 measurement outcome to Alice. The protocol proceeds with the two parties taking finitely many alternating turns of a similar form, where the non-destructive measurement applied at round depends on the measurement record accumulated during the previous rounds.

Let be the measurement record after the execution of the first rounds of the protocol. Then the measurement operator that Alice and Bob have effectively implemented is a product operator , where111Here we assume for simplicity that is even; in the odd case the operators and can be defined similarly.

 Am \vbox:=Amt−1(m1,…,mt−2)…Am3(m1,m2)Am1(Λ), (4) Bm \vbox:=Bmt(m1,…,mt−1)…Bm4(m1,m2,m3)Bm2(m1). (5)

Alice and Bob may choose to terminate the protocol depending on the measurement record obtained. At this point they must output one of the outcomes of the LOCC measurement that they are implementing. If is the set of all terminating measurement records corresponding to outcome , then the th POVM element of is given by

 Ek\vbox:=∑m∈L(k)A†mAm⊗B†mBm. (6)

Since each is separable, any LOCC measurement is separable.

#### 2.2.3 Finite and asymptotic LOCC

We consider two scenarios: when a measurement can be performed in a finite number of rounds or asymptotically.

###### Definition 2.

We say that a measurement can be implemented by (finite) LOCC if there exists a finite-round LOCC protocol that, for any input state, produces the same distribution of measurement outcomes as .

###### Definition 3.

We say that a measurement can be implemented by asymptotic LOCC if there exists a sequence of finite-round LOCC protocols whose output distributions converge to that of .

The exact implementation scenario is not practical since any real-world device is susceptible to errors due to imperfections in implementation. However, proving that a certain task cannot be performed asymptotically is considerably harder than showing that it cannot be done (exactly) by any finite LOCC protocol.

#### 2.2.4 LOCC protocol as a tree

We represent an LOCC measurement protocol as a tree (see Figure 1). The protocol begins at the root and proceeds downward along the edges. Each edge represents a certain measurement outcome obtained at its parent node, and leaves are the nodes where the protocol terminates. The set of all leaves is partitioned into subsets, each corresponding to an outcome of the LOCC measurement being implemented.

A path from the root to a leaf is called a branch. There is a one-to-one correspondence between the branches and the possible courses of execution of the LOCC protocol. Likewise, there is a one-to-one correspondence between the nodes of the tree and the accumulated measurement records.

The measurement at node is the measurement performed by the acting party once the protocol has reached node . In contrast, the measurement operator corresponding to node is the measurement operator that has been implemented upon reaching node . For example, consider the node . The measurement at node is given by the POVM , whereas the measurement operator corresponding to the node is given by . As another example, the measurement operators corresponding to the leaves are exactly the measurement operators of the LOCC protocol prior to coarse graining.

### 2.3 Bipartite state discrimination problem

The goal of this paper is to investigate the limitations of two-party LOCC protocols for the task of bipartite quantum state discrimination, which is as follows:

Let be a known set of quantum states. Suppose that is selected uniformly at random and Alice and Bob are given the corresponding parts of state . Their task is to determine the index by performing a measurement on this state.

A case of special interest is when is an orthonormal product basis, i.e., each for some orthonormal bases and . Such states can be perfectly discriminated by a separable measurement with POVM elements

 Ei\vbox:=|αi⟩⟨αi|⊗|βi⟩⟨βi|. (7)

However, this measurement cannot always be implemented by finite [WH02, GV01] or even asymptotic LOCC [BDF99]. In such cases we say that possesses nonlocality (without entanglement).

### 2.4 Previous results

The first example of an orthonormal product basis of bipartite quantum states that cannot be perfectly discriminated by (even asymptotic) LOCC was given in [BDF99]. This is a striking illustration of the difference between the power of LOCC and separable operations. Furthermore, [BDF99] quantifies the information deficit of any LOCC protocol for discriminating these states. This result has been a starting point for many other studies on state discrimination by LOCC, with the ultimate goal of understanding LOCC operations and how they differ from separable ones. We briefly describe some of the directions that have been explored. Unless otherwise stated, these results refer to the discrimination of pure states with finite LOCC.

First consider the problem of discriminating two states without any restrictions on their dimension. Surprisingly, any two orthogonal (possibly entangled) pure states can be perfectly discriminated by LOCC, even when they are held by more than two parties [WSHV00]. Furthermore, optimal discrimination of any two multipartite pure states can be achieved with LOCC both in the sense of minimum error probability [VSPM01] and unambiguous discrimination [CY01, CY02, JCY05]. Recently this has been generalized to implementing an arbitrary POVM by LOCC in any -dimensional subspace [Cro12].

Many authors have considered the problem of perfect state discrimination by finite LOCC. In particular, the case where one party holds a small-dimensional system is well understood. Reference [WH02] characterizes when a set of orthogonal (possibly entangled) states in can be perfectly discriminated by LOCC. A similar characterization for sets of orthogonal product states in has been given by [FS09]. In addition, [WH02] characterizes when a set of orthogonal states in can be perfectly discriminated by LOCC when Alice performs the first nontrivial measurement. It is also known that -rotated domino states cannot be perfectly discriminated by LOCC (unless ) [GV01]. Furthermore, the original domino states have inspired a construction of -partite -dimensional product bases that cannot be perfectly discriminated with LOCC [NC06].

The role of entanglement in perfect state discrimination by finite LOCC has also been considered. It is not possible to perfectly discriminate more than two Bell states by LOCC [GKR01]. In fact, the same is true for any set of more than maximally entangled states in [Nat05]. Multipartite states from an orthonormal basis can be perfectly discriminated by LOCC only if it is a product basis [HSSH03]. Also, no basis of the subspace orthogonal to a state with orthogonal Schmidt number 3 or greater can be perfectly discriminated by LOCC [DFXY09]. On the other hand, any three orthogonal maximally entangled states in can be perfectly discriminated by LOCC [Nat05]. In fact, if the number of dimensions is not restricted, one can find arbitrarily large sets of orthogonal maximally entangled states that can be perfectly discriminated by LOCC [Fan04]. Contrary to intuition, states with more entanglement can sometimes be discriminated perfectly with LOCC while their less entangled counterparts cannot [HSSH03]. Generally, however, a set of orthogonal multipartite states can be perfectly discriminated with LOCC only if , where measures the average entanglement of the states in [HMM06].

It is known that local projective measurements are sufficient to discriminate states from an orthonormal product basis with LOCC [DR04, CL04]. Moreover, there is a polynomial-time (cubic in ) algorithm for deciding if states from a given orthonormal product basis of can be perfectly discriminated with LOCC [DR04]. The state discrimination problem for incomplete orthonormal sets (i.e., orthonormal sets of states that do not span the entire space) seems to be harder to analyze. However, unextendible product bases might be an exception (although commonly referred to as “bases” these are in fact incomplete orthonormal sets). It is known that states from an unextendible product basis cannot be perfectly discriminated by finite LOCC [BDM99]. In fact, the same holds for any basis of a subspace spanned by an unextendible product basis in  [DXY10]. Curiously, there are only two families of unextendible product bases in , one of which is closely related to the domino states [DMS03].

The problem of state discrimination with asymptotic LOCC has been studied less. It is known that states from an unextendible orthonormal product set cannot be perfectly discriminated with LOCC even asymptotically [DR04]. Reference [KKB11] gives a necessary condition for perfect asymptotic LOCC discrimination, and also shows that for perfectly discriminating states from an orthonormal product basis, asymptotic LOCC gives no advantage over finite LOCC. The latter result implies that the algorithm from [DR04] also covers the asymptotic case. On the other hand, even in some very basic instances of state discrimination it remains unclear whether asymptotic LOCC is superior to finite LOCC (see [DFXY09, KKB11] for specific sets of states).

Another line of study originating from [BDF99] aims at understanding the difference between the classes of separable and LOCC operations. To this end, [Coh11] constructs an -round LOCC protocol implementing an arbitrary separable measurement whenever such a protocol exists. A different approach is to exhibit quantitative gaps between the two classes. To the best of our knowledge, only two quantitative gaps other than that of [BDF99] are known. References [KTYI07, Koa09] demonstrate a gap between the success probabilities achievable by bipartite separable and LOCC operations for unambiguously discriminating from a fixed rank-2 mixed state. The largest known difference between the two classes is a gap of 0.125 between the achievable success probabilities for tripartite EPR pair distillation [CCL11]. Moreover, as the number of parties grows, the gap approaches 0.37 [CCL11].

At a first glance one might think that the nonlocality without entanglement phenomenon is related to quantum discord. However, the quantum discord value cannot be used to determine whether states from a given ensemble can be discriminated with LOCC [BT10].

Finally, if a set of orthogonal (product or entangled) states cannot be perfectly discriminated by LOCC, one can measure their nonlocality by considering how much entanglement is needed to achieve perfect discrimination [Coh08, BBKW09].

## 3 Framework

In this section we introduce a framework for proving lower bounds on the error probability of any LOCC measurement for discriminating bipartite states from a given set

 S\vbox:={|ψ1⟩,…,|ψn⟩}⊂CdA⊗CdB. (8)

We make no assumptions about the states . In particular, they need not be product states or be mutually orthogonal.

From now on, denotes an arbitrary LOCC protocol for discriminating states from . In rough outline our argument proceeds as follows:

1. We modify so that it can be stopped when a specific amount of information has been obtained (see Section 3.1). This is done by terminating the protocol prematurely and possibly making the last measurement less informative (see Section 3.2).

2. When the information gain is , we lower bound a measure of disturbance (defined in Section 3.3) by for some constant (see Section 3.4).

3. We show that at least two of the possible initial states have become nonorthogonal at this stage of the protocol, and we infer a lower bound on the error probability of (see Section 3.5).

Our framework reuses some ideas of the original approach [BDF99]. However, instead of mutual information, we quantify how much an LOCC protocol has learned about the state using error probability. This allows us to replace the long mutual information analysis in the original paper with a simple application of Helstrom’s bound. The idea of relating information gain and disturbance also comes from [BDF99]. Here, we analyze this tradeoff using the nonlocality constant (see Definition 3.4) which can be applied to any set of states. In Section 4 we give a method for lower bounding the nonlocality constant that applies specifically when is an orthonormal basis of . In Section 5 we apply this method for the domino states and some other related bases.

### 3.1 Interpolated LOCC protocol

Consider an arbitrary node in the tree representing the protocol . Let be the corresponding measurement record and let denote the Kraus operator that is applied to the initial state when this node is reached. Note that the output dimensions of operators and could be arbitrary.

The initial state yields measurement record with probability

 p(m|ψk)\vbox:=Tr[(A⊗B)†(A⊗B)|ψk⟩⟨ψk|]=⟨ψk|(a⊗b)|ψk⟩ (9)

where and . Note that we need not concern ourselves with the arbitrary output dimensions of and from this point onward. We use Bayes’s rule and the uniformity of the probabilities to obtain the probability that the initial state was conditioned on the measurement record being :

 p(ψk|m)=p(ψk)p(m|ψk)∑nj=1p(ψj)p(m|ψj)=⟨ψk|(a⊗b)|ψk⟩∑nj=1⟨ψj|(a⊗b)|ψj⟩. (10)

At the root, the measurement record is the empty string and for all . As we proceed toward the leaves, these probabilities fluctuate away from . For example, if discriminates the states perfectly, the distribution reaches a Kronecker delta function.

For a given node let us define

 pmax(m)\vbox:=maxk∈[n]p(ψk|m). (11)

Let . Then characterizes the uniformity of the distribution and thus the amount of information learned about the input state. The next theorem shows that we can modify the protocol so that it can be stopped when some but not too much information has been learned. While this idea originates from [BDF99], we use a specific result from [KKB11].

###### Theorem 1 (Kleinmann, Kampermann, Bruß [Kkb11]).

Let be an LOCC protocol for discriminating states from a set of size . For any there exists an LOCC protocol that has the same success probability as , but each branch of has a node such that either

 pmax(m) =1n+ε or pmax(m) <1n+ε and m is a leaf of P. (12)
###### Proof idea.

Let be a node in the protocol tree of and let be the children of . Assume that for some we have

 pmax(u)<1n+ε

which means that the measurement outcome corresponding to the edge is too informative. To rectify this, we break up the measurement at node into two steps. We represent the outcomes of the first measurement by new nodes while the outcomes of the second measurement lead to the original nodes (see Figure 2).

The first measurement interpolates between a completely uninformative trivial measurement and the original measurement at . The interpolation parameters are chosen so that for all that satisfy Equation (13). The second measurement depends on the outcome of the first measurement. It produces the same set of post-measurement states as the original measurement at . Moreover, the total probability of obtaining each state is the same as in the case of the original measurement. After this we proceed according to the original protocol.

Protocol is obtained from by considering all branches of and performing the above procedure at the closest node to the root that has a child satisfying Equation (13). For more details see [KKB11]. ∎

In the context of state discrimination, the possibility of interpolating a protocol to obtain some but not too much information is what distinguishes LOCC measurements from separable ones. In particular, a separable measurement for a set of states that cannot be distinguished by asymptotic LOCC cannot be divided into two steps, with the first yielding information precisely and the second completing the measurement (further details will be provided in a manuscript currently in preparation).

### 3.2 Stopping condition

To control how much information the protocol has learned, we fix some and stop the execution of when we reach a node that satisfies the conditions in Equation (12).

###### Definition 4.

We say that stage I of the protocol is complete at the earliest point when Equation (12) is satisfied.

We choose in our analysis. Operationally, this means that none of the states has been eliminated at the end of stage I, since

 mink∈[n]p(ψk|m)≥1−(n−1)pmax(m)≥1n−(n−1)ε>0. (14)

This allows us to use Helstrom’s bound to lower bound the probability of error (see Section 3.5). It also ensures that the disturbance measure introduced in Section 3.3 is well defined at . All constraints imposed on the distribution are summarized in Figure 3.

Since the error probability of the protocol is a weighted average of error probabilities of individual branches, it suffices to lower bound these individual error probabilities. For any branch that terminates without a node satisfying

 pmax(m)=1n+ε, (15)

we can put a large lower bound on the error probability. In particular, for the optimal choice of Theorem 2 with ,

 perror(m)≥1−pmax(m)>1−(1n+ε)=1−1n−231n(n−1)≥16, (16)

which is much higher than the lower bound we obtain for other branches. We now consider the remaining case where stage I ends with a node satisfying Equation (15).

### 3.3 Measure of disturbance

Now we show that at least two possible post-measurement states and are nonorthogonal at the end of stage I, and lower bound their overlap quantitatively. Assuming that the initial state was , the normalized post-measurement state at the node with corresponding measurement operator is

 |ϕi⟩\vbox:=(A⊗B)|ψi⟩√⟨ψi|(a⊗b)|ψi⟩ (17)

where and . Note that for all because, from Equations (14) and (10), .

###### Definition 5.

The disturbance caused by the operator on the set of states is defined as

 δS(a⊗b)\vbox:=maxi≠j|⟨ϕi|ϕj⟩|=maxi≠j|⟨ψi|(a⊗b)|ψj⟩|√⟨ψi|(a⊗b)|ψi⟩⟨ψj|(a⊗b)|ψj⟩. (18)

Note that measures the nonorthogonality of the post-measurement states. If the initial states were orthogonal then indeed characterizes the disturbance caused by .

Since can be expressed in terms of the operators and , from now on we no longer explicitly use the measurement operators and .

Now we define the nonlocality constant and show that it relates (the disturbance caused at the end of stage I) to (the amount of information learned).

###### Definition 6.

The nonlocality constant of is the supremum over all such that for all and for all satisfying ,

 η⋅(maxk∈[n]⟨ψk|(a⊗b)|ψk⟩∑j∈[n]⟨ψj|(a⊗b)|ψj⟩−1n)≤δS(a⊗b). (19)

Equivalently, if for then

 η\vbox:=infa,b⎧⎪ ⎪ ⎪ ⎪ ⎪ ⎪⎨⎪ ⎪ ⎪ ⎪ ⎪ ⎪⎩maxi≠j|Gij|√GiiGjjmaxkGkk∑nj=1Gjj−1n⎫⎪ ⎪ ⎪ ⎪ ⎪ ⎪⎬⎪ ⎪ ⎪ ⎪ ⎪ ⎪⎭ (20)

where the infimum is over all and such that for all .

Recall from Section 3.2 that we stop the LOCC protocol at the end of stage I in a node where the condition in Equation (15) is satisfied for some . Let be the operator corresponding to node and let be the disturbance caused.

###### Lemma 1 (Disturbance/information gain trade-off).

The amount of information learned at the end of stage I lower bounds the disturbance as

 ηε≤δ (21)

where is the nonlocality constant of (see Definition 3.4).

###### Proof.

This immediately follows from the definitions of and :

 ηε=η(maxk∈[n]p(ψk|m)−1n)=η(maxk∈[n]⟨ψk|(a⊗b)|ψk⟩∑nj=1⟨ψj|(a⊗b)|ψj⟩−1n)≤δ (22)

where we have used Equations (15), (10), and (19). ∎

### 3.5 Lower bounding the error probability

In this section we use Lemma 1 to lower bound the error probability of any LOCC measurement for discriminating states from the set .

Note that Equation (21) together with the definition of implies that at the end of stage I there are two distinct post-measurement states and such that

 |⟨ϕi|ϕj⟩|=δ≥ηε. (23)

As discussed in Section 3.2, our choice of guarantees that and are both strictly positive. Thus we can use the following result to lower bound the error probability:

###### Fact (Helstrom bound [Hel76, pp.113]).

Suppose we are given state with probability and state with probability . Any measurement trying to discriminate the two cases errs with probability at least

 Q(q0,q1,δ)\vbox:=12(1−√1−4q0q1δ2)≥q0q1δ2, (24)

where is the overlap between the two states, and the inequality follows from for .

As increases, the disturbance (thus the overlap between some and ) increases, but the lower bound on the probabilities and decreases. The choice gives a lower bound on the error probability as follows.

###### Theorem 2.

Let be a set of quantum states in of size . Any LOCC measurement for discriminating states drawn uniformly from errs with probability

 perror≥227η2n5 (25)

where is the nonlocality constant of (see Definition 3.4).

###### Proof.

At the end of stage I there are two post-measurement states and with overlap . Let and be the posterior probabilities of these states. To lower bound the error probability of (thus that of ), we give Alice and Bob extra power at this point:

• if the actual input state does not lead to or , we assume that Alice and Bob succeed with certainty;

• otherwise Alice and Bob are allowed to perform the best joint measurement to discriminate the states and .

For fixed and probabilities and , we can lower bound the error probability by the following expression:

 P(p0,p1,ε)\vbox:\vbox:=(p0+p1)⋅Q(p0p0+p1,p1p0+p1,δ). (26)

Using Equation (24) and the inequality from Lemma 1, we get that

 P(p0,p1,ε)≥p0p1p0+p1(ηε)2. (27)

Recall that we stop the protocol at a point where we are guaranteed that and, by Equations (14) and (15),

 1n−(n−1)ε≤pi≤1n+ε (28)

for all . Given these constraints on and , we can choose the that maximizes and guarantee that the error probability in the branch of the LOCC protocol being considered satisfies

 perror≥maxε∈(0,1n(n−1))minp0,p1∈[1n−(n−1)ε,1n+ε]P(p0,p1,ε). (29)

From Equation (27) we get

 perror≥maxε∈(0,1n(n−1))minp0,p1∈[1n−(n−1)ε,1n+ε]p0p1p0+p1(ηε)2. (30)

The minimum is attained when (i.e., the probabilities are equal and as small as possible), so the problem simplifies to

 perror≥maxε∈(0,1n(n−1))12(1n−(n−1)ε)(ηε)2≥227η2n3(n−1)2≥227η2n5 (31)

where the value

 ε=231n(n−1) (32)

achieves the maximum. ∎

Theorem 2 shows that any LOCC protocol for discriminating states from errs with probability proportional to , justifying the name “nonlocality constant.”

## 4 Bounding the nonlocality constant

The framework described in Section 3 reduces the problem of bounding the error probability for discriminating bipartite states by LOCC to the one of bounding the nonlocality constant (see Theorem 2). This reduction holds for any set of pure states . In this section we assume that is an orthonormal basis of and provide tools for bounding the nonlocality constant. In particular, we bound in terms of another quantity that we call “rigidity”.

For the remainder of the paper we represent pure states from using “tiles” in a grid. We first introduce some notations related to tilings in Section 4.1. Then we define rigidity and relate it to the nonlocality constant in Section 4.2. Section 4.3 provides a tool, the “pair of tiles” lemma, that we use to bound rigidity for specific sets of states in Section 5.

### 4.1 Definitions

Given a fixed orthonormal basis , define the support of a pure state as

 supp|ψ⟩\vbox:={i∈[d]:⟨i|ψ⟩≠0}. (33)

If then . Consider as a rectangular grid of size . Any region that corresponds to a submatrix of this grid is called a tile. More formally, a tile is a subset such that for some and . (Note that a tile is not necessarily a contiguous region of the grid.) We use and to denote the rows and columns of this tile, respectively, and we use to denote the size or the area of . If is a product state, then and thus is a tile, which we call the tile induced by .

We say that an orthonormal set of product states induces a tiling of a grid if the tiles induced by the states in are either disjoint or identical. Note that if is an orthonormal basis of , then a tile of area is induced by states that form a basis of that tile. In a domino-type tiling, every tile has area or .

For a given tiling of a grid let us define the corresponding row graph as follows: its vertex set is with two vertices and adjacent if and only if there exists a column such that and belong to the same tile. The column graph of a tiling is defined similarly. We say that a tiling is irreducible if its row graph and its column graph are both connected. The diameter of the tiling is the maximum of the diameters of its row and column graphs. See Figure 4 for an example.

Without loss of generality we consider only irreducible tilings. Reducible tilings can be broken down into several smaller components without disturbing the underlying states. To do this, both parties simply perform a projective measurement with respect to the subspaces corresponding to the different components of the row and column graphs.

Note that in general, a tiling is not invariant under local unitaries. In particular, the irreducibility of the tiling induced by a given set of states is a basis-dependent property. The most extreme example of this phenomenon is the case of the standard basis. It induces a completely reducible tiling that consists only of tiles. However, if both parties apply a generic local unitary transformation, the resulting tiling consists only of a single tile of maximal size.

### 4.2 Lower bounding the nonlocality constant using rigidity

In this section we assume that is an orthonormal basis of (so in particular, ) and discuss a particular strategy for lower bounding for such . We apply this strategy to several sets of orthonormal product bases in Section 5.

We bound (quantifying a disturbance/strength tradeoff) by considering a quantitative property of the set called rigidity. Intuitively, we call a measurement operator strong if it is far from being proportional to the identity matrix; a set of states is rigid if there exists a strong measurement that leaves the set undisturbed. We formalize this as follows (recall that denotes the largest entry of a matrix in absolute value):

###### Definition 7.

For an orthonormal basis , if there is a constant such that for all , and for all such that ,

 ∥∥∥a⊗bTr(a⊗b)−In∥∥∥max≤c⋅δS(a⊗b), (34)

we say is -rigid, or is an upper bound on the rigidity of .

When is rigid, the states can remain unchanged despite application of a strong measurement. For example, a tensor product basis is not -rigid for any finite (i.e., such a basis is arbitrarily rigid). In contrast, if is small, then any strong measurement disturbs the set , and Equation (34) quantifies how weak a measurement operator must be for the disturbance to be small.

We now relate upper bounds on the rigidity of to lower bounds on its nonlocality constant:

###### Lemma 2.

Let be an orthonormal basis of . If is -rigid then

 η≥1cL. (35)

where is the size of the largest tile corresponding to states in .

###### Proof.

If is -rigid, then for any and (such that for all ), we have

 a⊗bTr(a⊗b)−In=cM⋅δS(a⊗b) (36)

for some Hermitian matrix with . From this we get

 maxk∈[n]⟨ψk|a⊗bTr(a⊗b)|ψk⟩−1n =cmaxk∈[n]⟨ψk|M|ψk⟩⋅δS(a⊗b) (37) ≤cL⋅δS(a⊗b). (38)

By the definition of (Equation (19)) and the fact that for any orthonormal basis , we get the desired inequality. ∎

Putting Lemma 2 and Theorem 2 together gives the following:

###### Theorem 3.

Let be an orthonormal basis of . If is -rigid then any LOCC measurement for discriminating states from errs with probability

 perror≥2271(cL)2n5 (39)

where is the size of the largest tile of .

### 4.3 The “pair of tiles” lemma

In this section we present a lemma that serves as our main tool for bounding rigidity.

###### Lemma 3.

Let , and define for and for . Then for any we have

 √mn⋅maxi,j|⟨φi|M|ψj⟩|≥maxk,l|Mkl|. (40)

The main idea of the proof is that a unitary change of basis can only increase the largest entry of a vector by a multiplicative factor depending on the dimension of the vector.

###### Proof.

Let us define a mapping as

 vec:|i⟩⟨j|↦|i⟩|j⟩ (41)

for and and extend it by linearity over . One can check that . Using this and basic inequalities between the -norm and the -norm, we get

 maxi,j|⟨φi|M|ψj⟩| =∥∥∥vec(∑i,j⟨φi|M|ψj⟩|i⟩⟨j|)∥∥∥∞ (42) =∥∥∥vec(∑i,j⟨i|U†MV|j⟩|i⟩⟨j|)∥∥∥∞ (43) =∥∥vec(U†MV)∥∥∞ (44) =∥∥(U†⊗VT)vec(M)∥∥∞ (45) ≥1√mn∥∥(U†⊗VT)vec(M)∥∥2 (46) =1√mn∥∥vec(M)∥∥2 (47) ≥1√mn∥∥vec(M)∥∥∞ (48)