# Quantifying entanglement with covariance matrices

###### Abstract

Covariance matrices are a useful tool to investigate correlations and entanglement in quantum systems. They are widely used in continuous variable systems, but recently also for finite dimensional systems powerful entanglement criteria in terms of covariance matrices have been derived. We show how these results can be used for the quantification of entanglement in bipartite systems. To that aim we introduce an entanglement parameter that quantifies the violation of the covariance matrix criterion and can be used to give a lower bounds on the concurrence. These lower bounds are easily computable and give entanglement estimates for many weakly entangled states.

###### pacs:

03.67.-a, 03.65.Ud## I Introduction

Entanglement is a central resource in quantum information processing and many works are devoted to its characterization hororeview (); witreview (); plenioreview (). One line of research is the derivation of entanglement criteria which should detect the entanglement also of weakly entangled states. A different line of research tries to quantify the entanglement via so-called entanglement measures plenioreview (). As most entanglement measures are defined via complex optimization procedures, they are often difficult to compute and therefore one tries to give at least lower bounds on them chenbounds (); otherconvexbounds (); generallowerbounds ().

Covariance matrices (CMs) of local observables are a widely used tool to study correlations in continuous variable systems, such as coupled harmonic oscillators or modes of light loockreview (). Moreover, many entanglement criteria for these systems are formulated as conditions on CMs cmgauss (). Recently, it has been shown that CMs are also a useful tool for the investigation of entanglement in discrete systems, such as polarized photons or trapped ions oldprl (); wir (); wirPRA (). Indeed, in Refs. wir (); wirPRA () a so-called covariance matrix criterion (CMC) has been established, which allows to detect many weakly entangled states, which are not detected by other criteria.

In this paper we show that CMs can be used not only to detect but also to quantify entanglement in discrete composite quantum systems. To that aim, we define an entanglement parameter \mathcal{E}(\varrho), that quantifies the violation of the CMC for discrete systems. Our construction is inspired by a similar definition of an entanglement parameter for continuous variable systems in Ref. giedkecirac (). While it remains unclear to which extent \mathcal{E}(\varrho) is a direct entanglement measure, we will see, however, that \mathcal{E}(\varrho) gives a lower bound on the concurrence, which is a widely used entanglement monotone.

In detail, our paper is organized as follows: In Section II we recall the CMC and define our entanglement parameter \mathcal{E}. We also demonstrate that existing results about the CMC give directly lower bounds on \mathcal{E}. In Section III we consider general properties of the the parameter \mathcal{E}. We show that is is convex and invariant under local rotations, but we present an example, where \mathcal{E} increases on average under LOCC. The physical reason behind this example is the fact that there are entangled states, which cannot be detected by the CMC, however, they are detected after suitable local filtering operations. In Section IV we investigate, how \mathcal{E} can be computed for different types of states. We give explicit formulas for pure states, and also show how to compute \mathcal{E} for the family of mixed, Schmidt-correlated states. In Section V we show how \mathcal{E} can be used as a lower bound on the concurrence. This delivers non-trivial bounds on the concurrence for many weakly entangled states. Finally, we conclude the paper and give technical calculations for some of our theorems in the Appendix.

## II Definition of the entanglement parameter

In this section we introduce a function based on the CMC that can be used to estimate the amount of entanglement in a given quantum state. To start, let us fix the notation and introduce the quantities, which we are going to work with. We consider quantum states \varrho over a finite-dimensional bipartite Hilbert space \mathcal{H}_{A}\otimes\mathcal{H}_{B}, where d_{A}=\mbox{dim}(\mathcal{H}_{A}) and d_{B}=\mbox{dim}(\mathcal{H}_{B}) denote dimensions of corresponding local spaces. Physical observables are described by Hermitian operators. For our purpose we choose a complete set of orthogonal observables A_{i} on \mathcal{H}_{A} with i=1,...,d_{A}^{2} and \mbox{Tr}(A_{i}A_{j})=\delta_{ij} and a similar set B_{j} for \mathcal{H}_{B}. We will refer to them as local orthogonal observables, an example for d_{A}=2 are the (appropriately normalized) Pauli matrices and the identity. Then, we can consider observables on \mathcal{H}_{A}\otimes\mathcal{H}_{B} defined by

\displaystyle\{M_{\alpha}\} | \displaystyle= | \displaystyle\{A_{i}\otimes\mathbbm{1},\mathbbm{1}\otimes B_{j}\},\;\;\;\;i=1,% \dots,d_{A}^{2}, | (1) | ||

\displaystyle j=d_{A}^{2}+1,\dots,d_{A}^{2}+d_{B}^{2}, |

which then also obey \mbox{Tr}(M_{\alpha}M_{\beta})=\delta_{\alpha\beta}.

The main object of our studies will be covariance matrices (CMs). A CM of a given bipartite state \varrho is defined by the following entries

\gamma(\varrho)_{\alpha\beta}=\frac{1}{2}\langle M_{\alpha}M_{\beta}+M_{\beta}% M_{\alpha}\rangle_{\varrho}-\langle M_{\alpha}\rangle_{\varrho}\langle M_{% \beta}\rangle_{\varrho}. | (2) |

Choosing the observables as in Eq. (1) one can write the CM in a handy block form

\gamma=\begin{pmatrix}A&C\\ C^{T}&B\end{pmatrix}, | (3) |

where A=\gamma(\varrho_{A},\{A_{i}\}), B=\gamma(\varrho_{B},\{B_{i}\}) are CMs of reduced density matrices and C_{ij}=\langle A_{i}\otimes B_{j}\rangle_{\varrho}-\langle A_{i}\rangle_{% \varrho}\langle B_{j}\rangle_{\varrho} denote correlations between the two parties.

Before introducing the function that we are going to use for entanglement quantification let us state the covariance matrix criterion (CMC). For that, recall that a state is separable, if it can be written as a convex combination of product states, i.e. \varrho=\sum_{k}p_{k}|a_{k}\rangle\langle a_{k}|\otimes|b_{k}\rangle\langle b_% {k}| with some probabilities p_{k}. Then we have:

###### Theorem 1 (Covariance matrix criterion).

Let \varrho be a separable bipartite state. Then there exist pure states |\psi_{k}\rangle\langle\psi_{k}| in \mathcal{H}_{A} and |\phi_{k}\rangle\langle\phi_{k}| in \mathcal{H}_{B} and convex weights p_{k} such that if we define \kappa_{A}=\sum_{k}p_{k}\gamma(|\psi_{k}\rangle\langle\psi_{k}|) and \kappa_{B}=\sum_{k}p_{k}\gamma(|\phi_{k}\rangle\langle\phi_{k}|) the inequality

\gamma(\varrho,\{M_{i}\})\geq\kappa_{A}\oplus\kappa_{B}\Longleftrightarrow% \begin{pmatrix}A&C\\ C^{T}&B\end{pmatrix}\geq\begin{pmatrix}\kappa_{A}&0\\ 0&\kappa_{B}\end{pmatrix} | (4) |

holds. This means that the difference between left and right hand side must be positive-semidefinite. If there are no such \kappa_{A,B} then the state \varrho must be entangled.

The proof of this statement can be found in Ref. wir (). The main task for applying the CMC is its evaluation, that is, the characterization of the matrices \kappa_{A} and \kappa_{B}. For this, several corollaries of the CMC have been derived in Refs. wir (); wirPRA (). As we will use them later, we present some of them here, but without any proof. For simplicity, we only consider the case d=d_{A}=d_{B}.

###### Proposition 2 (CMC evaluated from traces).

Let \varrho be a state with CM \gamma as in Eq. (3). Then if \varrho is separable, we have

\displaystyle 2\mbox{Tr}(|C|) | \displaystyle\leq\Big{(}\sum_{i=1}^{d^{2}}A_{ii}-d+1\Big{)}+\Big{(}\sum_{i=1}^% {d^{2}}B_{ii}-d+1\Big{)} | |||

\displaystyle=\big{[}1-\mbox{Tr}(\varrho_{A}^{2})\big{]}+\big{[}1-\mbox{Tr}(% \varrho_{B}^{2})\big{]}, | (5) |

If this inequality is violated, then \varrho must be entangled.

###### Proposition 3 (CMC and the trace norm of C).

Let \varrho be a state with CM \gamma as in Eq. (3). Then if \varrho is separable, we have for the trace norm of C

\displaystyle\|C\|_{\rm tr}^{2} | \displaystyle\leq\big{[}1-\mbox{Tr}(\varrho_{A}^{2})\big{]}\big{[}1-\mbox{Tr}(% \varrho_{B}^{2})\big{]}, | (6) |

If this inequality is violated, then \varrho must be entangled.

In order to define our entanglement parameter \mathcal{E}, let us reformulate the CMC in a slightly different way. Imagine some state \varrho is detected as entangled by the CMC. On the one hand there exist no \kappa_{A} and \kappa_{B} as above such that \gamma(\varrho)-\kappa_{A}\oplus\kappa_{B}\geq 0. On the other hand we can find surely \kappa^{e}_{A} and \kappa^{e}_{B} and some number t_{e}\in[0,1] such that \gamma(\varrho)-t_{e}\kappa^{e}_{A}\oplus\kappa^{e}_{B} is again positive semidefinite:

\gamma(\varrho_{e})-t_{e}\kappa^{e}_{A}\oplus\kappa^{e}_{B}\geq 0. | (7) |

In the worst case, we can fulfill this inequality by choosing t_{e}=0. For a state that is not detected by the CMC (e.g. a separable state) the parameter t can be chosen to be at least one, or even larger than that.

Implementing this idea in Theorem 1 results in an alternative formulation of the CMC:

###### Theorem 4 (Parameterized CMC).

Let \varrho be a bipartite state. Assume that we choose pure states |\psi_{k}\rangle\langle\psi_{k}| on \mathcal{H}_{A} and |\phi_{k}\rangle\langle\phi_{k}| on \mathcal{H}_{B} such that \kappa^{o}_{A}=\sum_{k}p_{k}\gamma(|\psi_{k}\rangle\langle\psi_{k}|) and \kappa^{o}_{B}=\sum_{k}p_{k}\gamma(|\phi_{k}\rangle\langle\phi_{k}|) are optimal in the sense that

\gamma-t_{o}\kappa^{o}_{A}\oplus\kappa^{o}_{B}\geq 0, | (8) |

for some 0\leq t_{o}\leq 1, but

\gamma-t\kappa_{A}\oplus\kappa_{B}\ngeq 0,\mbox{ for all }t>t_{o}\mbox{ and % all }\kappa_{A},\kappa_{B}. | (9) |

Then if the state \varrho is separable there exist \kappa^{o}_{A} and \kappa^{o}_{B} such that

\max_{t}\{t\leq 1:\gamma-t\kappa^{o}_{A}\oplus\kappa^{o}_{B}\geq 0\}=1, | (10) |

otherwise the state is entangled.

This leads to the idea, to use for entangled states the parameter t_{o} as an entanglement parameter. More precisely, we can define:

###### Definition 5 (Entanglement parameter).

Let \varrho be a bipartite quantum state with CM \gamma(\varrho). We define a function V(\varrho) as

V(\varrho)=\max_{t,\kappa_{A},\kappa_{B}}\{t\leq 1:\gamma(\varrho)-t\kappa_{A}% \oplus\kappa_{B}\geq 0\}. | (11) |

The entanglement parameter \mathcal{E}(\varrho) is then defined as

\mathcal{E}(\varrho)=1-V(\varrho). | (12) |

The parameter \mathcal{E}(\varrho) vanishes for separable states and is larger than zero for all states that are detected by the CMC. This function \mathcal{E}(\varrho) is the main topic of study in this paper and, as we shall see later, can be used to quantify entanglement in quantum states. A similar function has been already used to quantify entanglement in infinite dimensional systems, namely Gaussian states giedkecirac (), there this parameter turned out to be an entanglement monotone for special operations on special states.

Interestingly, using the parameterized version of the CMC (Theorem 4) and Propositions 2 and 3 one can immediately give a lower bound \mathcal{E}(\varrho). We can formulate:

###### Proposition 6 (Bounds on \mathcal{E}(\varrho)).

Assuming that d=d_{A}=d_{B} we have in the situation from above that

\mathcal{E}(\varrho)\geq\frac{\mbox{Tr}(\varrho_{A}^{2})+\mbox{Tr}(\varrho_{B}% ^{2})+2\mbox{Tr}(|C|)-2}{2d-2} | (13) |

and

\displaystyle\mathcal{E}(\varrho) | \displaystyle\geq | \displaystyle\frac{1}{d-1}\Big{\{}\frac{\mbox{Tr}(\varrho_{A}^{2})+\mbox{Tr}(% \varrho_{B}^{2})-2}{2}+ | (14) | ||

\displaystyle+\sqrt{\tfrac{1}{4}[\mbox{Tr}(\varrho_{A}^{2})-\mbox{Tr}(\varrho_% {B}^{2})]^{2}+\|C\|_{\rm tr}^{2}}\Big{\}}. |

Proof. For the first case, a calculation as in Ref. wirPRA () gives a parameterized version of Proposition 2 and results in 2\mbox{Tr}(|C|)\leq\mbox{Tr}(A+B-t(\kappa_{A}+\kappa_{B})). Using \mbox{Tr}(\gamma(\varrho))=d-\mbox{Tr}(\varrho^{2}) (see Ref. wirPRA ()) gives

t\leq\frac{2d-\mbox{Tr}(\varrho_{A}^{2})-\mbox{Tr}(\varrho_{B}^{2})-2\mbox{Tr}% (|C|)}{2d-2} | (15) |

and finally Eq. (13). Eq. (14) can be also directly derived from Eq. (6) and from the calculations in Ref. wirPRA (). \hfill\blacksquare

## III Properties of the entanglement parameter \mathcal{E}

In this section we investigate general properties of the function \mathcal{E}(\varrho). Since the function \mathcal{E}(\varrho) should be used to quantify entanglement in a given quantum state, two of the properties that have to be fulfilled are that it is convex and does not change under local unitary transformations. Indeed, this is the case:

###### Lemma 7 (Convexity and invariance under local unitary transformations).

The entanglement parameter \mathcal{E}(\varrho) is invariant under local unitary transformations and is convex in the state, that is for \varrho=p\varrho_{1}+(1-p)\varrho_{2} we have that \mathcal{E}(\varrho)\leq p\mathcal{E}(\varrho_{1})+(1-p)\mathcal{E}(\varrho_{2% }).

Proof: The invariance under local unitary transformations follows simply from the fact that the CMC is invariant under such transformations wir (); wirPRA (). In more detail, such transformations map a set of local orthogonal observables to another set of local orthogonal observables, and the CMC does not depend on the choice of the observables.

Concerning convexity, it is sufficient to prove the concavity of V(\varrho), i.e. that for any state \varrho=p\varrho_{1}+(1-p)\varrho_{2} the inequality V(\varrho)=\tilde{t}\geq p\tilde{t}_{1}+(1-p)\tilde{t}_{2}\equiv t^{\prime} holds, where \tilde{t}_{1}=V(\varrho_{1}) and \tilde{t}_{2}=V(\varrho_{2}).

To prove this we exploit the connection between the CMC and local uncertainty relations (LURs) lurs (). V(\varrho)=\tilde{t} implies that the parameterized CMC criterion is fulfilled and there exist \kappa_{A}, \kappa_{B} and \tilde{t} such that \gamma(\varrho)-\tilde{t}\kappa_{A}\oplus\kappa_{B}\geq 0. According to the Proposition V.2 in Ref. wirPRA () this means that if we take arbitrary local observables on Alice’s and Bob’s side {A}_{k}\otimes\mathbbm{1} and \mathbbm{1}\otimes{B}_{k} such and define positive constants U_{A}=\min_{\varrho}\sum_{k}\delta^{2}({A}_{k}) and U_{B}=\min_{\varrho}\sum_{k}\delta^{2}({B}_{k}) then

\sum_{k}\delta^{2}\left({A}_{k}\otimes\mathbbm{1}+\mathbbm{1}\otimes{B}_{k}% \right)_{\varrho}\geq\tilde{t}\left(U_{A}+U_{B}\right). | (16) |

Therefore it suffices to show that t^{\prime} fulfills the last inequality as well. Due to the concavity of the variance we can write

\displaystyle\sum_{k}\delta^{2} | \displaystyle\left(A_{k}\otimes\mathbbm{1}+\mathbbm{1}\otimes B_{k}\right)_{% \varrho}\geq p\sum_{k}\delta^{2}\left(A_{k}\otimes\mathbbm{1}+\mathbbm{1}% \otimes B_{k}\right)_{\varrho_{1}} | |||

\displaystyle+(1-p)\sum_{k}\delta^{2}\left(A_{k}\otimes\mathbbm{1}+\mathbbm{1}% \otimes B_{k}\right)_{\varrho_{2}}. | (17) |

Since the states \varrho_{1} and \varrho_{2} both fulfill the CMC with the parameters \tilde{t}_{1} and \tilde{t}_{2} we can write

\displaystyle p\sum_{k}\delta^{2} | \displaystyle\left(A_{k}\otimes\mathbbm{1}+\mathbbm{1}\otimes B_{k}\right)_{% \varrho_{1}}+(1-p)\sum_{k}\delta^{2}(A_{k}\otimes\mathbbm{1}+ | |||

\displaystyle+ | \displaystyle\mathbbm{1}\otimes B_{k})_{\varrho_{2}}\geq\left[\tilde{t}_{1}p+% \tilde{t}_{2}(1-p)\right]\left(U_{A}+U_{B}\right). | (18) |

Note that \tilde{t} is defined as maximal value of all possible t. Using (17) and (18) this finishes the proof. \hfill\blacksquare

A further important property of entanglement measures is they do not increase under local operations assisted with classical communication. This condition can be demanded in two different forms (see Refs. hororeview (); plenioreview (); loccoa ()): Minimally, one requires that if \hat{\varrho} arises from \varrho via some LOCC transformation, then E(\varrho)\geq E(\hat{\varrho}) holds. Often, however, a stronger condition is required and fulfilled, namely that E(\varrho) should not increase under LOCC operations on average. This means that if an LOCC protocol maps \varrho onto some states \varrho_{i} with probabilities p_{i}, then

E(\varrho)\geq\sum_{i}p_{i}E(\varrho_{i}), | (19) |

should hold.

In the following, we will show by an example that \mathcal{E}(\varrho) can increase on average under LOCC operations. This does not exclude a priori the usability of \mathcal{E}(\varrho) as an entanglement monotone (since the minimal requirement might still hold), however, it is a hint that \mathcal{E}(\varrho) might not be an entanglement measure. As we will see later, however, \mathcal{E}(\varrho) can be very useful to derive lower bounds on the concurrence for mixed states.

###### Lemma 8 (Increasing one average under LOCC).

There exists a two-qubit state \varrho and an LOCC-protocol, such that \mathcal{E} increases on average from zero to a positive value under this protocol.

Proof. We prove the statement by providing an explicit example of a two-qubit state, which can be found numerically. The idea to find such an example is as follows: We consider a family of states that was already intensively investigated in Refs. RudolphCCNC2003pra (); wirPRA (). Within this family one can find pairs of states \varrho and \varrho^{\prime} with the same covariance matrix but where \varrho is entangled, while \varrho^{\prime} is not. Hence, \varrho cannot be detected by the CMC criterion, and \mathcal{E}(\varrho) has to vanish.

It was shown in Refs. wir (); wirPRA (), however, that after an appropriate filtering operation

\varrho\mapsto\varrho_{\rm filt}={F_{A}\otimes F_{B}\varrho F_{A}^{\dagger}% \otimes F_{B}^{\dagger}} | (20) |

any entangled two-qubit state can be detected by the CMC. Hence \mathcal{E}(\varrho_{\rm filt})>0 and the filtering operation will give rise to the desired LOCC operation.

To be more concrete, a numerical example of the aforementioned state \varrho is

\varrho=\left(\begin{array}[]{cccc}0.48508&0&0&0.02094\\ 0&0.33&0&0\\ 0&0&0.00067&0\\ 0.02094&0&0&0.18425\end{array}\right), | (21) |

which is not detected by the CMC (see wirPRA ()) but which is clearly NPT and hence entangled. The corresponding filter operations are

\displaystyle F_{A} | \displaystyle=\left(\begin{array}[]{cc}0.16457&0\\ 0&0.98637\end{array}\right), | |||

\displaystyle F_{B} | \displaystyle=\left(\begin{array}[]{cc}0.96526&0\\ 0&0.26128\end{array}\right). | (22) |

The final state after \varrho_{\rm filt} filtering will be

\displaystyle\varrho_{\rm filt} | \displaystyle=\frac{F_{A}\otimes F_{B}\varrho^{\prime}F_{A}\otimes F_{B}}{Tr% \left(F_{A}\otimes F_{B}\varrho^{\prime}F_{A}\otimes F_{B}\right)} | |||

\displaystyle=\left(\begin{array}[]{cccc}0.47636&0&0&0.03336\\ 0&0.02375&0&0\\ 0&0&0.02364&0\\ 0.03336&0&0&0.47626\end{array}\right) | (23) |

and is detected by the CMC, hence \mathcal{E}(\varrho_{\rm filt})>0. Since \varrho is not detected, we have \mathcal{E}(\varrho)=0.

Using the filter operations F_{A} and F_{B} we can now construct a POVM type of measurements for Alice and Bob. The complementary operations are given by

\displaystyle F^{c}_{A} | \displaystyle=\left(\mathbbm{1}-F_{A}F_{A}\right)^{\frac{1}{2}}=\left(\begin{% array}[]{cc}0.98637&0\\ 0&0.16457\end{array}\right), | |||

\displaystyle F^{c}_{B} | \displaystyle=\left(\mathbbm{1}-F_{B}F_{B}\right)^{\frac{1}{2}}=\left(\begin{% array}[]{cc}0.26128&0\\ 0&0.96526\end{array}\right). | (24) |

With this operations we establish LOCC protocol with four different outcomes

\displaystyle\varrho_{1}\equiv\varrho_{\rm filt} | \displaystyle\mbox{ with probability }p_{1}=0.02570, | |||

\displaystyle\varrho_{2} | \displaystyle\mbox{ with probability }p_{2}=0.17629, | |||

\displaystyle\varrho_{3} | \displaystyle\mbox{ with probability }p_{3}=0.46200, | |||

\displaystyle\varrho_{4} | \displaystyle\mbox{ with probability }p_{4}=0.33601. | (25) |

Important for us is the fact that applying this protocol to a state with \mathcal{E}(\varrho) we achieve a state such that \mathcal{E}(\varrho_{\rm filt}) with non-zero probability. Therefore 0=\mathcal{E}(\varrho)<\sum_{i=1}^{4}p_{i}\mathcal{E}(\varrho_{i}), and \mathcal{E}(\varrho) increases on average under LOCC. \hfill\blacksquare

Note that for the provided example one can check the separability of the state \tilde{\varrho}=\sum_{i}p_{i}\varrho_{i} as this state has a positive partial transpose and is therefore separable. Consequently, the protocol given is not a counterexample to the LOCC condition of the first kind.

## IV Evaluation of \mathcal{E}(\varrho) for pure and Schmidt-correlated states

In this section we compute \mathcal{E}(\varrho) for pure states and a family of mixed states. We start with the case of two-qubits. Then, we generalize it to d-dimensional systems.

### IV.1 Pure states of two qubits

Using the relations that can be found in Appendix A, it is straightforward to calculate the CM of a two-qubit state |\psi\rangle=\sqrt{\lambda_{1}}|00\rangle+\sqrt{\lambda_{2}}|11\rangle with \lambda_{1}+\lambda_{2}=1. The CM will have the familiar block form

\gamma(|\psi\rangle)=\left(\begin{array}[]{cc}A&C\\ C^{T}&B\end{array}\right). | (26) |

with

\displaystyle A | \displaystyle=B=\left(\begin{array}[]{cccc}\lambda_{1}-\lambda_{1}^{2}&-% \lambda_{1}\lambda_{2}&0&0\\ -\lambda_{1}\lambda_{2}&\lambda_{2}-\lambda_{2}^{2}&0&0\\ 0&0&\frac{1}{2}&0\\ 0&0&0&\frac{1}{2}\end{array}\right), | |||

\displaystyle C | \displaystyle=\left(\begin{array}[]{cccc}\lambda_{1}-\lambda_{1}^{2}&-\lambda_% {1}\lambda_{2}&0&0\\ -\lambda_{1}\lambda_{2}&\lambda_{2}-\lambda_{2}^{2}&0&0\\ 0&0&\sqrt{\lambda_{1}\lambda_{2}}&0\\ 0&0&0&-\sqrt{\lambda_{1}\lambda_{2}}\end{array}\right). | (27) |

The next step in the calculation of the parameter t and therefore of the function \mathcal{E}(\varrho) is to find the optimal \kappa_{A}\oplus\kappa_{B}. In the two-qubit case we first guess the correct solution and the prove its optimality.

To construct the matrix \kappa_{A}\oplus\kappa_{B} we take two product states |00\rangle\langle 00| and |11\rangle\langle 11| and get \kappa_{A}\oplus\kappa_{B}=\frac{1}{2}{\rm diag}\{0,0,1,1,0,0,1,1\}. Then we calculate the V(|\psi\rangle) from the condition \gamma-t\kappa_{A}\oplus\kappa_{B}\geq 0. This matrix is positive iff 1-t\geq 2\sqrt{\lambda_{1}\lambda_{2}} and therefore for any

t\leq 1-2\sqrt{\lambda_{1}\lambda_{2}} | (28) |

we can find \kappa_{A} and \kappa_{B} such that \gamma-t\kappa_{A}\oplus\kappa_{B}\geq 0 holds.

Note that taking some particular expansion for \kappa_{A}\oplus\kappa_{B}, strictly speaking, does not provide any information about the entanglement, except for the case when we are able to find \kappa_{A}\oplus\kappa_{B} such that \gamma-t\kappa_{A}\oplus\kappa_{B}\geq 0 for some t\geq 1. Then the state is not detected by the CMC and \mathcal{E}(\varrho)=0. However, we can use the Proposition 2 to prove the following:

###### Lemma 9.

The upper bound on the parameter t for two qubits provided in Eq. (28) is tight.

Proof: Directly applying the relation (15) to the two-qubit case we have t\leq 1-2\sqrt{\lambda_{1}\lambda_{2}}, which coincides with (28) and therefore gives an optimal bound on parameter t. Indeed, on the one hand, it follows immediately from (28) that if t\leq 1-2\sqrt{\lambda_{1}\lambda_{2}} then we can find a decomposition \kappa_{A}\oplus\kappa_{B} such that \gamma-t\kappa_{A}\oplus\kappa_{B}\geq 0 holds. On the other hand, the condition (15) implies that for all t, with t>1-2\sqrt{\lambda_{1}\lambda_{2}} and for all \kappa_{A} and \kappa_{B} the relation \gamma-t\kappa_{A}\oplus\kappa_{B}\ngeq 0 holds. \hfill\blacksquare

According to the last Lemma the function \mathcal{E}(|\psi\rangle) can be calculated exactly for two-qubit pure states as

\mathcal{E}(|\psi\rangle)=2\sqrt{\lambda_{1}\lambda_{2}}. | (29) |

### IV.2 Pure states of two qudits

To estimate the parameter t for a pure state of two d-level systems, we follow the same strategy as in the two-qubit case and take the states |kk\rangle\langle kk| for the decomposition of \kappa_{A}\oplus\kappa_{B} in order to derive the upper bound on the parameter t. We make the ansatz

\kappa_{A}\oplus\kappa_{B}=\sum_{i=1}^{d}p_{i}\gamma(|ii\rangle) | (30) |

with some probabilities p_{i}.

The positive semi-definiteness of the matrix \gamma-t\kappa_{A}\oplus\kappa_{B} then implies the positive semi-definiteness of 2\times 2 blocks of the type

X^{ij}_{2\times 2}=\left(\begin{array}[]{cc}\lambda_{i}+\lambda_{j}-t(p_{i}+p_% {j})&\pm 2\sqrt{\lambda_{i}\lambda_{j}}\\ \pm 2\sqrt{\lambda_{i}\lambda_{j}}&\lambda_{i}+\lambda_{j}-t(p_{i}+p_{j})\end{% array}\right) | (31) |

for all i<j. Therefore, if for all i<j

t\leq\frac{\left(\sqrt{\lambda_{i}}-\sqrt{\lambda_{j}}\right)^{2}}{p_{i}+p_{j}} | (32) |

holds, then we can find \kappa_{A} and \kappa_{B} such that \gamma-t\kappa_{A}\oplus\kappa_{B}\geq 0 holds. To achieve the goal and calculate the function \mathcal{E}(|\psi\rangle) we need to prove that the choice of the expansion of the \kappa_{A}\oplus\kappa_{B} in Eq. (30) was optimal.

###### Lemma 10 (Optimality of the decomposition).

The optimal expansion for \kappa_{A}\oplus\kappa_{B} can always be written in a form of the Eq. (30):

\kappa_{A}^{opt}\oplus\kappa_{B}^{opt}=\sum_{i=1}^{I}p_{i}\gamma(|ii\rangle). | (33) |

Proof. First, we show that for pure states in Schmidt decomposition \gamma(|\psi\rangle)-t\kappa_{A}\oplus\kappa_{B}\geq 0 is equivalent to \gamma(|\psi\rangle)-t\kappa\oplus\kappa\geq 0, for some \kappa, which can be found explicitly. This \kappa can be constructed by choosing the product states in a proper way. Indeed, note that since the CM of a state in Schmidt decomposition is symmetric with respect to the interchange of the parties (A\leftrightarrow B) the relation \gamma(|\psi\rangle)-t\kappa_{B}\oplus\kappa_{A}\geq 0 must hold as well. So let \kappa_{A}=\sum_{k=1}^{K}p_{k}\gamma(|a_{k}\rangle\langle a_{k}|) and \kappa_{B}=\sum_{k=1}^{K}p_{k}\gamma(|b_{k}\rangle\langle b_{k}|). Then we have

\gamma(|\psi\rangle)-\frac{t}{2}\left(\kappa_{A}\oplus\kappa_{B}+\kappa_{B}% \oplus\kappa_{A}\right)\geq 0. | (34) |

Since \kappa_{A}\oplus\kappa_{B}+\kappa_{B}\oplus\kappa_{A}=\kappa_{A}\oplus\kappa_{% A}+\kappa_{B}\oplus\kappa_{B}, the appropriate choice of the product states is

|\eta_{k}\rangle=\left\{\begin{array}[]{l}|a_{i}\rangle\otimes|a_{i}\rangle,\>% i=1,\dots,K\;\;(\mbox{for }\kappa_{A}\oplus\kappa_{A}),\\ |b_{i}\rangle\otimes|b_{i}\rangle,\>i=K+1,\dots,2K\;\;(\mbox{for }\kappa_{B}% \oplus\kappa_{B}).\end{array}\right. | (35) |

Hence we have \gamma(|\psi\rangle)-t\kappa\oplus\kappa\geq 0, with

\kappa=\sum_{k=1}^{2K}\tilde{p}_{k}\gamma{|\eta_{k}\rangle}, | (36) |

where \tilde{p}_{k}=\tfrac{1}{2}p_{(k\;{{\rm mod}}\;K)}.

Second, because the blocks D in Eq. (A-9) in the Appendix are the same, we note that all diagonal elements D_{ii} from \kappa must be zero, otherwise only t=0 will satisfy \gamma(\varrho)-t\kappa_{A}\oplus\kappa_{B}\geq 0. This means that the only states, which can appear in the expansion (36) are of the form |\eta_{k}\rangle=|kk\rangle, since the |a_{k}\rangle and |b_{k}\rangle have to be eigenstates of the operators D_{i}=|i\rangle\langle i| (see the Appendix A). \hfill\blacksquare

Having proved the optimality of the expansion of \kappa_{A}\oplus\kappa_{B} in Eq. (30) we can now provide the general formula for the function \mathcal{E}(|\psi\rangle) for pure states in the Schmidt decomposition. The value of the function V(|\psi\rangle) is given by the solution of the following max-min problem

\alpha^{0}=\max_{\mathcal{P}}\min_{i<j}\frac{\left(\sqrt{\lambda_{i}}-\sqrt{% \lambda_{j}}\right)^{2}}{p_{i}+p_{j}},\>1\leq i<j\leq d, | (37) |

where the first max is taken over all possible probability distributions \mathcal{P}=\{p_{1},p_{2},...\}. A solution of this problem for the case d=3 and d=4 is given in Appendix and we can summarize:

###### Proposition 11 (\mathcal{E} for pure states).

(a) If |\psi\rangle=\sum_{i=1}^{3}\sqrt{\lambda_{i}}|ii\rangle is a pure two-qutrit state, then

\mathcal{E}(|\psi\rangle)=2\sqrt{\lambda_{i_{0}}\lambda_{j_{0}}}+2\sqrt{% \lambda_{i_{0}}\lambda_{k_{0}}}-\lambda_{i_{0}}, | (38) |

where i_{0},j_{0},k_{0} are pairwise different and
j_{0},k_{0} are such that
(\sqrt{\lambda_{j_{0}}}-\sqrt{\lambda_{k_{0}}})^{2}\geq(\sqrt{\lambda_{j}}-%
\sqrt{\lambda_{k}})^{2}
for all j,k.

(b) If |\psi\rangle=\sum_{i=1}^{4}\sqrt{\lambda_{i}}|ii\rangle
is a pure state in a 4\times 4-system, then

\mathcal{E}(|\psi\rangle)=\max\{\mathfrak{e}_{1},\mathfrak{e}_{2},\mathfrak{e}% _{3}\}, | (39) |

where

\displaystyle\mathfrak{e}_{1} | \displaystyle=2\sqrt{\lambda_{1}\lambda_{2}}+2\sqrt{\lambda_{3}\lambda_{4}},\;% \;\;\;\mathfrak{e}_{2}=2\sqrt{\lambda_{1}\lambda_{3}}+2\sqrt{\lambda_{2}% \lambda_{4}}, | |||

\displaystyle\mathfrak{e}_{3} | \displaystyle=2\sqrt{\lambda_{1}\lambda_{4}}+2\sqrt{\lambda_{2}\lambda_{3}}. | (40) |

Note that in both cases we have for a maximally entangled state \mathcal{E}(\psi)=1.

### IV.3 Schmidt-correlated states

To conclude the section we consider a family of mixed states, for which the introduced function \mathcal{E}(\varrho) can be also computed exactly. These states are called Schmidt-correlated (SC) states in the literature scstates (). By definition, SC states are a mixture of states that share the same Schmidt basis

\displaystyle\varrho_{SC} | \displaystyle=\sum_{u=1}^{N}q_{u}|\psi_{u}\rangle\langle\psi_{u}|,\mbox{ with} | (41) | ||

\displaystyle|\psi_{u}\rangle | \displaystyle=\sum_{i=1}^{d}\sqrt{\lambda_{i}^{(u)}}|ii\rangle. | (42) |

SC states can be written in computational basis directly as

\varrho_{SC}=\sum_{ij}\varrho_{ij}|ii\rangle\langle jj|,\mbox{ with }\varrho_{% ij}=\sum_{u}q_{u}\sqrt{\lambda^{(u)}_{i}\lambda^{(u)}_{j}}. | (43) |

As in the case of pure states, we find for the SC states the optimal decomposition of \kappa_{A}\oplus\kappa_{B}:

###### Lemma 12 (Optimality for SC states).

In the case of SC states the optimal decomposition of \kappa_{A}\oplus\kappa_{B} for the estimation of the parameter \mathcal{E}(\varrho) can always be written in the form of Eq. (30):

\kappa_{A}^{opt}\oplus\kappa_{B}^{opt}=\sum_{i=1}^{d}p_{i}\gamma(|ii\rangle). | (44) |

Proof: There were two essential ingredients in the proof of the Lemma 10. First, we used the fact that the CM of a state, written in Schmidt decomposition, is invariant under interchange of parties. Obviously the same invariance does also hold for SC states. Second, we used the fact, that all blocks D of the CM are the same. Using the formulae of the Appendix A one easily verifies that D^{A}_{SC}=D^{B}_{SC}=D^{C}_{SC}. \hfill\blacksquare

For these states the problem of calculating the function \mathcal{E}(\varrho) reduces to the max-min problem in Eq. (37). This is due to the fact that diagonal elements of the covariance matrix have a pretty simple form for \varrho_{SC}. Indeed, using the formulae from the Appendix A we calculate directly:

\displaystyle(D^{A/B/C}_{SC})_{ij}=\varrho_{ii}\delta_{ij}-\varrho_{ii}\varrho% _{jj},\;\;1\leq i\leq d, | |||

\displaystyle X^{A/B}_{SC}=Y^{A/B}_{SC}=\frac{1}{2}\mbox{diag}\{\varrho_{ii}+% \varrho_{kk}\},\;\;1\leq i<k\leq d, | |||

\displaystyle X^{C}_{SC}=-Y^{C}_{SC}=\mbox{diag}\{\varrho_{ik}\},\>1\leq i<k% \leq d. | (45) |

The 2\times 2 blocks in Eq. (31) will then take the form

B^{ij}_{2\times 2}=\left(\begin{array}[]{cc}\varrho_{ii}+\varrho_{jj}-t(p_{i}+% p_{j})&\pm 2\varrho_{ij}\\ \pm 2\varrho_{ij}&\varrho_{ii}+\varrho_{jj}-t(p_{i}+p_{j})\end{array}\right), | (46) |

which leads to the following max-min problem for V(\varrho_{SC})

\displaystyle V(\varrho_{SC}) | \displaystyle=\max_{\mathcal{P}}\min_{i<j}\frac{\varrho_{ii}+\varrho_{jj}-2% \varrho_{ij}}{p_{i}+p_{j}} | |||

\displaystyle=\max_{\mathcal{P}}\min_{i<j}\frac{\sum_{k}q_{k}\left(\sqrt{% \lambda^{(k)}_{i}}-\sqrt{\lambda^{(k)}_{j}}\right)^{2}}{p_{i}+p_{j}}. | (47) |

This problem can be effectively solved numerically or with the methods of the Appendix B and its solution gives the exact value of the function \mathcal{E}(\varrho_{SC}). For two qubits one finds

\mathcal{E}(\varrho_{SC})=2\sum_{k}q_{k}\sqrt{\lambda^{(k)}_{0}\lambda^{(k)}_{% 1}} | (48) |

as a nice analytical expression.

## V The entanglement parameter \mathcal{E}(\varrho) as a lower bound on the concurrence

In this section we demonstrate that the function \mathcal{E}(\varrho) can be used to estimate the amount of entanglement in a quantum state. More specifically, we show how it delivers a lower bound on the concurrence, which is a well known measure of bipartite entanglement. For bipartite pure states in a d\times d-system the concurrence is defined as conc1 (); conc2 (); conc3 ():

C(|\psi\rangle)=\sqrt{\frac{d}{d-1}}\sqrt{1-\mbox{Tr}(\varrho_{A}^{2})}. | (49) |

In this definition, we introduced already a prefactor which guarantees that 0\leq C\leq 1, this will turn out to be useful for our purposes.

The concurrence is then extended to mixed states by the convex-roof construction

C(\varrho)=\min_{p_{i},|\psi_{i}\rangle}\sum_{i}p_{i}C(|\psi_{i}\rangle), | (50) |

where the minimization is taken over all possible decompositions of the state \varrho=\sum_{i}p_{i}|\psi_{i}\rangle\langle\psi_{i}|. Of course, this minimization is quite difficult to perform, and only for two-qubits a complete solution is known conc2 (). Therefore, it is desirable to have at least some lower bounds on the concurrence.

The idea of obtaining lower bounds on C from \mathcal{E} is as follows: Let us assume that one can prove a lower bound like

C(|\psi\rangle)\geq\alpha\mathcal{E}(|\psi\rangle)+\beta | (51) |

for pure states only with some constants \alpha,\beta and \alpha\geq 0. Then, since \mathcal{E} is convex, the right hand side of Eq. (51) is convex, too. By definition, the convex roof is the largest convex function which coincides with C on the pure states. Consequently, C(\varrho)\geq\alpha\mathcal{E}(\varrho)+\beta holds for all mixed states, too. This trick has already been employed in several works to obtain lower bounds on entanglement measures chenbounds (); otherconvexbounds (). However, as the CMC detects many bound entangled states where other criteria fail wirPRA (), our results will deliver entanglement estimates for states, where the other methods fail.

### V.1 Two qubits

Using the Schmidt decomposition, one can express the concurrence for pure states in terms of Schmidt coefficients as

C(|\psi\rangle)=\sqrt{\frac{2d}{d-1}}\sqrt{\sum_{i<j}\lambda_{i}\lambda_{j}}. | (52) |

Comparing Eq. (52) and Eq. (29) from the Section IV we see that the concurrence and the function E(|\psi\rangle) coincide for on two-qubit pure states

\mathcal{E}(|\psi\rangle)=2\sqrt{\lambda_{1}\lambda_{2}}=C(|\psi\rangle). | (53) |

Consequently, C(\varrho)\geq\mathcal{E}(\varrho) holds for any mixed state. Note, however, that for the special case of two qubits one can calculate the concurrence also directly for mixed states conc2 ().

### V.2 Two qutrits

Using the solution of the problem (37) it is possible to derive a lower bound on concurrence for pure states of two d-level systems. Before we proceed, note that chenbounds ()

C(|\psi\rangle)=\sqrt{\frac{2d}{d-1}}\sqrt{\sum_{i<j}\lambda_{i}\lambda_{j}}% \geq\frac{2}{d-1}\sum_{i<j}\sqrt{\lambda_{i}\lambda_{j}}, | (54) |

This follows from the fact that

\displaystyle\sum_{i<j}\lambda_{i}\lambda_{j}=\frac{1}{d(d-1)}\sum_{i<j}\sum_{% k<l}(\lambda_{i}\lambda_{j}+\lambda_{k}\lambda_{l}) | (55) | ||

\displaystyle\geq\frac{2}{d(d-1)}\sum_{i<j}\sum_{k<l}\sqrt{\lambda_{i}\lambda_% {j}\lambda_{k}\lambda_{l}}=\frac{2}{d(d-1)}\Big{[}\sum_{i<j}\sqrt{\lambda_{i}% \lambda_{j}}\Big{]}^{2}. |

For two qutrits \mathcal{E}(|\psi\rangle) is given by Eq. (38). We have that

\displaystyle 2 | \displaystyle\sqrt{\lambda_{i}\lambda_{j}}+2\sqrt{\lambda_{i}\lambda_{k}}-% \lambda_{i}=2\sqrt{\lambda_{i}\lambda_{j}}+2\sqrt{\lambda_{i}\lambda_{k}}+2% \sqrt{\lambda_{j}\lambda_{k}} | |||

\displaystyle-2\sqrt{\lambda_{j}\lambda_{k}}-1+\lambda_{j}+\lambda_{k} | ||||

\displaystyle\leq 2C(|\psi\rangle)+(\sqrt{\lambda_{j}}-\sqrt{\lambda_{k}})^{2}-1 | ||||

\displaystyle\leq 2C(|\psi\rangle). | (56) |

Hence we have for mixed two-qutrit states

C(\varrho)\geq\frac{\mathcal{E}(\varrho)}{2}. | (57) |

Using the results from Proposition 6 we have, for instance,

\displaystyle C(\varrho) | \displaystyle\geq | \displaystyle\frac{1}{4}\Big{\{}\frac{\mbox{Tr}(\varrho_{A}^{2})+\mbox{Tr}(% \varrho_{B}^{2})-2}{2}+ | (58) | ||

\displaystyle+\sqrt{\tfrac{1}{4}[\mbox{Tr}(\varrho_{A}^{2})-\mbox{Tr}(\varrho_% {B}^{2})]^{2}+\|C\|_{\rm tr}^{2}}\Big{\}}, |

which is an easily computable lower bound that delivers non-trivial estimates for many weakly entangled states.

### V.3 4\times 4 systems

In this case \mathcal{E}(|\psi\rangle) is given by Eq. (39). We can directly estimate:

\displaystyle\mathcal{E}(|\psi\rangle) | \displaystyle=\max\{\mathfrak{e}_{1},\mathfrak{e}_{2},\mathfrak{e}_{3}\}\leq 2% \sqrt{\lambda_{1}\lambda_{3}}+2\sqrt{\lambda_{2}\lambda_{4}} | (59) | ||

\displaystyle+2\sqrt{\lambda_{1}\lambda_{2}}+2\sqrt{\lambda_{3}\lambda_{4}}+2% \sqrt{\lambda_{1}\lambda_{4}}+2\sqrt{\lambda_{2}\lambda_{3}} | ||||

\displaystyle\leq 3C(|\psi\rangle) |

and hence for arbitrary mixed states

C(\varrho)\geq\frac{1}{3}\mathcal{E}(\varrho). | (60) |

### V.4 Examples

Let us discuss the strength of these lower bounds by considering some examples. Let us first consider Bell-diagonal two-qubit states. For them, the reduced states \varrho_{A} and \varrho_{B} are maximally mixed, and then Proposition 6 delivers the bound C(\varrho)\geq\mbox{Tr}(|C|)-1/2. On the other hand, it is known that for Bell diagonal states the concurrence is given by C(\varrho)=2\lambda_{\rm max}-1, where \lambda_{\rm max} is the maximal eigenvalue, i.e., the maximal overlap with some Bell state conc1 (). Noting that \lambda_{\rm max}=[1+2\mbox{Tr}(|C|)]/4 (this can be easily seen if the closest Bell state is the singlet state and we take appropriately normalized Pauli matrices as observables in the definition of the matrix C), one finds that our lower bound is tight for Bell diagonal states.

For general two-qubit states, the lower bound cannot be tight, as they are entangled two-qubit states, which are not detected by the CMC. On the other hand, any full rank two qubit state can be brought to a Bell-diagonal state by filtering operations. Since it is known how the concurrence changes under filtering operations frank (), one could use the filtering and our lower bound to determine the concurrence for arbitrary two-qubit states.

For two qutrits, our bound is not tight for states like |\psi\rangle=(|00\rangle+|11\rangle+|22\rangle)/\sqrt{3} or |\psi\rangle=(|00\rangle+|22\rangle)/\sqrt{2}, however, for the latter the reason lies in the fact that the bound (54) is not tight. On the other hand, the presented method delivers nontrivial lower bounds for many bound entangled states (such as the the family of chessboard states), as many states of this type are detected by the CMC wirPRA (), but not by the PPT or CCNR criterion (which means that the methods from Ref. chenbounds () must fail). Similarly, our methods can be used to estimate the entanglement of bound entangled states for 4\times 4-systems.

## VI Conclusion

In conclusion, we have introduced an entanglement parameter \mathcal{E} that quantifies the violation of the covariance matrix criterion. We have shown that this parameter is convex and invariant under local rotations, but it can increase on average under local operations and classical communication. Most importantly the parameter \mathcal{E} can be used to deliver lower bounds on the concurrence.

For future work, it would be interesting to connect \mathcal{E} to other entanglement measures, such as the entanglement of formation plenioreview (). Even more interesting, would be an extension of the covariance matrix criterion to the multipartite case and a definition of a similar entanglement parameter there. This could help to quantify entanglement in multipartite systems, where much less is known compared to bipartite systems.

We thank Jens Eisert, Bastian Jungnitsch, Matthias Kleinmann, and Sönke Niekamp for discussions. Especially we thank Philipp Hyllus for discussions and comments on the manuscript. This work has been supported by the FWF (START Prize) and the EU (OLAQUI, QICS, SCALA).

## APPENDIX A.

Here we calculate symmetric block CM of a pure bipartite state, which is written in the Schmidt decomposition |\psi\rangle=\sum_{i}\sqrt{\lambda_{i}}|i_{A}\rangle\otimes|i_{B}\rangle. Consider d_{A}=d_{B}=d. As it is proven in wirPRA () we can choose the basis in the operator spaces \mathcal{B}(\mathcal{H}_{A}) and \mathcal{B}(\mathcal{H}_{B}) arbitrarily for applying the CMC. In this case it is convenient to choose the local orthogonal observables

\displaystyle D_{i} | \displaystyle=|i\rangle\langle i|,\;\;\;i=1,\dots,d, | (A-1) | ||

\displaystyle X_{i,j} | \displaystyle=\frac{1}{\sqrt{2}}(|i\rangle\langle j|+|j\rangle\langle i|),\;\;% \;1\leq i<j\leq d, | (A-2) | ||

\displaystyle Y_{k,l} | \displaystyle=\frac{i}{\sqrt{2}}(|k\rangle\langle l|-|l\rangle\langle k|),\;\;% \;1\leq k<l\leq d, | (A-3) |

which satisfy following anticommutation relations:

\displaystyle\{D_{i},D_{j}\} | \displaystyle=\delta_{ij}\left(|i\rangle\langle j|+|j\rangle\langle i|\right),% \;\;\;\{D_{i},X_{ij}\}=X_{ij}, | |||

\displaystyle\{D_{i},Y_{ij}\} | \displaystyle=Y_{ij},\;\;\;\;\;\;\;\;\;\;\;\;\;\;\;\;\;\;\;\;\;\;\;\;\;\;\{X_{% ij},Y_{ij}\}=0, | |||

\displaystyle\{X_{ij},X_{ij}\} | \displaystyle=D_{i}+D_{j},\;\;\;\;\;\;\;\;\;\;\;\;\;\;\;\;\;\;\{Y_{ij},Y_{ij}% \}=D_{i}+D_{j}. | (A-4) |

Note that this is not the complete set of relations, however other relations will not give any contribution to the CM and hence we leave them out here.

The mean values for the state |\psi\rangle are given by

\displaystyle\langle X^{A}_{ij}\otimes\mathbbm{1}\rangle=\langle\mathbbm{1}% \otimes X^{B}_{ij}\rangle=\langle Y^{A}_{ij}\otimes\mathbbm{1}\rangle=\langle% \mathbbm{1}\otimes Y^{B}_{ij}\rangle=0 | |||

\displaystyle\langle D^{A}_{i}\otimes\mathbbm{1}\rangle=\langle\mathbbm{1}% \otimes D^{B}_{i}\rangle=\lambda_{i} | |||

\displaystyle\langle\{D^{A}_{i},D^{A}_{j}\}\otimes\mathbbm{1}\rangle=(\lambda_% {i}+\lambda_{j})\delta_{ij} | (A-5) |

The blocks A,B and C of the \gamma^{S}(|\psi\rangle) can be therefore written as 3\times 3 block matrices. Because of the relations (A-4) and (A-5) a lot of terms in these blocks will be equal to zero an we have the structure

A,B,C=\left(\begin{array}[]{ccc}D^{A/B/C}&0&0\\ 0&X^{A/B/C}&0\\ 0&0&Y^{A/B/C}\end{array}\right), | (A-6) |

Since the off-diagonal terms can be calculated straightforward

\displaystyle\langle D_{i}^{A}\otimes D_{j}^{B}\rangle-\langle D_{i}^{A}% \rangle\langle D_{j}^{B}\rangle=\lambda_{i}\delta_{ij}-\lambda_{i}\lambda_{j}, | |||

\displaystyle\langle D_{i}^{A}\otimes X_{qr}^{B}\rangle=\langle D_{i}^{A}% \otimes Y_{qr}^{B}\rangle=\langle X_{pq}^{A}\otimes Y_{rs}^{B}\rangle=0, | |||

\displaystyle\langle X_{pq}^{A}\otimes X_{rs}^{B}\rangle=\sqrt{\lambda_{p}% \lambda_{q}}\delta_{pr}\delta_{qs}, | |||

\displaystyle\langle Y_{pq}^{A}\otimes Y_{rs}^{B}\rangle=-\sqrt{\lambda_{p}% \lambda_{q}}\delta_{pr}\delta_{qs}, | (A-7) |

we can write the blocks in (A-6) as follows

\displaystyle D=D^{A/B/C}_{ij} | \displaystyle=\lambda_{i}\delta_{ij}-\lambda_{i}\lambda_{j}, | |||

\displaystyle X=X^{A/B} | \displaystyle=\frac{1}{2}\mbox{diag}\{\lambda_{i}+\lambda_{k}\},1\leq i<k\leq d, | |||

\displaystyle Y=Y^{A/B} | \displaystyle=\frac{1}{2}\mbox{diag}\{\lambda_{i}+\lambda_{k}\},1\leq i<k\leq d, | |||

\displaystyle X^{C} | \displaystyle=\mbox{diag}\{\sqrt{\lambda_{p}\lambda_{q}}\},1\leq p<q\leq d, | |||

\displaystyle Y^{C} | \displaystyle=\mbox{diag}\{-\sqrt{\lambda_{p}\lambda_{q}}\},1\leq p<q\leq d. | (A-8) |

Finally, we arrive at the general form of the CM for a pure state as a function its Schmidt coefficients:

\gamma^{S}(|\psi\rangle)=\left(\begin{array}[]{cccccc}D&0&0&D&0&0\\ 0&X&0&0&X^{C}&0\\ 0&0&Y&0&0&Y^{C}\\ D&0&0&D&0&0\\ 0&X^{C}&0&0&X&0\\ 0&0&Y^{C}&0&0&Y\\ \end{array}\right) | (A-9) |

with the blocks given in Eq. (A-8).

## APPENDIX B.

In this Appendix we discuss the possible ways of solving the max-min problem:

\tilde{t}=\max_{\mathcal{P}}\min_{i<j}\frac{\left(\sqrt{\lambda_{i}}-\sqrt{% \lambda_{j}}\right)^{2}}{p_{i}+p_{j}},\>1\leq i<j\leq d. | (B-1) |

We consider the cases d=3 and d=4. We define

\displaystyle b_{ij} | \displaystyle\equiv\left(\sqrt{\lambda_{i}}-\sqrt{\lambda_{j}}\right)^{2}, | (B-2) | ||

\displaystyle\alpha_{ij} | \displaystyle\equiv\frac{b_{ij}}{p_{i}+p_{j}}=\alpha_{ji}. | (B-3) |

For d=3 there are only three different \alpha’s that can be arranged in a tableaux as in Fig. B-1(a).

The properties of the solution can be summarized as follows:

###### Lemma B-1.

(a) Consider the optimization problem in Eq. (B-1) for d=3 with the only assumption that b_{ij}\geq 0. Let j_{0},k_{0} be such that

b_{j_{0}k_{0}}=\max\{b_{jk}\} | (B-4) |

Then the optimal solution \alpha^{0} is given by

\alpha^{0}=\min\left\{\alpha^{I},\alpha^{II}\right\}, | (B-5) |

where

\displaystyle\alpha^{I} | \displaystyle=\frac{1}{2}(b_{12}+b_{13}+b_{23}) | (B-6) | ||

\displaystyle\alpha^{II} | \displaystyle=b_{i_{0},j_{0}}+b_{i_{0},k_{0}} | (B-7) |

with j_{0}\neq i_{0}\neq k_{0}.

(b) For the same problem, if the b_{ij} are given via
Eq. (B-2) as functions of Schmidt coefficients
and fulfill therefore further restrictions, the optimum
is always given by

\alpha^{0}=\alpha^{II}=1+\lambda_{i_{0}}-2\sqrt{\lambda_{i_{0}}\lambda_{j_{0}}% }-2\sqrt{\lambda_{i_{0}}\lambda_{k_{0}}} | (B-8) |

Then we also have that \alpha^{II}=\min_{ijk}\{1+\lambda_{i}-2\sqrt{\lambda_{i}\lambda_{j}}-2\sqrt{% \lambda_{i}\lambda_{k}}\} where the i,j,k are pairwise different.

Proof: (a) Let us first assume only that b_{ij}\geq 0. In the max-min problem (B-1) the maximization is taken over all possible probability distributions. It is convenient to distinguish two cases:

Case 1: The optimal probability distribution does not have any zero elements. Assume \mathcal{P}_{0} is the optimal distribution and p_{i}^{0}\neq 0 \forall i. We often drop the index {}^{0} in the following for simplicity. We show that this optimal distribution necessarily has to be such that \alpha_{12}=\alpha_{13}=\alpha_{23}=\alpha^{0}, otherwise the optimality is violated. Indeed, assume that this is not case. Then, without loss of generality, we can write \alpha_{12}\leq\alpha_{13}\leq\alpha_{23}, where one of the inequalities must be strict. Now consider some distribution \mathcal{P}^{\prime} such that

\displaystyle p^{\prime}_{1} | \displaystyle=p_{1}-2\varepsilon,\;\;\;p^{\prime}_{2}=p_{2}+\varepsilon,\;\;\;% p^{\prime}_{3}=p_{3}+\varepsilon, | (B-9) |

with some \varepsilon>0. The coefficients \alpha_{ij} will change and become according to the new distribution \mathcal{P}^{\prime}

\alpha^{\prime}_{12}>\alpha_{12},\;\;\;\alpha^{\prime}_{13}>\alpha_{13},\;\;\;% \alpha^{\prime}_{23}<\alpha_{23}. | (B-10) |

Since the parameter \varepsilon can be chosen arbitrarily small the number \alpha^{\prime}_{12} will be still the minimal one, i.e. \alpha^{\prime}_{12}=\min_{i<j}\alpha^{\prime}_{ij}. But \alpha^{\prime}_{12}>\alpha_{12}. Consequently the distribution \mathcal{P}^{\prime} gives a bigger minimum of the set \{\alpha_{ij}\} than the distribution \mathcal{P}_{0}, which contradicts the assumption that \mathcal{P}_{0} is optimal. Hence we conclude that \alpha_{12}=\alpha_{23} must hold, which implies \alpha_{12}=\alpha_{13}=\alpha_{23}.

Having established that if \mathcal{P}_{0} is optimal and contains no zero elements, then \alpha_{12}=\alpha_{13}=\alpha_{23}=\alpha^{0} holds, we can calculate \alpha^{0} explicitly. We have

\displaystyle\alpha^{0} | \displaystyle=\frac{b_{12}}{p_{1}+p_{2}}=\frac{b_{13}}{p_{1}+p_{3}}=\frac{b_{2% 3}}{p_{2}+p_{3}}. | (B-11) |

Multiplying by the denominators summing up these equations gives

2\alpha^{0}(p_{1}+p_{2}+p_{3})=b_{12}+b_{13}+b_{23}. | (B-12) |

Because p_{1}+p_{2}+p_{3}=1 we arrive at

\alpha^{I}\equiv\alpha^{0}=\frac{1}{2}(b_{12}+b_{13}+b_{23}). | (B-13) |

Case 2: The optimal probability distribution \mathcal{P}_{0} has at least one zero element. This means that one \alpha_{ij}=b_{ij} independently of the two free parameters of the probability distribution (since p_{i}+p_{j}=1). We can distinguish three cases, and assume for definiteness b_{12}\leq b_{13}\leq b_{23}.

(i) If \alpha_{12}=b_{12} (that is, p_{3}=0), then clearly \alpha_{ij}\geq b_{ij} for i,j=1,3 and i,j=2,3. Then we have \min\{\alpha_{ij}\}=\alpha_{12}. But then decreasing one of the p_{1} or p_{2} and increasing consequently p_{3} will lead to an increasing of \alpha_{12} and a better solution which belongs to case 1. So a solution with \alpha_{12}=b_{12} can never be optimal.

(ii) If \alpha_{13}=b_{13} the optimal probability distribution has to be such that \alpha_{13}\leq\alpha_{12} and \alpha_{13}\leq\alpha_{23}. But as in the case (i) one can directly see that this leads to case 1 and can never be optimal.

(iii) Finally, consider the case \alpha_{23}=b_{23}. Then, one can see as in case 1 one can achieve \alpha_{12}=\alpha_{13} without giving up optimality. More precisely, the optimal probability distribution has to fulfill this from the beginning (if \alpha_{12} and \alpha_{13} are the minima) or it can be achieved (if \alpha_{23} is the minimum).

This leads as in case 1 to the conclusion, that we have

\displaystyle\alpha_{12} | \displaystyle=\frac{b_{12}}{p_{2}}=\frac{b_{13}}{p_{3}}\Rightarrow\alpha_{12}=% b_{12}+b_{13}, | (B-14) |

and consequently

\alpha^{II}\equiv\alpha_{12}=b_{12}+b_{13}=1+\lambda_{1}-2\sqrt{\lambda_{1}% \lambda_{2}}-2\sqrt{\lambda_{1}\lambda_{3}}. | (B-15) |

However, it is not yet clear what the \min\{\alpha_{ij}\} is. Two cases can be distinguished:

(iiia) If \alpha_{12}\geq\alpha_{23}=b_{23} one would take \min\{\alpha_{ij}\}=\alpha_{23}=b_{23}, but then, one can improve it further as in the cases (i) and (ii) by going to the case I and taking finally \alpha^{I} from Eq. (B-13). Note that \alpha^{I}=(\alpha^{II}+b_{23})/2. Therefore, if \alpha^{II}=\alpha_{12}\geq\alpha_{23}=b_{23} one has also that \alpha^{I}\leq\alpha^{II}, so effectively one takes \min\{\alpha^{I},\alpha^{II}\}.

(iiib) If \alpha_{12}<\alpha_{23}=b_{23} we take \min\{\alpha_{ij}\}=\alpha^{II} and going to case 1 does not help. But in this case, we have \alpha^{I}\geq\alpha^{II}, so effectively one takes again \min\{\alpha^{I},\alpha^{II}\}.

Finally, let us discuss shortly the meaning of the choice j_{0} and k_{0} in Eq. (B-4) as one may consider also \alpha^{II}_{j,k} in Eq. (B-7) with other indices. However, one can directly compute that \alpha^{II}_{i,j}<\alpha^{I} is equivalent to b_{ij}+b_{ik}<b_{jk} and this can only be true, if j and k are chosen as in Eq. (B-4). In other words, the \alpha^{II}_{j,k} for other indices than j_{0},k_{0} can never contribute and one could alternatively write that \alpha_{0}=\min\{\alpha^{I},\alpha^{II}_{12},\alpha^{II}_{13},\alpha^{II}_{23}\}.

(b) Let us now assume that the b_{ij} stem from Schmidt coefficients as in Eq. (B-2). We know from the previous discussion that we have to take \alpha^{II} iff b_{i_{0}j_{0}}+b_{i_{0}k_{0}}\leq b_{j_{0}k_{0}}. In terms of the Schmidt coefficients, this implies that

(\sqrt{\lambda_{j_{0}}}-\sqrt{\lambda_{k_{0}}})^{2}\geq(\sqrt{\lambda_{i_{0}}}% -\sqrt{\lambda_{j_{0}}})^{2}+(\sqrt{\lambda_{i_{0}}}-\sqrt{\lambda_{k_{0}}})^{% 2}. | (B-16) |

This, however, is true for any triple of positive real numbers \sqrt{\lambda_{\nu}}, if j_{0} and k_{0} are chosen as in Eq. (B-4). Then, its also clear that the \alpha^{II} chosen is minimal among all the b_{ij}+b_{ik}. \hfill\blacksquare

Further, we discuss the case d=4. The elements \alpha_{ij} are again embedded in a tableaux as in Fig. B-1(b). We begin with studying of properties of the optimal probability distribution \mathcal{P}_{0}. Suppose as in the case d=3 that \alpha_{ij}^{0} correspond to the optimal probability distribution \mathcal{P}_{0} and that \alpha_{12}^{0}=\min_{ij}\{\alpha_{ij}^{0}\}. We can formulate:

###### Lemma B-2.

The solution of the max-min problem (37) for d=4 is given by

\alpha^{0}=\min\{\mathfrak{a}^{I},\mathfrak{a}^{II},\mathfrak{a}^{III}\}, | (B-17) |

where

\displaystyle\mathfrak{a}^{I}=1-2\sqrt{\lambda_{1}\lambda_{2}}-2\sqrt{\lambda_% {3}\lambda_{4}}, | |||

\displaystyle\mathfrak{a}^{II}=1-2\sqrt{\lambda_{1}\lambda_{3}}-2\sqrt{\lambda% _{2}\lambda_{4}}, | |||

\displaystyle\mathfrak{a}^{III}=1-2\sqrt{\lambda_{1}\lambda_{4}}-2\sqrt{% \lambda_{2}\lambda_{3}}. | (B-18) |

Proof:
The proof proceeds in several steps.

Step 1. Let us first consider optimal probability distributions
\mathcal{P}_{0}=\{p_{1},p_{2},p_{3},p_{4}\} where all p_{i} are nonzero.
In this case we show that for \alpha^{0}=\min\{\alpha_{ij}\}
at least one of the three equations must hold:

\displaystyle\alpha^{0} | \displaystyle=\alpha_{12}=\alpha_{34}, | |||

\displaystyle\alpha^{0} | \displaystyle=\alpha_{13}=\alpha_{24} | |||

\displaystyle\alpha^{0} | \displaystyle=\alpha_{14}=\alpha_{23}. | (B-19) |

The idea of the proof is similar to the proof of Lemma B-1: we consider small perturbations of the optimal probability distribution \mathcal{P}_{0} that increase the minimal element \alpha^{0} and therefore destroy the optimality if some additional constraints are not fulfilled. As we will see, these constraints will give us the conditions Eq. (B-19).

Let us assume for definiteness that the optimal \alpha^{0} is given by \alpha_{12}. We can consider the following four transformations of the p_{i}:

\displaystyle T_{1}: | \displaystyle p_{1}^{\prime}=p_{1}-3\varepsilon,\;\;\;p_{i}^{\prime}=p_{i}+% \varepsilon\mbox{ for }i\neq 1, | ||||

\displaystyle T_{2}: | \displaystyle p_{2}^{\prime}=p_{2}-3\varepsilon,\;\;\;p_{i}^{\prime}=p_{i}+% \varepsilon\mbox{ for }i\neq 2, | ||||

\displaystyle T_{3}: | \displaystyle p_{3}^{\prime}=p_{3}+3\varepsilon,\;\;\;p_{i}^{\prime}=p_{i}-% \varepsilon\mbox{ for }i\neq 3, | ||||

\displaystyle T_{4}: | \displaystyle p_{4}^{\prime}=p_{4}+3\varepsilon,\;\;\;p_{i}^{\prime}=p_{i}-% \varepsilon\mbox{ for }i\neq 4, | (B-20) |

where \varepsilon can be chosen arbitrarily small. All the transformations increase \alpha_{12}, but all have to keep the optimality of the probability distribution, so that minimal \alpha given by \mathcal{P}^{\prime} cannot be larger than \alpha_{12}. From transformation T_{1} it follows that \mathcal{P}_{0} is optimal if and only if \alpha_{12}=\min\{\alpha_{23},\alpha_{24},\alpha_{34}\}, as these entries decrease under the transformation. Similarly, it follows from T_{2} that \alpha_{12}=\min\{\alpha_{13},\alpha_{14},\alpha_{34}\}, and from T_{3} that \alpha_{12}=\min\{\alpha_{13},\alpha_{23},\alpha_{34}\}, and finally from T_{4} that \alpha_{12}=\min\{\alpha_{14},\alpha_{24},\alpha_{34}\}. Given this finite number of possibilities, one can directly check that either \alpha_{12}=\alpha_{34} or \alpha_{12}=\alpha_{13}=\alpha_{24} or \alpha_{12}=\alpha_{14}=\alpha_{23} must hold for optimal probability distribution \mathcal{P}_{0} which proves the first claim.

From these conditions we see that there are the three candidates for the optimal \alpha^{0}:

\displaystyle\alpha^{0} | \displaystyle=\alpha_{12}=\alpha_{34}, | |||

\displaystyle\Rightarrow\alpha^{0}=\mathfrak{a}^{I}=b_{12}+b_{34}=1-2\sqrt{% \lambda_{1}\lambda_{2}}-2\sqrt{\lambda_{3}\lambda_{4}}, | ||||

\displaystyle\alpha^{0} | \displaystyle=\alpha_{13}=\alpha_{24}, | |||

\displaystyle\Rightarrow\alpha^{0}=\mathfrak{a}^{II}=b_{13}+b_{24}=1-2\sqrt{% \lambda_{1}\lambda_{3}}-2\sqrt{\lambda_{2}\lambda_{4}}, | ||||

\displaystyle\alpha^{0} | \displaystyle=\alpha_{14}=\alpha_{23}, | |||

\displaystyle\Rightarrow\alpha^{0}=\mathfrak{a}^{III}=b_{14}+b_{23}=1-2\sqrt{% \lambda_{1}\lambda_{4}}-2\sqrt{\lambda_{2}\lambda_{3}}. | (B-21) |

Step 2. At this point, we have identified three candidates for the \alpha^{0}, but is is not clear yet, which one should be taken.

We will show now, however, that only the minimum of these can give a valid solution. For that, assume that one has a probability distribution \mathcal{P}_{1} which has the optimal \alpha^{0}(\mathcal{P}_{1})=\mathfrak{a}^{I}. Then \alpha_{34}^{0}=\alpha_{12}^{0}=\min_{ij}\{\alpha_{ij}\} and hence

\displaystyle\alpha_{12}\leq\alpha_{13} | \displaystyle\Rightarrow b_{12}(p_{1}+p_{3})\leq b_{13}(p_{1}+p_{2}), | |||

\displaystyle\alpha_{34}\leq\alpha_{23} | \displaystyle\Rightarrow b_{34}(p_{2}+p_{4})\leq b_{24}(p_{3}+p_{4}). | (B-22) |

Consequently, b_{12}+b_{34}\leq b_{13}+b_{24} and hence \mathfrak{a}^{I}\leq\mathfrak{a}^{II}. Similarly, it follows that \mathfrak{a}^{I}\leq\mathfrak{a}^{III}. So if one finds a solution, then it has to be the minimum of all \mathfrak{a}^{k}.

This also shows that if there is a second solution \mathcal{P}_{2} with \alpha^{0}(\mathcal{P}_{2})=\mathfrak{a}^{II}, then \alpha^{0}(\mathcal{P}_{1})=\alpha^{0}(\mathcal{P}_{2}) must hold, since \mathfrak{a}^{I}\leq\mathfrak{a}^{II} and \mathfrak{a}^{II}\leq\mathfrak{a}^{I}. Note also that the arguments leading to this did not require the assumption that the probability distributions have nonzero elements.

Summarizing Step 1 and Step 2, we can state that if there is a optimal probability distribution with non-zero elements, then the solution is given by

\alpha^{0}=\min\{\mathfrak{a}^{I},\mathfrak{a}^{II},\mathfrak{a}^{III}\}. | (B-23) |

Step 3. Now we have to consider the cases where the optimal probability distribution has some zero elements. Let us first consider the case that there is exactly one zero element.

There exist two possibilities. The first one arises, when the minimum is given by \alpha_{12} and p_{4}=0. Then, the transformations T_{1},T_{2} and T_{4} in Eq. (B-20) can still be applied, but we have to modify T_{3}, since there are no negative probabilities

\hat{T}_{3}:p_{3}^{\prime}=p_{3}+2\varepsilon,\;\;\;p_{i}^{\prime}=p_{i}-% \varepsilon\mbox{ for }i=1,2,\;\;\;p_{4}^{\prime}=p_{4}=0. | (B-24) |

This transformation leads exactly to the same condition as T_{3} above \alpha_{12}=\min\{\alpha_{13},\alpha_{23},\alpha_{34}\}. Therefore, the same conclusion as in Step 1 can be drawn. Similarly, by considering \hat{T}_{4}, one can show that if p_{3}=0, the conclusion from Step 1 still holds.

The second possibility arises, if the minimum is again given by \alpha_{12}, but this time p_{1}=0. Then, only T_{2} in Eq. (B-20) can be applied. We define the modified transformations:

\displaystyle\tilde{T}_{3}: | \displaystyle p_{3}^{\prime}=p_{3}+2\varepsilon,\;\;\;p_{i}^{\prime}=p_{i}-% \varepsilon\mbox{ for }i=2,4,\;\;\;p_{1}^{\prime}=p_{1}; | |||

\displaystyle\tilde{T}_{4}: | \displaystyle p_{4}^{\prime}=p_{4}+2\varepsilon,\;\;\;p_{i}^{\prime}=p_{i}-% \varepsilon\mbox{ for }i=2,3,\;\;\;p_{1}^{\prime}=p_{1}; | (B-25) |

Then, repeating the argumentation from Step 1, one arrives at the same conclusion, apart from the special case: \alpha_{12}=\alpha_{13}=\alpha_{14}<\alpha_{kl} for k,l\in\{2,3,4\} holds.

In this special case, we have that \alpha^{0}=(b_{12}+b_{13}+b_{14}) and consequently p_{k}=b_{1k}/(b_{12}+b_{13}+b_{14}) for k=2,3,4. Since \alpha_{23}=b_{23}/(p_{2}+p_{3})>\alpha^{0}=(b_{12}+b_{13}+b_{14}) it follows that b_{23}>b_{12}+b_{13}. Generally we have b_{kl}>b_{1k}+b_{1l}, for k,l\in\{2,3,4\}.

Due to the definition of the b_{ij}, it means that the Schmidt coefficients have to fulfill

(\sqrt{\lambda_{k}}-\sqrt{\lambda_{l}})^{2}>(\sqrt{\lambda_{1}}-\sqrt{\lambda_% {k}})^{2}+(\sqrt{\lambda_{1}}-\sqrt{\lambda_{l}})^{2} | (B-26) |

for k,l\in\{2,3,4\}. Since the \sqrt{\lambda_{k}} are positive real numbers, this can only hold if \sqrt{\lambda_{1}} is inside the interval [\sqrt{\lambda_{k}};\sqrt{\lambda_{l}}]. As there are three intervals, and two of them intersect in only one point, we must have that \sqrt{\lambda_{1}}=\sqrt{\lambda_{i}} for some i\in\{2,3,4\}, which implies that the corresponding b_{1i}=0 and \alpha_{1i}=0. Since \alpha_{12}=\alpha_{13}=\alpha_{14} all of them must be zero and hence \alpha^{0}=0 and all b_{1k}=0 for any k\in\{2,3,4\}. Physically, this means that all Schmidt coefficients are the same and the state is a maximally entangled one. But then also \mathfrak{a}^{I}=\mathfrak{a}^{II}=\mathfrak{a}^{III}=0, so this special case does not deliver a novel solution.

Step 4. Let us now consider the case, where two or more p_{i} equal zero.

Let us first assume that exactly two p_{i} are zero, namely p_{2}=p_{3}=0. Then \alpha_{14}=b_{14} and \alpha_{23}=\infty are independent of the probability distribution. However, if we make the transformation

\mathfrak{T}:p_{i}^{\prime}=p_{i}-\varepsilon\;\;i\in\{1,4\},\;\;p_{k}^{\prime% }=p_{k}+\varepsilon\;\;k\in\{2,3\}, | (B-27) |

the minimal value \alpha^{0} does not decrease (as all \alpha_{12},\alpha_{13},\alpha_{24},\alpha_{34} remain constant and \alpha_{14} increases). Therefore we arrive at a solution, where none of the p_{i} is zero and which is as good as a solution with p_{2}=p_{3}=0. Thus we conclude that solutions given by distributions with two zero elements are contained in solutions characterized in Step 1.

Finally, we have to discuss the case that three p_{i} equal zero and consequently the remaining one equals one. This can be excluded with a similar transformation as in Eq. (B-27) and we leave the details as an exercise to the reader. \hfill\blacksquare

## References

- (1) R. Horodecki, P. Horodecki, M. Horodecki, and K. Horodecki, Rev. Mod. Phys., 81, 865 (2009).
- (2) O. Gühne and G. Tóth, Phys. Rep. 474, 1 (2009).
- (3) M. Plenio and S. Virmani, Quant. Inf. Comp. 7, 1 (2007).
- (4) K. Chen, S. Albeverio, and S.-M. Fei, Phys. Rev. Lett., 95, 040504 (2005); K. Chen, S. Albeverio, and S.-M. Fei, Phys. Rev. Lett. 95, 210501 (2005).
- (5) J. I. de Vicente, Phys. Rev. A 75, 052320 (2007); 77, 039903(E) (2008); C.-J. Zhang, Y.-S. Zhang, S. Zhang, and G.-C. Guo, Phys. Rev. A 76, 012334 (2007); J. I. de Vicente, J. Phys. A: Math. Theor. 41, 065309 (2008); L. Li-Guo, T. Cheng-Lin, C. Ping-Xing, and Y. Nai-Chang, Chinese Phys. Lett. 26, 060306 (2009).
- (6) F. Mintert and A. Buchleitner, Phys. Rev. A 72, 012336 (2005); O. Gühne, M. Reimpell and R.F. Werner, Phys. Rev. Lett. 98, 110502 (2007); J. Eisert, F. Brandão, and K. Audenaert, New J. Phys. 9, 46 (2007).
- (7) S.L. Braunstein and P. van Loock, Rev. Mod. Phys. 77, 513 (2005); X.B. Wang, T. Hiroshima, A. Tomita and M. Hayashi, Phys. Rep. 448, 1 (2007).
- (8) R. Simon, Phys. Rev. Lett. 84, 2726 (2000); L.-M. Duan, G. Giedke, J. I. Cirac, and P. Zoller, Phys. Rev. Lett. 84, 2722 (2000); R. F. Werner, and M. M. Wolf, Phys. Rev. Lett. 86, 3658 (2001); G. Giedke, B. Kraus, M. Lewenstein, and J. I. Cirac, Phys. Rev. Lett. 87, 167904 (2001); P. Hyllus and J. Eisert, New J. Phys. 8, 51 (2006).
- (9) O. Gühne, Phys. Rev. Lett. 92, 117903 (2004); J. I. de Vicente, Quantum Inf. Comput. 7, 624 (2007); C.-J. Zhang, Y.-S. Zhang, S. Zhang, and G.-C. Guo, Phys. Rev. A 77, 060301(R) (2008).
- (10) O. Gühne, P. Hyllus, O. Gittsovich, and J. Eisert, Phys. Rev. Lett. 99, 130504 (2007).
- (11) O. Gittsovich, O. Gühne, P. Hyllus, and J. Eisert, Phys. Rev. A, 78, 052319 (2008).
- (12) G. Giedke and J. I. Cirac, Phys. Rev. A 66, 032316 (2002).
- (13) H.F. Hofmann and S. Takeuchi Phys. Rev. A 68, 032103 (2003).
- (14) V. Vedral and M. B. Plenio, Phys. Rev. A 57, 1619 (1998); M. B. Plenio, Phys. Rev. Lett. 95, 090503 (2005).
- (15) O. Rudolph, Phys. Rev. A 67, 032312 (2003).
- (16) E. M. Rains, Phys. Rev. A 60, 179 (1999).
- (17) S. Hill and W. K. Wootters, Phys. Rev. Lett. 78, 5022 (1997).
- (18) W. K. Wootters, Phys. Rev. Lett., 80, 2245 (1998).
- (19) P. Rungta, V. Bužek, C.M. Caves, M. Hillery, and G.J. Milburn, Phys. Rev. A 64, 042315 (2001).
- (20) F. Verstraete, J. Dehaene, and B. De Moor, Phys. Rev. A 64, 010101 (R) (2001).