Using complete measurement statistics for optimal device-independent randomness evaluation

Using complete measurement statistics for optimal device-independent randomness evaluation


The majority of recent works investigating the link between non-locality and randomness, e.g. in the context of device-independent cryptography, do so with respect to some specific Bell inequality, usually the CHSH inequality. However, the joint probabilities characterizing the measurement outcomes of a Bell test are richer than just the degree of violation of a single Bell inequality. In this work we show how to take this extra information into account in a systematic manner in order to optimally evaluate the randomness that can be certified from non-local correlations. We further show that taking into account the complete set of outcome probabilities is equivalent to optimizing over all possible Bell inequalities, thereby allowing us to determine the optimal Bell inequality for certifying the maximal amount of randomness from a given set of non-local correlations.

1 Introduction

In the context of any non-signaling theory, and in particular in the context of quantum theory, outcomes of measurements on separate systems leading to a Bell violation cannot be completely pre-determined, i.e. the violation of a Bell inequality guarantees the presence of genuine randomness. This link between non-locality [1] and randomness is interesting on the fundamental level [2, 3], but is also the main ingredient behind device-independent randomness generation (DIRG) [4, 5, 7, 6, 8], randomness amplification [9, 10], and device-independent quantum key distribution (DIQKD) [11, 13, 12, 14, 15, 17, 16].

At the basis of such developments lies a quantitative relation between the amount of randomness that is necessarily produced in a Bell experiment and the degree of violation of a certain Bell inequality, such as the CHSH inequality [18, 5], the chained inequality [19, 11, 20, 9], or a Mermin-type inequality [21, 4, 10]. However, the set of data obtained in a Bell experiment is much richer than just the value of the violation of some Bell inequality. For example, in a CHSH experiment there are eight independent probabilities that determine the single number corresponding to the amount of CHSH violation. Moreover, in [3] it was shown that there exist two-input two-output Bell inequalities that can allow for the certification of more randomness than the CHSH inequality. Similar examples have been provided in [22]. Such results imply that taking into account extra data beyond the value of a single Bell violation can be useful, but leave open the questions of just how useful and how to do so in a systematic manner.

These questions are especially relevant now that the detection loophole has been closed (albeit re-opening the locality loophole) with entangled photons [23, 24], opening the door for high rate DIRG. Nevertheless, there is still work to be done on the theoretical level before we can realize this goal efficiently. In particular, low detection efficiencies () necessitate using states of low entanglement (for efficiencies below the CHSH inequality cannot be violated using maximally entangled two-qubit states [25]), for which the CHSH inequality is not optimal with respect to randomness certification [3].

In this work we show how to evaluate the randomness produced in a Bell test, or, more specifically, how to obtain the device-independent guessing probability (DIGP) by systematically taking into account the complete non-local behavior, rather than just the violation of some pre-specified Bell inequality. We also show that for any set of non-local correlations, there exists a Bell inequality that is optimal for certifying the maximal amount of randomness given these correlations. Regarding this, we note that while the protocols in [5, 7, 6, 14, 15, 17] are general in the sense that they are not formulated with respect to some specific Bell inequality, they do not tell us the optimal Bell inequality to use given the measurement data. We then show how the optimal value of the DIGP and the associated optimal Bell inequality can be computed using the semidefinite programming (SDP) hierarchy introduced in [26]. Finally, we study three numerical examples illustrating the advantage in taking into account the complete non-local behavior, as opposed to taking into account only the violation of a specific Bell inequality.

2 Background: the device-independent guessing probability

We consider the following setting. Alice has access to a pair of quantum devices, or boxes, and , which she can prevent from communicating at will, and whose internal state may be correlated with a system in the possession of an adversary Eve (or equivalently to the environment). The joint state of the boxes and Eve’s system is described by a quantum state . Alice introduces inputs and , each chosen at random from the finite set into boxes and and obtains outputs and , respectively, each taking one of the values . This process is described by a pair of POVMs with elements and , each acting on and , respectively. The joint probability that the outputs and are obtained given the inputs and is , where . There are a total of such joint probabilities, which we view as the components of a vector . We refer to this vector as the (non-local) behavior characterizing Alice’s devices.

We refer to a specific state and sets of measurement operators and , yielding the behavior , as a quantum realization of . We denote by the convex set of all behaviors that admit a valid quantum realization . In the following, it will be useful to consider measurements on unnormalized quantum states (i.e. ). We denote the corresponding behaviors by and define their norm as . We denote by the corresponding set of unnormalized quantum behaviors, which is a convex cone.

In general, different quantum realizations are possible for a given behavior . Our aim is to quantify the randomness generated by the boxes from alone, independently of the possible underlying quantum realizations compatible with . To simplify the notation, we describe in the following how to quantify the local randomness associated with box ’s output when a certain input is used. The global randomness associated with both boxes’ outputs and for a given pair of inputs and can be treated analogously.

To begin, let us fix a specific quantum realization compatible with . This quantum realization defines an initial state and sets of projectors and 1 . After Alice’s measurement the correlations between her classical output and the quantum information held by Eve are described by the classical-quantum state , where is the reduced state of Eve given that Alice performed measurement and obtained outcome . The randomness of box ’s output given this side information can be quantified by the guessing probability [27, 3]: the average probability that Eve correctly guesses box ’s output using an optimal strategy. Such an optimal strategy is described by a -element POVM that Eve performs on her system; if she obtains the output , which happens with probability when her system is in the reduced state , she guesses that box ’s output was . Optimizing over all possible measurements, her average probability of guessing correctly is thus given by


The above expression defines the guessing probability, which is related to the quantum min-entropy through [27] 2. Note that in the above definition we made the dependence on explicit to stress that we are considering a given quantum realization . Since our aim is to obtain a bound on the randomness of the outputs that depends only on , but not on a specific quantum realization of , we must further maximize over all compatible with :


This defines the DIGP, the quantity which interests us in this work.

3 The device-independent guessing probability as a conic linear program

We have expressed the guessing probability as an average over Eve’s probabilities conditioned on box ’s outputs, but we can also express it, using Bayes’ rule, as an average over Alice’s probabilities conditioned on Eve’s outcomes:


Here is the probability that Eve obtains the outcome and is the probability that box outputs conditioned on that event. More generally, conditioning on Eve’s outcomes defines a family of behaviors for boxes and , or more conveniently of unnormalized behaviors . Note that averaging over these behaviors yields back the given behavior characterizing the boxes: . Every choice of and defines a family of quantum behaviors satisfying this property. Conversely, it is not difficult to see that any set of behaviors satisfying can be interpreted as describing the conditional joint output probabilities of boxes and for some quantum realization and POVM performed by Eve. In terms of the unnormalized behaviors, we can write Eq. (3) as and thus the DIGP associated with is the solution to the following optimization problem


where the s are the optimization variables. This is a typical instance of a conic linear program [28], i.e. the optimization of a linear objective function () subject to linear constraints and to the constraint that the optimization variables belong to a convex cone (the constraints , since is a closed convex cone).

The program Eq. (4) has a simple physical interpretation. Any feasible point corresponds to a possible quantum decomposition of the behavior . From the point of view of an adversary, such a decomposition can be understood as a strategy where with probability the adversary guesses that box ’s output was and prepares the quantum behavior . The probability of correctly guessing box ’s output in this strategy is . The program Eq. (4) simply searches for the optimal quantum strategy that maximizes this expression.

4 Dual formulation and optimal Bell expressions

Every conic linear program admits a dual formulation (see, e.g. [28]), which in the case of Eq. (4) is readily seen to be


In the above problem the optimization variable is the vector . It can be interpreted as defining a Bell expression whose expectation value is . That is, it defines a linear form in the behavior . The constraint means that should hold for all . Whenever satisfies this constraint, the expectation value provides an upper-bound on the guessing probability since


In particular, given a fixed Bell expression, such as the CHSH expression , one can determine coefficients and (effectively defining a new linear form ) such that and thus . Such bounds on the DIGP are the ones that are used in most works related to DIRG or DIQKD, see e.g. [5, 7, 6, 8, 29] and [14, 15, 17, 16], respectively. The program Eq. (5) goes further since it does not assume a fixed Bell expression, but determines the linear form that yields the lowest upper-bound on the DIGP for a given behavior .

The fact that the dual optimal solution yields an upper bound on the primal optimal solution is a general result that holds between any primal and dual conic linear program pairs. Provided that one of the two programs admits a strictly feasible solution, it further holds that there is no gap between the primal and dual optimal solutions, i.e. . This is the case here since the form , defined by for all , and , satisfies , and consequently , and so represents a strictly feasible point of the dual problem.

The programs Eqs. (4) and (5) are equivalent but have different interpretations. As we have explained above, the feasible points of the primal program correspond to explicit strategies for the adversary. Any such strategy yields a lower-bound on the DIGP. The primal program Eq. (4) searches for the optimal strategy that maximizes the guessing probability. On the other hand, any feasible point of the dual program corresponds to a Bell expression, which certifies that a certain amount of randomness is present in the given behavior , and yields an upper-bound on the DIGP. The dual program Eq. (5) searches for the Bell expression which certifies the maximal amount of randomness. The duality theorem of conic linear programming tells us that the optimal solutions of both programs are identical, and thus that for every behavior there exists a Bell expression, which certifies the full amount of randomness present in the correlations.

5 Semidefinite programming relaxations

The above conic linear programming formulations of the DIGP are in general difficult to implement exactly. However, they can be relaxed using the SDP method introduced in [26, 30]. This method introduces a hierarchy of convex sets , which approximate the quantum set from the outside 3 . The hierarchy of programs


therefore provides a sequence of relaxations to Eq. (4), which yields upper-bounds on the DIGP. In this approach a behavior belongs to if and only if there exists a positive semidefinite matrix satisfying a series of linear constraints of the form (see [30, 33] for details). Since the objective function and the first set of constraints in Eq. (7) are also linear, the problems Eq. (7) can be cast as SDP problems for which efficient algorithms are available.

This SDP hierarchy can also be understood from the perspective of the dual problem Eq. (5). To see this, we note that the constraint in Eq. (5) is equivalent to for all possible quantum states and all possible of the form , where and are valid sets of measurement operators. This in turn is equivalent to for all . We say that admits a sum of squares (SOS) decomposition of degree , and write if there exists a set of polynomials of degree in the operators such that holds for any sets of valid measurement operators and . If this is the case, it clearly follows that . Therefore, the series of problems


represents a sequence of relaxations of the dual problem Eq. (5) yielding upper-bounds on the DIGP.

It is well known that an SOS constraint of the form can be represented as an SDP constraint [31] and thus that the relaxations Eq. (8) are SDP problems. Such SDP relaxations turn out to be nothing but the dual formulation of the SDP relaxations Eq. (7) [30, 32] (see [33] for more details on the relation between the primal and dual of the SDP hierarchy).

Even though the primal and dual SDP relaxations Eqs. (7) and (8) are equivalent, like the original programs, they have different interpretations. Feasible points of the primal programs correspond to decompositions of in terms of supra-quantum behaviors in . They can be understood as characterizing the strategies available to an adversary which is able to prepare supra-quantum behaviors. Such strategies are not necessarily always available in a purely quantum setting and thus the associated values represent upper-bounds on the DIGP. The dual programs on the other hand return explicit Bell expressions certifying that the DIGP cannot be higher than a certain value . Such bounds are valid – and optimal – for any strategy in and thus are also valid – though not necessarily optimal – for any quantum strategy in . In other words, the SDP relaxations Eqs. (7) and (8) not only give a bound on the DIGP, but also return explicit Bell expressions that can be used in any analysis based on a quantitative relation between the amount of Bell violation and randomness, such as in [14, 17, 16, 5, 7, 6, 8, 9, 10].

6 Numerical examples

In this section we present three numerical examples demonstrating the advantage in taking into account the complete non-local behavior. In the first two examples, we consider a two-input two-output Bell scenario. We introduce the eight parameters , , , where . Their knowledge is equivalent to the knowledge of the complete set of probabilities .

CHSH correlations in the presence of white noise.

We first consider the randomness that can be extracted from a mixture of maximally violating CHSH correlations plus white noise, i.e. correlations of the form , where are the quantum correlations yielding the maximal CHSH violation of and denotes completely random correlations for which for all , and . As a function of the “visibility” the CHSH violation is thus given by . Naively, one would expect that in such a simple example knowledge of the full non-local behavior is of no greater utility than knowledge of the CHSH violation alone. Surprisingly, Figure 1 shows that this is not the case, although the improvement that we get by considering the full non-local behavior is modest. We have determined numerically the corresponding optimal Bell inequalities as a function of by solving explicitly the dual programs. We find that these inequalities all have the form


where the coefficients and are given in Figure 2. The case corresponds to the CHSH inequality and only arises in the case of perfect visibility (). This shows that in any real experiment, in which the visibility is necessarily imperfect (i.e. ), the optimal Bell inequality for randomness certification is not always the CHSH inequality.

Randomness from partially entangled states.

In the second example, we consider the following set of correlations


where . For these correlations are obtained by measuring a partially entangled state of the form and give rise to a maximal violation of the inequality [3] () with . A value of corresponds to a mixture of these correlations with completely white noise in the respective fractions of and .

Figure 3 presents bounds on the global DIGP corresponding to the pair of outcomes associated with the measurements and as a function of for . We see that taking into account complete sets of correlations can provide a very significant advantage, not only as compared with taking into account only the violation of a single Bell inequality, but also violations of two independent Bell inequalities.

It is interesting to see what the optimal Bell inequalities, obtained via the dual formulation of the SDP programs, look like. The significant advantage obtained in Fig. 2 by taking into account complete data suggests that the corresponding optimal Bell inequalities would be more than mere tweaks of any of the Bell inequalities that have thus far been investigated for the purposes of DIRG (essentially the inequalities of [3]). This intuition is indeed backed up by the numerics. For example, for () we obtain the Bell expression


whose local bound is .

Randomness from entangled qutrits.

As the last example, we consider the two-input, three-ouput Bell-CGLMP scenario [34]. Specifically, we consider correlations which violate the CGLMP inequality and which arise by performing the measurements specified in [34] on the family of states


with . For the state is a product state, for it is a maximally entangled two-qutrit state, while for it is a maximally entangled two-qubit state. For the CGLMP inequality is maximally violated [35], while no violation is obtained for using the set of measurements considered. Figure 4 presents bounds on the randomness , which can be certified in this scenario, for , taking into account only the CGLMP violation or the full non-local behavior. Unsuprisingly, at the point of maximal violation of the CGLMP inequality, we can certify one trit of randomness, i.e. . However, taking into account the complete behavior, a large interval of values of yields , including values for which the CGLMP violation is small. These results have been obtained using the second order relaxation of the SDP hierarchy. The range of values of for which may thus turn out to be larger when going to higher order SDP relaxations or using different measurements from those specified in [34].

7 Conclusion

We have shown how the device-independent guessing probability can be evaluated by taking into account in a systematic way the complete non-local behavior characterizing a Bell test and not only the violation of a pre-specified Bell inequality. We have also shown that for any given non-local correlations, there exists an optimal Bell inequality that can certify the maximal amount of randomness compatible with such correlations. Explicit upper-bounds on the device-independent guessing probability and their associated Bell inequalities can be computed by adapting the SDP hierarchy introduced in [26]. Low order relaxations, as is often the case with applications of the SDP hierarchy, usually already yield the optimal value of the guessing probability.

Our approach can be straightforwardly adapted to quantify randomness in purely non-signaling settings (i.e. without requiring the validity of quantum theory). The corresponding programs are simply the analogues of Eqs. (4) and (5), where the constraints and are replaced by and , respectively, with denoting the set of non-signaling behaviors. Since is entirely characterized by linear constraints (the no-signaling constraints [36] and the positivity of probabilities), these programs can be solved using linear programming.

We expect that the tools that we have presented will contribute to advancing our fundamental understanding of the relation between non-locality and randomness, and its cryptographic applications. In particular, the simple examples that we have studied (especially Figures 1, 2, and 4) already yield unexpected results that motivate further investigations. Finally, it would be interesting to understand what is the optimal way to incorporate directly our method in protocols for DIRG and DIQDKD taking into account finite statistics effects.

Note added.

Similar results to our own have been obtained independently and in parallel by J.D. Bancal, L. Sheridan, and V. Scarani [37].


We acknowledge financial support from the European Union under the project QCS, QALGO, DIQIP and from the F.R.S.-FNRS under the project DIQIP. S.P. acknowledges support from the Brussels-Capital Region through a BB2B grant. J.S. is chargé de recherches du F.R.S.-FNRS. O.N.S. acknowledges support from the F.R.S.-FNRS under a grant from the Fonds pour la Formation à la Recherche dans l’Industrie et l’Agriculture (F.R.I.A.). The Matlab toolboxes YALMIP [38] and SeDuMi [39] were used to solve the SDPs giving rise to the figures in Section 6.

Figure 1: Global randomness as a function of the visibility for optimally violating CHSH correlations in the presence of white noise. The dashed curve was obtained by taking into account only the CHSH value (i.e. ), while the solid curve was obtained by taking into account the full non-local behavior. Both curves were obtained using the second order relaxation of the SDP hierarchy and are actually optimal up to the numerical precision of used (we have verified optimality by finding explicit states and measurements saturating the bounds given by the SDP programs). Except when , i.e. when there is no noise, we see that there is a small advantage in taking into account the full non-local behavior.
Figure 2: Coefficients of the optimal Bell inequalities Eq. (9) as a function of . The CHSH inequality corresponds to the case and is optimal only for perfect visibility (and trivially ).
Figure 3: as a function of computed by taking into account partial or complete non-local data for . The dashed curve was obtained by constraining only the value of the expression, the dotted curve by constraining only the value of the CHSH expression, the dashed-dotted curve by constraining the values of both and the CHSH expressions, and the solid curve by taking into account the values of all correlators in accordance with Eq. (10). These curves were obtained using the third order relaxation of the SDP hierarchy. The dashed-dotted curve is optimal up to a precision of .
Figure 4: Local DIGP as a function of the parameter defined in Eq. (12). The dashed curve is obtained by taking into account only the CGLMP value, and the solid one the complete behavior. Both curves were obtained using the second order relaxation of the SDP hierarchy, and the dashed one has been verified to be optimal up to a numerical precision of .


  1. We can always restrict to projectors by increasing the dimension of the Hilbert space. No loss of generality will be incurred by this, since we will be working in device-independent settings.
  2. The guessing probability or equivalently the min-entropy is an operational measure of randomness: if is a cq-state with guessing probability , then a randomness extractor can be used to map to a -bit string that is close to being uniformly random and uncorrelated to the adversary, that is is close in trace-distance to the state [27].
  3. The hierarchy as presented in [26, 30] applies to normalized behaviors , but it can be can be trivially adapted to the unnormalized behaviors by removing the normalization constraint, e.g. in the notation of [26, 30].


  1. N. Brunner et al, arXiv:1303.2849.
  2. A. Valentini, Phys. Lett. A, 297, 273 (2002).
  3. A. Acín, S. Massar, and S. Pironio, Phys. Rev. Lett. 108, 100402 (2012).
  4. R. Colbeck, PhD dissertation, Univ. Cambridge (2007), arXiv:0911.3814; R. Colbeck and A. Kent, J. Phys. A 44, 095305 (2011).
  5. S. Pironio et al., Nature 464, 1021 (2010).
  6. S. Fehr, R. Gelles, and C. Schaffner, Phys. Rev. A 87, 012335 (2013).
  7. S. Pironio and S. Massar, Phys. Rev. A 87, 012336 (2013).
  8. U. Vazirani and T. Vidick, Phil. Trans. R. Soc. A 370, 3432 (2012).
  9. R. Colbeck and R. Renner, Nat. Phys. 8, 450 (2012).
  10. R. Gallego et al., arXiv:1210.6514.
  11. J. Barrett, L. Hardy, and A. Kent, Phys. Rev. Lett. 95, 010503 (2005).
  12. A. Acín et al., Phys. Rev. Lett. 98, 230501 (2007); S. Pironio et al., New J. Phys. 11, 045021 (2009).
  13. D. Mayers and A. Yao, Quantum Inf. Comput. 4, 273 (2004).
  14. Ll. Masanes, S. Pironio, and A. Acín, Nat. Commun. 2, 238 (2011).
  15. E. Hänggi and R. Renner, arXiv:1009.1833.
  16. U. Vazirani and T. Vidick, arXiv:1210.1810.
  17. S. Pironio et al., Phys. Rev. X 3, 031007 (2013).
  18. J.F. Clauser et al., Phys. Rev. Lett. 23 880 (1969).
  19. S.L. Braunstein and C.M. Caves, Ann. Phys. 202, 22 (1990).
  20. J. Barrett, A. Kent, and S. Pironio Phys. Rev. Lett. 97, 170409 (2007).
  21. D. Mermin Phys. Rev. Lett. 65, 1838 (1990).
  22. P. Mironowicz and M. Pawlowski, Phys. Rev. A 88, 032319 (2013).
  23. M. Giustina et al., Nature 497, 227 (2013).
  24. B.G. Christensen et al., Phys. Rev. Lett. 111, 130406 (2013).
  25. P. Eberhard, Phys. Rev. A 47, 747 (1993).
  26. M. Navascués, S. Pironio, and A. Acín, Phys. Rev. Lett. 98, 010401 (2007).
  27. R. König, R. Renner, and C. Schaffner, IEEE Trans. Inf. Th. 55, 4337 (2009).
  28. S. Boyd and L. Vandenberghe, Convex Optimization, Cambridge University Press, 2004.
  29. J. Silman, S. Pironio, and S. Massar, Phys. Rev. Lett. 110, 100504 (2013).
  30. M. Navascués, S. Pironio, and A. Acín, New J. Phys. 10, 073013 (2008).
  31. J.W. Helton, Ann. Math. 56, 675 (2002).
  32. A.C. Doherty et al., in Proc. 23rd CCC, (CS Press, 2008), p. 199.
  33. S. Pironio, M. Navascués, and A. Acín, SIAM J. Optim. 20, 2157 (2010).
  34. D. Collins et al., Phys. Rev. Lett. 88, 040404 (2002).
  35. A. Acín et al., Phys. Rev. A 65, 052325 (2002).
  36. J. Barrett et al., Phys. Rev. A 71, 022101 (2005).
  37. J. D. Bancal, L. Sheridan, V. Scarani, arXiv:1309.3894.
  38. J. Löfberg, YALMIP: A Toolbox for Modeling and Optimization in MATLAB. Available at
  39. J.F. Sturm and I. Polik, SeDuMi: a package for conic optimization. Available at
Comments 0
Request Comment
You are adding the first comment!
How to quickly get a good reply:
  • Give credit where it’s due by listing out the positive aspects of a paper before getting into which changes should be made.
  • Be specific in your critique, and provide supporting evidence with appropriate references to substantiate general statements.
  • Your comment should inspire ideas to flow and help the author improves the paper.

The better we are at sharing our knowledge with each other, the faster we move forward.
The feedback must be of minumum 40 characters
Add comment
Loading ...
This is a comment super asjknd jkasnjk adsnkj
The feedback must be of minumum 40 characters
The feedback must be of minumum 40 characters

You are asking your first question!
How to quickly get a good answer:
  • Keep your question short and to the point
  • Check for grammar or spelling errors.
  • Phrase it like a question
Test description