Necessary and Sufficient Conditions for Extended Noncontextuality
in a Broad Class of Quantum Mechanical Systems
Abstract
The notion of (non)contextuality pertains to sets of properties measured one subset (context) at a time. We extend this notion to include socalled inconsistently connected systems, in which the measurements of a given property in different contexts may have different distributions, due to contextual biases in experimental design or physical interactions (signaling): a system of measurements has a maximally noncontextual description if they can be imposed a joint distribution on in which the measurements of any one property in different contexts are equal to each other with the maximal probability allowed by their different distributions. We derive necessary and sufficient conditions for the existence of such a description in a broad class of systems including KlyachkoCanBinicioğluShumvoskytype (KCBS), EPRBelltype, and LeggettGargtype systems. Because these conditions allow for inconsistent connectedness, they are applicable to real experiments. We illustrate this by analyzing an experiment by Lapkiewicz and colleagues aimed at testing contextuality in a KCBStype system.
Keywords: CHSH inequalities; contextuality; criterion of contextuality; KlyachkoCanBinicioğluShumvosky inequalities; LeggettGarg inequalities; measurement bias; measurement errors; probabilistic couplings; signaling.
The notion of (non)contextuality in Quantum Mechanics (QM) relates the outcome of a measurement of a physical property to the choice of properties comeasured with KochenSpecker1967 (). The set of comeasured properties forms a measurement context for each of its members. The traditional understanding of a contextual QM system is that if the measurement of each property in it is represented by a random variable , then the random variables representing all properties in the system do not have a joint distribution.
We use here a different formulation, which, although formally equivalent, lends itself to more productive development Larsson2002 (); DK2013PLOS (); DK2014FooPh (); DK2014PLOS (); DK2015PhSc (); DK2013LNCS (). We label all measurements contextually: this means that a property is represented by different random variables depending on the context . We say that the system has a noncontextual description if there exists a joint distribution of these random variables in which any two of them, and , representing the same property in different contexts, are equal with probability 1. If no such description exists we say that the system is contextual. Note that the existence of a joint distribution of several random variables is equivalent to the possibility of presenting them as functions of a single, “hidden” variable SuppesZanotti1981 (); Fine1982 (); Larsson2002 (); DK2010 ().
This formulation applies to systems in which the random variables representing a given property in different contexts always have the same distribution. We call such systems consistently connected, because we call the set of all such variables for a given a connection. If the properties forming any given context are spacetime separated, consistent connectedness coincides with the nosignaling condition PopescuRohrlich (). The central aim of this paper is to extend the notion of contextuality to the cases of inconsistent connectedness, where the measurements of a given property may have different distributions in different contexts. This may happen due to a contextually biased measurement design or due to physical influences exerted on by elements of context other than .
The criterion of (necessary and sufficient conditions for) contextuality we derive below is formulated for inconsistently connected systems, treating consistent connectedness as a special case. This makes it applicable to real experimental data. For example, the experiment in Ref. Lapkiewicz2011 () testing the KlyachkoCanBinicioğluShumvosky (KCBS) inequality Klyachko2008 () exhibits inconsistent connectedness, necessitating a sophisticated workaround to establish contextuality (see Refs. Lapkiewicz2013 (); Ahrens2013 ()). Below, we apply our extended notion to the same data to establish contextuality directly, with no workarounds. Another example is LeggettGarg (LG) systems LeggettGarg1985 (), where our approach allows for the possibility that later measurements may be affected by previous settings (“signaling in time,” Bacciagaluppi (); Kofler2013 ()). Finally, in EPRBelltype systems Bell1964 (); Bell1966 () our approach allows for the possibility that Alice’s measurements are affected by Bob’s settings BaconToner 2003 () when they are timelike separated; and even with spacelike separation, the same effect can be caused by systematic errors AdenierKhrennikov 2007 ().
Earlier treatments.— In the KochenSpecker theorem KochenSpecker1967 () or its variants Peres1995 (); Cabello1996 (), contexts are chosen so that each property enters in more than one context, and in each context, according to QM, one and only one of the measurements has a nonzero value. The proof of contextuality, using our language, consists in showing that the variables cannot be jointly assigned values consistent with this constraint so that all the variables representing the same property are assigned the same value. An experimental test of contextuality here consists in simply showing that the observables it specifies can be measured in the contexts it specifies, and that the QM constraint in question is satisfied.
There has been recent work translating the value assignment proofs into probabilistic inequalities (sometimes called KochenSpecker inequalities) giving necessary conditions for noncontextuality Simon2001 (); Larsson2002 (). Inequalities that do not use valueassignment restrictions but only the assumption of noncontextuality are known as noncontextuality inequalities Cabello2008 (); Klyachko2008 (); Yu2012 (). Bell inequalities Bell1964 (); Bell1966 (); CHSH (); CH1974 (); Fine1982 () and LG inequalities SuppesZanotti1981 (); LeggettGarg1985 () are also established through noncontextuality Mermin1993 (), motivated by specific physical considerations (locality and noninvasive measurement, resp.).
An extension of the notion of (non)contextuality that allows for inconsistent connectedness was suggested in Refs. Larsson2002 (); Winter2014 (). However, the error probability proposed in those papers as a measure of contextdependent change in a random variable cannot be measured experimentally. The suggestion in both Refs. Larsson2002 (); Winter2014 () is to estimate the accuracy of the measurement and from that argue for a particular value of the error probability. For example, Ref. Winter2014 () uses the quantum description of the system for the estimate (quantum tomography), but there is no clear reason why or how the quantum error model would be related to that of the proposed noncontextual description. A noncontextuality test should not mix the two descriptions, as it attempts to show their fundamental differences.
In this paper we generalize the definition of contextuality in a different manner, to allow for inconsistent connectedness while only using directly measurable quantities. We derive a criterion of (non)contextuality for a broad class of systems that includes as special cases the systems intensively studied in the recent literature on contextuality: KCBS, EPRBell, and LG systems Klyachko2008 (); Cabello2013 (); Cabello2014 (), with their inconsistently connected versions DK_Bell_LG_K (); DK_Bell_LG ().
Basic Concepts and Definitions.— We begin by formalizing the notation and terminology. Consider a finite set of distinct physical properties . These properties are measured in subsets of called contexts, . Let denote the set of all contexts, and the set of all contexts containing a given property .
The result of measuring property in context is a random variable . The result of jointly measuring all properties within a given context is a set of jointly distributed random variables .
No two random variables in different contexts, , , are jointly distributed, they are stochastically unrelated DK2013LNCS (); DK2015PhSc (). The set of random variables representing the same property in different contexts is called a connection (for ). So the elements of a connection are pairwise stochastically unrelated. If all random variables within each connection are identically distributed, the system is called consistently connected; if it is not necessarily so, it is inconsistently connected. Consistent connectedness is also known in QM as the Gleason property CabelloSeverini2014 (), outside physics as marginal selectivity DK2013LNCS (), and Ref. Cereceda2000 () lists some dozen names for the same notion; a recent addition to the list is nodisturbance principle Ramanathan2012 (); Kurzynski2014 ().
The set of all properties together with the set of all contexts and the set of all sets of random variables representing contexts is referred to as a system. In the systems we consider here the set of properties is finite (whence the set of contexts is finite too), and each random variable has a finite number of possible values (e.g., spin measurement outcomes).
We introduce next the notion of a (probabilistic) coupling of all the random variables in our system Thor (). Intuitively, this is simply a joint distribution imposed, or “forced” on all of them (recall that they include stochastically unrelated variables from different contexts). Formally, a coupling of is any jointly distributed set of random variables such that, for every , , where stands for “has the same (joint) distribution as.” One can also speak of a coupling for any subset of the random variables . Thus, fixing a property , a coupling of a connection is any jointly distributed such that for all contexts . Note that if is a coupling of all , then every marginal (jointly distributed subset) of is a coupling of the corresponding connection .
Expressed in this language, the traditional approach is to consider a system noncontextual if there is a coupling of the random variables , such that for every property the random variables in are equal to each other with probability 1. That is, for every possible coupling of the random variables and every property we consider the marginal corresponding to a connection , and we compute
(1) 
If there exists a coupling for which this probability equals 1 for all , this provides a noncontextual description for our system. Otherwise, if in every possible coupling the probability in question is less than 1 for some properties , the system is considered contextual.
This understanding, however, only involves consistently connected systems. As mentioned in the introduction, a system may be inconsistently connected due to systematic biases or interactions (such as “signaling in time” in LG systems). If for some and some contexts , the distribution of and are not the same, then cannot equal 1 in any coupling . There would be nothing wrong if one chose to say that any such inconsistently connected system is therefore contextual, but contextuality due to systematic measurement errors or signaling is clearly a special, trivial kind of contextuality. One should be interested in whether the system exhibits any contextuality that is not reducible to (or explainable by) the factors that make distributions of random variables within a connection different. For systems in general therefore we propose a different definition.
Definition 1.
A system has a maximally noncontextual description if there is a coupling of the random variables , such that for any the random variables in are equal to each other with the maximum probability allowed by the individual distributions of .
To explain, consider a connection in isolation, and let be its coupling. Among all such couplings there must be maximal ones, those in which the probability that all variables in are equal to each other is maximal possible, given the distributions of . If a connection consists of two dichotomic () variables and , and is its coupling (i.e., are jointly distributed with , ), then by Lemma A3 in Supplementary Material, the maximal possible expectation is ; a coupling with this expectation is maximal. Now take every possible coupling of all our random variables , consider the marginals corresponding to connections , and for each of these marginals compute the probability (1). If there is a coupling in which this probability equals its maximal possible value for every , this provides a maximally noncontextual description for our system. For consistently connected systems Definition 1 reduces to the traditional understanding: the maximal probability with which all variables in can be equal to each other is 1 if all these variables are identically distributed.
Cyclic systems of dichotomic random variables.— We focus now on systems in which: (S1) each context consists of precisely two distinct properties; (S2) each property belongs to precisely two distinct contexts; and (S3) each random variable representing a property is dichotomic (). As shown in Lemma A1 (Supplementary Material), a set of properties satisfying S1–S2 can be arranged into one or more distinct cycles , in which any two successive properties form a context. Without loss of generality we will assume that we deal with a singlecycle arrangement of all the properties . The number is referred to as the rank of the system.
A schematic representation of a cyclic system is shown in Figure 1. The LG paradigm exemplifies a cyclic system of rank , on labeling the observables measured chronologically. The contexts here are represented by, respectively, pairs with observed joint distributions, whereas are connections for , respectively. The EPRBell paradigm exemplifies a cyclic system of rank , on labeling the observables for Alice and for Bob. Cyclic systems of rank are exemplified by the KCBS paradigm, on labeling the vertices of the KCBS pentagram by .
(Non)Contextuality Criterion.— For any , and any , we define the function
(2) 
The maximum is taken over all combinations of coefficients containing odd numbers of ’s. The following is our main theorem.
Theorem 2.
A cyclic system of rank with dichotomic random variables (see Figure 1) has a maximally noncontextual description if and only if
(3) 
( here having arguments, each entry being taken with ).
See Supplementary Material for the proof. In (3), are the quantum correlations observed within contexts, whereas are the maximal values for the unobservable correlations within the couplings of connections. If the system is consistently connected, i.e., , then these maximal values equal 1. By Corollary A10, the criterion (3) then reduces to the formula
(4) 
wellknown for (the LG inequality in the form derived in Ref. SuppesZanotti1981 ()) and for (CHSH inequalities CHSH ()). For , (4) contains the KCBS inequality (which by Corollary A.11 is not only necessary but also sufficient for the existence of a maximally noncontextual description). Finally, for any even , inequality (4) contains the chained Bell inequalities studied in Refs. Pearle1970 (); BraunstenCaves1990 (). It is known that for the chained Bell inequalities are not criteria, the latter requiring many more inequalities WW2001a (); WW2001b (); BasoaltoPercival2003 (); DK2012 ().
Generally, some of the terms in (3) may be nonzero. Thus, in an LG system (), if inconsistency is due to “signaling in time” Bacciagaluppi (); Kofler2013 (), these may include and but not , because cannot be influenced by later events. However, may be nonzero due to contextual biases in design, if something in the procedure of measuring is different depending on whether the next measurement is going to be of or .
An application to experimental data.— To illustrate the applicability of our theory to real experiments, consider the data from the KCBS experiment of Ref. Lapkiewicz2011 (). The experiment uses a single photon in a quantum overlap of three optical modes (paths) as an indivisible quantum system. Readout is performed through singlephoton detectors that terminate the three paths. Context is chosen through “activation” of transformations, by rotating a waveplate that precedes each beamsplitter to change the behavior of two out of three paths. Each transformation leaves one path untouched, which serves as justification for consistent connectedness of the corresponding measurements, , so that the target inequality is (4) for .
and are recorded in different experimental setups with zero or four polarizing beamsplitters “activated”. These outputs have significantly different distributions: from Ref. Lapkiewicz2011 () Table 1, , , and taking them as means and standard errors of 20 replications, the standard test with is significant at 0.1%. Lapkiewicz et al., deal with this by introducing in (4) a correction term involving . They estimate by identifying with , an output measured in a separate context and in a special manner: instead of photon detections it is measured by blocking two paths early in the setup. While this results in a wellmotivated experimental test, the identification of with involves additional assumptions Ahrens2013 (); Lapkiewicz2013 (). Furthermore, Lapkiewicz et al. have to discount the fact that the assumption can also be challenged for : the same test as above for and is significant at 1%. We see that the traditional approach adopted in Ref. Lapkiewicz2011 () encounters considerable experimental and analytic difficulties due to the necessity of avoiding inconsistent connectedness.
Our theory allows one to analyze the data directly as found in the measurement record. It is convenient to do this by using the inequality
(5) 
which, by Corollary A9, follows from the criterion (3) remark_conjecture (). One way of using it is to construct a conservative confidence interval with, say, for the lefthand side of (5) with and show that its lower endpoint exceeds . One can, e.g., construct 10 Bonferroni confidence intervals for each of the approximately normally distributed terms and (), with respective error terms read or computed from Table 1 of Ref. Lapkiewicz2011 (), and then determine the range of (5). Treating each estimated term as the mean of 20 observations, we have and so a conservative confidence interval for each term is given by . Using these intervals, we can calculate the conservative confidence interval for (5) as
(6) 
The system is contextual. The conclusion is the same as in Ref. Lapkiewicz2011 (), but we arrive at it by a shorter and more robust route.
Conclusion.— We have derived a criterion of (non)contextuality applicable to cyclic systems of arbitrary ranks. Even for consistently connected systems this criterion has not been previously known for ranks (KCBS and higherrank systems). However, it is the inclusion of inconsistently connected systems that is of special interest, because it makes the theory applicable to real experiments. A “system” is not just a system of properties being measured, but also a system of measurement procedures being used, with possible contextual biases and unaccountedfor interactions. Our analysis opens the possibility of studying contextuality without attempting to eliminate these first, whether by statistical analysis or by improved experimental procedure.
Acknowledgements.
This work is supported by NSF grant SES1155956, AFOSR grant FA95501410318, A. von Humboldt Foundation, and FQXi through Silicon Valley Community Foundation. We thank J. Acacio de Barros, Gary Oas, Samson Abramsky, Guido Bacciagaluppi, Adán Cabello, Andrei Khrennikov, and Lasse Leskelä for numerous discussions.References
 (1) S. Kochen and E. P. Specker. The problem of hidden variables in quantum mechanics. Journal of Mathematics and Mechanics, 17:59–87, 1967.
 (2) P. Suppes and M. Zanotti. When are probabilistic explanations possible? Synthese 48:191–199, 1981.
 (3) A. Fine. Hidden variables, joint probability, and the Bell inequalities. Physical Review Letters 48:291–295, 1982.
 (4) P. Kurzynski, R. Ramanathan, and D. Kaszlikowski. Entropic test of quantum contextuality. Physical Review Letters 109:020404, 2012.
 (5) J.Å. Larsson. A KochenSpecker inequality. Europhysics Letters, 58(6):799–805, 2002.
 (6) E.N. Dzhafarov and J.V. Kujala. Allpossiblecouplings approach to measuring probabilistic context. PLoS ONE 8(5):e61712. doi:10.1371/journal.pone.0061712, 2013.
 (7) E.N. Dzhafarov and J.V. Kujala. NoForcing and NoMatching theorems for classical probability applied to quantum mechanics. Foundations of Physics 44:248–265, 2014.
 (8) E.N. Dzhafarov and J.V. Kujala. Embedding quantum into classical: contextualization vs conditionalization. PLoS One 9(3):e92818. doi:10.1371/journal.pone.0092818, 2014.
 (9) E.N. Dzhafarov and J.V. Kujala. A qualified Kolmogorovian account of probabilistic contextuality. Lecture Notes in Computer Science 8369:201–212, 2014.
 (10) E.N. Dzhafarov and J.V. Kujala. Contextuality is about identity of random variables. Physica Scripta T163, 014009, 2014 (available as arXiv:1405.2116).
 (11) E.N. Dzhafarov and J.V. Kujala. The Joint Distribution Criterion and the Distance Tests for selective probabilistic causality. Frontiers in Psychology 1:151 doi:10.3389/fpsyg.2010.00151, 2010.
 (12) S. Popescu and D. Rohrlich. Quantum nonlocality as an axiom. Foundations of Physics 24:379–385, 1994.
 (13) J. Bell. On the EinsteinPodolskyRosen paradox. Physics 1:195200, 1964.
 (14) J. Bell. On the problem of hidden variables in quantum mechanics. Review of Modern Physic 38:447453, 1966.
 (15) D. Bacon and B. F. Toner. Bell Inequalities with auxiliary communication. Physical Review Letters 90:157904, 2003.
 (16) G. Adenier and A. Yu. Khrennikov. Is the fair sampling assumption supported by EPR experiments? Journal of Physics B: Atomic, Molecular and Optical Physics 40:131, 2007.
 (17) A.J. Leggett and A. Garg. Quantum mechanics versus macroscopic realism: Is the flux there when nobody looks? Physical Review Letters, 54:857–860, 1985.
 (18) J. Kofler and Č. Brukner. Condition for macroscopic realism beyond the LeggettGarg inequalities. Physical Review A 87:052115, 2013.
 (19) G. Bacciagaluppi. LeggettGarg inequalities, pilot waves and contextuality. International Journal of Quantum Foundations 1, 117, 2015.
 (20) R. Lapkiewicz, P. Li, C. Schaeff, N. K. Langford, S. Ramelow, M. Wieśniak, and A. Zeilinger. Experimental nonclassicality of an indivisible quantum system. Nature 474: 490–93, 2011.
 (21) A.A. Klyachko, M.A. Can, S. Binicioğlu, and A.S. Shumovsky. Simple test for hidden variables in spin1 systems. Physical Review Letters, 101(2):020403, 2008.
 (22) J. Ahrens, E. Amselem, A. Cabello, and M. Bourennane. Two fundamental experimental tests of nonclassicality with qutrits. Scientific Reports 3, 2013.
 (23) R. Lapkiewicz, P. Li, C. Schaeff, N. K. Langford, S. Ramelow, M. Wieśniak, and A. Zeilinger. Comment on “Two Fundamental Experimental Tests of Nonclassicality with Qutrits”. arXiv:1305.5529, 2013.
 (24) A. Peres. Quantum Theory: Concepts and Methods, Dordrecht: Kluwer, 1995.
 (25) A. Cabello, J. Estebaranz, and G. GarcìaAlcaine. BellKochenSpecker Theorem: A Proof with 18 vectors”, Physics Letters A 212:183–87, 1996.
 (26) C. Simon, Č. Brukner, and A. Zeilinger. Hiddenvariable theorems for real experiments. Physical Review Letters, 86(20):4427–4430, 2001.
 (27) A. Cabello. Experimentally testable stateindependent quantum contextuality. Physical Review Letters, 101(21):210401, 2008.
 (28) S. Yu and C.H. Oh. StateIndependent proof of KochenSpecker theorem with 13 rays. Physical Review Letters, 108(3):030402, 2012.
 (29) J.F. Clauser, M.A. Horne, A. Shimony, and R.A. Holt. Proposed experiment to test local hiddenvariable theories. Physical Review Letters 23:880–884, 1969.
 (30) J.F. Clauser and M.A. Horne. Experimental consequences of objective local theories. Physical Review D 10:526–535, 1974.
 (31) Mermin, N. D. Hidden variables and the two theorems of John Bell. Rev. Mod. Phys. 65, 803–815 (1993).
 (32) A. Winter. What does an experimental test of quantum contextuality prove or disprove? Journal of Physics A: Mathematical and Theoretical, 47(42):424031, 2014.
 (33) A. Cabello. Simple explanation of the quantum violation of a fundamental inequality. Physical Review Letters, 110:060402, 2013.
 (34) A. Cabello, S. Severini, and A. Winter. Graphtheoretic approach to quantum correlations. Physical Review Letters, 112:040401, 2014.
 (35) E.N. Dzhafarov and J.V. Kujala. Generalizing Belltype and LeggettGargtype inequalities to systems with signaling. arXiv:1407.2886, 2014.
 (36) E.N. Dzhafarov, E.N., J.V. Kujala, and J.Å. Larsson. Contextuality in three types of quantummechanical systems. Foundations of Physics 2015, DOI 10.1007/s1070101598829.
 (37) A. Cabello, S. Severini, and A. Winter. (Non)Contextuality of physical theories as an axiom. Physical Review Letters 112:040401, 2014.
 (38) J. Cereceda. Quantum mechanical probabilities and general probabilistic constraints for Einstein–Podolsky–Rosen–Bohm experiments. Foundations of Physics Letters 13: 427–442, 2000.
 (39) R. Ramanathan, A. Soeda, P. Kurzynski, and D. Kasznlikowski. Physical Review Letters 109:050404, 2012.
 (40) P. Kurzynski, A. Cabello, and D. Kaszlikowski. Fundamental monogamy relation between contextuality and nonlocality. Physical Review Letters 112:100401, 2014.
 (41) H. Thorisson. Coupling, Stationarity, and Regeneration. New York: Springer, 20002.
 (42) This formula is in fact equivalent to (3), as conjectured in Ref. DK_Bell_LG_K () and proved in Ref. conjecture ().
 (43) P. Pearle. Hiddenvariable example based upon data rejection. Physical Review D2, 1418–1425, 1970.
 (44) S. L. Braunstein and C. M. Caves. Wringing out better Bell inequalities. Annals of Physics 202, 22–56, 1990.
 (45) R. F. Werner and M. M. Wolf. Allmultipartite Bellcorrelation inequalities for two dichotomic observables per site. Physical Review A 64, 032112, 2001.
 (46) R. F. Werner and M. M. Wolf. Bell inequalities and entanglement. Quantum Information and Computation 1, 125, 2001.
 (47) R. M. Basoalto and I. C. Percival. BellTest and CHSH experiments with more than two settings. Journal of Physics A: Mathematical & General 36, 7411–7423, 2003.
 (48) Dzhafarov, E.N., Kujala, J.V.: Selectivity in probabilistic causality: Where psychology runs into quantum physics. Journal of Mathematical Psychology 56, 5463, 2012.
 (49) J.V. Kujala and E.N. Dzhafarov. Proof of a conjecture on contextuality in cyclic systems with binary variables. arXiv:1503.02181.
Supplementary Material to
“Necessary and Sufficient Conditions for Maximal Noncontextuality
in a Broad Class of Quantum Mechanical Systems.” Proof of the main
criterion and its consequences
The (non)contextuality criterion derived in this main text is a corollary to Theorem A.8 proved below. We first need the following simple result (see properties S1 and S2 formulated in section Cyclic systems of dichotomic random variables):
Lemma A.1.
In a system satisfying S1S2, the physical properties can be (re)indexed and arranged in one or more nonoverlapping cycles
(A.1) 
with and (), such that any two successive properties in each cycle form a context.
Proof.
Apparent from Figure A.1. ∎
Our proof of Theorem A.8 uses the fact that the connections and context representations enter a circular system symmetrically, so that it is possible to view circular systems as a circular arrangement of random variables in which any two successive variables have a joint distribution (see Figure A.2).
We need some auxiliary results. In addition to defined in the main text, we use function
(A.2) 
in which the maximum is taken over all combinations of coefficients containing even numbers of ’s.
Lemma A.2.
For any ,
(A.3) 
and
(A.4) 
The proof is obvious.
Lemma A.3.
Jointly distributed valued random variables and with given expectations exist if and only if
(A.5) 
Proof.
For jointly distributed , from the table of probabilities

Lemma A.4.
Jointly distributed valued random variables , , and with given expectations , , , , exist if and only if these expectations satisfy Lemma A.3 and
(A.7) 
Proof.
, , and satisfying Lemma A.3 uniquely determine ; and analogously for and . A joint distribution of is determined by 8 probabilities , . It has the given expectations if and only if the probabilities satisfy 7 equations
The statement of the lemma obtains by any algorithm (facet enumeration and reduction) analogous to that described in Text S3 of Ref. DK2013PLOS1 ().∎
Remark A.5.
One can also obtain the proof by using Fine’s theorem Fine19821 (), presenting it as (using Fine’s notation for the random variables)
and then putting , , and .
Lemma A.6.
Jointly distributed arbitrary random variables with given 2marginal distributions of and exist if and only if these 2marginals agree for the distribution of .
Proof.
The necessity is obvious. The sufficiency obtains by the Markov rule
for any possible values of, respectively, . ∎
Corollary A.7 (to Lemma a.6).
Jointly distributed valued random variables with given expectations , exist if and only if these expectations satisfy Lemma A.3.
Theorem A.8.
Jointly distributed valued random variables () with given expectations
(A.8) 
exist if and only if these expectations satisfy Lemma A.3 and
(A.9) 
Proof.
For the statement follows from Lemma A.4.
Assume that the statement holds up to and including some . We will prove that
(i) jointly distributed valued random variables with given expectations
exist if and only if
(ii) these expectations satisfy Lemma A.3 and
(iii) they satisfy
Since Statement (ii) is an obvious consequence of Statement (i), we only need to prove that if Statement (ii) is satisfied, then Statements (i) and (iii) are equivalent. So we assume Statement (ii).
By Lemma A.6, jointly distributed exist if and only if there are jointly distributed and , with one and the same jointly distributed . Hence Statement (i) holds if and only if, for some satisfying Lemma A.3, exists with expectations , and exists with expectations . Therefore, by the induction hypothesis, Statement (i) holds if and only if
Applying now Lemma A.2 to these inequalities and adding the condition of Lemma A.3 for the consistency of with and , we obtain the following system
Statement (i) holds if and only if this system is satisfied, for some real value of . And it is satisfied if and only if
with the inequality holding for any lefthand expression combined with any righthand expression. The inequalities with matching rows are satisfied always: the first two because
for ; the third one due to the fact that
for . This leaves the following six inequalities
They simplify to
and we combine pairs of inequalities using Lemma A.2 to obtain
(A.10)  
(A.11)  
(A.12) 
These three inequalities are satisfied if and only if Statement (i) holds. In particular, Statement (i) implies (A.10), and this completes the proof by induction of the necessity part of the theorem: for any , if are jointly distributed with expectations (A.8) then these expectations satisfy (A.9) (and Lemma A.3).
Now, Corollary A.7 implies that a joint distribution of with expectations , (satisfying Lemma A.3) always exists. If we close this chain into a cycle by introducing a constant variable , we get jointly distributed variables with expectations , . Applying to it the just established necessary part of the theorem, we conclude that (A.11) always holds. Similarly, considering the chain (whose joint distribution always exists) and adding the constant variable to close the chain into a cycle, the necessary condition implies (A.12) with . Thus, (A.12) also holds always, leaving just (A.10) as the equivalent condition for Statement (i). ∎
Proof of the main criterion (Theorem 4)..
From Theorem A.8, contexts and connections with specified expectations and (subject to Lemma A.3) can be imposed a joint distribution upon if and only if
(A.13) 
As the variables of the connection are dichotomic, the probability of them being equal can be written as and so this probability is maximized if and only if the expectation is maximized. By Lemma A.3, is the maximum possible value of given the distributions of and (determined by and ). The statement of the theorem now follows from Definition 2. ∎
Corollary A.9 (to Theorem 4).
A cyclic system of rank with dichotomic random variables has a maximally noncontextual description only if
(A.14) 
Proof.
Corollary A.10 (to Theorem 4.).
A cyclic consistentlyconnected system of rank with dichotomic random variables has a maximally noncontextual description if and only if
(A.15) 
Proof.
For consistentlyconnected systems, the main criterion has the form
(A.16) 
The form A.15 follows from the easily verifiable general formula
where […] is the Iverson bracket, equal to 1 if the predicate within it is true, and zero otherwise,∎
Corollary A.11 (to Theorem 4).
A cyclic consistentlyconnected system of rank with dichotomic random variables and with
has a maximally noncontextual description if and only if the original KCBS inequality holds,
(A.17) 
where , .
References
 (50) E.N. Dzhafarov and J.V. Kujala. Allpossiblecouplings approach to measuring probabilistic context. PLoS ONE 8(5):e61712. doi:10.1371/journal.pone.0061712, 2013.
 (51) A. Fine. Hidden variables, joint probability, and the Bell inequalities. Physical Review Letters 48:291–295, 1982.