# Putting Bell’s inequalities into context by putting context into Bell’s inequalities

###### Abstract

Within the Dempster-Shafer theory of evidence a non-Kolmogorovian kind of epistemic uncertainty arises, which is encoded using multi-valued maps. We analyse the possible implications such non-Kolmogorovian epistemic uncertainty may have for Bell-type inequalities relating to the Einstein-Podolsky-Rosen-Bohm (EPRB) thought experiment. Our analysis leads to a notion of contextuality concerning complexes of physical measurement conditions. The use of multi-valued maps reveals an implicit link between this contextuality and counterfactual outcomes, and results in a formulation wherein the states of measurement devices are explicitly taken into account as part of the probabilistic event space. This reflects a conception of measurement that was advocated by Bell some time ago. It results in context-conditioned measure-theoretic probabilities, which do not obey Bell-type inequalities, but which are nonetheless perfectly compatible with local classical physical models. We give an example of a local classical model that reproduces the quantum mechanical predictions and that fits within the contextual framework.

###### pacs:

03.65.Ud, 02.50.Cw## I Introduction

Although we have recently celebrated the fiftieth year since Bell’s original work Bell (1964), Bell-type inequalities remain a contentious issue _jo (). From a mathematical perspective Bell-type inequalities merely express constraints on random variables defined over a single Kolmogorov probability space. It is quite remarkable that such seemingly innocuous and straightforward results of probability theory have produced such heated and protracted debates within physics Pitowsky (1989); Khrennikov (2008). In the past Stapp claimed that Bell’s inequality is “the most profound discovery of science” Stapp (1975). On the other hand, more recently Khrennikov has claimed that “The only value of Bell’s arguments was the great stimulation of experimental technologies for entangled photons.” Khrennikov (2009). It seems that those who agree with Stapp tend to be physicists, while those who agree with Khrennikov tend to be probability theorists.

In this paper we will argue that a single Kolmogorov space is very narrow as the basis for the treatment of realistic experiments. This limits the power of Bell-type inequalities in constraining the interpretation of physical models quite considerably. Our alternative approach utilises concepts from the Dempster-Shafer theory of evidence Shafer (1976); Yager and Liu (2008). Unlike Kolmogorovian probabilities Dempster-Shafer probabilities are not additive, which makes the Dempster-Shafer theory more general Shafer (1976); Yager and Liu (2008). The main use we find for this theory is in the identification of a non-Kolmogorovian kind of epistemic uncertainty. This uncertainty is associated with a probabilistic event space whenever two or more observables are operationally incompatible. It is encoded using multi-valued maps, which result in the replacement of a single Kolmogorov space with multiple Kolmogorov spaces, each labeled by distinct measurement contexts. Within the treatment of EPRB-type experiments, the multi-valued maps allow us to establish a link between measurement contexts and counterfactual outcomes. When hidden variables are considered the multi-valued maps result in a contextual approach similar to Khrennikov’s Khrennikov (2009, 2014).

We begin in section II by considering Bell-type inequalities involving empirical data from a table. Such inequalities are based on phenomenological arguments alone and make no explicit reference to hidden-variables Peres (1978, 1995). Instead they concern counterfactual outcomes of hypothetical measurements. Using the Dempster-Shafer theory enables us to probabilistically treat counterfactual outcomes differently to factual outcomes. The probabilistic uncertainty associated with the former is non-Kolmogorovian (despite being purely epistemic), while the uncertainty in the latter is of the usual Kolmogorovian variety. In section III we provide a treatment involving hidden-variables wherein multi-valued maps provide contextual observables and context-conditioned probabilities. These quantities are not constrained by Bell-type inequalities. In section IV we relate our contextual approach to conventional approaches both rigorous and heuristic, and assess the significance of Bell’s locality assumption. In section V we give an explicit example of a classical, deterministic, local model of an EPRB-type experiment that fits within our contextual approach, and that reproduces the quantum predictions. Finally, we summarise our findings in section VI.

## Ii The EPRB experiment

In Bohm’s version of the EPR experiment there are two spin-half particles produced by a common source. These particles travel in opposite directions labeled left and right. The spin of the left particle is measured using a Stern-Gerlach (SG) device aligned along one of two possible directions or , and the right particle is measured by a similar SG device aligned along either or . The generalised Bell-type CHSH inequalities refer to the following theorem Clauser et al. (1969).

Theorem. Let be a Kolmogorov probability space. Let real random variables represent the spin observables of the left particle along directions and respectively, and let represent the spin observables for the right particle along and respectively. The following Clauser-Horne-Shimony-Holt (CHSH) inequalities hold;

(1) |

where the correlations are defined by

(2) |

The proof is almost trivial, but is omitted for brevity.

### ii.1 Phenomenological treatments and the Dempster-Shafer theory

In a typical EPRB experiment measurements of the spins of an ensemble of particle pairs results in a table of values such as that given in 1. We let denote the set of outcomes explicitly appearing in such a table. We denote by the set of values (KPVs) that are known to have been possessed by the particles at the time each particle was measured. Trivially, any reasonable theory of physics allows us to set .

A standard argument that a local deterministic classical theory cannot violate the CHSH inequalities proceeds as follows. Since, in a deterministic classical theory, the particles are assumed to possess spin values in all directions whether or not they have been measured, one can imagine that the blank entries in the table 1 are in fact filled with variables which can take values or . Let us denote by the set of unknown spin values possessed by the particles (UPVs). There are concrete choices that can be made for as a set of values each of which is either or . These different choices are labeled . Each represents a set of numbers called counterfactual outcomes. If, contrary to the actual experiment that we are describing, we supposed that somehow all four spin directions were measured for each particle pair, then we would obtain a complete table where the index takes a fixed value within . Each comprises a complete table of outcomes, and each table can be viewed as a Kolmogorov space in which the probability for an outcome is simply it’s frequency of occurrence within the table.

Run/ | Left particle | Right particle | ||
---|---|---|---|---|

Particle pair | ||||

1 | ? | ? | ||

2 | ? | ? | ||

3 | ? | ? | ||

4 | ? | ? | ||

N |

However, attempting to encode local classical determinism by making a concrete choice for as one of the conflates physical and probabilistic assumptions in way that may not be justified. Such an assumption is much stronger than classical determinism, because it assumes not only that there exist possessed values, but also that these values are known from having performed measurements. The assumption that particles can simultaneously possess definite spin values in different directions, does not by itself mean that we should assign relative frequencies to both KPVs and UPVs, as though both are measurement outcomes. Only KPVs should be treated as measurement outcomes. Put slightly differently, the assumption that possessed values merely exist does not mean that we should necessarily treat the two sets and as part of the same probability space. Such a probabilistic restriction entails a physical constraint which is different to the one we seek to impose.

Within a single Kolmogorov probability space all probabilities whether they are taken to pertain to UPVs or to KPVs are required to satisfy the same rules. Since UPVs are unknown, their probability assignments must be subjective. On the other hand KPVs appearing explicitly in a table are afforded probability assignments in the form of objective relative frequencies. Clearly a single Kolmogorov space is too limited to even allow for the possibility that these two distinct types of probability might be treated differently. However, if we employ the Dempster-Shafer theory, such a distinction becomes possible. Subjective UPV-probabilities and objective KPV-frequencies are generally viewed differently. The uncertainty associated with UPVs is interpreted as uncertainty in the underlying event space and is non-Kolmogorovian, although it is entirely epistemic.

Rather than viewing table 1 as defining Kolmogorov spaces, we instead view it as defining a single Dempster-Shafer probability space in a sense that will become clear in what follows. The basic idea in our approach is to associate a multi-valued map with each experimental context . A context is defined as a particular experimental arrangement of measurement devices, preparation devices etc. Thus, an experiment involving several measurement contexts actually corresponds to several sub-experiments each associated with a different context. If we combine probabilities conditioned on different contexts it is quite possible to violate the CHSH inequalities. Within any one context , only a subset of possible measurements can actually be performed, and we cannot associate definite relative frequencies with outcomes of measurements that cannot be made within . The multi-valued map only allows us to associate upper and lower subjective probabilities with UPVs. In this sense counterfactual outcomes are related to measurement contexts through the multi-valued maps.

In the EPRB setup the different contexts are corresponding to different possible settings of the SG devices. The contexts make no reference to the the right particle’s SG device, and likewise do not refer to the left particle’s SG device. The remaining contexts (namely and ) specify an arrangement of both devices simultaneously.

Let us consider first a complete table of known values for the outcomes of measurements of the random variables . The table can be divided into rows and columns . The intersection is simply the singleton set consisting of the ’th table entry. Since every entry has value or we can partition in terms of disjoint subsets as . We can now define useful intersections such as , which is empty if and is equal to otherwise.

Considering only a single particle, we can define the following relative frequencies

(3) |

where , . Here is used to denote the cardinality of the set , and and denote the number of entries in the ’th column with value and value respectively; . Considering both particles together we can define the joint coincidence frequencies

(4) |

where and . The above probabilities can be used to define individual averages and joint correlations as

(5) |

In the case that the table is complete the data necessarily obeys all Bell-type inequalities.

Now consider the case in which the table has missing entries corresponding to UPVs. The total space can be partitioned as before as , but now we also have the disjoint partitioning . We can also form the intersections and . The set is empty if is unknown and is equal to otherwise. Likewise the set is empty if is known and is equal to otherwise.

With each individual context we associate the multi-valued map defined by

(6) |

and with each joint context we associate the multi-valued map defined by

(7) |

These maps give rise to multiple upper and lower sets. For example, for

(8) |

and for

(9) |

We can now define upper and lower probabilities using (3) and (4). Since each multi-valued map is associated with a specific context, so too are the upper and lower probabilities. For example, using we have

(10) |

The difference represents Dempster’s “don’t know” probability associated with . That this quantity is nonzero reflects the fact that we cannot reveal any information about the value of the random variable , within the context . More colloquially, we “don’t know” what the values of are, if what we are measuring is .

In the case of measurements on both particles we have within the context

(11) |

In this case the “don’t know” difference is associated with the cases in which we do not know the value of for the left-particle or we do not know the value of for the right-particle. If we restrict our attention to a single one-particle context , then we cannot meaningfully associate frequencies with the values of . Similarly in the two-particle case restricted to the context we cannot meaningfully associate frequencies with a pair of observables for which or . We can only meaningfully give subjective upper and lower probability intervals in these cases. Combinations of these subjective probabilities are quite capable of violating Bell-type inequalities, though the sense in which such combinations are really meaningful raises delicate questions.

There is however, a way to violate the CHSH inequalities with meaningful contextual averages, which are naturally obtained from the Dempster-Shafer multi-valued maps. Each context defines an observable whose domain is the set of points for which is single-valued. The domain of such an observable is called a domain of certainty — a notion that plays a central role in the Dempster-Shafer treatment we employ. Within phenomenological treatments to EPRB-type experiments observables with disjoint domains of certainty could be termed incompatible. For example, considering only one particle we define by . In the two-particle case we can also define the product variable by , which denotes the product of the cartesian components of restricted to the subset . Any two distinct random variables so defined are incompatible. With respect to these variables averages are only defined over domains of certainty;

(12) |

Substituting these expressions into (1) it is clear that (1) can be violated. In fact, the upper bound on becomes rather than .

Along with the above averages we can define context restricted probabilities

(13) |

The first of these represents the frequency with which the outcome is obtained given that the experiment is actually set up to measure , i.e., given that the context is . The second represents the probability that is obtained given that both and are actually simultaneously measured, i.e., given that the context is . The relevance of this type of conditional probability in relation to EPRB-type experiments was first pointed out by A. Fine Fine (1982). In the treatment above context conditioned probabilities and averages generally violate all Bell-type inequalities, but nothing about this more general probabilistic treatment precludes the physical assumptions of classical determinism and locality.

## Iii Including hidden-variables

Most treatments of EPRB-type experiments including Bell’s original treatment, start with the assumption that for each particle pair we can use hidden-variables to give a complete, classically deterministic specification of the real experimental state. These hidden-variables are assumed to belong to a single Kolmogorov probability space. Averages are defined over this one space, and the CHSH inequalities (1) necessarily hold. In this section we will relax the latter assumption and define contextual random variables whose domains are the domains of certainty of specific Dempster-Shafer multi-valued maps.

### iii.1 Rigorous hidden-variable treatment

First, for comparative purposes, we provide a rigorous formulation of the CHSH inequalities. We consider the standard EPRB setup in which the spins of two particles produced by a common source are measured. We formulate the present treatment within a single Kolmogorov space . We define the variables representing spin observables in the directions for the left particle, and similarly we define the spin observables for the right particle. The directions of these spin observables coincide with the contexts referring to the SG device alignments. We define the sets and where and . We also define the real random vector by . Finally we assume that can be viewed as giving a complete description of the underlying reality within the experiment, i.e., represents a complete ontic state of the total physical system. With everything defined as such we can now give the two-particle probabilities relevant to Bell-type inequalities as

(14) |

More generally we have

(15) |

There are permutations of the outcomes appearing in the above expression, giving the same number of probabilities . These probabilities act as a basis in the sense that they can be used to express any other (absolute) probability. Examples of single-particle and two-particle probabilities are given by

(16) |

Single-particle and two-particle averages are given by

(17) |

Despite the fact that for a single particle we cannot simultaneously attribute known possessed values to both of the observables and , equation (III.1) does not distinguish between the averages like , and meaningful two-particle correlations such as . Similarly the formalism does not itself distinguish between the probabilities in which outcomes are simultaneously associated with all observables, and probabilities like , which only associates outcomes with observables that can be simultaneously measured. In other words, because it has been built using a single Kolmogorov space the above hidden-variable approach treats counterfactual outcomes in the same way as the conventional phenomenological approaches discussed in section II.1. In short, the above approach assumes that counterfactual outcomes can be treated in the same way as factual outcomes. It is convenient to refer to this assumption as simply the counterfactual assumption.

The quantities above can be written in terms of concrete probability densities by defining a Borel measurable chart consisting of hidden-variables that reveal the underlying physical states . If we denote by the Borel subsets of , the triple defines a concrete Kolmogorov space. One can then define the observables over this space, and one can define the joint distribution function by . The probability density associated with is defined by . The number characterises the observer’s epistemic, but purely “Kolmogorovian” uncertainty, as to the underlying physical state . This interpretation is inherited from the interpretation given to the Kolmogorov probability measure . The assumption of classical determinism means the observable is assumed to possess the definite value in the state . Note that the formalism distinguishes between two distinct types of state—an ontic state (or equivalently ), and an epistemic state . A simple example of the above formalism at work is given later on in section IV.1. For now we note that equations (14) and (III.1) can be written

(18) |

and

(19) |

respectively. Within the framework discussed above the CHSH inequalities (1) necessarily hold.

### iii.2 A contextual Dempster-Shafer approach

We now offer an alternative approach to that above for the modeling of EPRB-type experiments. As in section II.1 we achieve this using multi-valued maps. The resulting framework shares many features in common with Khrennikov’s contextual framework Khrennikov (2009). Our starting point is the idea that even if we only considered measuring the spin of a single particle in the different directions , a single Kolmogorov probability space would not offer an adequate description of the experiment being envisioned. Using only a single space makes it impossible to account for the fact that different spin directions are not simultaneously measured. The latter is an operational fact, which must be properly encoded within any theoretical treatment, regardless of whether or not we make particular physical assumptions like determinism or locality. Thus, the theory based on a single Kolmogorov space must be appended in order to properly account for the incompatibility between different experimental contexts. In short, we avoid making the counterfactual assumption.

A single-particle context is assumed to be determined by the alignment of the SG device. As both Bohr and later Bell repeatedly emphasized should be the case, the model of an experiment should describe the entire experiment, rather than just the systems being measured Bell and Aspect (2004). With this in mind we define where are distinct SG device contexts. Each can be decomposed as with the sets corresponding to the possible spin outcomes along the direction .

Just like in section II we associate with each a multi-valued map defined by

(20) |

The uncertainty in the event space within the context , is represented by the set which denotes the complement of . We can also define the joint-context multi-valued maps by

(21) |

As before we define the random variables , whose domains are the domains of certainty of the . We also define the product variables . With respect to these observables single-particle and two-particle probabilities are defined by

(22) |

The expressions in (III.2) are normalised over the domains of certainty defined by and respectively, and can be interpreted as conditional probabilities. In a similar fashion single-particle averages and two particle correlations are defined as

(23) |

where again each expression is normalised over a context specific domain of certainty, and so represents a conditional expectation. If we substitute correlations of the form given in (III.2) into (1) we obtain as in section II an upper bound on of rather than . Thus, conditional probabilities and correlations do not satisfy Bell-type inequalities. Of course, the interpretation of quantum probabilities pertaining to EPRB experiments as classical conditional probabilities is well-known RÃ©dei and StÃ¶ltzner (2001); Khrennikov (2014).

As in section III.1 we can map into . However, since in general we expect the epistemic states of an observer to depend on the context, we now associate with each context a “local” (as opposed to global) chart defined over the domain of certainty . With each joint context we associate another chart defined over the domain of certainty . The local charts give rise to measures , epistemic states , and observables defined over . Analogous measures, states and observables are associated with the joint contexts . The set is the “concrete” domain of certainty in , that corresponds to the “abstract” domain of certainty . The probabilities in (III.2) can now be written

(24) |

The notation does not necessarily denote a conventional conditional probability within , rather it indicates that all probabilities pertaining to context are normalised over . In general one need not require that . The normalisation factors in (III.2) can be absorbed into the definition of the densities to yield absolute probabilities over the domains of certainty. More precisely, defining and , we have

(25) |

These normalised measures allow one to define conditional probabilities within the domains of certainty in the usual way. The averages in (III.2) can be written

(26) |

Whenever the joint correlations in (III.2) can be expressed in the form given in (III.1), the CHSH inequalities in (1) hold. This requires the existence of a common density corresponding to a global chart , such that every probability is normalised over the total space . Such a description would only generally be appropriate if all measurements within the experiment were performed under the same physical conditions, that is, within the same context. Note that when a global chart is used but the probabilities are normalised only over the domains of certainty, they can be interpreted as conditional probabilities as in (III.2).

### iii.3 Discussion

Two different alignments of a SG device produce altogether different inhomogeneous magnetic fields over the spacetime region within which the spin measurement is performed. In general, the physical conditions under which a measurement is performed are at least partly determined by the state of the measuring device, which therefore influences the observed physical events. As Bell himself puts it: “the results have to be regarded as the joint product of ‘system’ and ‘apparatus’, the complete experimental setup” Bell and Aspect (2004). In a deterministic theory the set of all possessed values for the observables of a physical system can be divided into KPVs and UPVs. The KPVs are what Bell calls “results”, while obviously the UPVs are not “results”. As such UPVs should not be regarded as the joint product of system and apparatus. Rather, UPVs depend on the system alone. Moreover, in EPRB-type experiments the UPVs pertain to spin observables that are operationally incompatible with the spin observables that are actually measured to give the KPVs. Crucially therefore, we cannot assume that the set of KPVs is representative of the set of UPVs.. In other words, if, in determining measurement outcomes, measurement devices are active rather than passive, then the counterfactual assumption is not valid whenever operationally incompatible observables are being considered. Thus, we see that Bell’s own conception of measurement cannot generally be reconciled with the counterfactual assumption crucial for proving Bell-type inequalities.

In our approach distinct and incompatible states of a measuring device result in different physical contexts . Each context is associated with a multi-valued map . With respect to a given context , the uncertainty associated with counterfactual events taking place in a different context , is non-Kolmogorovian epistemic uncertainty pertaining to the underlying event space. This leads to the conditioning of densities on measurement contexts, and results in probabilities as in (III.2). These probabilities are defined in terms of different measures and are normalised over different domains. As a result they are not constrained by Bell-type inequalities. The contextual framework given here, offers an explanation as to why violations of Bell-type inequalities have actually been observed in the lab Aspect et al. (1982) without resorting to nonlocality or indeterminism.

#### iii.3.1 Free will

In mathematical terms the contextual formulation above uses multiple Kolmogorov probability spaces to model the EPRB experiment. This is deemed appropriate because the experiment actually consists of four incompatible experimental contexts. Thus, rather than using a single measure corresponding to which there is a single density , we use multiple densities , which are labeled by context. The assumption that a single density suffices, so that

(27) |

is called measurement independence in Hall (2010); Barrett and Gisin (2011), and is discussed in _ex (1985). It is argued that this assumption encodes the “free will” of the experimenter in choosing measurement settings. Thus, the use of different Kolmogorov probability spaces to model different and incompatible (sub)experiments is somehow seen as violating the assumption that observers have “free will”.

First we note that even when equation (27) holds, Bell-type inequalities will generally be violated if the probabilities involved are interpreted as conditional probabilities RÃ©dei and StÃ¶ltzner (2001); Khrennikov (2014, 2015). Furthermore, absolute relative frequency probabilities are necessarily absolute Kolmogorovian probabilities and therefore they must satisfy Bell-type inequalities. This means that quantum probabilities violating Bell-type inequalities cannot be interpreted as absolute frequency probabilities RÃ©dei and StÃ¶ltzner (2001); Khrennikov (2015). A violation of Bell-type inequalities can only be explained through some sort of conditional probability model RÃ©dei and StÃ¶ltzner (2001); Khrennikov (2015). The challenge one is faced with in constructing a deterministic hidden-variable model, is the reproduction of the quantum predictions using either conventional event-conditioned classical probabilities, or perhaps using context-conditioned classical probabilities ^{1}^{1}1Context-conditioning is different to event conditioning in that it makes use of multiple probability measures, each associated with a different context. Event-conditioning is the standard conditioning that occurs within a single Kolmogorov space, and is defined using Bayes’ rule (see Khrennikov (2009)).. Regarding the latter, one can simply interpret “measurement dependence” as the assumption that incompatible experimental contexts require separate probability spaces. This is justified through the assumption that experimental results are the product of the system and the apparatus.

Consider, for example, a situation in which a physical system is prepared at time and that this results in some ontic (possibly unknown) state . Suppose that at time the system is to be measured for some duration , with a device whose state is always known and can be controlled by the experimenter. In many cases a reasonable assumption might be that the state of the macroscopic measuring device is stationary .

We can now define a “free will” condition as the assumption that and are independent variables for . This means that which can be chosen at any time is not constrained by the system . However, during the measurement process and will certainly interact, so there is no good reason to assume that will be independent of for . The probability density over ontic states taken at the time of the measurement therefore depends on (as well as and other initial data). We can therefore write for . Thus, if the probability to obtain a particular measurement result is calculated from a density over ontic states taken at the time of the measurement, then the probabilities will be contextual. This does not violate “free will”, but simply indicates that the system and measurement device have interacted, so the result is not “measurement independent”.

The above qualitative analysis implies that the measuring device generally disturbs the system being measured and so it is not passive within the experiment. The assumption of a passive measurement device may often be made when dealing with macroscopic systems in classical physics, but this assumption is not a fundamental postulate of the latter, and it seems ill-justified when considering microscopic systems. Indeed, the assumption that measurement devices are always passive in classical physics introduces a kind of classical measurement problem in which measurement devices are given a privileged role as special systems that do not disturb the systems they interact with.

## Iv Relation to conventional approaches

In this section we attempt to assess how the rigorous treatment in III.1, the contextual treatment in III.2, and conventional hidden-variable treatments found throughout the physics literature, are each related to one another. Throughout the physics literature pertaining to Bell-type inequalities, one frequently encounters heuristic expressions for what is known as Bell’s locality assumption written in the form

(28) |

Here and are left and right-particle spin outcomes respectively, and are left and right-SG device settings respectively, and as usual is supposed to offer a complete description of the underlying state of affairs. The labels and in (28) refer to the left and right-particles respectively. This particular notation was borrowed from Fry et al. (2009). Sometimes one sees appearing as a conditioning event as in the following expression, borrowed from a recent article Å»ukowski and Brukner (2014)

(29) |

In our notation (28) would read

(30) |

though we have not yet offered an interpretation of the conditioning of our probabilities on an underlying physical state . This will be done in what follows.

### iv.1 The role of conditional probabilities

The probabilities on the left-hand-side in (29) are our primary interest as these are the probabilities appearing in Bell-type inequalities. Let us therefore focus on the left-hand-side of (29). Suppose that we treat the device settings and as genuine conditioning events in an event space . This, after all, is what the notations in (28) and (29) suggest we should do. As soon as we choose to do this it becomes impossible to avoid including a description of the SG devices within the probabilistic treatment of the experiment. Proceeding along this line of inquiry, our aim in this section is to relate the expressions in (28) and (29) to the expressions derived in III.2. To this end we interpret the left-hand-side of (29) as being given by (III.2);

(31) |

Now we would like to determine whether or not the right-hand-side of (29) can be understood as being equal to the right-hand-side of (31). The probabilities in (28) and (29) are supposed to be conditioned on the settings of the SG devices. This gives the impression that the states of the measuring devices have been properly taken into account. However, the expressions obtained do not represent true conditional probabilities, because they are not normalised as such. Rather they are normalised by . Perhaps the conditioning is supposed to have been taken care of by the hidden-variables . Let us investigate this possibility.

In the rigorous measure-theoretic formulation of Kolmogorov probability theory (c.f. Gatti (2004); Rao and Swift (2006); Klenke (2013)) a probability density is associated with a real random vector , which defines a measure where is the set of Borel subsets of . Note that the domain of implies that denotes the pre-image map, which is well-defined whether or not is invertible. In turn one can define an absolutely continuous distribution where . The associated density is defined as . Formally can be thought of as the probability that , and could be denoted . One can extend the construction to deal with conditional probabilities by defining for the measure . With this one can define a measure over as

(32) |

Then one defines the conditional distribution and the associated density

(33) |

where is the Dirac measure associated with ;

(34) |

According to this definition is normalised to unity indicating that it is a genuine probability density. Equation (33) yields for a subset with , the following expression for a conditional probability written in terms of the corresponding density;

(35) |

Finally, we can define the conditional probability by

(36) |

whenever almost everywhere (except on a set of measure zero). Comparing this expression with (33) yields

(37) |

This result is precisely what we should expect to find, because we are assuming that represents the precise state of affairs within the underlying physical reality. If we are given we can be certain which events will and will not occur. The quantity represents the probability of event given . Thus, must be either or . More precisely, the event is certain to occur if and certain not to occur otherwise, hence . Another way to view is as the probability of given that the epistemic state is the delta function , which corresponds to the situation in which we have complete knowledge of the underlying reality. Borrowing terminology from quantum theory, delta-function epistemic states could be termed pure states, with all other epistemic states termed mixed states. These pure states are clearly in one-to-one correspondence with the ontic states . Now, from either (37), or directly from (36), it follows that

(38) |

of which the first equality appears to be quite close to what we see in (29). Before we use the above formalism in analysing the conventional hidden-variable approaches, it may be instructive to see it in action using a simple example.

Consider a point particle moving in one-dimensional Euclidean space . With initial conditions the second order dynamics resulting from Newton’s laws defines a well-posed Cauchy problem. A position value and velocity (momentum) value suffice to give a complete physical description of the particle. Thus, the state (event) space is , which denotes the cotangent bundle of . We can turn the manifold into a Kolmogorov space by equipping it with a Kolmogorov measure where is a suitable -algebra. Since is a flat manifold it admits a global coordinate chart , which is associated with some family of observers . A particle state is a phase point with coordinate representation relative to . A particle observable is a suitably well-behaved (e.g. smooth, square-integrable) function , which admits