Operational Mermin non-locality and All-vs-Nothing arguments

Operational Mermin non-locality
and All-vs-Nothing arguments

Stefano Gogioso
Quantum Group, Department of Computer Science
University of Oxford, UK

Contextuality is a key resource in quantum information and the device-independent security of quantum algorithms. In this work, we show that the recently developed, operational Mermin non-locality arguments of [10] provide a large, novel family of quantum realisable All-vs-Nothing models [2]. In particular, they result in a diverse wealth of quantum realisable models which are maximally contextual (i.e. lie on the faces of the no-signalling polytope with no local elements), and could be used as a resource for the security of a new class of quantum secret sharing algorithms.

1 Introduction

Ever since Bell’s original work [6], contextuality has evolved from spooky phenomenon to fundamental feature of quantum mechanics, with applications to device-independent quantum security [5][11] and a recently proposed role in quantum speed-up [12]. Contrary to non-locality, which has many inequivalent definitions in the different communities, contextuality comes with a reasonably standard definition in terms of measurement contexts and probability distributions on outcomes (known as empirical models), and is rigorously captured by the sheaf-theoretic framework introduced in [3].

Of the many non-locality arguments that followed Bell’s, Mermin’s non-locality argument [16] stands out for its elegance and simplicity, and its -partite generalisations can be directly translated into a family quantum secret sharing protocols known as HBB CQ [15][14]. Mermin’s original non-locality argument is based on a system of equations, each admitting a solution in but without a global solution: the probabilities don’t play any role in the argument, which admits a purely possibilistic treatment. The formulation in terms of equations directly implies a much stronger form of contextuality, and can be generalised to a large class of possibilistic contextuality arguments known as All-vs-Nothing models [2]. There is considerable interest in quantum realisable All-vs-Nothing models (like the one from Mermin’s original argument) because all such models would automatically be maximally contextual, lying on a face of the no-signalling polytope.

The traditional linear-algebraic formulation of quantum mechanics makes it hard to isolate and understand the operational building blocks that lead to quantum advantage in quantum information and computation, as well as non-classicality in quantum foundations. The framework of Categorical Quantum Mechanics [4] has been developed throughout the years to provide a concrete, hands-on language that describes many fundamental structures involved in the theory and applications of quantum mechanics. Mermin’s original non-locality argument was formalised in this language by [7], unearthing a novel connection between contextuality in Mermin’s argument and the structure of phase groups in quantum mechanics, and a treatment of the HBB CQ protocols appears in [17].

A complete characterisation of Mermin non-locality in terms of phase groups recently appeared in [10], leading to a large class of contextuality arguments generalising Mermin’s original argument. Instead of focusing on the possibilistic distribution of outcomes and the system of locally-solvable/globally-unsolvable equations, this new approach focuses on the operational aspects, involving phases and eigenstates of the Pauli observables. In particular, it generalises the single equation , with no solution in , which is used in Mermin’s original argument to prove the non-existence of global solutions for the system of equations.

In Section 2, we provide an alternative and more discursive presentation of the material in [10], and we show that all the “operational Mermin non-locality arguments” described therein are quantum realisable.

In Section 3, we draw the connection with the sheaf-theoretic framework and All-vs-Nothing models, and we show that the operational Mermin non-locality arguments provide a new infinite family of quantum realisable All-vs-Nothing models. Furthermore, we show how operational Mermin non-locality arguments can be used to provide a non-collapsing hierarchy of All-vs-Nothing models requiring arbitrary large finite fields for their formulation.

2 Operational Mermin non-locality

The first part of this paper presents the work of [10] on Mermin non-locality111In this work, the word non-locality is used in “Mermin non-locality” for historical reasons, but in all technical contexts we will prefer the word contextuality, to take away any residual emphasis on underlying space-time structure carried by the expression “non-locality”. in a format more easily accessible to the quantum information community, and provides a novel result on quantum realisability. Mermin’s original non-locality argument is summarised, with a particular focus on the role played by phases. Finite-dimensional Hilbert spaces are generalised to finite-dimensional free modules over involutive semirings: GHZ states, phase gates and measurements/decoherence are introduced in this new context. Finally, Mermin’s original non-locality argument is fully generalised to obtain a large family of quantum realisable non-locality arguments: because of the focus on concrete realisation in -symmetric monoidal categories, we shall refer to this more general family as the operational Mermin non-locality arguments.

2.1 Mermin’s original non-locality argument

In the original [16], Mermin considers a 3-qubit GHZ state in the computational basis, the basis of eigenstates of the single-qubit Pauli observable, together with the following 4 measurement contexts:

  1. The GHZ state is measured in the observable .222Where is the single-qubit Pauli observable on qubits .

  2. The GHZ state is measured in the observable .333Where is the single-qubit Pauli observable on qubits .

  3. The GHZ state is measured in the observable .

  4. The GHZ state is measured in the observable .

Following traditional notation, we denote by the eigenstates of the single-qubit Pauli observable, by those of the single-qubit Pauli observable and by those of the single-qubit Pauli observable. We can see measurement outcomes as valued in by fixing the following bijections:

  1. for the observable, and

  2. for the observable, and

Mermin argument then proceeds as follows. While the joint measurement outcomes are probabilistic, the sum of the outcomes turns out to be deterministic, yielding the following system of equations:


If there was a non-contextual assignment of outcomes for all measurements ( and ), i.e. if there existed a non-contextual hidden variable model, then the system of equations 2.1 would have a solution in , and in particular it would have to be consistent. However, the sum of the left hand sides yields in :


while the sum of the right hand sides yields in . This shows the system to be inconsistent. Equivalently, one could observe that the sum of the LHS from 2.2 can equivalently be written as , and that the inconsistency of the system of equations is witnessed by the fact that the equation has no solution in . This latter point of view is the key to the operational generalisation of Mermin non-locality, while the All-vs-Nothing generalisation has its focus on inconsistent systems of equations.

2.2 The role of phases in Mermin’s argument

To understand the role played by the equation in the original Mermin argument, we take a step back. First of all, we observe that the single-qubit Pauli measurement can be equivalently obtained as a single-qubit Pauli measurement preceded by an appropriate unitary. A single-qubit phase gate, in the computational basis (single-qubit Pauli observable), is a unitary transformation in the following form:


where we used the fact that global phases are irrelevant to set the first diagonal element to 1. Then measuring in the single-qubit observable is equivalent to first applying the single-qubit phase gate and then measuring in the single-qubit Pauli observable.

Because they pairwise commute, phase gates come with a natural abelian group structure given by composition, resulting in an isomorphism between them and the abelian group (isomorphic to the circle group ). Of all the phase gates, (the identity element of the group) and stand out because of their well-defined action on the eigenstates of the single-qubit Pauli observable:


If we see as the subgroup (corresponding to in the circle group), then Equation 2.2 looks a lot like the regular action of on itself. This is not a coincidence. Each phase gate can be (faithfully) associated the unique phase state obtained from its diagonal, and these phase states can be abstractly characterised in terms of the single-qubit Pauli observable, with no reference to phase gates (see the next section for the characterisation of phase states). The phase states inherit the abelian group structure of the phase gates, and their regular action coincides with the action of the group of phase gates on them. In particular, the phase gates and have the eigenstates of the single-qubit Pauli observable as their associated phase states and , endowing the outcomes of single-qubit Pauli measurements with the natural abelian group structure arising444There is a unique isomorphism . from the inclusion . We will refer to the group of phase states as the group of -phases, and to the subgroup as the subgroup of -classical points, which we will also use to label the corresponding measurement outcomes for the single-qubit Pauli observable. We now show how to re-construct Mermin’s argument from the following statement: the equation has no solution in the subgroup of -classical points, but has a solution (corresponding to in the circle group) in the group of -phases.

The GHZ state used in Mermin’s argument has a special property, due to strong complementarity, when it comes to phase gates followed by measurements in the single-qubit Pauli observable.[7]

Lemma 2.1.

If , denote by the measurement (outcome) on qubit obtained by first applying phase gate and then measuring in the single-qubit Pauli observable. If , then respectively.

In the particular case of and , we obtain the system of equations from 2.1, where now is the sum in the abelian group , instead of the original :


Now back to the equation , which has solution in , but no solution in . Consider an -partite GHZ state (with ), the measurement and its non-trivial cyclic permutations. This yields the following generalised system of equations, where all the right hand sides are because we chose phase gates based on the solution in to the equation :


Adding up (in the abelian group ) all left hand sides gives the following equation:


where we defined . Taking , where is the exponent of the group , makes the right hand side of 2.7 into ; the smallest such is , yielding a 3-partite GHZ state. Adding the left hand side of Equation 2.7 to and the right hand side to leaves us with the equation .

Then the following system of equations, the same system from 2.1 but with phase gate notation, can be seen to be inconsistent by adding up the three variations as equations, then adding (an integer) times the control (an equation) and obtaining the equation , which has no solution in the subgroup of -classical points and thus excludes non-contextual hidden variable models:


We just saw how phase gates, with their role in measurements of the GHZ state, allowed us to reconstruct the Mermin argument from the fact that the equation has solutions in the group of -phases, allowing the argument to be formulated, but not in the subgroup -classical points, disallowing the existence of a non-contextual hidden variable model. In the next section we generalise this technique to arbitrary pairs of strongly complementary observables (generalising single-qubit Pauli and Pauli ), in arbitrary -symmetric monoidal categories (henceforth -SMCs, generalising finite-dimensional Hilbert spaces).

2.3 From Hilbert spaces to modules of semirings

In [7] the original Mermin non-locality argument from the previous section is formalised in the context of Categorical Quantum Mechanics by using strong complementarity. In [10], the argument is fully generalised, and a completely algebraic characterisation of Mermin non-locality, valid in arbitrary -SMCs, is provided. Here we present the work of [10] in a language closer to the one traditionally used in the study of quantum information.

Instead of the field of complex numbers, equipped with the involution given by complex conjugation, we will consider the more general case of an involutive commutative semiring .555We require in . We will substitute finite-dimensional Hilbert spaces and linear maps with finite-dimensional free -modules (henceforth spaces) and -linear maps (henceforth morphisms). We will refer to this as a process theory (with superposition).666Which is easier on the tongue than “dagger symmetric monoidal category distributively enriched in commutative monoids”.

Given a basis of states of a space 777I.e. elements of the -modules, corresponding to vectors in a vector space. We will equivalently see a state of an -module as the unique morphism given by ., every state of can be written as follows, for a unique family of coefficients in :


We have a on states given as follows, where is any orthonormal basis ( is the involution):


More in general, given orthonormal bases and of two free -modules and respectively, any -linear map can be written as follows, for a unique family of coefficients in :


We have a on -linear maps given by:


Finally, the tensor product sends free -modules with bases and to the free -module over the basis .

Remark 2.2.

From the point of view of [9], this is a -SMC distributively enriched over commutative monoids, where all objects admit some classical structure with enough classical points. In this context, classical structures with enough classical points (and such that the classical points form a finite, normalisable family) always correspond to orthonormal bases. We shall use this language no more for the rest of this paper.

2.4 GHZ states

From now on we fix some arbitrary space and work in a finite orthonormal basis , which we shall refer to as the observable, or the basis. We will write for the cardinality of . Furthermore, suppose that for some abelian group structure on there are morphisms and given by:



The internal monoid , together with its adjoint, forms what is known as a quasi-special commutative -Frobenius algebra [9], which we shall refer to as the observable. Although it is not necessarily true that this algebra will have classical points forming a basis, and thus that it can be given the same interpretation as non-degenerate observables in quantum mechanics, we adopt this nomenclature to highlight the fact that the associated GHZ state will play the same role that was played in Mermin’s original argument by the GHZ state in the single-qubit Pauli basis. As a technical requirement, we will ask for the natural number to have a multiplicative inverse as an element of the semiring .

The adjoint of the morphism is given as follows, and will be used to construct the GHZ state:


By composing888But not tensoring. together copies of , we can obtain as many different morphisms as there are binary trees with nodes. However, the group addition is associative, and with it and : hence all the morphisms that can be obtained from (by composition only) coincide with the following morphism:


where the -partite generalised GHZ state (with respect to the observable) is given by:


The -partite GHZ state is defined to be the generalised GHZ state at the group element :


The state in Equation 2.18 is expressed in terms of the observable, while the GHZ state is traditionally written in the observable: how is this new expression related to the traditional one? In the case of finite-dimensional Hilbert spaces999But this can be done in more generality., the orthogonal basis associated with our observable would take the following form, where is the set (abelian group , in fact) of multiplicative characters of the abelian group :


By the fundamental theorem of finite abelian groups, we can always write for some natural numbers (in fact, prime powers) : in this case, elements of can be written as -indexed vectors, with the -th component valued in , and Equation 2.19 takes the more familiar form:


where we have fixed some isomorphism101010We can because every finite group is isomorphic to its Pontryagin dual, but our choice of iso is, in general, non-canonical. , bijectively sending to . Equation 2.20 can be inverted to obtain write the basis in terms of the basis:


Then Equation 2.18 can be written as follows in terms of the basis, recovering the traditional definition of GHZ state (generalised from to an arbitrary finite abelian group ):


where we have used the fact that the sum of exponential in square brackets evaluates to if , and vanishes otherwise. In the case of single qubits, we recover the usual formulation of the -partite GHZ state:


2.5 Mermin measurement contexts

We now define phase gates for the observable, generalising those of Equation 2.3. A phase state for the observable is a state such that the following holds:

Remark 2.3.

In the case of Hilbert spaces, the orthogonal basis satisfies:

Using points (i) and (ii) above, we obtain the following equation characterising any phase state :


Point (iii) allows us to conclude that phase states are exactly those in the form , with unimodular coefficients (i.e. ) for all .

Finally, we can use phase states to define phase gates for the observable:


These will again form an abelian group under composition, and the set of phase states will inherit this group structure. Using associativity of , it is immediate to see that the group operation and unit on the phase states are given by and respectively: we will refer to this group as the group of -phases, and denote is by . Furthermore, the elements of the basis can be easily checked to be phase states, and they also form group under with unit : we will refer to the subgroup of given by the elements of the basis as the subgroup of -classical points, and denote it by .

In order to introduce measurements, we have to move from pure states to the mixed state framework. This is a straightforward generalisation of the Hilbert space formalism, where mixed states in are self-adjoint operators , possibly positive and possibly with unit trace:

  1. is self-adjoint, i.e.

  2. is positive, if for all pure states of and some (not necessarily unique). This requirement can be omitted if positivity of mixed states is not a desideratum, e.g. in theories admitting signed probabilities.

  3. has unit trace (i.e. is normalised) if . This requirement can be omitted if normalisation of mixed states is not a desideratum.

As usual, pure states can be identified with the 1-dimensional projectors:


The measurement/decoherence in the observable can then be defined as the following linear transformation of mixed states, eliminating non-diagonal elements in the basis:


Like in the Hilbert space case, measurement in the observable of a positive normalised mixed state always results in a convex combination111111In that case, is a family of positive elements which sums to 1 of eigenstates of the observable, and can thus be interpreted as a probabilistic mixture.121212Where probabilities are certain positive elements of the semiring , and coincide with in the case .

Given a family of -phases, we define the associated Mermin measurement context of the -partite GHZ state, which we denote by , as follows:

  1. Phase gates are applied locally to the component systems:

  2. The resulting state is measured locally in the observable:


The following Lemma [7] allows us us to recast the outcomes of a Mermin measurement context as the outcomes of measurement in the observable of some appropriate generalised GHZ state.131313In fact, GHZ states can be further generalised from -classical points to arbitrary -phases, and the result still holds.

Lemma 2.4.

Let be phase states for the observable, and suppose is a -classical point. Defining as in Equation 2.29, one obtains the following equivalent form of the state in 2.30:


Equation 2.28 expresses the joint outcomes of a Mermin measurement context as a mixture of -classical points, and Equation 2.17, together with Lemma 2.4, can be used to explicitly compute the coefficients141414Positive, since . of each state in the mixture:


In order for to normalisable to a positive unit trace mixed-state, which can in turn be interpreted as a probabilistic mixture of -classical points, the following two requirements must hold:

  1. the size must be invertible (which we already required), and positive, i.e. for some (so that dividing by its inverse turns positive elements into positive elements). This is merely a technical requirement, to ensure that the coefficients in the normalised sum are positive: it can be avoided if positivity is not a desideratum (e.g. in theories admitting signed probabilities).

  2. the -phase must lie in the subgroup of -classical points (so that the set of such that is non-empty). This is a physical requirement, without which the Mermin argument measurement context will fail to be realisable.151515The process is “impossible” in the given theory, i.e. it doesn’t return any outcomes.

If both requirements above hold, then the Mermin measurement context will result in the following probabilistic combination of -classical points (note that Equation 2.32 takes the form of a possibilistic combination):

Remark 2.5.

A fundamental observation behind the Mermin argument is that the intrinsically non-deterministic outcomes (for ) of any Mermin measurement context can be turned into a (interesting) deterministic outcome by applying a suitable, classical group homomorphism to them. In particular, consider the following deterministic function of -classical points:


Then the group homomorphism applied to the probabilistic mixture of -classical points yields the following deterministic -classical outcome:161616An analogous argument holds for the possibilistic version if we use the operation of the semiring of booleans instead of the operation of the semiring .


Equations 2.32 and 2.33 show that the outcomes of a Mermin measurement context are entirely characterised171717Both possibilistically and probabilistically. by the solutions to the following equation:


In order to keep track of both the system and the -phase associated to the system in the Mermin measurement, we will adopt the following notation, generalising the one we previously used in 2.8:


2.6 Operational Mermin non-locality arguments

Now assume that we have a -module equation in the following form, with (i.e. valued in ) and admitting some solution in the group of -phases:


Let be the exponent of , pick some such that and define:


For define -phases as follows:

  1. Let .

  2. Define a function by:

  3. For define

Now we consider the following Mermin measurement scenario , consisting of one control and N variations:


The Mermin measurement contexts above can each be realised in our generalised framework: the result of applying these measurement contexts to distinct GHZ states can be modelled by tensor product, resulting in the following -partite mixture of -classical points:


Now assume that the following deterministic function of -classical points can be realised as a morphism in our generalised framework:181818We already have multiplication , but one also needs group inversion, which in categorical terms is the antipode of the strongly complementary structures. The multiplication by (in the abelian group/-module ) can be obviated by adding up independent controls.


By applying this to the mixture of Equation 2.42, and using Remark 2.5, we obtain a single deterministic outcome (where we used the fact that ):


Now that we have shown how to realise a generalised scenario, we can tackle the question of locality.

Theorem 2.6.

The mixture admits an -classical probabilistic local hidden variable model if and only if there is a solution in the subgroup of -classical points to Equation 2.38.


We only sketch the main points; the detailed proof can be found in [10].

  1. Suppose that admits a -classical probabilistic non-contextual hidden variable model:


    where we have defined:

    1. for all , and

    2. for (i.e. for the control)

    3. for (i.e. for the variations), and our modular sums are modulo with set of residues (instead of the traditional ).

    Because is a deterministic function of -classical points, and a group homomorphism , Equation 2.44 implies that, for each , we have:


    In particular, is a solution in to Equation 2.38.

  2. In the other direction, assume that there is a solution in to Equation 2.38. Then, by using this solution together with Lemma 2.4, a local hidden variable model for can be obtained as follows:


This method can be generalised from an individual equation in the form of 2.38 to systems of -module equations, constructing a Mermin measurement scenario for the system by considering independent Mermin measurement scenarios for each equation. This leads us to the following algebraic characterisation of Mermin non-locality. [10]

Theorem 2.7.

A process theory is Mermin non-local, i.e. it admits an operational Mermin non-locality argument, if and only if for (i) some space , (ii) some basis on , and (ii) some group structure on the basis realised by some structure , we have that the group of -phases is an algebraically non-trivial extension of the subgroup of -classical points, i.e. that there is some system of -module equations valued in which has solutions in but not in .

For example, qubit stabiliser quantum mechanics is Mermin non-local, because it is possible to formulate the original Mermin non-locality argument in it: the group of -phases has a solution to the equation , which has no solution in the subgroup of -classical points. On the other hand, the process theory given by finite-sets and relations between them191919Which is a process theory of free modules over the semiring of the booleans., a model for non-deterministic classical computation, is Mermin local: all -phase groups are in the form for some abelian group , and thus any solution in to a system of equations valued in can be projected to a solution in . This holds true in the more general case where the structure does not yield a basis.202020In the category of sets and relations, almost all pairs of strongly complementary structures do not yield an basis.

2.7 Quantum realisability

Potentially, there are a lot of possible combinations of -phase groups and subgroups of -classical points: it is possible to construct toy theories yielding any individual pair (but we will not do so here). However, the only features of the abelian group required by the argument are that:

  1. contains the subgroup of -classical points

  2. contains the -phases involved in the solution to the system of equations

As a consequence, any process theory providing a phase group satisfying points (i) and (ii) above will allow for a realisation of the argument. The problem of realisability of an operational Mermin non-locality argument in some given process theory can then be formulated as follows:

Given finite abelian group and a finite consistent system of -valued -module equations with no solutions in , are there appropriate basis and structure (on some system in the given process theory), such that is isomorphic to the subgroup of -classical points, and the group of -phases contains a solution to ?

In the framework above, operational Mermin non-locality arguments can be formulated in any process theory with a suitable strong complementary pair. In particular, a large family of arguments can be formulated in the category , and we shall refer to these arguments as quantum realisable. Let be a -dimensional Hilbert space, and be an orthonormal basis on it. Let be an abelian group structure on and define the structure by: