Causal categories: relativistically interacting processes

Causal categories: relativistically interacting processes

Bob Coecke and Raymond Lal
University of Oxford, Computer Science, Quantum Group,
Wolfson Building, Parks Road, Oxford OX1 3QD, UK.

A symmetric monoidal category naturally arises as the mathematical structure that organizes physical systems, processes, and composition thereof, both sequentially and in parallel. This structure admits a purely graphical calculus. This paper is concerned with the encoding of a fixed causal structure within a symmetric monoidal category: causal dependencies will correspond to topological connectedness in the graphical language. We show that correlations, either classical or quantum, force terminality of the tensor unit. We also show that well-definedness of the concept of a global state forces the monoidal product to be only partially defined, which in turn results in a relativistic covariance theorem. Except for these assumptions, at no stage do we assume anything more than purely compositional symmetric-monoidal categorical structure. We cast these two structural results in terms of a mathematical entity, which we call a causal category. We provide methods of constructing causal categories, and we study the consequences of these methods for the general framework of categorical quantum mechanics.

1 Introduction

This paper is concerned with the causal structure of fundamental theories of physics. We cast causal aspects of both relativity and quantum theory in a unified setting, that is, a single mathematical entity that will be derived from certain phenomenological considerations from each theory. Our starting point is categorical quantum mechanics (CQM) [2], a general framework for physical theories in which type, process, and composition thereof, are the primary concepts [18, 19]. The two modes of composing processes, in parallel and sequentially, already provide an imprint of causal structure, admitting the interpretation of temporal and spatial composition respectively. We seek to make this correspondence more precise, such that one can encode a fixed causal structure within the category. From a dual perspective, we ‘thicken’ a causal structure [35, 50, 43] to a proper category, so that we obtain a category that encodes more than just causal relationships, but which also encodes the processes that may take place along these causal connections.

We draw on earlier work by Markopoulou [42], Blute-Ivanov-Panangaden [10], Hardy [28, 27] and Chiribella-D’Ariano-Perinotti [15, 16].111 In turn, the work by Hardy in [28, 27] and Chiribella-D’Ariano-Perinotti in [15, 16] is strongly influenced by CQM; in particular, by taking a diagrammatic language for processes as their starting point. In these papers, with increasing levels of abstraction, one considers a diagram representing causal connections, and decorates it with specific quantum events or processes. We trim the assumptions in this work down to their ‘bare categorical bones’, while retaining the key results:

  • a covariance theorem for global states as in [10, 28];

  • uniqueness of effects of a certain type as in [15].

In particular, unlike the previous work mentioned above, our derivation does not make any reference to measurement or probabilities, just to a very general concept of process, and hence is more primitive. We then observe that at the most basic level of the causal structure, the quantum-mechanical structure itself does not play a crucial role; for example, our structure also captures classical probabilistic processes that take place in relativistic space-time.

The resulting structure is one which organizes processes which can potentially take place within a causal setting, together with their compositional interaction. This allows, for example, the collection of possible physical processes to vary according to the causal structure, for example, taking into account the different capabilities of distinct agents, or differences of a purely physical origin.

Conceptually speaking, the stance of elevating processes to a privileged role in theories of physics was already present in the work of Whitehead in the 1950s [53] and the work of Bohr in the early 1960s [14]. It became more prominent in the work of Bohm in the 1980s and also later in Hiley’s [11, 12, 13], who is still pursuing this line of research [29]. It is an honor to dedicate this paper to Basil Hiley, on the occasion of his 75th birthday.

Plan of the paper.

In Section 2 we give an overview of symmetric monoidal categories and CQM, and we describe the problem that partly motivates this paper, namely how compact structure, interpreted as post-selected quantum teleportation [2, 18], leads to signaling. In Sections 3 and 4 we show how consideration of this problem and other phenomenological issues leads to the causal category structure. In Section 5 we formally define causal categories and describe some of their basic properties, in particular the way in which causal categories are incompatible with some structures of CQM. In Section 6 we define methods of constructing causal categories, and how key features of CQM are retained.

2 Processes as pictures

In this Section we describe the existing framework for doing categorical quantum mechanics (CQM) using symmetric monoidal categories, and we state the problem addressed in this paper.

2.1 Symmetric monoidal categories

Symmetric monoidal categories are mathematical entities with a direct physical interpretation; introductions to the subject are [3, 8, 22]. Their role in CQM is that they provide two modes for composing systems and processes, sequential and concurrent. More precisely put:

1. Entities.

We shall consider a collection of named objects or systems , and morphisms or processes which may take a system of one kind into a system of another kind . We call the type222The term ‘type’ reflects the application of category theory to theoretical computer science [5]. of the process , for which is the domain and is the codomain. The set of all processes taking into is denoted . States correspond to processes from a special object into , where one may interpret as the ‘unspecified environment’. From an operational perspective one can think of such a process as a preparation procedure, while from an ontic perspective one can think of it as the unknown process that caused to be in this state. Moreover, effects correspond to processes from an object to , and scalars or weights correspond to processes from to .

2. Graphical language.

We now introduce a graphical language to represent our entities [18]: systems are represented by wires, and processes are represented by boxes with an input wire representing system and an output wire representing system :

In fact, as we shall discuss below, there may be more than one input and output wire—which could result from composing systems—or indeed no input and/or no output wires, representing ‘no system’—denoted above by .

3. Composition.

The mathematical content of the formalism is given by the composition of processes. There are two connectives which allow the composition of processes both ‘in parallel’ and ‘sequentially’:

  • The sequential, or dependent, or causal, or connected composition of processes and is , and is depicted as:

  • The parallel, or independent, or acausal, or disconnected composition of processes and is , and is depicted as:


Importantly, compoundness of physical systems is now directly apparent in the graphical notation, in that there can be several wires side-by-side:

one system   two systems      systems       operation on systems

Note that there is an imprint of causal structure in this formalism (as indicated in the terminology), since one can think of the ‘acausal’ composition as ‘spatially’ separating, while one thinks of the ‘causal’ composition as ‘temporally’ connecting

We now give the symbolic definition of a symmetric monoidal category. Here we restrict to the case of strict symmetric monoidal categories, since that is what the diagrammatic calculus embodies. More importantly, as argued in [22], physical processes always form strict symmetric monoidal categories.333On the other hand, their mathematical representations typically form non-strict symmetric monoidal categories; see again [22] for a discussion of this point.

Definition 1.

A strict monoidal category is a category equipped with a bifunctor and a unit object such that is a monoid, and for all in we have associativity of acausal composition:


The content of being a bifunctor is that there is an interchange law between and : for all morphisms (of appropriate type) in a strict monoidal category, we have:


Now, recall that an isomorphism between two objects and in a category is a morphism for which there exists an inverse , that is, a morphism which is such that and , where denotes the identity morphism on .

Definition 2.

A symmetric strict monoidal category (SMC) is a monoidal category equipped with a family of symmetry isomorphisms which is such that for all we have , and for all and in , we have:

Example 1 (Programming languages and proof theory).

Symmetric monoidal categories play a significant role in the theory of programming languages and proof theory, a modern branch of logic. In programming, objects represent data types and a morphism stands for running a program that requires input data of type and produces output data of type . Sequential composition would then mean first running one program and then using its output as the input for the second program. Parallel composition means running two programs in parallel. In proof theory, objects represent propositions and a morphism stands for a derivation of proposition given proposition .

Example 2 (Dirac notation).

Morphisms for which the domain and/or the co-domain is the tensor unit have a special form in the graphical notation. A generic element or state (cf. a Dirac ‘ket’ in quantum mechanical Dirac notation) is depicted as , and a generic co-element or effect (cf. a Dirac ‘bra’ ) as [18]. This shows how the graphical language is a ‘two-dimensional’ extension of Dirac notation; consider the graphical notation for states and effects compare to Dirac bras and kets:

We notice that a clockwise rotation of the Dirac ket by yields the same triangle shape as the graphical notation on the right hand-side; and similarly for bras. Moreover, given a state and an effect , we obtain a morphism with the ‘trivial’ type . As mentioned above, this is a scalar, and is denoted graphically as having no input nor output wires, which is again a rotated denotation of .

The symmetry natural isomorphism is denoted graphically by a crossing, so Eq. (5) is depicted as:


This gives an indication of how the graphical calculus will subsume symbolic equations: Eq. (6) captures Eq. (5) by the intuitive notion of ‘sliding boxes along a wire’. It will therefore be useful to formally distinguish graphical and symbolic representations.

Definition 3 (symbolic language).

By an object formula in the (symbolic) language of an SMC we mean any expression involving objects and the tensor thereof. By a (well-formed) morphism formula in the (symbolic) language of an SMC we mean any expression involving morphisms, sequential composition, and parallel composition thereof, for which sequential composition only occurs for morphisms with matching types.

Consider an object formula in a category , with . We shall be careful to distinguish between and , since there may be other objects, say and , such that , and hence, the object formula contains more information than its corresponding object in the category does, namely, it shows how it was formed. The same applies to morphism formulae, e.g. , which again contains more information than the corresponding morphism in the category. We shall notationally distinguish the object language and objects as follows:

  • Object formulae will be denoted by calligraphic capital letters

  • Objects will be denoted by Roman-font capital letters

and morphism language and morphisms as follows:

  • Morphism formulae will be denoted by calligraphic capital letters

  • Morphisms will be denoted by Roman font

Each morphism formula has an object formula as its input and output, which we specify by writing . We can associate to a corresponding morphism , simply by performing the compositions expressed within the object formulae and morphism formula. We define corresponding objects for object formulae similarly. We write , and , a notational convention which we already use in Eqs. (1) and (2) above, where the right-hand side represents the diagram expressing the composition, while the left-hand side represents the morphism that one obtains when performing the composition. We use the notation to mean that in we have and, . An equation means that the corresponding morphisms are equal, i.e. . The physical intuition behind this is that several ‘physical scenarios’ or ‘experimental protocols’, while distinct in their implementation details, may have exactly the same overall effect.

The graphical elements we have introduced correspond formally to the entities defined by an SMC: we can define a graphical language and calculus [33, 48] in correspondence to the axioms of an SMC; this procedure traces back to Penrose’s work in the 1970s [44]. For each morphism formula there is a corresponding diagram in the graphical language, e.g. for the corresponding diagram is

Surprisingly, a morphism formula still has more information than its corresponding diagram in the graphical language. But from a physical perspective this extra information is in fact redundant. For example, when expressing both sides of Eq. (4) in the graphical language, we obtain the same diagram twice:

since each side represents the same physical ‘scenario’. Hence, the graphical language is, in a sense, superior to the symbolic language, since it renders an equational constraint superfluous.

The true power of the graphical calculus as opposed to the symbolic formalism is made clear by the following theorem due to Joyal and Street [33, 48], which implicitly defines what we actually mean by ‘graphical calculus’.

Theorem 4.

An equation between morphism formulae in the symbolic language of symmetric monoidal categories follows from the axioms of symmetric monoidal categories if and only if we can obtain one picture from the other by displacing the boxes, whilst retaining how wires connect the inputs and outputs of boxes, as well as keeping the overall number of inputs and outputs of the diagram fixed.

So a ‘graphical calculation’ is nothing but a ‘deformation’, for example:


From now on we shall make use of the efficiency of the graphical language in making certain equations superfluous: we will treat morphism formula up to equivalence in the diagrammatic representation. For example, consider the following mathematical ambiguity about our use of connected vs. disconnected composition as defined symbolically: while parallel composition always leads to topological disconnectedness, sequential composition may lead to either a connected or a disconnected diagram. In particular, when we compose over the tensor unit the two modes of composition coincide:

Our use of diagrammatic equivalence classes resolves this ambiguity, since we can always represent a composition over the tensor unit by a formula that uses ’’ instead of ’’.

We shall also assume that all our morphism formulae contain only ‘atomic’ expressions—those which do not contain, in the correspondong graphical representation, topologically disconnnected components. To define this symbolically, we first define a generalized symmetry morphism to be a morphism that is either the identity morphism or is the vertical or parallel composition of symmetry morphisms or identity morphisms.

Definition 5 (Non-trivial parallel composition; atomic morphism).

The non-trivial parallel composition of morphisms and is a morphism , where neither nor is a scalar (i.e. of type ). A morphism is atomic if, for all post or pre–compositions of generalized symmetry morphisms, it cannot be written as a non-trivial parallel composition of other morphisms.

Examples of non-atomic morphisms are:

which in the diagrammatic language always consist of two non-trivially typed (i.e. non-scalar) subcomponents. In Appendix A we consider how these assumptions affect the relationship between the structure of symbolic formulae and topological connectedness in the corresponding diagrammatic language.

Why formulate physical theories using categories?

A category provides not only a description of objects and morphisms, but it also provides an equational theory. Indeed, in the case of an SMC, a category provides both the description of a physical theory, in terms of a language involving systems, processes (cf. evolution) and their (de)composition, as well as the equational laws it is subject to, e.g. when two scenarios or protocols result in the same overall a process. A particular case of this—which is the one that one encounters in more traditional formulations of the dynamics of a theory—is when the final states coincide given they take the same input state. Leaving inputs open then means allowing for variable inputs. From a logical perspective this means that it provides both the language, i.e. well-formed formulae (wff), and the axioms, i.e. equations between wff.

By a model one means a concrete realization (e.g. processes described in concrete Hilbert-space quantum theory) which typically will obey some extra axioms and have a more refined language; these can be seen as additional laws and data, which may or may not be redundant. For example, when passing to a theory of quantum gravity one may expect that certain ingredients of the Hilbert-space structure may have to be relinquished (e.g. the continuum [30]).

2.2 Elements of Categorical Quantum Mechanics

There have been many attempts to identify the key underlying structures in quantum theory. For example, the first such attempt, by Birkhoff and von Neumann, used non-distributive lattices to recast quantum theory as quantum logic [52, 9]. Other axiomatic frameworks have variously taken as a starting point algebras of observables—using C*-algebras [45, 26]—or probabilistic structure—using probability spaces [39] or convex structures [37, 38]. But the focus of these approaches has not usually been on causality. Relatedly, a weakness of these approaches is that they lack an elegant conceptual account of the behavior of compound quantum systems. Indeed, for most of these approaches the Hilbert space tensor product does not lift to the level of the languages in which the axiomatic framework was stated. The rise of quantum information and computation, where the tensor product plays a key role, and for which non-local phenomena are exploited for practical applications, might be seen as the fatal blow for many of these approaches.

In contrast, CQM treats composing systems (and processes) as a primitive. This leads to a paradigmatic shift from treating measurement as the basic concept of a theory, as advocated by Birkhoff andvon Neumann, to compoundedness, as advocated by Schrödinger [46]. This has led to immediate results, and CQM has established itself as a promising framework for studying the foundations of quantum mechanics, as well as a high-level framework for quantum information and computation. Some milestones and key results of CQM are [47, 51, 23, 1, 20, 21, 25, 24].

In CQM we add expressive power to the formalism described in Subsection 2.1 by using dagger compact categories, which we define diagrammatically as follows:

  • dagger : For each graphical element, including an entire diagram, the one obtained by flipping it upside-down is also a valid graphical element:

  • compactness : for any object there is an object and a Bell state

    which is such that, any two diagrams with matching inputs and outputs are equal if they are equal up to homotopy, that is, only the topology of the diagrams matters.

In combination with the dagger structure, this implies:


These equations are the defining equations of dagger compactness.

Definition 6 (-Smc).

A dagger category is a category equipped with an involutive identity-on-objects contravariant functor , called a dagger functor. A dagger (strict) symmetric monoidal category (-SMC) is a (strict) SMC equipped with a dagger functor such that for all we have , and

and for all in :

Definition 7 (Compact structure).

A compact structure on an object of a SMC is a quadruple ) consisting of , its dual object , the unit and the counit , such that the following diagrams commute:

A compact category is one for which there is a compact structure on each object.

Definition 8.

In a -SMC a Bell state is a compact structure , and a dagger compact category is a -SMC for which each object has a chosen Bell state; these choices are moreover coherent with dagger symmetric monoidal structure.

Example 3.

The category of finite-dimensional Hilbert spaces and linear maps, denoted , is a -SMC for which the dagger functor is given by the linear-algebraic adjoint, and the monoidal product is given by the tensor product of Hilbert spaces. By linearity, states are in bijective correspondence with pure states by the mapping , and the scalars are endomorphisms , i.e. in bijective correspondence with the complex numbers. The terminology of Definition 8 is justified by the fact that the morphism is given by the quantum state .

We now introduce some operational terminology for the object and morphism languages, with a view to their physical interpretation.

Definition 9 (Operational terminology).

A slice is an object formula in the symbolic language of SMCs, and a scenario or protocol is a morphism formula in the symbolic language of SMCs, or equivalently, a diagram in the graphical language of SMCs.

Dagger-compact categorical structure was used in [2] to provide sufficient structure to do a large amount of quantum theory, which justifies the terminology of Definition 9. They appeared earlier in the work of Baez and Dolan [7], as a particular case of -tuply monoidal -categories; the importance of the particular case of and was later acknowledged by Baez in [6]. There also exists an analogous theorem to Theorem 4 for dagger compact categories, which identifies symbolic axioms with a graphical language for which only topology matters [34, 47].

Perhaps even more importantly, the completeness theorem by Selinger [49] states that an equational statement in the language of dagger compact categories is provable in the corresponding language if and only if it is provable in the SMC of Hilbert spaces, linear maps, the tensor product and the linear-algebraic adjoint. Put informally, for an important class of equational statements, derivability in the graphical language is equivalent to validity within Hilbert-space quantum theory. Hence a less dichotomic view on axioms versus models can be obtained by means of the concept of abstraction.

2.3 A pitfall

As discussed in [18], Eq. (8) can be interpreted as post-selected quantum teleportation, that is, quantum teleportation conditioned upon the measurement outcomes, such that no unitary correction is needed. However, naive causal interpretation yields:


The origin of this apparent ability for alice to ‘signal’ to bob is post-selection, which is easily seen to be a (virtual) resource that enables signaling. We obtain this even for classical probability theory: if alice and bob each have an unknown bit with the promise that they are the same, then if alice post-selects then consequently bob will also have . Hence alice has signaled the bit to bob. To avoid this, one must consider all possible measurement outcomes together. In the quantum teleportation protocol this requires classical communication:

But to do so in the existing formalism of CQM requires specifying admissible operations, e.g. projective measurements, classical communication, classical control structure etc., and to do this various internal structures must be defined. However, in the structure we develop in this paper, causal categories, postselection will be automatically excluded, as we see below in Section 3.2.

3 Terminality of the tensor unit from correlations

In this section we first show how causal structure can be thought of in terms of information flow, and how connectedness captures this.

3.1 Causality as information flow, formalised by connectedness

Causal structure is often conceived as a partially ordered set where stands for being in the causal past of . The passage to SMCs will involve more than just expressing that there is a causal connection: it will involve specifying the processes that establish this causal connection from to , e.g. by means of sending a non-void signal.

Now, consider a physical scenario of the kind we discussed in the previous section:

where (alice) is not causally preceded by (bob). Now, whilst by causality no information can flow from to (cf. Eq. (10) and the discussion in the previous section), there does physically exist a composite process, e.g. for the picture on the left:

So in particular,  . Hence we make a key distinction is between:

  • the existence of a physical process, that is ; and,

  • the flow of information enabled by such a process: it is whether information flow can occur which in this paper will witnesses a causality assertion .

Example 4 (Proof theory).

The passage from an ordered structure to a categorical structure, or one from assertion to witnessing, is exactly what has occurred in logic, specifically in proof theory. While in algebraic logic one asserts that there is a proof which derives predicate from predicate , in categorical logic one also articulates how this can be established by explicitly giving the proofs, a proof then being a morphism in some category of type (see e.g. [36, 3]). So rather than focussing on provability one also takes the structure of the space of proofs into account:

But in proof theory the paradigm connecting the ordered structure and the categorical structure is:

or in words, is derivable from if there exists a proof that does so. The above discussion shows that this proof-theory paradigm cannot be retained on-the-nose, and rather than having existence of a morphism as witnessing partial ordering, we will require the existence of an information-flow-enabling morphism.

We shall now formalize what we mean by information flow.

Remark 1.

In what follows we have categories of deterministic processes in mind, i.e. non-postselected. In CQM terms, this means that the category only contains one scalar, namely , representing ‘certainty’.

Definition 10.

We say that a process is:

  • constant on states iff for all we have ;

  • is determined by its action on states iff for all , for all implies .

In this paper information flow means a non-constant process . Since in this case there exists with , in a scenario bob can choose to ‘feed’ either or into , so that alice receives or respectively:

Remark 2 (Well-pointedness).

Conditions of ‘well-pointedness’, as used in Definition 10, are sometimes thought to be undesirable, both for mathematical and physical reasons [32, 30]. However, our level of generality will also capture ‘pointless’ objects: we will show that, in a well-pointed situation, the notion of information flow that refers to points can be equivalently stated purely in terms of connectedness, without reference to states. It is then this pointless characterization that we will use throughout the paper.

We shall now establish that information flow from to is captured by topological connectedness in the graphical language:

information flow no information flow
Definition 11.

In an SMC, a morphism is disconnected if it decomposes as for some and . A morphism is connected if it is not disconnected.

Proposition 12 (Equivalence of constancy and disconnectedness).

If all scalars are equal to and if processes are determined by their action on states, then the following are equivalent:

  • is constant on states;

  • is disconnected.


Let be constant on states and be that constant. Then we indeed have for any , since:

for all . Conversely:

where we again used uniqueness of scalars. ∎

Hence we have characterized information flow using the structure of an SMC: and are causally related if a process can take place which is not disconnected, and dually, and are not causally related iff all processes are disconnected (i.e. factor through the tensor unit), that is:

where is the unique effect, which is stated in anticipation of the main result of the following section.

3.2 Terminality of as ‘no correlation-induced signaling’

Our notion of causality has been based so far on information flow between distinct locations. In the previous Subsection this was enabled by a process . We shall call this information flow of the 1kind. However, given a bipartite state , there may also be another type of information flow, which we call information flow of the 2kind. Diagrammatically, they appear as follows:

1kind info-flow 2kind info-flow
Example 5 (Quantum entanglement).

In quantum theory a bipartite state may be entangled. In that case, information flow of the 2kind corresponds to correlations between measurement outcomes of the two parties that are signaling, which can only happen if we allow post-selection [41].

The following postulate imposes compatibility between information flows of the 2kind with those of the 1kind; in other words, it forbids correlation-induced signaling when systems are not causally related: otherwise 2kind information-flow could be used to produce 1st kind information-flow, thus violating causal structure.

Postulate 13 (Causal consistency).

For a bipartite state , information flows of the 2kind cannot occur when and are not causally related.

Remark 3.

Note that we could have made the stronger requirement that entanglement-induced signaling does not occur even for causally related systems, but since these ‘information flows of the 2kind across time’ do not cause any inconsistency with causal structure we ignore them.

If all bipartite states are disconnected:

then, by analogy with the disconnected processes that characterize causal independence, there will be no information flow of the 2kind, hence Postulate 13 is trivially satisfied. However, the kind of universes that interest us of course do have connected bipartite states, both in quantum theory and for classical probabilistic states, for example, a perfectly correlated bipartite state. The following definition asserts the existence of states of this kind: it states that all processes can be faithfully represented by bipartite states. In the context of quantum theory this corresponds to the Choi-Jamiołkowski isomorphism [17, 31], as described in Example 7 below. Technically, it weakens the definition of compactness (see above) which, as we shall in Subsection 5.2, cannot be retained.

Definition 14.

In a CJ-universe for all systems there exists another system , not causally related to , as well as a bipartite state , called the CJ-state, such that for all ,

is an injective mapping.

Note that this definition implies in particular that for the case , there is an injection from effects to states .

Definition 15.

By an environment structure we mean a family of effects , one for each system . We call a CJ-universe with an environment structure a CJ-universe.

We can interpret these processes as ‘removing that system from our scope’. In quantum theory this role is played by the partial trace operation.

Definition 16.

A terminal object in a category is an object for which, for each object , there is a unique morphism from to .

Note that the uniqueness of as a scalar is implied by terminality of the tensor unit.

Theorem 17.

A CJ-universe obeying Postulate 13 has a terminal tensor unit.


If , then by Definition 14 (with ) we have

which contradicts Postulate 13. Hence there can at most be one effect and its existence is guaranteed by the environment structure. ∎

Example 6 (Classical probability theory).

We define classical probability theory as a subcategory of the category of real matrices : morphisms are stochastic maps, i.e. finite-dimensional real matrices with entries , and whose columns are normalised, i.e. . The monoidal product is the Kronecker product of matrices. States are given by normalised positive-real row vectors, and the environment structure is given by marginalisation of the probability distribution. A CJ-state is then given by a perfectly correlated bipartite probability distribution

which can easily be seen to provide an injective mapping from operations to states.

Example 7 (Mixed quantum theory).

is the motivating example of a -SMC in CQM, and one might attempt to define it as a -universe, using the Bell state as the CJ state. However this is problematic for the following reason. The environment structure provides a unique morphism for each object , and the interpretation of this family of morphisms is the partial trace operation (i.e. the operation which sends the system to the environment). But since tracing out a system in quantum theory typically leads to a mixed state when starting with a global pure state (e.g. a maximally entangled state), we should consider the category of mixed operations rather than , whose states are always pure (see also Remark 7 of [24]).

Hence we define a category whose objects are finite-dimensional Hilbert spaces (i.e. the same objects as ), and whose morphisms are completely positive maps for the appropriate domain and codomain: denoting the set of linear operators on as we define

Monoidal structure is again given by the tensor product of Hilbert spaces, and the environment structure for is given by the partial trace.

Now, we define for a fixed orthonormal basis of (i.e. the maximally entangled state for ). Then the CJ state for is the operator , since it supports the Choi-Jamiołkowski isomorphism from completely positive maps to positive operators on , given by

and whose inverse is

If we restrict to the subcategory of whose morphisms are completely-positive trace-preserving maps then we obtain a -universe, i.e. the tensor unit is terminal. Note that the construction of a category of mixed states and operations has been axiomatised in [47], where the CPM construction was defined for any -compact SMC.

This has the following trivial consequences.

Corollary 18.

Under the assumptions of Theorem 17 we have:

  • All scalars are equal to  ;

  • States are ‘normalized’ i.e.  for all  ;

  • For all , we have  ;

  • All bipartite effects are disconnected.

Within this framework we can now show that teleportation without classical communication cannot generate any information flow.

Corollary 19.

Under the assumptions of Theorem 17, the composite of the protocol

must be disconnected, that is, it is of the form:

The diagrammatic proof is as follows:

Similarly we have:

Corollary 20.

Under the assumptions of Theorem 17, in a compact category all identities must be disconnected and hence trivial, and in a dagger category all bipartite states must be disconnected, and each object has at most one state.


The first part is a direct mathematical analogue to Corollary 19 and the second part is straightforward. ∎

A formal account of this is in Section 5.2; in Section 6 we show how we can retain the full power of CQM.

Remark 4 (Time-symmetric quantum mechanics).

The passage from dagger compact categories to causal categories can also be seen as an abstract counterpart to the passage from time-symmetric quantum mechanics (TSQM) [4] to the usual formalism of quantum mechanics. In the formalism of TSQM, not only do we assume the existence of a pre-selected state (i.e. one which has been prepared by measurement), but we also assume that measurement outcomes have been post-selected. This corresponds to how a dagger symmetric monoidal category is used in CQM, because in this setting the dagger imposes a formal symmetry between states and effects. In TSQM, the violation of Postulate 13 has been partially addressed by restricting the formalism to those classes of intial and final (post-selected) states which do not lead to signaling [41]; this is ad hoc, and these classes lack an elegant formal characterisation.

Remark 5.

In fact, when exploiting the input-output duality at both at alice’s and bob’s sites we can identify two more kinds of information flow:

3kind info-flow 4kind info-flow

Each of these is however excluded by terminality; in the first case since there cannot be two non-equal effects and in the second case since the bipartite effect must itself be .

Earlier work: Chiribella, D’Ariano and Perimotti.

In [15], Chiribella, D’Ariano and Perimotti use the existence of a unique deterministic effect, which they call the causality axiom [15, Definition 25 & Lemma 3], to derive information-theoretic features of quantum theory. In that work, much use is made of the probabilistic structure of measurements and classical outcomes. However, in our framework, we can already derive such features without assuming probabilistic structure, e.g. Corollary 19. By using compositionality, we expose the structural—as opposed to probabilistic—aspects of information flow that follow from requiring causality.

4 Partiality of the tensor from global state

In a monoidal category the tensor product exists for every pair of objects, in particular, a system can be tensored with itself to produce . In that case, consists of all for . The independence of and lacks physical meaning, since if the two s in refer to one and the same system (spatiotemporally) then obviously we have . More generally, the freedom of composing arbitrary states of a system and a system implies that they should be independent in . The contrapositive to this is that for systems that are not independent, we have to restrict composition of the states of subsystems, which can also be achieved simply by constraining the composition of the objects themselves. We will derive this feature in this Subsection, where we first investigate to which slices (i.e. object formulae) we can meaningfully assign states within the context of arbitrary scenarios or protocols. In this manner we will develop a partial tensor product, which will be the analogue of spacelike hypersurfaces in relativity: we call these objects ‘spatial slices’.

Now, we will assume that in a scenario or protocol each object occurs only once as an input type and once as an output type. This can be achieved without loss of generality simply by annotating an object with its ‘location’ within the scenario or protocol. The reason for this assumption is merely to guarantee that in the following definition an object is considered only once. Recall that we assume that morphism formulae contain only atomic morphisms.

Definition 21.

Let be a morphism formula. A slice is included in if the objects occurring in is a subset of the input and output types of the atomic morphisms that are in .

In terms of the graphical language, a slice is included in a diagram just when it is a subset of the wires in it, which we can denote by putting ticks on the wires:

Definition 22.

A spatial slice is a slice for which every pair of objects that occur in it are disconnected objects, that is, if and are disconnected.

While the slice in picture (11) cannot be spatial, since it involves objects that are explicitly connected within the diagram, the following slice may be spatial:


provided there are no scenarios for which the ticked objects are connected.

Definition 23.

Let be a morphism formula made up of atomic morphisms. Another morphism formula is a sub-formula of , if can be formed from (necessarily included!), , and other morphism formula. Similarly we define sub-scenario and sub-protocol.

In the graphical language a sub-formula is simply part of a diagram:

The following definition states the conditions under which we can assign a state to a slice included within a protocol or scenario, when an initial state is specified (cf. ‘initial condition’). We denote by the appropriate composite of symmetry isomorphisms that realizes the stated type.

Definition 24.

Let be a scenario (or protocol), let be a slice included in it with , and let be a state. The local state at relative to , where with is a sub-scenario of , is the state .

With and as in picture (12), for subscenario the annotated part of  :

the local state is:

where we relied on eq. (6) to eliminate the symmetry isomorphisms.

Theorem 25 (Existence and uniqueness of local states).

For an SMC with terminal unit object, a slice admits a local state for any scenario in which it is included if and only if it is a spatial slice. Moreover, local states, whenever they exist, do not depend on the choice of the subscenario of Definition 24.


First we show that if a slice is non-spatial, then there exists a scenario for which does not admit any local states. If is non-spatial then there exists and such that some morphism is connected. Setting:

one easily sees that admits no subformula of the type required to yield a local state.

If is spatial, then for any scenario which includes we can construct the causal past of as follows. First, consider all morphisms in for which an object in the outcome type is in ; if such an object happens to be part of the input type of then we can consider the identity morphism. Then denote by the slice consisting of all the input types of these morphisms, and now repeat this procedure, with playing the role of , until we obtain a slice consisting of objects in the input type of . The causal past then consists of all the morphisms that we encountered in this procedure, together with for and up to symmetry equal to the input type of ; e.g.:

in the case of the example in picture (12). One then obtains the local state by post-composing any object in the output type which is not in with .

We now show independence of the local state on the choice of the subscenario . Any such will include all the morphisms accounted for in the causal past, precisely by the construction of the causal past. We now proceed by induction on the number of morphisms contained in . We can enlarge in two manners, either by means of sequential or parallel composition with some morphism , respectively yielding or , where we omitted symmetry isomorphisms. But since post-composition with and respectively yields and the resulting local state will be the same as for .

Theorem 25 can be understood as arising from the way in which the object and morphism languages interact. Roughly speaking, the morphism-language defines how processes can be composed; the interaction of the morphism-language with the object-language defines how processes or scenarios can be decomposed using a slice; and for this latter structure to allow local states to be defined for each slice, we require a partial monoidal structure.

Remark 6.

If we were to allow to be a disconnected morphism in Theorem 25 then the proof would break down, i.e. we can indeed define a local state. This is possible because, as per our assumption that morphism formulae are equivalent if they correspond to equivalent diagrams in the graphical language, a disconnected morphism can be written as , since this is in the same diagrammatic equivalence class. But this would then provide a subformula , which ensures a local state can be given at the spatial slice . In constrast, in the connected case we were not able to ‘push’ next to to form —as can be done for the disconnected case—to be part of the codomain of .

Remark 7.

Note that from the proof of Theorem 25 it also follows that in Definition 24 we do not always need to specify an initial state for the entire slice , but only for the slice which is included in the causal past of .

Definition 26.

A spatial slice with is total for a scenario in which it is included if decomposes into two subscenarios and .

Total slices allow one to model evolution of a state through a scenario, when considering local states for ‘propagating’ family total slices e.g.:

In this context, a general covariance theorem is one which states that the state of a system does not depend on the particular choice of foliation, i.e. the slice it belongs to.

Corollary 27 (general covariance).

Local states do not depend on the choice of foliation.

We provide a simple example: for the scenario and total spatial slices:

we have:

since by terminality:

Earlier work: BIP; Markopoulou; and Hardy.

There are three strands of work which are directly relevant to this Section.

  1. Markopoulou [42] defined a quantum causal history (QCH) as a mapping that assigns a Hilbert space to each element of a causal set [50]; it assigns tensor products for an antichain ; and it assigns unitary mappings between antichains. This structure is clearly very similar to the spatial slices we have defined in monoidal categories. However, the assignment of unitaries between antichains and does not take into account the causal structure between elements and . Hence a unitary for which the state at is not a constant function of the state at is allowed even if and are not causally related. Therefore a QCH cannot enforce the fact that teleportation without classical communication cannot provide information flow.

  2. Blute, Ivanov and Panangaden (BIP) developed a mathematical framework for describing the evolution of open quantum systems in [10], that is related to Markopoulou’s work and the causal sets programme. A similar notion of spatial slice exists for that framework, and it is shown there that the states are covariant. However, whereas in a causal category covariance follows immediately from the definition of the category, demonstrating covariance in [10] is much more involved, relying on properties of the concrete model of density matrices on Hilbert spaces.

  3. Hardy has developed an operational framework for describing general probabilistic theories [27]. His framework also uses diagrammatic methods, although its formalization is not explicit. His work uses the way in which inputs and outputs of boxes are connected to define the causal structure of a scenario. This is similar to our work, because we define causal structure (see Subsection 3.1) using the processes available between two slices, i.e. whether or not they are disconnected. However in a causal category the causal structure can be defined globally, whereas in Hardy’s approach the causal structure is defined only for the boxes for which connections are made.

Above we showed that terminality of the tensor unit implies covariance; however, in a CJ-universe the converse is also true, as we now show.

Theorem 28.

In a CJ-universe, general covariance implies terminality of the tensor unit.


When considering

which by injectivity in the definition of CJ-universe requires . ∎

Hence, slices which are not spatial will not allow us to meaningfully describe the local state on some part of the slice. Therefore, to ensure that this is always possible:

  • we will restrict the tensor to causally unrelated (i.e. disconnected) systems.

Our rigorous formal definition is in the next Section. One key consequence of this is that:

  • all systems (i.e. objects) in a causal category correspond to spatial slices,

without any further do, simply by the compositional structure.

Remark 8 (Crossing slices).

Although we shall restrict tensor composition of objects we will not restrict tensor composition of processes. In contrast to other work in the same vein [42, 10], this will allow for processes to be defined between ‘crossing’ slices. For example, for slices and , with causally preceding while causaly preceeds , it still makes perfect sense to speak of processes of type , which will all be of the form

with arbitrary in and arbitrary in , graphically:

5 Definition and analysis of causal categories

Given that, in a category , the absence of first-kind information-flow between objects and is implemented by making the hom-set disconnected, we can summarise the results of the previous Section in the table below; it shows the mathematical structure corresponding to the physical properties that we aim to axiomatize.

Physical property Mathematical structure Assumptions

No second-kind info-flow
Terminality of tensor unit Existence of CJ states
Unique local state for each slice Partial monoidal structure Terminality of tensor unit

Note that our concrete models will typically be quantum theory or classical probability theory, so the assumption of the existence of CJ states will be satisfied. Since this assumption leads to teminality, we note that terminality is actually not an extra assumption given that we assume the existence of CJ states. This leads to the formal definition of a caucat which we now introduce.

5.1 Definition of a causal category

We use to denote pointwise application. As already indicated above, we will take natural isomorphisms of monoidal structure to be strict.

Definition 29.

A partial functor is a functor , where is a subcategory of ; is called the domain of definition of , written , and is called the domain of , written . A partial bifunctor is a partial functor whose domain is a product category.

Definition 30.

A symmetric strict partial monoidal category is a category , together with a partial bifunctor , for which is a full subcategory of , and such that there exists a unit object , which is the unit of a partial monoid :

  • For all , both and , and

  • ;

  • , iff ,

  • when they exist, , and

  • for any morphisms in , when they exist.

  • for all , there exists a symmetry morphism

    such that .

Remark 9 (Associativity of parallel composition for morphisms).

Since is a full subcategory of , the partial monoidal product of morphisms and exists iff and exist. Then, given a morphism , since exists iff exists, we also have that exists iff exists.

Remark 10 (Bifuncoriality).

For a partial monoidal category bifunctoriality holds just as for a (full) monoidal category, i.e.

This is guaranteed by the fact that the domain of definition for a partial bifunctor is a full subcategory of its domain, so the monoidal product of the composites and always exists.

Example 8.

Any strict monoidal category is a strict partial monoidal category, where , and any category that contains a strict monoidal category as a full subcategory is a strict partial monoidal category.

Remark 11.

The symmetry morphism can be seen as a ‘kinematic’ feature of a caucat, analogous to inversion of a spatial axis in a conventionally formulated physical theory.

Definition 31.

A causal category (or caucat) is a symmetric strict partial monoidal category whose unit object is terminal, i.e. for each object there is a unique morphism , and for which the monoidal product, , exists iff


We also require that each object has at least one element, i.e. .

Proposition 32.

(i) In a caucat, morphisms are ‘normalized’, i.e. . (ii) In a caucat and whenever exists.


Both (i) and (ii) follow straightforwardly by terminality of . ∎

We shall refer to a monoidal or partial monoidal category in which the unit object is terminal, and for which each object has at least one element, as a normalized category (or normcat). The definition of disconnectedness for a partial monoidal category is the same as for a full monoidal category (cf. Defn. 11 and Defn. 22), but we state it explicitly for the reader’s convenience.

Definition 33 (Disconnectedness for partial monoidal category).

In a partial monoidal category, a morphism is disconnected if it decomposes as for some and , and a hom-set is disconnected if it contains only disconnected morphisms. If both and are disconnected then we say that the objects and are disconnected.

Proposition 34.

(i) In Definition 31, Eq. (13) is equivalent to both and being disconnected. (ii) In a caucat, exists iff exists. (iii) Conditions (u1) and (a1) in the definition of partial monoidal category are implied by Eq. (13) together with the condition that if , , exist then also exists.


(i) follows from the fact that by terminality of any disconnected morphism in a caucat is of the form . (ii) follows straightforwardly from the symmetry of Eq. (13). Concerning (iii), since is terminal we have:

  • , and

  • ,

so and always exist. If exists, then for , :

for some , so it follows that exists, and by symmetry also exists, and hence by our additional assumption also exists. ∎

Remark 12.

Item (ii) in Proposition 34 shows that the symmetry morphism can be consistently defined for a caucat.

Definition 35.

In a causal category:

  • If exists then we call and space-like separated.

  • If is connected while is disconnected then causally precedes .

  • If and are connected then and are causally intertwined.

Remark 13.

A pair of objects which are causally intertwined each correspond to a spatial slice, but they are the ‘crossed’ spatial slices that we discussed in Remark 8.

Example 9.

Each category induces a caucat by freely adjoining a monoidal unit (and a state for each object); we could call such a degenerate caucat purely temporal. Each monoid induces also another caucat with the monoid as the tensor by freely adoining a unique morphism for each ordered pair of objects; we could call such a degenerate caucat purely spatial.

Remark 14.

Since in non-degenerate situations identities are connected, the tensor of with itself will typically not exist (see Lemma 37 below).

5.2 Relation to dagger compact categories

We now show that some basic aspects of CQM, involving identical or isomorphic objects (which will allow us to identify systems of the same kind), in particular the use of compactness and dagger structure, are incompatible with the caucat structure! In Subsection 6.4 we shall describe how these aspects can be reinstated.

We first show that isomorphisms cannot be used to represent the property that two systems at different spatiotemporal locations are of the same type (e.g. a qubit).

Proposition 36.

Given a caucat , suppose that causally precedes , or that and are causally unrelated. Then if , it follows that .


If either causally precedes , or and are space-like separated, then is disconnected. Hence for the iso we have, for some ,

Since by terminality of we have , we obtain , and similarly. ∎

Hence the fact that systems at different spacetime locations are of identical types cannot be witnessed in the caucat, but instead in the -SMC that will be used to construct the caucat—we describe how this can be done in Subsection 6.4.

As discussed in Section 4, the tensor product of a system with itself is not meaningful in a causal setting. This can be formalised as follows.

Lemma 37.

If the identity morphism on an object in a caucat is disconnected, then has only one state. Any morphism between objects for which the identity is disconnected on either the domain or codomain is also disconnected.


If for some , then for any other state we have

Consider a morphism for which . Then

for some morphism . The codomain case proceeds similarly. ∎

Proposition 38.

If exists then is disconnected.


If exists then , and is disconnected. ∎

We restate Corollary 20 for completeness, since it shows that there are no non-trivial objects with compact structures.

Proposition 39.

For an object in a caucat with a compact structure is disconnected, and morphisms between objects with a compact structure are disconnected. Hence for a compact subcategory of a caucat all morphisms are disconnected.

Hence compact structure is incompatible with the structure of a caucat. Dagger structure can also not be maintained.

Proposition 40.

In a caucat with a dagger functor every object has only one state, and hence compound objects only have disconnected states.


For a given object , a dagger functor provides a bijection

so since is terminal this can only occur if each object has only one state. ∎

6 Constructing caucats

In this Section we shall describe methods for constructing caucats. The first step consists of normalizing a (-compact) SMC. Then there are two options, either to ‘carve out’ an appropriate subcategory of a monoidal category—this will represent discarding the unphysical objects in the category (when a causal structure is already given in the category)—or to combine it with a causal structure, resulting in a partial monoidal product that exists for pairs of objects which are not causally related.

Finally in Subsection 6.4 we shall describe how to reinstate the power of CQM, given that we showed above that structures such as compactness cannot be retained in caucats.

6.1 Normalizing

Definition 41 (Normalization).

Given a (-compact) SMC with environment structure, we define a new SMC with the same objects as but with morphisms restricted to normalized ones, and a corresponding inclusion functor .

Evidently, if is -compact, will not be, therefore we retain its connection with the given -compact SMC via the strict monoidal functor . But while we cannot retain dagger structure and compact structure, we can retain a conjugate functor, which can be constructed from the dagger structure and the compact structure.

Firstly, given a compact category (see Definition 6) we can define a contravariant functor:

that is, in diagrams,


where we used a 180-degree rotation of the box representing the morphism to denote its transpose. If we moreover have a dagger functor we can also define a covariant functor [2]:

that is, in diagrams,


where we use reflection in the vertical axis to denote the conjugate.

Under mild assumptions, also the existence of a CJ-state is also retained in a caucat.

Theorem 42.

Let be a dagger compact category with an environment structure for which, for all :

  • the scalar is invertible, and,

  •  ,

then in every object has a CJ-state , and the conjugation functor from is inherited.


For the morphism is normalized by assumption (cc1), so in , and the fact that this is a CJ-state follows straightforwardly from compactness, since the property of being a CJ-state is weakened form of compactness. Moreover, if is normalized, then