Towards Quantum Integrated Information Theory

# Towards Quantum Integrated Information Theory

## Abstract

Integrated Information Theory (IIT) has emerged as one of the leading research lines in computational neuroscience to provide a mechanistic and mathematically well-defined description of the neural correlates of consciousness. Integrated Information () quantifies how much the integrated cause/effect structure of the global neural network fails to be accounted for by any partitioned version of it. The holistic IIT approach is in principle applicable to any information-processing dynamical network regardless of its interpretation in the context of consciousness. In this paper we take the first steps towards a formulation of a general and consistent version of IIT for interacting networks of quantum systems. A variety of different phases, from the dis-integrated () to the holistic one (extensive ), can be identified and their cross-overs studied.

## I Introduction

Over the last decade Integrated Information Theory (IIT), developed by G. Tononi and collaborators, has emerged as one of the leading research lines in computational neuroscience. IIT aims at providing a mechanistic and mathematically well-defined description of the neural correlates of consciousness IIT-0 (); IIT-1 (); IIT-2 (); IIT-3 ().

The idea is to quantify the amount of cause/effect power in the neural network that is holistic in the sense that goes beyond and above the sum of its parts. This is done in a bottom-up approach by quantifying how arbitrary parts of the network (“mechanisms”), in a given state, influence the future and constrain the past of other arbitrary parts (“purviews”), in a way that is irreducible to the separate (and independent) actions of parts of the mechanism over parts of the purview. Iterated at the global network level this process gives rise to a so-called “conceptual structure”, comprising a family of mechanisms and purviews, where the latter represent the integrated core causes/effects of the former IIT-1 (). A measure of the distance between this conceptual structure with the closest one obtainable from a suitably partitioned network quantifies how much of the cause/effect structure of dynamical newtwork fails to be reducible to the sum of its parts. This minimal distance is, by definition, the Integrated Information (denoted by ) of the network.

In IIT it is then boldly postulated that the larger , the higher is the degree of consciousness of the network in the given state. The irreducibility of the causal information-processing structure of the network measured by is independent of the specific “wetware” implementing the brain circuitry. It follows that the IIT approach to consciousness seems to lead to a, rather controversial scott (), panpsychist view of the world IIT-3 ().

Besides, and irrespective of, the applications to consciousness, the IIT approach is in principle applicable to any information-processing network. For example, applications of IIT to Elementary Cellular Automata and Adapting Animats have been discussed Larissa (). Moreover, potential extensions of IIT to more general systems, including quantum ones, have been proposed in Max-1 (); Max-2 () by M. Tegmark (see also Ranchin2015 ()).

In this paper we shall make an attempt to formulate a general and consistent version of IIT for interacting networks of finite-dimensional and non-relativistic quantum systems. Our approach is going to be a quantum information-theoretic one: neural networks are being replaced by networks of qudits, probability distributions by non-commutative density matrices, and markov processes by trace preserving completely positive maps. The irreducible cause/effect structure of the global network is encoded by a so-called conceptual structure operator. The minimal distance of the latter from those obtained by factorized versions of the network, defines the quantum Integrated Information . We would like to strongly emphasize from the very beginning that:

i) Our goal is not to account for potential quantum features of consciousness. We aim at understanding the role that, a suitably designed notion of, information integration may play in a) quantum information processing in sensu lato, and b) in a novel categorization of the different phases of quantum matter.

ii) The quantum extension of IIT (QIIT) that we are going to discuss is not unique. In fact, exploring new avenues toward QIIT is one of the main goals for further investigations.

Here we deem necessary a word of warning: in the following we will often borrow jargon from classical IIT e.g., mechanism, repertories, purviews, concepts, conceptual structures,…, these are technical terms (precisely defined in the paper) which may not be necessarily familiar to quantum information experts and should not be confused with the ordinary language usage of the same terms.

## Ii Setting the stage

Let be a set of cardinality . For each there is an associated dimensional quantum system with Hilbert space . Adopting the IIT jargon we will refer to subsets of as to mechanisms. Usually will be equipped by a distance function in this case we define the distance between mechanisms and by: Given any we define , with dimension . We will denote by () the associated operator-algebra (state-space). One has that , where denotes the complement of (in ). The network dynamics will be described by a trace preserving unital CP-map with . This map has to be thought of as the one-step evolution of a discrete time process. If is a CP-map its dual is defined by: Given we define the noising CP-map by .

The first step in classical IIT is to is to consider pairs (where the mechanism is referred to as the purview of ) and to quantify how the state i.e., a probability distribution, of (with being in a maximally random state) at time conditions (constraints) the state of (at time ). The quantification is obtained measuring the distance between the conditioned states (referred to as the effect and cause repertoires of ) and the un-conditioned one. It follows that the first step toward a QIIT of is to define a quantum version of the cause/effect repertoires of classical IIT IIT-2 (); IIT-3 (). Let us now motivate our choice for the quantum counterparts.

Effects.– We will denote by () the degrees of freedom (DOFs) associated with the purview at time (its complement ) and by () the DOFs associated to the mechanism at time (its complement ). On purely classical probabilistic grounds one can write

 Pr(p|m) = Pr(q,m)Pr(m)=∑p′,m′Pr(q,q′,m,m′)Pr(m) (1) = ∑p′,m′Pr(q,q′,m,m′)Pr(m)Pr(m′)Pr(m′)=∑p′,m′Pr(q,q′|m,m′)Pr(m′).

Here we have assumed that the prior of and factorizes, i.e., . Now quantum mechanics enters in defining the transition probability , where is the unital CP-map describing the (one-step) dynamics of the network. Inserting this in the equation above one gets

 Pr(p|m) = ∑p′⟨p|⟨p′|U(|m⟩⟨m|⊗∑m′Pr(m′)|m′⟩⟨m′|)|p′⟩|p⟩ (2) = ⟨p|TrP′U(|m⟩⟨m|⊗1M′d|M′|)|p⟩.

Here we have assumed that the prior for is the uniform unconstrained one, i.e., . The last equation shows that the probability for the purview being in the state at time (conditioned on the mechanism being in the state at time ) is the diagonal element of the reduced density matrix .

Causes.– We now consider cause repertoires and denote by () the degrees of freedom (DOFs) associated with the purview at time (its complement ) and by () the DOFs associated to the mechanism at time (its complement ). Using Bayes rule and Eq. (2) (with and interchanged) one can write

 Pr(p|m) = Pr(m|p)Pr(p)Pr(m)=⟨m|TrM′U(|p⟩⟨p|⊗1P′d|P′|)|m⟩Pr(p)Pr(m) (3) = Tr[(|m⟩⟨m|⊗1M′d|M′|)U(|p⟩⟨p|⊗1P′)].

Here we used . Using the Hilbert-Schmidt dualand the properties of reduced density matrices, the equation above becomes

 Pr(p|m) = Tr[U∗(|m⟩⟨m|⊗1M′d|M′|)(|p⟩⟨p|⊗1P′)] (4) = ⟨p|TrP′U∗(|m⟩⟨m|⊗1M′d|M′|)|p⟩.

Again, the last equation shows that the probability for the purview being in the state at time (conditioned on the mechanism being in the state at time ) is the diagonal element of .

## Iii Qiit

The considerations above naturally lead to a definition for cause/effect repertoires where we consider the full quantum density matrix as opposed to just its diagonal entries. We would like to stress that, given the essential role of entanglement in quantum theory, in our approach we drop out the assumption of conditional independence of the repertoires and the associated need of virtualization IIT-2 (). Also, notice that in this paper we restrict ourselves to the unital case, in order to have the unconditioned repertoires equal to the maximally mixed state (see below). This is a simplifying technical assumption, not a key requirement. Definition 1a: cause/effect Repertoires: Given the unital , the state and , we define the effect (e) and cause (c) repertoire of over the purview , by

 ρ(x)(P|M):=TrP′U(x)∘NM′(ΨΛ)=ρ(x)(P|M):=TrP′U(x)(ΨM⊗⊗1M′d|M′|),(x=e,c) (5)

where, and (Hilbert-Schmidt dual of . The set of density matrices () encode how the dynamics constrains the future (past) of , given that the system is initialized in and noised over (see Fig. (1)). From a qualitative physical point of view one might think in the following way: supports an extended quantum medium that is everywhere at infinite temperature but over the region where it has been locally “cooled off” to some (possibly pure) quantum state The system is then evolved forward (backward) in time by the map (). The quantities (6) quantifies the distinguishability of the states obtained in this way from the infinite temperature one if only measurements local to the region are allowed.

The next step is to define the cause/effect information by the information-theoretic distance between the conditioned and the un-conditioned repertoire In classical IIT the distance between repertoires is usually taken to be the Wasserstein distance IIT-2 (). In this paper, in view of its salient quantum-information theoretic properties and simplicity, we will adopt the trace distance between density matrices and as a measure of statistical distinguishability, i.e., . Definition 1b: cause/effect Information: The cause/effect information of over is given by

 xi(P|M):=D(ρ(x)(P|M),1Pd|P|),(x=e,c). (6)

A: Repertories for the Swap operation For the sake of illustration we will us the case with where is a swap operation. The initial state is taken in the factorized form where is a pure density matrix over One can easily check that the non-trivial repertoires (notice that ) are given by and From this it follows At the technical level the following remarks are now useful:

1) Since pure states have the maximum distance from the maximally mixed state and by distance monotonicity under partial traces

2) For unitary ’s generated by a local Hamiltonian the functions fulfill a Lieb-Robinson type inequality LR ()

 xi(P|M)≤cexp(−a(dist(P,M)−v|t|)),(x=c,e).

Here are constants depending on and Moreover, is the Lieb-Robinson velocity which depends on (see Appendix for a proof).

3) The average cause/effect information of a map is defined by the uniform average of over all mechanisms/purviews

4) Using the inequality one finds where here denotes the von-Neumann entropy. Introducing the -Renyi entropy i.e., one gets

 xi(P|M)≤√12log(d|P|∥ρ(x)(P|M)∥22),(x=c,e). (7)

This inequality is useful as the purity is technically easier to handle than the trace-distance.

Let us illustrate this fact in two ways: the first shows that the conditional repertoires purities have a simple expression in terms of standard multi-point spin correlators; the second shows how, using Eq. (7), one can gain an insight on the behavior of cause-effect power for typical (Haar) random unitaries.

5) We focus on effect repertoires as everything in the following holds for cause ones by replacing with Using the notation for the - th spin (tensorized with the identity over ) one finds unp () and

 ∥ρ(e)(P|M)∥22=12|P|∑α∈Z|P|4|∑β∈Z|M|4G(P|M)α,βλβ|2, (8)

where (similarly for ) and

 G(P|M)α,β:=12|Λ|Tr[∏j∈Pσ(j)αjU(∏j∈Mσ(j)βj)],(α∈Z|P|4,β∈Z|M|4)

is a -point (infinite temperature) spin-spin correlator for the CP-map Similiar expressions hold for the cause repertoires.

In the special case (i.e., both mechanism and its purview consist of single qubit, say the -th and the -th respectively) ) one has a further simplification. Indeed in this case and from which it follows where is two-point spin correlator, is the Bloch vector of the mechanism state and denotes the standard euclidean norm. Moreover the Bloch vector of is nothing but i.e.,

6) For unitary evolutions and pure and factorized one can explicitly (Haar) average over ’s  unp ()

 EU[∥ρ(x)U(P|M)∥22]=12∑α=±1(d|Λ|+αd|M|d|Λ|+α)(1d|P|+α1d|P′|),(x=e,c) (9)

This result is the same for cause and effects repertoires (invariance of the Haar measure under ) and its state independent. If and (i.e., the purview not a finite fraction of ) then from (9) it follows that which in turn, using (7) and concavity, implies This bound holds true for any mechanism For the physical interpretation is that a typical (Haar) random will map the initial network state onto a nearly maximally entangled one which locally, for will look almost indistinguishable from the maximally mixed state i.e., the unconditional one. This remark may seem to suggest that quantum entanglement plays a sort of “negative” role in the type of QIIT we are here trying to develop (see more about this issue later on). Examples. The following examples show that the type of causal power defined by Eq. (6) has counter-intuitive aspects and should, therefore, handled with care. When one has that now if one finds . Moreover the XI can be computed using the fact that

 XI(1)=122|Λ|∑xP,xM∈{0,1}|Λ|(1−d−⟨xM,xP⟩)=1−122|Λ|∑xP,xM∈{0,1}|Λ|∏i∈Λd−xM(i)xP(i)=1−(3d+14d)|Λ|. (10)

Here are bit-strings of length which parametrize the sets and Notice that the same result holds for any totally factorized unitary For one qubit one has whereas for two qubits The latter result is identical to the one for being the swap between the two qubits (direct computation) showing that the XI of identity can be equal to the one of non trivial (and integrated) transformation. Moreover, for two qubits, and with one finds (direct computation) showing that a non-trivial interaction can have less total cause-effect power that identity (i.e., doing nothing). The next definition captures quantitatively the notion of irreducibility of c/e repertories, namely how far the conditional repertoires are from those obtainable from disjoint parts of independently conditioning disjoint parts of . The idea of IIT is that only irreducible actions are “real” and exists per se IIT-1 (). Definition 2: Integrated information for mechanisms- Given the mechanism and the purview we consider all possible bi-partitions of them and , where . We define the (cause/effect) integrated information (ii) of over by

 φ(x)(P|M)=min(Pi,Mi)D[ρ(x)(P|M),ρ(x)(P1|M1)⊗ρ(x)(P2|M2)]∈[0,1],(x=e,c) (11)

In this definition the minimum is taken over all the possible pairings different from the trivial one , which would make any repertoire factorizable. Notice that, since the ’s () are not the reduced density matrices of the factorizability of the latter is a necessary, but not sufficient condition for the vanishing of . If is the partition of , which achieves the minimum, then quantum entanglement of , measured by its distance from the set of separable states over , provides a lower-bound to . Moreover, cause/effect information gives an upper bound to the integrated information (note that by normalization)

 φx(P|M)≤D(ρ(x)(P|M),ρ(x)(∅|M)⊗ρ(x)(P|∅))=xi(P|M),(x=e,c). (12)

The bound is saturated in In particular, Eq. (12) and remark 2) above imply that integrated information obeys a Lieb-Robinson type of bound for ’s generated by local-Hamiltonians. This shows that obeys locality in the usual sense allowed in non-relativistic quantum theory LR (). Furthermore, Eq. (12) along with the bounds for cause/effect information in 4) above show that for finite purviews and typical (Haar) random unitaries integrated information is exponentially small in the network size. B: for the Swap From Eq. (12) one sees that Moreover from: and Finally, Using Eq. (11) one can now, for each mechanism identify two purviews over which has maximal irreducible causal power. Definition 3: Core causes, effects.– The purview is a core effect/cause of , if . The corresponding value of will be denoted by . The associated (global) repertoires are given by , where is the complement of the core effect/cause of The integrated cause/effect information of is given by . If , then either or . In the first (second) case, the mechanism fails to constrain the future (past) on any purview in an integrated fashion. Either way, such a mechanism is not regarded as an integrated part of the network and it is dropped out of the picture.

The irreducible causal structure of the network has been so far described at the level of mechanisms. The next definition is instrumental in uplifting the construction to the global network level. Definition 4: Conceptual Structure operators.– For any mechanism the triple with is called a concept. The totality of concepts forms a conceptual structure (CS) IIT-2 (). Formally one can encode a CS on a positive semi-definite operator over , given by

 C(U):=12∑M,αφU(M)|Mα⟩⟨αM|⊗ραU(M), (13)

where and we have made explicit the -dependence (but kept implicit the one). A CS can be also be seen a “constellation” of triples . The latter compact set may be referred to as the quantum “Qualia Space” IIT-2 ().

Given two CS’s, and associated to and , respectively, we define the distance between them as the (trace-norm) distance bewteen the associated CS operators . More explicitly,

 D(C1,C2)=14∑M,α∥φ1(M)ρ(α)1(M)−φ2(M)ρ(α)2(M)∥1. (14)

In particular, iff one has either and , or . In words: two conceptual structures are the same iff all the core effects/causes repertoires and the associated integrated-information coincide for all concepts.

It is important to notice that if the repertoires depend continuously on some parameter, e.g., through the map , then will be a continuous function as well. However, core effects/causes may change dis-continuously and this will be reflected by CS operators (13) and functions thereof, e.g., Eq. (14). C: The CS of the Swap The network supports just two concepts. The conceptual structure operator is given bu: The core effect/cause of () is ( ). The key idea in IIT is to compare the global cause/effect structure (encoded in our quantum version in (13)) with those of factorized maps associated to bi-partitioned and decoupled networks. In this way one wants to assess how the “whole goes beyond and above the sum of its parts” i.e., it exists intrinsically The standard way in classical IIT to produce factorized maps is by bi-partitioning the total set and by “cutting the connections between the two halves by injecting them with noise” IIT-2 (). We adopt here a natural quantum version of this procedure. Given the (non-trivial) partition , one can define

 UP=U1⊗U2,Ui:L(HΛi)→L(HΛi):X↦Ui(X):=TrΛ′iU(X⊗1Λ′id|Λ′i|),(i=1,2) (15)

Notice that the ’s, while unital, are not in general unitary even if the unpartitioned map is. We are now finally ready to define the fundamental global quantity of the paper: the Integrated Information, denoted by of the whole network. Qualitatively, measures how the integrated cause/effect structure of the quantum network fails to be described by any partitioned and decoupled version of it. Definition 5: Integrated Information.– We define Quantum Integrated Information (II) by

 Φ(U):=minPD(C(U),C(UP)). (16)

The minimum here is taken over the set of bi-partitions of If , we say that the network is dis-integrated. The bi-partition , for which the minimum in Eq. (16) occurs, is referred to as the Maximally Irreducible Partition (MIP) in classical IIT, i.e., . If the network is dis-integrated , namely there exists a “cut and noising” of the network in two halves that does not affect its global (integrated) cause/effect structure. The system does not exist as a whole per se; in a network-intrinsic information-theoretic sense there is no “added value” in combining the two halves. For a completely factorized and one has D: of the Swap We have of course just one partition which dis-integrates both concepts in Therefore, using (14) and (16) one has

 Φ(S)=2×14(∥12(12⊗Ψ)−0∥1+∥12(Ψ⊗12)−0∥1)=12(12+12)=12.

Several remarks are now in order to shed some light on the nature of the quantum II defined by Eq. (16).

7) obeys “time-reversal symmetry”  time-reversal () and, for unitary ’s,  group-action ().

8) In spite of the simplified notation, one should not forget that the conceptual structure operator and, therefore depends on as well. In this paper we will focus at first on completely factorized pure states . In this case factorizability of is a sufficient condition for vanishing In fact, for a given , vanishing is a weaker property than factorizability of the dynamical map. Take, e.g., any non-factorizable unitary that is diagonal in a tensor product basis and to be any basis element. One has that . The action of , for this , is the same of the identity map and therefore .

9) It is essential to stress that different state choices for , e.g., entangled, may result in dramatically different result. For example, even factorized maps may have non-vanishing . To illustrate this intriguing fact let us consider e.g., One can easily see that there is just one concept (supported by the full ) and one partition from which it follows that unp (). This is an example of what might be dubbed entanglement activated integration, and it shows a sense in which genuinely quantum effects may play a “positive” role in our version of IIT.

10) At the quantum level one might define the minimization (16) over all possible virtual bi-partitions of  virtual1 (); virtual2 (). This would provide a lower bound to and a much more stringent, and uniquely quantum, definition of integration. Of course at the computational level this would be a tremendous challenge.

11) If one can consider the reduced network where: if then and If the reduced network is referred to as a complex IIT-0 (); IIT-1 (). A network may have many complexes which represent, in a sense, “local maxima” of

We are now ready to illustrate the rather complex mathematical framework developed so far by means of physically motivated examples. Let us start with a very simple one.

### iii.1 Partial Swap

Let us consider a basic network with ( pure state paperion), equipped with a “partial swap” map One has three mechanisms/purviews . Direct computation shows unp ():

 ρ(e/c)(1|1) = c2tΨ+s2t1,ρ(e/c)(2|1)=s2tΨ+c2t1, ρ(e/c)(Λ|1) = c2tΨ⊗12+s2t12⊗Ψ±ictst[S,Ψ⊗12]. (17)

Identical expressions hold for the repertoires (). Finally, and One can obtain for each mechanism and . It follows that for small (near to ) ’s the core effect/cause of is itself () (analogously for ), whereas the core effect/cause of is itself . Moreover, there is a window around in which the core effect/cause of and delocalize and comprise the full . The corresponding jumps of are shown in Fig. 2. The intermediate, high , delocalized phase originates from the commutator term in (17). It can be regarded as a genuine quantum feature, i.e., it would disappear if were just a probabilistic mixture of identity and swap.