Optimal stochastic modelling with unitary quantum dynamics
Identifying and extracting the past information relevant to the future behaviour of stochastic processes is a central task in the quantitative sciences. Quantum models offer a promising approach to this, allowing for accurate simulation of future trajectories whilst using less past information than any classical counterpart. Here we introduce a class of phase-enhanced quantum models, representing the most general means of causal simulation with a unitary quantum circuit. We show that the resulting constructions can display advantages over previous state-of-art methods – both in the amount of information they need to store about the past, and in the minimal memory dimension they require to store this information. Moreover, we find that these two features are generally competing factors in optimisation – leading to an ambiguity in what constitutes the optimal model – a phenomenon that does not manifest classically. Our results thus simultaneously offer new quantum advantages for stochastic simulation, and illustrate further qualitative differences in behaviour between classical and quantum notions of complexity.
Models of stochastic processes are essential to quantitative science, providing a systematic means for simulating future behaviour based on past observations. Given different models exhibiting statistically identical behaviour, there is a general preference for the simplest models – those which require minimal information about the past. The motivation is two-fold: foundationally, they represent a way of identifying potential causes of future events; and operationally, simulating a process using such models requires less memory – as they need to track less information about the past – leading to a reduction in resource costs.
The field of computational mechanics Crutchfield and Young (1989); Shalizi and Crutchfield (2001); Crutchfield (2012) provides a systematic approach to constructing the provably simplest classical causal model for any given stochastic process. These models, called -machines, can produce statistically correct predictions using less memory than any classical alternative. The amount of past information they store has been employed as a measure of structure in diverse contexts Crutchfield and Feldman (1997); Palmer et al. (2000); Varn et al. (2002); Clarke et al. (2003); Park et al. (2007); Li et al. (2008); Haslinger et al. (2010); Kelly et al. (2012), motivated by its interpretation as a fundamental limit on how much information from the past must be tracked in order to predict the future.
Quantum mechanics, however, enables even simpler models that bear statistically identical predictions Gu et al. (2012); Mahoney et al. (2016); Thompson et al. (2017); Aghamohammadi et al. (2018); Binder et al. (2018); Elliott and Gu (2018); Elliott et al. (2018); Riechers et al. (2016); Yang et al. (2018). This advantage, which has been observed experimentally Palsson et al. (2017); Ghafari Jouneghani et al. (2017), can scale without bound Garner et al. (2017); Aghamohammadi et al. (2017); Elliott and Gu (2018); Thompson et al. (2018) and induces significant qualitative classical-quantum divergences in quantifiers of structure Suen et al. (2017); Aghamohammadi et al. (2016). However, while presently-known quantum constructions are provably optimal for some specific cases Suen et al. (2017); Thompson et al. (2018), they are known not to be so in general. This motivates the search for even simpler quantum models that obtain further memory advantages in stochastic simulation, and better characterise quantum notions of structure and complexity.
In this paper, we introduce phase-enhanced quantum models – a sophistication of previous quantum models – that capture all possible methods of causal simulation using unitary quantum circuits. We show that the resulting models can improve upon current state-of-the-art constructions in further reducing the amount of memory they require, according to both entropic and dimensional measures Thompson et al. (2018). Moreover, our new models reveal the origin and highlight the widespread nature of a recently discovered phenomenon Loomis and Crutchfield (2018) – which we term the ambiguity of optimality – wherein optimising for quantum models that track minimal information about the past may sacrifice achieving minimal dimensionality of their memory (and vice versa).
Classical models. A bi-infinite discrete-time, discrete-event stochastic process Khintchine (1934) is characterised by a sequence of random variables that take values drawn from a finite alphabet at each time step . The process is defined by a joint probability distribution , where and represent the past and future sequences of the process respectively (we use upper case to denote random variables, and lower case for their variates). A consecutive sequence of length is denoted by . Here, we consider stationary stochastic processes, such that .
An instance of a given stochastic process has a specific past , and possesses a corresponding conditional future . A causal model of a stochastic process defines an encoding function that maps each possible to some suitable memory state such that the same systematic action on the memory at each timestep gives rise to future sequences according to this conditional future distribution. Notably, all information about the future that is stored in the memory states may be obtained from observations of the past Crutchfield and Young (1989); Shalizi and Crutchfield (2001); Thompson et al. (2018).
The field of computational mechanics Crutchfield and Young (1989); Shalizi and Crutchfield (2001) offers a systematic means to construct the simplest classical causal models – -machines. These models are defined by encoding past information into causal states , defined by an equivalence relation on the past-future conditional distribution:
A key property of -machines is that they are unifilar Shalizi and Crutchfield (2001) – given an initial causal state and output symbol , the memory transitions into a unique subsequent causal state. We may thus define an update rule to describe the new state Binder et al. (2018).
The memory of an -machine is often parameterised according to two metrics Crutchfield and Young (1989): the statistical complexity
which measures the amount of information stored in the memory, and the topological complexity
which measures the dimension of the memory. Here, denotes the steady-state distribution of the causal states. The -machine minimises both these metrics over analogous measures for the memory of all other classical causal models. Nevertheless, it still stores information that is not directly relevant for simulating future statistics; can be strictly greater than the mutual information between past and future Shalizi and Crutchfield (2001). Operationally, and correspond to the size of the simulator memory (per simulator), when run in an ensemble or single-shot setting respectively.
Quantum models. Quantum effects present an opportunity to bypass classical limits, enabling models that require less past information than -machines Gu et al. (2012); Mahoney et al. (2016); Thompson et al. (2017); Aghamohammadi et al. (2018); Binder et al. (2018); Elliott and Gu (2018); Elliott et al. (2018). The present state-of-the-art systematic constructions for quantum models can be expressed as a step-wise unitary circuit Aghamohammadi et al. (2018); Binder et al. (2018), where each causal state is assigned to a corresponding quantum memory state . Future sequences are manifest via the use of a unitary operator that satisfies
where we have introduced the shorthand notation . At each time step , the memory state (first subspace) is interacted with a fresh ancilla (second subspace) initialised in [Fig. 1]. Subsequent measurement of the resulting ancilla then yields the correct conditional future statistics at each time step. Such a unitary operation has been proven to exist for any stationary stochastic process Binder et al. (2018).
where . These quantities inherit the same operational significance as their corresponding classical counterparts. We refer to them as the quantum statistical memory and quantum topological memory, respectively. These quantities are model-dependent 111Note that has sometimes been alternatively referred to as the quantum statistical complexity or quantum machine complexity; we avoid such nomenclature here as the former is incorrect if the model is not minimal, while the latter invites potential confusion with ..
As the memory states are generally not mutually orthogonal they enable memory savings in terms of both metrics Nielsen and Chuang (2000). In fact, the above constructions saturate bounds on pairwise memory state overlap Suen et al. (2017); Binder et al. (2018). That is, for any quantum model the overlap between quantum memory states cannot exceed the fidelities of their respective conditional future distributions due to information processing inequalities. For the above construction, Binder et al. (2018); Mahoney et al. (2016).
Despite this, the optimality of these models is only proven for specific processes Suen et al. (2017); Thompson et al. (2018), and known not be true in general. and are thus not the true quantum analogues of statistical and topological complexity, but rather bound them from above. There is hence a strong motivation to find quantum models whose memories further reduce these measures, in order to both provide a more efficient means of stochastic modelling, and to capture the ultimate limits of quantum models.
Phase-enhanced quantum models. We construct our phase-enhanced unitary models by postulating a new set of quantum memory states with a corresponding unitary interaction satisfying a generalisation of Eq. (4):
where are the additional phase factors that depend both on the initial causal state and the output symbol . Given a set of memory states and unitary operator satisfying this relation, measurements of the second subspace in the computational basis are guaranteed to produce sequences that obey the same statistics as the corresponding non-phase-enhanced model.
Theorem 1: All phase-enhanced models are valid; a corresponding unitary satisfying Eq. (6) exists for any choice of phase factors .
The proof is given in the Supplementary Material.
Theorem 2: The set of phase-enhanced models of a given stochastic process as described above contains the unitary quantum models of the process that minimise each of the quantum statistical and topological memories.
The only possible valid modifications that can be made to Eq. (6) are refinements Shalizi and Crutchfield (2001) of the memory states beyond the causal states. Modifying the transition structure between the memory states in any other manner, or modifying the magnitude of the terms in the action of the unitary will change the output statistics, and hence change the process being modelled, ruling out such modifications. It has previously been shown that such refinements can only increase the statistical memory Suen et al. (2017); thus, the minimal unitary quantum models must be described by Eq. (6)
As the quantum memory state overlaps generally differ between different phase choices, the corresponding memory measures will also differ. For any phase-enhanced model we can compute the corresponding quantum statistical and topological memories:
where . Since these quantities depend on the choice of , we define
(and similarly ) as the minimal quantum statistical (topological) memory over all possible phase-enhancements. Should these quantities be smaller than those without phase-enhancement, i.e.,
the resulting phase-enhanced models would be more memory efficient.
The potential for such a memory reduction might at first blush appear counterintuitive. In the Supplementary Material, we show that the overlaps of the quantum memory states are given by
where is shorthand for the multi-step combination of phases. Therefore, is always maximised when all phase factors are zero. Moreover, for most other choices, is strictly less than ; phase factors cannot increase pairwise overlaps between memory states. Nevertheless, as we illustrate in the next section, phase-enhancement can indeed lead to simpler quantum models according to both memory metrics. The possibility to reduce topological memory can be understood as the phase factors creating linear dependencies between the memory states. Meanwhile, its potential to reduce statistical memory nicely illustrates that increasing pair-wise distinguishability between an ensemble of quantum states can sometimes still reduce higher-order distances between the ensemble that are captured by the von Neumann entropy Jozsa and Schlienz (2000).
Three-state Markov processes. We illustrate the power of phase-enhancements by systematic study of three-state Markovian processes. The Markov chain for such processes is given in Fig. 2, where is used to denote the transition probability of going from state to state (while emitting ). The Markov property allows us to simplify Eq. (6) to
Theorem 3: Phase-enhancements can reduce the dimension of the memory (i.e., quantum topological memory), providing advantages for single-shot stochastic modelling.
The condition for dimensional reduction is that there exists a linear dependence between the quantum memory states:
for some . We can restrict and to be positive reals through freedom to add phase to the memory states . Moreover, due to global phase invariance, we can set for all without loss of generality. Eq. (12) can be expressed in terms of the transition probabilities:
From this we obtain the following set of inequalities
that must be satisfied for all . The existence of real and positive satisfying these inequalities is a necessary and sufficient condition for a dimensional advantage.
Furthermore, given that satisfy these conditions for a set of transition probabilities, we can determine the phases that collapse the memory to two dimensions:
Thus, for processes satisfying these inequalities the phase-enhanced quantum model has , in contrast to the non-phase-enhanced model with .
We performed a numerical sweep over the space of three-state Markov processes (see Supplementary Material), and found that the inequalities Eqs. (14) are satisfied for approximately of such processes when . Expanding the range of and values to we find that the inequalities can be satisfied by at least of cases. Accounting for additional values for the parameters can only increase this number. However, our lower bound already indicates that dimensional advantages, wherein , are relatively commonplace.
Theorem 4: Phase-enhancements can reduce the quantum statistical memory.
Due to certain phase symmetries such as global phase, the quantum memory states and corresponding unitary can be given in their most general form as:
We calculate the statistical memory for this model across the full range of possible phase factors. In Fig. 3(b) we compare with , observing a clear advantage with our phase-enhanced models. We also show the full dependence of on the two phase parameters in Fig. 3(c), where it can be seen that is found when .
Performing a numerical sweep across the space of general three-state Markov processes however, we find that entropic advantages appear to be quite rare, occuring in less than of cases (see Supplementary Material).
Our numerical results thus indicate that for three-state Markov processes, models that admit are much more common than those with . This begets the question, what happens to for models with dimensional advantages? We find that in many cases for which , the corresponding is strictly greater than . However, since multiple choices of phases can provide a dimensional advantage, one may be tempted to think that another set of phases will show advantages in both metrics. We now study a family of processes that conclusively show that the dichotomy cannot always be resolved in this manner: unlike classical causal models, the optimal quantum model can depend on the choice of memory metric.
Theorem 5: The model that minimises quantum topological memory is not in general that which minimises quantum statistical memory. That is, there is no unique optimal quantum model, leading to an ambiguity of optimality.
A process displaying this phenomenon for models with real phases was recently highlighted Loomis and Crutchfield (2018). Our results here illustrate that this phenomenon is in fact widespread when general complex phase-enhancements are introduced.
Consider a modified three-state quasi-cycle with slippage, as illustrated in Fig. 4(a). Our phase-enhanced models offer dimensional advantages along one line of the parameter space, while there is a large area of the space that permits models that exhibit an entropic advantage [Fig. 4(b)]. Specifically, a dimensional advantage exists iff and satisfy , in which case the inequalities Eqs. (14) are satisfied only for a single pair of values of and given by
Since there is only a single set of values for that offer a linear dependence between the memory states at each point along the aforementioned line, we can be satisfied that this gives the unique optimal model in terms of topological memory. In Fig. 4(c) we plot for this model in the parameter region denoted by the red dashed line, and compare it to and . We see that for certain parameter values , confirming the ambiguity of optimality.
Geometrically, we can understand how such an ambiguity can manifest; reductions in topological memory require linear dependence between the memory states, irrespective of the distance between them, while reductions in statistical memory arise from reductions in the distance between the states. When these two factors are in competition, the ambiguity occurs.
We have shown that complex phase-based encodings can provide further memory advantages for quantum models of stochastic processes, beyond the previous state-of-the-art constructions. We have provided examples of such enhancements, and through these, demonstrated an ambiguity in which model should be considered optimal based on the measure of memory, a phenomenon not present for classical models. Nevertheless, our resulting models are proven to contain the optimal models among all unitary constructions for each measure of memory.
A natural next step is to explore the prevalence of phase-enhanced models and associated ambiguities in higher dimensions. We expect that such enhancements will become more typical in stochastic processes with larger numbers of causal states – the rationale being that the number of phase parameters that can be tweaked grows quadratically with dimension, allowing more freedom for optimisation.
Acknowledgements. This work was funded by Singapore National Research Foundation Fellowship NRF-NRFF2016-02, the Lee Kuan Yew Endowment Fund (Postdoctoral Fellowship), Singapore Ministry of Education Tier 1 grant RG190/17 and NRF-ANR grant NRF2017-NRF- ANR004 VanQuTe. Q.L., T.J.E. and F.C.B. thank the Centre for Quantum Technologies for their hospitality.
- Crutchfield and Young (1989) J. P. Crutchfield and K. Young, Physical Review Letters 63, 105 (1989).
- Shalizi and Crutchfield (2001) C. R. Shalizi and J. P. Crutchfield, Journal of Statistical Physics 104, 817 (2001).
- Crutchfield (2012) J. P. Crutchfield, Nature Physics 8, 17 (2012).
- Crutchfield and Feldman (1997) J. P. Crutchfield and D. P. Feldman, Physical Review E 55, R1239 (1997).
- Palmer et al. (2000) A. J. Palmer, C. W. Fairall, and W. A. Brewer, IEEE Transactions on Geoscience and Remote Sensing 38, 2056 (2000).
- Varn et al. (2002) D. P. Varn, G. S. Canright, and J. P. Crutchfield, Physical Review B 66, 174110 (2002).
- Clarke et al. (2003) R. W. Clarke, M. P. Freeman, and N. W. Watkins, Physical Review E 67, 016203 (2003).
- Park et al. (2007) J. B. Park, J. W. Lee, J.-S. Yang, H.-H. Jo, and H.-T. Moon, Physica A: Statistical Mechanics and its Applications 379, 179 (2007).
- Li et al. (2008) C.-B. Li, H. Yang, and T. Komatsuzaki, Proceedings of the National Academy of Sciences 105, 536 (2008).
- Haslinger et al. (2010) R. Haslinger, K. L. Klinkner, and C. R. Shalizi, Neural Computation 22, 121 (2010).
- Kelly et al. (2012) D. Kelly, M. Dillingham, A. Hudson, and K. Wiesner, PloS one 7, e29703 (2012).
- Gu et al. (2012) M. Gu, K. Wiesner, E. Rieper, and V. Vedral, Nature Communications 3, 762 (2012).
- Mahoney et al. (2016) J. R. Mahoney, C. Aghamohammadi, and J. P. Crutchfield, Scientific Reports 6, 20495 (2016).
- Thompson et al. (2017) J. Thompson, A. J. P. Garner, V. Vedral, and M. Gu, npj Quantum Information 3, 6 (2017).
- Aghamohammadi et al. (2018) C. Aghamohammadi, S. P. Loomis, J. R. Mahoney, and J. P. Crutchfield, Physical Review X 8, 011025 (2018).
- Binder et al. (2018) F. C. Binder, J. Thompson, and M. Gu, Physical Review Letters 120, 240502 (2018).
- Elliott and Gu (2018) T. J. Elliott and M. Gu, npj Quantum Information 4, 18 (2018).
- Elliott et al. (2018) T. J. Elliott, A. J. P. Garner, and M. Gu, arXiv:1803.05426 (2018).
- Riechers et al. (2016) P. M. Riechers, J. R. Mahoney, C. Aghamohammadi, and J. P. Crutchfield, Physical Review A 93, 052317 (2016).
- Yang et al. (2018) C. Yang, F. C. Binder, V. Narasimhachar, and M. Gu, arXiv:1803.08220 (2018).
- Palsson et al. (2017) M. S. Palsson, M. Gu, J. Ho, H. M. Wiseman, and G. J. Pryde, Science Advances 3, e1601302 (2017).
- Ghafari Jouneghani et al. (2017) F. Ghafari Jouneghani, M. Gu, J. Ho, J. Thompson, W. Y. Suen, H. M. Wiseman, and G. J. Pryde, arXiv:1711.03661 (2017).
- Garner et al. (2017) A. J. P. Garner, Q. Liu, J. Thompson, V. Vedral, et al., New Journal of Physics 19, 103009 (2017).
- Aghamohammadi et al. (2017) C. Aghamohammadi, J. R. Mahoney, and J. P. Crutchfield, Scientific Reports 7 (2017).
- Thompson et al. (2018) J. Thompson, A. J. P. Garner, J. R. Mahoney, J. P. Crutchfield, V. Vedral, and M. Gu, Physical Review X 8, 031013 (2018).
- Suen et al. (2017) W. Y. Suen, J. Thompson, A. J. P. Garner, V. Vedral, and M. Gu, Quantum 1, 25 (2017).
- Aghamohammadi et al. (2016) C. Aghamohammadi, J. R. Mahoney, and J. P. Crutchfield, Physics Letters A 381, 1223 (2016).
- Loomis and Crutchfield (2018) S. Loomis and J. P. Crutchfield, arXiv:1808.08639 (2018).
- Khintchine (1934) A. Khintchine, Mathematische Annalen 109, 604 (1934).
- (30) Note that has sometimes been alternatively referred to as the quantum statistical complexity or quantum machine complexity; we avoid such nomenclature here as the former is incorrect if the model is not minimal, while the latter invites potential confusion with .
- Nielsen and Chuang (2000) M. A. Nielsen and I. Chuang, “Quantum Computation and Quantum Information,” (2000).
- Jozsa and Schlienz (2000) R. Jozsa and J. Schlienz, Physical Review A 62, 012301 (2000).
- Horodecki and Oppenheim (2013) M. Horodecki and J. Oppenheim, Nature Communications 4, 2059 (2013).
Supplementary A: Existence of and overlap of quantum memory states
Here we show that the unitary operator for our phase-encoded quantum models exists for any choice of the phases , provide an expression for the overlaps of pairs of quantum memory states, and show that the solution to this overlap converges.
Existence of . We introduce the notation to indicate the combined system-ancilla state after applying the unitary circuit:
Previous work Binder et al. (2018) established the existence of a unitary operation in the non-phase-encoded case if and only if
Similarly, for the existence of in our phase-encoded models we require:
A solution for the inner product of the quantum memory states is as follows:
which can be verified by insertion into Eq. (Supplementary Material), thus proving the existence of .
Convergence of . We must now verify that our solution to the memory state overlaps is convergent; that is, , where
Note that to avoid confusion between variables at different timesteps, in this section we do not employ the shorthand introduced in the main text.
We assume that we are dealing with synchronisable processes, such that the memory of the model can be initialised properly given the entire past. Recalling that , this condition can be expressed
and thus for large we can express
for some small that vanishes as . This allows us to divide the possible trajectories into two classes: those where the memory state is (asymptotically) synchronised (); and those where it is not. However, since this uncertainty is finite, the probability of such non-synchronising trajectories occuring must be vanishingly small for consistency with Eq. (S7), and moreover, the total probability of such trajectories must also be vanishingly small. We can therefore devote our attention only to the former class.
For this former class, we can express
for some that again vanishes as . Since each term in the summation is non-negative, we can also constrain each term to satisfy the inequality individually. To satisfy this, we must have that each is either close to 0 or 1. These probabilities must sum to 1, which ensures that for one value of , which we shall label as , the probability is for some small , while the others occur with probability that are each also small, with . In other words, after having produced a sufficiently long sequence of outputs the past of the process almost certainly belongs to causal state , and
Now consider the expansion
For , the left-hand side becomes arbitrarily close to 1 when , and 0 otherwise.
Examining the case , since , for any where we must have . Using Bayes’ rule, and assuming that , we have
implying that for any such that , the probability of such an output trajectory occuring given we started in a past belonging to causal state must be vanishingly small, even relative to the probability of the trajectory occuring at all.
Taken together, these lead us to the conclusion that
for all but a set of output trajectories of vanishingly small probability; that is, for sufficiently large the current causal state is almost certainly determined by the output sequence alone independent of the initial state prior to this sequence. Note that for processes with finite Markov order this statement is tautologically true for any trajectory once is at least as large as the Markov order.
Returning then to Eq. (S5), we see that for sufficiently large that for all but a set of trajectories of vanishingly small probability we may replace . Thus, for sufficiently large , the recursive factor in the expression tends towards unity, and as such as required.
Supplementary B. Numerical sweep search for phase-enhancements
For a general three-state Markov process as depicted in Fig. 2, each state is described by the three output probabilities to each state, defined by two free parameters due to normalisation of probability. These free parameters can be mapped to a point on the positive octant of a unit sphere [Fig. S1], where the square of the distance along a given axis corresponds to the probability of transitioning into the corresponding state. Each process is defined by three such points, one for each state.
In the case of searching for dimensional advantages, we systematically sweep over these surfaces, coarse-grained into grids such that there are 20 evenly-spaced steps along each edge of the sweep areas. For each process we then check whether the inequalities Eqs. (14) are satisfied for any of the combinations of and given in the main text.
When searching for entropic advantages, we instead sample by randomly picking a point on the three surfaces to determine a process, and then systematically sweep over all possible phase angles for the process to seek whether a can be found.
Our findings are summarised in the table below:
|Advantage||of three-state processes admitting advantage|
|Dimensional (Multiple ())|