Shorter gate sequences for quantum computing by mixing unitaries

Shorter gate sequences for quantum computing by mixing unitaries

Earl Campbell Department of Physics and Astronomy, University of Sheffield, Sheffield, UK

Fault-tolerant quantum computers compose elements of a discrete gate set in order to approximate a target unitary. The problem of minimising the number of gates is known as gate-synthesis. The approximation error is a form of coherent noise, which can be significantly more damaging than comparable incoherent noise. We show how mixing over different gate sequences can convert this coherent noise into an incoherent form. As measured by diamond distance, the post-mixing noise is quadratically smaller than before mixing, without increasing resource cost upper bounds. Equivalently, we can look for shorter gate sequences that achieve the same precision as unitary gate-synthesis. For a broad class of problems this gives a factor reduction in worst-case resource-costs.

The constraints of fault-tolerant quantum computing mean that the available quantum gates form a discrete set. Such a gate set is said to be universal if it generates a group that gives a dense cover over all unitaries. That is, any target unitary can be approximated to any desired level of precision with a sufficiently long sequence of gates. The Solovay-Kitaev Kitaev et al. (2002); Dawson and Nielsen (2006); Fowler (2011); Pham et al. (2013) theorem ensures that whenever we have a universal gate set, we can achieve a circuit depth that is poly-logarithmic in the inverse precision. The Solovay-Kitaev theorem is a very powerful and general result, but in practice yields very long gate sequences. Remarkable progress beyond Solovay-Kitaev has been made in recent years by focusing on gate-sets that naturally arise in fault-tolerant quantum computing, in particular the Clifford+ gate set, with the flourishing topic becoming known as gate-synthesis Kliuchnikov et al. (2013); Amy et al. (2013); Ross and Selinger (2016); Bocharov et al. (2015).

A common feature of both new and old approaches to gate-synthesis is the approximation of the target unitary with a different unitary. Then the approximation error is a form of coherent noise, which has attracted attention as being especially pernicious to quantum computations Sanders et al. (2015); Kueng et al. (2016). It has, however, been observed several times that mixing over equivalent circuits can average out coherent noise into less damaging incoherent noise Knill (2004); Kern et al. (2005); Kliuchnikov et al. (2016); Wallman and Emerson (2016); Knee and Munro (2015). For instance, when the individual gates suffer from coherent noise, randomized compiling has been shown to quadratically reduce this noise source Wallman and Emerson (2016). In the context of gate synthesis, the approximation error appears even when the components of our gate set are perfect, and so a different approach is required.

Here we give the first general set of tools for mixing out the approximation errors in gate synthesis. Quantifying this noise by the diamond norm, we find our approach reduces noise from to , without increasing the any worst-case metric of resource cost. To be clear, by worst-case resource cost we mean the tightest available upper bound on resource cost. Alternatively, we can achieve noise with reduced worst-case resource cost. If the worst-case resource cost of unitary gate-synthesis scales as , then using quantum channels noise can be attained with resource costs upper bounded by in the small limit. Many recent gate-synthesis algorithms have scaling, and so in these setting we cut worst-case costs in half. This is an extension of the notion of magic state dilution in Ref. Campbell and O’Gorman (2016), but here applied to synthesis of operations, rather than states. When completing this work, some similar insights were reported by Hastings Hastings (), though without the explicit convex hull finding algorithm provided here.

I Notation

We use throughout for the operator norm, so that is the largest singular value of . We also make use of the Schatten 1 norm on operators denoted , which equals the sum of the singular values. Throughout we make use of several norm properties discussed in standard texts Horn and Johnson (2006); Bhatia (2013). For a quantum channel we use the diamond norm where


The diamond norm induces the diamond distance between two channels and , so that


and is widely used Kitaev (1997) to quantify how well an imperfect channel approximates an ideal, target channel . The diamond distance is well behaved under composition of channels, allowing it to be used in rigorous proofs, including proofs of the threshold theorem for fault-tolerant quantum computing Aharonov and Ben-Or (1997). Despite the average fidelity gaining popularity and being easily measurable by randomised benchmarking Emerson et al. (2005, 2007); Knill et al. (2008); Dankert et al. (2009), various commentators have observed that average fidelity is less meaningful than the diamond distance Sanders et al. (2015).

In inexact gate synthesis, a sequence of available gates are composed to produce some that gives a good approximation to a target unitary . Techniques for gate synthesis typically report the precision of these approximations by taking and evaluating some norm. This prompts us to ask how this notion of precision corresponds to the more versatile diamond distance. Denoting, and as the channels corresponding to and , we have


as shown in Refs. Wang et al. (2013); Wang and Sanders (2015). In general, there is no simple lower bound. For instance, if then , but and so . However, these pathologies only arise when is large, and many families of unitaries are well behaved. Consider, for instance, unitaries of the form and , for small we find is very close to the diamond distance (see App. B of Ref. Campbell and O’Gorman (2016) for more a more detailed discussion). So while unitary precision and diamond distance are very different measures, they often coincide.

Throughout we will use to denote the available gate set, and for the associated cost function. To assess the depth of a circuit we would use a constant cost function for all . However, for the Clifford+ gate set the gates can be significantly more expensive than Clifford gates due to the resource overhead of magic state distillation Bravyi and Kitaev (2005); Bravyi and Haah (2012); Meier et al. (2013); Jones (2013). In this setting, one often takes and for all in the Clifford group. The cost of a gate sequence is then taken to be the numerical sum of the composite gate costs. We also use for the group generated by set . We say a gate set is finite when contains a finite number of elements. Lastly, we will use to denote the convex hull of a set of operators.

Ii Results

Here we present two main results of this paper

Theorem 1

Let be some dimensional Lie group, which is a subgroup of a unitary group . Let be a finite gate set with cost function , such that is a dense cover of and . Assume we have a unitary synthesis algorithm: for every and all the algorithm outputs a finite sequence , such that


where is the worst case cost of the unitary synthesis algorithm. It follows that we can construct a channel of the form


where all and each have cost upper bounded by , and provided the post-mixing noise satisfies


Therefore, error in the diamond norm.

The simplest setting is that , so , but we also allow for subgroups with . Few gate-synthesis techniques exist for multi-qubit or qudit problems, but our results apply there also. It directly applies to the familiar problem of performing general single-qubit rotations from the Clifford+ gate set. The natural cost function of this gate set is and for all in the Clifford group. For such a cost function, Ross and Selinger Ross and Selinger (2016) showed that efficient gate synthesis of any single qubit gate is possible with . Using quantum channels, and no more gates, we can ensure precision in diamond distance.

We use the terminology axial rotation for single qubit rotations about the axis, and denote the group . For such rotations the above findings apply with the function . However, the Ross and Selinger algorithm can generate axial rotations at a slightly lower cost with leading order , and other algorithms have been tailored to this special case. So one might anticipate that resource savings could be made by tailoring our approach to axial rotations. We find this is indeed the case, but we cannot blindly apply the above result to algorithms for axial rotations. Note that Thm. 1 does not apply in this setting since the generated group contains gates outside . That is, with as the Clifford+ set, the generated group has gates outside the axial rotation group, so . However, our techniques are straightforwardly extended to such scenarios.

Theorem 2

Let be the group of axial rotations. Let be a gate set with cost function with Pauli and =0. Assume we have a unitary synthesis algorithm: for every and all the algorithm outputs a finite sequence , such that


where is the worst case cost of the unitary synthesis algorithm. It follows that we can construct a channel of the form


where all and each have cost upper bounded by , and provided the post-mixing noise satisfies


Therefore, error in the diamond norm.

This result has a slightly better instead of , but more importantly benefits from using which gives a smaller resource overhead than for general qubit rotations.

Let us reflect on how this free error suppression can be swapped in exchanged for cheaper gate sequences. We instead run our protocol and use gate sequences of cost not exceeding , where is or depending on which theorem we employ. It follows that the post-mixing noise is bounded by , but worst-case resource costs are reduced. However, in a particular instance of a problem the resource cost could be much less than the worst-case cost. As such, whenever a new protocol offers a superior worst-case cost, there is no ironclad promise that the protocol will have a lower resource cost in all problem instances, though such anomalies are probably quite rare. We proceed on the mild assumption that improved worst-case resource costs accurately reflect actual resource savings, and next give a precise account of this saving.

The form of for unitary gate-synthesis is typically upto a small contribution. Our reduced cost is then


Therefore, our resource savings are a factor where


collects the terms in the square bracket of Eq. (12). In the small limit we have . Typically, is very small with many algorithms requiring and so is a reasonable approximation. Convergence toward is shown in Fig. 1, with the speed of convergence dictated by . When proving our theorems we focus on clarity rather than minimising and believe smaller is plausible. Lastly, recall that for single qubit problems known algorithms have , but in other settings different may appear.

Figure 1: The resource savings of our approach over unitary gate-synthesis is and here we show (see Eq. (13)) for and a range of post-mixing error rates. The different correspond to different constant factors in Eq. (7) and Eq. (11).

Iii The mixing lemma

Here we prove a Lemma that underpins both Thm. 1 and Thm. 2, and may also enable further extensions.

Lemma 1

Let be a target unitary, with associated channel . Let and be a set of unitaries such that

  1. for all we have ;

  2. there exist positive numbers such that and .

It follows that satisfies


We will find constructions where and , so that the diamond norm is upper bounded by .

For now, we prove the above Lemma. We begin by defining so that . We also have


with condition (2) of the lemma entailing that . The channel acts as


Since the diamond norm is unitarily invariant, we have where


where . Since the operator norm is unitarily invariant, we have . Compared to the identity channel , and using , we have


Taking the 1-norm and using the triangle inequality, we have


Using the Hölder inequality and , we have


Noting the property and condition (1) of Lem. 1, we conclude that . Therefore, the last sum of terms is upper bounded by . The first two summations are likewise bounded by by virtue of condition (2). Therefore,


which is true for all . If we tensor the channels with the identity this does not affect the proof except to burden the notation, and so


Since this is true for all the diamond norm is also upper bounded by . This completes the proof.

Figure 2: The geometric intuition of the convex hull finding algorithm. The cross marks the origin corresponding to . (a) We find a so that is near the origin. (b) We extrapolate from through the origin to a point . (c) We find a close to , so that is near to . (d) We form the convex hull of and and find the point , which is closest to the origin. From here we extrapolate out through the origin to the point . (e) We find a close to , so that is near to . (f) We form the convex hull of and and find the origin lies inside the hull, and so the algorithm terminates. Note that none of the can stray far from the origin.

Iv General rotations

We show here that Thm. 1 follows from Lem. 1. First, let be the subset of such that they can be synthesized with cost not exceeding . We have that is an -cover of . That is, for all there exists a with . Since we work with a unitarily invariant norm this can be restated as . We shift to a Hermitian representation and define a such that . Since we can choose to have small norm, which we verify later. Our goal is to not just find a single close to but a whole set that allows us to use the following

Lemma 2

Let be a set of bounded Hermitian operators for all . Assume, the origin lies within the convex hull with convex decomposition . It follows that

  1. for all ;

  2. .

When for some unitary , this can be restated as

  1. for all ;

  2. .

Clearly, such a set of Hermitian operators would allow us to use Lem. 1 with constants related by and , yielding an upperbound of . The lemma is proved by expanding the exponentials into a power series and using standard norm properties, as shown in App. A.

The key point is that we seek a set of Hermitian operators, such that the origin is contained within the convex hull of these points. Next, we present an explicit method for finding such a convex decomposition of Hermitian operators. We assume access to an oracle performing the relevant gate-synthesis decompositions. We outline the algorithm for finding a suitable convex set containing the origin.

Convex hull finding algorithm

  1. Call oracle to find such that ;

  2. Find principle such that ;

  3. Set and loop the following

    1. Find with minimum ;

    2. If then EXIT LOOP;

    3. Define where ;

    4. Call oracle to find such that ;

    5. Find principle such that and append to set ;

    6. and return to start of loop.

The calculation in step 3(a) is a convex optimisation problem and can be solved using standard interior-point methods. The whole algorithm has two free parameters and (see step 3b). In our analysis we assume , and for all practical applications this is easily satisfied. We take for simplicity, and the exact constants in our bounds and convergence rates depend on this choice. The algorithm behaves qualitatively the same for different settings, assuming . The algorithm has two important properties that we discuss below, leaving technical details until the appendices. The basic geometric intuition behind the algorithm is illustrated in Fig. 2.

First, for all found by the algorithm we have


which we show in App. B. This provides us with the value to be substituted into Lem. 2, which traced back leads to the diamond norm upper bound


where the last line uses to simplify higher order terms. This gives the upper bound stated in Thm. 1.

The second important property of the algorithm is that it eventually terminates. Each is distinct, and in particular its falls outside the convex hull of previous points (see App. C for proof). If we further assume that there are a finite number of distinct points with bounded resource cost, then there are only a finite number of possible for the algorithm to output. Since each is distinct, the algorithm must terminate in a finite number of steps. The additional assumption of a finite number of suitable points is very mild, and is satisfied both for the Clifford+ gate set and also any gate set where all gates have non-zero cost. Furthermore, below we see that the algorithm need not terminate, but that sufficient iterations will work equally well.

A finite number of steps may still be very many, but we have evidence the converge is very fast. First we note that in a -dimensional space, a simplex of will suffice to enclose a nontrivial volume. Though the algorithm is not ensured to converge in steps, it may often do so. Looking at Fig. 2, the analogous setup in Euclidean geometry hints that it will always find an enclosing simplex in iterations, though it is unclear whether this carries over to the topology induced by the operator norm. We can be more quantitative by considering the quantity , which measures the distance from the convex hull. Recall that the convex hull finding algorithm halts when . Further evidence of rapid convergence is that decreases exponentially fast. Specifically, we find there exists a such that


so the convergence toward zero is exponentially fast. Even exponentially small may be nonzero, but once the preceding proofs can be adapted to account for nonzero with negligible influence on the upper bounds. All convergence proof details are given in App. C.

V Axial rotations

We now consider a setting where the target is an axial rotation of a single qubit. The only assumption we make about the generating gate set is that it contains Pauli as a free resource. Given a protocol for axial-synthesis, for all such and any there exists at least one such that and where has cost not exceeding for some . Recall that is polylogarithmic in . For instance, the Ross-Selinger algorithm satisfies the worst case bound , and on average. It will prove useful to consider and expand in the Pauli basis


We say is an over-rotation if and an under-rotation if . We require a second unitary such that the pair contains one over-rotation and one under-rotation. We can assume as otherwise the second rotation is not needed. For the second rotation, we will use the Pauli expansion


Gate-synthesis only ensures one unitary such that , but a suitable can be found only slightly further away. Specifically, there must exist a suitable with cost below . To verify this, one first constructs an axial rotation with and . Specifically, using and then the two values


both ensure that . Choosing the the sign of to match the sign of , it follows that . Unitary gate synthesis must then provide a within of , such that . Furthermore, within the same cost budget we can synthesize unitaries and , with


Considering the set it follows immediately that they satisfy condition (1) of Lem. 1 with . Next, we assign them weights where will be fixed later. The linear combination is


Subtracting the identity and taking the operator-norm squared,


We now fix to eliminate the second term. Considering the variables , one is positive (an over-rotation) and the other negative (an under-rotation), so zero sits within the convex hull of these variables and suitable can be found. Specifically


satisfies . With the second term cancelled and taking square roots we have


By the triangle inequality and , we have


Inserting , so that , and again using the triangle inequality, we arrive at


From we can infer that


Evaluating the left hand side, we obtain


Unitarity of entails that and after some simplification, we find


From which we infer . Similarly, from we can infer . Substituting into Eq. (35), we have


Therefore, we have demonstrated both the necessary conditions of Lem. 1 with and . Applying the Lemma, our channel satisfies


A smaller factor than 5 is likely to be provable.

Vi Conclusions

We have seen that worst-case resource costs of fault-tolerant quantum computing can be reduced by switching to a randomised approach to gate-synthesis. It may seem counterintuitive that a randomisation process can be advantageous. However, convexity of the diamond distance naturally entails that mixing over channels of similar noise levels can only reduce the noise.

We presented a convex hull finding algorithm for finding the suitable mixing ratios. While this algorithm is exponentially fast, it is plausible that a constant time algorithm exists. We suspect that a variant of Delaunay triangulation could be used to quickly identify a suitable simplex. However, our literature search on Delaunay triangulation has only found results on Euclidean space and we have yet to ascertain if such tools carry over to the operator norm topology.

This work has only considered mixing over unitary channels, which prompts the question whether more general quantum channels might be useful. Probabilistic quantum circuits with fallback Bocharov et al. (2015) is an approach to gate-synthesis that is not entirely unitary, though it makes use of an ancillary qubit and works very differently to the approach presented here. As remarked earlier, mixing can be useful in preparation of different magic states Campbell and O’Gorman (2016). We ponder whether all these approaches can be understood within a single framework of quantum channel synthesis.

Acknowledgements.- We thank Yuan Su, Vadym Kliuchnikov, Steve Flammia, Jens Eisert, Mark Howard, Luke Heyfron, Neil J. Ross, Mark Pearce and Scott Vinay for related discussions. Several of these discussions were facilitated by the Centro de Ciencias de Benasque. This work was supported by the EPSRC (EP/M024261/1).

Appendix A Convex hull proof

This section will prove Lem. 2. We start by showing another general result that we use in several places. Let be a Hermitian operator with eigenvalues , so that by definition for all . We consider the operator


This can be diagonalised in the eigenbasis of and has eigenvalues . Therefore, we have


On the interval , one can verify that , and so provided we have


Now turning specifically to Lem. 2, we have


Since we always choose the principle , we have and we can use Eq. 43 to find


Recall that in Lem. 2 we defined so that for all , which explains the second inequality. Therefore, . This shows property (1) of Lem. 2. Next we consider the convex sum of unitaries,


which is split into zeroth, first and higher order terms. By assumption the linear terms vanish. Therefore,


Going from second to third line, we have again used Eq. 43. This proves Lem. 2.

Appendix B Bounding .

We wish to upper bound in terms of , the precision to which gate synthesis is assessed. The operator is chosen so that provides a certain unitary, , and the eigenvalues are chosen within the interval . Furthermore, on this interval one has that all eigenvalues satisfy . It follows that


Next, we note that for each we have


The case is similar but without the contribution. Combining this with Eq. (49) we have


Assuming this can be simplified to


as reported in the main text. This gives the value of for Lem. 2.

Appendix C Convergence proof

Next we show that each is new by showing the strictly monotonic decrease of . Furthermore, we show exponential decrease of with . We begin by translating the closeness of to into the space of Hermitian operators. We define


and later will find an upper bound on . First we use these operators to construct a point in the new convex hull. Mixing and gives a point in the convex hull, which must have norm no larger than , so that


If we consider when


then it is easy to see and that the square bracket vanishes so that


This iteration begins with . Further progress requires an upper bound on , which we now take a lengthy detour to find.

Adding several terms of the form to , we have


Taking the norm and applying triangle inequality, we get


For the middle term we know , and for the first and last terms we again use Eq. 43, so that


We can again use to bound higher order terms to obtain


Plugging this in Eq. (56), we have


where we have used that . Iterating this argument times we find exponential behaviour


where . Using our earlier assumption that guarantees that . In most instances convergence will be much faster than ensured by this proof, often jumping to within only a few iterations. Last we note that and that , which gives


Since , we have . Combined with we know the square bracket cannot exceed 6, which leads to Eq.(25).


  • Kitaev et al. (2002) A. Y. Kitaev, A. Shen,  and M. N. Vyalyi, Classical and quantum computation, Vol. 47 (American Mathematical Society Providence, 2002).
  • Dawson and Nielsen (2006) C. M. Dawson and M. A. Nielsen, Quantum Info. Comput. 6, 81 (2006).
  • Fowler (2011) A. G. Fowler, Quantum Information and Computation 11, 867 (2011).
  • Pham et al. (2013) T. T. Pham, R. Van Meter,  and C. Horsman, Physical Review A 87, 052332 (2013).
  • Kliuchnikov et al. (2013) V. Kliuchnikov, D. Maslov,  and M. Mosca, Phys. Rev. Lett. 110, 190502 (2013).
  • Amy et al. (2013) M. Amy, D. Maslov, M. Mosca,  and M. Roetteler, Computer-Aided Design of Integrated Circuits and Systems, IEEE Transactions on 32, 818 (2013).
  • Ross and Selinger (2016) N. J. Ross and P. Selinger, Quantum Information and Computation 16, 901 (2016).
  • Bocharov et al. (2015) A. Bocharov, M. Roetteler,  and K. M. Svore, Phys. Rev. Lett. 114, 080502 (2015).
  • Sanders et al. (2015) Y. R. Sanders, J. J. Wallman,  and B. C. Sanders, New Journal of Physics 18, 012002 (2015).
  • Kueng et al. (2016) R. Kueng, D. M. Long, A. C. Doherty,  and S. T. Flammia, Phys. Rev. Lett. 117, 170502 (2016).
  • Knill (2004) E. Knill, arXiv preprint quant-ph/0404104  (2004).
  • Kern et al. (2005) O. Kern, G. Alber,  and D. L. Shepelyansky, The European Physical Journal D-Atomic, Molecular, Optical and Plasma Physics 32, 153 (2005).
  • Kliuchnikov et al. (2016) V. Kliuchnikov, D. Maslov,  and M. Mosca, IEEE Transactions on Computers 65, 161 (2016).
  • Wallman and Emerson (2016) J. J. Wallman and J. Emerson, Phys. Rev. A 94, 052325 (2016).
  • Knee and Munro (2015) G. C. Knee and W. J. Munro, Phys. Rev. A 91, 052327 (2015).
  • Campbell and O’Gorman (2016) E. T. Campbell and J. O’Gorman, Quantum Science and Technology 1, 015007 (2016).
  • (17) M. B. Hastings, “Turning gate synthesis errors into incoherent errors,” ArXiv:1612.01011v1.
  • Horn and Johnson (2006) R. A. Horn and C. R. Johnson, Matrix Analysis (Cambridge, 2006).
  • Bhatia (2013) R. Bhatia, Matrix analysis, Vol. 169 (Springer Science & Business Media, 2013).
  • Kitaev (1997) A. Y. Kitaev, Russian Mathematical Surveys 52, 1191 (1997).
  • Aharonov and Ben-Or (1997) D. Aharonov and M. Ben-Or, in Proceedings of the twenty-ninth annual ACM symposium on Theory of computing (ACM, 1997) pp. 176–188.
  • Emerson et al. (2005) J. Emerson, R. Alicki,  and K. Życzkowski, Journal of Optics B: Quantum and Semiclassical Optics 7, S347 (2005).
  • Emerson et al. (2007) J. Emerson, M. Silva, O. Moussa, C. Ryan, M. Laforest, J. Baugh, D. G. Cory,  and R. Laflamme, Science 317, 1893 (2007).
  • Knill et al. (2008) E. Knill, D. Leibfried, R. Reichle, J. Britton, R. B. Blakestad, J. D. Jost, C. Langer, R. Ozeri, S. Seidelin,  and D. J. Wineland, Phys. Rev. A 77, 012307 (2008).
  • Dankert et al. (2009) C. Dankert, R. Cleve, J. Emerson,  and E. Livine, Phys. Rev. A 80, 012304 (2009).
  • Wang et al. (2013) D.-S. Wang, D. W. Berry, M. C. de Oliveira,  and B. C. Sanders, Phys. Rev. Lett. 111, 130504 (2013).
  • Wang and Sanders (2015) D.-S. Wang and B. C. Sanders, New Journal of Physics 17, 043004 (2015).
  • Bravyi and Kitaev (2005) S. Bravyi and A. Kitaev, Phys. Rev. A 71, 022316 (2005).
  • Bravyi and Haah (2012) S. Bravyi and J. Haah, Phys. Rev. A 86, 052329 (2012).
  • Meier et al. (2013) A. M. Meier, B. Eastin,  and E. Knill, Quant. Inf. and Comp. 13, 195 (2013).
  • Jones (2013) C. Jones, Phys. Rev. A 87, 042305 (2013).
Comments 0
Request Comment
You are adding the first comment!
How to quickly get a good reply:
  • Give credit where it’s due by listing out the positive aspects of a paper before getting into which changes should be made.
  • Be specific in your critique, and provide supporting evidence with appropriate references to substantiate general statements.
  • Your comment should inspire ideas to flow and help the author improves the paper.

The better we are at sharing our knowledge with each other, the faster we move forward.
The feedback must be of minimum 40 characters and the title a minimum of 5 characters
Add comment
Loading ...
This is a comment super asjknd jkasnjk adsnkj
The feedback must be of minumum 40 characters
The feedback must be of minumum 40 characters

You are asking your first question!
How to quickly get a good answer:
  • Keep your question short and to the point
  • Check for grammar or spelling errors.
  • Phrase it like a question
Test description