# Novel constructions for the fault-tolerant Toffoli gate

## Abstract

We present two new constructions for the Toffoli gate which substantially reduce resource costs in fault-tolerant quantum computing. The first contribution is a Toffoli gate requiring Clifford operations plus only four gates, whereas conventional circuits require seven gates. An extension of this result is that adding control inputs to a controlled gate requires gates, whereas the best prior result was . The second contribution is a quantum circuit for the Toffoli gate which can detect a single error occurring with probability in any one of eight gates required to produce the Toffoli. By post-selecting circuits that did not detect an error, the posterior error probability is suppressed to lowest order from (or , without the first contribution) to for this enhanced construction. In fault-tolerant quantum computing, this construction can reduce the overhead for producing logical Toffoli gates by an order of magnitude.

## I Introduction

Fault-tolerant quantum computing is the effort to design quantum information processors which are resilient to sufficiently small (but nonzero) probability of failure in any individual component Preskill (1998); Nielsen and Chuang (2000). Enhanced reliability comes at the cost of redundancy, and recent study in this area has focused on minimizing the overhead, or additional resource costs, associated with converting a perfect quantum operation into a form compatible with error correction Isailovic et al. (2008); Jones et al. (2012a); Fowler and Devitt (2012). This work focuses on the Toffoli gate, which appears in both reversible-classical and quantum logic and which may be defined as , for being binary variables. Unlike many quantum gates, the quantum Toffoli gate has a classical analogue, so it is favored as a building block for importing more complex classical operations, such as binary arithmetic, into quantum algorithms like Shor’s factoring algorithm Shor (1997); Van Meter and Itoh (2005); Jones et al. (2012a) and quantum simulation Clark et al. (2009); Jones et al. (2012a, b). For these reasons, the Toffoli gate is critically important to quantum computing in general, and improvements in the design of the Toffoli gate make the realization of large-scale quantum computation more tractable.

Several researchers have studied circuit constructions for the Toffoli gate. The most oft-cited implementation is probably the one on page 182 of Ref. Nielsen and Chuang (2000), which may have been derived from Ref. Barenco et al. (1995). As can be seen in Ref. Nielsen and Chuang (2000), the Toffoli gate is decomposed into smaller quantum gates, each of which can be made fault-tolerant by conventional means Preskill (1998). The most nettlesome of these is the gate, which is much more expensive in both time and space resources to produce Zhou et al. (2000); Bravyi and Kitaev (2005); Raussendorf et al. (2007); Isailovic et al. (2008); Jones et al. (2012a); Meier et al. (2012); Bravyi and Haah (2012); Jones (2012); notably, the Toffoli circuit in Ref. Nielsen and Chuang (2000) uses seven gates. In fact, Ref. Barenco et al. (1995) contains a construction nearly identical to one derived here (we use four gates), except for an undesirable phase on one output state (we show how to correct this with modest effort). However, “complete” implementations of the Toffoli gate without a phase error have used seven gates in the literature to date. Amy et al. studied classical search methods for decomposing gates like Toffoli into fault-tolerant primitives Amy et al. (2012), and Selinger investigated circuit constructions with particular emphasis on -gate count and depth, where the latter metric allows parallel gates on different qubits Selinger (2012). We use Selinger’s work as our starting point, as we turn his almost-Toffoli gate into a proper Toffoli gate, using four gates and some quantum teleportation. Finally, the importance of this topic has attracted the attention of other researchers, and Eastin has independently discovered equivalent results Eastin (2012).

This paper presents two important results. First, Section II describes how to implement the Toffoli gate with only four gates and Clifford-group operations Gottesman and Chuang (1999); Nielsen and Chuang (2000). Second, Section III introduces a Toffoli construction requiring eight gates that can detect an error in any single gate. This new circuit is an important development for fault-tolerant quantum computing, because it relaxes the requirements on high-fidelity gates that are expensive to produce; however, the circuit is probabilistic, and we discuss its proper usage. Section IV presents some analysis of the resource costs and error rates for these circuits. The paper concludes with a brief discussion of the impact these results have on large-scale quantum computing.

## Ii Toffoli using just four gates

In fault-tolerant quantum computing, the most difficult quantum gates to produce are non-Clifford gates. The Hadamard gate , the phase gate , and the CNOT gate are generators for the Clifford group, as any gate in this group can be produced by combinations thereof, up to a global phase that we ignore. However, at least one gate outside the Clifford group is required for universal quantum computing. The gate is often selected because it is the easiest to produce; however, as we explain below, “easy” is relative, and this gate is still quite expensive in computing resources.

In most quantum codes, including the surface code Fowler et al. (2009), non-Clifford gates are produced using an ancilla state that is injected into the circuit Gottesman and Chuang (1999); Nielsen and Chuang (2000). As this ancilla is produced in a faulty manner, it must be purified through magic state distillation Bravyi and Kitaev (2005); Meier et al. (2012). The handful of rounds of state distillation required to reach the error rates required for quantum algorithms are considerably expensive, such that a single gate requires the circuit volume (product of qubits and time steps) of a CNOT or gate Jones et al. (2012a), making its production the dominant cost among fault-tolerant gate primitives. This poses an issue for quantum computing, as very many gates in the form of Toffoli gates are required for typical quantum algorithms like integer factoring or quantum simulation. The first Toffoli gate construction we present uses four gates instead of seven, thereby reducing the overhead due to state distillation.

Let us denote the Toffoli gate as the operation in Figure 1a, which requires four gates and was introduced by Selinger Selinger (2012). Toffoli and Toffoli differ only by a controlled- gate between the control qubits and . Beginning with Toffoli, we need only an ancilla qubit, a phase gate , and teleportation to implement the exact Toffoli gate, as shown in Figure 1b. We first apply the Toffoli using the same controls the desired Toffoli but with an ancilla as target. The erroneous controlled- is corrected by a simple gate applied to the ancilla. Afterwards, the CNOT and measurement teleport the doubly conditional NOT operation encoded in the ancilla to the target qubit of the desired Toffoli. The measurement result determines whether a corrective gate of controlled-, which is in the Clifford group, is required to correct a phase resulting from measurement back-action. One can readily verify that only four gates are required in this procedure Nielsen and Chuang (2000); Gottesman and Chuang (1999). Note that the inverse gate requires the same ancilla-based teleportation circuit as , so these gates are equivalent in state-distillation cost and construction.

The construction in Figure 1b can also be used to add control-qubit inputs to an existing controlled- gate, where is any unitary. Replace the CNOT in Figure 1b with controlled- (targeting however many qubits acts on), and the result is controlled-controlled-. By iterating this procedure, one can add controls to controlled- using gates. The best prior result required gates Selinger (2012).

## Iii Error-detecting Toffoli circuit

Whereas the previous section reduced the number of gates needed to make a Toffoli, this section addresses the resource-cost problem differently by making each gate less expensive. The cost of a gate scales inversely with the probability of it having an undetected error, with a relationship where circuit volume (qubits gates) is . We introduce a new Toffoli gate that can detect an error in any one of eight gates. As a result, the effective error probability of the Toffoli gate is instead of (we only consider lowest non-vanishing order throughout this paper since ). Even though twice as many gates are needed, they can tolerate larger error rates, so they are substantially less expensive to produce than would otherwise be necessary.

The error-detecting Toffoli circuit is rather simple to derive. It consists of two Toffoli gates acting on a target qubit which is in a bit-flip code Nielsen and Chuang (2000), as shown in Figure 2. The gate with reversed triangles is the inverse operation (Toffoli). Importantly, the controlled- and controlled- gates acting on the same qubits and are inverse operations, so they cancel. A logically equivalent decomposition into gates is shown in Figure 3; this circuit is convenient for analyzing how errors propagate. We assume that preparation, , CNOT, and measurement operations are perfect, because fault-tolerant error correction for these processes is economical compared to gates. A single error in any of the gates will necessarily propagate to the syndrome measurement for this bit-flip code, as indicated by the red dashed lines. Upon such an event, all of the qubits are discarded. Note that errors, if present, do not propagate anywhere since they commute with the CNOT gates; they have no effect on the Toffoli gate.

The circuit in Figure 3 must be discarded upon a detected error event, which happens with probability . If this circuit were connected by entanglement to other qubits in a quantum algorithm, all qubits must be discarded, and the algorithm fails. To avoid this scenario, one can produce a Toffoli ancilla Nielsen and Chuang (2000). If the circuit fails because of a detected error, then the qubits are discarded, but no far-reaching damage occurs since this faulty circuit is not entangled to any data qubits. Conditioned on the circuit succeeding, the ancilla is teleported into data qubits to enact a Toffoli gate, using only Clifford gates and measurement, as shown in Figure 4. Using a representative value for -gate error as (we consider such a scenario in Section IV), the failure probability for preparing the Toffoli ancilla is a modest , which negligibly increases the number of times such preparation circuits must be repeated.

## Iv Resource Analysis

Comparing resource costs between the naive Toffoli gate using seven gates and our construction using four is straightforward. The latter requires about the half the resources of the former, under our assumption that gates are the dominant cost. There is also a modest improvement in the Toffoli error rate ( becomes ). However, in fault-tolerant quantum computing, this result is likely overshadowed by the error-detecting construction.

Doubling the number of gates from four to eight to achieve Toffoli error rate is usually the correct decision. The reason is that this approach is more economical than increasing the accuracy of the gates through further magic-state distillation (or other fault-tolerant procedures). Bravyi and Haah present a conjecture in the context of magic-state distillation, stating that to produce one magic state with error requires at least two input states with error ; hence, the resources needed to increase gate accuracy to at least doubles, and in all practical cases known to this author, the overhead factor is larger than two (an example case is considered below). Moreover, there is no known protocol which saturates this bound. Multilevel distillation comes arbitrarily close as , but this limiting case is not always relevant for finite , and multilevel protocols require large and complex circuits Jones (2012).

Under conditions relevant to quantum computing, the error-detecting Toffoli in Figure 3 can reach the low error rates required for quantum algorithms with one less round of state distillation, leading to as much as an order-of-magnitude reduction in the resources required to produce a fault-tolerant Toffoli gate. For example, suppose that we wish to produce a Toffoli gate with error probability below . We presume the “raw” gate ancilla has a -error probability of . Using the results in Ref. Meier et al. (2012), the simple Toffoli gate would require four gates distilled to using a hybrid scheme of one round of Bravyi-Kitaev (BK) distillation and two rounds of Meier-Eastin-Knill (MEK) distillation, at an average total cost of raw states. Conversely, the error-detecting Toffoli would require just one round each of BK and MEK distillation circuits for each of the eight gates distilled to , for a total average cost of raw states. The resource savings factor is , just in terms of number of undistilled states needed for distillation. In practice, the resource savings is amplified by another factor of 2 because one less round of distillation is needed (fewer gates means smaller circuit volume). Additionally, the state-distillation sub-circuits for the error-detecting Toffoli gate can use weaker error correction (i.e. lower code distance, by about a factor of two) than the same preparation circuits for the simple Toffoli, which translates to fewer qubits and gates at the hardware level Fowler et al. (2012). Relative to the Toffoli circuit using seven gates, there is an additional savings factor of . Therefore, the error-detecting circuit reduces total overhead for non-Clifford gates by up to an order of magnitude in this representative example.

It is also noteworthy that if “raw” gates can be produced with error rate , then the error-detecting Toffoli has a posterior error probability of approximately . This would enable modest quantum computations using about Toffolis, such as the multiplication of two 1000-bit numbers, without the need for resource-intensive magic-state distillation.

## V Conclusions

The Toffoli gate is an ubiquitous operation in quantum computing, as it plays a key role in many quantum algorithms. However, quantum computers that realize these algorithms are still out of reach. In the meantime, engineering a system capable of large-scale, fault-tolerant quantum computation demands that quantum computer architects minimize computing resource costs in terms of execution time and machine size. The constructions in this paper substantially reduce the circuit volume for the fault-tolerant Toffoli gate when one considers how expensive each non-Clifford gate is to produce. In the case of the error-detecting Toffoli gate, the resource savings is an order of magnitude in a representative example with -gate error . The improved fault-tolerant Toffoli gate brings large-scale quantum computing closer to realization.

###### Acknowledgements.

This work was supported by the Univ. of Tokyo Special Coordination Funds for Promoting Science and Technology, NICT, and the Japan Society for the Promotion of Science (JSPS) through its “Funding Program for World-Leading Innovative R&D on Science and Technology (FIRST Program).”### References

- John Preskill, “Reliable quantum computers,” Proceedings of the Royal Society of London. Series A: Mathematical, Physical and Engineering Sciences 454, 385–410 (1998).
- Michael A. Nielsen and Isaac L. Chuang, Quantum Computation and Quantum Information, 1st ed. (Cambridge University Press, 2000).
- N. Isailovic, M. Whitney, Y. Patel, and J. Kubiatowicz, “Running a quantum circuit at the speed of data,” in 35th International Symposium on Computer Architecture, 2008 (ISCA’08) (2008).
- N. Cody Jones, Rodney Van Meter, Austin G. Fowler, Peter L. McMahon, Jungsang Kim, Thaddeus D. Ladd, and Yoshihisa Yamamoto, “Layered Architecture for Quantum Computing,” Phys. Rev. X 2, 031007 (2012a).
- Austin G. Fowler and Simon J. Devitt, “A bridge to lower overhead quantum computation,” (2012), Preprint arXiv:1209.0510v3.
- Peter W Shor, “Polynomial-time algorithms for prime factorization and discrete logarithms on a quantum computer,” SIAM J. Comput. 26, 1484–1509 (1997).
- Rodney Van Meter and Kohei M. Itoh, “Fast quantum modular exponentiation,” Phys. Rev. A 71, 052320 (2005).
- Craig R. Clark, Tzvetan S. Metodi, Samuel D. Gasster, and Kenneth R. Brown, “Resource requirements for fault-tolerant quantum simulation: The ground state of the transverse Ising model,” Phys. Rev. A 79, 062314 (2009).
- N Cody Jones, James D Whitfield, Peter L McMahon, Man-Hong Yung, Rodney Van Meter, Al’an Aspuru-Guzik, and Yoshihisa Yamamoto, “Faster quantum chemistry simulation on fault-tolerant quantum computers,” New Journal of Physics 14, 115023 (2012b).
- Adriano Barenco, Charles H. Bennett, Richard Cleve, David P. DiVincenzo, Norman Margolus, Peter Shor, Tycho Sleator, John A. Smolin, and Harald Weinfurter, “Elementary gates for quantum computation,” Phys. Rev. A 52, 3457–3467 (1995).
- Xinlan Zhou, Debbie W. Leung, and Isaac L. Chuang, “Methodology for quantum logic gate construction,” Phys. Rev. A 62, 052316 (2000).
- Sergey Bravyi and Alexei Kitaev, “Universal quantum computation with ideal clifford gates and noisy ancillas,” Phys. Rev. A 71, 022316 (2005).
- R. Raussendorf, J. Harrington, and K. Goyal, “Topological fault-tolerance in cluster state quantum computation,” New Journal of Physics 9, 199 (2007).
- Adam M. Meier, Bryan Eastin, and Emanuel Knill, “Magic-state distillation with the four-qubit code,” (2012), Preprint arXiv:1204.4221v1.
- Sergey Bravyi and Jeongwan Haah, “Magic state distillation with low overhead,” (2012), Preprint arXiv:1209.2426v1.
- Cody Jones, “Multilevel distillation of magic states for quantum computing,” (2012), Preprint arXiv:1210.3388v1.
- Matthew Amy, Dmitri Maslov, Michele Mosca, and Martin Roetteler, “A meet-in-the-middle algorithm for fast synthesis of depth-optimal quantum circuits,” (2012), Preprint arXiv:1206.0758v2.
- Peter Selinger, “Quantum circuits of T-depth one,” (2012), Preprint arXiv:1210.0974v1.
- Bryan Eastin, “Distilling one-qubit magic states into Toffoli states,” (2012), private communication.
- Daniel Gottesman and Isaac L. Chuang, “Demonstrating the viability of universal quantum computation using teleportation and single-qubit operations,” Nature 402, 390–393 (1999).
- Austin G. Fowler, Ashley M. Stephens, and Peter Groszkowski, ‘‘High-threshold universal quantum computation on the surface code,” Phys. Rev. A 80, 052312 (2009).
- Austin G. Fowler, Matteo Mariantoni, John M. Martinis, and Andrew N. Cleland, “Surface codes: Towards practical large-scale quantum computation,” Phys. Rev. A 86, 032324 (2012).