# Demonstration of Topological Data Analysis on a Quantum Processor

## Abstract

Topological data analysis offers a robust way to extract useful information from noisy, unstructured data by identifying its underlying structure. Recently, an efficient quantum algorithm was proposed [Lloyd, Garnerone, Zanardi, Nat. Commun. 7, 10138 (2016)] for calculating Betti numbers of data points – topological features that count the number of topological holes of various dimensions in a scatterplot. Here, we implement a proof-of-principle demonstration of this quantum algorithm by employing a six-photon quantum processor to successfully analyze the topological features of Betti numbers of a network including three data points, providing new insights into data analysis in the era of quantum computing.

###### pacs:

03.65.Ud, 03.67.Mn, 42.50.Dv, 42.50.XaIn exploratory data analysis and data mining, our data often encodes extremely valuable information, but is typically large, unstructured, noisy, and incomplete, such that extracting useful information from the data is an important yet challenging task. Topological data analysis (TDA) Carlsson (2009) provides a general framework for studying such data in a manner that is insensitive to the particular metric and robust against noise. In particular, persistent homology Edelsbrunner et al. (2002); Zomorodian and Carlsson (2005b) has been well established as a technique for extracting useful information by identifying topological features of data. One essential feature is the number of -dimensional holes and voids in datasets, that is, the -th Betti number (a topological invariant). For instance, the first three Betti numbers, , and , represent respectively the number of connected components, one-dimensional holes, and two-dimensional voids. The Betti numbers abstract away the actual data, reducing it to a purely topological representation, which is valuable for understanding the underlying structure of datasets. The field of using topological data analysis to analyze Betti numbers of data has been growing rapidly in recent years, yielding applications in image recognition Carlsson et al. (2008), signal processing Perea and Harer (2015), network science Petri et al. (2013a, b), sensor analysis De Silva and Ghrist (2007b, a); De Silva and Carlsson (2004); Ghrist and Muhammad (2005), brain connectomics Giusti et al. (2016, 2015), and fMRI data analysis Petri et al. (2014); Lord et al. (2016), just to name a few.

Practically however, when facing the issue of computational complexity, classical topological methods pose a formidable task: a set of data points possesses potential subsets that could contribute to the topology, quickly overwhelming even the most powerful classical computers, even for not-so-large datasets. So far the best classical algorithm for estimating Betti numbers to all orders with accuracy takes time Cohen-Steiner et al. (2007); Basu (1999, 2003, 2008, ); Friedman (1998). Moreover, exact calculation of Betti numbers is known to be PSPACE-hard for some classes of topologies Scheiblechner (2007).

Recently, Lloyd Lloyd et al. (2014a, 2016) extended methods from quantum machine learning to TDA for efficiently estimating Betti numbers to all orders. Indeed, if the proportion of -simplices generated from a dataset is large enough, the quantum algorithm for calculating Betti numbers to all orders with accuracy has runtime – exponentially faster than the best known classical algorithms. Furthermore, the algorithm does not require a large-scale quantum random access memory (qRAM) Giovannetti et al. (2008) – just bits is sufficient for the algorithm to store the information of all pairwise distances between the data points. The potential computational speedup and its practicality will likely make quantum TDA a promising application for future quantum computers, in addition to Shor’s algorithm Shor (1997); Lu et al. (2007); Lanyon et al. (2007); Huang et al. (2017), quantum simulation Feynman (1982); Lloyd (1996); Lu et al. (2009); Lanyon (2010), solving linear systems Harrow et al. (2009); Cai et al. (2013), and classification of linear vectors Rebentrost et al. (2014); Lloyd et al. (2014b); Cai et al. (2015).

Here we report a proof-of-principle demonstration of the quantum TDA algorithm on a small-scale photonic quantum processor for the first time. The topological features of Betti numbers of three data points are revealed and monitored at two different topological scales in our experiment. Our experiment successfully demonstrates the viability of the algorithm and suggests that data analytics may be an important future application for quantum computing, with widespread applications in our increasingly data-centric world.

To calculate Betti numbers, we first represent data topologically in terms of relationships between data points. Using a cutoff distance , we group data points into simplices (see Fig. 1(a))– fully-connected subsets of data points. The set of simplices forms a simplicial complex, the topological structure from which features such as Betti numbers can be extracted. This topological construction is shown in Fig. 1(b-d).

By determining the complete set of Betti numbers over the full range of , we can then construct the barcode (see Fig. 1(e)) Ghrist (2008), a parameterized version of Betti numbers in a distance-dependent manner. Each bar in the region of represents a -dimensional hole, and the length of the bar indicates its persistence in the parameter . With the barcode, we can qualitatively filter out the short bars as topological noise and capture the long bars as significant features, since the length of bars is indicative of their persistence against changes in distance . In Fig. 1(e), a bar in the region of persists for a long range, leading us to determine that the underlying topological feature of the unstructured data (Fig. 1(b)) is a circle.

In general, the quantum TDA algorithm has two main steps (see Fig. 2(a)). First, one accesses the data to construct the uniform mixture of the -simplices that encode the desired topological structure. The time of this step is in the worst case exponential and in fact depends on the proportion of -simplices. In cases where this fraction is large enough, this step can be implemented efficiently either classically, or using Grover’s algorithm, yielding a further quadratic algorithmic enhancement. In the quantum algorithm, this step could be realized via two small steps, namely: (1a) simplicial complex state preparation; (1b) uniform mixed state construction. Second, one implements step (2) to reveal the topological invariants of the structure. This step is realized using the phase-estimation algorithm Nielsen and Chuang (2010), which provide an exponential speedup over known classical procedures on a quantum computer, in fact Lloyd et al. (2014a, 2016) showed that this can executed in time , with accuracy . The steps of the quantum algorithm are now described in more detail.

Implementing step (1a) constructs the simplicial complex. For a scatterplot including data points, a -simplex consists of points , together with edges, creating a fully connected subset of the data. We can encode a -simplex as an -qubit quantum state with 1s at positions and 0s at the other remaining positions.

The Vietoris-Rips simplicial complex is the set of -simplices where all points are within distance of each other. In the quantum implementation, we can construct the simplicial complex state as the uniform superposition of -simplices in the complex

(1) |

Classically verify whether all points in each of the are within distance of each other could help us construct the simplicial complex state. Besides, we can also implement a multi-target Grover’s algorithm Grover (1997) with a membership oracle function if to verify whether , yielding a quadratic speedup. Let be the Hilbert space spanned by where . The construction of also reveals the number of -simplices, , and takes time , where is the proportion of -simplices that are actually in this complex at scale , and is the number of iterations of the multi-target Grover’s algorithm. When the proportion is too small, the quantum search procedure will fail to find the simplices Lloyd et al. (2014a, 2016).

In step (1b), we construct the mixed state,

(2) |

the uniform mixture over the set of simplices in the complex. This procedure can be easily realized by adding an -qubit ancillary register, performing controlled-NOT (CNOT) operations to copy to construct , and finally tracing out the ancillary register to obtain .

Step (2) acta on the simplicial complex to reveal topological features – the core of exponential speedup in the algorithm. Define the boundary map that operates from to by,

(3) |

where is obtained from with vertices by omitting the -th point from . The -th Betti number is defined as Basu (1999, 2003, 2008, ),

(4) |

Classical algorithms for calculating Betti numbers to all orders with accuracy require time Cohen-Steiner et al. (2007); Basu (1999, 2003, 2008, ); Friedman (1998). In quantum TDA, an exponential speedup is achieved by employing the phase-estimation algorithm. For this purpose, the boundary map is embedded into a Hermitian matrix,

(5) |

Now applying phase-estimation to decompose in terms of the eigenvectors and eigenvalues of , one obtains the probability of projecting onto the kernel by measuring the eigenvalue register. Then the dimension of the kernel of can be calculated as . When both and are determined, we can reconstruct the -th Betti number by,

(6) |

We note that for some special cases for , it is trivial to calculate . For example, if a -simplex does not exist, , while is always equal to the number of points.

Careful evaluation indicates that step (2) can estimate Betti numbers to all orders with accuracy in time Lloyd et al. (2014a, 2016). Hence, while in the worst case that their proportion is too small, step (1) will fail to find the -simplices, since both the classical and quantum algorithm will take exponential time. There are specific cases, in particular where step (1) can be implemented efficiently, where the overall quantum algorithm can provide exponential savings. In fact we have tested a particular case using data-points with random distances between them and showed that indeed step (1) can be implemented efficiently (see Supplement 1 for details), either by a classical algorithm or further improving the time by a square root factor through Grover’s algorithm.

To experimentally demonstrate the quantum TDA algorithm, we choose the simplest meaningful instance: estimating the Betti numbers for three data points at two different scales. Assume the distances between the three points are 3, 4 and 5 (see Fig. 2(b)). For scales in the ranges and , the corresponding states for 1-simplices (the -simplex for doesn’t exist since not all three data points can be connected at and ) are (Fig. 2(c)) and (Fig. 2(d)) respectively, which means and . A simple quantum circuit is designed to prepare () directly by removing (adding) a Hadamard gate marked by dashed lines at step (1) in Fig. 2(e).

To construct the corresponding uniform mixed states, we don’t actually need to generate a complete copy of (). Instead, we need only perform a CNOT operation between the auxiliary qubit and the second qubit of () to partially copy the state of simplices. After tracing out the ancillary qubit, the uniform mixed states and are obtained.

Next, apply quantum phase-estimation to reveal information related to Betti numbers. Since there are only three data points, -dimensional holes for can not exist. Therefore, only the 0-th and 1-st Betti numbers need to be calculated. We note that the algorithm cares not about the exact eigenvalue spectrum, but the probability of detecting in the eigenvalue register. We can exploit this property to reduce the number of qubits required in the eigenvalue register. A particular treatment for boundary matrices is utilized to greatly simplify the complex circuit (see Supplement 1 for details) – a single CNOT operation between the eigenvalue register comprising only one qubit and the first bit of () is sufficient for realizing phase-estimation. Finally, the information related to Betti numbers will be read out by measuring the eigenvalue register. Note that since the quantum TDA algorithm only depends on how the points are connected, not the precise distances between points, our circuit works for all nontrivial cases of three points (where one, or two edges are present). The cases where zero or three edges are present are trivial, since we could clearly know the Betti numbers in the cases that the points are all disconnected (, and for ) or all connected (, and for ) for points without calculating.

Fig. 3 shows the setup of our experiment. We use single photons as qubits, where the logical qubits and are encoded into horizontal () and vertical () polarization, respectively. With these settings, the step of simplices state preparation becomes straightforward. and can be prepared directly by adding or removing the polarizer in path 2 respectively, where the index in denotes the spatial mode. Photons 4 (ancilla) and 5 (eigenvalue register) are both disentangled by polarizers into , and then photons 3 and 6 (trigger) immediately collapse into . Note that the CNOT gates can be simulated using combinations of a polarizing beam splitter (PBS) and a half-wave plate (HWP) Lu et al. (2007), since the target qubits are fixed at . This setup, in principle, suffices to demonstrate the underlying conceptual principles of quantum TDA.

Before running the algorithm, we first characterized the performance of the optical quantum circuit. In the case of , a three-photon entangled state is generated after implementing the CNOT gate in step (1a). We measured the fidelity of the experimentally prepared state (see Supplement 1 for details) as , which exceeds the threshold of 0.5 for the entanglement witness to confirm genuine multi-partite entanglement Gühne and Tóth (2009). To the best of our knowledge, such a high fidelity for three photon entanglement has never been achieved before Hamel et al. (2014).

After tracing out the ancilla in the Pauli- basis, the uniform mixed states,

(7) |

are created at the scales of and , respectively. We characterized these states using quantum state tomography to reconstruct the density matrices. (See Supplement 1 for details). The fidelity and trace distance between the reconstructed () and ideal () matrices were calculated as , and , respectively. Furthermore, The fidelity and trace distance are related by the inequality Nielsen and Chuang (2010). In our experiment, both and are located in the range of , and close to the lower bound.

The final results were read out via 6-fold coincidence events. Figures. 4(a,b) show the measurement results of the eigenvalue register at the scales of and , respectively. In the case of , with a probability of we measure in the eigenvalue register, from which we calculate the dimension of the kernel space as . Since and for , we finally obtain the 0-th Betti number and 1-st Betti number , following Eq. 4, which can be rounded to and . In the case of , the probability of measuring in the eigenvalue register is 0.038(9). Using the same approach, we calculate the 0-th and 1-st Betti numbers as and , respectively, which can be rounded to and . That is to say, we have revealed and tracked the topological features of the dataset in Fig. 2(b) at two different scales: the number of connected components at scales of and are 2 and 1, respectively, and no -dimensional holes for exist. From these results, the barcode is constructed as shown in Fig. 4(c).

To further quantify the experimental performance, we use the similarity measure Fuchs (1996) to characterize the overlap between experimental and theoretical values, where and are the experimental and theoretical output probabilities of the state , respectively. The data in Fig. 4 shows the results as and , indicating near perfect experimental accuracy, confirming that the algorithm is successful.

We note that for the quantum TDA algorithm, the results are read out by measuring the eigenvalues. In general, the eigenvalue register requires only a few qubits for the quantum TDA algorithm (1 qubit in the current work), since we only care about the proportion of in the eigenvalue register, rather than the exact value of all eigenvalues. Thus, a small amount of measurements are sufficient for obtaining reliable results, an important feature for the scalability of the algorithm.

In addition, theoretically, for the quantum TDA algorithm, only the qubits in the eigenvalue register need to be measured, rather than having to measure all qubits. In our experiment, since the photons generated by spontaneous parametric down conversion are probabilistic, to ensure that all qubits in the circuit have been generated, and the quantum circuits have been fully implemented, we need to measure 6-fold coincidence events. In fact, this is a common problem encountered in the current linear optical quantum computing. Fortunately, with the development of deterministic quantum dot single photon source He et al. (2017), and other techniques Kaneda et al. (2015), we believe this problem can eventually be overcome. We anticipate that with more qubits (more photons Wang et al. (2016, 2017) or higher dimensional states Fickler et al. (2012); Wang et al. (2015)), our proposal could be extended to the analysis of much larger datasets in the future.

In summary, we have presented the first proof-of-principle demonstration of quantum TDA on a small-scale photonic quantum processor. The topological features of a dataset comprising three data points is revealed and tracked at two different topological scales, fully reproducing the Betti numbers associated with the topology of the data. Future advances in the field could open up new frontiers in data analysis for quantum computing, including signal and image analysis, astronomy, network and social media analysis, behavioral dynamics, biophysics, oncology and neuroscience.

Acknowledgements: We thank R.-Z. Liu, Michele Cirafici, T. L. for enlightening discussions. This work was supported by the National Natural Science Foundation of China, the Chinese Academy of Sciences, and the National Fundamental Research Program. P.P.R. is funded by an ARC Future Fellowship (project FT160100397).

See Supplement 1 for supporting content.

## Supplemental Material

## I Background and practical applications of Betti number and TDA

Betti numbers are a way to describe the connectivity within a topological space. In simplest terms, the -th Betti number counts the the number of -dimensional holes in a topological space, for example,

- is the number of connected components;

- is the number of planar holes (1-dimensional holes);

- is the number of two-dimensional voids (2-dimensional holes);

- …

Betti numbers are topological invariants. If two Betti numbers are the same for two different spaces then the spaces are homotopy equivalent Carlsson (2009). To demonstrate Betti numbers more vividly, some examples are shown in Fig. 5. We can see that a circle has a connected component, a 1-dimensional holes, thus . The Betti numbers of circle are the same as a triangle, so they are are homotopy equivalent (see Fig. 5(a)); Similarly, the two-dimensional hollow sphere is homotopy equivalent to a hollow tetrahedron (see Fig. 5(b)). Thus, Betti numbers can record significant topological features of a shape, which could be directly used in pattern recognition Carlsson (2014), anomaly detection Johannsen and Marchette (2012), computational linguistics Nilsson and Ekgren (2013). For instance, considering a simple shape recognition task, namely the recognition of printed letters, by using the Betti numbers, we could identify and distinguish the letters “A” and “B” in Fig. 5(c), even in the presence of some deformation.

Now, we briefly introduce some mathematical background for Betti numbers. For more details, one can refer to Nakahara (2003).

We first describe how to use a simplicial complex to formally describe a topological structure.

Simplex: A -simplex is a fully connected set of affine geometric points , together with edges (see Fig 1(a) for some example). where is the dimension of the simplex.

Simplicial complex: Roughly speaking, a simplicial complex is a finite set simplices (see Fig. 1(d) for an example) such that:

) any face of a simplex of is a simplex of ,

) the intersection of any two simplices of is either empty or a common face of both.

Next, we will introduce the chain group, boundary operator, cycle group and boundary group, and then how to calculate the Betti numbers.

-chain group: A -chain is a formal sum of -simplices with integer coefficients, which can be written as with , where is the set of -simplices of . The set of all -chains forms an Abelian group .

-boundary operator: For a -simplex , the boundary map is given by

where indicates that is removed, and is the -simplex spanned by all the vertices except .

-boundary group and -cycle group: The -boundary group is defined as , containing elements that are boundaries of -dimensional objects; The -cycle group is defined as , the elements in the cycle group can be understood as ‘loops’. It can be proved that .

Homology group: Let be an -dimensional simplicial complex. The th homology group associated with is defined by , which represents those elements of (loops) that are not boundaries.

Betti numbers: The -th Betti number is defined by

Using Betti numbers, we can detect invisible geometric features of high-dimensional objects. Applying Betti numbers to data analysis could help us analyze and exploit the complex topological and geometric structures underlying data. Next, we will introduce how to use persist homology, a sophisticated topological data analysis method, to extract useful information by identifying the topological features (Betti numbers) of data.

From points to simplicial complex: In data analysis, data is usually represented as an unordered sequence of points (see Fig. 1(b)), to analyze the Betti numbers of data, requiring a method to construct a simplicial complex.

To define a simplicial complex, the most obvious way is to use the points as the vertices of a combinatorial graph whose edges are determined by proximity. Using a cutoff distance , and connecting points within distance (see Fig 1. (b-d) for the procedure), we can construct the simplicial complex (see Fig 1. (d)), called a Vietoris-Rips simplicial complex.

Computing Betti numbers: Having constructed the simplicial complex of data points, we use the method above to calculate Betti numbers, finding the topological structure of the data points.

Barcode: Converting data points into a simplicial complex requires a choice of parameter – cutoff distance . However, if is too small, almost all points are separated, and no overall structure is apparent; if is too large, all the points may be connected with each other, the complex is a single high dimensional simplex, and no topological holes exist. It is challenging to select an appropriate scale for a given dataset. To address this problem, we observe the evolution of topological features for the full range of , rather than focussing on a particular numeric value, yielding the barcode (see Fig. 1(e)). Each bar in the region of of the barcode represents a -dimensional hole, the length of which indicates its persistence in the parameter . With the barcode, we can qualitatively filter out the short bars as topological noise and capture the long bars as significant, persistent topological features, since the length of bars is indicative of their persistence against changes in distance . For further details, refer to Zomorodian and Carlsson (2005a).

There are many interesting and useful applications of topological data analysis. For instance, in the field of image recognition, Carlsson et al. found that high-contrast 33 pixel patches from grayscale digital images concentrate near the surface of a Klein bottle in a higher-dimensional space Carlsson et al. (2008); in the field of signal processing, Perea and Harer found that persistent homology can detect periodicity in time-series data preventing noise Perea and Harer (2015), which is very stable and accurate especially in the presence of damping; in unsupervised machine learning, persistent homology also provides a powerful tool for the analysis of musical data, exploring common features of classical scores Sethares and Budney (2014).

## Ii Numerical simulation of the proportion of -simplices in some cases

As mentioned in the main text, the efficiency of step (1) depends on the proportion of -simplices. Here, we studied the relationship among the proportion of -simplices, the number of data point , the dimension of the -simplices, and cutoff distance by numerical simulation (see Fig. 6).

In our simulations, without loss of generality, we randomly set the distances between different points in the range of [0,1]. In Fig. 6(a), we take as an example to simulate the relationship among the proportion of -simplices, the number of data points and cutoff distance . Since the computational complexity of step (1) in quantum TDA is , and the computational complexity of step (2) is , where is the accuracy, we could regard step (1) as efficient in quantum TDA if , that is . In Fig. 6(a), the blue area represents , and the green area represents . We can see that, as increases, the the green area becomes larger and the blue area becomes smaller. Thus, with the increase of , the step (1) is efficient at a wider range of cutoff distance .

In Fig. 6(b), we take as an example to simulate the relationship between the proportion of -simplices, their dimension , and the cutoff distance . It is clear that the proportion of -simplices becomes smaller gradually at each cutoff distance as becomes larger. Similar to Fig. 6(a), we let the blue area represent , and the green area represent , yielding Fig. 6(c). We can see that even when and reaches the maximum , the green area can still encompass over 50% of the region. Obviously, by analyzing all three figures in Fig. 6, the regime of step (1) that can be regarded as efficient is much larger than than that regarded as inefficient. That is, step (1) can be implemented efficiently in the cases of our numerical simulations.

## Iii Experimental Errors analysis

In this section, we will analyze errors introduced by experimental noise and provide an error threshold analysis.

The imperfections in our experiment can be attributed to two major causes: higher-order photon emissions, and partial distinguishability of independent photons. In order to suppress the influence of higher-order photon emissions, we placed two single-photon detectors at each measurement port. This dual-channel setup can partially suppress higher-order events where both detectors trigger simultaneously at one measurement port, indicating the presence of multiple photons. To ensure the high levels of indistinguishability between independent photons, all photons are spectrally filtered by 3-nm narrow-band filters.

The final result of the quantum TDA algorithm is decided by the probability of the zero eigenvalue measured in the eigenvalue register. Assume the ideal probability of measuring the zero eigenvalue is , then the dimension of the kernel of could be calculated as . To obtain the correct dimension in the experiment, we need to ensure that , that is if we use the rounding principle, where is the probability of the experimentally measured zero eigenvalue. To quantify the experimental error threshold, we define the error as , and then simulate the error threshold that satisfies the constraint condition . The relationship between the number of -simplices ( axis) and error threshold ( axis) is shown in Fig. 7. Obviously, as increases, the error threshold decreases. Thus, appropriate fault-tolerance mechanisms should be employed when we deal with large-scale dataset.

Note that unlike the the previous quantum algorithm, the quantum TDA algorithm only cares about the probability of the zero eigenvalue, not all the individual values in the eigenvalue register. Thus, the quantum TDA algorithm, in principle, could be more robust to noise than other algorithms, such as Shor’s algorithm Shor (1997) and the HHL algorithmHarrow et al. (2009), which require an exact quantum state as output.

## Iv necessity of constructing the mixed state

In the quantum TDA algorithm, step (1) is used to construct the uniform mixture of the -simplices, which is realized by: (1a) simplicial complex state preparation; (1b) uniform mixed state construction. In fact, the purpose of step (1) is to sample a uniform -simplex, which is the essential reason for constructing mixed state.

Next, we will provide the reason why the quantum TDA algorithm can not directly use the pure state generated in step (1a) as the input of step (2). In step (2), we use quantum phase-estimation algorithm to decompose a mixed state in terms of the eigenvectors of the Hermitian matrix , which acts on the space , and find the probability of the zero eigenvalue to compute the dimension of the kernel of . The mixed state is

where each -simplices is the basis, and is a maximally mixed state. According to quantum mechanics, even using another complete basis set, the maximally mixed state is still of the above form. Thus, could be rewritteb as the eigenstate set of

Introduce qubits as the eigenvalue register, after the phase-estimation algorithm,

For each eigenstate , the eigenvalue register will output its corresponding eigenvalue . Thus, The probability of measuring the zero eigenvalue in the register is , where is the number of eigenstates in whose eigenvalue is zero, that is, the dimension of the kernel of . However, if we directly used the pure state generated in step (1a) as the input to step (2), after we decompose the pure state in terms of the eigenvectors of the Hermitian matrix , the probability of the zero eigenvalue in the register will be meaningless due to interference effects. For ease of understanding, we will give an example to show that using the pure state as the input of step (2) will output wrong results.

For the topological structure in Fig. 8, the 1-simplices are , which are denoted as respectively. The 0-simplices are , which are denoted as respectively. The Hermitian operator is

(8) |

where

(9) |

There are only two eigenstates of the Hermitian matrix whose eigenvalue is zero:

Therefore, after the phase-estimation algorithm, the probability of measuring the eigenvalue of zero in eigenvalue register should be 2/7. However, if we use the the pure state,

Obviously, the probability of measuring the eigenvalue of zero is , which is inconsistent with the expectation 2/7. By this counterexample, we can see that the algorithm can not use pure state generated in step (1a) as the input to step (2).

## V Circuit details

To implement the algorithm with a limited number of qubits, our designed circuit differs from the original algorithm via several modifications, some of which have already been mentioned in the main text. Here we show the details of the modifications to phase-estimation, the core of the quantum TDA algorithm. Before introducing the modification, we provide two preliminaries:

(i) Let be an arbitrary unitary operator, the eigenvector and eigenvalue sets of which are and , respectively. If we transform the unitary operator into , where is a constant, then the eigenvalue set of become , and the eigenvector set will not change. We note that if , then , else if , then .

(ii) Suppose is the input of the phase-estimation algorithm, where is an eigenvalue register with qubits, and is an eigenvector of unitary operator with eigenvalue ( with binary representation). The phase-estimation algorithm is designed to output , where is an approximation to the phase with a precision of bits.

Specifically, the Hermitian boundary matrices at scales and are

(10) |

(11) |

The eigenvalue and eigenvector sets of the boundary matrices are and , respectively, are

(12) |

To reduce the number of qubits required in the eigenvalue register, we set , then the eigenvalue spectrum becomes , without changing the eigenvector set. We note that the algorithm cares not about the full spectrum but the probability of being detected in the register, so this special treatment is justified. Then transforming into the unitary operator allows us to implement phase-estimation using an eigenvalue register with only one qubit . For the input , we apply the transformation,

(13) |

Similarly, at the scale of , we set and transform into the unitary operator to meet experimental requirements. For the input , the phase-estimation procedure outputs the state , where . Thus, in our experiment, only a single CNOT operation between the eigenvalue register comprising only one qubit and the first bit of () is sufficient for us to compile the phase-estimation algorithm.

## Vi experimental implementation of the Circuit

In the experiment, we use single photons as qubits, where the logical qubits and are encoded into horizontal () and vertical () polarization, respectively. The setup of our experiment is shown in Fig. 3. Photons in paths 1, 2, and 3 are used to construct simplex states. Photons 4 (ancilla) and 5 (eigenvalue register) are both disentangled by polarizers into , and then photons 3 and 6 (trigger) immediately collapse into . Here we describe details of how to experimentally implement the circuit in Fig. 2(b).

In the initialization stage, the photons in our experiment are generated by spontaneous parametric down-conversion using -barium borate (BBO). Ultraviolet laser pulses pass through a BBO crystal to produce entangled state (see Fig. 9(a)). If we do not want the entangled state, we could use a polarizer (POL) to disentangle the entangled state to or (see Fig. 9(b)).

In the quantum gate operation stage, we need to implement a gate, gate, and CNOT gate. The single-qubit quantum gates and can beexperimentally realized using half-wave plates (HWP) of (see Fig. 9(c)) and (see Fig. 9(d)), respectively. Since the target qubit of the CNOT gate in our circuit is , it can be realized using a combination of a polarizing beam splitter (PBS) and a HWP, and post-selecting the events where there is exactly one photon exiting each output of the PBS Lu et al. (2007) (see Fig. 9(e)).

In the measurement stage, each photon passes through a quarter-wave plate (QWP), a HWP, a PBS, and is finally read out by using a single-photon detector (see Fig. 9(f)). By adjusting the angle of the QWP and HWP, we can measure the photonic qubit in arbitrary bases.

## Vii Photon source

We developed a high-performance source of polarization entangled photons generated via spontaneous parametric down-conversion (SPDC) using a sandwich-like bulk Wang et al. (2016), which consists of two identically cut 2mm-thick beam-like type-II -barium borate (BBO) crystals with one half-wave plate (HWP) inserted between them. The source simultaneously exhibits high brightness (850Hz/mW), high efficiency (45% collection efficiency with 3nm bandwidth filters, and 88% collection efficiency without narrowband filtering) and high fidelity (0.98) at a pump power of 240mW. These three essential features are crucial for future scalable photonic quantum technologies.

## Viii Characterizing the three-photon entangled state

Here we show the details for determining the fidelity of the three-photon entangled state and verifying genuine multipartite entanglement Seevinck and Uffink (2001) using an entanglement witness. The fidelity is the overlap of the experimentally produced state with the desired state ,

(14) |

For the three-photon entangled state where , and are the Pauli matrices , , respectively. Fig. 10 shows the experimental data. The expectation values of and are 0.987(1) and 0.921(12) respectively. Thus, the state fidelity of can be calculated as , which exceeds the threshold of 0.5 required for the entanglement witness. With high statistical significance (76 standard deviations), genuine three-photon entanglement is confirmed.

## Ix State reconstructions

The matrix form of the reconstructed experimentally obtained states and are,