Optimization of quantum state tomography in the presence of experimental constraints
In the absence of experimental constraints, optimal measurement schemes for quantum state tomography are well understood. We consider the scenario where the experimenter doesn’t have arbitrary freedom to construct their measurement set, and may therefore not be able to implement a known optimal scheme. We introduce a simple procedure for minimizing the uncertainty in the reconstructed quantum state for an arbitrary tomographic scheme. We do this by defining a figure of merit based on the equally-weighted variance of the measurement statistics. This figure of merit is straightforwardly based on the singular value decomposition of the measurement matrix, making it well-suited for optimization.
The characterization of quantum states is a key step in quantum technologies such as quantum computing Kok2007 (), quantum information Nielsen2010 () and quantum cryptography Bennett1984 (). This can be achieved by performing an informationally complete set of measurements on multiple identically-prepared copies of the state, using the results to reconstruct a representation of the state. Such a procedure is known as quantum state tomography (QST) Vogel1989 (); Leonhardt1995 (); James2001 ().
In principle, as long as the measurement set is informationally complete, the density matrix representing the state can always be reconstructed (Sec. I). However, in practice, certain measurement schemes are more prone to error than others.
The optimality of various tomographic schemes has been explored in works such as those by Scott Scott2006 (), Roy and Scott Roy2007 (), and de Burgh et al. deBurgh2008 (). Roughly speaking, for two-state systems, measurement schemes whose probability operators are distributed symmetrically on the Bloch sphere—such as those based on mutually unbiased bases and platonic solids—minimize the error in the reconstructed density matrix.
In this paper, we consider a scenario where the experimenter lacks the resources to implement an arbitrary measurement set, and may therefore not be able to implement a known optimal scheme. We show how one can determine the measurement settings which minimize the uncertainty in the reconstructed quantum state.
To do so, we introduce a figure of merit based on the equally-weighted variance (EWV) of the measurement statistics, that can be used to quantify the uncertainty in the reconstructed density matrix (Sec II). This figure of merit is based on the singular-value decomposition of the measurement matrix and is trivial to compute on a modern computer, making it ideal for the task of optimization. We demonstrate this figure of merit’s utility (Sec III), by optimizing wave-plate parameters in two non-standard optical tomographic scenarios.
I Tomographic protocol
In a typical tomography experiment, multiple copies of the same quantum state interact with a measurement apparatus. The apparatus determines the number of states that registered the outcome , known as ‘counts’.
This can be modelled with an informationally complete set of probability operators . For our purposes, it is more natural to work directly with measurement statistics rather than probabilities. We therefore consider unnormalized probability operators that form an unnormalized POVM such that , where is the total number of counts for the POVM (for a nice introduction to the POVM formalism, refer to Ch 2.2.6 of Nielsen2010 ()). In the absence of an informationally complete POVM, multiple POVMs can be combined to form an informationally complete set.
The number of counts for a given outcome is given by the expectation value of the corresponding probability operator
where is the density matrix of a -level quantum system. The total number of counts for a single POVM is given by .
Given a set of measurement outcomes , one is interested in inferring the density matrix of the measured quantum state. This may be done by first decomposing the unknown density matrix in terms of an orthogonal basis:
where is related to the identity matrix and are the generalized Pauli matrices—a set of -dimensional traceless, Hermitian matrices that are orthogonal under the Hilbert-Schmidt inner product, given by . The parameters completely characterize the state and the condition ensures that the density matrix is normalized.
This can be written in compact matrix form as
where the vectors
are related by the measurement matrix
If the tomography scheme consists of operators and is informationally complete (i.e. has singular values), the density matrix of an unknown state can be reconstructed by calculating the pseudo-inverse of the measurement matrix to give the parameters in terms of the measurement outcomes :
If , is simply the inverse of and if , the tomography scheme may lead to an overcomplete set of Eqs. for .
A specific example of how the above formalism can be used to implement the familiar six-outcome tomographic protocol for qubits can be found in Appendix A.
We note that the inversion technique described above amounts to performing least-squares estimation which does not always produce physical density matrices. Despite this, we argue in Appendix B that linear inversion provides a reasonable starting point for developing a figure of merit that characterizes the uncertainty in the reconstructed density matrix.
When it comes to actual state reconstruction in practice, one often resorts to using the popular maximum likelihood estimation method Hradil1997 (). Alternatively, one can look to a growing number of exciting new techniques such as the forced purity routine Kaznady2009 (), Baysean mean estimation Blume-Kohout2010 (), hedged maximum likelihood estimation Blume-Kohout2010a (), compressed sensing Gross2010a (), von Neumann entropy maximization Teo2011 (), minimax estimation Ng2012 () and likelihood-free quantum inference Ferrie2013a (), as well as techniques that focus on reconstructing the state with reliable error bars Christandl2011 () and confidence regions Blume-Kohout2012 ().
Ii Deriving a figure of merit
Our goal is to identify the set of operators that minimize the uncertainty in the reconstructed density matrix. To do so, we must first define a figure of merit that quantifies this uncertainty.
In our analysis, we consider statistical uncertainties in measurement outcomes rather than systematic errors, e.g. due to misalignment of optical elements. We therefore neglect error in . We note that a recent study comparing systematic and statistical errors in QST was performed by Langford Langford2013 ().
Under these assumptions, We can relate uncertainties in the detected number of counts to uncertainties in the density matrix by
To assess a tomography scheme’s robustness to noise, we calculate the equally-weighted variance , defined in terms of the equal sum of the variances in the parameters Sabatke2000 ():
From Eq. (9), it follows that
Statistical noise can take the form of signal-independent background noise such as additive Gaussian noise, or signal-dependent noise such as Poisson noise. The latter is endemic to experiments involving particles that are spontaneously generated, such as single photons or neutrons.
Here, we consider Poisson noise, which leads to a figure of merit that is input-state-dependent. An average over all input states gives an expression that corresponds to the result for signal-independent noise.
Fluctuations due to Poisson noise are statistically independent from one measurement to the other Goudail2009 (), simplifying the expression to
For Poisson processes, the variance is given by , thus
From Eq. (4), we see that , giving
In Appendix C, we make use of the singular value decomposition (SVD) (i.e. ) to arrive at the final expression for the equally-weighted variance:
where are the singular values of and
Eq. (18) is an expression for the equally-weighted variance of a tomography scheme undergoing Poisson noise statistics. As expected for signal-dependent noise processes, it is dependent on the target state given by the vector . In the next section, we calculate the average equally-weighted variance.
We emphasize that our figure of merit is applicable for tomographic schemes described by the above formalism. Namely, those where the unknown state is parametrized by a Bloch vector; all Bloch vector elements are of equal interest; and that the probability operators form a POVM. Take for counter-example the scheme for measuring the diagonal elements of the density matrix of a single-mode state of an electromagnetic field introduced by Mogilevtsev Mogilevtsev1998 (). For this scheme, using our notation, we have and , in the Fock basis. The measurement matrix is then given by the Vandermonde matrix. Naive calculation of the EWV from this measurement matrix may not lead to reasonable assessment of the uncertainty in this scheme, since do not form a complete basis and do not satisfy the condition .
ii.1 Average EWV
In many cases, one is interested in how well a tomography scheme performs overall, rather than for one particular state. In this section, we address this question by calculating the average equally-weighted variance for all possible input states:
The average over the first element of is trivial, i.e. . To calculate the average for other values of , i.e. , we follow the approach of Życzkowski et al. Zyczkowski2011 () and assume an average over an ensemble of random states for which the probability measure may be factorized. The distribution of eigenvalues and eigenvectors are therefore independent and, hence, can be averaged independently. While the average over eigenvectors is relatively straightforward, performing the average over eigenvalues can be quite tricky as there is no single probability measure for mixed states. Fortunately, as we show below, for any set of eigenvalues, the average over all possible eigenvectors gives , saving us from the difficulty of averaging over eigenvalues.
To calculate we invert Eq. (2) such that
where we used and made use of the eigenvalue decomposition where and . We can write the average of Eq. (21) over all possible eigenbases for a given set of eigenvalues as
As pointed out in Zyczkowski2011 (), it is natural to assume that the eigenvectors are distributed according to the unique, unitarily invariant, Haar measure on unitary matrices. Different assignments of eigenvectors to eigenvalues are all equally likely since any rearrangement of the eigenvectors could be done by a unitary transformation, and the Haar measure is invariant under unitary transformations. For a given set of eigenvalues , we can therefore decompose the sum over eigenbases into a sum over permutations of as follows
where is the set of all possible maps that permute . We can rewrite the sum over permutations of eigenbases in Eq. (25) as the sum over permutations of eigenvalues as follows
where is the inverse of the map . We note that and that this is independent of the index . We can therefore write
Using the fact that , and that the generalized Pauli matrices are traceless, we find that . We therefore conclude that , and Eq. (20) simplifies to
The operators can be thought of as scaled projectors such that . We use this, as well as the definition , to give
which combined with Eq. (26) gives
Eq. (29) is an expression for the average EWV of a tomography scheme undergoing Poisson noise statistics. This expression is trivial to compute using a modern computer and does not require computational averaging over input states—in contrast to, say, the fidelity deBurgh2008 ()—making it ideal for optimization over tuneable parameters.
If all probability operators are equally weighted, i.e. , Eq. (29) simplifies to
since each column of the unitary matrix has unit length.
We note that the corresponds to the EWV for signal-independent noise, such as additive Gaussian noise.
ii.2 Lower bound for
When evaluating the robustness of a specific tomography scheme to noise, it is useful to make comparisons with the best possible scheme, i.e. one that minimizes .
In this section we derive a lower bound for the average EWV. The approach we use is to find a relationship between the diagonal elements and singular values of the Hermitian matrix . This puts constraints on the parameters which determine . We then use Lagrange multipliers, which provide a strategy for optimizing a function subject to equality constraints, to minimize .
First, consider the singular value decomposition of the matrix , where is defined in Eq. (4).
To calculate a lower bound for the uncertainty, we begin by considering the sum of the diagonal elements of the matrix :
The set of operators
forms an orthonormal basis, in the sense of Hilbert-Schmidt inner product, for the vector space of all Hermitian matrices over real numbers. Recall that are scaled projectors with a trace . We can therefore write the “Hilbert-Schmidt length” of the vector in this vector space as
which simplifies Eq. (33) to
We now make use of the Schur-Horn theorem Schur1923 (); Horn1954 (), which states that there exists an Hermitian matrix with diagonal values and eigenvalues that are both ordered non-increasingly, if and only if,
Bearing in mind that the matrix is Hermitian, we find that
Given the above constraints and the fixed sum , which corresponds to having a fixed total number of counts for a given POVM, we used Lagrange multipliers to show that for a given , the expression is minimal when and . This reduces the problem to a single-variable optimization problem, which yields
Given a completely unknown state, one is interested in a tomographic scheme that performs best “on average”. Symmetric informationally-complete positive operator valued measures (SIC-POVMs) are known to be optimal for quantum state tomography under signal-independent noise Scott2006 (). We find that they are also optimal given Poisson noise processes, by showing that the for SIC-POVMs is equal to (see Appendix D).
The can be used to determine the optimal tomography scheme under a set of practical constraints. We demonstrate this with two examples, using two-state systems, i.e. qubits, encoded in the polarization degree of freedom of single photons. In such systems, the measurement is typically made in the horizontal/vertical basis by counting photons at the output ports of a polarizing beam splitter (PBS). Measurements in different bases are implemented by placing wave plates at different orientations in front of the beam splitter.
In Scenario A, we are restricted to one wave plate per qubit mode, but have the freedom to choose the retardance of each wave plate (Fig. 1 a). We use to optimize over the retardance of the wave plate.
In Scenario B, we consider two wave plates with which we can perform QST on a single qubit mode, however, the wave plates are optimized for a different optical frequency to the one used in the experiment (Fig. 3 a). We minimize to optimize over the orientation of the wave plates about the optical axis.
iii.1 Optimization of wave plate retardance
A single wave plate in front of a PBS can generate a POVM consisting of two operators associated with the horizontal and vertical output modes of each PBS, given by where
for . In fact, these operators correspond to the special case of a projective-value measure (PVM), where . The unitary operator associated with the wave plate,
rotates the operators on the Bloch sphere by an angle , about the vector
where and are unit vectors in Euclidian space (defined by the axes in Fig. 1) and . As the wave plate is rotated about the optical axis, the projector (after appropriate normalization) traces out a figure-eight path on the Bloch sphere, as shown in Fig. 1. The retardance of the wave plate determines the size of the figure eight.
Although is not informationally complete, an informationally-complete set of operators can be constructed with different orientations of the wave plate about the optical axis, given by the angle , as long as the retardance is not equal to an integer multiple of . Here, the total number of projectors is twice the number of PVMs used to construct the complete tomographic protocol. This scenario was considered in the Fourier Transform Tomography (FTT) scheme introduced in Mohammadi2013 ().
The task is to identify the optimal retardance that minimizes the average EWV. We consider a tomographic protocol which consists of a set of PVMs whose elements are distributed along the figure eight with equally-spaced values of . From these operators, we calculate the coefficients . We also construct a measurement matrix according to Eq. (7) and calculate its singular values . For (i.e. six different values of the retardance which corresponds to six different PVMs), the measurement matrix is given by:
Using Mathematica Wolfram-Research2010 (), the singular values were found to be:
for . Knowing and , we can calculate the average EWV according to Eq. (29), which is given by
where is the total number of counts. Fig. 2 shows as a function of the retardance for 6 equally-spaced time bins for a single qubit (i.e. ). This function reaches a minimum of when . For comparison, a protocol that consists of three PVMs (see Appendix A)—where the measurement operators of each PVM are the eigenstates of the Pauli operators—gives , where . The four-outcome SIC-POVM also gives , where here .
iii.2 Optimization of wave plate orientation
The second scenario we consider consists of optimization over wave plate orientation. Given two wave plates—a half-wave plate () and a quarter-wave plate ()—one is able to access the entire surface of the Bloch sphere. A wave plate’s retardance, however, is dependent on the wavelength of the input light; using the wrong wave plate results in a non-standard retardance. Here we consider the situation where the experimenter only has access to two identical wave plates, designed for a different optical wavelength. In such a scenario, access to the entire Bloch sphere may not be possible.
For two wave plates of retardance , the measurement operators associated with the horizontal and vertical output modes of each PBS are given by where
for , where is defined in Eq. (43).
To construct an informationally-complete set of operators, we generate three PVMs from (48) using different realizations of and . We fix for the first PVM and optimize over and for the second and third PVMs. We follow the same procedure as above—constructing a measurement matrix to find its singular values as well as calculating — to compute .
For , optimization over and yields , which occurs at and for the second PVM, and and for the third PVM. The resulting six projectors are shown in Fig 3 b). Fig 4 shows the optimal for different ; showing that there exists a large retardance window that can lead to effectively optimal state reconstruction.
Iv Concluding remarks
We have shown how one can use a simple figure of merit to optimize a tomographic protocol when practical constraints prohibit the implementation of known optimal protocols. In doing so, we considered statistical errors in the form of Poisson noise and assumed no systematic errors. One could in principle follow a similar procedure for noise models appropriate to other experimental scenarios, although whether such a simple figure of merit arrises is to be determined. Systematic errors could also be included by considering uncertainties in the measurement matrix, e.g. .
This procedure could be straightforwardly extended to quantum process tomography Chuang1997a (); Altepeter2003 (); OBrien2004 () where the measurement statistics relate linearly to the unknown density matrix parameters. Extensions to tomographic schemes that consider unknown parameters in both state and process parameters may also be possible Branczyk2012a (); Mogilevtsev2012 (); Merkel2012 (); Blume-Kohout2013 (); Takahashi2013 (). However, we anticipate the nonlinear relationship between measurement statistics and unknown parameters to provide additional challenges.
Our figure of merit is closely related to the trace distance, introduced by Scott Scott2006 (). However, the trace distance is only applicable for signal-independent noise, while our figure of merit accounts for signal-dependent noise. Our figure of merit does not posses the same operational interpretation that might make other figures of merit—such as the fidelity deBurgh2008 ()—more appealing, however it is vastly simpler to work with as it does not require computational averaging over input states.
We anticipate that the method introduced in this paper will help the design of tomographic experiments performed in laboratories with minimal available resources. Furthermore, our figure of merit can be used to compare the performance of different tomographic protocols, in complement to other figures of merit such as the fidelity.
We thank Daniel James for helpful discussions. We also thank Joe Altepeter, Hubert de Guise and Alessandro Fedrizzi for helpful comments on the manuscript. MM acknowledges support from NSERC (USRA) and The Fields Institute. AMB acknowledges support from DARPA (QuBE).
Appendix A Six-measurement example for qubits
Consider a qubit encoded in the polarization of a single photon. One of the simplest measurements one can implement is a two-outcome POVM with elements
In fact, these operators correspond to the special case of a projective-value measure (PVM), where . This measurement is implemented by sending multiple copies of the state through a polarizing beam splitter (which splits the state into horizontal and vertical modes) and counting the number of photons in each output port.
However, this is not enough to reconstruct the quantum state because this PVM isn’t informationally complete. Two additional PVMs can be constructed by inserting a half-wave plate (HWP) followed by a quarter-wave plate (QWP) in front of the PBS. Setting the HWP and QWP at and to the optical axis, respectively, generates the following PVM:
where . Setting them to and gives
where . For all three PVMs, we have assumed equal data collections times and therefore the total number of counts is in all three cases.
Performing the experiment, we get a set of six numbers which correspond to the counts at both output ports of the PBS for each of the three PVMs. These can be written as
We are interested in finding the vector
which characterizes the state of our qubit.
The measurement matrix , which relates to via Eq. (4), is given by
and , , and . The pseudoinverse of is given by
which can be used to determine the vector as follows:
The vector then gives the density matrix according to
Appendix B Discussion of linear inversion and positive semi-definiteness
Use of linear inversion (LI) for state reconstruction yields coefficients that may not necessarily correspond to a physical state, i.e., the estimated density matrices may be unnormalized or negative definite.
We can denote the estimated coefficients obtained from this method by . We also denote coefficients corresponding to the actual state (A) of the system by .
The proposed figure of merit, the EWV, is the expectation value of . As may not correspond to a physical state, the experimenter could solve this problem by, say, looking for a physical state (Ph) that is closest to the one obtained by linear inversion; we denote this state by and characterize the closeness of states by the distance function . In this scenario, we are interested in the expectation value of , as a measure for the error. Since is the closest physical state to , we have . This, along with the triangle inequality, yields .
The error computed while taking the positive semi-definiteness of the density matrix into account would be bounded by a coefficient of the old figure of merit. Therefore, minimizing the proposed figure of merit, EWV, also minimizes an upper bound for the error that takes positive semi-definiteness into account.
We add that the geometry of physical states is especially complicated and there are many open problems—e.g. the existence of SIC-POVMs—associated with it. We therefore consider our approach justified in that it provides a simple procedure that leads to reasonable results.
Appendix C Simplify expression for equally-weighted variance
We define a vector
such that Eq. (59) becomes
We rewrite the term as
where is an diagonal matrix whose diagonal elements are the elements of the th column of , i.e. . We perform a singular value decomposition (i.e. ), where and are and matrices recpectively, to give
where . Since is a diagonal matrix of inverse singular values of , i.e. , we can write