Optimization of quantum state tomography in the presence of experimental constraints

Optimization of quantum state tomography in the presence of experimental constraints

Mohammadreza Mohammadi Department of Physics, University of Toronto, 60 Saint George St., Toronto, Ontario M5S 1A7, Canada    Agata M. Brańczyk abranczyk@perimeterinstitute.ca Department of Physics, University of Toronto, 60 Saint George St., Toronto, Ontario M5S 1A7, Canada Department of Chemistry, University of Toronto, 80 Saint George St., Toronto, Ontario M5S 3H6, Canada Perimeter Institute for Theoretical Physics, Waterloo, Ontario N2L 2Y5, Canada
July 12, 2019

In the absence of experimental constraints, optimal measurement schemes for quantum state tomography are well understood. We consider the scenario where the experimenter doesn’t have arbitrary freedom to construct their measurement set, and may therefore not be able to implement a known optimal scheme. We introduce a simple procedure for minimizing the uncertainty in the reconstructed quantum state for an arbitrary tomographic scheme. We do this by defining a figure of merit based on the equally-weighted variance of the measurement statistics. This figure of merit is straightforwardly based on the singular value decomposition of the measurement matrix, making it well-suited for optimization.

03.65.Wj, 42.50.Ex

The characterization of quantum states is a key step in quantum technologies such as quantum computing Kok2007 (), quantum information Nielsen2010 () and quantum cryptography Bennett1984 (). This can be achieved by performing an informationally complete set of measurements on multiple identically-prepared copies of the state, using the results to reconstruct a representation of the state. Such a procedure is known as quantum state tomography (QST) Vogel1989 (); Leonhardt1995 (); James2001 ().

In principle, as long as the measurement set is informationally complete, the density matrix representing the state can always be reconstructed (Sec. I). However, in practice, certain measurement schemes are more prone to error than others.

The optimality of various tomographic schemes has been explored in works such as those by Scott Scott2006 (), Roy and Scott Roy2007 (), and de Burgh et al. deBurgh2008 (). Roughly speaking, for two-state systems, measurement schemes whose probability operators are distributed symmetrically on the Bloch sphere—such as those based on mutually unbiased bases and platonic solids—minimize the error in the reconstructed density matrix.

In this paper, we consider a scenario where the experimenter lacks the resources to implement an arbitrary measurement set, and may therefore not be able to implement a known optimal scheme. We show how one can determine the measurement settings which minimize the uncertainty in the reconstructed quantum state.

To do so, we introduce a figure of merit based on the equally-weighted variance (EWV) of the measurement statistics, that can be used to quantify the uncertainty in the reconstructed density matrix (Sec II). This figure of merit is based on the singular-value decomposition of the measurement matrix and is trivial to compute on a modern computer, making it ideal for the task of optimization. We demonstrate this figure of merit’s utility (Sec III), by optimizing wave-plate parameters in two non-standard optical tomographic scenarios.

I Tomographic protocol

In a typical tomography experiment, multiple copies of the same quantum state interact with a measurement apparatus. The apparatus determines the number of states that registered the outcome , known as ‘counts’.

This can be modelled with an informationally complete set of probability operators . For our purposes, it is more natural to work directly with measurement statistics rather than probabilities. We therefore consider unnormalized probability operators that form an unnormalized POVM such that , where is the total number of counts for the POVM (for a nice introduction to the POVM formalism, refer to Ch 2.2.6 of Nielsen2010 ()). In the absence of an informationally complete POVM, multiple POVMs can be combined to form an informationally complete set.

The number of counts for a given outcome is given by the expectation value of the corresponding probability operator


where is the density matrix of a -level quantum system. The total number of counts for a single POVM is given by .

Given a set of measurement outcomes , one is interested in inferring the density matrix of the measured quantum state. This may be done by first decomposing the unknown density matrix in terms of an orthogonal basis:


where is related to the identity matrix and are the generalized Pauli matrices—a set of -dimensional traceless, Hermitian matrices that are orthogonal under the Hilbert-Schmidt inner product, given by . The parameters completely characterize the state and the condition ensures that the density matrix is normalized.

Inserting Eq. (2) into Eq. (1), we find a linear relationship between the parameters and the measurement outcomes :


This can be written in compact matrix form as


where the vectors


are related by the measurement matrix


If the tomography scheme consists of operators and is informationally complete (i.e. has singular values), the density matrix of an unknown state can be reconstructed by calculating the pseudo-inverse of the measurement matrix to give the parameters in terms of the measurement outcomes :


If , is simply the inverse of and if , the tomography scheme may lead to an overcomplete set of Eqs. for .

A specific example of how the above formalism can be used to implement the familiar six-outcome tomographic protocol for qubits can be found in Appendix A.

We note that the inversion technique described above amounts to performing least-squares estimation which does not always produce physical density matrices. Despite this, we argue in Appendix B that linear inversion provides a reasonable starting point for developing a figure of merit that characterizes the uncertainty in the reconstructed density matrix.

When it comes to actual state reconstruction in practice, one often resorts to using the popular maximum likelihood estimation method Hradil1997 (). Alternatively, one can look to a growing number of exciting new techniques such as the forced purity routine Kaznady2009 (), Baysean mean estimation Blume-Kohout2010 (), hedged maximum likelihood estimation Blume-Kohout2010a (), compressed sensing Gross2010a (), von Neumann entropy maximization Teo2011 (), minimax estimation Ng2012 () and likelihood-free quantum inference Ferrie2013a (), as well as techniques that focus on reconstructing the state with reliable error bars Christandl2011 () and confidence regions Blume-Kohout2012 ().

Ii Deriving a figure of merit

Our goal is to identify the set of operators that minimize the uncertainty in the reconstructed density matrix. To do so, we must first define a figure of merit that quantifies this uncertainty.

In our analysis, we consider statistical uncertainties in measurement outcomes rather than systematic errors, e.g. due to misalignment of optical elements. We therefore neglect error in . We note that a recent study comparing systematic and statistical errors in QST was performed by Langford Langford2013 ().

Under these assumptions, We can relate uncertainties in the detected number of counts to uncertainties in the density matrix by




To assess a tomography scheme’s robustness to noise, we calculate the equally-weighted variance , defined in terms of the equal sum of the variances in the parameters Sabatke2000 ():


From Eq. (9), it follows that


Statistical noise can take the form of signal-independent background noise such as additive Gaussian noise, or signal-dependent noise such as Poisson noise. The latter is endemic to experiments involving particles that are spontaneously generated, such as single photons or neutrons.

Here, we consider Poisson noise, which leads to a figure of merit that is input-state-dependent. An average over all input states gives an expression that corresponds to the result for signal-independent noise.

Fluctuations due to Poisson noise are statistically independent from one measurement to the other Goudail2009 (), simplifying the expression to


For Poisson processes, the variance is given by , thus


From Eq. (4), we see that , giving


In Appendix C, we make use of the singular value decomposition (SVD) (i.e. ) to arrive at the final expression for the equally-weighted variance:


where are the singular values of and


Eq. (18) is an expression for the equally-weighted variance of a tomography scheme undergoing Poisson noise statistics. As expected for signal-dependent noise processes, it is dependent on the target state given by the vector . In the next section, we calculate the average equally-weighted variance.

We emphasize that our figure of merit is applicable for tomographic schemes described by the above formalism. Namely, those where the unknown state is parametrized by a Bloch vector; all Bloch vector elements are of equal interest; and that the probability operators form a POVM. Take for counter-example the scheme for measuring the diagonal elements of the density matrix of a single-mode state of an electromagnetic field introduced by Mogilevtsev Mogilevtsev1998 (). For this scheme, using our notation, we have and , in the Fock basis. The measurement matrix is then given by the Vandermonde matrix. Naive calculation of the EWV from this measurement matrix may not lead to reasonable assessment of the uncertainty in this scheme, since do not form a complete basis and do not satisfy the condition .

ii.1 Average EWV

In many cases, one is interested in how well a tomography scheme performs overall, rather than for one particular state. In this section, we address this question by calculating the average equally-weighted variance for all possible input states:


The average over the first element of is trivial, i.e. . To calculate the average for other values of , i.e. , we follow the approach of Życzkowski et al. Zyczkowski2011 () and assume an average over an ensemble of random states for which the probability measure may be factorized. The distribution of eigenvalues and eigenvectors are therefore independent and, hence, can be averaged independently. While the average over eigenvectors is relatively straightforward, performing the average over eigenvalues can be quite tricky as there is no single probability measure for mixed states. Fortunately, as we show below, for any set of eigenvalues, the average over all possible eigenvectors gives , saving us from the difficulty of averaging over eigenvalues.

To calculate we invert Eq. (2) such that


where we used and made use of the eigenvalue decomposition where and . We can write the average of Eq. (21) over all possible eigenbases for a given set of eigenvalues as


As pointed out in Zyczkowski2011 (), it is natural to assume that the eigenvectors are distributed according to the unique, unitarily invariant, Haar measure on unitary matrices. Different assignments of eigenvectors to eigenvalues are all equally likely since any rearrangement of the eigenvectors could be done by a unitary transformation, and the Haar measure is invariant under unitary transformations. For a given set of eigenvalues , we can therefore decompose the sum over eigenbases into a sum over permutations of as follows


where is the set of all possible maps that permute . We can rewrite the sum over permutations of eigenbases in Eq. (25) as the sum over permutations of eigenvalues as follows


where is the inverse of the map . We note that and that this is independent of the index . We can therefore write


Using the fact that , and that the generalized Pauli matrices are traceless, we find that . We therefore conclude that , and Eq. (20) simplifies to


Inserting the expression for in Eq. (7) into the expression for in (19), we evaluate the expression for :


The operators can be thought of as scaled projectors such that . We use this, as well as the definition , to give


which combined with Eq. (26) gives


Eq. (29) is an expression for the average EWV of a tomography scheme undergoing Poisson noise statistics. This expression is trivial to compute using a modern computer and does not require computational averaging over input states—in contrast to, say, the fidelity deBurgh2008 ()—making it ideal for optimization over tuneable parameters.

If all probability operators are equally weighted, i.e. , Eq. (29) simplifies to


since each column of the unitary matrix has unit length.

We note that the corresponds to the EWV for signal-independent noise, such as additive Gaussian noise.

ii.2 Lower bound for

When evaluating the robustness of a specific tomography scheme to noise, it is useful to make comparisons with the best possible scheme, i.e. one that minimizes .

In this section we derive a lower bound for the average EWV. The approach we use is to find a relationship between the diagonal elements and singular values of the Hermitian matrix . This puts constraints on the parameters which determine . We then use Lagrange multipliers, which provide a strategy for optimizing a function subject to equality constraints, to minimize .

First, consider the singular value decomposition of the matrix , where is defined in Eq. (4).


where is a orthogonal matrix. Comparing Eqs. (31b) and (31c), we find that the eigenvalues of the matrix are related to the singular values of the matrix by .

To calculate a lower bound for the uncertainty, we begin by considering the sum of the diagonal elements of the matrix :


The set of operators


forms an orthonormal basis, in the sense of Hilbert-Schmidt inner product, for the vector space of all Hermitian matrices over real numbers. Recall that are scaled projectors with a trace . We can therefore write the “Hilbert-Schmidt length” of the vector in this vector space as


which simplifies Eq. (33) to


We now make use of the Schur-Horn theorem Schur1923 (); Horn1954 (), which states that there exists an Hermitian matrix with diagonal values and eigenvalues that are both ordered non-increasingly, if and only if,




Bearing in mind that the matrix is Hermitian, we find that


Given the above constraints and the fixed sum , which corresponds to having a fixed total number of counts for a given POVM, we used Lagrange multipliers to show that for a given , the expression is minimal when and . This reduces the problem to a single-variable optimization problem, which yields


Under these conditions, we can find a lower bound for the average EWV by inserting the expressions for in Eqs. (40) into Eq. (30). The final expression for the lower bound for the average EWV is


Given a completely unknown state, one is interested in a tomographic scheme that performs best “on average”. Symmetric informationally-complete positive operator valued measures (SIC-POVMs) are known to be optimal for quantum state tomography under signal-independent noise Scott2006 (). We find that they are also optimal given Poisson noise processes, by showing that the for SIC-POVMs is equal to (see Appendix D).

Iii Examples

The can be used to determine the optimal tomography scheme under a set of practical constraints. We demonstrate this with two examples, using two-state systems, i.e. qubits, encoded in the polarization degree of freedom of single photons. In such systems, the measurement is typically made in the horizontal/vertical basis by counting photons at the output ports of a polarizing beam splitter (PBS). Measurements in different bases are implemented by placing wave plates at different orientations in front of the beam splitter.

In Scenario A, we are restricted to one wave plate per qubit mode, but have the freedom to choose the retardance of each wave plate (Fig. 1 a). We use to optimize over the retardance of the wave plate.

In Scenario B, we consider two wave plates with which we can perform QST on a single qubit mode, however, the wave plates are optimized for a different optical frequency to the one used in the experiment (Fig. 3 a). We minimize to optimize over the orientation of the wave plates about the optical axis.

Figure 1: (Color online) a) Schematic diagram of the Fourier Transform Tomography set-up with one rotating wave plate Mohammadi2013 (). The angle parameterizes the orientation of the wave plate about the optical axis. b) Path traced out by (blue) and (orange) defined in Equation (42), for: and .

iii.1 Optimization of wave plate retardance

A single wave plate in front of a PBS can generate a POVM consisting of two operators associated with the horizontal and vertical output modes of each PBS, given by where


for . In fact, these operators correspond to the special case of a projective-value measure (PVM), where . The unitary operator associated with the wave plate,


rotates the operators on the Bloch sphere by an angle , about the vector


where and are unit vectors in Euclidian space (defined by the axes in Fig. 1) and . As the wave plate is rotated about the optical axis, the projector (after appropriate normalization) traces out a figure-eight path on the Bloch sphere, as shown in Fig. 1. The retardance of the wave plate determines the size of the figure eight.

Although is not informationally complete, an informationally-complete set of operators can be constructed with different orientations of the wave plate about the optical axis, given by the angle , as long as the retardance is not equal to an integer multiple of . Here, the total number of projectors is twice the number of PVMs used to construct the complete tomographic protocol. This scenario was considered in the Fourier Transform Tomography (FTT) scheme introduced in Mohammadi2013 ().

The task is to identify the optimal retardance that minimizes the average EWV. We consider a tomographic protocol which consists of a set of PVMs whose elements are distributed along the figure eight with equally-spaced values of . From these operators, we calculate the coefficients . We also construct a measurement matrix according to Eq. (7) and calculate its singular values . For (i.e. six different values of the retardance which corresponds to six different PVMs), the measurement matrix is given by:

Figure 2: The average equally weighted variance for tomography with one wave plate, scaled by the total number of counts , as a function of the retardance for . The dotted line shows the lower bound , attainable by optimal tomographic schemes such as those that measure the Pauli matrices or the SIC-POVM Rehacek2004 (); Ling2006 (); Kalev2012a (). Note that the EWV is dimensionless.

Using Mathematica Wolfram-Research2010 (), the singular values were found to be:


for . Knowing and , we can calculate the average EWV according to Eq. (29), which is given by


where is the total number of counts. Fig. 2 shows as a function of the retardance for 6 equally-spaced time bins for a single qubit (i.e. ). This function reaches a minimum of when . For comparison, a protocol that consists of three PVMs (see Appendix A)—where the measurement operators of each PVM are the eigenstates of the Pauli operators—gives , where . The four-outcome SIC-POVM also gives , where here .

Figure 3: (Color online) a) Schematic diagram of a tomography set-up with two wave plates of retardance . The angles and parameterize the orientation of the wave plates about the optical axis. b) The six projectors that minimize the ; where (see Eq. (48)).
Figure 4: a) The optimal average equally weighted variance for tomography with two identical wave plates, scaled by the total number of counts , for different values of the retardance . The dotted line shows the lower bound . Note that the EWV is dimensionless. Note that unlike in Fig. 2, the minimization was performed numerically for discrete values of .

iii.2 Optimization of wave plate orientation

The second scenario we consider consists of optimization over wave plate orientation. Given two wave plates—a half-wave plate () and a quarter-wave plate ()—one is able to access the entire surface of the Bloch sphere. A wave plate’s retardance, however, is dependent on the wavelength of the input light; using the wrong wave plate results in a non-standard retardance. Here we consider the situation where the experimenter only has access to two identical wave plates, designed for a different optical wavelength. In such a scenario, access to the entire Bloch sphere may not be possible.

For two wave plates of retardance , the measurement operators associated with the horizontal and vertical output modes of each PBS are given by where


for , where is defined in Eq. (43).

To construct an informationally-complete set of operators, we generate three PVMs from (48) using different realizations of and . We fix for the first PVM and optimize over and for the second and third PVMs. We follow the same procedure as above—constructing a measurement matrix to find its singular values as well as calculating — to compute .

For , optimization over and yields , which occurs at and for the second PVM, and and for the third PVM. The resulting six projectors are shown in Fig 3 b). Fig 4 shows the optimal for different ; showing that there exists a large retardance window that can lead to effectively optimal state reconstruction.

Iv Concluding remarks

We have shown how one can use a simple figure of merit to optimize a tomographic protocol when practical constraints prohibit the implementation of known optimal protocols. In doing so, we considered statistical errors in the form of Poisson noise and assumed no systematic errors. One could in principle follow a similar procedure for noise models appropriate to other experimental scenarios, although whether such a simple figure of merit arrises is to be determined. Systematic errors could also be included by considering uncertainties in the measurement matrix, e.g. .

This procedure could be straightforwardly extended to quantum process tomography Chuang1997a (); Altepeter2003 (); OBrien2004 () where the measurement statistics relate linearly to the unknown density matrix parameters. Extensions to tomographic schemes that consider unknown parameters in both state and process parameters may also be possible Branczyk2012a (); Mogilevtsev2012 (); Merkel2012 (); Blume-Kohout2013 (); Takahashi2013 (). However, we anticipate the nonlinear relationship between measurement statistics and unknown parameters to provide additional challenges.

Our figure of merit is closely related to the trace distance, introduced by Scott Scott2006 (). However, the trace distance is only applicable for signal-independent noise, while our figure of merit accounts for signal-dependent noise. Our figure of merit does not posses the same operational interpretation that might make other figures of merit—such as the fidelity deBurgh2008 ()—more appealing, however it is vastly simpler to work with as it does not require computational averaging over input states.

We anticipate that the method introduced in this paper will help the design of tomographic experiments performed in laboratories with minimal available resources. Furthermore, our figure of merit can be used to compare the performance of different tomographic protocols, in complement to other figures of merit such as the fidelity.

V Acknowledgements

We thank Daniel James for helpful discussions. We also thank Joe Altepeter, Hubert de Guise and Alessandro Fedrizzi for helpful comments on the manuscript. MM acknowledges support from NSERC (USRA) and The Fields Institute. AMB acknowledges support from DARPA (QuBE).

Appendix A Six-measurement example for qubits

Consider a qubit encoded in the polarization of a single photon. One of the simplest measurements one can implement is a two-outcome POVM with elements


In fact, these operators correspond to the special case of a projective-value measure (PVM), where . This measurement is implemented by sending multiple copies of the state through a polarizing beam splitter (which splits the state into horizontal and vertical modes) and counting the number of photons in each output port.

However, this is not enough to reconstruct the quantum state because this PVM isn’t informationally complete. Two additional PVMs can be constructed by inserting a half-wave plate (HWP) followed by a quarter-wave plate (QWP) in front of the PBS. Setting the HWP and QWP at and to the optical axis, respectively, generates the following PVM:


where . Setting them to and gives


where . For all three PVMs, we have assumed equal data collections times and therefore the total number of counts is in all three cases.

Performing the experiment, we get a set of six numbers which correspond to the counts at both output ports of the PBS for each of the three PVMs. These can be written as


We are interested in finding the vector


which characterizes the state of our qubit.

The measurement matrix , which relates to via Eq. (4), is given by




and , , and . The pseudoinverse of is given by


which can be used to determine the vector as follows:


The vector then gives the density matrix according to


Appendix B Discussion of linear inversion and positive semi-definiteness

Use of linear inversion (LI) for state reconstruction yields coefficients that may not necessarily correspond to a physical state, i.e., the estimated density matrices may be unnormalized or negative definite.

We can denote the estimated coefficients obtained from this method by . We also denote coefficients corresponding to the actual state (A) of the system by .

The proposed figure of merit, the EWV, is the expectation value of . As may not correspond to a physical state, the experimenter could solve this problem by, say, looking for a physical state (Ph) that is closest to the one obtained by linear inversion; we denote this state by and characterize the closeness of states by the distance function . In this scenario, we are interested in the expectation value of , as a measure for the error. Since is the closest physical state to , we have . This, along with the triangle inequality, yields .

The error computed while taking the positive semi-definiteness of the density matrix into account would be bounded by a coefficient of the old figure of merit. Therefore, minimizing the proposed figure of merit, EWV, also minimizes an upper bound for the error that takes positive semi-definiteness into account.

We add that the geometry of physical states is especially complicated and there are many open problems—e.g. the existence of SIC-POVMs—associated with it. We therefore consider our approach justified in that it provides a simple procedure that leads to reasonable results.

Appendix C Simplify expression for equally-weighted variance

In this section, we detail the steps between Eqs. (17) and (18). For reference, we repeat Eq. (17) here:


We define a vector


such that Eq. (59) becomes


We rewrite the term as


where is an diagonal matrix whose diagonal elements are the elements of the th column of , i.e. . We perform a singular value decomposition (i.e. ), where and are and matrices recpectively, to give


where . Since is a diagonal matrix of inverse singular values of , i.e. , we can write


Inserting Eq. (67) into Eq. (61), we now have