A Six-measurement example for qubits

# Optimization of quantum state tomography in the presence of experimental constraints

## Abstract

In the absence of experimental constraints, optimal measurement schemes for quantum state tomography are well understood. We consider the scenario where the experimenter doesn’t have arbitrary freedom to construct their measurement set, and may therefore not be able to implement a known optimal scheme. We introduce a simple procedure for minimizing the uncertainty in the reconstructed quantum state for an arbitrary tomographic scheme. We do this by defining a figure of merit based on the equally-weighted variance of the measurement statistics. This figure of merit is straightforwardly based on the singular value decomposition of the measurement matrix, making it well-suited for optimization.

###### pacs:
03.65.Wj, 42.50.Ex

The characterization of quantum states is a key step in quantum technologies such as quantum computing (1), quantum information (2) and quantum cryptography (3). This can be achieved by performing an informationally complete set of measurements on multiple identically-prepared copies of the state, using the results to reconstruct a representation of the state. Such a procedure is known as quantum state tomography (QST) (4); (5); (6).

In principle, as long as the measurement set is informationally complete, the density matrix representing the state can always be reconstructed (Sec. I). However, in practice, certain measurement schemes are more prone to error than others.

The optimality of various tomographic schemes has been explored in works such as those by Scott (7), Roy and Scott (8), and de Burgh et al. (9). Roughly speaking, for two-state systems, measurement schemes whose probability operators are distributed symmetrically on the Bloch sphere—such as those based on mutually unbiased bases and platonic solids—minimize the error in the reconstructed density matrix.

In this paper, we consider a scenario where the experimenter lacks the resources to implement an arbitrary measurement set, and may therefore not be able to implement a known optimal scheme. We show how one can determine the measurement settings which minimize the uncertainty in the reconstructed quantum state.

To do so, we introduce a figure of merit based on the equally-weighted variance (EWV) of the measurement statistics, that can be used to quantify the uncertainty in the reconstructed density matrix (Sec II). This figure of merit is based on the singular-value decomposition of the measurement matrix and is trivial to compute on a modern computer, making it ideal for the task of optimization. We demonstrate this figure of merit’s utility (Sec III), by optimizing wave-plate parameters in two non-standard optical tomographic scenarios.

## I Tomographic protocol

In a typical tomography experiment, multiple copies of the same quantum state interact with a measurement apparatus. The apparatus determines the number of states that registered the outcome , known as ‘counts’.

This can be modelled with an informationally complete set of probability operators . For our purposes, it is more natural to work directly with measurement statistics rather than probabilities. We therefore consider unnormalized probability operators that form an unnormalized POVM such that , where is the total number of counts for the POVM (for a nice introduction to the POVM formalism, refer to Ch 2.2.6 of (2)). In the absence of an informationally complete POVM, multiple POVMs can be combined to form an informationally complete set.

The number of counts for a given outcome is given by the expectation value of the corresponding probability operator

 nj=⟨^Ej⟩=Tr[^ρ^Ej], (1)

where is the density matrix of a -level quantum system. The total number of counts for a single POVM is given by .

Given a set of measurement outcomes , one is interested in inferring the density matrix of the measured quantum state. This may be done by first decomposing the unknown density matrix in terms of an orthogonal basis:

 ^ρ=1√2dd2−1∑i=0Si^σi, (2)

where is related to the identity matrix and are the generalized Pauli matrices—a set of -dimensional traceless, Hermitian matrices that are orthogonal under the Hilbert-Schmidt inner product, given by . The parameters completely characterize the state and the condition ensures that the density matrix is normalized.

Inserting Eq. (2) into Eq. (1), we find a linear relationship between the parameters and the measurement outcomes :

 Extra open brace or missing close brace (3)

This can be written in compact matrix form as

 n=Ma, (4)

where the vectors

 n= (n1,…,nN)T; (5) a= (S0,S1,…,Sd2−1)T, (6)

are related by the measurement matrix

 Mi,j=1√2dTr[^σj−1^Ei]. (7)

If the tomography scheme consists of operators and is informationally complete (i.e. has singular values), the density matrix of an unknown state can be reconstructed by calculating the pseudo-inverse of the measurement matrix to give the parameters in terms of the measurement outcomes :

 a=M+n. (8)

If , is simply the inverse of and if , the tomography scheme may lead to an overcomplete set of Eqs. for .

A specific example of how the above formalism can be used to implement the familiar six-outcome tomographic protocol for qubits can be found in Appendix A.

We note that the inversion technique described above amounts to performing least-squares estimation which does not always produce physical density matrices. Despite this, we argue in Appendix B that linear inversion provides a reasonable starting point for developing a figure of merit that characterizes the uncertainty in the reconstructed density matrix.

When it comes to actual state reconstruction in practice, one often resorts to using the popular maximum likelihood estimation method (10). Alternatively, one can look to a growing number of exciting new techniques such as the forced purity routine (11), Baysean mean estimation (12), hedged maximum likelihood estimation (13), compressed sensing (14), von Neumann entropy maximization (15), minimax estimation (16) and likelihood-free quantum inference (17), as well as techniques that focus on reconstructing the state with reliable error bars (18) and confidence regions (19).

## Ii Deriving a figure of merit

Our goal is to identify the set of operators that minimize the uncertainty in the reconstructed density matrix. To do so, we must first define a figure of merit that quantifies this uncertainty.

In our analysis, we consider statistical uncertainties in measurement outcomes rather than systematic errors, e.g. due to misalignment of optical elements. We therefore neglect error in . We note that a recent study comparing systematic and statistical errors in QST was performed by Langford (20).

Under these assumptions, We can relate uncertainties in the detected number of counts to uncertainties in the density matrix by

 Δa=M+Δn, (9)

where

 Δn= (Δn1,…,ΔnN)T; (10) Δa= (ΔS0,ΔS1,…,ΔSd2−1)T. (11)

To assess a tomography scheme’s robustness to noise, we calculate the equally-weighted variance , defined in terms of the equal sum of the variances in the parameters (21):

 EWV=d2−1∑i=0ΔS2i=Δa⋅Δa. (12)

From Eq. (9), it follows that

 EWV= (M+Δn)⋅(M+Δn) (13) = d2∑i=1(N∑j=1M+ijΔnj)2. (14)

Statistical noise can take the form of signal-independent background noise such as additive Gaussian noise, or signal-dependent noise such as Poisson noise. The latter is endemic to experiments involving particles that are spontaneously generated, such as single photons or neutrons.

Here, we consider Poisson noise, which leads to a figure of merit that is input-state-dependent. An average over all input states gives an expression that corresponds to the result for signal-independent noise.

Fluctuations due to Poisson noise are statistically independent from one measurement to the other (22), simplifying the expression to

 EWV= d2∑i=1N∑j=1(M+ij)2(Δnj)2. (15)

For Poisson processes, the variance is given by , thus

 EWV= d2∑i=1N∑j=1(M+ij)2nj. (16)

From Eq. (4), we see that , giving

 EWV=d2∑i,k=1N∑j=1(M+ij)2Mjkak. (17)

In Appendix C, we make use of the singular value decomposition (SVD) (i.e. ) to arrive at the final expression for the equally-weighted variance:

 EWV=d2∑i=11μ2id2∑k=1ri,kak, (18)

where are the singular values of and

 ri,k= d2∑j=1(U−1ij)2Mjk=d2∑j=1U2j,iMjk. (19)

Eq. (18) is an expression for the equally-weighted variance of a tomography scheme undergoing Poisson noise statistics. As expected for signal-dependent noise processes, it is dependent on the target state given by the vector . In the next section, we calculate the average equally-weighted variance.

We emphasize that our figure of merit is applicable for tomographic schemes described by the above formalism. Namely, those where the unknown state is parametrized by a Bloch vector; all Bloch vector elements are of equal interest; and that the probability operators form a POVM. Take for counter-example the scheme for measuring the diagonal elements of the density matrix of a single-mode state of an electromagnetic field introduced by Mogilevtsev (23). For this scheme, using our notation, we have and , in the Fock basis. The measurement matrix is then given by the Vandermonde matrix. Naive calculation of the EWV from this measurement matrix may not lead to reasonable assessment of the uncertainty in this scheme, since do not form a complete basis and do not satisfy the condition .

### ii.1 Average EWV

In many cases, one is interested in how well a tomography scheme performs overall, rather than for one particular state. In this section, we address this question by calculating the average equally-weighted variance for all possible input states:

 ⟨EWV⟩=d2∑i=11μ2id2∑k=1ri,k⟨ak⟩. (20)

The average over the first element of is trivial, i.e. . To calculate the average for other values of , i.e. , we follow the approach of Życzkowski et al. (24) and assume an average over an ensemble of random states for which the probability measure may be factorized. The distribution of eigenvalues and eigenvectors are therefore independent and, hence, can be averaged independently. While the average over eigenvectors is relatively straightforward, performing the average over eigenvalues can be quite tricky as there is no single probability measure for mixed states. Fortunately, as we show below, for any set of eigenvalues, the average over all possible eigenvectors gives , saving us from the difficulty of averaging over eigenvalues.

To calculate we invert Eq. (2) such that

 ak+1=Sk=√d2Tr[d∑i=1λi|ψi⟩⟨ψi|ρ^σk], (21)

where we used and made use of the eigenvalue decomposition where and . We can write the average of Eq. (21) over all possible eigenbases for a given set of eigenvalues as

 ⟨Sk⟩b= √d2⟨Tr[d∑i=1λi|ψi⟩⟨ψi|^σk]⟩b. (22)

As pointed out in (24), it is natural to assume that the eigenvectors are distributed according to the unique, unitarily invariant, Haar measure on unitary matrices. Different assignments of eigenvectors to eigenvalues are all equally likely since any rearrangement of the eigenvectors could be done by a unitary transformation, and the Haar measure is invariant under unitary transformations. For a given set of eigenvalues , we can therefore decompose the sum over eigenbases into a sum over permutations of as follows

 ⟨Sk⟩b= √d√2d!⟨Tr[d∑i=1∑π∈Pdλi|ψπ(i)⟩⟨ψπ(i)|^σk]⟩b, (23)

where is the set of all possible maps that permute . We can rewrite the sum over permutations of eigenbases in Eq. (25) as the sum over permutations of eigenvalues as follows

 ⟨Sk⟩b= √d√2d!⟨Tr[d∑i=1⎛⎝∑π∈Pdλπ−1(i)⎞⎠|ψi⟩⟨ψi|^σk]⟩b, (24)

where is the inverse of the map . We note that and that this is independent of the index . We can therefore write

 ⟨Sk⟩b= √d√2d!⎛⎝∑π∈Pdλπ(i)⎞⎠⟨Tr[d∑i=1|ψi⟩⟨ψi|^σk]⟩b. (25)

Using the fact that , and that the generalized Pauli matrices are traceless, we find that . We therefore conclude that , and Eq. (20) simplifies to

 ⟨EWV⟩=d2∑i=11μ2iri,1. (26)

Inserting the expression for in Eq. (7) into the expression for in (19), we evaluate the expression for :

 ri,1= d2∑j=1U2j,iMj,1 (27a) = d2∑j=11√2dU2j,iTr[^σ0^Ej]. (27b)

The operators can be thought of as scaled projectors such that . We use this, as well as the definition , to give

 ri,1= 1dd2∑j=1αjU2j,i, (28)

which combined with Eq. (26) gives

 ⟨EWV⟩= 1dd2∑ij=1αjU2j,iμ2i. (29)

Eq. (29) is an expression for the average EWV of a tomography scheme undergoing Poisson noise statistics. This expression is trivial to compute using a modern computer and does not require computational averaging over input states—in contrast to, say, the fidelity (9)—making it ideal for optimization over tuneable parameters.

If all probability operators are equally weighted, i.e. , Eq. (29) simplifies to

 ⟨EWV⟩= αdd2∑i=11μ2i, (30)

since each column of the unitary matrix has unit length.

We note that the corresponds to the EWV for signal-independent noise, such as additive Gaussian noise.

### ii.2 Lower bound for ⟨EWV⟩

When evaluating the robustness of a specific tomography scheme to noise, it is useful to make comparisons with the best possible scheme, i.e. one that minimizes .

In this section we derive a lower bound for the average EWV. The approach we use is to find a relationship between the diagonal elements and singular values of the Hermitian matrix . This puts constraints on the parameters which determine . We then use Lagrange multipliers, which provide a strategy for optimizing a function subject to equality constraints, to minimize .

First, consider the singular value decomposition of the matrix , where is defined in Eq. (4).

 MTM= (VSTUT)(USVT) (31a) = VSTSVT (31b) = VWVT, (31c)

where is a orthogonal matrix. Comparing Eqs. (31b) and (31c), we find that the eigenvalues of the matrix are related to the singular values of the matrix by .

To calculate a lower bound for the uncertainty, we begin by considering the sum of the diagonal elements of the matrix :

 d2∑j=1dj= N∑i=1d2∑j=1MTj,iMij (32) = 12dN∑i=1d2−1∑j=0Tr[^σj^Ei]2. (33)

The set of operators

 B={^σj√2}  for  j=0,…,d2−1 (34)

forms an orthonormal basis, in the sense of Hilbert-Schmidt inner product, for the vector space of all Hermitian matrices over real numbers. Recall that are scaled projectors with a trace . We can therefore write the “Hilbert-Schmidt length” of the vector in this vector space as

 Tr[^Ei^Ei]=12d2−1∑j=0Tr[^σj^Ei]2=α2i, (35)

which simplifies Eq. (33) to

 d2∑j=1dj= ∑Niα2id. (36)

We now make use of the Schur-Horn theorem (25); (26), which states that there exists an Hermitian matrix with diagonal values and eigenvalues that are both ordered non-increasingly, if and only if,

 l∑i=1di≤l∑i=1λi,∀l∈{1,2,...,m} (37)

and

 m∑i=1di=m∑i=1λi. (38)

Bearing in mind that the matrix is Hermitian, we find that

 d2∑i=1μ2i= d2∑j=1dj=∑Niα2id; (39a) and       μ21= λ1≥∑Niα2id2. (39b)

Given the above constraints and the fixed sum , which corresponds to having a fixed total number of counts for a given POVM, we used Lagrange multipliers to show that for a given , the expression is minimal when and . This reduces the problem to a single-variable optimization problem, which yields

 μ21= Nα2d2; (40a) and       μ22=⋯= μ2d2=Nα2d2(d+1). (40b)

Under these conditions, we can find a lower bound for the average EWV by inserting the expressions for in Eqs. (40) into Eq. (30). The final expression for the lower bound for the average EWV is

 ⟨EWV⟩LB= d2(d2+d−1)Nα. (41a)

Given a completely unknown state, one is interested in a tomographic scheme that performs best “on average”. Symmetric informationally-complete positive operator valued measures (SIC-POVMs) are known to be optimal for quantum state tomography under signal-independent noise (7). We find that they are also optimal given Poisson noise processes, by showing that the for SIC-POVMs is equal to (see Appendix D).

## Iii Examples

The can be used to determine the optimal tomography scheme under a set of practical constraints. We demonstrate this with two examples, using two-state systems, i.e. qubits, encoded in the polarization degree of freedom of single photons. In such systems, the measurement is typically made in the horizontal/vertical basis by counting photons at the output ports of a polarizing beam splitter (PBS). Measurements in different bases are implemented by placing wave plates at different orientations in front of the beam splitter.

In Scenario A, we are restricted to one wave plate per qubit mode, but have the freedom to choose the retardance of each wave plate (Fig. 1 a). We use to optimize over the retardance of the wave plate.

In Scenario B, we consider two wave plates with which we can perform QST on a single qubit mode, however, the wave plates are optimized for a different optical frequency to the one used in the experiment (Fig. 3 a). We minimize to optimize over the orientation of the wave plates about the optical axis.

### iii.1 Optimization of wave plate retardance

A single wave plate in front of a PBS can generate a POVM consisting of two operators associated with the horizontal and vertical output modes of each PBS, given by where

 ^Ea(β,ϕ)= N^U†(β,ϕ)|a⟩⟨a|^U(β,ϕ), (42)

for . In fact, these operators correspond to the special case of a projective-value measure (PVM), where . The unitary operator associated with the wave plate,

 ^U(β,ϕ)=cos(β2)^σ0−isin(β2)→v(ϕ)⋅→σ, (43)

rotates the operators on the Bloch sphere by an angle , about the vector

 →v(ϕ)=cos(ϕ)→k+sin(ϕ)→i, (44)

where and are unit vectors in Euclidian space (defined by the axes in Fig. 1) and . As the wave plate is rotated about the optical axis, the projector (after appropriate normalization) traces out a figure-eight path on the Bloch sphere, as shown in Fig. 1. The retardance of the wave plate determines the size of the figure eight.

Although is not informationally complete, an informationally-complete set of operators can be constructed with different orientations of the wave plate about the optical axis, given by the angle , as long as the retardance is not equal to an integer multiple of . Here, the total number of projectors is twice the number of PVMs used to construct the complete tomographic protocol. This scenario was considered in the Fourier Transform Tomography (FTT) scheme introduced in (27).

The task is to identify the optimal retardance that minimizes the average EWV. We consider a tomographic protocol which consists of a set of PVMs whose elements are distributed along the figure eight with equally-spaced values of . From these operators, we calculate the coefficients . We also construct a measurement matrix according to Eq. (7) and calculate its singular values . For (i.e. six different values of the retardance which corresponds to six different PVMs), the measurement matrix is given by:

 M(12)=N2⎛⎜ ⎜ ⎜ ⎜ ⎜ ⎜ ⎜ ⎜ ⎜ ⎜ ⎜ ⎜ ⎜ ⎜ ⎜ ⎜ ⎜ ⎜ ⎜ ⎜ ⎜ ⎜ ⎜ ⎜ ⎜ ⎜ ⎜ ⎜ ⎜ ⎜ ⎜ ⎜ ⎜ ⎜⎝1−√3(cosβ−1)4−sinβ2(cosβ+3)410−sinβcosβ1√3(cosβ−1)4−sinβ2(cosβ+3)41−√3(cosβ−1)4sinβ2(cosβ+3)410sinβcosβ1√3(cosβ−1)4sinβ2(cosβ+3)41√3(cosβ−1)4sinβ2−(cosβ+3)410sinβ−cosβ1−√3(cosβ−1)4sinβ2−(cosβ+3)41√3(cosβ−1)4−sinβ2−(cosβ+3)410−sinβ−cosβ1−√3(cosβ−1)4−sinβ2−(cosβ+3)4⎞⎟ ⎟ ⎟ ⎟ ⎟ ⎟ ⎟ ⎟ ⎟ ⎟ ⎟ ⎟ ⎟ ⎟ ⎟ ⎟ ⎟ ⎟ ⎟ ⎟ ⎟ ⎟ ⎟ ⎟ ⎟ ⎟ ⎟ ⎟ ⎟ ⎟ ⎟ ⎟ ⎟ ⎟⎠ (45)

Using Mathematica (31), the singular values were found to be:

 μ1= N√N√2; (46a) μ2= N√N4√2√9+4cos(β)+3cos(2β); (46b) μ3= N√N2|sin(β)|; (46c) μ4= N√N2sin2(β2), (46d)

for . Knowing and , we can calculate the average EWV according to Eq. (29), which is given by

 ⟨EWV⟩=2NT(12+csc4(β2)+csc2(β)+89+4cos(β)+3cos(2β)), (47)

where is the total number of counts. Fig. 2 shows as a function of the retardance for 6 equally-spaced time bins for a single qubit (i.e. ). This function reaches a minimum of when . For comparison, a protocol that consists of three PVMs (see Appendix A)—where the measurement operators of each PVM are the eigenstates of the Pauli operators—gives , where . The four-outcome SIC-POVM also gives , where here .

### iii.2 Optimization of wave plate orientation

The second scenario we consider consists of optimization over wave plate orientation. Given two wave plates—a half-wave plate () and a quarter-wave plate ()—one is able to access the entire surface of the Bloch sphere. A wave plate’s retardance, however, is dependent on the wavelength of the input light; using the wrong wave plate results in a non-standard retardance. Here we consider the situation where the experimenter only has access to two identical wave plates, designed for a different optical wavelength. In such a scenario, access to the entire Bloch sphere may not be possible.

For two wave plates of retardance , the measurement operators associated with the horizontal and vertical output modes of each PBS are given by where

 ^Ea(β,ϕ1,ϕ2)=N^U†(β,ϕ2)^U†(β,ϕ1)|a⟩⟨a|^U(β,ϕ1)^U(β,ϕ2), (48)

for , where is defined in Eq. (43).

To construct an informationally-complete set of operators, we generate three PVMs from (48) using different realizations of and . We fix for the first PVM and optimize over and for the second and third PVMs. We follow the same procedure as above—constructing a measurement matrix to find its singular values as well as calculating — to compute .

For , optimization over and yields , which occurs at and for the second PVM, and and for the third PVM. The resulting six projectors are shown in Fig 3 b). Fig 4 shows the optimal for different ; showing that there exists a large retardance window that can lead to effectively optimal state reconstruction.

## Iv Concluding remarks

We have shown how one can use a simple figure of merit to optimize a tomographic protocol when practical constraints prohibit the implementation of known optimal protocols. In doing so, we considered statistical errors in the form of Poisson noise and assumed no systematic errors. One could in principle follow a similar procedure for noise models appropriate to other experimental scenarios, although whether such a simple figure of merit arrises is to be determined. Systematic errors could also be included by considering uncertainties in the measurement matrix, e.g. .

This procedure could be straightforwardly extended to quantum process tomography (32); (33); (34) where the measurement statistics relate linearly to the unknown density matrix parameters. Extensions to tomographic schemes that consider unknown parameters in both state and process parameters may also be possible (35); (36); (37); (38); (39). However, we anticipate the nonlinear relationship between measurement statistics and unknown parameters to provide additional challenges.

Our figure of merit is closely related to the trace distance, introduced by Scott (7). However, the trace distance is only applicable for signal-independent noise, while our figure of merit accounts for signal-dependent noise. Our figure of merit does not posses the same operational interpretation that might make other figures of merit—such as the fidelity (9)—more appealing, however it is vastly simpler to work with as it does not require computational averaging over input states.

We anticipate that the method introduced in this paper will help the design of tomographic experiments performed in laboratories with minimal available resources. Furthermore, our figure of merit can be used to compare the performance of different tomographic protocols, in complement to other figures of merit such as the fidelity.

## V Acknowledgements

We thank Daniel James for helpful discussions. We also thank Joe Altepeter, Hubert de Guise and Alessandro Fedrizzi for helpful comments on the manuscript. MM acknowledges support from NSERC (USRA) and The Fields Institute. AMB acknowledges support from DARPA (QuBE).

## Appendix A Six-measurement example for qubits

Consider a qubit encoded in the polarization of a single photon. One of the simplest measurements one can implement is a two-outcome POVM with elements

 ^E1= N|H⟩⟨H|;   ^E2=N|V⟩⟨V|. (49)

In fact, these operators correspond to the special case of a projective-value measure (PVM), where . This measurement is implemented by sending multiple copies of the state through a polarizing beam splitter (which splits the state into horizontal and vertical modes) and counting the number of photons in each output port.

However, this is not enough to reconstruct the quantum state because this PVM isn’t informationally complete. Two additional PVMs can be constructed by inserting a half-wave plate (HWP) followed by a quarter-wave plate (QWP) in front of the PBS. Setting the HWP and QWP at and to the optical axis, respectively, generates the following PVM:

 ^E3= N|D⟩⟨D|;   ^E4=N|A⟩⟨A|, (50)

where . Setting them to and gives

 ^E5= N|R⟩⟨R|;   ^E6=N|L⟩⟨L|, (51)

where . For all three PVMs, we have assumed equal data collections times and therefore the total number of counts is in all three cases.

Performing the experiment, we get a set of six numbers which correspond to the counts at both output ports of the PBS for each of the three PVMs. These can be written as

 n= (n1,n2,n3,n4,n5,n6)T. (52)

We are interested in finding the vector

 a= (S0,S1,S2,S3)T, (53)

which characterizes the state of our qubit.

The measurement matrix , which relates to via Eq. (4), is given by

 M=N2⎛⎜ ⎜ ⎜ ⎜ ⎜ ⎜ ⎜ ⎜⎝1001100−111001−100101010−10⎞⎟ ⎟ ⎟ ⎟ ⎟ ⎟ ⎟ ⎟⎠, (54)

where

 Mi,j=12Tr[^σj−1^Ei], (55)

and , , and . The pseudoinverse of is given by

 M+=1N⎛⎜ ⎜ ⎜ ⎜⎝131313131313001−10000001−11−10000⎞⎟ ⎟ ⎟ ⎟⎠, (56)

which can be used to determine the vector as follows:

 a=M+n=⎛⎜ ⎜ ⎜ ⎜⎝13∑6i=1nin3−n4n5−n6n1−n2⎞⎟ ⎟ ⎟ ⎟⎠. (57)

The vector then gives the density matrix according to

 ^ρ=124∑j=1aj^σj−1. (58)

## Appendix B Discussion of linear inversion and positive semi-definiteness

Use of linear inversion (LI) for state reconstruction yields coefficients that may not necessarily correspond to a physical state, i.e., the estimated density matrices may be unnormalized or negative definite.

We can denote the estimated coefficients obtained from this method by . We also denote coefficients corresponding to the actual state (A) of the system by .

The proposed figure of merit, the EWV, is the expectation value of . As may not correspond to a physical state, the experimenter could solve this problem by, say, looking for a physical state (Ph) that is closest to the one obtained by linear inversion; we denote this state by and characterize the closeness of states by the distance function . In this scenario, we are interested in the expectation value of , as a measure for the error. Since is the closest physical state to , we have . This, along with the triangle inequality, yields .

The error computed while taking the positive semi-definiteness of the density matrix into account would be bounded by a coefficient of the old figure of merit. Therefore, minimizing the proposed figure of merit, EWV, also minimizes an upper bound for the error that takes positive semi-definiteness into account.

We add that the geometry of physical states is especially complicated and there are many open problems—e.g. the existence of SIC-POVMs—associated with it. We therefore consider our approach justified in that it provides a simple procedure that leads to reasonable results.

## Appendix C Simplify expression for equally-weighted variance

In this section, we detail the steps between Eqs. (17) and (18). For reference, we repeat Eq. (17) here:

 EWV=d2∑i,k=1N∑j=1(M+ij)2Mjkak. (59)

We define a vector

 gk= d2∑i=1N∑j=1(M+ij)2Mjk. (60)

such that Eq. (59) becomes

 EWV=d2∑k=1gkak. (61)

We rewrite the term as

 gk≡ Tr[M+Y(k)(M+)T] (62)

where is an diagonal matrix whose diagonal elements are the elements of the th column of , i.e. . We perform a singular value decomposition (i.e. ), where and are and matrices recpectively, to give

 gk= Tr[(U+)T(S+)TV+(V+)TS+U+Y(k)] (63) = Tr[(S+)TS+X] (64)

where . Since is a diagonal matrix of inverse singular values of , i.e.