Measuring Measurement: Theory and Practice

Measuring Measurement: Theory and Practice

A. Feito ab1805@imperial.ac.uk    J. S. Lundeen    H. Coldenstrodt-Ronge    J. Eisert    M. B. Plenio    I. A. Walmsley Clarendon Laboratory, Oxford University, Parks Road, Oxford, OX1 3PU, UK Institute for Mathematical Sciences, Imperial College London, SW7 2PG, UK QOLS, The Blackett Laboratory, Imperial College London, Prince Consort Road, SW7 2BW, UK Physics and Astronomy, University of Potsdam, 14476 Potsdam, Germany
July 12, 2019
Abstract

Recent efforts have applied quantum tomography techniques to the calibration and characterization of complex quantum detectors using minimal assumptions. In this work we provide detail and insight concerning the formalism, the experimental and theoretical challenges and the scope of these tomographical tools. Our focus is on the detection of photons with avalanche photodiodes and photon number resolving detectors and our approach is to fully characterize the quantum operators describing these detectors with a minimal set of well specified assumptions. The formalism is completely general and can be applied to a wide range of detectors.

I Introduction

The quantum properties of nature reveal themselves only to carefully designed measurement techniques Resch (); Pryde (). In addition, most quantum information applications both computational and cryptographic, rely on a certain knowledge of the measurement apparatuses involved. Indeed for these protocols we often need to ensure that we associate detector outcomes with the correct quantum mechanical operation or quantum state. More critically, the assumption of a fully characterized detector completely underlies both quantum state tomography (QST) and quantum process tomography (QPT) PhysRevLett.70.1244 (); PhysRevLett.78.390 (); kwiat (). State tomography has become an important tool for characterizing states, partially due to the realization that non-classical states are a resource for performing tasks such as enhanced precision metrology, quantum communication, and quantum computation. Often taken for granted, measurement also plays a crucial role in these tasks and in some can even eliminate the need for entanglement. In enhanced precision metrology, appropriate measurements alone can give rise to super-resolution Resch () and Heisenberg-limited sensitivity Pryde (). In communication, measurement allows entanglement swapping, and thus, is central to quantum repeaters. And for computing, measurement based schemes enable quantum computation Raussendorf (); Raussendorf-Briegel-prl (). It follows that measurement should also be considered a resource for quantum protocols. In QST a given number of measurements on many copies of an unknown state reveal its density operator PhysRevA.40.2847 (); PhysRevLett.70.1244 (); PhysRevA.60.674 (). Characterising the operators that govern an evolution or a channel – QPT – amounts to applying the process to a set of input states, and subsequently fully characterising the output states chuang-1996 (); PhysRevLett.78.390 (); PhysRevLett.80.5465 (); kwiat (); lvovsky-2008 (). In this paper we study quantum detector tomography (QDT) PRLSoto (); dariano-2004-93 (); PhysRevA.64.024102 (); JMOTomo (); NaturePhysTomo (), in which a detector’s outcome statistics in response to a set of input states determine the operators that describe that detector. State and detector tomography evidently exhibit a dual role: Either the input is well-known and the detector is to be characterised, or the detector is well-known and the state is to be tomographically reconstructed.

Throughout the paper, we focus on examples from optics. However, quantum detector tomography is a general concept, applicable to any quantum detector. Building upon previous theoretical descriptions PRLSoto (); dariano-2004-93 (); PhysRevA.64.024102 (); JMOTomo () we will introduce the concept of detector tomography in the next section. Alongside, we will present examples detailing the reconstruction of simple detectors to introduce the key concepts. Subsequently, based on recent experiments NaturePhysTomo () we will present the methods used in optical detector tomography and the convex optimization methods Convex () which allow an efficient and simple numerical optimization. Finally, in the last section, we will discuss some of the theoretical and experimental challenges involved and how to address them.

Ii Detector tomography

ii.1 Definitions

In quantum mechanics, the operator describing a measurement apparatus is, in its most general form, a positive operator valued measure (POVM). The POVM elements describe the possible outcomes, labeled here by . Particularly, for a projective measurement, the POVM elements are orthogonal and simplify to the familiar form

In quantum optics, an example of a POVM that consists of projectors is that of an eight-port homodyne, .

Now, given an input state , the probability of obtaining output is

(1)

It follows that the POVM elements must be positive semi-definite, , while observing

(2)

if we want probabilities adding up to one. Inverting Eq. (1) to extract subject to the aforementioned conditions is the task behind detector tomography.

ii.2 Assumptions

Detector tomography raises fundamental questions about the kind of information we can extract from Nature. It is reasonable to think that state tomography performed with poorly characterized detectors can lead to unwanted errors. In addition, for quantum cryptography, a mischaracterization of states or detectors may lead to channels through which an eavesdropper may attack invalidating certain security proofs (see for ex. Makarov ()). Indeed, if we misjudge the noise level, then we misjudge the information the eavesdropper possesses. Interestingly, quantum key distribution with totally uncharacterized detectors is possible in principle, based entirely on correlations violating Bell’s inequalities, but at the expense of having much smaller rates Masanes1 (); Masanes2 (). An in depth discussion of our assumptions is therefore necessary to avoid unwanted errors in our detector estimation.

Generally, there are going to be assumptions about the input states produced by the source. In the reconstruction, we will often need to assume we can truncate an infinite dimensional Hilbert space, such as in the case of photon number. On the detector side, a common assumption will be that the detector is memory-less: the previous measurements do not modify the result of future measurements. For example, this assumption fails when detector deadtimes are longer than the time between consecutive measurements. All of these assumptions need to be tested.

More generally, we can ask what the working minimal set of assumptions happens to be. An assumption-free tomography could use a complete black box approach: prepare a collection of unknown states, measure them and try to draw some conclusions about both the detector and the states. Specifically, we could have some classical controls to prepare quantum states characterized by the index and some classical pointer to indicate the possible outcomes, labeled . Minimizing the set of assumptions would constrain us to draw our conclusions exclusively from the joint probability distribution .

To further constrain the problem we can add the standard assumptions: An underlying Hilbert space of fixed dimension, normalized positive density matrices and positive POVM elements satisfying . However, without further assumptions, the relationship between and ’s parameters is completely unknown, as is that between and ’s parameters.

This discussion highlights the inherent difficulties that a fully general inference (or tomographic) scheme entails. Reasonable assumptions are thus needed but the question of general tomography remains an interesting one to be explored. In this direction, some progress has been made in self-testing maps. In this context states are prepared with classical recipes and families of unitary gates are revealed with few assumptions about the quantum states (however known measurements in the computational basis are required) self-testing ().

Figure 1: It is generally possible to divide an experiment into preparation, evolution, and measurement. However, if one corner in the above diagram is unknown or missing, then we need assumptions about the other two.

As shown diagramatically in Fig.  1 we can divide any experiment into preparation, evolution and measurement. If one of the elements of the triad is unknown or missing then we need to previously characterize the other two making assumptions in the process.

ii.3 Practical detector tomography

In state tomography, one must perform a set of measurements spanning the space of the density operator to be reconstructed. If the state is defined on a -dimensional Hilbert space, then it will be fully specified by real parameters (respecting the constraint of unit trace). To fully characterize a quantum detector, we need the data obtained from measuring input states from a well-characterized source. To recover all the POVM elements from the measured statistics the probe states or input states must also be chosen to form a set that is tomographically complete: the span of the operators – which are not necessarily linearly independent – must be the entire space from which is taken. A spanning set forming an operator basis will have at least elements for a outcome detector. In principle this is sufficient to calculate the direct inversion of Eq. (1). However experimental detector tomography carries additional requirements. Clearly the probe states should be previously characterised, and large numbers of them should be easily and reliably generated. In the case of optical detectors, coherent states are ideal candidates since a laser can generate them directly and we can create a tomographically complete set by transforming their amplitude through attenuation (for example with a beam splitter) and a phase-shifter (an optical path delay). Using input states one can then reconstruct the -function of the detector PRLSoto () which is simply proportional to the measured statistics,

(3)

Since of each POVM contains the same information as the element itself, predictions of the detection probabilities for arbitrary input states can then be calculated directly from the Q-function.

ii.4 Simple example: the perfect photocounter

Consider as a simple example the case of a perfect photon number detector This projective measurement is characterized by its POVM elements, . In the simplest of scenarios, using pure number states, , the characterization would be trivial, since the statistics

would immediately characterize our detector. From a more practical perspective, pure number states are very hard to generate, especially for high photon numbers. Using coherent states is therefore a more realistic approach. Assuming a collection of perfect coherent states our statistics would then become

Of course in an experiment we would not know the POVM elements in advance, and we would need to express them with a generic expression such as

suffering only the constraint . Fortunately, these operators can be simplified: interposing a phase shifter in the coherent beam’s path we can check if the statistics are independent of the phase. If they are we can infer the phase independence of the POVM. Our operators then become and the statistics can be expressed as

(4)

[0pt]

Figure 2: Outcome probability distributions for an -outcome detector. Each curve represents the probability of that outcome (zero clicks, one click, etc) happening vs. the value of the intensity of the coherent state arriving at the detector.

For a perfect photon number detector that can discriminate up to eight photons, the outcome probability distributions would be the one shown in Fig. 2.

We now consider how one would estimate the POVM elements from these probability distributions, which form a set of simple simulated data. Our goal is to invert Eq. (4). To do so we can write a matrix version of the equation. Given the set of coherent states , and truncating the number states at a sufficiently large , we can write

(5)

Here, we have taken

So the matrix has entries

entries , and entries . For an -outcome detector, this gives the matrix dimensions of , , and .

Now, to obtain , we can simply solve the convex optimization problem

min
subject to (6)

This is a convex problem, because the norm , defined as for matrix is a convex function and the positivity constraint is semi-definite. The result of a single such minimization is shown in Fig. 3, where we recover the expected POVM elements

(7)

It is remarkable that even introducing some statistical noise in the simulated data the results are just as perfect. Indeed, if instead of using we use, where represent a % random noise, then the results are just as robust as without. This will be discussed in more detail in later sections as it relates to the technical noise of the laser. Let us then move on to more realistic detectors and see the problems that loss and finite photon resolution introduce.

[0pt]

Figure 3: Reconstructed POVM of a theoretical photon number detector with bins. The POVM elements, , are the result of the optimization in Eq. (II.4), and show a perfect result even though a % random noise was added to the data (changing the value of the coherent state amplitudes).

Iii Why detector tomography?

In spite of its first sucessful applications, one could question the need for detector tomography. After all, detectors have been calibrated in the past without tomographic techniques. However, as quantum technology makes striking advances, quantum detectors are becoming more complex and, thus, susceptible to imperfections that are not incorporated in the bottom-up approach of modelling. In contrast, with tomography one can design detectors with a top-down aproach by fully characterizing the final detector operation. For example, photo-detection has seen the advent of single-carbon-nanotube detectors nanotube-detector (), charge integration photon detectors (CIPD) CIDP (), Visible Light Photon Counters (VLPC) VLPC (), quantum dot arrays shields:3673 (), superconducting edge and picosecond sensors miller:791 (); super-pico () or time multiplexing detectors based on commercial Si-APDs loopy (); loopy0 (). Certainly understanding and modelling in full detail the noise, loss and coherence characteristics of these technologies is far from trivial. Detector tomography is an answer to those challenges and, additionally, will allow us to benchmark competing detectors. Such characterized detectors also allow for the preparation of non-Gaussian states in a certified manner, such as photon subtracted states. Hence, such detectors are readily useable in protocols such as entanglement distillation based on non-Gaussian states Dis1 (); Dis (); Dis2 (), in schemes increasing entanglement by means of photon subtraction In (), enhancing the teleportation fidelity Tel1 (); Tel2 (), or in other applications of non-Gaussian states Loophole (); Met (), specifically in the context of quantum metrology.

Let us discuss more precisely how detector tomography can provide an advantage with respect to traditional calibration methods.

iii.1 The limits of calibration

Standard calibration methods require a model of the detector. The parameters of this model are then estimated experimentally using states and standard assumptions. However, for specific applications, this models can become very complicated with a daunting number of parameters. For example, a detector as simple as a Yes/No avalanche photo diode (APD) becomes very hard to model when all noise sources are studied migdall1 (); migdall2 (). Tomography sidesteps this by enabling the operational detector to be measured directly. Another advantage can be to avoid errors or pitfalls from standard calibration. An example is the use of Bell-state detectors which can appear to be working while frequency correlations in the input states obscure the results Broken-Bell-State-Detector (); Ian_royal_soc (). To avoid such pitfalls detectors are often used in the rages where their behaviour is easily calibrated. Detector tomography could extend the range of applicability of existing detectors and help design more complicated ones. Let us now look at other advantages of full detector tomography.

iii.2 Quantitative entanglement verification

Once a detector is fully characterized, it can be used to characterize states in a certified fashion. In this respect, a detector is still useful if it is imperfect in the sense that its POVM elements are not just unit rank projections: One can use them in an estimation problem, or in a setting that unambiguously estimates properties of a state. This is one of the key applications of detector tomography, in that imperfect devices can be used to very reliably perform estimation. A specifically important example is the direct detection of entanglement: Once detectors are characterized, one can perform measurements and then find lower bounds to entanglement measures, based on these measurements. This is an idea presented in Ref. Auden () (see also Refs. Quant (); GuehneShort (); GuehneLong (); Wunder-Plenio ()). Here we discuss the specific issues related to infinite-dimensional systems and photon counters.

Assume that we have performed more than one type of measurement, labeled by , for which we have completely characterized the POVM elements to great accuracy. Using two such devices one can make local measurements on each part of a bipartite state. With the data from

one can then ask for the minimal degree of entanglement consistent with it. This approach does not make any assumptions on the preparation of states, and works even for detectors with a low detection efficiency; This will already be incorporated in the stated bound. The unambiguously minimal degree of entanglement, say in terms of the negativity, is then given by the solution of the optimization problem Auden ():

min
subject to

One easily finds lower bounds to the optimal solution to this problem by considering

such that . Then a lower bound is readily given by Quant ()

The optimal lower bound, based on such an approach, is in turn the solution of the convex optimization problem Convex () in and given by

max
subject to

Some conditions need to be examined to make sure that the solution to this formulation (dual optimal) coincides with the solution of the original formulation (primal optimal). For linear programs or for semi-definite problems (SDP) as the one described these conditions are easily satisfied Convex (). As we will see this type of optimisation method is also useful for detector tomography itself.

Here, if one just makes use of POVM elements with a finite support, as one has in photon counting with an additional phase reference, then such bounds will provide very strong lower bounds. However, it will not provide good bounds for unbounded operators such as in homodyning. Hence, without having detectors of high efficiency, and without assumptions on the preparation of the entangled state, one can use a characterized detector to certify entanglement in quantitative terms.

Iv Modelling photodetectors

As we have seen, the aim of detector tomography is to identify the physical POVM closest to the experimental data with minimal assumptions on the functioning of the detectors. To compare this method with a more traditional calibration method let us study a photodetector modelling example. We will do so for an avalanche photodiode and a photon number resolving detector able to detect up to photons loopy (). In the next section we will compare these models to the experimental results derived without any model.

iv.1 Optical photon number detectors

An important detector in quantum optics is the single-photon counting module based on a silicon avalanche photodiode (APD). It has two detection outcomes, either registering an electronic pulse (-click) or not (-clicks). A loss-free perfect version of it would implement the Kraus operators

(8)

distinguishing between the presence or absence of photons. However some photons are absorbed without triggering a pulse. This loss can be modelled placing a BS in front of the perfect detector Ulf-loss (). The POVMs describing a detector with a BS of transmittivity can then be written as,

no click (9)
click (10)

disregarding after-pulsing or dark counts APD-Hendrik (). Having only two outcomes, this detector cannot distinguish the number of photons present.

A more advanced detector called time multiplexing detector (TMD) does have certain photon-number resolution. It splits the incoming pulse into many temporal bins, making unlikely the presence of more than one photon per bin. All the time bins are then detected with two APDs. Summing the number of -click outcomes from all the bins one can then estimate the probability of having detected a number of incoming photons. This detector is not commercially available but one has been constructed by the Ultrafast Group in Oxford loopy (). It has eight bins in total (four time bins in each of two output fibres) and thus nine outcomes – from zero to eight clicks.

The theoretical description of this detector is a bit more involved since there is what we call the “binning problem”. Indeed, in addition to loss there is a certain probability that all photons will end up in a single time bin, or more generally that incoming photons will result in less than clicks. To account for the details describing these probabilities we use a recursive relation martin-private (); loopy-achilles (). Our goal is to describe the following probability distribution:

: Probability of having -clicks given that there were incident photons and that the detector has -bins (or modes). Let us start with the simplest possible TMD which would consist of an input port, a beam splitter (with reflectivity and transmittivity and ) and two YES/NO detectors. This detector is shown in Fig. 4 and has two bins.

[0pt]

Figure 4: Diagram of a simplified -bin multiplexing detector.

For this simple example, the probability distribution we are after is . We will show how to calculate , and then how to go from to . For the two bin case from Fig. 4 consider a BS with transmittivity and reflectivity . In that case:

  • (if no photons are present we will only register zero clicks).

  • (with probability , photons end up in the lower bin and the same holds for the upper bin with . The probability of a single click is the sum of these independent probabilities).

  • (if then only two events may happen: one click or two. This complementary event has therefore (Probability of click).)

In the case of a -bin detector shown in Fig. 5, incoming photons are distributed to two -bin detectors according to a binomial distribution.

[0pt]

Figure 5: Diagram of a simplified bin multiplexing detector. The first beam splitter distributes photons according to a binomial distribution between the two -bin loopies of the second stage.

Now let us evaluate the probability for the upper -bin detector to register counts if photons entered while registering clicks in the lower 2-bin detector if entered the lower port. This should be weighted by the probability that photons enter the upper branch and the lower one, which is

Now the probability that counts are found overall is found summing the weighted probability over all possible ways that the detectors can find counts (i.e., ) and summing over all possible ways of distributing photons:

[0pt]

Figure 6: Diagram of a -bin multiplexing detector. The first beam splitter distributes photons according to a binomial distribution between the two next -bin stages.

We can extend the same argument to . Imagine we know . Now, for that detector to become a -bin detector all we need is to couple two of them to a beam splitter which will distribute the photons as described above and as shown in Fig. 6. In that same fashion we can then define the recursive relation

(11)

This of course can be generalized to accommodate a different BS reflectivity at each node (adding an index to and to account for its position). Based on this recursion and once we determine all and , we can write a simple and efficient program to generate the corresponding theoretical POVMs. For example, a -outcome detector would have a POVM which can be captured in the following matrix:

where , and . For example, the -click event has a POVM element,

etc. More generally, the measured statistics are related to the incoming photons by

where is the probability of detecting counts and the probability that photons arrived to the TMD loopy0 (). The Matrix introduced above then relates probabilities and density matrices through: .

iv.2 Detector loss

TMD detectors have various sources of loss, meaning that photons are absorbed before triggering a detection event. The major sources of loss are the coupling to the fibres, the absorption and scattering in the delay fibres and the non-unit efficiency of the detectors achilles-2006-97 (). A full description of the effect of losses is certainly complex, since loss occurs at many stages of the detector. One can model loss simply as a beamsplitter diverting photons out of an input state. However, one would have to include a BS before the detector (fibre coupling), a BS at each stage of fibre and a BS in front of each APD, altering Eq. (11) accordingly. Instead we give an effective description modeling loss with a single BS in front of the detector. The matrix capturing the losses has entries that are given by

being the binomial distribution accounting for loss since for all . Now combining both descriptions, we can decouple the loss from the binning, putting a BS coupled to the environment before the -bin TMD resulting in

This relationship expresses how the incoming photons experience loss and then are distributed among the available modes. The model of the TMD described up to this point would be the one needed without detector tomography. One could for example try to measure independently the transmittivities of the inner BS, reconstruct the convolution (or binning) matrix and try to estimate the overall loss to include it in the matrix . This would help us build a model of the TMD sketched in Fig. 7. By contrast, using detector tomography, we do not need to know anything about bins, beam splitters, inner detectors or specific loss mechanisms. Moreover, anything left out of our detector model (e.g. dark counts, afterpulsing, etc.) would be included in a tomographic characterization.

[0pt]

Figure 7: Diagram of a simplified 8 bin time multiplexing detector (TMD). The spirals represent a delay in the optical fibre.

V Experimental reconstruction

We now turn to the description of the experimental realisation. As mentioned earlier, coherent states are ideal probe states with which to characterize optical detectors. This holds true for any optical detector, including polarization detectors, frequency detectors (e.g. spectrometers), and even detectors that discriminate inherently quantum states (e.g. a photonic Bell-state detector). In most of these cases, we are only interested in the detector’s action at a particular input photon number (usually ). Still, we can reconstruct the detector POVM in the full photon number basis and then restrict our attention to a particular subspace. It is interesting that an optical state which can be fully described in a classical theory of electromagnetism can be used to characterize uniquely quantum detectors. To characterize a completely unknown detector (i.e. a black box) one would need to vary all the available parameters of the probe set of coherent states: spatial-temporal mode, polarization, phase, and amplitude (ensuring a tomographically complete set of states is constructed). However, given that frequency, time, position, momentum, and amplitude have infinite range, this is impossible in practice. Consequently, one is required to make realistic assumptions about the range of operation and sensitivity of any unknown detector. One might additionally neglect some of these optical parameters if one is only interested in a particular aspect of the full characterization.

The subjects of our detector tomography, the APD and TMD, possess the unique features of single-photon sensitivity and photon-number resolution, respectively. To characterize these features we vary only the amplitude and phase of probe coherent states, while fixing the spatial-temporal mode. In particular, the input wavepackets have a time extent shorter than the time window of the electronics, and the center wavelength is within range of the detectors. Light is coupled to both types of detector through single-mode optical fiber, eliminating the possibility of any variation in the position or momentum mode of the light. Critically, for the detectors to perform as intended and in order to ensure the detectors are memoryless, the wavepackets must be preceded and followed by time intervals in which there is no input light. The APD is known to have a deadtime of roughly ns; a detection that occurs before the input wavepacket arrives at the detector will make the detector insensitive to the probe coherent state. The TMD splits the incoming wavepacket into time bins spread over ns. The inverse of these two timescales then sets an upper limit on the rate at which we can send probe states to the detectors. We further limit the rate to ensure the detectors do not heat up, which would change their properties over time. These time variant features of the detectors could be illuminated with detector tomography but this would be unnecessarily complicated, as they are quickly evident without use of the full tomography procedure.

When operated with a gain high above their lasing threshold, lasers produce light well approximated by a coherent state. Coherent states (and statistical mixtures thereof) are unique amongst pure optical states in that, at a beamsplitter, the transmitted and reflected states are unentangled. Consequently, through attenuation we can vary the amplitude of a coherent state without changing its nature. In homodyne state tomography, one must use a balanced detection technique to negate technical noise in the laser. In contrast, by attenuating the laser light to the single-photon level in our detector tomography scheme, the resulting coherent state possesses an inherent amplitude uncertainty that renders the technical noise insignificant.

[0pt]

Figure 8: Experimental setup: The amplitude of the probe coherent state is attenuated with a half-waveplate () and a Glan-Thompson polarizer (P). The light is then further attenuated by Frequency independent filters or Neutral Density Filters (NDF) and coupled into a fibre (FC)

We use a cavity dumper (APE Cavity Dumper Kit) to reduce the repetition rate of a pulsed mode-locked Ti:sapph laser to . Long term drift of the laser power over 1 million pulses was . Our laser randomly varies in energy between subsequent pulses with a standard deviation of of . We vary the amplitude of our probe coherent states by rotating their polarization with a half-waveplate () in front of a Glan-Thompson polarizer (P) as shown in Fig. 8. We attenuate the coherent states by reflecting them from a beamsplitter (BS) (=95%) and three neutral density filters (NDF) (i.e. spectrally flat attenuators). Note that for the BS was measured with a relative deviation smaller than . Along with some loss upon coupling into a single-mode fiber, we collect these attenuations together in an overall attenuation factor . We test for any variation in as a function of (e.g. which might be caused by rotating the waveplate if it had a wedge) and find that the variation is less than .

There are, as of yet, no direct techniques to calibrate the power of light at the single-photon level Worsley09 (). In fact, there are no laboratory methods to make an absolute calibration of power at any intensity. Instead, a chain of photo-detectors are calibrated relative to each other. At the beginning of the chain is an absolute calibration system held at national standards institutes. In the case of our power meter (Coherent FieldMaxII-TO), the National Institute for Standards and Technology (NIST) uses a cryogenic bolometer as its absolute standard. This chain results in a 5% systematic uncertainty in our measurements of the laser power (averaged over 0.2s), measured at the transmitted port of the beamsplitter. The magnitude of for the probe state was found via . For each value of we recorded the number of times each detection outcome occurred in trials (i.e. laser pulses), which provides an estimate of . The phase of was allowed to drift freely, during which no change in the was observed. Consequently, we did not actively vary the phase of our probe states.

The 5% uncertainty in is the dominant error in our experiment. For a detector with over 95% efficiency, this error could result in estimates of that are incompatible with any physical detector. For example, this could result in more detector clicks on average than there were photons in the probe state on average. Gain in the detector could achieve this, but at the same time would necessarily introduce noise that would change the distribution of . For detectors with lower efficiency, the systematic error in will simply add or subtract from efficiency of the detector that results from the tomographic characterization.

We choose a center wavelength and a FWHM bandwidth of that are appropriate for each detector. In the case of the APD detector (a Perkin Elmer SPCM-AQR-13-FC) we set nm, nm, and chose the appropriate rate kHz, , and . For the TMD detector we set nm, nm, kHz, , and .

Since has an infinite range, one must set an upper limit on the magnitude of used in our set of probe states. A natural place for this is at the at which the detector behavior saturates, i.e. stays constant as a function of . Since this occurs asymptotically, a somewhat arbitrary degree of constancy must be chosen; we set as our upper limit. We expected the saturation behavior of the TMD would be whereas we found that it was and Already, the measured statistics (which are proportional to ) give a clear signature that our theoretical model of the detector is too simplistic. Moreover, as we increase the beyond our upper limit begins to decrease. The cause for this behavior is a break down of our memoryless detector assumption and highlights how crucial it is to create probe states in the desired spatial-temporal mode. Along with the probe laser pulse, the cavity dumper also out-couples a small fraction of the preceding and following pulses in the Ti:Sapph cavity. These are separated in time from our probe pulse by ns and each contain only the energy of the probe pulse. Consequently, these extra pulses will have an insignificant effect on for most of the range of . However, at , the preceding pulse will be a coherent state with . Consequently, roughly of the time bins in the TMD will be preceded by a photon. If the detector has, for instance, a efficiency then of the time bins will be preceded by a detection event that will not be counted as click by the electronics (due to their time window). More importantly, those bins will subsequently be unavailable to detect photons in our probe pulse, due to the ns deadtime of the APDs inside the TMD. Thus, as we observe, roughly of the bins will not result in a click. This behavior was extraneous to the normal operation of the detector and so we limit to in the tomographic reconstruction. However, it exemplifies the usefulness of even the basic detector tomography procedure, which results in approximate , for rough evaluation of the detector action. Some of these hypothesis or details could be further explored if we knew well some detectors or some states. For example the response of BS and neutral density filters to single photons could be explored (granted good single photons and reliable single photon detectors). The time independence of the POVMs could also be studied with well known states. However, we will see that the excellent fit of the data to the Q-function and of the reconstruction to the model suggests a sufficiently good understanding.

v.1 Results

We now turn to the tomographic reconstruction. To characterize our detector we have measured the outcome probability distributions resulting from sending a tomographically complete set of input states (or probe states). The use of pure coherent states as probe states implies that the probability distribution is proportional to the Q-function, as seen in Eq. (3). In principle the knowledge of the Q-function is then sufficient to predict the measurement probabilities for any incoming state since determine completely.

Figure 9: This plot presents the probability distributions corresponding to two detectors. One with negative “POVM” elements and anotherone with positive ones. Both POVMs observe .

However, a more useful and natural representation for photodetectors is the POVM expanded in the Fock basis. Another argument to find the POVM elements is that due to statistical noise, the Q-function could correspond to a non-physical POVM. Indeed, if we simply make a fit to the noisy measured probability distribution, this fit could correspond to negative “POVMs”. As an example consider Fig. 9 where two Q-functions are displayed together: Even though they are very similar to each other, one corresponds to a detector with a non-positive “POVM” element (and thus negative probabilities) and one corresponds to a physical one. Our goal is therefore to reconstruct the POVM operators which most closely match the data and still observe Quantum Mechanics (and thus are positive). .

Since we adopt a “black box” approach we need not assume any of the properties described in the previous theoretical models. Only the accessible parts of the “black box”  i.e., number of outcomes and control (or lack of control) of the phase will determine the description of our detector. The lack of a phase reference simplifies the experimental procedure, allowing us to solely control the magnitude of (as has been done for tomography of a single photon Lvovsky ()). A detector with no observed phase dependence will be described by POVM elements diagonal in the number basis,

simplifying hence the reconstruction of .

Figure 10: The measured probabilities for different “number-of-clicks” are shown (red dots) as a function of . The plot shows the statistics for the time multiplexed detector (TMD) with time-bins. The statistical error vertically is too small to be seen and the jitter of was estimated to be % of its value. An additional % systematic error in the calibration of the power meter is present but can be absorbed as loss. From the reconstructed POVM elements we generate the corresponding probability distributions (blue curves). These are generated for pure or mixed and for reconstructed with the filter function or without it. For all these options, the probability distributions (blue lines) are so similar that they are indistinguishable on this scale.

Again we can express the relationship between the statistics and our diagonal as,

if we measure different values of , and truncate the number states at a sufficiently large . For an -outcome detector, the matrices will have dimensions , , and . In addition

can easily be rewritten when the input state is a mixed state. This was done indeed to account for the laser’s technical noise (as we will see in the next section) but gave similar results. For such a detector, the physical POVM consistent with the data can be estimated through the following optimisation problem:

(12)

where the -norm ensures it is a convex quadratic problem. Note that we also allow for convex quadratic filter functions which will be discussed in some detail later. These are related to the conditioning of the problem and must not depend on the type of detector. For example, no symmetry or knowledge of the typical POVM structures in photo-detection can be assumed. If any, only general regularization functions that work for any POVM should be chosen. Now, for suitable filter functions (i.e. cuadratic) the whole problem is a convex quadratic optimisation problem, and hence also a semi-definite problem (SDP) which can be efficiently solved numerically Convex (). Moreover, in this case, there exists a dual optimisation problem whose solution coincides with the original problem. Thus, the dual problem provides a certificate of optimality since it provides a lower bound to the primal problem.

Figure 11: The measured probabilities for the “click” and “no-click” envents in the Avalanche Photodiode (APD) are shown as a function of . The statistical error vertically is too small to be seen and the jitter of was estimated to be % of its value. An additional % systematic error in the calibration of the power meter is present but can be absorbed as loss. From the reconstructed POVM elements we generate the corresponding probability distributions (blue curves). These are generated for pure or mixed and for reconstructed with the filter function or without it. For all these options, the probability distributions (blue lines) are so similar that they are indistinguishable on this scale.

Care has to be taken that the optimisation problem is well conditioned in order to find the true POVM of the detector. In finding their number basis representation we are deconvolving a coherent state from our statistics which is intrinsically badly conditioned due to the importance of the wings of the Gaussian. Similar issues of conditioning have been discussed in the context of state and process tomography, see, e.g., Refs. UnstableBoulant (); UnstableHradil (). Due to a large ratio between the largest and smallest singular values of the matrices defining the quadratic problem, small fluctuations in the probability distribution can result in large variations for the reconstructed POVM. This can result in operators that approximate really well the outcome statistics and yet do not exhibit a smooth distribution in photon-number. We will discuss how to treat this problem in the next section.

[0pt]

Figure 12: Reconstructed POVMs for (a) the photon-number resolving TMD and (b) the APD “yes/no” detector. TMD POVMs were obtained up to element (therefore ), but are shown up to for display purposes. APD POVMs are shown in full. Stacked on top of each where is the number of clicks we show in yellow. “rec” stands for reconstructed and (theo) is the theoretical POVM expected from (a) a TMD modelled with beam splitters of measured reflectivities and 52.1% overall loss (b) a theoretical APD with % loss respectively. Note that this result was obtained with a regularized optimisation as explained in next section .

The measured probabilities for each outcome as a function of are displayed in Fig. 11 and Fig. 10. The probability distributions (equivalent modulo to the Q-function of the detector) show smooth profiles and distinct photon number ranges of sensitivity for increasing number of clicks in the detector. Fig. 12 shows the POVMs that result from the optimisation in Eq. (V.1) which we will discuss later. A first remarkable property is that being the POVM for clicks, shows zero amplitude for detecting less than photons. That is, the detector shows essentially no dark counts. It should be noted that this was not assumed at the outset and is purely the result of the optimization. This sharp feature gives the detector its discriminatory power where clicks means at least photons in the input pulse.

To assess the performance of our method we compare it to the model described in the previous section. This time however, the BS used in the model are not 50/50 but its reflectivities () were measured experimentally. This was done measuring the reflected and transmitted beams of a laser with a calibrated power meter. The yellow bars in Fig. 12, show the absolute value of the difference between the theoretical and the reconstructed POVM elements. The magnitude

is shown stacked on top of each coefficient of the POVM elements where (theo) stands for theoretical and (rec) for reconstructed. The small yellow bars reveal a good agreement with the model. We also calculate a form of fidelity finding that

holds for all . Note that to calculate we normalized the POVM elements. This overlap indicates an excellent agreement between the two.

In addition, one can reconstruct a probability distribution: from the found POVMs to fit the data. The reconstruction is plotted as dark blue bars in Fig. 10 and Fig. 11. It is the equivalent of the Q-function had our probe states with suitable complex been strictly pure. In fact, although formally distinct, the probability distribution associated with the reconstructed POVM using mixed or pure states are practically indistinguishable and are plotted together in Fig. 10 for comparison.

v.2 Detector Wigner functions

An alternative representation of the detectors which can give us more insight about their structure comes from the quasi-probability distributions such as the Wigner Function UlfBook (); Schleich-book (). Since the POVM elements are self adjoint positive-semi-definite operators, a Wigner function can be calculated in the standard way from the POVM element :

(13)

where we have, as usual, now identified as phase space coordinates of a single mode with . However, since the POVMs do not have unit trace, this detector Wigner function will not be normalized,

(14)

We should note that the marginals cannot be interpreted as probability distributions but we can still use to calculate probabilities according to

(15)

Figure 13: Wigner function of the first POVM elements of the TMD. Since the detectors have no phase reference, their Wigner functions are rotationally symmetric with respect to their center and a cut contains all the information. The dotted blue curve represents the Wigner function of the reconstructed POVMs from to clicks. In red we can see the theoretical Wigner function for: (a) a theoretical TMD with 52% loss. (b) a theoretical TMD without loss. Paying attention to the scale we observe how dramatic the effect of loss is at damping the ripples in the Wigner function. It is also worth noting that the end ripples in (a) for the “ click” are just an edge effect due to number state cutoff.

Since none of the detectors have phase sensitivity their Wigner functions are rotationally symmetric around the origin. In Fig. 13 we display a cut of the TMD Wigner function for the following POVM elements:

Higher clicks are not displayed because their amplitude is too small to be compared with the rest. The interesting feature about the plot is the comparison with the theoretical TMD Wigner functions one can generate with the model. Indeed, comparing a theoretical loss-less TMD with the measured one we see how the amplitude of the Wigner function decreases rapidly for higher photon numbers. On the other hand, comparison with the lossy theoretical model reveals a good agreement.

Vi Ill Conditioning and regularisation

One of the main problems encountered in the tomographic characterisation of the detectors has to do with the numerical stability of the reconstruction. Such problems are common in tomography UnstableBoulant (); UnstableHradil (). Consider for example the transformations involved in the inverse Radon transform and their inherent instabilities. Note also how going from the Q-function to the P-function is not always well defined Schleich-book (). Multiple tools exist to bridge the link between homodyne tomography and the density matrix description lvovsky-pattern (). One of them involves the use of pattern functions dariano-pattern (); leonardt-pattern (); wunsche-1997 (). That is, finding some functions such that

However, finding the appropriate involves the irregular wave functions (particular unbounded solutions of the Schrödinger equation) and proving them to be appropriate is typically as hard as estimating the error pattern-fun (). The use of maximum likelihood has also been explored and particularly for detector tomography dariano-2004-93 (); PhysRevA.64.024102 (). However, the speed of the convergence is not generally guaranteed to be high, becoming exponential for certain problems. Our approach, following the spirit of maximum-likelihood, translates the problem into a quadratic optimisation one allowing for efficient semi-definite programming (SDP) (cf. Eq. (V.1)). We discuss here the details, approximations and filters that lead to our solution of the problem.

vi.1 Truncating the Hilbert space

The data was measured up to but was truncated at lower values in phase space. This was done to avoid noisy behavior and the emergence of new regimes in the behavior of the detector. Memory effects requiring a larger POVM space were thus avoided as discussed in the experimental section. Notably, the effects related to the detector’s dead time, after-pulsing or the dark counts from possible over heating were avoided staying in a low () photon number regime.

vi.2 Pure vs. mixed

The Q-function of our detector (directly measured) is proportional to

(16)

From Eq. (5) and Eq. (II.4), for a diagonal POVM, we can write the problem as

Figure 14: POVM reconstruction, using only minimisation from Eq. (17). Dark blue: , lighter blue, , etc.
(17)

with the usual constraints and . Using a semi-definite solver such as Yalmip, the obtained POVMs shows irregular dips and a structure quite dissimilar from what a TMD is expected to do.

Figure 15: POVM reconstruction, using only minimisation from Eq. (17). Dark blue: , lighter blue, , etc, displayed up to number layer .

The Fig. 14 shows a typical result, and Fig. 15 shows it for higher photon numbers revealing an even more irregular structure.

vi.2.1 Describing the laser’s amplitude uncertainty

A first meaningful observation is that some level of uncertainty existed in the amplitude of the coherent states. If values of were measured then the real

might actually have been

with some vector of errors . To address the effect of this uncertainty on our minimisation we can artificially introduce noise and then average over many runs of the optimisation. In other words, since in Eq. (17) depends on the measured values of , we can substitute with where are independent and identically distributed random variables. Using we run the optimisation and obtain a family of estimated POVM elements (each element of the family corresponds to a run of the optimisation with a different ) As a first approximation we may use a Gaussian probability distribution with zero mean and . Note that % was the measured variance of the laser amplitude from pulse to pulse as shown in Fig. 18. Subsequently we average over the POVMs obtained with different “jitters”  in realizations, obtaining

iterations of the optimisation with subsequent averaging improves the appearance of the POVMs but barely solves the “dips” observed. Fig. 16 and Fig. 17 are an example of this approach, showing that this kind of averaging does not properly counteract the fluctuations in the reconstructed POVM.

Figure 16: POVM reconstruction, using direct averaging (150 runs with 1% Gaussian noise).
Figure 17: POVM reconstruction, using direct averaging (300 runs with % Gaussian noise).

vi.2.2 Using mixed input states

A key obsevation showing that the previous approach is not the appropriate treatment of uncertainty in is that each probe state would be best described by a mixture of coherent states,

(18)
(19)

Here, would be some distribution centered around in phase space, leading to a mixed Gaussian state in case of a Gaussian classical probability distribution. We can integrate this state over the complex phase since we have no phase reference available and focus solely on the amplitude of the coherent states or mixtures thereof. Measurements reveal that the intensity of the laser varies from pulse to pulse following a distribution that looks like a Lorentzian with a tail (see Fig. 18).

Figure 18: Measurement of the laser’s amplitude variations from pulse to pulse.

A good approximation can however be made using a Gaussian distribution, leading to a Gaussian state, with standard deviation , implying

with . The detection probability for outcome is then

(20)

To simplify these calculations we can write a distribution in ,

(21)

with

In this case has been chosen such that the approximation holds. These subtleties however do barely alter our results and POVMs are as irregular as previously.

To evaluate the difference introduced by the pure () or mixed state () approach we have studied their influence on the reconstructed POVMs. In the regularised optimisation (i.e., for our final results), we have compared the POVMs obtained with each description finding that

(22)

and the largest relative difference between any two coming from a mixed state or a pure state derivation was . Furthermore, the reconstructed probability distributions are so close that they are indistinguishable on the scale of Fig. 10. This reinforces our earlier expectation that technical noise in the laser will be negligible when using single-photon-level coherent states. This differs from homodyne tomography where technical noise can shift a strong local oscillator to a nearly orthogonal state.

However, since the problem of the irregular POVMs is not solved by the mixed state description we need to look further into the origin of these irregularities. One first remarkable (but expected) property is that large variations in the photon number degree of freedom of the POVMs result in minuscule differences in the probability distributions (see Fig. 12). Since one convolutes the photon number distribution with a Gaussian in to obtain the Q-function this behavior has been expected. Conversely this means that small errors or statistical fluctuations in the Q-function can result in large errors in the POVM elements. Consider for example that if instead of

we try to minimise

the SDP solver finds no sensible solution. This is because using the Moore-Penrose pseudo-inverse we find due to its inherent ill conditioning, meaning that the ratio of largest and smallest singular values in is large.

Various methods exist to try and regularise these problems. Whatever the chosen method it should assume as little knowledge as possible about the specific form of the sought POVM. Since has very small values for high photon numbers one could enhance those values while preserving the minimisation target. For example we could run the optimisation

where is a diagonal matrix aimed at regularising the problem. This can be shown to introduce some improvement but does not solve the “dip problem” completely. It is also hard to find the exact form of that yields “good” results without any prior knowledge about the expected POVMs. In addition (roughly speaking) it is hard to find a balance between having good results for low photon numbers and for high photon numbers.

Figure 19: Minimisation using damping method on Eq. (23). Note that the point of view is opposite that of the previous plots. We see some dips around the -th and -th number layer.

Another approach is to introduce a sort of damping or specific penalisation. For example one could define a diagonal matrix such that

and use it to redistribute the weight of each POVM element, avoiding unreasonably large POVM element amplitudes (that compensate for low values in ). The optimisation could be recast as,

(23)

A result of this can be seen on Fig. 19. This method has the same shortcomings as the previous one: it is sensitive to the choice of parameters and the exact form of is hard to determine without detailed prior assumptions.

A more reasonable method is to capture the relative smoothness of the POVM from a lossy detector. This method is also called smoothing regularisation Convex (). In this case one single assumption needs to be made. The POVMs should exhibit a certain degree of “smoothness”.

vi.3 Smooth or not?

Let us first define what we mean by smooth. Smooth will mean in this context that the difference is small for all and . In the optimisation context we will mean that our minimisation is defined as follows:

(24)

with

for some fixed value of y. The smoothing function will be independent of the detector, and will mildly penalize non-smooth POVM elements. This approach is further substantiated by the observation that the resulting POVMs are largely independent of the weight that is given to the smoothness penalty.

As most quantum detectors, especially those disucussed here are lossy, this is a particularly plausible feature. Indeed, if an optical detector has a POVM element with non-zero amplitude in , then if it is lossy, it will have a positive amplitude in , , decreasing with but different from zero. In fact, in general, if the detector has a finite efficiency which can be modelled with a BS, it will impose some smoothness on the distribution . That is because if is the probability of registering photons and is the probability that were present, then the loss process will impose kenny-thesis ():

Consequently, if , then , etc. cannot be zero, but will have some relatively smooth distribution. This simple physical argument makes a certain smoothness plausible (but still should allow sharp transitions for ).

For this detector (and for any photodiode based detector) assuming loss is reasonable and can make the “smoothness” requirement appropriate. Let us however see if, without looking at the specific shape of our POVM, we can find an optimal smoothing coefficient and justify further the use of the smoothing regularisation.

Figure 20: Illustration of the sensitivity to noise for two different minimisation methods ( corresponds to no regularisation, and to an approach using a smoothing regularisation). For each value of and we have run the optimisation times and displayed the results here to illustrate this variation.
Figure 21: Illustration of how sensitive the optimisation is to the specific choice of . This plot shows the relative error with respect to the POVM elements obtained using , as a function of . In red, and only for reference (since it does not change with ), the value of the relative error for (no smoothing) is shown. We vary in the range . It is remarkable that a % variation in results in only a % variation. For the relative error is less than % in .

One way to test this method is to quantify how resilient it is to noise in the data. To do so we introduce additional noise in to the measured data. For example, we can alter in where is again a vector of random variables distributed with a Gaussian distribution with zero mean. This simulates a statistical uncertainty in the measurement of the coherent state. To see its effect on the reconstruction we use the figure of merit . This quantity should evaluate how POVMs differ from the one without noise. It is seen that the additional smoothing penalty makes the optimisation more robust, largely independent of the value of (we can multiply by a and stay in the same regime). Using this smoothing regularisation with noisy data seems therefore a good choice.

We have seen how smoothing makes the optimization more robust against noise but we should also ask how sensitive this optimisation is to the exact choice of . To do so we may use the following procedure: compare the POVM obtained using with that obtained varying over orders of magnitude. In Fig. 21 we plot the relative error . Remarkably doubling the value of results in an overall relative error in the POVM of less than %. Multiplying (or dividing) by 10 gives a variation below % and -fold variation results in a % variation. If we compare how this differs from the case which is % different then we can conclude that the optimisation is quite insensitive to the exact choice of the smoothing parameter . The following table provides some values for reference.

variation relative error
0.0001 x/1000 27.3%
0.001 x/100 12.2%
0.01 x/10 4%
0.05 x/2 1%
0.5 x 5 3%
1 x 10 5%
Table 1: This table illustrates how sensitive the optimisation is to the particular choice of (the reference smoothing strenght is )
Figure 22: Illustration of how too much smoothing can fail to capture the sharp variations of a POVM. We define as the matrix containing the POVM elements of the theoretical POVMs. From them we generate a probability distribution and reconstruct the POVMs with the smoothing regularised optimisation. The dotted lines represent for different values of and for a variety of POVMs (see following plots). The horizontal lines represent that same difference for and are plotted for reference.

vi.4 Sharp and smooth

There is of course a limit to how much we can penalize non-smooth POVMs. Is it possible for the smoothing regularisation to wash out all the sharp features of the POVM, thus smoothing in excess? This of course is a legitimate question that further restricts the reasonable range for . To study that effect we analyse four cases:

  1. A theoretical loss-less TMD, based on the model described in Eq. (11).

  2. A lossy TMD, based on the above with added loss from an BS.

  3. A perfect photon number detector, that is with .

  4. An artificial POVM with sharp variations, containing the POVM elements:

    and observing

To study the smoothing we generate the POVM elements numerically, build a probability distribution and retrieve the using the optimisation from Eq. (24) for an increasing range of . Then we compare these results with the theoretical POVMs we defined in order to generate the PD. All optimisations are done using the mixed-state approach from Eq. (21). Broadly speaking, we find two distinct behaviors: POVMs with terms that decay slowly in photon number need regularisation and are quite insensitive to the precise . For sharp POVMs (without loss) the range preserves their shape quite well, but further smoothing hides their true shape. These properties are further illustrated in the figures that follow.

vi.4.1 Lossy TMD

Figure 23: Smoothing evolution for a lossy TMD detector (loss=52%). We show as an example the evolution of the POVM element as we increase the amount of smoothing (in ). The yellow bars display