Measuring Measurement: Theory and Practice
Abstract
Recent efforts have applied quantum tomography techniques to the calibration and characterization of complex quantum detectors using minimal assumptions. In this work we provide detail and insight concerning the formalism, the experimental and theoretical challenges and the scope of these tomographical tools. Our focus is on the detection of photons with avalanche photodiodes and photon number resolving detectors and our approach is to fully characterize the quantum operators describing these detectors with a minimal set of well specified assumptions. The formalism is completely general and can be applied to a wide range of detectors.
I Introduction
The quantum properties of nature reveal themselves only to carefully designed
measurement techniques Resch (); Pryde (). In addition, most quantum
information applications both computational and cryptographic, rely on a
certain knowledge of the measurement apparatuses involved. Indeed for these
protocols we often need to ensure that we associate detector outcomes with the
correct quantum mechanical operation or quantum state. More critically,
the assumption of a fully characterized detector completely underlies both
quantum state tomography (QST) and quantum process tomography (QPT)
PhysRevLett.70.1244 (); PhysRevLett.78.390 (); kwiat (). State tomography has
become an important tool for characterizing states, partially due to the
realization that nonclassical states are a resource for performing tasks such
as enhanced precision metrology, quantum communication, and quantum
computation. Often taken for granted, measurement also plays a crucial role in
these tasks and in some can even eliminate the need for entanglement. In
enhanced precision metrology, appropriate measurements alone can give rise to
superresolution Resch () and Heisenberglimited sensitivity Pryde ().
In communication, measurement allows entanglement swapping, and thus, is
central to quantum repeaters. And for computing, measurement based schemes
enable quantum computation Raussendorf (); RaussendorfBriegelprl (). It
follows that measurement should also be considered a resource for quantum
protocols. In QST a given number of measurements on many copies of an unknown
state reveal its density operator PhysRevA.40.2847 (); PhysRevLett.70.1244 (); PhysRevA.60.674 (). Characterising the operators that govern an evolution or a
channel – QPT – amounts to applying the process to a set of input states,
and subsequently fully characterising the output states chuang1996 (); PhysRevLett.78.390 (); PhysRevLett.80.5465 (); kwiat (); lvovsky2008 (). In this
paper we study quantum detector tomography (QDT) PRLSoto (); dariano200493 (); PhysRevA.64.024102 (); JMOTomo (); NaturePhysTomo (), in which a
detector’s outcome statistics in response to a set of input states determine
the operators that describe that detector. State and detector tomography
evidently exhibit a dual role: Either the input is wellknown and the detector
is to be characterised, or the detector is wellknown and the state is to be
tomographically reconstructed.
Throughout the paper, we focus on examples from optics. However, quantum detector tomography is a general concept, applicable to any quantum detector. Building upon previous theoretical descriptions PRLSoto (); dariano200493 (); PhysRevA.64.024102 (); JMOTomo () we will introduce the concept of detector tomography in the next section. Alongside, we will present examples detailing the reconstruction of simple detectors to introduce the key concepts. Subsequently, based on recent experiments NaturePhysTomo () we will present the methods used in optical detector tomography and the convex optimization methods Convex () which allow an efficient and simple numerical optimization. Finally, in the last section, we will discuss some of the theoretical and experimental challenges involved and how to address them.
Ii Detector tomography
ii.1 Definitions
In quantum mechanics, the operator describing a measurement apparatus is, in its most general form, a positive operator valued measure (POVM). The POVM elements describe the possible outcomes, labeled here by . Particularly, for a projective measurement, the POVM elements are orthogonal and simplify to the familiar form
In quantum optics, an example of a POVM that consists of projectors is that of an eightport homodyne, .
Now, given an input state , the probability of obtaining output is
(1) 
It follows that the POVM elements must be positive semidefinite, , while observing
(2) 
if we want probabilities adding up to one. Inverting Eq. (1) to extract subject to the aforementioned conditions is the task behind detector tomography.
ii.2 Assumptions
Detector tomography raises fundamental questions about the kind of information we can extract from Nature. It is reasonable to think that state tomography performed with poorly characterized detectors can lead to unwanted errors. In addition, for quantum cryptography, a mischaracterization of states or detectors may lead to channels through which an eavesdropper may attack invalidating certain security proofs (see for ex. Makarov ()). Indeed, if we misjudge the noise level, then we misjudge the information the eavesdropper possesses. Interestingly, quantum key distribution with totally uncharacterized detectors is possible in principle, based entirely on correlations violating Bell’s inequalities, but at the expense of having much smaller rates Masanes1 (); Masanes2 (). An in depth discussion of our assumptions is therefore necessary to avoid unwanted errors in our detector estimation.
Generally, there are going to be assumptions about the input states produced by the source. In the reconstruction, we will often need to assume we can truncate an infinite dimensional Hilbert space, such as in the case of photon number. On the detector side, a common assumption will be that the detector is memoryless: the previous measurements do not modify the result of future measurements. For example, this assumption fails when detector deadtimes are longer than the time between consecutive measurements. All of these assumptions need to be tested.
More generally, we can ask what the working minimal set
of assumptions happens to be. An assumptionfree tomography could use a
complete black box approach: prepare a collection of unknown states, measure
them and try to draw some conclusions about both the detector and the states.
Specifically, we could have some classical controls to prepare quantum states
characterized by the index and some classical
pointer to indicate the possible outcomes, labeled . Minimizing the set of
assumptions would constrain us to draw our conclusions exclusively from the
joint probability distribution .
To
further constrain the problem we can add the standard assumptions: An
underlying Hilbert space of fixed dimension, normalized positive density
matrices and positive POVM elements satisfying . However, without further assumptions, the relationship
between and ’s parameters is completely unknown, as
is that between and ’s parameters.
This discussion highlights the inherent difficulties that a fully general inference (or tomographic) scheme entails. Reasonable assumptions are thus needed but the question of general tomography remains an interesting one to be explored. In this direction, some progress has been made in selftesting maps. In this context states are prepared with classical recipes and families of unitary gates are revealed with few assumptions about the quantum states (however known measurements in the computational basis are required) selftesting ().
As shown diagramatically in Fig. 1 we can divide any experiment into preparation, evolution and measurement. If one of the elements of the triad is unknown or missing then we need to previously characterize the other two making assumptions in the process.
ii.3 Practical detector tomography
In state tomography, one must perform a set of measurements spanning the space of the density operator to be reconstructed. If the state is defined on a dimensional Hilbert space, then it will be fully specified by real parameters (respecting the constraint of unit trace). To fully characterize a quantum detector, we need the data obtained from measuring input states from a wellcharacterized source. To recover all the POVM elements from the measured statistics the probe states or input states must also be chosen to form a set that is tomographically complete: the span of the operators – which are not necessarily linearly independent – must be the entire space from which is taken. A spanning set forming an operator basis will have at least elements for a outcome detector. In principle this is sufficient to calculate the direct inversion of Eq. (1). However experimental detector tomography carries additional requirements. Clearly the probe states should be previously characterised, and large numbers of them should be easily and reliably generated. In the case of optical detectors, coherent states are ideal candidates since a laser can generate them directly and we can create a tomographically complete set by transforming their amplitude through attenuation (for example with a beam splitter) and a phaseshifter (an optical path delay). Using input states one can then reconstruct the function of the detector PRLSoto () which is simply proportional to the measured statistics,
(3) 
Since of each POVM contains the same information as the element itself, predictions of the detection probabilities for arbitrary input states can then be calculated directly from the Qfunction.
ii.4 Simple example: the perfect photocounter
Consider as a simple example the case of a perfect photon number detector This projective measurement is characterized by its POVM elements, . In the simplest of scenarios, using pure number states, , the characterization would be trivial, since the statistics
would immediately characterize our detector. From a more practical perspective, pure number states are very hard to generate, especially for high photon numbers. Using coherent states is therefore a more realistic approach. Assuming a collection of perfect coherent states our statistics would then become
Of course in an experiment we would not know the POVM elements in advance, and we would need to express them with a generic expression such as
suffering only the constraint . Fortunately, these operators can be simplified: interposing a phase shifter in the coherent beam’s path we can check if the statistics are independent of the phase. If they are we can infer the phase independence of the POVM. Our operators then become and the statistics can be expressed as
(4) 
For a perfect photon number detector that can discriminate up to eight photons, the outcome probability distributions would be the one shown in Fig. 2.
We now consider how one would estimate the POVM elements from these probability distributions, which form a set of simple simulated data. Our goal is to invert Eq. (4). To do so we can write a matrix version of the equation. Given the set of coherent states , and truncating the number states at a sufficiently large , we can write
(5) 
Here, we have taken
So the matrix has entries
entries , and entries . For an outcome detector, this gives the matrix dimensions of , , and .
Now, to obtain , we can simply solve the convex optimization problem
min  
subject to  (6) 
This is a convex problem, because the norm , defined as for matrix is a convex function and the positivity constraint is semidefinite. The result of a single such minimization is shown in Fig. 3, where we recover the expected POVM elements
(7) 
It is remarkable that even introducing some statistical noise in the simulated data the results are just as perfect. Indeed, if instead of using we use, where represent a % random noise, then the results are just as robust as without. This will be discussed in more detail in later sections as it relates to the technical noise of the laser. Let us then move on to more realistic detectors and see the problems that loss and finite photon resolution introduce.
Iii Why detector tomography?
In spite of its first sucessful applications, one could question the need for detector tomography. After all, detectors have been calibrated in the past without tomographic techniques. However, as quantum technology makes striking advances, quantum detectors are becoming more complex and, thus, susceptible to imperfections that are not incorporated in the bottomup approach of modelling. In contrast, with tomography one can design detectors with a topdown aproach by fully characterizing the final detector operation. For example, photodetection has seen the advent of singlecarbonnanotube detectors nanotubedetector (), charge integration photon detectors (CIPD) CIDP (), Visible Light Photon Counters (VLPC) VLPC (), quantum dot arrays shields:3673 (), superconducting edge and picosecond sensors miller:791 (); superpico () or time multiplexing detectors based on commercial SiAPDs loopy (); loopy0 (). Certainly understanding and modelling in full detail the noise, loss and coherence characteristics of these technologies is far from trivial. Detector tomography is an answer to those challenges and, additionally, will allow us to benchmark competing detectors. Such characterized detectors also allow for the preparation of nonGaussian states in a certified manner, such as photon subtracted states. Hence, such detectors are readily useable in protocols such as entanglement distillation based on nonGaussian states Dis1 (); Dis (); Dis2 (), in schemes increasing entanglement by means of photon subtraction In (), enhancing the teleportation fidelity Tel1 (); Tel2 (), or in other applications of nonGaussian states Loophole (); Met (), specifically in the context of quantum metrology.
Let us discuss more precisely how detector tomography can provide an advantage with respect to traditional calibration methods.
iii.1 The limits of calibration
Standard calibration methods require a model of the detector. The parameters of this model are then estimated experimentally using states and standard assumptions. However, for specific applications, this models can become very complicated with a daunting number of parameters. For example, a detector as simple as a Yes/No avalanche photo diode (APD) becomes very hard to model when all noise sources are studied migdall1 (); migdall2 (). Tomography sidesteps this by enabling the operational detector to be measured directly. Another advantage can be to avoid errors or pitfalls from standard calibration. An example is the use of Bellstate detectors which can appear to be working while frequency correlations in the input states obscure the results BrokenBellStateDetector (); Ian_royal_soc (). To avoid such pitfalls detectors are often used in the rages where their behaviour is easily calibrated. Detector tomography could extend the range of applicability of existing detectors and help design more complicated ones. Let us now look at other advantages of full detector tomography.
iii.2 Quantitative entanglement verification
Once a detector is fully characterized, it can be used to characterize states in a certified fashion. In this respect, a detector is still useful if it is imperfect in the sense that its POVM elements are not just unit rank projections: One can use them in an estimation problem, or in a setting that unambiguously estimates properties of a state. This is one of the key applications of detector tomography, in that imperfect devices can be used to very reliably perform estimation. A specifically important example is the direct detection of entanglement: Once detectors are characterized, one can perform measurements and then find lower bounds to entanglement measures, based on these measurements. This is an idea presented in Ref. Auden () (see also Refs. Quant (); GuehneShort (); GuehneLong (); WunderPlenio ()). Here we discuss the specific issues related to infinitedimensional systems and photon counters.
Assume that we have performed more than one type of measurement, labeled by , for which we have completely characterized the POVM elements to great accuracy. Using two such devices one can make local measurements on each part of a bipartite state. With the data from
one can then ask for the minimal degree of entanglement consistent with it. This approach does not make any assumptions on the preparation of states, and works even for detectors with a low detection efficiency; This will already be incorporated in the stated bound. The unambiguously minimal degree of entanglement, say in terms of the negativity, is then given by the solution of the optimization problem Auden ():
min  
subject to  
One easily finds lower bounds to the optimal solution to this problem by considering
such that . Then a lower bound is readily given by Quant ()
The optimal lower bound, based on such an approach, is in turn the solution of the convex optimization problem Convex () in and given by
max  
subject to 
Some conditions need to be examined to make sure that the solution to this formulation (dual optimal) coincides with the solution of the original formulation (primal optimal). For linear programs or for semidefinite problems (SDP) as the one described these conditions are easily satisfied Convex (). As we will see this type of optimisation method is also useful for detector tomography itself.
Here, if one just makes use of POVM elements with a finite support, as one has in photon counting with an additional phase reference, then such bounds will provide very strong lower bounds. However, it will not provide good bounds for unbounded operators such as in homodyning. Hence, without having detectors of high efficiency, and without assumptions on the preparation of the entangled state, one can use a characterized detector to certify entanglement in quantitative terms.
Iv Modelling photodetectors
As we have seen, the aim of detector tomography is to identify the physical POVM closest to the experimental data with minimal assumptions on the functioning of the detectors. To compare this method with a more traditional calibration method let us study a photodetector modelling example. We will do so for an avalanche photodiode and a photon number resolving detector able to detect up to photons loopy (). In the next section we will compare these models to the experimental results derived without any model.
iv.1 Optical photon number detectors
An important detector in quantum optics is the singlephoton counting module based on a silicon avalanche photodiode (APD). It has two detection outcomes, either registering an electronic pulse (click) or not (clicks). A lossfree perfect version of it would implement the Kraus operators
(8) 
distinguishing between the presence or absence of photons. However some photons are absorbed without triggering a pulse. This loss can be modelled placing a BS in front of the perfect detector Ulfloss (). The POVMs describing a detector with a BS of transmittivity can then be written as,
no click  (9)  
click  (10) 
disregarding afterpulsing or dark counts APDHendrik (). Having only two outcomes, this detector cannot distinguish the number of photons present.
A more advanced detector called time multiplexing
detector (TMD) does have certain photonnumber resolution. It splits
the incoming pulse into many temporal bins, making unlikely the
presence of more than one photon per bin. All the time bins are then
detected with two APDs. Summing the number of click outcomes from
all the bins one can then estimate the probability of having detected
a number of incoming photons. This detector is not commercially
available but one has been constructed by the Ultrafast Group in Oxford
loopy (). It has eight bins in total (four time bins in each of
two output fibres) and thus nine outcomes – from zero to eight
clicks.
The theoretical description of this detector is a bit more involved since there is what we call the “binning problem”. Indeed, in addition to loss there is a certain probability that all photons will end up in a single time bin, or more generally that incoming photons will result in less than clicks. To account for the details describing these probabilities we use a recursive relation martinprivate (); loopyachilles (). Our goal is to describe the following probability distribution:
: Probability of having clicks given that there were incident photons and that the detector has bins (or modes). Let us start with the simplest possible TMD which would consist of an input port, a beam splitter (with reflectivity and transmittivity and ) and two YES/NO detectors. This detector is shown in Fig. 4 and has two bins.
For this simple example, the probability distribution we are after is . We will show how to calculate , and then how to go from to . For the two bin case from Fig. 4 consider a BS with transmittivity and reflectivity . In that case:

(if no photons are present we will only register zero clicks).

(with probability , photons end up in the lower bin and the same holds for the upper bin with . The probability of a single click is the sum of these independent probabilities).

(if then only two events may happen: one click or two. This complementary event has therefore (Probability of click).)
In the case of a bin detector shown in Fig. 5, incoming photons are distributed to two bin detectors according to a binomial distribution.
Now let us evaluate the probability for the upper bin detector to register counts if photons entered while registering clicks in the lower 2bin detector if entered the lower port. This should be weighted by the probability that photons enter the upper branch and the lower one, which is
Now the probability that counts are found overall is found summing the weighted probability over all possible ways that the detectors can find counts (i.e., ) and summing over all possible ways of distributing photons:
We can extend the same argument to . Imagine we know . Now, for that detector to become a bin detector all we need is to couple two of them to a beam splitter which will distribute the photons as described above and as shown in Fig. 6. In that same fashion we can then define the recursive relation
(11) 
This of course can be generalized to accommodate a different BS reflectivity at each node (adding an index to and to account for its position). Based on this recursion and once we determine all and , we can write a simple and efficient program to generate the corresponding theoretical POVMs. For example, a outcome detector would have a POVM which can be captured in the following matrix:
where , and . For example, the click event has a POVM element,
etc. More generally, the measured statistics are related to the incoming photons by
where is the probability of detecting counts and the probability that photons arrived to the TMD loopy0 (). The Matrix introduced above then relates probabilities and density matrices through: .
iv.2 Detector loss
TMD detectors have various sources of loss, meaning that photons are absorbed before triggering a detection event. The major sources of loss are the coupling to the fibres, the absorption and scattering in the delay fibres and the nonunit efficiency of the detectors achilles200697 (). A full description of the effect of losses is certainly complex, since loss occurs at many stages of the detector. One can model loss simply as a beamsplitter diverting photons out of an input state. However, one would have to include a BS before the detector (fibre coupling), a BS at each stage of fibre and a BS in front of each APD, altering Eq. (11) accordingly. Instead we give an effective description modeling loss with a single BS in front of the detector. The matrix capturing the losses has entries that are given by
being the binomial distribution accounting for loss since for all . Now combining both descriptions, we can decouple the loss from the binning, putting a BS coupled to the environment before the bin TMD resulting in
This relationship expresses how the incoming photons experience loss and then are distributed among the available modes. The model of the TMD described up to this point would be the one needed without detector tomography. One could for example try to measure independently the transmittivities of the inner BS, reconstruct the convolution (or binning) matrix and try to estimate the overall loss to include it in the matrix . This would help us build a model of the TMD sketched in Fig. 7. By contrast, using detector tomography, we do not need to know anything about bins, beam splitters, inner detectors or specific loss mechanisms. Moreover, anything left out of our detector model (e.g. dark counts, afterpulsing, etc.) would be included in a tomographic characterization.
V Experimental reconstruction
We now turn to the description of the experimental realisation. As mentioned earlier, coherent states are ideal probe states with which to characterize optical detectors. This holds true for any optical detector, including polarization detectors, frequency detectors (e.g. spectrometers), and even detectors that discriminate inherently quantum states (e.g. a photonic Bellstate detector). In most of these cases, we are only interested in the detector’s action at a particular input photon number (usually ). Still, we can reconstruct the detector POVM in the full photon number basis and then restrict our attention to a particular subspace. It is interesting that an optical state which can be fully described in a classical theory of electromagnetism can be used to characterize uniquely quantum detectors. To characterize a completely unknown detector (i.e. a black box) one would need to vary all the available parameters of the probe set of coherent states: spatialtemporal mode, polarization, phase, and amplitude (ensuring a tomographically complete set of states is constructed). However, given that frequency, time, position, momentum, and amplitude have infinite range, this is impossible in practice. Consequently, one is required to make realistic assumptions about the range of operation and sensitivity of any unknown detector. One might additionally neglect some of these optical parameters if one is only interested in a particular aspect of the full characterization.
The subjects of our detector tomography, the APD and TMD, possess the unique
features of singlephoton sensitivity and photonnumber resolution,
respectively. To characterize these features we vary only the amplitude and
phase of probe coherent states, while fixing the spatialtemporal mode. In
particular, the input wavepackets have a time extent shorter than the time
window of the electronics, and the center wavelength is within range of the
detectors. Light is coupled to both types of detector through singlemode
optical fiber, eliminating the possibility of any variation in the position or
momentum mode of the light. Critically, for the detectors to perform as
intended and in order to ensure the detectors are memoryless, the wavepackets
must be preceded and followed by time intervals in which there is no input
light. The APD is known to have a deadtime of roughly ns; a detection
that occurs before the input wavepacket arrives at the detector will make the
detector insensitive to the probe coherent state. The TMD splits the incoming
wavepacket into time bins spread over ns. The inverse of these two
timescales then sets an upper limit on the rate at which we can send probe
states to the detectors. We further limit the rate to ensure the detectors do
not heat up, which would change their properties over time. These time variant
features of the detectors could be illuminated with detector tomography but
this would be unnecessarily complicated, as they are quickly evident without
use of the full tomography procedure.
When operated with a
gain high above their lasing threshold, lasers produce light well approximated by a coherent
state. Coherent states (and statistical mixtures thereof) are unique amongst pure
optical states in that, at a beamsplitter, the transmitted and reflected
states are unentangled. Consequently, through attenuation we can vary the
amplitude of a coherent state without changing its nature. In homodyne state
tomography, one must use a balanced detection technique to negate technical
noise in the laser. In contrast, by attenuating the laser light to the
singlephoton level in our detector tomography scheme, the resulting coherent
state possesses an inherent amplitude uncertainty that renders the technical
noise insignificant.
We use a cavity dumper (APE Cavity Dumper Kit) to reduce the repetition rate of a pulsed modelocked Ti:sapph laser to . Long term drift of the laser power over 1 million pulses was . Our laser randomly varies in energy between subsequent pulses with a standard deviation of of . We vary the amplitude of our probe coherent states by rotating their polarization with a halfwaveplate () in front of a GlanThompson polarizer (P) as shown in Fig. 8. We attenuate the coherent states by reflecting them from a beamsplitter (BS) (=95%) and three neutral density filters (NDF) (i.e. spectrally flat attenuators). Note that for the BS was measured with a relative deviation smaller than . Along with some loss upon coupling into a singlemode fiber, we collect these attenuations together in an overall attenuation factor . We test for any variation in as a function of (e.g. which might be caused by rotating the waveplate if it had a wedge) and find that the variation is less than .
There are, as of yet, no direct techniques to calibrate the power of light at the singlephoton level Worsley09 (). In fact, there are no laboratory methods to make an absolute calibration of power at any intensity. Instead, a chain of photodetectors are calibrated relative to each other. At the beginning of the chain is an absolute calibration system held at national standards institutes. In the case of our power meter (Coherent FieldMaxIITO), the National Institute for Standards and Technology (NIST) uses a cryogenic bolometer as its absolute standard. This chain results in a 5% systematic uncertainty in our measurements of the laser power (averaged over 0.2s), measured at the transmitted port of the beamsplitter. The magnitude of for the probe state was found via . For each value of we recorded the number of times each detection outcome occurred in trials (i.e. laser pulses), which provides an estimate of . The phase of was allowed to drift freely, during which no change in the was observed. Consequently, we did not actively vary the phase of our probe states.
The 5% uncertainty in is the dominant error in our experiment. For a detector with over 95% efficiency, this error could result in estimates of that are incompatible with any physical detector. For example, this could result in more detector clicks on average than there were photons in the probe state on average. Gain in the detector could achieve this, but at the same time would necessarily introduce noise that would change the distribution of . For detectors with lower efficiency, the systematic error in will simply add or subtract from efficiency of the detector that results from the tomographic characterization.
We choose a center wavelength and a FWHM bandwidth of that are appropriate for each detector. In the case of the APD detector (a Perkin Elmer SPCMAQR13FC) we set nm, nm, and chose the appropriate rate kHz, , and . For the TMD detector we set nm, nm, kHz, , and .
Since has an infinite range, one must set an upper limit on the magnitude of used in our set of probe states. A natural place for this is at the at which the detector behavior saturates, i.e. stays constant as a function of . Since this occurs asymptotically, a somewhat arbitrary degree of constancy must be chosen; we set as our upper limit. We expected the saturation behavior of the TMD would be whereas we found that it was and Already, the measured statistics (which are proportional to ) give a clear signature that our theoretical model of the detector is too simplistic. Moreover, as we increase the beyond our upper limit begins to decrease. The cause for this behavior is a break down of our memoryless detector assumption and highlights how crucial it is to create probe states in the desired spatialtemporal mode. Along with the probe laser pulse, the cavity dumper also outcouples a small fraction of the preceding and following pulses in the Ti:Sapph cavity. These are separated in time from our probe pulse by ns and each contain only the energy of the probe pulse. Consequently, these extra pulses will have an insignificant effect on for most of the range of . However, at , the preceding pulse will be a coherent state with . Consequently, roughly of the time bins in the TMD will be preceded by a photon. If the detector has, for instance, a efficiency then of the time bins will be preceded by a detection event that will not be counted as click by the electronics (due to their time window). More importantly, those bins will subsequently be unavailable to detect photons in our probe pulse, due to the ns deadtime of the APDs inside the TMD. Thus, as we observe, roughly of the bins will not result in a click. This behavior was extraneous to the normal operation of the detector and so we limit to in the tomographic reconstruction. However, it exemplifies the usefulness of even the basic detector tomography procedure, which results in approximate , for rough evaluation of the detector action. Some of these hypothesis or details could be further explored if we knew well some detectors or some states. For example the response of BS and neutral density filters to single photons could be explored (granted good single photons and reliable single photon detectors). The time independence of the POVMs could also be studied with well known states. However, we will see that the excellent fit of the data to the Qfunction and of the reconstruction to the model suggests a sufficiently good understanding.
v.1 Results
We now turn to the tomographic reconstruction. To characterize our detector we have measured the outcome probability distributions resulting from sending a tomographically complete set of input states (or probe states). The use of pure coherent states as probe states implies that the probability distribution is proportional to the Qfunction, as seen in Eq. (3). In principle the knowledge of the Qfunction is then sufficient to predict the measurement probabilities for any incoming state since determine completely.
However, a more useful and natural representation for
photodetectors is the POVM expanded in the Fock basis. Another argument to
find the POVM elements is that due to statistical noise, the Qfunction could
correspond to a nonphysical POVM. Indeed, if we simply make a fit to the
noisy measured probability distribution, this fit could correspond to negative
“POVMs”. As an example consider
Fig. 9 where two Qfunctions are displayed together:
Even though they are very similar to each other,
one corresponds to a detector with a nonpositive “POVM” element (and thus negative probabilities) and one
corresponds to a physical one.
Our goal is therefore to reconstruct the
POVM operators which most closely match the data and still observe
Quantum Mechanics (and thus are positive).
.
Since we adopt a “black box” approach we
need not assume any of the properties described in the previous theoretical
models. Only the accessible parts of the “black
box” i.e., number of
outcomes and control (or lack of control) of the phase
will determine the description of our detector.
The lack of a phase reference simplifies the
experimental procedure, allowing us to solely control the magnitude of
(as has been done for tomography of a single photon Lvovsky ()).
A detector with no observed phase dependence will be described by POVM
elements diagonal in the number basis,
simplifying hence the reconstruction of .
Again we can express the relationship between the statistics and our diagonal as,
if we measure different values of , and truncate the number states at a sufficiently large . For an outcome detector, the matrices will have dimensions , , and . In addition
can easily be rewritten when the input state is a mixed state. This was done indeed to account for the laser’s technical noise (as we will see in the next section) but gave similar results. For such a detector, the physical POVM consistent with the data can be estimated through the following optimisation problem:
(12) 
where the norm ensures it is a convex quadratic problem. Note that we also allow for convex quadratic filter functions which will be discussed in some detail later. These are related to the conditioning of the problem and must not depend on the type of detector. For example, no symmetry or knowledge of the typical POVM structures in photodetection can be assumed. If any, only general regularization functions that work for any POVM should be chosen. Now, for suitable filter functions (i.e. cuadratic) the whole problem is a convex quadratic optimisation problem, and hence also a semidefinite problem (SDP) which can be efficiently solved numerically Convex (). Moreover, in this case, there exists a dual optimisation problem whose solution coincides with the original problem. Thus, the dual problem provides a certificate of optimality since it provides a lower bound to the primal problem.
Care has to be taken that the optimisation problem is well conditioned in order to find the true POVM of the detector. In finding their number basis representation we are deconvolving a coherent state from our statistics which is intrinsically badly conditioned due to the importance of the wings of the Gaussian. Similar issues of conditioning have been discussed in the context of state and process tomography, see, e.g., Refs. UnstableBoulant (); UnstableHradil (). Due to a large ratio between the largest and smallest singular values of the matrices defining the quadratic problem, small fluctuations in the probability distribution can result in large variations for the reconstructed POVM. This can result in operators that approximate really well the outcome statistics and yet do not exhibit a smooth distribution in photonnumber. We will discuss how to treat this problem in the next section.
The measured probabilities for each outcome as a function of are displayed in Fig. 11 and Fig. 10. The probability distributions (equivalent modulo to the Qfunction of the detector) show smooth profiles and distinct photon number ranges of sensitivity for increasing number of clicks in the detector. Fig. 12 shows the POVMs that result from the optimisation in Eq. (V.1) which we will discuss later. A first remarkable property is that being the POVM for clicks, shows zero amplitude for detecting less than photons. That is, the detector shows essentially no dark counts. It should be noted that this was not assumed at the outset and is purely the result of the optimization. This sharp feature gives the detector its discriminatory power where clicks means at least photons in the input pulse.
To assess the performance of our method we compare it to the model described in the previous section. This time however, the BS used in the model are not 50/50 but its reflectivities () were measured experimentally. This was done measuring the reflected and transmitted beams of a laser with a calibrated power meter. The yellow bars in Fig. 12, show the absolute value of the difference between the theoretical and the reconstructed POVM elements. The magnitude
is shown stacked on top of each coefficient of the POVM elements where (theo) stands for theoretical and (rec) for reconstructed. The small yellow bars reveal a good agreement with the model. We also calculate a form of fidelity finding that
holds for all . Note that to calculate we normalized the POVM elements. This overlap indicates an excellent agreement between the two.
In addition, one can reconstruct a probability distribution: from the found POVMs to fit the data. The reconstruction is plotted as dark blue bars in Fig. 10 and Fig. 11. It is the equivalent of the Qfunction had our probe states with suitable complex been strictly pure. In fact, although formally distinct, the probability distribution associated with the reconstructed POVM using mixed or pure states are practically indistinguishable and are plotted together in Fig. 10 for comparison.
v.2 Detector Wigner functions
An alternative representation of the detectors which can give us more insight about their structure comes from the quasiprobability distributions such as the Wigner Function UlfBook (); Schleichbook (). Since the POVM elements are self adjoint positivesemidefinite operators, a Wigner function can be calculated in the standard way from the POVM element :
(13) 
where we have, as usual, now identified as phase space coordinates of a single mode with . However, since the POVMs do not have unit trace, this detector Wigner function will not be normalized,
(14) 
We should note that the marginals cannot be interpreted as probability distributions but we can still use to calculate probabilities according to
(15) 
Since none of the detectors have phase sensitivity their Wigner functions are rotationally symmetric around the origin. In Fig. 13 we display a cut of the TMD Wigner function for the following POVM elements:
Higher clicks are not displayed because their amplitude is too small to be compared with the rest. The interesting feature about the plot is the comparison with the theoretical TMD Wigner functions one can generate with the model. Indeed, comparing a theoretical lossless TMD with the measured one we see how the amplitude of the Wigner function decreases rapidly for higher photon numbers. On the other hand, comparison with the lossy theoretical model reveals a good agreement.
Vi Ill Conditioning and regularisation
One of the main problems encountered in the tomographic characterisation of the detectors has to do with the numerical stability of the reconstruction. Such problems are common in tomography UnstableBoulant (); UnstableHradil (). Consider for example the transformations involved in the inverse Radon transform and their inherent instabilities. Note also how going from the Qfunction to the Pfunction is not always well defined Schleichbook (). Multiple tools exist to bridge the link between homodyne tomography and the density matrix description lvovskypattern (). One of them involves the use of pattern functions darianopattern (); leonardtpattern (); wunsche1997 (). That is, finding some functions such that
However, finding the appropriate involves the irregular wave functions (particular unbounded solutions of the Schrödinger equation) and proving them to be appropriate is typically as hard as estimating the error patternfun (). The use of maximum likelihood has also been explored and particularly for detector tomography dariano200493 (); PhysRevA.64.024102 (). However, the speed of the convergence is not generally guaranteed to be high, becoming exponential for certain problems. Our approach, following the spirit of maximumlikelihood, translates the problem into a quadratic optimisation one allowing for efficient semidefinite programming (SDP) (cf. Eq. (V.1)). We discuss here the details, approximations and filters that lead to our solution of the problem.
vi.1 Truncating the Hilbert space
The data was measured up to but was truncated at lower values in phase space. This was done to avoid noisy behavior and the emergence of new regimes in the behavior of the detector. Memory effects requiring a larger POVM space were thus avoided as discussed in the experimental section. Notably, the effects related to the detector’s dead time, afterpulsing or the dark counts from possible over heating were avoided staying in a low () photon number regime.
vi.2 Pure vs. mixed
The Qfunction of our detector (directly measured) is proportional to
(16) 
(17) 
with the usual constraints and . Using a semidefinite solver such as Yalmip, the obtained POVMs shows irregular dips and a structure quite dissimilar from what a TMD is expected to do.
The Fig. 14 shows a typical result, and Fig. 15 shows it for higher photon numbers revealing an even more irregular structure.
vi.2.1 Describing the laser’s amplitude uncertainty
A first meaningful observation is that some level of uncertainty existed in the amplitude of the coherent states. If values of were measured then the real
might actually have been
with some vector of errors . To address the effect of this uncertainty on our minimisation we can artificially introduce noise and then average over many runs of the optimisation. In other words, since in Eq. (17) depends on the measured values of , we can substitute with where are independent and identically distributed random variables. Using we run the optimisation and obtain a family of estimated POVM elements (each element of the family corresponds to a run of the optimisation with a different ) As a first approximation we may use a Gaussian probability distribution with zero mean and . Note that % was the measured variance of the laser amplitude from pulse to pulse as shown in Fig. 18. Subsequently we average over the POVMs obtained with different “jitters” in realizations, obtaining
iterations of the optimisation with subsequent averaging improves the appearance of the POVMs but barely solves the “dips” observed. Fig. 16 and Fig. 17 are an example of this approach, showing that this kind of averaging does not properly counteract the fluctuations in the reconstructed POVM.
vi.2.2 Using mixed input states
A key obsevation showing that the previous approach is not the appropriate treatment of uncertainty in is that each probe state would be best described by a mixture of coherent states,
(18)  
(19) 
Here, would be some distribution centered around in phase space, leading to a mixed Gaussian state in case of a Gaussian classical probability distribution. We can integrate this state over the complex phase since we have no phase reference available and focus solely on the amplitude of the coherent states or mixtures thereof. Measurements reveal that the intensity of the laser varies from pulse to pulse following a distribution that looks like a Lorentzian with a tail (see Fig. 18).
A good approximation can however be made using a Gaussian distribution, leading to a Gaussian state, with standard deviation , implying
with . The detection probability for outcome is then
(20) 
To simplify these calculations we can write a distribution in ,
(21) 
with
In this case has been chosen such that the approximation holds. These subtleties however do barely alter our results and POVMs are as irregular as previously.
To evaluate the difference introduced by the pure () or mixed state () approach we have studied their influence on the reconstructed POVMs. In the regularised optimisation (i.e., for our final results), we have compared the POVMs obtained with each description finding that
(22) 
and the largest relative difference between any two coming from a mixed state or a pure state derivation was . Furthermore, the reconstructed probability distributions are so close that they are indistinguishable on the scale of Fig. 10. This reinforces our earlier expectation that technical noise in the laser will be negligible when using singlephotonlevel coherent states. This differs from homodyne tomography where technical noise can shift a strong local oscillator to a nearly orthogonal state.
However, since the problem of the irregular POVMs is not solved by the mixed state description we need to look further into the origin of these irregularities. One first remarkable (but expected) property is that large variations in the photon number degree of freedom of the POVMs result in minuscule differences in the probability distributions (see Fig. 12). Since one convolutes the photon number distribution with a Gaussian in to obtain the Qfunction this behavior has been expected. Conversely this means that small errors or statistical fluctuations in the Qfunction can result in large errors in the POVM elements. Consider for example that if instead of
we try to minimise
the SDP solver finds no sensible solution. This is because using the MoorePenrose pseudoinverse we find due to its inherent ill conditioning, meaning that the ratio of largest and smallest singular values in is large.
Various methods exist to try and regularise these problems. Whatever the chosen method it should assume as little knowledge as possible about the specific form of the sought POVM. Since has very small values for high photon numbers one could enhance those values while preserving the minimisation target. For example we could run the optimisation
where is a diagonal matrix aimed at regularising the problem. This can be shown to introduce some improvement but does not solve the “dip problem” completely. It is also hard to find the exact form of that yields “good” results without any prior knowledge about the expected POVMs. In addition (roughly speaking) it is hard to find a balance between having good results for low photon numbers and for high photon numbers.
Another approach is to introduce a sort of damping or specific penalisation. For example one could define a diagonal matrix such that
and use it to redistribute the weight of each POVM element, avoiding unreasonably large POVM element amplitudes (that compensate for low values in ). The optimisation could be recast as,
(23) 
A result of this can be seen on Fig. 19. This method has the same shortcomings as the previous one: it is sensitive to the choice of parameters and the exact form of is hard to determine without detailed prior assumptions.
A more reasonable method is to capture the relative smoothness of the POVM from a lossy detector. This method is also called smoothing regularisation Convex (). In this case one single assumption needs to be made. The POVMs should exhibit a certain degree of “smoothness”.
vi.3 Smooth or not?
Let us first define what we mean by smooth. Smooth will mean in this context that the difference is small for all and . In the optimisation context we will mean that our minimisation is defined as follows:
(24) 
with
for some fixed value of y. The smoothing function will be independent of the detector, and will mildly penalize nonsmooth POVM elements. This approach is further substantiated by the observation that the resulting POVMs are largely independent of the weight that is given to the smoothness penalty.
As most quantum detectors, especially those disucussed here are lossy, this is a particularly plausible feature. Indeed, if an optical detector has a POVM element with nonzero amplitude in , then if it is lossy, it will have a positive amplitude in , , decreasing with but different from zero. In fact, in general, if the detector has a finite efficiency which can be modelled with a BS, it will impose some smoothness on the distribution . That is because if is the probability of registering photons and is the probability that were present, then the loss process will impose kennythesis ():
Consequently, if , then , etc. cannot be zero, but will have some relatively smooth distribution. This simple physical argument makes a certain smoothness plausible (but still should allow sharp transitions for ).
For this detector (and for any photodiode based detector) assuming loss is reasonable and can make the “smoothness” requirement appropriate. Let us however see if, without looking at the specific shape of our POVM, we can find an optimal smoothing coefficient and justify further the use of the smoothing regularisation.
One way to test this method is to quantify how resilient it is to noise in the data. To do so we introduce additional noise in to the measured data. For example, we can alter in where is again a vector of random variables distributed with a Gaussian distribution with zero mean. This simulates a statistical uncertainty in the measurement of the coherent state. To see its effect on the reconstruction we use the figure of merit . This quantity should evaluate how POVMs differ from the one without noise. It is seen that the additional smoothing penalty makes the optimisation more robust, largely independent of the value of (we can multiply by a and stay in the same regime). Using this smoothing regularisation with noisy data seems therefore a good choice.
We have seen how smoothing makes the optimization more robust against noise but we should also ask how sensitive this optimisation is to the exact choice of . To do so we may use the following procedure: compare the POVM obtained using with that obtained varying over orders of magnitude. In Fig. 21 we plot the relative error . Remarkably doubling the value of results in an overall relative error in the POVM of less than %. Multiplying (or dividing) by 10 gives a variation below % and fold variation results in a % variation. If we compare how this differs from the case which is % different then we can conclude that the optimisation is quite insensitive to the exact choice of the smoothing parameter . The following table provides some values for reference.
variation  relative error  

0.0001  x/1000  27.3% 
0.001  x/100  12.2% 
0.01  x/10  4% 
0.05  x/2  1% 
0.5  x 5  3% 
1  x 10  5% 
vi.4 Sharp and smooth
There is of course a limit to how much we can penalize nonsmooth POVMs. Is it possible for the smoothing regularisation to wash out all the sharp features of the POVM, thus smoothing in excess? This of course is a legitimate question that further restricts the reasonable range for . To study that effect we analyse four cases:

A theoretical lossless TMD, based on the model described in Eq. (11).

A lossy TMD, based on the above with added loss from an BS.

A perfect photon number detector, that is with .

An artificial POVM with sharp variations, containing the POVM elements:
and observing
To study the smoothing we generate the POVM elements numerically, build a probability distribution and retrieve the using the optimisation from Eq. (24) for an increasing range of . Then we compare these results with the theoretical POVMs we defined in order to generate the PD. All optimisations are done using the mixedstate approach from Eq. (21). Broadly speaking, we find two distinct behaviors: POVMs with terms that decay slowly in photon number need regularisation and are quite insensitive to the precise . For sharp POVMs (without loss) the range preserves their shape quite well, but further smoothing hides their true shape. These properties are further illustrated in the figures that follow.