Optimal estimation and discrimination of excess noise
in thermal and amplifier channels
We determine a fundamental upper bound on the performance of any adaptive protocol for discrimination or estimation of a channel which has an unknown parameter encoded in the state of its environment. Since our approach relies on the principle of data processing, the bound applies to a variety of discrimination measures, including quantum relative entropy, hypothesis testing relative entropy, Rényi relative entropy, fidelity, and quantum Fisher information. We apply the upper bound to thermal (amplifier) channels with a known transmissivity (gain) but unknown excess noise. In these cases, we find that the upper bounds are achievable for several discrimination measures of interest, and the method for doing so is non-adaptive, employing a highly squeezed two-mode vacuum state at the input of each channel use. Estimating the excess noise of a thermal channel is of principal interest for the security of quantum key distribution, in the setting where a fiber-optic cable has a known transmissivity but a tampering eavesdropper alters the excess noise on the channel, so that estimating the excess noise as precisely as possible is desirable. Finally, we outline a practical strategy which can be used to achieve these limits.
Introduction—One of the primary goals of quantum information theory is to identify limitations on how well one can process information or estimate an unknown parameter, when allowing for quantum effects book2000mikeandike (); H06book (); H12 (); W15book (). Along with this goal, there is great interest in determining whether it is possible to approach these limits in principle, and furthermore, if this can be done in practice with realistic constraints taken into account, such as time, energy, scalability, etc.
In this paper, we are interested in the fundamental limitations on channel discrimination and estimation for a particular class of quantum channels. Suppose that an unknown parameter is encoded in an environmental state, which subsequently interacts with an input quantum system via a fixed unitary quantum interaction. Suppose further that the unitary interaction has two output quantum systems, one of which is available and denoted as and the other is lost or discarded to the environment. The transformation of the input system to the output system is called a quantum channel. Let us call such channels environment-parametrized channels, given that the unknown parameter is encoded exclusively in the environment and not in the unitary interaction 111These channels were called programmable quantum channels in DP05 (); JWDFY08 (), which is terminology used for them in the context of quantum computation, the idea being that one could encode a program in a quantum state that could then be executed via a unitary interaction between an input and the program register. This meaning and context is completely different from ours, so we prefer to use the terminology “environment-parametrized channel”. Important environment-parametrized channels of practical interest are thermal channels with a fixed, known transmissivity and unknown excess noise. Other examples are amplifier channels with a fixed, known gain but unknown excess noise.
We consider two tasks: first, we suppose that the parameter takes one of two values and the goal is to figure out which value it takes. Second, we suppose that the parameter takes a value from a continuum and the goal is to estimate the unknown parameter. The former task is called channel discrimination PhysRevA.71.062340 (); PhysRevA.72.014305 (); W08 (); DFY09 (); H09 (); HHLW10 () and the latter channel estimation CPR00 (); W02 (); FI03 (); JWDFY08 (); DKG12 (), both topics having an extensive literature already. Also, there are strong connections between the two tasks N05 (), as one might suspect. In these tasks, we would like for the error probability or the mean-square error, respectively, to be as small as possible when determining the unknown parameter.
For both tasks, the most general strategy one could allow for is an adaptive strategy, when trying to determine an unknown parameter encoded in a quantum channel (see Figure 1). An adaptive strategy that makes calls to the channel is specified in terms of an input quantum state , a set of adaptive, interleaved channels , and a final quantum measurement that outputs an estimate of the unknown parameter. The strategy begins with the discriminator preparing the input quantum state and sending the system into the channel . The channel outputs the system , which is then available to the discriminator. The discriminator adjoins the system to system and applies the channel . We say that the channel is adaptive because it can take an action conditioned on information in the system , which itself might contain some partial information about the unknown parameter . The discriminator then inputs the system into the second use of the channel , which outputs a system . This process repeats more times, and at the end, the discriminator has systems and . The discriminator finally performs a measurement that outputs an estimate of the unknown parameter . The conditional probability for the estimate given the unknown parameter is given by the Born rule:
Note that such an adaptive strategy contains a non-adaptive strategy as a special case: the system can be arbitrarily large and divided into subsystems, with the only role of the interleaved channels being that they redirect these subsystems to be the inputs of future calls to the channel (as would be the case in any non-adaptive strategy for estimation or discrimination).
Our first main result is a general upper bound on the performance of adaptive discrimination and estimation of environment-parametrized channels. We establish this upper bound for any discrimination measure that satisfies a data-processing inequality (that is, it is monotone non-increasing with respect to the action of a quantum channel). Our result thus holds for all known and useful discrimination measures, given that the data-processing inequality is the most basic requirement needed for any discrimination measure. This includes well known discrimination measures such as quantum relative entropy U62 (), Rényi relative entropy P86 (); MDSFT13 (); WWY13 (), quantum fidelity U76 (), trace distance, Chernoff information PhysRevLett.98.160501 (); ANSV08 (), hypothesis testing relative entropy HP91 (); BD10 (); WR12 (), etc., each of which have operational interpretations for certain information-processing tasks. The essential statement of the upper bound is that one’s ability to discriminate or estimate environment-parametrized channels is limited by how well one can discriminate or estimate the environmental states that encode the unknown parameter.
In our second main result, we show that it is possible to attain this upper bound in principle for a number of the discrimination measures listed above, when estimating excess noise in thermal channels or excess noise in amplifier channels. For these particular channels, the unknown parameter is the mean photon number of an environmental thermal state, while the transmissivity or gain is known in our scenario. We find that the optimal strategy does not involve any adaptation whatsoever and consists solely in sending one share of a highly squeezed two-mode squeezed vacuum state into each use of the channel, followed by a measurement on the output systems. What we find remarkable about this result is that, in the limit of large squeezing, several of the discrimination measures mentioned above depend only on the mean photon number of the environmental thermal state and have no dependence on the transmissivity or gain of the channel. Thus, such a strategy with a highly squeezed two-mode squeezed vacuum state allows for removing the effect of loss or gain in the channel, and we provide a physical interpretation for this phenomenon in what follows.
Our results for estimating excess noise in thermal channels should be useful for the security of quantum key distribution SBCDLP09 (). There, the transmissivity is typically known when the communication medium is a fiber-optic cable, but the excess noise in the channel can be attributed to a tampering eavesdropper. Thus, estimating excess noise in the channel is of primary interest and plays a critical role in security analyses.
Environment-parametrized channels—We begin by defining an environment-parametrized quantum channel DP05 (); JWDFY08 (). Let be an unknown parameter, and let be a quantum state that depends on . Let be a unitary operator that takes vectors in a tensor-product input Hilbert space to vectors in a tensor-product output Hilbert space . Then we define an environment-parametrized channel as follows:
where is an operator acting on and denotes the partial trace. By inspecting the above definition, we see that it is only the environment state that depends on the unknown parameter and the unitary interaction is fixed and independent of . Thus, all of the information that distinguishes one channel from another channel is encoded in the environment of these channels.
Particular examples of environment-parametrized channels are thermal channels, noisy amplifier channels, Pauli channels, and erasure channels. We review the first two here and sketch later why the latter two are environment-parametrized. The unitary for a thermal channel is defined from the following Heisenberg input-output relations:
where , , , and are the field-mode annihilation operators for the sender’s input, the receiver’s output, the environment’s input, and the environment’s output of these channels, respectively. The environmental mode is prepared in a thermal state of mean photon number , defined as
where is the orthonormal, photonic number-state basis. The parameter is the excess noise of the thermal channel. When , reduces to the vacuum state, in which case the resulting channel in (3) is called the pure-loss channel—it is said to be quantum-limited in this case because the environment is injecting the minimum amount of noise allowed by quantum mechanics. The parameter is the transmissivity of the channel, representing the average fraction of photons making it from the input to the output of the channel. Let denote this channel. In our application, we set the unknown parameter , and we suppose that the transmissivity is known.
The unitary for an amplifier channel is defined from the following Heisenberg input-output relations:
The parameter is the gain of the amplifier channel. For this channel, the environment is prepared in the thermal state . The parameter is the excess noise of the amplifier channel. If , the amplifier channel is said to be quantum-limited for a similar reason as stated above. Let denote this channel. The class of amplifier channels we consider are those with a fixed known gain and the unknown parameter .
General bound from quantum data processing—We now establish our first main result. Let denote a generalized divergence SW12 (); WWY13 (), which is a function accepting two quantum states as input and producing a non-negative real number as its output. The only property that we demand to hold for a generalized divergence is that the following data-processing inequality hold:
where is a quantum channel. The inequality in (8) asserts that a generalized divergence , interpreted as a measure of distinguishability of the states and , does not increase under the action of a quantum channel . Particular examples of generalized divergences include quantum relative entropy U62 (), hypothesis testing relative entropy HP91 (); BD10 (); WR12 (), quantum fidelity U76 (), trace distance, Rényi relative entropy P86 (); MDSFT13 (); WWY13 (), etc. Note that any generalized divergence is unitarily invariant WWY13 (): i.e., the following equality holds for any unitary operator :
because and are quantum channels, , and . Furthermore, it is invariant with respect to tensoring in the same state WWY13 ():
because is a quantum channel and partial trace is a quantum channel, so that and .
Suppose that the discriminator is attempting to distinguish two environment-parametrized channels of the form in (2), where the environmental state is either or . In such a case, the conditional probability for outputting is for as given in (1), whenever the discrimination strategy is the most general adaptive strategy as outlined before. Then our first main result is the following inequality
Manifest in the above inequality is the following intuitive statement: the discriminator’s ability to distinguish the two channels, if given calls to the channel, cannot be any better than if the discriminator were presented with copies of the environmental state and then asked to decide with which one he was presented. If the generalized divergence is also additive with respect to tensor-product states, which holds for many examples of divergences as we discuss below, then (11) reduces to
We note that results bearing some similarities to (11) have appeared in previous papers JWDFY08 (); DKG12 (), but the previous statements are not given in such generality (i.e., for all generalized divergences) nor were the previous statements argued to apply to the most general adaptive strategy one could consider and instead only argued for non-adaptive strategies.
We now prove the inequality in (11). For simplicity, let us suppose that the adaptive discrimination strategy consists of two calls to the unknown channel, and then it will be easy to see how to generalize the result to get (11). Then, in this case,
and let us abbreviate the expression on the right as . Then
All of the steps given above are a consequence of the data-processing inequality in (8). The first inequality follows because the final measurement can be considered as a quantum channel acting on the states and that produces the respective output probability distributions and . The second inequality follows from the definition of environment-parametrized channels in (2) and because a partial trace is a quantum channel. The first equality follows because any generalized divergence is unitarily invariant, as recalled in (9). The third inequality follows by discarding the adaptive channel . The next few steps follow the same reasoning and the last equality follows from (10). Thus we establish the inequality in (11) for , but it is easy to see that repeating the above steps establishes (11) for arbitrary .
where the infimum is with respect to all operators satisfying and . The physical interpretation of this quantity is in asymmetric hypothesis testing: if it is desired that the error probability in identifying the state by a measurement be less than , then is the minimum error that one could have in identifying the state using the same binary-outcome quantum measurement. The hypothesis testing relative entropy is a quantity of deep interest in quantum information theory because various relevant information measures can be built from it, which are useful in assessing the performance of a variety of information-processing tasks MW12 (); TH12 (); DTW14 (); TWW14 (); WTB16 (). It obeys the data processing inequality in (8) by its very definition, for the simple reason that applying the same quantum channel to the states and never decreases the two different error probabilities discussed above WR12 ().
Applying the result in (11) leads to the following bound:
where, in the last equality, we have used the quantum relative entropy U62 (), the quantum relative entropy variance li12 (); TH12 (), the inverse of the cumulative Gaussian distribution function , and an expansion of the hypothesis testing relative entropy that holds for tensor-power states li12 (); TH12 (); DPR15 (). The bound in (16) thus places a fundamental limitation on the performance of any adaptive channel discrimination strategy in the context of asymmetric hypothesis testing.
where and . The first one satisfies (8) for P86 (), and the second one satisfies (8) for MDSFT13 (); WWY13 (); B13monotone (); FL13 (); MO15 (). Both are additive with respect to tensor-product states, converge to the quantum relative entropy in the limit as , and thus satisfy (8) in this limit. Applying (11) we find that
for the ranges of for which data processing holds. As these quantities have operational meaning in the context of asymmetric hypothesis testing as error exponents and strong converse exponents in the quantum Hoeffding bound MO15 (), the above inequalities place fundamental limitations on the exponential convergence rate of error probabilities of adaptive channel discrimination strategies in this setting (see also CMW14 () for results on adaptive channel discrimination and Rényi relative entropies).
An important measure in quantum estimation theory is the quantum Fisher information Hel76 (); H82 (); BC94 (); BCM96 (), related to quantum fidelity and defined for a continuously parametrized set of states as (H06book, , Theorem 6.3)
(See the appendix for a derivation of the second equality.) The importance of the quantum Fisher information is that it is a lower bound on the variance of an unbiased estimator of Hel76 (); H82 (); BC94 (); BCM96 ():
One can apply the same reasoning to adaptive protocols for estimating an unknown parameter encoded in a family of channels, and we find that
where is the Fisher information with respect to the conditional probability defined in (1). Applying the bound in (20) and the relation between fidelity and Fisher information in (21), we find that the following lower bound holds when trying to estimate an unknown parameter encoded in a family of environment-parametrized channels of the form in (2):
Application to thermal channels—We now show that several of the above upper bounds are in fact achievable, whenever the goal is to determine the excess noise in a thermal channel with known transmissivity. We begin with channel discrimination. Suppose that we are given two thermal channels and , each having a known transmissivity with excess noise equal to or . (If or , then it is impossible to distinguish the channels and so we do not consider these cases.) In all cases for discrimination or estimation, we find that a non-adaptive strategy involving copies of a highly squeezed, two-mode squeezed vacuum state suffices to attain the upper bounds given above, proving that this non-adaptive strategy suffices for achieving the best possible performance. The two-mode squeezed vacuum state is equivalent to a purification of the thermal state in (5) and is defined as
The strategy we are employing in all cases leads to the following, final pre-measurement state for :
Starting with quantum relative entropy, we find the following expansion for large and for , by employing a formula for the quantum relative entropy of Gaussian states PhysRevA.71.062320 (); PLOB15 ():
where is a relative entropic generalization of the well known formula for the entropy of a bosonic thermal state (see, e.g., G08thesis ()) and is defined for as
In fact, as indicated in (28), we find for all that , so that the relative entropy in the limit of high squeezing converges to the classical relative entropy between the two thermal states that distinguish the channels (here we say classical relative entropy because the states and commute).
Similarly, we find the following expansion for the quantum relative entropy variance for large and for , by employing a formula for the quantum relative entropy variance of Gaussian states WTLB16 ():
As indicated in (31), we also find for all that . The formula in (30) is an expression for the relative entropy variance of two thermal states, which generalizes the entropy variance formula from WRG15 () for a thermal state. See the appendix for a derivation.
By the statement in (16), we find the following upper bound on the performance of any adaptive strategy when discriminating the channels
the expansions for large in (28) and (31) establish that the upper bound in (32) is achievable in the limit as . As a consequence, by using a highly squeezed state as a probe and in the limit of high squeezing, it is as if the loss in the channel has no effect on the transmitted state and one’s ability to distinguish the channels is as good as one’s ability to distinguish the environmental states and , which correspond to the excess noise in the channels. We offer an explanation for this phenomenon later on.
Turning to the fidelity, we find similar results. Applying a formula for the fidelity of two-mode Gaussian states MM12 (), we find for that
Consistent with our previous observations and as indicated in (35), we also find for that .
We finally consider the quantum Fisher information as defined in (21). Applying a formula for the fidelity of two-mode Gaussian states MM12 () and expanding about small and large , we find for and that
Thus, by applying (21), we find that the quantum Fisher information in the large limit is equal to
in agreement with (GL14, , Eq. (63)). By applying the bound from (24), the fact that the quantum Fisher information of an ensemble of thermal states is equal to , and the fact that the quantum Fisher information is achievable in principle by a measurement Hel76 (); H82 (); BC94 (); BCM96 (), we can conclude that there exists a non-adaptive strategy that achieves the ultimate precision possible in the limit of high squeezing. Furthermore, the form of the quantum Fisher information in (37) has an intuitive form: the noisier the state, the lower the Fisher information, and vice versa.
Concrete Discrimination Strategy—All of the convergences of the quantum discrimination measures to the discrimination of two thermal states begs for an intuitive explanation. Here we give some explanation for this phenomenon, by establishing a physical relation between a thermal state with mean photon number and the state defined in (26), in the limit as . At the same time, this explanation leads to a concrete discrimination strategy consisting of applying the unitary transformation given below followed by photodetection.
Consider the following symplectic transformation:
The symplectic transformation is independent of and diagonalizes when . Also, can be realized by a two-mode squeezer, which corresponds to a unitary transformation acting on the tensor-product Hilbert space of the two modes. Applying to with finite , we get
One can physically eliminate the off-diagonal terms by randomizing the two modes (or just by simply treating them separately). Then in the limit as , we find that the above state is equivalent to a product of two thermal states with photon numbers and . So a concrete discrimination strategy consists in applying the above unitary transformation to the output of each channel, tracing over the second mode, and performing photodetection on the first mode, which is the optimal measurement for distinguishing two thermal states.
Application to amplifier channels—For quantum amplifier channels with a fixed known gain but unknown excess noise, we find results similar to the ones given above for thermal channels. The upper bound from (11) results in a generalized divergence between two thermal states. Also, the quantum relative entropy, the quantum relative entropy variance, the fidelity, and the quantum Fisher information evaluated for the state converge to the same expressions given above in the limit of high squeezing, having no dependence on the gain of the amplifier channel. There is a similar explanation for the convergences as given above and a resulting concrete discrimination strategy in the limit of high squeezing.
Teleportation method—One can also arrive at our results for thermal and amplifier channels in terms of a technique called teleportation simulation (BDSW96, , Section V). In (BDSW96, , Section V), the authors showed how any protocol consisting of adaptive operations interleaved between many independent uses of the same channel can be reduced to a non-adaptive protocol if the channel is simulable by teleportation. This method was reviewed recently in PLOB15 () and therein extended to continuous-variable bosonic channels and others as well. Recently, the technique was also applied in the context of channel discrimination and estimation of particular channels PL16 ().
Briefly, the main idea of the teleportation method is to 1) replace every channel in the protocol by its simulation with teleportation and 2) rearrange all of the uses of the channel to the start of the protocol, such that all of the adaptive operations act at the end of the protocol and the resulting protocol no longer has the adaptive form. For the channels considered in PL16 () (limited to Pauli channels or erasure channels), the resulting protocol is such that one feeds in shares of a maximally entangled state to each channel use. Then a final measurement is performed on this state to discriminate two channels in a given class.
In the examples that we consider here, including thermal channels of a fixed transmissivity or amplifier channels of a fixed gain, we can instead use the two-mode squeezed vacuum state and continuous-variable teleportation prl1998braunstein () to effect the teleportation reduction discussed above. One critical aspect of the problem setup is that the channels being discriminated or estimated have the same transmissivity or gain, so that the teleportation correction operations are independent of the particular channel being discriminated or estimated. In order for the teleportation simulation to be perfect, it is necessary to consider the limit of high squeezing, as we have done above, and the result is to recover all of the convergences of quantum discrimination measures discussed previously.
The teleportation simulation approach to understanding our results is interesting, but we think that the data-processing method outlined in this paper is simpler and more powerful when applicable. The data-processing method applies independently of whether a channel is teleportation simulable, and furthermore, we only need a generalized divergence for the argument in (14) to hold, whereas one further requires continuity (albeit a natural property) in order for the teleportation argument to go through in the continuous-variable case. Finally, the data-processing method outlined here recovers all of the results established in PL16 () because all of the channels considered there are in fact environment-parametrized. To see this, for Pauli channels, we can take the environment state in (2) to be and the unitary interaction to be , where the parameter is the probability vector and is a Pauli operator. For erasure channels, we can take the environment state in (2) to be and the unitary interaction to be .
It would be interesting to determine if there are teleportation-simulable channels that are not environment-parametrized. If it were the case, then the teleportation simulation method could be used to analyze adaptive discrimination and estimation protocols, whereas the data-processing method would not necessarily apply.
Conclusion—We have outlined a general method for bounding the performance of adaptive channel discrimination or estimation of environment-parametrized channels, in which an unknown parameter is encoded in the environment of the channel. The method applies to any generalized divergence, a function whose sole property is data processing (monotonicity under the action of a quantum channel). We applied the approach to several discrimination measures that have operational meaning in a variety of contexts. As a concrete example, we considered thermal (amplifier) channels with known transmissivity (gain) and unknown excess noise. We derived limitations on the performance of the most general adaptive discrimination or estimation strategies for these channels, and we also showed that these limits are achievable in principle if highly squeezed states are available.
Going forward from here, it would be interesting to generalize the approach to channels encoding multiple unknown parameters that need to be estimated or discriminated—the results from GL14 (); M13 () should be helpful here, at least in the case of quantum Gaussian channels. We also wonder whether there are other approaches, besides the data-processing method or the teleportation simulation approach, that could be used to simplify adaptive protocols for channel discrimination or estimation.
Acknowledgements—We are grateful to Saikat Guha and Chenglong You for discussions related to the topic of this paper. MT is grateful to the Hearne Institute for Theoretical Physics at Louisiana State University for hosting him during October 2016, when this research was completed. MMW acknowledges support from the NSF under Award No. CCF-1350397.
We begin by establishing (30). Let
The relative entropy variance is defined as
where is the photon-number operator. So then
Finally, we find that