A Cramér-Rao bound

# Physical limit to concentration sensing amid spurious ligands

## Abstract

To adapt their behaviour in changing environments, cells sense concentrations by binding external ligands to their receptors. However, incorrect ligands may bind nonspecifically to receptors, and when their concentration is large, this binding activity may interfere with the sensing of the ligand of interest. Here, I derive analytically the physical limit to the accuracy of concentration sensing amid a large number of interfering ligands. A scaling transition is found when the mean bound time of correct ligands is twice that of incorrect ligands. I discuss how the physical bound can be approached by a cascade of receptor states generalizing kinetic proof-reading schemes.

Because of their small sizes, biological systems typically operate with only a few copies of the molecules they sense and communicate with. In their pioneering work, Berg and Purcell derived the fundamental bound that the noise arising from these small numbers sets on the accuracy of concentration sensing (1). Experimental progress in the characterization of single-cell variability (2) and sensing precision (3) has fueled a renewed interest in small-number noise and its implications for information processing (4); (5); (6). General or refined bounds on sensing accuracy have been recently derived for single receptors (7); (8); (9), and extended to spatial (10); (11); (12); (13); (14) or temporal (15) gradient sensing, while the metabolic cost and trade-offs of sensing accuracy have been explored (16); (17); (18); (19); (20); (21); (22); (23); (24); (25). Much of this past work has assumed perfect specificity between the biological receptors and their cognate ligands. In realistic biological contexts, large numbers of spurious ligands may bind receptors nonspecifically, interfering with the ligand of interest (26). This is the case in the problem of antigen recognition by T-cell receptors, where cells must react to a small number of specific foreign peptides among a large number of nonspecific self-peptides (27). Biochemical network architectures based on kinetic proofreading (28); (29) have been shown to provide a solution to the discrimination problem, and have been studied in depth theoretically (30); (31); (32); (26). However, no fundamental bound has been derived against which to compare the performance of these solutions, save for Ref. (33) where concepts of statistical decision theory were used to derive the minimal decision time to detect cognate ligands. In this paper I derive the fundamental limit on concentration sensing accuracy and ligand detection error in the presence of a large number of spurious ligands. The maximum likelihood estimate achieving the bound can be implemented biologically by simple networks based on push-pull reactions.

Consider a mixture of two ligands, only one of which the biological system wishes to sense. The ligand of interest (hereafter referred to as the correct ligand) is present in concentration , while the interfering or spurious ligand (called the incorrect ligand) is present in concentration . The biological unit can sense ligands through identical receptors, which can be bound by either ligand with a common rate , where is the molecule diffusivity and the effective receptor size. Receptors can distinguish between the two molecules thanks to their higher affinity to the correct ligand. Physically, this means that the unbinding rate of the correct ligand is smaller than that of the incorrect ligand .

The occupancy of each receptor,

 p=kcr−1+kc′r′−11+kcr−1+kc′r′−1, (1)

depends on both concentrations, and cannot be used alone to determine . The interchangeability of the ratios and in this expression emphasizes the ambiguity between many incorrect ligands and a few correct ones. To discern these two effects, one must use the full temporal record of occupancy of each receptor. The probability distribution for the binding and unbinding events at all receptors during a time interval reads:

 P=e−kctotTun∏i=1(kcre−rti+kc′r′e−r′ti), (2)

with is the total concentration of ligands, is the total unbound time accrued over all receptors, and the durations of the binding events occurring at all receptors during . The log-likelihod can be rewritten as a sum of three independent contributions, , where depends on neither or , and where and pertain to the unbound and bound intervals respectively:

 L1(ctot) = nlnctot−kctotTu (3) L2(x) = n∑i=1ln(1−x+xαe(1−α)r′ti), (4)

where is the fraction of correct ligands, and is the binding constant ratio. As can be seen in the respective dependencies of and upon and , unbound intervals are informative of the total concentration, while the bound intervals are informative of the fractions of ligands.

The maximum likelihood estimate for the total concentration is obtained by the condition , which gives . The error made by this estimate is given in the large time limit by the Cramér-Rao bound, which sets the best possible performance of any estimator (34):

 ⟨δc2tot⟩≈−(∂2L1∂c2tot)−1=c2totn≈ctot4Da(1−p)NT, (5)

with . This result is that obtained in (8) for a single ligand, where the maximum-likelihood error was shown to be half as small as the classical Berg and Purcell bound (1) based on the average receptor occupancy. The reason for this difference is that the maximum likelihood estimate is not affected by the noise due to the stochastic nature of receptor unbinding, as evident in Eq. (3). In the case of a mixture, the receptor occupancy Eq. (1) depends on as well as , and does not even suffice to determine the total concentration.

The fraction of correct ligands can be estimated by maximum likelihood as well, by solving:

 ∂L2∂x∣∣∣x∗=n∑i=1αe(1−α)r′ti−11−x∗+x∗αe(1−α)r′ti=0. (6)

The error can be estimated from the Cramér-Rao bound (App. A.1):

 ⟨δx2⟩≈−(∂2L2∂x2)−1≈f(x,α)nwithf(x,α)−1=∫+∞0due−u(αe(1−α)u−1)21−x+xαe(1−α)u. (7)

The total error in the concentration of the correct ligand is then the sum of the (independent) errors in and from Eqs. (5) and (7): .

It is interesting to consider the limit where the correct ligands are rare, , as in the case of immune recognition. Two scaling regimes, illustrated in Fig. 2, are found depending on the value of the ratio between the two binding constants:

 Unknown environment '% (8)

with and , and . Since , the error in reduces to:

 ⟨δc2⟩≈g(α)ctot4Da(1−p)NTα>1/2,⟨δc2⟩≈h(α)cβc1−βtot4Da(1−p)NTα<1/2. (9)

In the hard discrimination regime (), incorrect ligands dominate the error, which is governed by as in Eq. (5). The prefactor diverges at , as expected when the two ligands have the same binding constant and are thus indistinguishable. By contrast, in the easy discrimination regime () the error is governed by a weighted geometric mean between and . In the limit of very small , corresponding to clearly distinguishable ligands, the error is —precisely the error when no interfering ligand is present (8). For , the maximum-likelihood estimate may infer a small . However, the second derivative of the likelihood diverges at for , indicating that the Cramér-Rao bound (7) fails to give a correct estimate of this error, which instead scales anomalously with the number of events: , hence (App. A.2).

In many situations, it is more useful for the system to determine the presence of the correct ligand rather than its precise concentration, as in the recognition of foreign pathogens by immune receptors. This decision can be made optimally (in the Bayesian sense) by comparing the likelihoods of the two competing hypotheses: presence versus absence of the correct ligand at fraction . The presence of the correct ligand is detected when , where is an adjustable parameter controlling the balance between the false-positive and false-negative error rates and . These errors decay exponentially fast with large numbers of binding events, and can be estimated in that limit using a saddle-point approximation (App. B1):

 FP≈exp[nϕ(λ)−λθ]λ√2πn|ϕ′′(λ)|,FN≈λ1−λeθFP, (10)

where , and where satisfies the saddle-point condition . The receiver operating characteristics (ROC) giving the dependency between and can thus be estimated parametrically by varying . This saddle-point approximation is well verified by numerical simulations (Fig. 3).

As in the case of concentration sensing error, a scaling transition is found in the limit of scarce correct ligands, . When , one obtains

 FP≈e−12λ2nx2/g(α)√2πλ2nx2/g(α), FN≈e−12(1−λ)2nx2/g(α)√2π(1−λ)2nx2/g(α), (11)

while when both error rates decay as , with , and a function of and (App. B.2). The time necessary to make a reliable decision scales as for , and as for . Equivalent scaling laws were obtained in (33) for minimal on-the-fly detection times.

Can biological systems approach the physical bound on concentration sensing given by Eq. (7)? To gain insight into this question, one can expand Eq. (6) at first order in to get an approximation to the maximum likelihood estimate when (for this expansion gives quantities with diverging means and cannot be used):

 x∗≈2α−1(1−α)21nn∑i=1(αe(1−α)r′ti−1). (12)

This estimator, which is subject to the same asymptotic error as in Eq. (8), suggests a simple strategy, where each receptor signals “positively” with a rate that depends on how long it has been bound, , and “negatively” (i.e. with an opposite effect on the readout, see below) through a fixed burst upon binding, so that the net effect of each binding event on the readout molecule concentration is

 ∫ti0dt[α(1−α)r′e(1−α)r′t]+α−1=αe(1−α)r′ti−1, (13)

i.e. exactly the argument of the sum in Eq. (12).

This idea can be implemented biologically by a cascade of receptor conformational states triggered by binding, and proceeding irreversibly from states to , each transition to the next state occurring with rate (Fig. 4). The ligand is free to detach from the receptor at any time, bringing the receptor back to the unbound state . The receptors signal through the production or activation of two molecules and with opposite effects on a push-pull network governing the state of a molecule , which provides the final readout for through its modified state . If one requires that the equilibration of and are fast, and that and are always in excess in the Michaelis-Mentens reactions, then

 dX∗dt=X0N∑j=1(bμ(j)−dμ(j)),X0=const, (14)

where is the state of the receptor, and . For the purpose of this discussion, the internal molecules , and are assumed to be unaffected by biochemical noise, restricting the source of noise to the input alone. In this design increases indefinitely to mimick the sum in Eq. (12) over all events at all receptors. A more a realistic but equivalent scheme would involve a running sum over an effective time , obtained by relaxing to with rate (15).

When the number of states is large and the transitions between them are rapid, can track Eq. (12) with arbitrary precision when . In that case, the receptor state provides an approximation to the time since binding, . Then, for example, receptors signaling positively with rate , and negatively with rate (with an adjustable parameter) would exactly realize Eq. (13) and thus the estimator of Eq. (12) in the limit .

Although such optimal performance is only reached for large and , this network design may still perform well in more general situations. One can optimize the expected error produced by this network over the net signaling rates , with the constraint that the mean effect of binding incorrect ligands on be zero, so that on average (App. C). Fig. (5) shows how the performance of such optimized networks approaches the theoretical bound as the number of states increases. The convergence is significantly worse for at small . In that regime, the estimator of (12) is not valid, suggesting that this network design may not achieve the optimal bound even with an infinite number of states. The output of these networks can also be used to detect ligands. Their performance in doing so is compared to the optimal discrimination errors of Eq. (10) in Fig. (3).

The principle of maximum likelihood not only yields the fundamental bound on the accuracy of discerning cognate ligands from spurious ones, but also suggests biochemical solutions to approach this optimal bound. Such maximum-likelihood inspired designs have been previously proposed in the case of a single ligand (15); (19). The network structure proposed in this study (Fig. 4) is reminiscent of kinetic proofreading schemes and their generalizations, which provide a well-known solution to the ligand discrimination problem (28); (29); (35); (32); (26). An important difference is that here signaling occurs during all steps, albeit at various, fine-tuned rates, and with potentially negative contributions, the role of which is to buffer the effect of wrong ligands. Consistent with this prediction, it was shown that a negative interaction through a diffusible molecule between kinetic-proofreading receptors could mitigate the effects of large numbers of incorrect ligands in a discrimination task (26).

The present results are relevant beyond the particular case of sensing by receptors, and apply to any kind of biochemical signaling in presence of competing ligands or “cross-talk.” This is the case for example in the context of gene regulation, where competing transcription factors may bind regulatory sites unspecifically, a problem particularly acute in metazoans (36).

The scaling transition occurring at the binding constant ratio suggests that different strategies should be employed depending on how hard the discrimination task is. In particular, the approximate but biologically implementable estimator of Eq. (12) curiously breaks down in the easy discrimation regime, . In that regime, the optimal bound is harder to achieve because it is dominated by rare, long binding events that are hard to encode by biochemical solutions. The example of immune recognition falls precisely into that regime, with a binding constant ratio between agonist and nonagonist ligands ranging from one fifth to one third (27). More elaborate network designs, probably with feedback, may be needed to achieve the theoretical bound Eq. (7) in that case. Finally, this study has assumed throughout that the unbinding rates and are priorly known to the system. Complex mixtures of ligands with unknown binding constants would make for interesting generalizations.

I thank A. Walczak for her helpful comments on the manuscript. While this article was under review, a paper treating a similar topic was submitted to the arXiv (37).

## Appendix A Cramér-Rao bound

### a.1 The Cramér-Rao bound is tight: a physicist’s proof

In general the Cramér-Rao bound is a lower bound on the error made by any unbiased estimator, but it is not always certain whether this bound can be achieved. Here the maximum likelihood estimate is shown to approach the Cramér-Rao bound in the limit of large samples.

Assume that the likelihood of the data factorizes over independent datapoints,

 L=lnP=n∑i=1ℓ(x,ti), (15)

where is the model parameter to be estimated, and the series of datapoints. In the specific case of receptors binding to two types of ligands, is the fraction of correct ligands, the duration of binding event , and

 ℓ(x,ti)=lnr′−r′ti+ln(1−x+xαe(1−α)r′ti). (16)

The derivative of with respect to is denoted by . The maximum likelihood estimate satisfies:

 n∑i=1ℓ′(x∗,ti)=0. (17)

This estimator is unbiased: if denotes the true parameter with which the data was generated, then should give back on average. Equivalently,

 ⟨∂L(~x)∂x⟩~x=n∑i=1⟨ℓ′(~x,ti)⟩~x=n∫+∞0dteℓ(~x,t)ℓ′(~x,ti)=n∂∂x∫+∞0dteℓ(x,t)∣∣∣~x=0, (18)

(the last integral is just 1 because of normalization), where denote averages over data generated with the true parameter . In other words, the maximum of is reached at on average. The probability that this maximum be larger than a certain value is:

 P(x∗>x)=P(∂L(x)∂x>0)=⟨Θ(∂L∂x)⟩~x. (19)

The Heaviside function can be replaced by its Fourier representation:

 Θ(x)=∫+∞−∞dω2πωeiωx=∫+i∞−i∞dλ2πiλeλx, (20)

allowing for factorization over datapoints:

 P(x∗>x)=∫+i∞−i∞dλ2πiλn∏i=1∫+∞0dtieℓ(~x,ti)+λℓ′(x,ti)=∫+i∞−i∞dλ2πiλexp[nln∫+∞0dteℓ(~x,t)+λℓ′(x,t)]. (21)

and

 P(x)=−dP(x∗>x)dx==−∫+i∞−i∞dλ2πi[∫+∞0dtℓ′′(x,t)eℓ(~x,t)+λℓ′(x,t)]×exp[(n−1)ln∫+∞0dteℓ(~x,t)+λℓ′(x,t)]. (22)

This integral can be evaluated by a saddle-point approximation in the large limit:

 ∫dλG(λ)enF(λ)≈G(λ∗)√2πn|F′′(λ∗)|enF(λ∗), (23)

with

 G(λ) =∫+∞0dtℓ′′(x,t)eℓ(~x,t)+λℓ′(x,t)∫+∞0dteℓ(~x,t)+λℓ′(x,t), (24) F(λ) =ln∫+∞0dteℓ(~x,t)+λℓ′(x,t). (25)

The saddle is given by the condition that the derivative of the argument of the exponential with respect to be zero:

 ∫+∞0dtℓ′(x,t)eℓ(~x,t)+λℓ′(x,t)=0. (26)

At , this condition is satisfied for . In the limit of large samples, is small and so should the corresponding . One can expand at small and :

 ℓ′(x,t)≈ℓ′(~x,t)+(x−~x)ℓ′′(~x,t), (27)

and

 ∫+∞0dt[(x−~x)ℓ′′(~x,t)+λℓ′(~x,t)2]eℓ(~x,t)=0, (28)

yielding and:

 P(x)≈√nH√2πexp[−n2(x−~x)2H], (29)

with

 H=−∫+∞0dtℓ′′(~x,t)eℓ(~x,t)=∫+∞0dtℓ′(~x,t)2eℓ(~x,t). (30)

A symmetric argument gives the same result for . The resulting distribution of is Gaussian, with mean and variance

 Missing or unrecognized delimiter for \left (31)

In the specific case of Eq. (16),

 H=∫+∞0dtr′e−r′t(αe(1−α)r′t−1)21+x(αe(1−α)r′t−1). (32)

Performing the change of variable yields the result of the main text:

 ⟨δx2⟩≈f(x,α)nwithf(x,α)−1=∫+∞0due−u(αe(1−α)u−1)21−x+xαe(1−α)u. (33)

### a.2 Small x limit

For and small , the integral in Eq. (33) can be approximated by:

 f(x,α)−1≈∫+∞0due−u(αe(1−α)u−1)2=(1−α)22α−1. (34)

For , the function is not integrable, and the denominator of Eq. (33) is necessary to ensure integrability at large , however small is. Thus, the values of governing the behavior of the integral satisfy . This observation suggests the change of variable :

 f(x,α)−1=x−βα11−α11−α∫+∞αxdyy−2−α1−α(y−x)21+y−x, (35)

with . Expanding gives three terms scaling as , and at small , respectively. The last two give diverging integrals as for all , yielding terms of order 1. Only when does the first term give a diverging integral, and thus a term of order 1 in ; in that case, the sum of all three terms gives back the result of Eq. (34). If however, the first term is integrable and thus dominates the expression for , yielding:

 f(x,α)−1=x−βα11−α11−α∫+∞0dyy−α1−α1+y+O(1), (36)

where denotes a term of order 1 at small . The integral can be calculated:

 ∫+∞0dyy−α1−α1+y=πsin(πα1−α), (37)

to finally obtain:

 f(x,α)−1≈x−βα11−α11−απsin(πα1−α). (38)

In the intermediate case , the three terms in the integral of Eq. (35) are of order , and . Again the last two terms diverge in the integral and give contributions of order 1. The first term also diverges, but its contribution reads:

 α11−α11−α(−ln(αx))=12|ln(x/2)|, (39)

so that:

 f(x,α)−1=12|ln(x/2)|+O(1). (40)

When and , , as the large deviation function of becomes nonanalytic. The expansion of in Eq. (27) is no longer integrable when done around , and needs revisiting. The integral in Eq. (21) reads:

 ∫+∞0due−uexp[λ(αe(1−α)u−1)1+x(αe(1−α)u−1)]=1+∫+∞0due−u{exp[λ(αe(1−α)u−1)1+x(αe(1−α)u−1)]−1−λ(αe(1−α)u−1)} (41)

where the same change of variable has been done. Doing a further change of variable to yields:

 1+(αx)11−α1−α∫+∞αxy−2−α1−α{exp[(λ/x)(y−x)1+y−x]−1−(λ/x)(y−x)} (42)

The term is the brackets is of order , as was the case in Eq. (33). Hence, terms in are integrable and dominate the expression, which becomes at leading order in :

 1+(αx)11−α1−α∫+∞0y−2−α1−α{exp[(λ/x)y1+y]−1−(λ/x)y} (43)

With , the saddle-point condition becomes:

 ψ′(~λ,α)=0, (44)

with

 ψ(~λ,α)=−α11−α1−α∫+∞0y−2−α1−α(e~λy1+y−1−~λy) (45)

and the cumulative probability distribution is:

 P(x∗>x)≈1√2πnx11−αψ′′(~λ,α)~λexp[−nx11−αψ(~λ,α)], (46)

where and . Fluctuation of are thus of order .

When , the term of order in the brackets of Eq. (42) dominates and diverges, so that this expression reduces at leading order to:

 1+x22|ln(x/2)|(~λ22−~λ). (47)

The saddle point condition gives and one obtains:

 P(x∗>x)≈1√πnx2|ln(x/2)|exp[−nx2|ln(x/2)|4], (48)

which implies fluctuations of order .

## Appendix B Probability of discrimination error

### b.1 General case

The discrimination between two competing hypotheses—presence versus absence of the correct ligand in fraction —can be performed by a likelihood ratio test:

 lnP(t1,…,tn|x)P(t1,…,tn|x=0)=n∑i=1[ℓ(x,ti)−ℓ(0,t)]>θ, (49)

where is an adjustable parameter. The false-positive and false-negative error rates are defined as the probabilities of detecting the presence of a ligand that is in fact absent, and of missing it where it is there:

 FP=∫n∏i=1[dtieℓ(0,ti)]Θ(n∑i=1[ℓ(x,ti)−ℓ(0,t)]−θ),FN=∫n∏i=1[dtieℓ(x,ti)]Θ(n∑i=1[ℓ(0,ti)−ℓ(x,t)]+θ). (50)

The integral representation of the Heaviside function, Eq. (20) can be used again to obtain:

 FP=∫+i∞−i∞dλ2πiλe−λθ[∫dte(1−λ)ℓ(0,t)+λℓ(x,t)]n,FN=∫+i∞−i∞dλ2πiλeλθ[∫dte(1−λ)ℓ(x,t)+λℓ(0,t)]n. (51)

Substituting in the second equation gives an expression for that looks very similar to :

 ∫dλ2πi(1−λ)e(1−λ)θ[∫dte(1−λ)ℓ(0,t)+λ.ℓ(x,t)]n. (52)

In summary:

 FP=∫+i∞−i∞dλ2πiλenϕ(λ)−λθFN=∫+i∞−i∞dλ2πi(1−λ)enϕ(λ)+(1−λ)θ, (53)

with:

 ϕ(λ)=ln∫dte(1−λ)ℓ(0,t)+λℓ(x,t)=ln∫due−u[1+x(αe(1−α)u−1)]λ. (54)

These two expressions can be evaluated in the large limit using a saddle-point approximation, with the same saddle-point condition for both, yielding:

 FP≈1λ√2πn|ϕ′′(λ)|exp[nϕ(λ)−λθ],FN≈1(1−λ)√2πn|ϕ′′(λ)|exp[nϕ(λ)+(1−λ)θ]. (55)

### b.2 Small x limit

Again two regimes emerge in the limit, depending on whether is smaller or greater than . When , can be expand at small :

 ϕ(λ)≈−12(1−α)22α−1λ(1−λ)x2. (56)

This implies:

 FP≈e−12λ2nx2/g(α)√2πλ2nx2/g(α),FN≈e−12(1−λ)2nx2/g(α)√2π(1−λ)2nx2/g(α), (57)

with

 g(α)=2α−1(1−α)2. (58)

When , one can do the same change of variable as before, , to obtain at leading order:

 ϕ(λ)=(αx)11−α1−α∫+∞0dyy−2−α1−α[(1+y)λ−1−λy]. (59)

The integrand is of order at small , and therefore is integrable. The error rates are then given by:

 FP≈exp[nx11−α(χ(λ)−λχ′(λ))]λ√2πnx11−α|χ′′(λ)|FN≈exp[nx11−α(χ(λ)+(1−λ)χ′(λ))](1−λ)√2πnx11−α|χ′′(λ)|. (60)

where

 χ(λ)=α11−α1−α∫+∞0dyy−2−α1−α[(1+y)λ−1−λy]. (61)

The intermediate case is treated similarly as before, by noting that the integral defining is dominated by the (diverging) term of order . This gives:

 ϕ(λ)≈−x24|ln(x/2)|λ(1−λ). (62)

and therefore:

 FP≈e−14λ2nx2|ln(x/2)|√πλ2nx2|ln(x/2)|,FN≈e−14(1−λ)2nx2|ln(x/2)|√π(1−λ)2nx2|ln(x/2)|. (63)

As a result, the number of binding events necessary to a make reliable decision scales as for ,