Joint Channel Parameter Estimation via Diffusive Molecular Communication

# Joint Channel Parameter Estimation via Diffusive Molecular Communication

## Abstract

The design and analysis of diffusive molecular communication systems generally requires knowledge of the environment’s physical and chemical properties. Furthermore, prospective applications might rely on the timely detection of changes in the local system parameters. This paper studies the local estimation of channel parameters for diffusive molecular communication when a transmitter releases molecules that are observed by a receiver. The Fisher information matrix of the joint parameter estimation problem is derived so that the Cramer-Rao lower bound on the variance of locally unbiased estimation can be found. The joint estimation problem can be reduced to the estimation of any subset of the channel parameters. Maximum likelihood estimation leads to closed-form solutions for some single-parameter estimation problems and can otherwise be determined numerically. Peak-based estimators are proposed for low-complexity estimation of a single unknown parameter.

{IEEEkeywords}

Cramer-Rao lower bound, diffusion, maximum likelihood estimation, molecular communication, parameter estimation.

## 1 Introduction

\PARstart

Molecular communication (MC) is the transmission of information where molecules are used as information carriers. MC is ubiquitous in biological systems. For example, endocrine signaling is the release of hormone molecules that propagate via the bloodstream, paracrine signaling is the release of molecules into extracellular fluid that are detected by local cells, and molecules are also released in the synapses between neurons to relay signals between them; for more details, see [2, Ch. 16]. Despite their widespread use, MC systems in nature are typically designed for the transmission of limited quantities of information, e.g., a message that is a time-varying ON/OFF control signal for a biological process. Recent advances in nanotechnology have motivated interest in synthetic communication networks where the principles of MC are used to deliver arbitrary amounts of information in environments where the deployment of traditional wireless communication networks is unsafe or infeasible. These networks could advance applications in a diverse number of fields, including biological engineering, healthcare, and manufacturing; see [3].

Diffusion-based MC relies on the random motion of information molecules due to collisions with other molecules in the propagation environment. When molecules are released into a diffusive environment, they can be transported from a transmitter to its corresponding receiver without any additional infrastructure or external energy. However, this is an imperfect process that can be best described by an expected channel impulse response, i.e., the number of molecules expected at a receiver when molecules are released at some instant by a transmitter. The expected channel impulse response is a function of the parameters of the diffusive environment, including its geometry, the distance from the transmitter to the receiver, the diffusion coefficient of the molecules, and the time elapsed since the molecules were released. Other phenomena can also impact the status of the diffusing molecules and hence the channel impulse response. These phenomena include chemical reactions that have the molecules of interest as a product or reactant, other sources of those molecules that are not the intended transmitter, and whether there is any bulk fluid flow.

Given that the channel impulse response depends on the environmental parameters, the response can be used as a local noisy observation to estimate the values of those parameters. This is especially true when the expected impulse response can be written in closed form (although simplifying assumptions are generally needed to obtain a closed-form expression). By observing the arrival of molecules from a transmitter, an intelligent receiver might learn about the current local conditions of the propagation environment, which is essential for some prospective MC applications.

For example, consider a healthcare application where a network of microscale sensors are deployed to monitor a patient’s bloodstream. The sensors might need to be mounted at regular intervals along the blood vessel walls, such that they need to estimate the distance separating themselves before mounting. By monitoring the remaining individual channel parameters, changes could be detected and the cause of the change might be inferred. The blood flow velocity could be a proxy for blood pressure. The diffusion coefficient could be a proxy for blood composition and used to identify major changes in blood cell counts, as described in [4]. The chemical kinetics of the information molecules could be a proxy for blood pH; chemical reactivity varies with pH, as discussed in [5, Ch. 10]. In summary, knowledge of the individual channel parameters can be more insightful than knowledge of the expected channel impulse response alone, with the caveat that estimating individual parameters is only feasible if an expression for the channel impulse response as a function of the parameters is available.

We note that there are also macroscale estimation methods that are used to measure channel parameters. For example, there are various experimental methods to measure fluid diffusion coefficients, such as diaphragm cells and Taylor dispersion; see [6, Ch. 5]. However, these methods are appropriate for laboratory environments and might not be suitable for on-going measurements in confined settings where the deployment of an MC system might be less invasive.

In this paper, we study local joint channel parameter estimation in a diffusive MC environment, where in the most general case we assume that we know the form of the expression for the expected channel impulse response but we assume that we know none of the individual parameter values. Specifically, we consider the system model that we studied in [7], where a fixed receiver in an unbounded 3-dimensional environment observes molecules released by a fixed impulsive source. The molecules experience steady uniform flow and can probabilistically degrade. We ignore the presence of other molecule sources. For tractability, the receiver is a passive observer that can perfectly count the number of information molecules within its volume at a given instant. Each count is an observation and one or multiple observations are used to estimate the parameters.

We study an ideal model for two reasons. First, to the best of our knowledge it is the most detailed diffusive MC model for which a closed-form time domain expression of the channel impulse response is available. Second, the characteristics of this model approximate special cases of more realistic environments. For example, an environment that is sufficiently large relative to the distance between communicating devices can be assumed to be infinitely large. Also, an impulsive point source is sufficient to approximate a larger source that releases molecules sufficiently fast relative to the time required for molecules to reach the receiver via diffusion and flow (in fact, we use a non-point source in our simulations with negligible impact on estimation performance). In summary, estimator performance within this ideal model can serve as a bound or benchmark for performance in more realistic environments.

Existing literature on parameter estimation via diffusive MC has been limited to one unknown parameter. The distance between devices has been estimated in [1, 8, 9, 10, 11], whereas the time of transmitter release (i.e., synchronization) has been estimated in [12, 13]. With the exception of our preliminary work in [1], parameter estimation has only been considered in environments with diffusion alone and not with fluid flow or molecule degradation.

In our model, when the transmitter releases an impulse of molecules, the unknown parameters are the time that the molecules are released, the number of molecules released, the distance to the receiver, the diffusion coefficient, the fluid flow vector, and the molecule degradation rate. We are interested in determining the best possible performance of the (classical1) joint estimation of all of these parameters, as a function of the observations made by the receiver. We aim to provide bounds on the performance of any estimation protocol. We do not claim that estimating all parameters simultaneously is practical. Rather, our analysis easily simplifies to the estimation of any subset of the channel parameters. For example, our analysis of distance estimation in [1] is a special case of the complete analysis that we present here. Furthermore, we gain insight into how the knowledge of any one parameter decreases the error in estimating any of the other parameters. The primary contributions of this paper are summarized as follows:

1. We derive the Fisher Information Matrix (FIM) of our joint parameter estimation problem to give the Cramer-Rao lower bound (CRLB) on the variance of estimation error of any locally unbiased estimator as a function of independent observations of a transmitted impulse. Bounds on the unbiased estimation of any subset of the channel parameters can be found by considering only the corresponding elements of the FIM (e.g., if only estimating the distance, as we did in [1], then only 1 of the 28 unique terms in the FIM is needed).

2. We study maximum likelihood (ML) estimation of our joint parameter estimation problem. Closed-form solutions exist for some single-parameter estimation problems with one observation, as we showed for distance estimation in [1]. Otherwise, ML estimates can be determined numerically, via either the Newton-Raphson method or an exhaustive search.

3. We consider the presence of singularities in the FIM, in which case any unbiased estimator will have infinite variance. Dealing with singularities is an open problem in the parameter estimation literature, cf. e.g. [15, 16, 17, 18]. Singularities in the FIM, or being in the “vicinity” of a singularity, can have an impact when estimating one parameter or multiple parameters simultaneously.

4. We propose peak-based estimators for low-complexity estimation of a single parameter. Variants of peak-based distance estimators were originally presented in [9, 10]. We present a comprehensive discussion of how the peak molecule observation and/or the time of the peak number of observed molecules can be used to estimate any single parameter, given knowledge of the other parameters.

We note that we focus on parameter estimation when there is only one device releasing molecules, i.e., the transmitter, and they are observed by the receiver. We coined the term one-way protocols in [1] to refer to estimation protocols using this approach, and to distinguish them from two-way protocols (such as those proposed for distance estimation in [8, 9, 11]), which rely on feedback from the receiver back to the transmitter so that the transmitter makes the estimate. In general, two-way protocols can be no more accurate than one-way protocols, because two-way methods require the subsequent detection of two molecule impulses.

The rest of this paper is organized as follows. In Section 2, we describe the physical environment, review the expected channel impulse response, and review the CRLB and ML estimation. We derive the FIM of the joint estimation problem, from which the CRLB can be found, in Section 3. In Section 4, we apply examples of ML estimation to the joint estimation problem and present the peak-based estimation protocols. We present numerical and simulation results in Section 5. Conclusions are drawn in Section 6.

## 2 System Model and Estimation Preliminaries

In this section, we describe the diffusive environment and the expected channel impulse response. We review the definition of the CRLB for vector parameter estimation. We also review ML estimation and the Newton-Raphson method for numerical evaluation of the ML estimate.

### 2.1 Physical Environment

We consider a 3-dimensional fluid environment as shown in Fig. 1. The environment is unbounded and with uniform temperature and viscosity. There are two fixed devices, which we label the transmitter (TX) and the receiver (RX) because we focus on one-way parameter estimation. The TX is a point that is distance from the center of the RX. The RX is a sphere of radius and volume . The coordinate axes are defined by placing the center of the RX at the origin and the TX at Cartesian coordinates . As we noted in [7], concentrations observed in this environment are equivalent by a factor of two to those in the semi-infinite case where the -plane is an elastic boundary and the RX is a hemisphere; see [19, Eq. (2.7)]. There is a steady uniform flow with components and . is the component of in the direction of a line pointing from the TX towards the RX, and is the component of perpendicular to (the precise direction of is irrelevant due to symmetry). We note that uniform flows are the simplest analytically but do not generally describe the flow in cylindrical environments such as blood vessels, where flows are described as laminar (where successive layers of fluid slide over one another without mixing) or turbulent (where fluid motion is even more chaotic than under diffusion alone), depending on the relative importance of inertial and viscous forces; see [2, Ch. 2].

The TX is a source of molecules, labeled molecules, that can be detected by the RX. The molecules independently diffuse with constant diffusion coefficient , and they can degrade anywhere in the propagation environment via a first-order chemical reaction that can be written as

 Ak→∅, (1)

where is the first-order reaction rate constant in . We do not specify the product of reaction (1), except to say that it is not recognizable by the RX. We ignore the reaction kinetics of the reception process at the RX for tractability (recently, the time domain channel impulse response for a similar but simpler system model was derived in [20]). Instead, the RX is a passive observer that can perfectly count the number of molecules within its volume at any desired time, i.e., the molecules are observed without being bound or consumed.

Given our system model, we can write the expected channel impulse response. We assume that the concentration of molecules expected at the RX due to a release of molecules by the TX is uniform throughout the RX and equal to that expected at the center of the RX. We previously studied the accuracy of this assumption in environments with molecule degradation in [21] and in flowing environments in [22]. In both works, we showed that this assumption is accurate for an RX that is sufficiently small relative to its distance from the TX. Using this assumption, if the TX instantaneously releases molecules at time , then the number of those molecules expected to be observed by the RX at time , , is given by [7, Eq. (12)]

 ¯¯¯¯¯¯¯¯¯¯¯¯NAob(t)= NVRX(4πD(t−t0))3/2 ×exp⎛⎝−k(t−t0)−|→ref|24D(t−t0)⎞⎠, (2)

where is the square of the effective distance from the TX to the RX. For compactness, we define as the elapsed time since the molecules were released, i.e., . We note that (2.1) can be derived as an extension of [23, Eq. (107)] by modifying the underlying differential equation to account for flow and molecule degradation, as we did in [7] and [24], respectively. The actual number of molecules observed by the RX is , and the time-varying mean of is given by (2.1). The only variable in (2.1) that we always assume is known to the RX is its volume . We summarize the remaining channel parameters in Table 1, and we assume that some subset of those parameters are unknown and must be estimated.

### 2.2 The Cramer-Rao Lower Bound

The Cramer-Rao lower bound is a bound on the error variance of any (locally) unbiased estimator; biased estimators, or estimators that are locally biased, can outperform the CRLB. Here, we review the definition of the CRLB for a vector parameter as described in [14, Ch. 3]. The definition easily simplifies in the case of a single unknown parameter.

Assume that we have a vector of observations and a vector of unknown parameters , where is vector transpose. Assume that we know the conditional probability density function (PDF) of the observations, . Under standard regularity conditions (see [25, Ch. 1.7]), and by [14, Th. 3.2], the covariance matrix of any unbiased estimator for , , satisfies

 C^θ−I−1(θ)≥0, (3)

where means that the matrix is positive semi-definite. An estimator is unbiased if . The elements of the Fisher information matrix are given by

 [I(θ)]θi,θj=−E[∂2lnp(s|θ)∂θi∂θj], (4)

where is the expectation taken with respect to , and the derivatives are evaluated at the true value of . For a positive semi-definite matrix, the diagonal elements are non-negative. Thus, from (3) we have

 [C^θ−I−1(θ)]θi,θi≥0, (5)

and

 var(^θi)=[C^θ]θi,θi≥[I−1(θ)]θi,θi, (6)

where is defined as the variance of the estimation error of parameter , i.e.,

 var(^θi)=E[(^θi−E[^θi])2]. (7)

Thus, the CRLB on the error variance of the th parameter, when all parameters are jointly estimated by an unbiased estimator, is given by the th diagonal element of the inverse of the FIM. The elements of the FIM are found using (4).

### 2.3 Maximum Likelihood Estimation

ML estimation is known as a “turn-the-crank” procedure because it can be procedurally implemented for many estimation problems where the observation PDF is known; see [14, Ch. 7] and examples of exceptions in [26, Ch. 6]. It is generally accepted that, in most cases, ML estimation is asymptotically efficient in the sense of the CRLB as the number of observations grows large, i.e., as ; see [26, Ch. 6]. However, we cannot make any general claims about the bias or the relative performance of ML estimation for a finite number of observations.

The ML estimate of vector parameter is given as follows:

 ^θ∣∣ML=\operatornamewithlimitsargmaxθp(s|θ), (8)

i.e., the ML estimate of is the vector that maximizes the observation PDF, given the observation vector . We will find that there are special cases, particularly if there is one observation and one unknown parameter, where we can write the ML estimate in closed form. In general, it can be found numerically. For example, we can consider the Newton-Raphson method to avoid performing an exhaustive search (the latter becomes computationally cumbersome when there are multiple unknown parameters). The Newton-Raphson method begins with an initial estimate . The th estimate is found iteratively as [14, Eq. (7.48)]

 ^θn+1=^θn−[∂2lnp(s|θ)∂θ∂θT]−1∂lnp(s|θ)∂θ∣∣∣θ=^θn, (9)

where

 [∂2lnp(s|θ)∂θ∂θT]i,j=∂2lnp(s|θ)∂θi∂θj∀i,j∈{1,…,L}. (10)

The convenience in implementing the Newton-Raphson method is that the expressions for the derivatives required in (9) and (10) can be found while deriving the elements of the FIM in (4). We will provide examples of this procedure in Section 4.1. We must also recognize the limitations of the Newton-Raphson method, as discussed in [14, Ch. 7]. The method is not guaranteed to converge, or it might converge to a local maximum. The method can quickly diverge if the current estimate results in an FIM that is close to singular. Generally, the ML estimate will be found if the initial estimate is close to the ML estimate and not in the “vicinity” of singularities (we discuss the meaning of being in the vicinity of a singularity in the FIM in further detail in Section 4.1).

## 3 Joint Parameter Estimation Performance

In this section, we first derive the FIM of the joint parameter estimation problem in diffusive MC with steady uniform flow and first-order molecule degradation. Then, we present simple examples of how to use the FIM to find the CRLB (following the methodology in Section 2.2) and comment on situations where the FIM is singular, i.e., where the CRLB does not exist.

### 3.1 Main Result

To derive the FIM, we first need the joint observation PDF for our problem. The TX makes a single release of molecules at time . Our observations are the discrete number of molecules found within at the sampling times, i.e., , where the th observation is made at time . We assume that the time between successive observations is sufficient for each observation to be independent (we discussed the independence of observations in detail in [7]). We will also assume that the individual observations, which are Binomially distributed, can be approximated as Poisson random variables whose means are the expected values of the observations at the corresponding times (this has been shown to be highly accurate in our previous work, including [7, 24], although the Gaussian approximation can become more accurate as it becomes more likely to observe any individual molecule). Thus, the joint PDF is [1, Eq. (9)]

 p(s|θ)=M∏m=1¯¯¯¯¯¯¯¯¯¯¯¯NAob(tm)smexp(−¯¯¯¯¯¯¯¯¯¯¯¯NAob(tm))/sm!, (11)

where is as given by (2.1). The logarithm of the joint PDF is

 lnp(s|θ)=M∑m=1[smln¯¯¯¯¯¯¯¯¯¯¯¯NAob(tm)−lnsm!−¯¯¯¯¯¯¯¯¯¯¯¯NAob(tm)]. (12)

We summarize the channel parameters that we wish to estimate in Table 1. From (12) and (2.1), the FIM can be found. We present the final result in the following theorem:

###### Theorem 1 (FIM of the Joint Estimation Problem)

The elements of the Fisher information matrix for the joint parameter estimation problem are of the form

 [I(θ)]θi,θj=M∑m=1GθiGθj¯¯¯¯¯¯¯¯¯¯¯¯NAob(tm), (13)

where is a unique term for parameter and we note that the ordering of the elements in is arbitrary. The terms for the channel parameters are as follows:

 Gd= 12D(v∥−dtef), (14) Gt0= ⎛⎝32tef+k+v2∥+v2⊥4D−d24Dt2ef⎞⎠, (15) GD= 12D[12D(d2t%ef−2dv∥+tef(v2∥+v2⊥))−3], (16) Gk= −tef, (17) Gv∥= 12D(d−v∥tef), (18) Gv⊥= −v⊥tef2D, (19) GN= 1N, (20)

where here . The diagonal elements of the FIM are presented in Table 1, such that is the diagonal element associated with parameter . The 21 unique off-diagonal elements can be analogously found from (13).

{IEEEproof}

The proof is straightforward by applying the properties of logarithms and exponentials and the rules of differentiation2 to (12) and (2.1), and by noting that (by definition) . It can be shown that the terms come from the derivative of the logarithm of the joint PDF in (12) with respect to , i.e.,

 ∂lnp(s|θ)∂θi=M∑m=1Gθi(sm−¯¯¯¯¯¯¯¯¯¯¯¯NAob(tm)). (21)

### 3.2 Examples of the CRLB

The size of the FIM for a specific problem depends on the number of unknown channel parameters, i.e., given that there are unknown parameters, the FIM will be an matrix. The size of the FIM does not depend on the number of parameters that we want to estimate. If we want to estimate parameters, then we should have . Here, we present two basic examples of using the FIM to find the CRLB. We consider estimating the distance , then jointly estimating and the molecule release time , because the distance between any pair of devices in the same environment can be unique, and every device can have its own internal synchronization. Thus, and are arguably the most critical parameters when establishing a communication link between a pair of devices.

The simplest scenario is the estimation of a single parameter when we assume that all other parameters are known. We studied this case for distance estimation in [1]. By (6), we see that we only need to invert the corresponding entry in Table 1, and we can write the lower bound on the variance of any unbiased distance estimator as [1, Th. 1]

 var(^d)≥4D2∑Mm=1(v∥−dtef)2¯NAob(tm). (22)

Similarly, the FIM for any one unknown parameter has a single element and it can be easily inverted to find the CRLB. As we discussed in [1], equations for the CRLB give us insight into the factors that affect the accuracy of an estimate. For example, from (22) we see that a more accurate estimate might be possible if more samples are taken, i.e., by increasing . The same observation can be made for the estimation of any single parameter via inspection of the diagonal elements of the FIM in Table 1. The impact of some parameters, such as on the estimation of , or on the estimation of the degradation rate, are not immediately clear because the parameters are both inside and outside the term in the corresponding FIM element. However, we can see from Table 1 that increasing the number of molecules will also increase the bound on the variance of estimation of , because is only scaled by a factor of .

For , we must perform a matrix inversion to obtain the CRLB. Consider where . The structure of the FIM is then

 I(θ)=[[l][I(θ)]d[I(θ)]d,t0[I(θ)]d,t0[I(θ)]t0], (23)

where , are from Table 1, and can be evaluated from (13) using (14) and (15). The inversion of (23) is straightforward. For brevity, we omit writing the inversion out in full, but we have two comments regarding its use. First, we did not need to specify which parameter(s) we are trying to estimate, i.e., the FIM in (23) applies to estimating or or both, given that both are unknown. Second, it can be shown that the CRLB for either parameter cannot be smaller than if that parameter were the only unknown parameter. These two comments apply to any joint estimation problem (see [14, Ch. 3]); the FIM depends on the unknown parameters and not the parameters being actively estimated, and the CRLB never decreases when more parameters become unknown. We show more examples of these observations when we present our numerical results in Section 5.

### 3.3 On the Nonexistence of the CRLB

Our analysis and discussion of the CRLB would be incomplete if we did not consider the occasions when the CRLB does not exist. By inspection of (13) when there is a single observation, i.e., , we can see that singularities arise when , such that inversion of the FIM is not possible and so the CRLB cannot be found (we do not consider the case where , because we would not expect any meaningful communication if no molecules are expected at the RX). It has been shown in [15, 16] that if the FIM is singular, then there is no unbiased estimator for with finite variance. Furthermore, we must also consider the conditioning of the FIM. The terms associated with different parameters can vary by many orders of magnitude, such that the FIM can be nearly singular.

## 4 Estimation Protocols

In this section, we describe the implementation of estimation protocols for the channel parameter estimation problem. First, we apply examples of ML estimation, as defined in Section 2.3. We consider cases where the ML estimate can be written in analytical closed form. We also consider examples of applying the Newton-Raphson method to find the ML estimate numerically, and comment on comparing ML estimates with the CRLB when the FIM is singular or nearly singular. Then, we propose peak-based estimation protocols as low-complexity methods for finding any one unknown channel parameter.

### 4.1 ML Estimation

#### Analytical ML Estimation

We can try to search for ML estimates analytically by taking the derivative of the logarithm of the joint PDF with respect to the parameter of interest, i.e., (21), and setting it equal to 0. If , i.e., if there is more than one unknown parameter, then we will have to solve a system of equations (each in the form of (21)) to find the critical points that are candidates for the ML estimate. For tractability, we limit our discussion of analytical solutions to the special case of and , and rely on numerical methods for the ML estimation of more than one parameter and/or observation. Furthermore, for ML estimation when and , we use an approach that is more direct than taking the derivative of the logarithm of the joint PDF.

Consider the direct estimation of the expected channel impulse response at time , . It is straightforward to show that the ML estimate of is just the observation at time , i.e., . Then, by the invariance property of ML estimation (see [26, Ch. 3]), the ML estimate of any single parameter in (2.1) can be found by setting in (2.1), substituting with , and re-arranging to solve for the unknown parameter. Analytical solutions for estimating and are not possible using this method because they are found both inside and outside the exponential in (2.1). We can still consider this method numerically for and as an alternative to the numerical maximization of the likelihood function directly, except when .

The single-sample analytical ML estimates are then as follows:

 ^d∣∣ML= v∥tef±√4DtefH(s1)−t2ef(v2⊥+4kD), (24) ^k∣∣ML= −|→ref|24Dt2ef+H(s1)tef, (25) ^v∥∣∣ML= dtef±1tef√4DtefH(s1)−t2ef(v2⊥+4kD), (26) ^v⊥∣∣ML= ±1tef√4DtefH(s1)−4kDt2ef−(d−v∥tef)2, (27) ^N∣∣ML= s1(4πDtef)3/2VRX% exp⎛⎝ktef+|→ref|24Dtef⎞⎠, (28)

where

 H(s1)=ln(NVRXs1(4πDtef)3/2), (29)

we recall that , and here . Some additional comments on these ML estimates are necessary:

1. is a decreasing function of the observation . For a sufficiently large value of , an estimate of can be negative or an estimate of , , or can have an imaginary component. A negative degradation rate is physically meaningful and corresponds to the spontaneous generation of molecules in the propagation environment. Estimates with imaginary components should be ignored.

2. The “” in (24), (26), and (27) mean that there could be multiple valid estimates due to the symmetry of (2.1) about the point . Even if the resulting distance is negative, it still has physical meaning because it represents uncertainty in the position of the TX relative to the RX, e.g., at or if . We could choose between multiple valid estimates by tossing an unbiased coin.

3. If the observation , then and all analytical ML estimates (except for that of ) are infinite. We can avoid infinite estimates by setting if , where .

As with the CRLB, we will find that the accuracy of ML estimation improves with the number of observations . Therefore, in Section 5 we will not focus on assessing the above equations for single-sample analytical ML estimates.

#### Iterative Numerical ML Estimation

Here, we present examples of the structure of the Newton-Raphson method for numerically finding the ML parameter estimate. We consider the same examples that we examined in Section 3.2 because of their importance when establishing a communication link. First, we consider estimation of the distance . Second, we consider the joint estimation of and .

By (9), the distance can be found iteratively as

 ^dn+1=^dn−∂lnp(s|θ)∂d/∂2lnp(s|θ)∂d2∣∣∣d=^dn, (30)

where we have already presented the first derivative of the logarithm of the joint PDF with respect to in (21). The second derivative with respect to can be shown to be

 ∂2lnp(s|θ)∂d2=−M∑m=1⎛⎝sm−¯¯¯¯¯¯¯¯¯¯¯¯NAob(tm)2Dtef+G2d¯¯¯¯¯¯¯¯¯¯¯¯NAob(tm)⎞⎠, (31)

such that is found iteratively as

 ^dn+1=^dn+∑Mm=1Gd(sm−¯¯¯¯¯¯¯¯¯¯¯¯NAob(tm))∑Mm=1(sm−¯¯¯¯¯¯¯¯¯¯¯NAob(tm)2Dtef+G2d¯¯¯¯¯¯¯¯¯¯¯¯NAob(tm)), (32)

where (as defined in (14)) and are evaluated for . Similar iterative expressions can be written for the iterative estimation of the other channel parameters. We see that, for a single observation, i.e., , (32) will converge (such that = ) when the estimate is such that , unless we simultaneously have (in which case the method will diverge).

The joint estimation of distance and synchronization (via ), such that , can be found iteratively as

 Unknown environment '% (33)

where we use when we evaluate the derivatives of the logarithm of the joint PDF. It can be shown that the second derivative of the logarithm of the joint PDF with respect to is

 ∂2lnp(s|θ)∂t02= M∑m=1[sm−¯¯¯¯¯¯¯¯¯¯¯¯NAob(tm)2t2ef(3−d2Dtef) −G2t0¯¯¯¯¯¯¯¯¯¯¯¯NAob(tm)], (34)

and the cross derivative is

 ∂2lnp(s|θ)∂d∂t0= M∑m=1(d(¯¯¯¯¯¯¯¯¯¯¯¯NAob(tm)−sm)2Dt2ef −GdGt0¯¯¯¯¯¯¯¯¯¯¯¯NAob(tm)). (35)

The structure of the Newton-Raphson method can be similarly described for other estimation problems with more than one unknown parameter.

#### ML Estimation and the CRLB

We complete our discussion of ML estimation by commenting on the behavior of ML estimation when the FIM is singular and the CRLB does not exist. Consider the estimation of a single parameter from a single observation so that from (13) the FIM is a single element with no summation. If , then and no unbiased estimator with finite error variance exists. We have observed that we cannot find an analytical ML estimate when estimating one parameter from a single observation when . However, an informative ML estimate still exists; performing a finite grid search and choosing the estimate that maximizes the observation’s log likelihood will result in a finite mean square error. We will see an example of this in Section 5. The ML estimate is still informative because it is now biased (we previously noted in Section 2.3 that we can only claim that ML estimation is efficient in the sense of the CRLB as ).

It is insufficient to limit this discussion to the case where . In fact, ML estimation is biased and better than the CRLB when is in the “vicinity” of 0, i.e., as . Even in the case of estimating multiple parameters from a “small” number of observations, the FIM could be singular or nearly singular (this becomes less likely as more observation are made). Again, ML estimation in such a scenario can be biased and better than the CRLB. More seriously, poor conditioning can also cause convergence problems when implementing the Newton-Raphson method for ML estimation.

Existing literature (see [17, 18]) has sought to define the “neighborhood” of a singularity to determine where the CRLB is not an actual lower bound for ML estimation. However, this is a non-trivial task that has only been studied for some specific problems. A detailed study to determine the parameter values for which the CRLB is not a lower bound on ML estimation is outside the scope of this work.

One might question whether knowledge of the CRLB is meaningful if it is not always a lower bound on ML estimation. We believe that it is relevant to have the CRLB because we are ultimately interested in practical parameter estimation schemes. A practical estimator will be more effective if it collects many observations over time. FIMs that are singular (and therefore have no corresponding CRLB) or close to singular will be less common as more observations are made, as we will observe in Section 5. Furthermore, ML estimation becomes unbiased (such that the CRLB is valid) as more observations are made. Thus, we claim that the CRLB is a useful benchmark.

### 4.2 Peak-Based Estimation

The study of parameter estimation in this paper has focused thus far on optimal performance, i.e., we have asked what is the best possible performance of an unbiased estimator and what is the performance of the maximum likelihood approach. We do not expect to implement a ML estimator as part of a nanoscale device, even if there is only one unknown parameter to estimate. Rather, our intent is to establish theoretical limits that we can use to compare with simpler, more practical estimators. We propose peak-based estimation for finding any one unknown channel parameter. Peak-based estimation has been proposed for distance estimation in [9, 10] and was also considered in our work in [1]. It has been shown to be a relatively simple and accurate method for measuring the distance. By simple, we mean that a peak-based estimator makes multiple observations but uses just one observation to calculate the estimate.

In our simplest variation, the RX measures the time when the peak number of molecules is observed and uses the value of to estimate the unknown parameter. For comparison, we consider more complex variations where the RX measures the peak number of observed molecules , and also where the RX measures both and .

#### Finding the Peak Time

For peak-based estimation we need the time, after molecules are released by the TX, when the maximum number of molecules is expected, i.e., given that . We previously derived for our system model as [1, Eq. (4)]

 ¯¯¯¯¯¯¯¯tmax=(−3+√9+d2η/D)/η, (36)

where

 η=(v2∥+v2⊥)/D+4k=|v|2/D+4k. (37)

In the absence of flow and molecule degradation, i.e., if , then it can be shown that the peak number of molecules would be expected at the RX at time

 ¯¯¯¯¯¯¯¯tmax=d2/(6D). (38)

Peak-based estimation requires the RX to measure either the peak number of observed molecules or the time when the peak number is observed. The simplest method for doing so is to keep track of the number of molecules observed over a “sufficiently” long period of time and then select (either the time or the value of) the peak observation. A more general method, originally proposed in [10], is for the RX to track the upper and lower envelopes of the observations. The “peak” observation is then the peak value of the mean of the two envelopes. We implemented the envelope detector in [1] using what we called a moving maximum filter and a moving minimum filter. Given an odd filter length , the th filtered observation of the moving minimum filter is

 s′m=minw∈{m−W−12,…,m+W−12}sw, (39)

and the moving maximum filter is analogously defined. We note that a filter length is analogous to the simplest method of determining or . We also note that the maximum observation will generally be greater than the expected observation at the time when the maximum observation is expected, even when using the envelope detector. This is discussed in greater detail in [10]. The estimators that follow in the remainder of this section can be implemented with any method of finding (the time or the value of) the peak observation.

#### Estimation from Peak Time

Our simplest variation of peak-based estimation is when the RX estimates a parameter using alone (and not ). If is the unknown parameter, then we assume that the RX is able to calculate from (36) or (38) as appropriate and measure the time when the peak number of molecules is observed. If the observed peak time is , then the RX can immediately estimate as

 ^t0∣∣Peak=tmax−¯¯¯¯¯¯¯¯tmax. (40)

For clarity of exposition in the remainder of this section, we will assume that when it is known and that the RX has adjusted its timer accordingly. Other values of can be accommodated by replacing the observed with .

Estimates for most of the remaining parameters can be derived by re-arranging (36) or (38) as appropriate (the number of molecules released, , does not appear in (36) or (38), so we cannot use this method to estimate ). Generally, if we have flow or molecule degradation, i.e., if , then the remaining peak-based estimators are

 ^d∣∣Peak= Missing or unrecognized delimiter for \right (41) ^D∣∣Peak= d2−tmax2|v|24tmax2k+6tmax, (42) ^k∣∣Peak= d2−6Dtmax−tmax2|v|24Dtmax2, (43) ^|v|∣∣Peak=  ⎷d2−4Dktmax2−6Dtmaxtmax2, (44)

and the estimators (41) and (42) for the distance and the diffusion coefficient, respectively, also apply in the absence of flow and molecule degradation, i.e., if . We note that a two-way version of the distance estimator (41) when was originally proposed as the round-trip time from peak concentration protocol in [9]. Given by (44) and the knowledge of one flow component, we can estimate the unknown flow component using .

#### Estimation from Peak Observation

Our remaining peak-based estimation protocols are adapted from single-sample ML estimation, given that we have the peak observation . Any such parameter estimate will not be the ML estimate given all of the observations that were assessed to identify , but will be the single-sample ML estimate for the largest observation. These protocols must be implemented numerically, except for special cases, and are considered as (potentially) more accurate alternatives to estimation from only the peak time .

If the RX has knowledge of both and , then both of these can be substituted into (12) and the ML estimate can be found numerically ( is substituted for ). We can alternatively apply one of the analytical closed-form ML estimates found in Section 4.1 if the corresponding .

If the RX has knowledge of but not of , then the corresponding formula for (either (36) or (38)) can be substituted for in (12) and the ML estimate can be found numerically. This approach was applied in the implementation of the envelope detector proposed for distance estimation when in [10]. The only analytical ML estimate in Section 4.1 that remains in closed-form for any without requiring a numerical evaluation is that of in (28) because is not a function of the number of released molecules.

## 5 Numerical and Simulation Results

In this section, we present simulation results to assess the performance of the channel parameter estimation protocols discussed in this paper with respect to the corresponding CRLBs. Our simulations were executed in the microscopic stochastic framework that we presented in [24]. The TX is implemented as a spherical source such that the released molecules are initially separated by at least . The molecules are not created at a common point because they cannot physically occupy the same space, and is larger than the size of single atoms and on the order of the size of small organic molecules that might be suitable for signaling; see [2, Ch. 2]. Every molecule released by the TX is treated as an independent point particle whose location is updated every simulation time step . In one time step, a given molecule has a probability of of degrading via reaction (1). All simulation results that we present in this section were averaged over independent simulations.

For clarity of exposition, since we have presented a number of parameter estimators in this paper, and there are many possible combinations of joint parameter estimation problems, we focus on a single set of environmental parameters as summarized in Table 2. The RX has a radius of , which is about the size of a small bacterial cell; see [2, Ch. 1]. The diffusion coefficient of is comparable to that of small molecules in blood plasma; see [27]. The molecule degradation rate of is sufficient, in the absence of flow, for an RX from the TX to expect one less molecule at the expected peak concentration time than if , i.e., instead of .

The flow magnitudes of and are strong relative to the diffusion but do not completely dominate; the Peclet number, which describes the relative dominance of convection versus diffusion and is found here as (see [28, Ch. 5]), is equal to when . Such strong flows are within the range of average capillary blood speed (from to ; see [27]). We do not claim to accurately model capillary flow, where the flow is more complex than the uniform flow that we consider in this work, but such an environment is also one where the flow is relatively stronger than diffusion (without dominating; see [29, Ch. 7]). The strong flows also enable us to observe singularities in the CRLB for distance estimation at sampling times of interest (i.e., near when the maximum number of molecules is expected). The number of molecules released by the TX at one time, , is the number of molecules that would be inside a spherical container of radius with a concentration of , which is at least an order of magnitude lower than the concentration of common ions used for signaling in mammalian cells; see [2, Ch. 12].

Table 2 also lists the minimum and maximum parameter values that we use when performing a grid search of the maximum likelihood estimate of a given channel parameter. By symmetry, we only consider positive distance and positive perpendicular flow . We do not consider TX release times greater than , the time of the first observation, because molecules cannot be observed before they are released. We also only consider non-negative degradation rate , even though negative has physical meaning (i.e., information molecules are spontaneously created). Our constraints on the ranges of parameter values for grid searches enable “best-case” ML estimation; relaxing any of the constraints can only make ML estimation less accurate.

The resulting expected channel impulse response as a function of time, given the parameters listed in Table 2, is presented in Fig. 2 for varying distance from to . We also show the average channel impulse response as generated by independent realizations of our simulator at each distance. Over this range of distances, the time of the expected maximum increases from about to almost , and the number of molecules expected at that time decreases by almost two orders of magnitude. The average simulated responses are generally in agreement with the expected impulse responses, although the expected response tends to slightly underestimate the simulations before the peak time and overestimate the simulations after the peak time (due to the limitation of the assumption that the concentration expected throughout the RX is uniform). Assuming that the TX is a point source even though we simulate a spherical source is also a (negligible) source of inaccuracy.

In the remainder of this section, we present normalized (i.e., dimensionless) results of the CRLB and the performance of the parameter estimators (unless otherwise noted). By normalizing our results, we are able to show the relative accuracy of estimating a given parameter. This is useful when a single parameter can vary over orders of magnitude, or when we want to show the estimation of different parameters on a single plot. We normalize the CRLB of parameter as

 1θ2iRef[I−1(θ)]θi,θi, (45)

such that a CRLB of 1 means that the lower bound on the variance of an unbiased estimator is equal to . Generally, we will set . The one exception is that of because it has a value of . We set so that the normalizing term is on the order of what an accurate estimate would be.

The performance of an estimator of parameter is evaluated by measuring the estimator’s mean square error, which (unless otherwise noted) we normalize as

 mse(^θi)∣∣Norm=E[(^θi−θi)2]/θ2iRef, (46)

and we note that the non-normalized mean square error, i.e., without the scaling factor of , is equivalent to the variance in (7) if and only if the estimator is unbiased. Generally, we aim for the CRLB and the mean square error to be as small as possible, such that the normalized bound and error should be much less than for the estimation to be meaningful.

Unless otherwise noted, the sampling scheme is as follows. When one observation is made, i.e., , then it is taken at (close to the time when the maximum number of molecules are expected at distance ; see Fig. 2). For other values of , the observation times are equally spaced such that the last sample is taken at time . If we had only added new sample times when increasing (without changing the old values of ), then from (3) and (13) the CRLB could never increase. However, since we change the exact sample times for each value of , we will see results where the CRLB can increase with (small values of) increasing .

### 5.1 Optimal Estimation

We begin our discussion of estimator performance by focusing on optimal estimation, i.e., ML estimation and how it compares with the CRLB. All ML performance results presented were obtained via a grid search using the limits specified in Table 2. By performing grid searches instead of using the analytical solutions available when , we do not need to address the exceptional cases described in Section 4.1.1. The performance of the estimation of a single parameter has also been verified via the Newton-Raphson method.

First, we consider estimating the distance when the true value is and we vary the number of observations made and the number of known parameters. We measure the normalized CRLB (given by (45)) and the normalized mean square error (given by (46)) of ML estimation when only is unknown, and then successively remove the knowledge of , , and . The results are shown in Fig. 3. Removing the knowledge of , , or is not as detrimental to distance estimation, so corresponding results are not shown. We will see later in this section that removing the knowledge of does not significantly degrade the estimation of , , or , either. We note that the ML estimate was solved for fewer values of when there are three unknown parameters and for no values of when there are four unknown parameters due to the increasing computational requirements of exhaustive searching. Applying the Newton-Raphson method for three and four unknown parameters was not feasible here due to poor matrix conditioning.

In Fig. 3, we see that there are no steady trends of ML estimation accuracy or its comparison with the CRLB for low values of , i.e., for . This is for two reasons: the sampling times change significantly for each value of and some of these samples are “close” to singular points. For example, a sample taken at about will have a corresponding term with a value of , which is why the CRLB when only is unknown and (i.e., the first sample is at ) is higher than when (i.e., the first sample is at ). We also cannot claim that losing knowledge of parameters will always degrade performance; when or , the ML estimate of when both and are unknown is more accurate than when only is unknown. In these cases, the ML estimate trades accuracy in estimating for accuracy in estimating (later in this section, we will see that estimating is very inaccurate when ). Nevertheless, we can make more general claims as more samples are taken, i.e., for . As more samples are made, the CRLB improves and the ML estimate approaches the CRLB. In this regime, the CRLB and ML performance both degrade as more parameters become unknown. The potential mean square error in the estimation of increases by orders of magnitude as we remove the knowledge of the values of , , and .

In Fig. 4, we observe the performance of ML estimation of each individual channel parameter when only that parameter is unknown. We set , and we measure the normalized error of each parameter as a function of the number of samples . To ease inspection of the normalized error for small values of , this figure is shown in log-log scale. The figure gives us a sense of the relative accuracy to which we can aim to estimate any single channel parameter, and helps us to verify the diagonal elements of the FIM that we list in Table 1. As in Fig. 3, the normalized error as a function of the number of samples begins to stabilize for . We observe that the ML estimation of any single parameter performs very close to the corresponding CRLB as more samples are taken. In a relative sense, we can most accurately estimate the distance , followed (in order) by the flow towards the RX , the perpendicular flow component , the number of molecules released , the diffusion coefficient , the molecule degradation rate , and finally the release time (although the choice of was particularly arbitrary since we could not choose ; the normalized error in the estimation of is comparable to that of if we choose instead of ).

In Fig. 5, we observe the performance of ML estimation of each individual channel parameter when there are two unknown parameters: the distance (whose actual value is still ) and the parameter of interest. We measure the normalized error of each parameter as a function of the number of samples . We can compare Fig. 5 directly with Fig. 4 to see the importance of the knowledge of the distance when estimating the other channel parameters. There is negligible degradation in the estimation of , , and , slight degradation in the estimation of , and significant degradation in the estimation of and . The negligible change in the estimation of is most interesting because the opposite was not observed in Fig. 3, where removing the knowledge of was shown to measurably degrade the estimation of .

The results presented thus far do not give us a very clear sense of the performance of ML estimation in the “neighborhood” of a singularity in the FIM. To do so, we need to consider ML estimation as a function of a varying channel parameter whose domain includes a point where the CRLB is infinite. In Fig. 6, we perform distance estimation as a function of the actual distance for the number of observations . We adjust the sampling times for so that they are taken at and . This adjustment ensures that, for every value of , a sample is taken at time , so that the corresponding is when (recall that is a function of ). Also, in this figure we do not normalize the mean square error or the CRLB because is the only unknown parameter.

The only singularity in the FIM in Fig. 6 is when and . Although ML estimation when is generally not nearly as accurate as the CRLB, it is more accurate than the CRLB over the range . This range is effectively the “vicinity” of the singularity when one sample is taken at time and when ML estimation must be biased. Interestingly, when , there is never an actual singularity in the FIM, but we are still in the vicinity of a singularity when , where ML estimation is more accurate than the CRLB and must be unbiased. We observe this behavior where the term for the observation at is equal to zero, but not where the term for the observation at is equal to zero, i.e., at around . This is because the sample at time is more critical for the estimation of than that at . The relative importance of individual samples is reduced as more samples are taken, such that ML estimation is only slightly more accurate than the CRLB at when , i.e., where the sample is most critical for the estimation of and the corresponding term is . Otherwise, we observe that ML estimation performs close to but not better than the CRLB for larger values of , where ML estimation becomes increasingly unbiased, over the entire range of that we consider.

### 5.2 Peak-Based Estimation

Finally, we consider the performance of the sub-optimal peak-based estimators that we proposed in Section 4.2. We are interested in how well the simplest peak-based protocol (which only measures and can be implemented in closed-form) performs in comparison to the peak-based protocols that generally require a ML search given the value of the peak observation . The ML estimates given are found via a grid search. We are also interested in the impact of the moving minimum (and maximum) filter window length on the performance of each estimator, and whether the relative performance of the different estimators varies when different parameters are being estimated.

In Fig. 7, we compare the performance of the peak-based distance estimators as a function of the actual distance for varying window length