Secret Key Generation Via Localization and Mobility

Secret Key Generation Via Localization and Mobility

Abstract

We consider secret key generation from relative localization information of a pair of nodes in a mobile wireless network in the presence of a mobile eavesdropper. Our problem can be categorized under the source models of information theoretic secrecy, where the distance between the legitimate nodes acts as the observed common randomness. We characterize the theoretical limits on the achievable secret key bit rate, in terms of the observation noise variance at the legitimate nodes and the eavesdropper. This work provides a framework that combines information theoretic secrecy and wireless localization, and proves that the localization information provides a significant additional resource for secret key generation in mobile wireless networks.

I Introduction

We consider the generation of a common key in a pair of nodes, which move in (continuous space) according to a stochastic mobility model. We exploit the reciprocity of the distance between a given pair of locations, view the distance between the legitimate nodes as a common randomness shared by these nodes and utilize it to generate secret key bits using the ideas from source models of secrecy [1].

Unlike the recent plethora of studies (see Section I-A for a brief list of related papers) that focuses on wireless channel reciprocity, a variety of technologies can be used for localization (e.g., ultrasound, infrared, Lidar, Radar, wireless radios), which makes distance reciprocity an additional resource for generating secret key bits. Such versatility makes the key generation systems more robust, since different technologies may have different capabilities that wireless RF does not have. For instance, narrow beam width of infrared systems would make them less susceptible to eavesdropping from different angles. Distance reciprocity is highly robust, since the distance measured between any pair of points is identical, regardless of which point the measurement originates. (e.g., when there is no line-of sight, or when different frequency bands are used each way). Yet, there are various challenges in obtaining reciprocal distance measurements.

In this paper, we analyze the theoretical limits of key generation using localization in the following system. We assume that mobile nodes obtain observations regarding the sequence of distances between them over a period of time as they move in the area. The measurements can be obtained actively through exchange of wireless radio, ultrasound, infrared beacons, or passively by processing existing video images, etc. The beacon signal may contain explicit information such as a time stamp, or the receiving node can extract other means of localization information by analyzing angle of arrival, received signal strength, etc. The nodes perform localization based on the observations of distances, and the statistics of the mobility model, and obtain estimates of their relative locations with respect to each other. Then, the nodes communicate over the public channel to agree on a secret key. The generated key bits satisfy the following three quality measures: i) reliability, ii) secrecy, and iii) randomness. For reliability, we show that the probability of mismatch between the keys generated by the legitimate nodes decays to with increasing block length. In our attacker model, we consider a passive eavesdropper, that overhears the exchanged beacons in the first phase, and the public discussion in the third phase, and tries to deduce the generated key based solely on these observations. The attacker can follow various mobility strategies in order to enhance its position statistically to reduce the achievable key rate (possibly to ). We assume that the attacker does not actively interfere with the observation phase, e.g., by injecting jamming signals, etc., in order not to reveal its presence. For secrecy, we consider Wyner’s notion, i.e., the rate at which mutual information on the key leaks to the eavesdropper should be arbitrarily low. For randomness, the generated key bits have to be perfectly compressed, i.e., the entropy should be equal to the number of bits it contains.

We mainly focus on information theoretic limits. Using a source model of secrecy [1], we characterize the achievable secret key bit rate in terms of observation noise parameters at the legitimate nodes and the eavesdropper under two different cases of global location information (GLI): (i) No GLI, in which the nodes do not observe their global locations directly, and (ii) perfect GLI, in which nodes have perfect observation of their global locations, through a GPS device, for example. While the bounds we provide are general for a large set of observation statistics, we further investigate the scenario in which the observation noise is i.i.d. Gaussian for all nodes: We study the observation SNR asymptotics, and show a phase-transition phenomenon for the key rate. In particular, we prove that the secret key rate grows unboundedly as the observation noise variance decays, if the eavesdropper does not obtain the angle of arrival observations. Otherwise, it is not possible to increase the secret key rate beyond a certain limit. Then, we evaluate the theoretical performance numerically for a simple grid-type model, as a function of beacon power. We also evaluate the performance for the case where the eavesdropper strategically changes its location to reduce the secret key rate. Specifically, we consider the strategy where the eavesdropper moves to the middle of its location estimates of the legitimate nodes. We show that with this strategy, the eavesdropper can significantly reduce the secret key rate compared to the case where it follows a random mobility pattern.

In summary, our main contribution is to illustrate that relative localization information can be used as an additional resource for secret key generation (see Section I-A for a comparison with related work).

I-a Related Work

Generation of secret key from relative localization information can be categorized under source model of information theoretic secrecy, which studies generation of secret key bits from common randomness observed by legitimate nodes. In his seminal paper [1], Maurer showed that, if two nodes observe correlated randomness, then they can agree on a secret key through public discussion. He provided upper and lower bounds on the achievable secret key rates. Although the bounds have been improved later [2, 3], the secret key capacity of the source model in general is still an open problem. Despite this fact, the source model has been utilized in several different settings [4, 5, 6].

There is a vast amount of literature on localization (see, e.g., [8, 9] for wireless localization, [15] for infrared localization, and [16] for ultrasound localization). There has been some focus on secure localization and position-based cryptography [10, 11, 12, 13], however, these works either consider key generation in terms of other forms of secrecy (i.e., computational secrecy), or fall short of covering a complete information theoretic analysis.

A similar line of work in wireless network secrecy considers channel identification [14] for secret key generation using wireless radios. Based on the channel reciprocity assumption, nodes at both ends experience the same channel, corrupted by independent noise. Therefore, nodes can use their channel magnitude and phase response observations to generate secret key bits from public discussion. The literature on channel identification based secret key generation is vast. The works [20, 21, 22, 23, 24, 25] study key generation with on-the-shelf devices, under 802.11 development platform using a two way radio signal exchange on the same frequency. [27], on the other hand, utilizes the fact that fading is highly correlated on locations that are less than a half wavelength apart, instead of exploiting the reciprocity. Therefore, very close nodes can use public radio signals (e.g., FM, TV, WiFi) to generate secret key bits.

In most of these works, the security analysis is based on the assumption that the channel gains are modeled as random processes, that are independent of the distances between the nodes, and are independent at locations that are more than a few wavelengths apart. While being appropriate for a non line-of-sight and highly dynamic media, these models do not capture wireless propagation in environments where attenuation is a function of the propagation distance. In such environments, an attacker that has some localization capabilities will gain a statistical advantage by estimating the channel gains based on its distance observations. If the key generation process ignores this advantage, part of the key may be recovered by the attacker and thus the key cannot be perfectly secure. For instance, Jana et. al. [21] focuses on a scenario in which secret key bits based on the received signal strength (RSSI), and show that an eavesdropper that knows the location of the legitimate nodes can launch a mobility attack to force the legitimate nodes to generate deterministic key bits, by periodically blocking and un-blocking their line-of-sight. Similarly, if the eavesdropper is close (less than a wavelength) to one of the legitimate nodes, then eavesdropper will obtain correlated information [27], therefore the generated key will not be perfectly secure, and secrecy outage occurs. The practical applicability of exploiting channel reciprocity for secret key generation has also been questioned recently in [26]. It is shown that, especially when the nodes have sufficient mobility, the eavesdropper’s and the legitimate receiver’s channel can be significantly correlated depending on the locations, which breaks the secrecy of the initial generated key.

On the other hand, key generation based on locations does not make such independence assumptions. The dependencies in the locations of the legitimate nodes and the observations of the attacker with those of the legitimate nodes are taken into account to provide provably security against a mobile eavesdropper with localization capability. Thus, the insights provided in this paper can also be valuable for the class of studies on key generation based on wireless channel reciprocity, as we show how one should capture a variety of capabilities of the attackers in finding the correct rate for the key and in designing the appropriate mechanisms to generate a truly secret key.

A word about notation: We use and denotes the L2-norm. A brief list of variables used in the paper can be found in Table I.

Ii System Model

Ii-a Mobility Model

We consider a simple network consisting of two mobile legitimate nodes, called user and , and a possibly mobile eavesdropper . We divide time uniformly into discrete slots. Let be the random variable that denotes the coordinates of the location of node in slot , where nodes are restricted to the field . We use the boldface notation , to denote the -tuple location vectors for . The distance between nodes and in slot is . Similarly, and denote the sequence of distances between nodes and nodes respectively. We use the boldface notation , , for the -tuple distance vectors. Note that, in any slot the nodes form a triangle in , as depicted in Figure 1, where , , , denote the angles with respect to some coordinate axis.

Fig. 1: Legitimate nodes and the eavesdropper form a triangle.

We assume that the distances take values in the interval , since the nodes cannot be closer to each other than due to physical restrictions, and they cannot be further than away from each other due to their limited communication range. We assume that the location vectors are ergodic processes. We will use the notation to summarize the state variables related to mobility in the system. Note that for any 1.

var. Description
number of slots
number of steps in public discussion
distance between nodes and
the field where nodes are located
2-D location of node
angle between nodes and
observation of nodes of and
observation of node of and
complete observations of node based on available GLI
location triple
quantized version of location triple,
quantization resolution
uniform quantization function
node ’s estimate of based on all its information
-bit Gray coder
obtained binary key at node before reconciliation
obtained binary key at node after reconciliation
obtained binary key at node after universal compression
final key at node after universal hashing
TABLE I: List of variables

Ii-B Localization

At each time slot, there is a period in which the legitimate nodes obtain information about their relative position with respect to each other. As discussed in Section I-A, there are various methods to establish the localization information. In this paper, we will not treat these methods separately. We will simply assume that, during measurement period , when node transmits a beacon, nodes and obtain a noisy observation of and respectively. Let these observation be and , respectively. Similarly, when node follows up with a beacon, nodes and obtain the distance observations and , respectively. The nodes may also independently observe their global positions, e.g., through a GPS device. They may also observe the angle they make with respect to each other, if they are equipped with direction sensitive localizers (e.g., directional antennas in wireless localization). We consider two extreme cases on the global location information (GLI):
1) no GLI: The nodes do not have any knowledge of their global location. However, with the observations of both the beacons, the eavesdropper also obtains a noisy observation, , of the angle between the legitimate nodes.
2) perfect GLI: Each node has perfect knowledge of its global location, and a sense of orientation with respect to some coordinate plane as shown in Figure 2. In this case, nodes , obtain noisy observations of the angle . Similarly, node obtains noisy observation of the angles .

Let denote the set of observations of node during slot , and . The observations for each case is provided in Table II. We emphasize that, the observations in each slot are obtained solely from the beacons exchanged during that particular slot. The nodes’ final estimates of the distances depend also on the observations during other slots, due to predictable mobility patterns.

Fig. 2: With GLI, the nodes obtain noisy observations of the relative orientation of each other with respect to the x-axis.
No GLI Perfect GLI
TABLE II: Nodes’ Observations

Ii-C Attacker Model

We assume that there exists a passive eavesdropper , which does not transmit any beacons. However, node can strategically change its location to obtain a geographical advantage against the legitimate nodes. Overall, we consider two strategies:
Random Mobility: Eavesdropper moves randomly, without a regard to the location of the legitimate nodes. We will assume that eavesdropper adopts random mobility unless otherwise stated.
Mobile Man in the Middle: Node controls its mobility, such that it can move accordingly to obtain a geographic advantage compared to legitimate nodes. We consider the strategy where node moves to the mid-point of its maximum likelihood estimates of the legitimate nodes’ locations. For , let us denote node ’s maximum likelihood estimate of node ’s location at slot , based on its observations up to slot as . Then,

In other words, node and node ’s locations at slot is predicated by node by its observations in the previous slots. Then, at the beginning of each slot , node moves to the mid-point of the estimates, which is .

Although we restricted ourselves to a single passive eavesdropper, we also discuss the implications of multiple eavesdroppers. The eavesdroppers may utilize their observations in two possible ways: (i) Non-colluding eavesdroppers do not communicate, or share their observations with each other, whereas (ii) colluding eavesdroppers combine their measurements to obtain less noisy measurements. Note that, an eavesdropper with multiple location sensors (e.g., multiple antennas in the case of wireless radio-based localization) is a special case of colluding eavesdroppers, as each sensor could be viewed as a separate eavesdropper, with perfect links between them. Theoretical secret key capacity under colluding eavesdropper scenario is lower, due to cooperation of the eavesdroppers, as discussed in Section III.

Ii-D Notion of security

We consider the typical definition of source model of information theoretic secrecy under a passive eavesdropper: We assume that there exists an authenticated error-free public channel, using which the legitimate nodes can communicate to agree on secret keys, based on the observations of the distances and angles ( and ) obtained during beacon exchange. This process, commonly referred to as public discussion [1], is a step message exchange protocol, where at any step , node sends message , and node replies back with message such that, for ,

(1)
(2)

At the end of the step protocol, node obtains , and node obtains as the secret key, where

(3)
Definition 1

Secret key bits are generated (with respect to the described attacker model) at rate , if, for all and , there exists some such that (1), (2) and (3) are satisfied, and

(4)
(5)
(6)

Here, (4)-(6) correspond to perfect randomness, reliability and security constraints, respectively. The schemes proposed in the literature typically use a random coding structure, where are generated by using a binning strategy [1]-[6]. In Section III, we will make use of these existing results to provide computable theoretical bounds on the achievable key rates.

Iii Theoretical Performance Limits

In this section, we provide information theoretical bounds on the achievable key rate with perfect reliability. To evaluate these bounds, we assume an idealized system by ignoring the issues associated with quantization, cascade reconciliation protocol, and privacy amplification. Thus, these bounds are valid for any key generation scheme that satisfies Definition 1.

Theorem 1

A lower bound , and an upper bound on the perfectly-reliable key rate achievable through public discussion are

(7)
(8)

respectively, where and are as given in Table II for different possibilities of GLI.

The theorem follows2 from Theorem 4 in [18], which generalizes Maurer’s results on secret key generation through public discussion [1], to non-i.i.d. settings. Although tighter bounds exist in the literature [2, 3], we use the above bounds since they provide clearer insights into our systems due to their simplicity.

Note that for the special case where the observations are i.i.d., we can safely drop the index , and denote the joint probability density function of observations as . Therefore, the conditioning on the past and future observations in and disappear, and the bounds reduce to

(9)
(10)

Also note that Theorem 1 can be extended to provide key rate bounds against multiple eavesdropper models discussed in Section II-C. Consider eavesdroppers, with observations . For the non-colluding eavesdroppers model, since the eavesdroppers are not communicating, we can safely consider the most capable eavesdropper . In other words, in (7), (8) we can replace with for which yields the lowest bounds, and discard the rest of the eavesdroppers. For the colluding eavesdroppers model, we can replace the term in (7), (8) with since the eavesdroppers perfectly communicate with each other. It can be directly observed that the bounds for colluding case are lower with respect to the non-colluding case.

Iv Gaussian observations

To obtain more insights from theoretical results in Section III, we focus on the following special case: First, we assume that the node locations are individually Markov processes such that

holds for any , and their joint probability density function is well defined. Secondly, all observations of distance and angle terms are i.i.d. Gaussian processes. This model is typically used in the literature to characterize observation noise [19, 8]. To that end, for no GLI,

(11)
(12)
(13)

are Gaussian noise processes, where is the beacon power. The observation noise variances are increasing functions of the distance, which are modeled by the increasing functions for distance observations and for angle observations. The parameter depends on the capability of the nodes. For instance, in wireless localization, and depend on the path loss exponent, and depends on receiver antenna gain, number of antennas, etc [19, 8]. For perfect GLI, we additionally assume3 that for ,

(14)
(15)

Clearly, the achievable key rates depend highly on the functions , and . Note that, there there may be a bias on these observations due to small scale fading [19]. The effect of biased observations are studied in Appendix B.

Iv-a Beacon Power Asymptotics

In this part, we analyze the beacon power asymptotics of the system. We show that, if the eavesdropper does not observe the angle4, i.e., , then increases unboundedly with the beacon power , which indicates that arbitrarily large secret key rates can be obtained. However, when eavesdropper observes the angle information, then remains bounded, which indicates that the advantage gained by increasing beacon power is rather limited. To clearly illustrate our insights, we present our results for the no GLI scenario. However, the same conclusion holds for the perfect GLI case as well.

Theorem 2

When the eavesdropper obtains angle information, i.e., ,

(16)

The proof is in Appendix A, where we show that , where

(17)

The parameter remains finite since the distances take on values in some bounded range with probability . Therefore, the secret key rate remains bounded.

Theorem 3

When the eavesdropper does not obtain any angle information, i.e., ,

The proof is provided in Appendix A. Theorem 3 implies that, without the angle observation at the eavesdropper, an arbitrarily large key rate can be achieved with sufficiently large beacon power . However, the key rate increases with , which means that increasing the beacon power would provide diminishing returns.

Iv-B Numerical Evaluations

We evaluate the theoretical bounds in Section III for Gaussian observations model, using Monte Carlo simulations.

Setup

We consider a simple discrete 2-D grid, which simulates a city with blocks that covers a square field of area , such that for any , , . Node mobilities are Markov, and characterized by parameter , where

For no GLI, we choose , and

and for perfect GLI, we choose

such that both parameters are strictly increasing functions of the distances. 5 We consider node capability parameters , unless stated otherwise. The theoretical key rates in Section III converge as , therefore they are calculated for large enough , using the forward algorithm procedure.

Results

Due to computational limitations, we consider examples in which , and . Note that this choice limits the maximum achievable secret key rate6.

Fig. 3: Effect of eavesdropper mobility on the key rate
Fig. 4: Bounds for no GLI vs normalized beacon power, for different , ,
Fig. 5: Bounds for perfect GLI vs normalized beacon power, for different , ,

Then, we analyze the effect of the different grid size, field area and GLI on the theoretical key rates. In Figures 5 and 5, we plot the bounds on the achievable key rate with respect to the normalized beacon power for different grid size for no GLI and perfect GLI cases, respectively. We assumed a constant ratio of field size and grid size, , and considered . We can see that, there is a diminishing return on the increased power levels for the achievable key rate. Furthermore, we can see that increasing the field area has a negative impact on the key rate despite the increase in , which is due to the fact that the common information of the legitimate nodes decreases as a result of increase in their observation error variance. Next, in Figures 6 and 7, we plot the bounds with respect to beacon power , for different step size for no GLI and perfect GLI cases, respectively. We assumed for the no GLI case, and for perfect GLI case, and in both cases, the ratio of field size and grid size is constant, such that . We can clearly see the positive effect of the increased step size on the secret key rate. This is due to the increase in different distance combinations that are possible.

Fig. 6: Upper and lower bounds for no GLI vs normalized beacon power , , for different

Fig. 7: Upper and lower bounds for perfect GLI vs normalized beacon power , , for different

Finally, we analyze the effect of eavesdropper mobility on the achievable key rate. In Figure 5, for , and , we plot the secret key rate bounds versus beacon power for the cases where the eavesdropper i) follows the random mobility pattern described in the setup with parameter , ii) stays at the origin, and iii) follows the man in the middle strategy described in Section II-C, i.e., moves to the mid point of its location estimates of nodes and . We can see that, compared to following a random mobility pattern, the eavesdropper can reduce the achievable secret key rate significantly by following this strategy. However, the rate still remains positive. We observe that the eavesdropper can also reduce the key rate by simply staying static at a certain favorable location, rather than moving randomly. However, in practice this may not be feasible for the eavesdropper, since by staying put, it will lose connection completely with the legitimate nodes in a large region.

V Conclusion

In this paper, we showed that relative localization information is an additional resource for generating secret key bits in mobile networks. We studied the information theoretic limits of secret key generation, and characterized lower and upper bounds of key rates utilizing results for the cases in which the nodes are/are not capable of observing their global locations. Focusing on the special case where the observation noise is i.i.d. Gaussian, we studied the beacon power asymptotics, and observed that, interestingly, when the eavesdropper cannot observe the angle information, the secret key rate grows unboundedly. The following research directions can be further investigated 1) theoretical performance analysis of secret key generation in large networks, taking into account the recent advances in network information theoretic security, and 2) security analysis of various adversarial models, such as active jamming attacks, or impersonation attacks in unauthenticated networks.

Appendix A Proofs of Theorems in Section Iv-A

A-a Proof of Theorem 2

We first provide three lemmas that will be useful when proving the theorem.

Lemma 1

Let and be random variables. Then,

(18)

If and are independent, then .

{proof}

When and are not independent,

(19)

where (19) follows from the fact that . When and are independent, , implying the result.

Lemma 2

Let be a random variable such that , where . Let . Then, .

{proof}

Assume , where . Let . Let us define , and . Note that,

since the centralized second moment is minimized around the mean. Also,

since , and . Therefore, it suffices to show that ,

(20)
(21)

i) First note that . Therefore, the condition (21) is satisfied for .
ii) For ,

(22)

where is the first derivative of at point . (22) follows from the fact that is a strictly concave function in the interval . Therefore, condition (21) is satisfied for .
iii) Combining the facts that is a strictly concave function of in the interval ; is linear; ; and , we can see that when . Therefore, condition (21) is satisfied for . iv) When , and , therefore, condition (21) is satisfied. This concludes the proof.

Lemma 3

Let , be random variables. Then,

{proof}

Note that

Since for any , ,

is satisfied for any , which completes the proof. Now, we proceed as follows. Assume without loss of generality that . When ,

(23)
(24)

where (23) follows from the fact that forms a Markov chain, and (24) follows from the fact that all of the random variables have a stationary distribution, denoted as and , respectively. The second term in (24) can be found as

(25)

from the definition of . Now, we bound the first term in (24). Let us define

Then,

(26)

Note that for a given variance, Gaussian distribution maximizes the entropy. Therefore, the entropy of a Gaussian random variable that has a variance identical to that of will be an upper bound for (26). We proceed as follows.

(27)

where (27) follows from the fact that for any dependent random variables and , . We now find an upper bound on the first term of (27). Note that,

(28)

due to Lemma 1, since forms a Markov chain, and the fact that is independent of given . The first term in (28) is equal to

(29)

We bound the second term in (28) as follows. Let us define

Then,

(30)
(31)

where (30) follows due to the definitions of , and . (31) follows due to definition of , and the cosine law . Now we will apply Lemma 2 to bound (31). First, note that

(32)

where (32) follows from the fact that since is zero mean Gaussian, follows a Half-normal distribution with