Robust Wireless Fingerprinting: Generalizing Across Space and Time

Robust Wireless Fingerprinting: Generalizing Across Space and Time


Can we distinguish between two wireless transmitters sending exactly the same message, using the same protocol? The opportunity for doing so arises due to subtle nonlinear variations across transmitters, even those made by the same manufacturer. Since these effects are difficult to model explicitly, we investigate learning device fingerprints using complex-valued deep neural networks (DNNs) that take as input the complex baseband signal at the receiver. Such fingerprints should be robust to ID spoofing, and to distribution shifts across days and locations due to clock drift and variations in the wireless channel. In this paper, we point out that, unless proactively discouraged from doing so, DNNs learn these strong confounding features rather than the subtle nonlinear characteristics that are the basis for stable signatures. Thus, a network trained on data collected during one day performs poorly on a different day, and networks allowed access to post-preamble information rely on easily-spoofed ID fields. We propose and evaluate strategies, based on augmentation and estimation, to promote generalization across realizations of these confounding factors, using data from WiFi and ADS-B protocols. We conclude that, while DNN training has the advantage of not requiring explicit signal models, significant modeling insights are required to focus the learning on the effects we wish to capture.


additions \xpatchcmd\phase#2  , #2 

I s_cIntroduction

An important tool in wireless security is a “fingerprint” capable of distinguishing between devices that transmit exactly the same message. This is possible due to subtle hardware imperfections that occur even in devices made by the same manufacturer [1]. Such fingerprints can serve as a powerful authentication tool at the physical layer, complementing conventional security schemes in higher layers of the networking stack. In this paper, we seek fingerprints that are robust to confounding factors in data collected over multiple days and locations, including the carrier frequency offset (CFO), which drifts over time, and the wireless channel, which depends on the propagation environment.

In literature, fingerprints are often extracted via protocol-specific processing of the received wireless signal [2, 3, 4, 5, 6, 7, 8, 9, 10, 11]. We focus instead on an approach that is independent of the underlying protocol. Because the wireless signal is complex-valued, we employ convolutional neural networks (CNNs) with complex-valued parameters to learn fingerprints. When compared to prior work using real-valued CNNs [12, 13, 14], these networks have a smaller degree of freedom at the synaptic level, which has been observed to confer generalization benefits [15].

A key message of this paper is that the network learns the easiest set of features that it can in order to accomplish the desired task (in our case, discriminating between transmitters based on the received wireless signal), hence we must be extremely proactive in promoting robustness across effects that we do not want the network to lock on to. For instance, we would like the RF signature for a transmitter to be robust across different days and for different wireless channels. However, if we employ training data collected over a single day, the channel and CFO for a transmitter are relatively constant, and the CNN will lock onto these rather than to subtle nonlinear effects. This gives unreasonably excellent accuracy on test data collected over the same day, but disastrous results for data collected on a different day, when both the channel and the CFO (which drifts substantially over time) can be different. We show that model-based augmentation strategies can significantly improve robustness to these effects.

As another example, even if we train a network to process the entire packet, it may choose to focus on fields that convey information regarding the transmitter ID, such as the MAC address in WiFi data, and the ICAO address in ADS-B signals (Automatic Dependant Surveillance-Broadcast, an air traffic control protocol). Since such fields can be easily spoofed by an adversary, we must be vigilant against locking on them. We demonstrate that networks are indeed vulnerable to such involuntary “cheating”, and then show that restricting attention to just the preamble, which is common to all packets from all transmitters, suffices to obtain good accuracies. Our main contributions are summarized below.

Fig. 1: Block diagram of a wireless communication system. Subtle nonlinearities unique to each device can provide a fingerprint. However, easy-to-learn features such as the CFO and channel are not stable over time and location, affecting generalization.


  • We demonstrate that protocol-agnostic fingerprinting is possible using complex-valued CNNs, comparing design choices for data from two different wireless protocols: WiFi and ADS-B.

  • When making use of post-preamble information, we show that networks artifically inflate accuracies by relying on device ID fields present in these sections. We then focus on learning fingerprints from the preamble, which provides reasonably good performance despite its short length.

  • Using controlled emulations on a clean WiFi dataset, we show major pitfalls in this approach when training and testing on different days, due to the effect of propagation channels and frequency offsets which are far stronger than

    the nonlinear effects we seek to capture.

  • We develop augmentation strategies based on signal models for these effects, and evaluate performance against compensation techniques that explicitly try to undo them. We find that compensation works well only if the confounding factors are simple enough, like the CFO. For more complex effects, model-driven augmentation is essential for learning robust signatures.

Ii s_cBackground and Related Work

A generic model for a radio frequency (RF) wireless transmitted signal (shown in Fig. 1) is as follows:

where denotes the carrier frequency, or the frequency of the electromagnetic wave that “carries” the information-bearing waveforms (riding on the cosine of the carrier) and (riding on the sine of the carrier). Typical parameters for WiFi, for example, are of 2.4 or 5.8 GHz, and , having bandwidths of 20 MHz.

The receiver strips the carrier away to recover and , and then processes them to decode the information bits that they carry. For a typical wireless channel, there are multiple paths from transmitter to receiver, so multiple delayed, attenuated and phase-shifted versions of the transmitted waveform sum up at the receiver. These transformations are best modeled by thinking of the information-bearing waveform as a complex-valued signal, , where . The effect of a wireless channel is then modeled as a complex-valued convolution. The carrier frequency used at the receiver is not precisely the same as at the transmitter, and the impact of such carrier frequency offset is also most conveniently modeled in the complex domain.

While RF processing is designed to produce as little distortion as possible, in practice, there are nonlinearities, typically with some characteristics unique to each transmitter because of manufacturing variations, which can in principle provide RF signatures. Variations in components such as digital-to-analog converters (DACs) and power amplifiers (PAs) are inevitable even for transmitters manufactured using exactly the same process. Transistors, resistors, inductors, and capacitors within a device vary around nominal values, typically within a designed level of tolerance, and the goal is to translate the resulting variations in transmitter characteristics into a device signature. We discuss here some example effects, depicted in Figure 2, that may contribute towards such a signature.

Fig. 2: (a) Example variations of PA nonlinearities across transmitters, (b) Differential nonlinearity caused by DAC, (c) Scatterplots of noisy 4-QAM constellation points with and without I-Q imbalance.
  • I-Q Imbalance: This results from mismatch in the gain and phase of the in-phase (I) and quadrature (Q) signal paths for upconversion. The phase of the cosine and sine of the carriers may not be offset by exactly , and the path gains along the branches may not be equal.

  • Differential Nonlinearity (DNL) due to DAC: DNL is defined as the discrepancy between the ideal and obtained analog values of two adjacent digital codes due to circuit component non-idealities [16].

  • PA Nonlinearity: Power amplifiers are ideally linear, but start saturating at high input voltages. There is a significant literature on PA modeling [17, 18, 19, 20], as well as on the impact of PA nonlinearities on communication systems with high dynamic range such as OFDM [21, 22]. A common model is a memoryless polynomial fit (typically up to third order) of the form:

    Recent promising results on wireless fingerprints for PA nonlinearities, extracted using CNNs, are reported in [23].

Of course, we seek to devise DNNs that extract signatures based on a combination of characteristics such as those in Figure 2, while marginalizing over channels and CFOs. It is possible to extract signatures from either the transient (microsecond-length) signals transmitted during the on/off operation of devices, or via the steady-state packet information present in between the start and end transients. We focus here on work that employs the steady-state method since it is of more practical utility [4]. Such prior work can be divided into two categories: (i) approaches that use handcrafted features, and (ii) machine learning based techniques.

Traditional approaches: The first approach to device fingerprinting was in [2], albeit only for wired devices in wide area networks. The feature used in [2] was the clock skew, which was observed to be fairly consistent over time, but varied significantly across devices. This technique was extended by [5] to wireless local area networks where timestamps in IEEE 802.11 frames contain more precise information about the clock skew. However, [6] demonstrated deficiencies of the previous two studies, presenting a spoofing attack based on the clock-skew information generated by a fake access point. Despite the drift in CFO and the relative ease of spoofing it at the physical layer, recent proposals on CFO-based fingerprints include [9] and [11].

Machine learning based approaches: There are many papers over the past decade using machine learning to derive fingerprints. Much of this work involves significant protocol-specific preprocessing, in contrast to the protocol-agnostic approach considered in this paper. Examples include a -nearest neighbor (-NN) classifier in [4] based on spectral analysis of WiFi preambles, support vector machine (SVM) in [3] based on demodulation error metrics such as frequency offset and I/Q offset, -means clustering of features based on inter-arrival times of ADS-B messages [7], and a neural network operating on WiFi inter-arrival times [8], and a real-valued CNN operating on the error signal obtained after subtracting out an estimated ideal signal from frequency-corrected received data [10]. Section V evaluates the robustness of our approach against protocol-specific estimation strategies, showing that estimation works well only for simple variations like the CFO. For more complex effects such as channel variations, the augmentation approach we study has a clear advantage.

Modern CNNs learning directly from I/Q data include [12, 13] for modulation classification, and [14] for device fingerprinting. However, this line of work employs real-valued networks, with real and imaginary parts of complex data treated as different channels. It has been previously noted that the complex-valued networks we use have generalization benefits over real-valued approaches [15], and our results in Section III corroborate this advantage.

Parts of this work previously appeared in our conference paper [24], which, to our knowledge, was the first to employ complex-valued CNNs for wireless fingerprinting. The present paper extends on [24] by studying robustness across days and locations due to variations in the CFO and wireless channel. We note that our prior work precedes and is independent of [25], which also uses complex-valued networks, and claims to be “the first […] system able to fingerprint devices using unprocessed raw signals in a range of frequencies of interest.” In [26], channel-resilient fingerprinting was studied by modifying the transmitter using a finite impulse response (FIR) filter. Our work on channel resilience is based solely on modifying DNN training and does not involve transmitter-side hardware alterations.

Iii s_cComplex-valued Representations

The subtle nonlinear effects discussed in the previous section are difficult to model explicitly, hence deep learning is a natural approach to teasing out transceiver signatures based on them. We explore the use of complex-valued neural networks for this purpose: these are well-matched to the complex baseband received signal. Such networks have previously been used for speech, music and vision tasks [27, 28]. Here, we learn device fingerprints for two different wireless protocols: WiFi and ADS-B.


Fig. 3: ModReLU and CReLU activation functions in the complex plane. ModReLU preserves the phase of all inputs outside a disc of radius , while CReLU distorts all phases outside the first quadrant. Figure adapted from [28].
Fig. 4: Example complex-valued 1D CNN architecture for WiFi signals.

Data: We provide results for the following external database:

  • WiFi data containing a mix of IEEE 802.11a () and IEEE 802.11g () packets from 19 commercial-off-the-shelf devices, without channel distortion.

  • ADS-B air traffic control signals (, narrowband) collected in the wild from 100 airplanes.

We use available oversampled data for both protocols, with WiFi signals sampled at 200 MHz and ADS-B at 20 MHz. The length of the preamble is then 3200 samples for WiFi and 320 samples for ADS-B.

Architecture: For complex layers, we explore the following choices of activation functions, shown in Figure 3:

  • ModReLU - This function affects only the magnitude and preserves phase. Here is a learned bias.

  • CReLU - Here, separate ReLUs are applied to the real and imaginary parts of the input. The phase of the output is therefore restricted to .

    The loss in phase information can be potentially compensated by using wider filters (i.e. with a larger number of channels) capable of providing phase derotation.

Fig. 5: Evolution of training accuracy over epochs for ModReLU and CReLU networks (ADS-B). ModReLU provides a small gain in train and test accuracies over CReLU, with similar convergence behavior.

Figure 4 depicts the complex-valued 1D CNN we use for WiFi signals, using as input the I/Q data at the receiver, restricted to the preamble. is used midway through the network to convert complex representations to real ones. The network architectures we use are listed below in compact form (similar to the notation in [29]):

  • ADS-B: .

  • WiFi: .

The notation should be read as follows:

where denotes a convolutional layer, a fully connected layer, and Avg a temporal averaging layer.

Complex backpropagation is performed using the framework of [28], taking partial derivatives of the cost with respect to the real and imaginary parts of each parameter. Networks are trained for 200 epochs with a batch size of 100, using the Adam optimizer with default hyperparameters and weight decay constant of . We normalize signals to unit power, and use 200 samples per device for training and 100 for testing for WiFi, and 400 samples per device for both training and testing for ADS-B.

Performance: We find that the ModReLU architecture outperforms CReLU (shown in Fig. 5), without any difference in convergence speed. Using the preamble alone, we obtain 99.62% fingerprinting accuracy for 19 WiFi devices, and 81.66% accuracy for 100 airplanes using the ADS-B protocol.

Dataset Network type Accuracy
Total number of
real parameters
ADS-B Complex 81.66 128,400
Real 73.84 78,400
Real (1.4x) 73.25 133,680
Real (2x) 75.00 246,600
WiFi Complex 99.62 262,719
Real 97.50 162,319
Real (1.4x) 97.61 278,399
Real (2x) 97.94 512,519
TABLE I: Performance comparison between complex-valued and real-valued networks.
(a) Layer 1
(b) Layer 2
Fig. 6: Visualizations of the first and second convolutional layer for ADS-B (ModReLU architecture). Each row shows the input signal that maximizes the activation of a particular filter, computed using gradient ascent starting from random noise (with signals normalized to unit power at each step). Convolutional filters in the first layer span 2 input symbols; filters in the second layer span 6 symbols.

We compare the performance of complex-valued and real-valued networks in Table I. For real networks, we follow the approach of [12, 13, 14] in treating real and imaginary parts of input data as different channels. For a fair comparison, we consider real networks with different scaling factors for the number of channels. This is to account for the fact that a complex filter would contain twice as many parameters as an equivalent real filter. We find that the complex network outperforms all its real counterparts, with a performance gain of 6.6% for ADS-B and 1.6% for WiFi.

Figure 6 depicts input signals that strongly activate filters in the first and second layer of the ADS-B architecture. Since device-specific nonlinear effects manifest primarily as short-term transitions of amplitude and phase, the filters in the first layer can capture these effects by spanning a small multiple of the symbol interval (2 symbols).

Iv s_cResilience to ID Spoofing

Fig. 7: ADS-B packet structure. Top: Mode S. Bottom: Mode S Extended. While the first 16 symbols of both packet types are device-independent, the following 24 symbols are highly device-dependent.

This section studies the potential benefits of using the entire packet for fingerprinting. While this can yield averaging gains, we must be proactive against locking on to device ID fields which can be easily spoofed. We expect signatures learnt from subtle nonlinear features to be stable over time, in contrast to ID fields that are localized in time. We focus here on the ADS-B protocol and begin by describing its packet structure.

Fig. 8: Classification accuracies for ADS-B (100 devices) when using the entire packet. Here, we use architecture .

Packet structure: The ADS-B data we use contains packets of differing lengths (shown in Figure 7): 64 symbols for Mode S, and 120 symbols for Mode S Extended. Therefore, we prune all packets to a uniform length of 64 symbols. For both packet types, the first 16 symbols consist of a preamble common to all devices, while symbols 17–40 contain the ICAO address which serves as a unique device identifier. To determine whether networks lock on to this field, we consider the following scenarios with offset data:

  • an offset of zero,

  • a randomly chosen offset, and

  • a fixed offset where we choose the last 64 symbols.

Performance: Fig. 8 reports on results for each scenario. At first glance, using the entire packet appears to yield substantial gains: we obtain 99.29% accuracy when not using any offset. This is a 17 point improvement over the preamble-only accuracy reported in the previous section. However, performance actually drops in the scenarios with offsets, yielding 65.64% and 75.49% accuracy. The picture becomes clearer when we closely examine results for the two packet types. These have identical accuracies without any offset, but in all other scenarios, Mode S dominates performance. This temporal dependence indicates that the network is focusing on device IDs from the payload for Mode S. It is easy to obtain 99% accuracy by restricting attention to the ICAO address, which is a clear indicator of “cheating”.

A natural solution that comes to mind is to delete the symbols corresponding to the device ID. However, we again obtain artificially high accuracies. This is due to the presence of parity bits correlated with the ICAO address: the network is able to reconstruct a device identifier from the combination of parity and preamble sections. Another solution might be to set filters in first layer of the network to span only 2 symbols, so that we avoid learning the device ID (which spans 24 symbols). This yields an accuracy of 97.28%, which is still substantially higher than the preamble-only scenario. Small kernel sizes in the first layer alone are not sufficient to prevent cheating: one just needs to look at the second layer to see that its filters actually extend over 6 symbols.

These results show that allowing a network access to packet payloads is unwise: networks involuntarily “cheat” whenever given the chance, ignoring device-specific nonlinearities in favor of easily spoofed ID fields. This behavior can be avoided by restricting attention to the preamble, and this is what we choose to do in the rest of this paper. We leave as an open issue the problem of certifiably sanitizing ID information from the remainder of the packet.

V s_cStability to Variations in Space and Time

In this section, we use the clean WiFi dataset for controlled experiments emulating the effect of frequency drift and channel variations. We show that these fluctuations can have a disastrous effect on performance and study compensation and augmentation strategies to promote robustness.

V-a Carrier Frequency Offset

We first examine robustness to carrier frequency offset (CFO), caused by frequency mismatch in the crystal oscillators at the transmitter and receiver. Since the CFO depends on the transmitter, it could potentially be used as a feature to fingerprint devices [3, 11]. However, this has the following key drawbacks:

  • Oscillator frequencies drift substantially over time, leading to an unstable signature. Training and test data collected over different days could contain different CFOs, which, as we show below, significantly degrades performance.

  • Since the offset depends on both the receiver and the transmitter, data collection with multiple receivers dampens its usefulness as a fingerprint.

  • The CFO can be easily spoofed by an adversary manipulating baseband signals.

Therefore, it is important for a network to avoid involuntary use of the CFO as a fingerprint. We investigate this by artifically inserting offsets in data, emulating an oscillator frequency tolerance of parts per million as specified in the IEEE 802.11 standard [30]. We begin with an example where only the test data is offset.

Type of data augmentation CFO in test set
None Bernoulli Uniform
None 99.50 4.63 13.58
Bernoulli 3.32 99.32 13.53
Uniform 96.21 90.79 95.37
TABLE II: Performance when only the test data is offset, with CFOs in the range (-20, 20) ppm. Augmenting training data with uniformly distributed CFOs helps confer robustness.

Offset in test data alone: We find that networks trained on clean data do not generalize to offset data, even when the offset is very small: accuracy drops to 4.6% at an offset of 20 ppm. In order to alleviate this, we augment the training set with randomly chosen CFOs and report results in Table II. We consider two types of random offsets: ppm and ppm, augmenting the size of the training set by 5x in each scenario.

This strategy can significantly help in learning robust fingerprints, but the type of augmentation matters: in particular, it is insufficient to augment with worst-case offsets alone. When we train with Bernoulli offsets, the network becomes robust to Bernoulli test offsets (99.3%), but fails to generalize to any offset smaller than 20 ppm, including an offset of zero. In contrast, when we augment data with uniformly chosen offsets, we obtain resilience (>90%) to all test set offsets in the desired range.

"Different day" scenario: We now emulate collecting training data on one day and testing on another: we insert different “physical” offsets for each device, but fix the offset for all packets from a particular device. The offsets are randomly chosen in the range ppm (since both the transmitter and receiver oscillators can vary by ppm). Oscillator drift across days is realized via different random seeds for training and test offsets.

This setting makes it particularly easy for the network to focus on the CFO as a fingerprint: we obtain artificially high training accuracies (94.2%), but poor test set performance (9.7%) due to frequency drift. We now explore two strategies to restore performance: data augmentation with randomly chosen CFOs, and frequency compensation.

Augmentation: Table III reports on the efficacy of various CFO augmentation strategies, capable of increasing test accuracy to 87.1%. For training data, we find that the best augmentation technique is to use a different augmentation offset for each packet in a class, but the same set of offsets across classes, which discourages the network from learning the CFO as a means of distinguishing between classes. We term this an “orthogonal” strategy: we are trying to train in a direction “orthogonal” to the tendency to lock onto the “physical” CFO as a signature.

A novel finding is that data augmentation for testing leads to significant performance gains when we add up soft outputs across augmented versions of each test packet. The best result is obtained when we insert a different randomly chosen CFO for each of a 100 copies of each test data packet, and then sum up the softmax outputs across the augmented data. We find that averaging of logits also improves performance, but not to the extent of the softmax average. This is due to the following: for nuisance parameter (e.g. CFO, channel) and class , suppose the network outputs likelihood . The desired prediction can then be computed by averaging over different realizations of :


assuming a uniform prior for . This essentially translates to averaging of softmax outputs over a sufficiently large number of augmented copies of test data.

 Training  augmentation Test time augmentation
None 5 20 100
None 9.68 7.84 8.74 8.47
Random 5 74.21 71.84 74.21 77.37
20 72.79 75.84 78.05 80.05
Orthogonal 5 69.58 75.11 81.05 83.63
20 82.37 82.32 86.21 87.11
TABLE III: Performance in the “different day” CFO setting, with CFOs in the range (-40, 40) ppm. “Random” training augmentation uses a different offset for each packet, while the “orthogonal” type uses the same set of offsets across classes.

Frequency compensation: We can also estimate and correct the offset using knowledge of the periodic structure of the preamble. Consider a periodic signal with period , and frequency offset resulting in . Since we know that , the CFO can be estimated by correlating with its shifted version:

We follow a two-step approach [31] involving a coarse estimate from the 802.11 short training sequence () and then a fine estimate from the long training field (). This method restores accuracy to 97.7%, and, as shown in Fig. 9, does about 6.2% better than augmentation.

Residual approach: An interesting way to combine the above two strategies is by excising the transmitted message to obtain a residual signal containing device nonlinearities. Using the estimated CFO and known preamble sequence, we can compute an ideal noiseless reconstruction of the received signal . The residual noise can then be used as input to the CNN. While this alone is not sufficient to restore performance across days (the noise signal still contains CFO effects), we can use a combination of the residual technique and augmentation to obtain an improvement over pure augmentation, as shown in Fig. 9. Stripping out the message in this manner makes it easier for the network to learn nonlinear signatures.

There is a clear tradeoff between the different approaches considered: CFO estimation is less resource-intensive, but it requires detailed knowledge of the underlying protocol. Augmentation is, in contrast, protocol-agnostic.

Fig. 9: Comparison of frequency compensation and augmentation in the “different day” CFO scenario as we increase the number of training augmentations. The test set is augmented by 100x throughout. The baseline corresponds to a network trained without any augmentation or compensation.

V-B Multipath Channels

1 2 3 4 5 6 7
(ns) 0 30 70 90 110 190 410
(dB) 0.0 -1.0 -2.0 -3.0 -8.0 -17.2 -20.8
TABLE IV: Power-delay profile for the EPA multipath fading model. Tap amplitudes are Rayleigh distributed with variance .

The wireless channel is another important source of distribution shift between training and test data. Since multipath components in the channel depend on propagation geometry, a network that locks on to the channel will fail to generalize to test data collected on a different day or location. If the training data does not span a sufficiently diverse set of geometries, it could contain channels that are highly correlated with the transmitter ID, necessitating the use of channel augmentation or equalization strategies to improve robustness.

 Training  augmentation Test time augmentation
None 1 5 20 100
None 5.74 6.74 7.26 7.21 7.26
Random 5 39.58 39.79 54.05 59.84 62.68
20 54.05 52.84 63.21 67.68 68.47
Orthogonal 5 41.16 42.16 52.89 56.68 58.68
20 56.16 54.74 66.47 71.00 71.84
TABLE V: Performance in the “different day” channel setting when we train on 2 days and test on a third day. “Random” augmentation uses a randomly drawn channel for each packet, while the “orthogonal” type uses the same set of channels across classes.
(a) No test augmentation.
(b) 10 test augmentations.
(c) 100 test augmentations.
Fig. 10: Plots showing how test augmentation affects the histogram of softmax outputs (averaged over augmentations) for data from two specific classes ( and ), in the “different day” channel setting. Histograms are normalized to be probability densities. As the number of test augmentations increases, the probability of correct prediction and shifts towards .

We study the impact of multipath on fingerprinting using a Rayleigh fading model [32] with multipath components:

where , and is the Dirac delta function. We use the Extended Pedestrian A (EPA) profile, a well-known statistical channel model used in LTE system testing [33]. As shown in Table IV, this profile quantifies the delays and relative powers of the multipath components.

“Different day” scenario: We investigate training and testing on different days similar to prior CFO experiments. Using the EPA profile, we emulate different channels across classes but retain the same channel for all packets in a class. With single day training, we get excellent performance when testing on the same day (98%), but very poor accuracy if we test on a different day (5.8%). This clearly indicates a lack of robustness to channel variations, with the network involuntarily locking on to the channel as a means of discriminating between classes.

Fig. 11: Comparison of channel equalization and augmentation as we increase the number of days over which training data is collected (with the size of the training set kept constant). Baseline performance is reported for a network trained without augmentation or equalization.

Augmentation: We find that channel augmentation helps, but accuracy increases only to 47.8% in the “train on one day, test on another” setting. We can boost performance to 71.8% if we are allowed access to training data collected over 2 days (without increasing the size of the training set) and test on a third day, as shown in Table V. Note that accuracy without augmentation is still low. If training data spans 3 days, augmentation improves accuracy even further to 79.7%.

This phenomenon can be understood by modeling channel variations in the frequency domain. Suppose transmitter sends message over “physical” channel

and we augment with randomly chosen channels :

The effective channel will still contain all the nulls of , which could potentially be correlated with the transmitter ID. Thus, augmentation alone cannot completely remove the effect of the underlying physical channel. Access to more varied training data, when combined with augmentation, increases the diversity of the overall channel that the network sees.

The above results are achieved using 20 training and 100 test augmentations (with soft outputs added up over the 100 copies of each test packet). As before, we find that the “orthogonal” approach works the best: using the same set of channels across classes discourages the network from learning to use the channel as a fingerprint. Fig. 10 illustrates the impact of test time augmentation on the distribution of soft outputs for two sample classes. If we do not augment the test set, many samples from class 4 are misclassified as class 7 (shown in the first row of Fig. 9(a)). As the number of test augmentations increases (Fig. 9(b), 9(c)), we get increasingly precise estimates of the desired prediction (1), causing to shift towards , and towards .

Equalization: Another strategy to remove channel influence would be to equalize signals using the long training field of the WiFi preamble. We equalize data in the frequency domain and compare results with augmentation in Fig. 11. Each experiment is performed with 5 different seeds, with error bars denoting one standard deviation from the mean. We find that equalization performs much poorer than channel augmentation, with a performance gap of 26.5% even with 20 training days. The equalization noise is sufficient to swamp out the nonlinear characteristics that we are interested in.

Fig. 12: Performance of training augmentation across days when both the CFO and channel vary. We use orthogonal augmentation for channels and the random method for CFOs.
(a) Effect of increasing training augmentations.
(b) Effect of increasing test augmentations.
Fig. 13: Accuracy as a function of the amount of augmentation when both the CFO and channel fluctuate. We augment the CFO and channel by equal amounts, with the -axis denoting the number of augmentations for each.

Residual approach: In a manner similar to the previous section, we compute a noiseless reconstruction of the received signal by convolving the estimated channel with the known preamble, and then subtract it out to obtain residual noise. When combined with augmentation, we obtain accuracies that are competitive with, but not better than, pure augmentation, as shown in Fig. 11. The noise in channel estimation prevents the residual method from offering a clear advantage in accuracy like in the CFO scenario.

Overall, augmentation is the best of the three considered strategies for making networks insensitive to channel effects: with 10 training days, it can restore accuracy to as high as 97.7%.

V-C Combination of Channel and Carrier Offsets

Lastly, we focus on a combination of channel and carrier offsets across different days. This is a harsher and more realistic setting than prior experiments, with test set accuracy no better than random guessing (5%) even if we collect training data over 20 days.

Augmentation: We explore data augmentation with different amounts of augmented CFOs and channels, and report results in Figs. 12 and 12(a). We find equal numbers of augmented CFOs and channels to work well: this improves performance from 5% to 90.10% with 20 training days. For test augmentation to yield benefits, we find that the number of augmentations is important: as shown in Fig. 12(b), if we only augment test data 2 times, we observe a drop in accuracy. This is because the Bayesian average (1) requires a large number of realizations of the two nuisance parameters (CFO, channel) in order to be accurate.

Estimation: Table VI reports on comparisons with estimation strategies, the residual approach and also a mix of estimation and augmentation. We find that equalization, when combined with either CFO compensation or augmentation, results in only 10% accuracy and so do not include it in the comparison. The best result is obtained by a combination of CFO compensation and channel augmentation for both training and test sets, with competitive performance from pure augmentation when the number of days is large.

Training method Number of days
2 5 10 20
Residual + augmentation 19.11 26.21 67.50 78.95
Pure augmentation 24.90 49.36 77.83 90.10
CFO comp. + channel aug. 33.96 62.63 88.96 91.40
TABLE VI: Comparison of augmentation, compensation and the residual approach when both the CFO and channel vary.

Vi s_cConclusions

While complex-valued CNNs are a promising tool for learning RF signatures, blind adoption of these networks is dangerous due to confounding factors that impede generalization across space and time. We have shown that training augmentation tied to the physical phenomena driving these effects is a critical tool for learning robust signatures. A novel finding is that test augmentation, with soft combining of likelihoods across augmented data, yields substantial performance gains. An alternative to augmentation is to estimate and undo the effects of confounding factors using detailed, protocol-specific models, but our results indicate that residual errors from such a classical approach may be enough to swamp out the weaker nonlinear effects that constitute a stable signature. While a judicious combination of estimation and augmentation can confer robustness, using augmentation alone is attractive because it is a powerful general-purpose strategy which requires minimal protocol modeling. An important open issue for future work is to investigate the fundamental limits of robust fingerprinting, in order to characterize how far we can go in terms of the number of devices that can be reliably distinguished. To this end, it is also of interest to develop provably robust methods of fusing information from preamble and post-preamble sections of data, in order to utilize all the available data.


This work was funded in part by DARPA under the AFRL contract number FA8750-18-C-0149, by ARO under grant W911NF-19-1-0053, and by the National Science Foundation under grant CIF-1909320. The views and conclusions contained herein are those of the authors and should not be interpreted as necessarily representing the official policies or endorsements, either expressed or implied, of DARPA or Air Force Research Laboratory or ARO or the U.S. Government. The authors gratefully acknowledge research discussions with collaborators at Teledyne Scientific, including Mark Peot, Laura Bradway, Karen Zachary and Michael Papazoglou.


  1. K. A. Remley, C. A. Grosvenor, R. T. Johnk, D. R. Novotny, P. D. Hale, M. D. McKinley, A. Karygiannis, and E. Antonakakis, “Electromagnetic signatures of WLAN cards and network security,” in Proceedings of the Fifth IEEE International Symposium on Signal Processing and Information Technology, December 2005, pp. 484–488.
  2. T. Kohno, A. Broido, and K. C. Claffy, “Remote physical device fingerprinting,” IEEE Transactions on Dependable and Secure Computing, vol. 2, no. 2, pp. 93–108, April 2005.
  3. V. Brik, S. Banerjee, M. Gruteser, and S. Oh, “Wireless device identification with radiometric signatures,” in Proceedings of the 14th ACM International Conference on Mobile Computing and Networking, 2008, pp. 116–127.
  4. I. O. Kennedy, P. Scanlon, F. J. Mullany, M. M. Buddhikot, K. E. Nolan, and T. W. Rondeau, “Radio transmitter fingerprinting: A steady state frequency domain approach,” in 2008 IEEE 68th Vehicular Technology Conference, 2008, pp. 1–5.
  5. S. Jana and S. K. Kasera, “On fast and accurate detection of unauthorized wireless access points using clock skews,” IEEE Transactions on Mobile Computing, vol. 9, no. 3, pp. 449–462, 2010.
  6. C. Arackaparambil, S. Bratus, A. Shubina, and D. Kotz, “On the reliability of wireless fingerprinting using clock skews,” in Proceedings of the 3rd ACM Conference on Wireless Network Security, 2010, pp. 169–174.
  7. M. Strohmeier and I. Martinovic, “On passive data link layer fingerprinting of aircraft transponders,” in Proceedings of the First ACM Workshop on Cyber-Physical Systems-Security and/or PrivaCy, 2015, pp. 1–9.
  8. S. V. Radhakrishnan, A. S. Uluagac, and R. Beyah, “GTID: A technique for physical device and device type fingerprinting,” IEEE Transactions on Dependable and Secure Computing, vol. 12, no. 5, pp. 519–532, 2015.
  9. M. Leonardi, L. Di Gregorio, and D. Di Fausto, “Air traffic security: Aircraft classification using ADS-B message’s phase-pattern,” Aerospace, vol. 4, no. 4, p. 51, 2017.
  10. K. Merchant, S. Revay, G. Stantchev, and B. Nousain, “Deep learning for RF device fingerprinting in cognitive communication networks,” IEEE Journal of Selected Topics in Signal Processing, vol. 12, no. 1, pp. 160–167, 2018.
  11. J. Hua, H. Sun, Z. Shen, Z. Qian, and S. Zhong, “Accurate and efficient wireless device fingerprinting using channel state information,” in IEEE International Conference on Computer Communications, 2018, pp. 1700–1708.
  12. T. J. O’Shea, J. Corgan, and T. C. Clancy, “Convolutional radio modulation recognition networks,” in International Conference on Engineering Applications of Neural Networks, 2016, pp. 213–226.
  13. T. O’Shea and J. Hoydis, “An introduction to deep learning for the physical layer,” IEEE Transactions on Cognitive Communications and Networking, vol. 3, no. 4, pp. 563–575, 2017.
  14. K. Sankhe, M. Belgiovine, F. Zhou, S. Riyaz, S. Ioannidis, and K. Chowdhury, “ORACLE: Optimized Radio clAssification through Convolutional neuraL nEtworks,” in IEEE International Conference on Computer Communications, 2019.
  15. A. Hirose and S. Yoshida, “Generalization characteristics of complex-valued feedforward neural networks in relation to signal coherence,” IEEE Transactions on Neural Networks and Learning Systems, vol. 23, no. 4, pp. 541–551, 2012.
  16. K. R. Lakshmikumar, R. A. Hadaway, and M. A. Copeland, “Characterisation and modeling of mismatch in MOS transistors for precision analog design,” IEEE Journal of Solid-State Circuits, vol. 21, no. 6, pp. 1057–1066, December 1986.
  17. A. A. M. Saleh, “Frequency-independent and frequency-dependent nonlinear models of TWT amplifiers,” IEEE Transactions on Communications, vol. 29, no. 11, pp. 1715–1720, November 1981.
  18. A. Zhu and T. J. Brazil, “Behavioral modeling of RF power amplifiers based on pruned volterra series,” IEEE Microwave and Wireless Components Letters, vol. 14, no. 12, pp. 563–565, December 2004.
  19. Hyunchul Ku and J. S. Kenney, “Behavioral modeling of nonlinear RF power amplifiers considering memory effects,” IEEE Transactions on Microwave Theory and Techniques, vol. 51, no. 12, pp. 2495–2504, December 2003.
  20. J. C. Pedro and S. A. Maas, “A comparative overview of microwave and wireless power-amplifier behavioral modeling approaches,” IEEE Transactions on Microwave Theory and Techniques, vol. 53, no. 4, pp. 1150–1163, April 2005.
  21. E. Costa, M. Midrio, and S. Pupolin, “Impact of amplifier nonlinearities on OFDM transmission system performance,” IEEE Communications Letters, vol. 3, no. 2, pp. 37–39, February 1999.
  22. S. Merchan, A. G. Armada, and J. L. Garcia, “OFDM performance in amplifier nonlinearity,” IEEE Transactions on Broadcasting, vol. 44, no. 1, pp. 106–114, March 1998.
  23. S. S. Hanna and D. Cabric, “Deep learning based transmitter identification using power amplifier nonlinearity,” in International Conference on Computing, Networking and Communications (ICNC), February 2019, pp. 674–680.
  24. S. Gopalakrishnan, M. Cekic, and U. Madhow, “Robust wireless fingerprinting via complex-valued neural networks,” in IEEE Global Communications Conference (Globecom), Waikoloa, HI, Dec. 2019. ArXiv:1905.09388.
  25. I. Agadakos, N. Agadakos, J. Polakis, and M. R. Amer, “Deep complex networks for protocol-agnostic radio frequency device fingerprinting in the wild,” arXiv preprint arXiv:1909.08703, 2019.
  26. F. Restuccia, S. D’Oro, A. Al-Shawabka, M. Belgiovine, L. Angioloni, S. Ioannidis, K. Chowdhury, and T. Melodia, “DeepRadioID: Real-time channel-resilient optimization of deep learning-based radio fingerprinting algorithms,” in Proceedings of the Twentieth ACM International Symposium on Mobile Ad Hoc Networking and Computing.    ACM, 2019, pp. 51–60.
  27. S. Wisdom, T. Powers, J. Hershey, J. Le Roux, and L. Atlas, “Full-capacity unitary recurrent neural networks,” in Advances in Neural Information Processing Systems, 2016, pp. 4880–4888.
  28. C. Trabelsi, O. Bilaniuk, Y. Zhang, D. Serdyuk, S. Subramanian, J. F. Santos, S. Mehri, N. Rostamzadeh, Y. Bengio, and C. J. Pal, “Deep complex networks,” in International Conference on Learning Representations, 2018.
  29. D. Cireşan, U. Meier, and J. Schmidhuber, “Multi-column deep neural networks for image classification,” in IEEE Conference on Computer Vision and Pattern Recognition, 2012, pp. 3642–3649.
  30. IEEE Std 802.11a, Wireless LAN Medium Access Control (MAC) and Physical Layer (PHY) Specifications: High Speed Physical layer in the 5 GHz band, 1999.
  31. E. Sourour, H. El-Ghoroury, and D. McNeill, “Frequency offset estimation and correction in the IEEE 802.11a WLAN,” in IEEE 60th Vehicular Technology Conference, vol. 7.    IEEE, 2004, pp. 4923–4927.
  32. T. S. Rappaport et al., Wireless Communications: Principles and Practice, 2nd ed.    Prentice Hall PTR New Jersey, 1996.
  33. 3GPP TS 36.101, LTE; Evolved Universal Terrestrial Radio Access (E-UTRA); User Equipment (UE) Radio Transmission and Reception.    Version 11.2.0, release 11, 2012.
Comments 0
Request Comment
You are adding the first comment!
How to quickly get a good reply:
  • Give credit where it’s due by listing out the positive aspects of a paper before getting into which changes should be made.
  • Be specific in your critique, and provide supporting evidence with appropriate references to substantiate general statements.
  • Your comment should inspire ideas to flow and help the author improves the paper.

The better we are at sharing our knowledge with each other, the faster we move forward.
The feedback must be of minimum 40 characters and the title a minimum of 5 characters
Add comment
Loading ...
This is a comment super asjknd jkasnjk adsnkj
The feedback must be of minumum 40 characters
The feedback must be of minumum 40 characters

You are asking your first question!
How to quickly get a good answer:
  • Keep your question short and to the point
  • Check for grammar or spelling errors.
  • Phrase it like a question
Test description