Fingerprint template protection
using minutia-pair spectral representations
Storage of biometric data requires some form of template protection in order to preserve the privacy of people enrolled in a biometric database. One approach is to use a Helper Data System. Here it is necessary to transform the raw biometric measurement into a fixed-length representation. In this paper we extend the spectral function approach of Stanko and Škorić , which provides such a fixed-length representation for fingerprints. First, we introduce a new spectral function that captures different information from the minutia orientations. It is complementary to the original spectral function, and we use both of them to extract information from a fingerprint image. Second, we construct a helper data system consisting of zero-leakage quantisation followed by the Code Offset Method. We show empirical data which demonstrates that applying our helper data system causes only a small performance penalty compared to fingerprint authentication based on the unprotected spectral functions.
I-a Biometric template protection
Biometric authentication has become popular because of its convenience. Biometrics cannot be forgotten or left at home. Although biometric data is not exactly secret (we are leaving a trail of fingerprints, DNA etc.), it is important to protect biometric data for privacy reasons. Unprotected storage of biometric data could reveal medical conditions and would allow cross-matching of entries in different databases. Large-scale availability of unprotected biometric data would make it easier for malevolent parties to leave misleading traces at crime scenes (e.g. artificial fingerprints , synthesized DNA .) One of the easiest ways to properly protect a biometric database against breaches and insider attacks (scenarios where the attacker has access to decryption keys) is to store biometrics in hashed form, just like passwords. An error-correction step has to be added to get rid of the measurement noise. To prevent critical leakage from the error correction redundancy data, one uses a Helper Data System (HDS) , , , for instance a Fuzzy Extractor or a Secure Sketch , , . The best known and simplest HDS scheme is the code-offset method (COM). The COM utilizes a linear binary error-correction code and thus requires a fixed-length representation of the biometric measurement. Such a representation is not straightforward when the measurement noise can cause features of the biometric to appear/disappear. For instance, some minutiae may not be detected in every image captured from the same finger.
A fixed-length representation called spectral minutiae was introduced by Xu et al. , , , . For every detected minutia of sufficient quality, the method evaluates a Fourier-like spectral function on a fixed-size two-dimensional grid; the contributions from the different minutiae are added up. Disappearance of minutiae or appearance of new ones does not affect the size of this representation.
One of the drawbacks of Xu et al.’s construction is that phase information is discarded in order to obtain translation invariance. Nandakumar  proposed a variant which does not discard the phase information. However, it reveals personalised reliability data, which makes it difficult to use in a privacy-preserving scheme.
A minutia-pair based variant of Xu et al.’s technique was introduced in . It has a more compact grid and reduced computation times. Minutia pairs (and even triplets) were used in [7, 9], but with a different attacker model that allows encryption keys to exist that are not accessible to the attacker.
I-B Contributions and outline
First we extend the pair-based spectral minutiae method  by introducing a new spectral function that captures different information from the minutia orientations. Then we use the spectral functions as the basis for a template protection system. Our HDS consists of two stages. In the first stage, we discretise the analog spectral representation using a zero-leakage HDS [5, 17]. This first HDS reduces quantisation noise, and the helper data reveals no information about the quantised data. Discretisation of the spectral functions typically yields only one bit per grid point. We concatenate the discrete data from all the individual grid points into one long bitstring. In the second stage we apply the Code Offset Method. Our code of choice is a Polar code, because Polar code are low-complexity capacity-achieving codes with flexible rate.
We present False Accept vs. False Reject tradeoffs at various stages of the data processing. We introduce the ‘superfinger’ enrollment method, in which we average the spectral functions from multiple enrollment images. By combining three enrollment images in this way, and constructing a polar code specifically tuned to the individual bit error rate of each bit position, we achieve an Equal Error Rate around 1% for a high-quality fingerprint database, and around 6% for a low-quality database.
The outline of the paper is as follows. In Section II we introduce notation briefly review helper data systems, the spectral minutiae representation, and polar codes. In Section III we introduce the new spectral function. In Section IV we explain our experimental approach and motivate certain design choices such as the number of discretisation intervals and the use of a Gaussian approximation. We introduce two methods for averaging enrollment images.
Ii-a Notation and terminology
We use capitals to represent random variables, and lowercase for their realizations. Sets are denoted by calligraphic font. The set is defined as . The mutual information (see e.g. ) between and is . The probability density function (pdf) of the random variable in written as and its cumulative distribution function (cdf) as . We denote the number of minutiae found in a fingerprint by . The coordinates of the ’th minutia are and its orientation is . We write and We will use the abbreviations FRR = False Reject Rate, FAR = False Accept Rate, EER = Equal Error Rate, ROC = Receiver Operating Characteristic. Bitwise xor of binary strings is denoted as .
Ii-B Helper Data Systems
A HDS is a cryptographic primitive that allows one to reproducibly extract a secret from a noisy measurement. A HDS consist of two algorithms: Gen (generation) and Rec (reconstruction), see Fig. 1. The Gen algorithm takes a measurement as input and generates the secret and a helper data . The Rec algorithm has as input a noisy measurement and the helper data; it outputs an estimator . If is sufficiently close to then . The helper data should not reveal much about . Ideally it holds that . This is known as Zero Leakage helper data.
Ii-C Two-stage HDS template protection scheme
Fig. 2 shows the two-stage HDS architecture mentioned in Section I-B. The enrollment measurement is transformed to the spectral representation on grid points. The first-stage enrollment procedure Gen1 is applied to each individually, yielding short (mostly one-bit) secrets and zero-leakage helper data . The are concatentated into a string . Residual noise in is dealt with by the second-stage HDS (Code Offset Method), whose Gen2 produces a secret and helper data . A hash is computed, where is salt. The hash and the salt are stored.
In the verification phase, the noisy is processed as shown in the bottom half of Fig. 2. The reconstructed secret is hashed with the salt ; the resulting hash is compared to the stored hash.
Ii-D Minutia-pair spectral representation
Minutiae are special features in a fingerprint, e.g. ridge endings and bifurcations. We briefly describe the minutia-pair spectral representation introduced in . For minutia indices the distance and angle between these minutiae are given by and . The spectral function is defined as
where is a width parameter. The spectral function is evaluated on a discrete grid. The variable is integer and can be interpreted as the Fourier conjugate of an angular variable, i.e. a harmonic. The function is invariant under translations of . When a rotation of the whole fingerprint image is applied over an angle , the spectral function transforms in a simple way,
Ii-E Zero Leakage Helper Data Systems
We briefly review the ZLHDS developed in [5, 17] for quantisation of an enrollment measurement . The density function of is , and the cumulative distribution function is . The verification measurement is . The and are considered to be noisy versions of an underlying ‘true’ value. They have zero mean and variance , , respectively. The correlation between and can be characterised by writing , where is the attenuation parameter and is zero-mean noise independent of , with variance . It holds that . We consider the identical conditions case: the amount of noise is the same during enrollment and reconstruction. In this situation we have and .
The real axis is divided into intervals , with , . Let . The quantisation boundaries are given by . The Gen algorithm produces the secret as and the helper data as . The inverse relation, for computing as a function of and , is given by .
The Rec algorithm computes the estimator as the value in for which it holds that , where the parameters are decision boundaries. In the case of Gaussian noise these boundaries are given by
Here it is understood that and , resulting in , .
The above scheme ensures that and that the reconstruction errors are minimized.
Ii-F The Code Offset Method (COM)
We briefly describe how the COM is used as a Secure Sketch. Let be a linear binary error correcting code with message space and codewords in . It has an encoding : , a syndrome function : and a syndrome decoder : . In Fig. 2 the Gen2 computes the helper data as . The in Fig. 2 is equal to . The Rep2 computes the reconstruction .
Ii-G Polar codes
Polar codes, proposed by Arıkan , are a class of linear block codes that get close to the Shannon limit even at small code length. They are based on the repeated application of the polarisation operation on two bits of channel input. Applying this operation creates two virtual channels, one of which is better than the original channel and one worse. For channel inputs, repeating this procedure in the end yields near-perfect virtual channels, with close to capacity, and near-useless channels. The -bit message is sent over the good channels, while the bad ones are ‘frozen’, i.e used to send a fixed string known a priori by the recipient.
The most popular decoder is the Successive Cancellation Decoder (SCD), which sequentially estimates message bits according to the frozen bits and the previously estimated bits . Polar codes have been recently adopted for the next generation wireless standard (5G), especially for control channels, which have short block length ().
Iii A new spectral function
Consider Fig. 3 (modified from ). The invariant angle is defined as the angle from the orientation of minutia to the connecting line , taken in the positive direction. (The is defined analogously). Modulo it holds that and . The spectral function (1) uses only the invariant angle . The second invariant angle, which can be written e.g. as , is not used. We therefore now introduce a new spectral function, denoted as , which incorporates the invariant angle .
We will use , and their fusion.
Iv Experimental approach
We use the MCYT, FVC2000, and FVC2002 database. The MCYT database  contains good-quality images from 100 individuals: 10 fingers per individual and 12 images per finger. FVC2000 and FVC2002 contain low-quality images (only index and middle fingers ). Each FVC database contains 100 fingers, 8 images per finger. In FVC2002, images number 3, 4, 5, and 6 have an exceptionally large angular displacement, so they are omitted from the experiments.
We extract the minutia position and orientation by using VeriFinger software . For MCYT we evaluate the spectral functions on the same grid as , namely and and we maintain pixels. For the FVC databases we use the same grid, and pixels turns out to be a good choice. The average number of minutiae that can be reliably found is .
Iv-B No image rotation
As mentioned in , during the reconstruction procedure one can try different rotations of the verification image, but it results only in a minor improvement of the EER. For this reason we do not apply image rotation.
Iv-C Quantization methods
Before quantization all spectral functions are normalized to zero mean and unit variance, where the variance is taken of the real and imaginary part together. We quantize the real and imaginary part of the spectral functions separately. We study two methods: ‘hard thresholding’ (without helper data) and the Zero Leakage quantisation of Section II-B. The hard thresholding gives a bit value ‘1’ if and ‘0’ otherwise. We will show results for this method mainly to demonstrate the advantages of Zero Leakage quantisation.
Iv-D Gaussian probability distributions
Iv-E Zero leakage quantization
Iv-E1 Signal to noise ratio; setting
In the ZL HDS of Section II-E, the optimal choice of the parameter (number of quantization intervals) depends on the signal to noise ratio. Fig. 6 shows a comparison between and . At low noise it is obvious that extracts more information from the source than . At larger than approximately , there is a regime where can extract more in theory, but is hindered in practice by the high bit error rate. At the ‘wins’ in all respects.
For our data set, we define a for every grid point as the variance of over all images in the database. The noise is the variance over all available images of the same finger, averaged over all fingers.
Figs. 7 and 8 show the noise-to-signal ratio. Note the large amount of noise; even the best grid points have . Fig. 6 tells us that setting is the best option, and this is the choice we make. At we extract two bits per grid point from each spectral function (one from , one from ). Hence our bit string string (see Fig. 2) derived from has length 640. When we apply fusion of and this becomes 1280.
For the formulas in Section II-E simplify to , , , , , . Since we work with Gaussian distributions, is the Gaussian cdf (‘probability function’).
Iv-E2 Enrollment and reconstruction
We have experimented with three different enrollment methods:
E1. A single image is used.
E2: We take the first222 We take the first images to show that the approach works. We are not trying to optimise the choice of images. images of a finger and calculate the average spectral function. We call this the ‘superfinger’ method. In the ZLHDS calculations the signal-to-noise ratio of the average spectral function is used.
E3: For each of images we calculate an enrollment string . We apply bitwise majority voting on these strings. (This requires odd .) The reconstruction boundaries are calculated based on the superfinger method, i.e. as in E2.
We study fingerprint authentication with genuine pairs and impostor pairs. For pedagogical reasons we will present results at each stage of the signal processing: (1) spectral function domain, before quantisation; (2) binarized domain, without HDS; (3) with ZLHDS; (4) with ZLHDS and discarding the highest-noise grid points.
In the spectral function domain the fingerprint matching is done via a correlation score . In the binarized domain we look at the Hamming weight between the enrolled and the reconstructed . For all cases we will show ROC curves in order to visualise the FAR-FRR tradeoff as a function of the decision threshold.
Let the number of images per finger
be denoted as , and the number of fingers in a database as .
E1: For the spectral domain and the quantization without HDS we compare all genuine pairs, i.e. image pairs per finger, resulting in data points. For ZLHDS the number is twice as large, since there is an asymmetry between enrollment and reconstruction. For the FVC databases we generate all possible impostor combinations (all images of all impostor fingers), resulting in data points.
For the MCYT database, which is larger,
we take only one random image per impostor finger, resulting in data points.
E2+E3: For genuine pairs we compare the superfinger to the remaining images. Thus we have data points. Impostor pairs are generated as for E1.
Note: The VeriFinger software was not able to extract information for every image.
V Experimental results
V-a FAR/FRR rates before error correction
For each the data processing steps/options before application of the Code Offset method, we investigate the False Accept rates and False Reject rates. We identify a number of trends.
Figs. 9 and 10 show ROC curves. All the non-analog curves were made under the implicit assumption that for each decision threshold (number of bit flips) an error-correcting code can be constructed that enforces that threshold, i.e. decoding succeeds only if the number of bit flips is below the threshold. Unsurprisingly, we see in the figures that quantisation causes a performance penalty. Furthermore the penalty is clearly less severe when the ZLHDS is used. Finally, it is advantageous to discard some grid points that have bad signal-to-noise ratio. For the curves labeled ‘ZLHDS+reliable components’ only the least noisy333 This is defined as a global property of the whole database. The selection of reliable components does not reveal anything about an individual. Note that  does reveal personalised reliable components and obtains better FA and FN error rates. 512 bits of were kept (1024 in the case of fusion). Our choice for the number 512 is not entirely arbitrary: it fits error-correcting codes. Note in Fig. 10 that ZLHDS with reliable component selection performs better than analog spectral functions without reliable component selection. (But not better than analog with selection.)
The E2 and E3 enrollment methods perform better than E1. Furthermore, performance increases with . A typical example is shown in Fig. 11.
The spectral functions and individually have roughly the same performance. Fusion yields a noticeable improvement. An example is shown in Fig. 12. (We implemented fusion in the analog domain as addition of the two similarity scores.)
In Table I it may look strange that the EER in the rightmost column is sometimes lower than in the ‘analog’ column. We think this happens because there is no reliable component selection in the ‘analog’ procedure.
Ideally the impostor BER is 50%. In the tables we see that the impostor BER can get lower than 50% when the ZLHDS is used and the enrollment method is E2. On the other hand, it is always around 50% in the ‘No HDS’ case. This seems to contradict the Zero Leakage property of the helper data system. The ZLHDS is supposed not to leak, i.e. the helper data should not help impostors. However, the zero-leakage property is guaranteed to hold only if the variables are independent. In real-life data there are correlations between grid points and correlations between the real and imaginary part of a spectral function.
|#images ()||Analog||No HDS||ZLHDS||ZLHDS+r.c.|
|#images ()||Analog||No HDS||ZLHDS||ZLHDS+r.c.|
|#images ()||Analog||No HDS||ZLHDS||ZLHDS+r.c.|
|#images ()||Analog||No HDS||ZLHDS||ZLHDS+r.c.|
|#images ()||Analog||No HDS||ZLHDS||ZLHDS+r.c.|
V-B Error correction: Polar codes
The error rates in the genuine reconstructed are rather high, at least 0.21. In order to apply the Code Offset Method with a decent message size it is necessary to use a code that has a high rate even at small codeword length.
Consider the case of fusion of and . The codeword length is 1280 bits (1024 if reliable component selection is performed). Suppose we need to distinguish between users. Then the message length needs to be at least 20 bits, in spite of the high bit error rate. Furthermore, the security of the template protection is determined by the entropy of the data that is input into the hash function (see Fig. 2); it would be preferable to have at least 64 bits of entropy.
We constructed a number of Polar codes tuned to the signal-to-noise ratios of the individual grid points. The codes are designed to find a set of reliable channels, which are then assigned to the information bits. Each code yields a certain FAR (impostor string accidentally decoding correctly) and FRR (genuine reconstruction string failing to decode correctly), and hence can be represented as a point in an ROC plot. This is shown in Fig. 13. For the MCYT database we have constructed a Polar code with message length 25 at an EER around 1.2% (compared to 0.7% before error correction). For the FVC2000 database we have constructed a Polar code with message length 15 at an EER around 6% (compared to 3.3% EER before error correction). Note that the error correction is an indispensable part of the privacy protection and inevitably leads to a performance penalty. However, we see that the penalty is not that bad, especially for high-quality fingerprints.
From our results we also see that even under the best circumstances (high-quality MCYT database) the entropy of the extracted string is severely limited (25 bits). In order to achieve a reasonable security level of the hash, at least two fingers need to be combined. We do not see this as a drawback of our helper data system; given that the EER for one finger is around 1%, which is impractical in real-life applications, it is necessary anyhow to combine multiple fingers.
V-C Error correction: random codebooks
There is a large discrepancy between the message length of the Polar code () and the known information content of a fingerprint. According to Ratha et al  the reproducible entropy of a fingerprint image with robustly detectable minutiae should be more than 120 bits. Furthermore, the potential message size that can be carried in a 1024-bit string with a BER of 23% is bits. (And 122 bits at 30% BER.)
We experimented with random codebooks to see if we could extract more entropy from the data than with polar codes. At low code rates, a code based on random codewords can be practical to implement. Let the message size be , and the codeword size . A random table needs to be stored of size bits, and the process of decoding consists of computing Hamming distances. We split the 1024 reliable bits into 4 groups of bits, for which we generated random codebooks, for various values of . The total message size is and the total codeword size is . The results are shown in Fig. 13. In short: random codebooks give hardly any improvement over Polar codes.
Vi Summary and discussion
A Helper Data System protects privacy but causes a fingerprint recognition degradation in the form of increased EER. We have built a HDS from a spectral function representation of fingerprint data, combined with a Zero Leakage quantisation scheme. It turns out that our HDS causes only a very small EER penalty when the fingerprint quality is high.
The best results were obtained with the ‘superfinger’ enrollment method (E2, taking the average over multiple enrollment images in the spectral function domain), and with fusion of the , functions. The superfinger method performs slightly better than the E3 method and also has the advantage that it is not restricted to an odd number of enrollment captures.
For the high-quality MCYT database, our HDS achieves an EER around 1% and extracts a 1024-bit string with bits of entropy. In practice multiple fingers need to be used in order to obtain an acceptable EER. This automatically increases the entropy of the hashed data. The entropy can be further increased by employing tricks like the Spammed Code Offset Method .
As topics for future work we mention (i) testing the HDS on more databases; (ii) further optimisation of parameter choices such as the number of reliable components, and the number of minutiae used in the computation of the spectral functions; (iii) further tweaking of the Polar codes.
Part of this work was supported by NWO grant 628.001.019 (ESPRESSO), and grant 61701155 from the National Natural Science Foundation of China (NSFC).
-  VeriFinger SDK. Available online, www.neurotechnology.com.
-  E. Arıkan. Channel polarization: a method for constructing capacity-achieving codes for symmetric binary-input memoryless channels. IEEE Transactions on Information Theory, 55(7):3051–3073, 2009.
-  R. Canetti, B. Fuller, O. Paneth, L. Reyzin, and A. Smith. Reusable fuzzy extractors for low-entropy distributions. In Eurocrypt 2016, 2016.
-  T.M. Cover and J.A. Thomas. Elements of Information Theory. John Wiley & Sons, Inc., Berlin, 2005.
-  J. de Groot, B. Škorić, N. de Vreede, and J.P. Linnartz. Quantization in Zero Leakage Helper Data Schemes. EURASIP Journal on Advances in Signal Processing, 2016. 2016:54.
-  Y. Dodis, R. Ostrovsky, L. Reyzin, and A. Smith. Fuzzy Extractors: how to generate strong keys from biometrics and other noisy data. SIAM J. Comput., 38(1):97–139, 2008.
-  F. Farooq, R.M. Bolle, T.-Y. Jea, and N. Ratha. Anonymous and revocable fingerprint recognition. In IEEE Conference on Computer Vision and Pattern Recognition, pages 1–7. IEEE, 2007.
-  D. Frumkin, A. Wasserstrom, A. Davidson, and A. Grafit. Authentication of forensic DNA samples. FSI Genetics, 4(2):95–103, 2010.
-  Z. Jin, A.B.J. Teoh, T.S. Ong, and C. Tee. Generating revocable fingerprint template using minutiae pair representation. In International Conference on Education Technology and Computer, pages 251–255. IEEE, 2010.
-  A. Juels and M. Wattenberg. A fuzzy commitment scheme. In ACM Conference on Computer and Communications Security (CCS) 1999, pages 28–36, 1999.
-  J.-P. Linnartz and P. Tuyls. New shielding functions to enhance privacy and prevent misuse of biometric templates. In Audio- and Video-Based Biometric Person Authentication. Springer, 2003.
-  D. Malton, D. Maio, A.K. Jain, and S. Prabhakar. Handbook of Fingerprint Recognition. Springer, London, 2 edition, 2009.
-  T. Matsumoto, H. Matsumoto, K. Yamada, and S. Hoshino. Impact of artificial “gummy” fingers on fingerprint systems. In Proc. SPIE, Optical Security and Counterfeit Deterrence Techniques IV, volume 4677, pages 275–289, 2002.
-  K. Nandakumar. A fingerprint cryptosystem based on minutiae phase spectrum. In Workshop on Information Forensics and Security (WIFS), pages 1–6. IEEE, 2010.
-  J. Ortega-Garcia, J. Fierrez-Aguilar, D. Simon, J. Gonzalez, M. Faundez, V. Espinosa, A. Satue, I. Hernaez, J.J. Igarza, C. Vivaracho, D. Escudero, and Q.I. Moro. MCYT baseline corpus: A bimodal biometric database. In Vision, Image and Signal Processing, Special Issue on Biometrics on the Internet, volume 150, pages 395–401. IEEE, 2003.
-  N.K. Ratha, J.H. Connell, and R.M. Bolle. Enhancing security and privacy in biometrics-based authentication systems. IBM Systems Journal, 40:614–634, 2001.
-  T. Stanko, F.N. Andini, and B. Škorić. Optimized quantization in Zero Leakage Helper Data Systems. IEEE Transactions on Information Forensics and Security, 12(8):1957–1966, 2017.
-  T. Stanko and B. Škorić. Minutia-pair spectral representations for fingerprint template protection. In WIFS 2017. arxiv.org/abs/1703.06811.
-  B. Škorić and N. de Vreede. The Spammed Code Offset Method. IEEE Transactions on Information Forensics and Security, 9(5):875–884, 2014.
-  C.I. Watson, M.D. Garris, E. Tabassi, C.L. Wilson, R.M. McCabe, S. Janet, and K. Ko. User’s guide to export controlled distribution of NIST biometric image software, 2004. NISTIR 7391.
-  H. Xu and R.N.J. Veldhuis. Spectral minutiae representations of fingerprints enhanced by quality data. In Int. Conf. on Biometrics: Theory, Applications and Systems (BTAS) 2009. IEEE, 2009.
-  H. Xu and R.N.J. Veldhuis. Spectral representations of fingerprint minutiae subsets. In Image and Signal Processing (CISP) 2009, pages 1–5, 2009.
-  H. Xu and R.N.J. Veldhuis. Complex spectral minutiae representation for fingerprint recognition. In Computer Vision and Pattern Recognition Workshop. IEEE, 2010.
-  H. Xu, R.N.J. Veldhuis, A.M. Bazen, T.A.M. Kevenaar, A.H.M. Akkermans, and B. Gokberk. Fingerprint verification using spectral minutiae representations. IEEE Transactions on Information Forensics and Security, 4(3):397–409, 2009.