A Cryptanalysis of Two Cancelable Biometric Schemes based on Index-of-Max Hashing

A Cryptanalysis of Two Cancelable Biometric Schemes based on Index-of-Max Hashing

Kevin Atighehchi1, Loubna Ghammam2, Koray Karabina3, Patrick Lacharme4 1University Clermont Auvergne, LIMOS, France 2ITK Engineering GmbH, Germany 3Florida Atlantic University, Boca Raton 33431, FL, United States 4Normandie Univ, UNICAEN, ENSICAEN, CNRS, GREYC, 14000 Caen, France
Abstract

Cancelable biometric schemes generate secure biometric templates by combining user specific tokens and biometric data. The main objective is to create irreversible, unlinkable, and revocable templates, with high accuracy in matching. In this paper, we cryptanalyze two recent cancelable biometric schemes based on a particular locality sensitive hashing function, index-of-max (IoM): Gaussian Random Projection-IoM (GRP-IoM) and Uniformly Random Permutation-IoM (URP-IoM). As originally proposed, these schemes were claimed to be resistant against reversibility, authentication, and linkability attacks under the stolen token scenario. We propose several attacks against GRP-IoM and URP-IoM, and argue that both schemes are severely vulnerable against authentication and linkability attacks. We also propose better, but not yet practical, reversibility attacks against GRP-IoM. The correctness and practical impact of our attacks are verified over the same dataset provided by the authors of these two schemes.

Cancelable biometrics; Locality sensitive hashing; Index-of-Max hashing; Reversibility attack; Authentication attack; Linkability attack

I Introduction

Biometrics has been widely adopted in authentication systems, border control mechanisms, financial services, and healthcare applications. Biometric technologies are very promising to provide user-friendly, efficient, and secure solutions to practical problems. In a typical biometric based authentication scheme, users register their biometric-related information with the system, and they are authenticated based on a similarity score calculated from their enrolled biometric data and the fresh biometric they provide. As a consequence, service providers need to manage biometric databases. This is somewhat analogous to storing and managing user passwords in a password-based authentication scheme. The main difference is that biometric data serves as a long-term and unique personal identifier, whence categorized as a highly sensitive and private data. This is not the case for passwords as they can be chosen independent of any user specific characteristics, a single user can create an independent password per application, and passwords can be revoked, changed, and renewed easily at any time. As a result, managing biometric data in applications is more challenging, and it requires more care. As biometric-based technologies are deployed at a larger scale, biometric databases become natural targets in cyber attacks. In 2015, 5.6 million fingerprints were stolen from the U.S. Office of Personnel Management’s database in a cyber attack [OPM-Hack]. More recently, The U.S. Customs and Border Protection (CBP) said in a statement that traveler images collected by CBP were compromised from a subcontractor’s company network by a malicious cyber-attack on May 31, 2019 [FaceHack]. Thus, biometric protection schemes become a necessity with the proliferation of biometric applications.

In order to mitigate security and privacy problems in the use of biometrics, several biometric template protection methods have been proposed, including cancelable biometrics, biometric cryptosystems (e.g. fuzzy extractors), keyed biometrics (e.g. homomorphic encryption), and hybrid biometrics. In this paper, we focus on cancelable biometrics (CB), and refer the reader to two surveys [Survey-2015, Survey-2016] for more details on biometric template protection methods.

In CB, a biometric template is computed through a process where the main inputs are biometric data (e.g. biometric image, or the extracted feature vector) of a user, and a user specific token (e.g. a random key, seed, or a password). In a nutshell, templates can be revoked, changed, and renewed by changing user specific tokens. For the security of the system, it is important that the template generation process is non-invertible (irreversible): given the biometric template and/or the token of a user, it should be computationally infeasible to recover any information about the underlying biometric data. Similarly, given a pair of biometric templates and the corresponding tokens, it should be computationally infeasible to distinguish whether the templates were derived from the same user (unlinkability). We should note that even though user specific tokens in CB may be considered as secret, as part of a two-factor authentication scheme, cryptanalysis of CB with stronger adversarial models commonly assume that the attacker knows both the biometric template and the token of a user. This is a plausible assumption in practice because a user token may have low entropy (e.g. a weak password), or it may just be compromised by an attacker. This scenario is also known as the stolen-token scenario; see [tkl08].

CB was first propsed by Ratha et al. [rcb01] for face recognition. Since then, several CB schemes have been proposed, including the Biohashing algorithm applied on many modalities such as fingerprints [tng04], face [tgn06], and iris [ctjnl06]. Due to its simple design properties based on an orthonormal projection matrix, biohashing has been widely studied [kczky06, nl06, ln07, tkl08]. For example, Ratha et al. proposed three transformations for minutiae-based fingerprint templates in [rccb07]. Another family of CB has been proposed based on Bloom filters; see [rbbb14].

Several attacks on the biohashing type schemes have been proposed [lcm09, LaChRo13, NaNaJa10, fly14, ToKaAzEr16]. As mentioned before, in most of these attacks, it is assumed that the adversary knows the user specific token, except in one scenario of [fly14], which assumes the knowledge of several biometric templates from several distinct subjects (68 individuals, 105 images per individual, for the CMUPIE face database; and 350 individuals, 40 images per individual, for the FRGC face database). In [NaNaJa10, fly14, ToKaAzEr16], the attacks are combined with a masquerade attack. Cryptanalysis efforts have been extended to other types of CB schemes as well. For example, a Bloom filters-based protection scheme is analyzed in [bmr17], concerning the non-linkability of templates. The schemes presented in [rccb07] have been attacked in [qfaf08], using the Attack via Record Multiplicity (ARM) technique, where the attack uses the knowledge of multiple templates generated from the original data. ARM is also used by Li and Hu [lh14] for attacking several CB schemes that are designed for minutiae based fingerprint templates, as proposed in [lctlk07, wh12].

In summary, CB schemes offer several advantages such as efficient implementation, high matching accuracy, and revocability. On the other hand, security of CB schemes, in general, are not well understood and the security claims are rather based on some intuitive, heuristic, and informal arguments, as opposed to being based on formal arguments with rigorous proofs.

More recently, Jin et al.  [Jin18Ranking] proposed two cancelable biometric schemes based on a particular locality sensitive hashing function, index-of-max (IoM) (see [Charikar02Similarity] for details on IoM hashing): Gaussian Random Projection-IoM (GRP-IoM) and Uniformly Random Permutation-IoM (URP-IoM). It is shown in [Jin18Ranking] that, for suitably chosen parameters, GRP-IoM and URP-IoM are robust against variation and noise in the measurement of data. It is also claimed in [Jin18Ranking] that, GRP-IoM and URP-IoM are resistant against reversibility, authentication, and linkability attacks under the stolen token scenario.

In this paper, we formalize some security notions under the stolen token scenario and propose several attacks against GRP-IoM and URP-IoM. We argue that both schemes are severely vulnerable against authentication and linkability attacks. We also propose better, but not yet practical, reversibility attacks for GRP-IoM. We utilize linear and geometric programming methods in our attacks. Their correctness and practical impact are verified over the same dataset provided by the authors of these two schemes. In order to be more specific, we state the security claims in [Jin18Ranking] and our cryptanalysis results as follows:

  1. Reversibility attack: In a reversibility attack, an adversary, who already has the knowledge of a user’s specific token, and has at least one biometric template of the same user, tries to recover a feature vector, that corresponds to the user’s biometric data.

    Analysis in [Jin18Ranking]

    It is claimed in [Jin18Ranking] that the best template reversing strategy for an adversary is to exhaustively search feature vectors. Based on some entropy analysis of the feature vectors, it is concluded in [Jin18Ranking] that recovering the exact feature vectors from their system implemented over the FVC 2002 DB1 dataset requires operations for both GRP-IoM and URP-IoM; see Section VII.A in [Jin18Ranking]. In fact, the attack cost in [Jin18Ranking] was underestimated as because of underestimating as . A more accurate analysis yields a cost of .

    Our results

    We propose a new reversibility attack against GRP-IoM. The main idea is to reduce the search space by guessing the sign of the components of the feature vectors with high success probability. Our analysis and experiments over the FVC 2002 DB1 dataset suggest that recovering GRP-IoM feature vectors now requires operations. Even though our attack is not practical, it reduces the previously estimated security level for GRP-IoM by bits from -bit to -bit. Furthermore, we relax the exact reversibility notion to the nearby reversibility notion. This relaxation is reasonable given the fact that different measurements of the same user’s biometric produce different feature vectors due to the inherent noise in the measurements. Under this relaxation, we propose successful attack strategies against GRP-IoM. Currently, we do not have any reversing attack strategy against URP-IoM that works better than the naive exhaustive search or random guessing strategies. For more details, please see Section IV-B.

  2. Authentication attack: In an authentication attack, an adversary, who already has the knowledge of a user’s specific token, and has at least one biometric template of the same user, tries to generate a feature vector such that the adversary can now use that feature vector and the stolen token to be (falsely) authenticated by the system as a legitimate user. Note that authentication attacks are weaker than reversibility attacks because feature vectors generated in the attacks are not required to correspond to actual biometrics.

    Analysis in [Jin18Ranking]

    The authors in [Jin18Ranking] analyze several authentication attack strategies (brute force, record multiplicity, false acceptance, birthday) against GRP-IoM and URP-IoM. In particular, the analysis in [Jin18Ranking] yields that authentication attacks against GRP-IoM with parameters , and URP-IoM with parameters require and operations, respectively, when the underlying dataset is FVC 2002 DB1; see Table V in [Jin18Ranking].

    Our results

    We utilize linear and geometric programming methods and propose new and practical authentication attacks against both GRP-IoM and URP-IoM. For example, we verify that our attacks against GRP-IoM and URP-IoM (under the same parameters and the dataset as above) run in the order of seconds and can authenticate adversaries successfully. We also show that the cancellability property of both GRP-IoM and URP-IoM are violated in the sense that adversaries can still be (falsely) authenticated by the system even after user templates are revoked and tokens are renewed. For more details, please see Section IV-A and V-A.

  3. Linkability attack: In a linkability attack, an adversary, who is given a pair of biometric templates, tries to determine whether the templates were generated from two distinct individuals or from the same individual using two distinct tokens.

    Analysis in [Jin18Ranking]

    Based on some experimental analysis of the pseudo-genuine and pseudo-imposter score distributions, and the large overlap between the two distributions, it is concluded in [Jin18Ranking] that an adversary cannot be successful in a linkability attack against GRP-IoM and URP-IoM; see Section VII C in [Jin18Ranking].

    Our results

    Unlinkability claims in [Jin18Ranking] are limited in the sense that the analysis only takes into account the attack strategies based on correlating the similarity scores of given templates. Therefore, the analysis in [Jin18Ranking] does not rule out other, potentially better, attack strategies. In our analysis, we exploit partial reversibility of GRP-IoM and URP-IoM, and propose successful attack strategies (distinguishers) against both schemes. More specifically, the distinguisher for GRP-IoM uses a preimage finder along with a correlation metric, that counts the number of identically signed components in the preimages. As a result, our attack can correctly link two templates 97 percent of the time. The distinguisher for URP-IoM uses a preimage finder along with the Pearson correlation metric, and that can correctly link two templates 83 percent of the time. For more details, please see Section VI.

Organization

The rest of this paper is organized as follows. We provide some background information on GRP-IoM and URP-IoM in Section II. In Section II, we also formalize some of the concepts for a more rigorous discussion and analysis of our attacks. We provide our attack models and relevant definitions in Section III. Our attacks against GRP-IoM and URP-IoM are explicitly described and evaluated in Section IVV, and VI. We derive our conclusions in Section VII.

Ii Formalizing Cancelable Biometric Schemes

Biometric templates in GRP-IoM and URP-IoM are constructed in two steps: (1) Feature extraction: A feature vector is derived from a biometric image; and (2) Transformation: A user specific secret is used to transform the user’s feature vector to a template. In this section, we present formal descriptions of these two steps and show how GRP-IoM and URP-IoM can be seen as concrete instantiations of our formal definitions. Our formalization will later help us to describe security notions, and to present our cryptanalysis of GRP-IoM and URP-IoM in a rigorous manner.

Ii-a Feature Extraction and Template Generation

In the following, we let and be two metric spaces, where and represent the feature space and template space, respectively; and and are the respective distance functions.

Definition 1

A biometric feature extraction scheme is a pair of deterministic polynomial time algorithms , where

  • is the feature extractor of the system, that takes biometric data as input, and returns a feature vector .

  • is the verifier of the system, that takes two feature vectors , , and a threshold as input, and returns if , and returns if .

Remark 1

is not explicitly used in GRP-IoM and URP-IoM. More specifically, after a feature vector is extracted from a biometric image , a transformation is applied to and a biometric template is derived. Therefore, the feature vector is not used in the protocol. The main reason that we introduce and here is to capture the notion of a vector , that is close to the feature vector . For example, the pair and may represent the feature vectors of the same individual extracted from two different measurements and ; in which case, one would expect to return for relatively small values of . As a second example, may be the feature vector constructed by an attacker to reverse the biometric template of an individual with biometric image . In this case, one may measure the success of the attack as a function of , and the rate of values returned by . A successful attack is expected to result in higher return rates of for relatively small values of .

Remark 2

In this paper, we consider two different methods to quantify the similarity between feature vectors in the GRP-IoM and URP-IoM schemes. The first one is the Euclidean distance, where one computes

and the verifier returns if , and returns if . In the second method, on computes

and the verifier returns if , and returns if . The reason for using the first method is that Euclidean distance is commonly deployed in biometrics, and the reason for using the second method is that it has been recently argued to be a successful measure in [Jin18Ranking, Jin16Generating].

Definition 2

Let be token (seed) space, representing the set of tokens to be assigned to users. A cancelable biometric scheme is a tuple of deterministic polynomial time algorithms , where

  • is the secret parameter generator of the system, that takes a token (seed) as input, and returns a secret parameter set .

  • is the transformation of the system, that takes a feature vector , and the secret parameter set as input, and returns a biometric template .

  • is the verifier of the system, that takes two biometric templates = , , and a threshold as input; and returns if , and returns if .

Ii-B GRP-IoM and URP-IoM Schemes

The feature extractor , which is common for both GRP-IoM and URP-IoM, takes fingerprint images as input, and generates feature vectors of length , that is with .

Let denote the set of integers from to . In [Jin18Ranking], GRP-IoM sets , and URP-IoM sets , for some suitable parameters , , and . In the rest of this paper, we unify this notation and use for both GRP-IoM and URP-IoM. In both GRP-IoM and URP-IoM, the distance between two templates, , is defined as the Hamming distance between and . Therefore, in the rest of this paper, we use instead of .

Both GRP-IoM and URP-IoM use an Index-of-Max operation, denoted , in their verification algorithm . is the smallest index, at which attains its maximum value. The algorithms and for GRP-IoM and URP-IoM significantly differ, and we explain them in the following.

GRP-IoM Instantiation

  • takes the seed as input, and generates random Gaussian -by- matrices , for . The column vectors of the matrices are sampled as standard Gaussian vectors of length-: for . As a result, the secret parameter set consists of the sequence of projections .

  • takes the secret parameter set , and a fingerprint feature vector as input, and computes

    1. ,

    2. ,

    for . The output of is the biometric template .

  • takes two biometric templates , and a matching threshold as input; computes ; and returns if , and returns if . Note that represents the minimum rate of the number of indices with the same entry in the pair of vectors to be accepted as a genuine pair.

The GRP-based IoM Hashing is depicted Figure 1.

Fig. 1: Transformation of the GRP-based IoM scheme.

Concrete parameters

In [Jin18Ranking], several experiments are performed to select optimal parameters and . More specifically, accuracy of the system is analyzed for , and . It is concluded that large is necessary for better accuracy, and that the effect of on the accuracy is not significant when is sufficiently large. For example, changing the configuration from to changes the equal error rate (EER) of the system from to , a minor improvement of . As a result, the parameter set is commonly referred in the security and performance analysis of GRP-IoM in [Jin18Ranking] with ; see Table IV and Table V in [Jin18Ranking]. For convenient comparison of our results, we also use and as the main reference point in our security analysis in this paper.

URP-IoM Instantiation

Let be the symmetric group of all permutations on . Let denote the set of partial permutations for . In other words, permutations in are obtained by restricting permutations in to the first integers . For and , we denote . As an example, for and , restricting the permutation to yields , and we get . Finally, the component-wise (Hadamard) product of two vectors and is denoted by . The secret parameter generation, transformation, and verification operations in URP-IoM are performed as follows:

  • takes the seed as input, and generates partial permutations uniformly at random: for , and . As a result, the secret parameter set consists of the sequence of partial permutations .

  • takes the secret parameter set , and a fingerprint feature vector as input, and computes

    1. for ,

    2. ,

    for . The output of is the biometric template .

  • takes two biometric templates , and a matching threshold as input; computes ; and returns if , and returns if . Note that represents the minimum rate of the number of indices with the same entry in the pair of vectors to be accepted as a genuine pair.

An illustration of the URP-based IoM transformation is given Figure 2.

Concrete parameters

In [Jin18Ranking], several experiments are performed to select optimal parameters , , and . It is reported that, the best performance over the FVC 2002 DB1 dataset is achieved when and . This parameter set is also referred in the security and performance analysis of URP-IoM in [Jin18Ranking]; see Table V in [Jin18Ranking]. For convenient comparison of our results, we also use and as the main reference point in our security analysis in this paper.

Fig. 2: Transformation of the URP-based IoM scheme for .

Iii Stolen Token Attack Models

Let be the set of users of the biometric system. We identify a user with its biometric characteristic, and define a function that takes a biometric characteristic as input, and outputs a digital representation of biometric data ; for instance, the scan of a fingerprint. Note that for two different computations of and (e.g. at different times, or different devices), we may have due to the inherent noise in the measurement of biometric data. Therefore, we model as a probabilistic polynomial time function. We also allow for due to the error rates of recognition systems. In the following, we use to indicate that is chosen from the set uniformly at random.

Iii-a Reversibility attacks

Let be a feature vector, and let be the template generated from and the secret parameter set . In a reversibility attack, an adversary is given , , and a threshold value , and the adversary tries to find a feature vector such that is exactly the same as , or is close to with respect to the distance function over and the threshold value . In this case, we say that is a -nearby-feature preimage (or simply a nearby-feature preimage, when is clear from the context) of the template . More formally, we have the following definition.

Definition 3

Let be a feature vector, and for some secret parameter set . Let be a threshold value. A nearby-feature preimage of with respect to is a feature vector such that = .

As a result, an adversary in a reversibility attack can be modelled as an algorithm that takes and = as input, and that outputs . We say that the adversary is succeesful, if is a nearby-feature preimage of . More formally, we have the following definition.

Definition 4

Let be a cancelable biometric protection scheme and an adversary for a nearby-feature preimage attack. The success rate of , denoted by , is defined as:

Note that an adversary can follow a naive strategy by simply sampling a user from and returning . Under this strategy, the adversary would be expected to succeed with probability , which is the false accept rate of the system with respect to and as the threshold value for the comparison of the pairs of feature vectors. A weakness of the scheme, with respect to the reversibility notion, would require better attack strategies, and this motivates the following definition.

Definition 5

The protection scheme is said to be reversible with advantage , if there exists an adversary such that . If is negligible for all , then we say that is irreversible in the stolen token scenario.

In particular, a protection scheme is irreversible in the stolen token scenario if, the success rate of any adversary is not significantly better than the success rate of the strategy of drawing randomly from .

Remark 3

Definitions 4 and 5 can be generalized to the case where the pair (token, template) is renewed times. The adversary thus takes advantage of pairs of (token, template), with .

Iii-B Authentication attacks

Let be a feature vector, and let be the template generated from and the secret parameter set . In an authentication attack, an adversary is given , , and a threshold value , and the adversary tries to find a feature vector such that for , is exactly the same as , or is close to with respect to the distance function over and the threshold value . In this case, we say that is a -nearby-template preimage (or simply a nearby-template preimage, when is clear from the context) of the template . More formally, we have the following definition.

Definition 6

Let be a feature vector, and for some secret parameter set . Let be a threshold value. A nearby-template preimage of with respect to is a feature vector such that and .

As a result, an adversary in an authentication attack can be modelled as an algorithm that takes and = as input, and that outputs . We say that the adversary is succeesful if is a nearby-template preimage of . More formally, we have the following definition.

Definition 7

Let be a cancelable biometric protection scheme and an adversary for finding a nearby-template preimage. The success rate of , denoted by , is defined as:

Note that an adversary can follow a naive strategy by simply sampling a user from and returning . Under this strategy, the adversary would be expected to succeed with probability , which is the false accept rate of the system with respect to and as the threshold value for the comparison of the pairs of templates. This strategy is also commonly known as the false acceptance rate attack in the literature. A weakness of the scheme, with respect to the false authentication notion, would require better attack strategies, and this motivates the following definition.

Definition 8

The protection scheme is said to have false authentication with advantage property, if there exists an adversary such that . If is negligible for all , then we say that does not have false authentication property under the stolen token scenario.

In particular, a protection scheme does not have false authentication property under the stolen token scenario, if the success rate of any adversary is not significantly better than the success rate of the strategy of drawing randomly from ; or in other words, the success rate of any attack is bounded by the false acceptance rate of the system.

Now, suppose that an adversary knows the secret parameter set of a user (), and the template of the user, where . At this point, the user may renew her token, or register to another system with a new token and a freshly acquired feature vector. Suppose now that the adversary knows the user’s new secret parameter set , but the adversary does not know the user’s new template . In such a scenario, the adversary would try to compute a nearby-template preimage of the template , given , , and . Informally, we call such a nearby-template preimage as a long-lived nearby-template preimage. More formally, we have the following definition.

Definition 9

Let be a cancelable biometric protection scheme and an adversary for finding a long-lived nearby-template preimage. The success rate of , denoted by , is defined as:

Note that an adversary can follow a naive strategy by simply sampling a user from and returning . Under this strategy, the adversary would be expected to succeed with probability , as explained in the previous authentication attack model. A weakness of the scheme, with respect to the long-lived false authentication notion, would require better attack strategies, and this motivates the following definition.

Definition 10

The protection scheme is said to have long-lived false authentication with advantage property, if there exists an adversary such that . If is negligible for all , then we say that does not have long-lived false authentication property under the stolen token scenario.

In other words, a protection scheme is vulnerable to long-lived nearby-template preimage attacks if an adversary, who knows a user’s previous token and template pair, and the user’s renewed token, can construct a feature vector that can be (falsely) authenticated by the system with some probability greater than the false accept rate of the system.

Remark 4

We should emphasize that in finding long-lived nearby-template preimages, we allow the adversary to know , , and , but we do not allow the adversary to know . Therefore, the finding a long-lived nearby-template preimage problem is not easier than the finding a nearby-template preimage problem. This observation also makes sense in practice as explained in the following. Consider an adversary, who has access to an efficient algorithm for finding nearby-template preimages. Such an adversary can be blocked by revoking biometric templates and renewing tokens. On the other hand, an adversary, who has access to an efficient algorithm for finding long-lived nearby-template preimages, can still be (falsely) authenticated by the system even after user templates are revoked and tokens are renewed. In other words, a successful algorithm for finding long-lived nearby-template preimages would defeat the purpose of cancellability feature of a system.

Remark 5

Definitions 9 and 10 can be generalized to the case where the pair (token, template) is renewed times. The adversary thus takes advantage of the first leaked pairs of (token, template), along with the ’th token.

Iii-C Linkability attacks

Let be two feature vectors. Let and be two templates generated from and , and the secret parameters set and . In a linkability attack, an adversary is given , , , and , and the adversary tries to find out whether and are derived from the same user. As a result, an adversary in a linkability attack can be modelled as an algorithm that takes , , , and as input, and that outputs or , where the output indicates that the feature vectors and are extracted from the same user, and the output indicates that the feature vectors and are extracted from two different users. We say that the adversary is successful, if his conclusion (whether the feature vectors are extracted from the same user) is indeed correct. More formally, we have the following definition.

Definition 11

Let be a cancelable biometric protection scheme and an adversary for a linkability attack. The success rate of , denoted by , is defined as:

Note that an adversary can follow a naive strategy by simply sampling a value from uniformly at random. Under this strategy, the adversary would be expected to succeed with probability . This strategy is also known as the guessing attack in the literature. A weakness of the scheme, with respect to the linkability notion, would require better attack strategies, and this motivates the following definition.

Definition 12

The protection scheme is said to be linkable (distinguishable) with advantage , if there exists an adversary such that . If is negligible for all , then we say that is unlinkable (indistinguishable) under the stolen token scenario.

Iv Attacks on GRP-IoM

In this section, we propose some concrete attack strategies against GRP-IoM, and evaluate the impact of our attacks through our implementation over one of the datasets as provided in [Jin18Ranking]. More specifically, we use the dataset of features extracted from the fingerprint images of FVC2002-DB1 as in  [Jin18Ranking]. This dataset contains a total of 500 samples: 5 samples per user for 100 users.

As mentioned before, for convenient comparison of our results, we use the GRP-IoM paramaters , and as the main reference point in our security analysis, because these parameters are commonly referred in the security and performance analysis of GRP-IoM; see Table IV and Table V in [Jin18Ranking].

Iv-a Authentication attacks on GRP-IoM

Finding nearby-template preimages

As before, let be a feature vector, and let be the template generated from and the secret parameter set . Assume that an adversary knows and . In order to find a nearby-template preimage vector , the adversary proceeds as follows. Since knows , can recover the set of Gaussian random projections in GRP-IoM: Let the rows of be denoted by , , …, . Let denote the inner product between the vectors and . Recall that the template produced by GRP-IoM is a vector comprised of the indices of maximum, i.e.

from which recovers the set of inequalities

(1)

As a result, obtains inequalities in unknowns, and sets to be one of the (arbitrary) solutions of this system (possibly imposing for some positive, for ). By the construction of , we must have , and so , for all . In other words, is expected to get (falsely) authenticated by the server with , or equivalently, . The expected success rate of our attack has been verified in our python implementation using the cvxopt library [cvxopt] on a computer running on Ubuntu 17.10 with xfce environment, with an i7 4790k 4 Ghz processor, an 8 GB of RAM, a SATA SSD of 512GB. The attack runs in the order of seconds for the parameters , and .

Finding long-lived nearby-template preimages

Let and be two feature vectors of the same user, and and two secret parameters sets. Let and . In finding a long-lived nearby-template preimage of , we assume that the adversary knows , , . In our proposed attack, follows the previously described strategy to find a nearby-template preimage based on and , and presents this as a candidate for nearby-template preimage of .

We evaluate this attack by computing both the average and the minimum matching score, over one hundred users, between and the re-enrolled genuine template . Our experiments yield as the average rate of the number of indices with the same entry in and ; and as the minimum rate of the number of indices with the same entry in and . Therefore, given the matching score thresholds of as set in [Jin18Ranking], we expect that the success rate of the adversary to be . The above attack strategies show that GRP-IoM is severely vulnerable against authentication attacks under the stolen token and template attack model, and also show that adversaries cannot be prevented by renewing templates or tokens. In other words, the cancalleability feature of GRP-IoM is violated under the stolen token and template scenario.

Optimizing authentication attacks

Next, we explore whether the attacks can be optimized when a user leaks several token and template pairs. More specifically, assume that an adversary captures token, template pairs , for , derived from the same feature vector . In practice, these pairs may correspond to different enrollments of the user for different services using the same biometric image. Assume further that the adversary is in the possession of another token , but not the template , from the ’st enrollment of the user with same feature vector .

Let us denote by the sets of matrices derived from the token . The adversary can either keep all corresponding sets of inequalities, or selectively choose the inequalities of the system to decrease both the memory usage and the running time to refine the solution. In the following, we denote by the attack consisting of using all the constraints, and by the attack where the constraints are selected. The attack proceeds as follows:

  1. First, compute an approximated solution from the pair , and initialize a set of constraints

  2. For , the following computations are performed:

    1. where .

    2. , a vector of differences.

    3. The set of constraints is updated as

    4. is updated subject to the constraints of .

  3. Return .

Recall that the dataset in [Jin18Ranking] contains samples (genuine feature vectors) for each user. Therefore, in our experiments, we consider . We use linear programming solver of the SciPy optimization library in Python. The linprog function is parameterized with the ’interior-point’ solver method, with upper bounds (1) and lower bounds (-1) for the components of seeked solutions, and without objective function. The experiments yield the results of Table I and Table II, showing an improvement of the matching scores over the previous attacks (for ). Table I reports on the matching scores obtained by an attacker ; and Table II reports on the matching scores obtained by an attacker , optimizing the number of constraints.

Constr. Number
()
9,598 14,098 18,598
GRP Match. Score – Min (%) 27.7 33 38
GRP Match. Score – Avg (%) 50.6 53.4 56
TABLE I: Matching scores using .
Constr. Number – Avg 3,281 3,832 4,351
GRP Match. Score – Min (%) 29.3 29.7 38.3
GRP Match. Score – Avg (%) 48.4 50.6 52.8
TABLE II: Matching scores using .

Iv-B Reversibility attacks on GRP-IoM

In authentication attacks in the previous section, adversarial strategies focus on finding nearby-template preimages , that are not required to be close to the actual feature vector . In a reversibility attack, an adversary finds a nearby-feature preimage , and the quality of the attack is measured by the closeness of to .

Exact reversibility

The best case for an attacker is to have . In [Jin18Ranking], it is argued that the best strategy for an attacker to find is to exhaustively search (guess) the components of . Given the feature vectors extracted from FVC2002-DB1, it is reported in [Jin18Ranking] that the minimum and maximum values of the feature vector components are and respectively. Therefore, the search space for a feature component consists of possibilities, including the positive and negative signed components. Moreover, the fetaure vectors in GRP-IoM are of length . Therefore, it is concluded in [Jin18Ranking] that the attack requires to exhaust a search space of size . In the following, we propose a better attack strategy to recover . The main idea is to guess the sign of the components of the feature vector, and shrink the search space accordingly. Given a token, template pair of a user, the adversary computes a nearby-template preimage , and guesses the sign of as the same as the sign of . If all the signs were correctly guessed by the adversary, then the size of the search space would be reduced from to . However, the adversary may guess the signs incorrectly. Based on our experiments, where we compare the sign of the components of the preimage vectors and the actual feature vectors , we estimate that the probabilty of guessing the sign correctly per component is . Therefore, we estimate the size of the search space for as . Even though our attack is not practical, it reduces the previously estimated security level for GRP-IoM by bits from -bit to -bit.

Nearby reversibility

Now, we analyze some attack strategies for finding a nearby-feature preimage of a template under the stolen token attack scenario. The adversary proceeds similarly as in the authentication attacks, except that now we also include some objective functions, and solve a linearly constrained quadratic optimization problem. We consider three cases for which the objective functions are given as follows:

  1. .

  2. where is the average feature vector in the database provided in [Jin18Ranking]. For our experiments, one sample per user is attacked, i.e. one hundred linear programs are solved.

  3. where is a feature vector derived from a fingerprint of the adversary. For our experiments, is picked at random among the samples of one user. These samples are then removed from the database. Among the remaining samples, one sample per user is attacked, for a total of program solvings.

In our experiments, we use Python and the CVXOPT package [cvxopt] which provides linearly constrained quadratic programming solvers. We measure the success rate of this attack strategy , and report its advantage over the false accept rate of the system. We compute two reference false accept rate values for the dataset provided in [Jin18Ranking], one with respect to the Euclidean distance, and one with respect to the similarity measure as described in Remark 2. We compute using the Euclidean distance, with the threshold . We estimate using the similarity measure, with the threshold . Our attacks are evaluated in three cases: the cases 1, 2 and 3 when an objective function is used in the order as mentioned above, and the case none when no optimization function is used. Results of our experiments, as summarized in Table III and Table IV, show that in most cases the solving of an optimization problem leads to a significantly greater that . We then conclude that GRP-IoM is reversible with a single complete leak, both considering the Euclidean distance and the dedicated matching score of [Jin16Generating].

Objective Function (case) none 1 2 3
, 2 63 77 98
, 0 3 4 27.2
, 0 0 0 5
, 0 0 0 0
TABLE III: Success rate of the reversibility attack against GRP-IoM under single stolen token and template attack; using the Euclidean distance and for different values of .
Objective Function (case) none 1 2 3
, 100 0 0 100
TABLE IV: Success rate of the reversibility attack against GRP-IoM under single stolen token and template attack; using the similarity measure with the threshold .

Table III also shows that an adversary’s success rate drops when Euclidean distance threshold is lowered from in the system, as expected. In Table IV, we perform a similar analysis when the similarity measure is used with the threshold . We observe that the best adversarial success rates are obtained when no objective function is used, or the objective function in case 3 is used. The effect of multiple stolen token and template pairs is evaluated for the previously mentioned four cases, the results of which are presented in the Table V and VI. When no function is optimized, we see the success rate is increasing with the number of stolen pairs, up to 3 pairs, after which it decreases. This decrease may be due to the variability of the feature vector components at each re-enrollement of the user, i.e. when a renewal of the token is required. Since each system of constraints that we add to the linear program corresponds to a re-enrollment, the amount of errors in the constants of the inequalities may exceed the benefit of having more inequalities. Finally, when an objective function is added in the program solving, our experiments show there is no value gained with multiple leaks. We should note that our experiments are rather limited due to the sample size. For better and more definitive conclusions, one would need to perform more experiments.

1 2 3 4 5
, 2 43 68 63 62
, 0 3 3 3 3
, 0 0 0 0 0
TABLE V: Success rate of the reversibility attack against GRP-IoM under stolen token and template attacks when no optimization is performed (case none), using the Euclidean distance and for different values of .
1 2 3 4 5
, 77 71 69 69 69
, 4 3 3 3 3
, 0 0 0 0 0
TABLE VI: Success rate of the reversibility attack against GRP-IoM under stolen token and template attacks when optimization is performed (case 2), using the Euclidean distance and for different values of .

V Attacks on URP-IoM

In this section, we propose some concrete attack strategies against URP-IoM, and evaluate the impact of our attacks through our implementation over one of the datasets as provided in [Jin18Ranking]. More specifically, we use the dataset of features extracted from the fingerprint images of FVC2002-DB1 as in  [Jin18Ranking]. This dataset contains a total of 500 samples: 5 samples per user for 100 users.

As mentioned before, for convenient comparison of our results, we use the parameter set , and as the main reference point in our security analysis, because these parameters are commonly referred in the security and performance analysis of URP-IoM in [Jin18Ranking]; see Table V in [Jin18Ranking].

V-a Authentication attacks on URP-IoM

Finding nearby-template preimages

As before, let be a feature vector, and let be the template generated from and the secret parameter set . Assume that an adversary knows and . In order to find a nearby-template preimage vector , the adversary proceeds as follows.

Since knows , can recover the set of permutations in URP-IoM. The template is a vector comprised of the indices of maximum, i.e.

where is the window size, from which the adversary can recover the set of inequalities

Each of these inequalities can be transformed into linear constraints by taking the logarithm of the both sides. The corresponding set of linear constraints can be given as follows:

The logarithm adds a new set of constraints, namely , where is the size of the feature vector. finds a solution of this system, and sets such that , , to be one of the (arbitrary) solutions of this system. By the construction of , we must have , and so , for all . In other words, is expected to get (falsely) authenticated by the server with , or equivalently, .

The expected success rate of our attack has been verified in our python implementation using the cvxopt library [cvxopt] on a computer running on Ubuntu 17.10 with xfce environment, with an i7 4790k 4 Ghz processor, an 8 GB of RAM, a SATA SSD of 512GB. The attack runs in the order of seconds for the parameters for the parameters , and . We should note that, if an adversary captures more than one token and template pair, then additional constraints can further optimize the attack as previously discussed for GRP-IoM.

Finding long-lived nearby-template preimages

Let and be two feature vectors of the same user, and and two secret parameters sets. Let and . In finding a long-lived nearby-template preimage of , we assume that the adversary knows , , . In our proposed attack, follows the previously described strategy to find a nearby-template preimage based on and , and presents this as a candidate for nearby-template preimage of .

We evaluate this attack by computing both the average and the minimum matching score, over one hundred users, between and the re-enrolled genuine template . Our experiments yield as the average rate of the number of indices with the same entry in and ; and as the minimum rate of the number of indices with the same entry in and . Therefore, given the matching score thresholds of as set in [Jin18Ranking], we expect that the success rate of the adversary to be , on average. The above attack strategies show that URP-IoM is severely vulnerable against authentication attacks under the stolen token and template attack model, and also show that adversaries cannot be prevented by renewing templates or tokens. In other words, the cancalleability feature of GRP-IoM is violated under the stolen token and template scenario.

Optimizing authentication attacks

Similar to our GRP-IoM analysis, we now explore whether the attacks can be optimized when a user leaks several token and template pairs. More specifically, assume that an adversary captures token, template pairs , for , derived from the same feature vector . Assume further that the adversary is in the possession of another token , but not the template , from the ’st enrollment of the user with same feature vector .

Table VII reports the values when the number of leaks increases, and shows that 2 stolen token and template pairs are sufficient to yield when .

Constraint Number
()
152,998 229,198 305,398
URP Match. Score – Min (%) 12.8 14.7 14.8
URP Match. Score – Avg (%) 28.2 29.6 31.3
TABLE VII: Matching scores using .

Vi Linkability Attacks on GRP-IoM and URP-IoM

Recall that an adversary in a linkability attack can be modelled as an algorithm that takes , , , and as input, and that outputs or , where the output indicates that the feature vectors and are extracted from the same user, and the output indicates that the feature vectors and are extracted from two different users.

Authentication attacks on GRP-IoM and URP-IoM only return a feature vector that enables successful (false) authentication. Reversibility attacks on GRP-IoM allows to construct a nearby-feature preimage vectors, that are somewhat close to the actual feature vector. For example, in the exact reversibility attack on GRP-IoM, we were able to guess the sign of a component of the actual feature vector with estimated probability of . In our linkability attack on GRP-IoM, we utilize such sign guessing, and partial reversibility results. However, we could not obtain nearby-feature preimage vectors in URP-IoM successfully, mainly because, by the use of geometric programming, all of the components in a preimage must be non-negative, whereas an actual feature vector component can well be negative. As a result, the linkability attack techniques for GRP-IoM do not immediately apply to attack URP-IoM. However, we show that it is still possible to successfully link URP-IoM templates.

An attack on GRP-IoM

Given , , , and , the adversary computes nearby-feature preimage vectors and as explained before. For some decision threshold value , the adversary computes , where is the number of indices for which and have exactly the same sign. Finally, the adversary outputs , if , indicating that the feature vectors and are extracted from the same user. Otherwise, if , the adversary outputs , indicating that the feature vectors and are extracted from two different users.

In our experiments, we created 500 nearby-feature preimages, derived from the 500 templates along with their 500 seeds. Recall that the templates are the transformations (using distinct random seeds) of the feature vectors provided by the authors of IoM hashing [Jin18Ranking]. Using our dataset of nearby-feature preimages (estimated feature vectors) produced by our attack, we estimate the success rate of our attack using the following script:

  1. for between and :

    1. pick at random two nearby-feature preimage vectors and from the same individual ().

    2. if .

    3. pick at random two nearby-feature preimage vectors and from two different individuals.

    4. if .

  2. return and .

In our experiments, we set and , and obtained and . Therefore, we estimate that , as the average of the success rates over the genuine and imposter pairs. The run time of the attack is dominated by the run time of computing nearby-feature preimages, that takes only a few seconds as mentioned earlier.

An attack on URP-IoM

Given , , , and , the adversary computes nearby-feature preimage vectors and as explained before. For some decision threshold value , the adversary computes the Pearson coefficient of and . The formula for the Pearson coefficient is given as follows:

where , , , and .

The adversary outputs , if , indicating that the feature vectors and are extracted from the same user. Otherwise, if , the adversary outputs , indicating that the feature vectors and are extracted from two different users. Following the linkability attack on GRP-IoM, we estimate the success rate of our attack using the following script:

  1. for between