A Cryptanalysis of Two Cancelable Biometric Schemes based on IndexofMax Hashing
Abstract
Cancelable biometric schemes generate secure biometric templates by combining user specific tokens and biometric data. The main objective is to create irreversible, unlinkable, and revocable templates, with high accuracy in matching. In this paper, we cryptanalyze two recent cancelable biometric schemes based on a particular locality sensitive hashing function, indexofmax (IoM): Gaussian Random ProjectionIoM (GRPIoM) and Uniformly Random PermutationIoM (URPIoM). As originally proposed, these schemes were claimed to be resistant against reversibility, authentication, and linkability attacks under the stolen token scenario. We propose several attacks against GRPIoM and URPIoM, and argue that both schemes are severely vulnerable against authentication and linkability attacks. We also propose better, but not yet practical, reversibility attacks against GRPIoM. The correctness and practical impact of our attacks are verified over the same dataset provided by the authors of these two schemes.
I Introduction
Biometrics has been widely adopted in authentication systems, border control mechanisms, financial services, and healthcare applications. Biometric technologies are very promising to provide userfriendly, efficient, and secure solutions to practical problems. In a typical biometric based authentication scheme, users register their biometricrelated information with the system, and they are authenticated based on a similarity score calculated from their enrolled biometric data and the fresh biometric they provide. As a consequence, service providers need to manage biometric databases. This is somewhat analogous to storing and managing user passwords in a passwordbased authentication scheme. The main difference is that biometric data serves as a longterm and unique personal identifier, whence categorized as a highly sensitive and private data. This is not the case for passwords as they can be chosen independent of any user specific characteristics, a single user can create an independent password per application, and passwords can be revoked, changed, and renewed easily at any time. As a result, managing biometric data in applications is more challenging, and it requires more care. As biometricbased technologies are deployed at a larger scale, biometric databases become natural targets in cyber attacks. In 2015, 5.6 million fingerprints were stolen from the U.S. Office of Personnel Management’s database in a cyber attack [OPMHack]. More recently, The U.S. Customs and Border Protection (CBP) said in a statement that traveler images collected by CBP were compromised from a subcontractor’s company network by a malicious cyberattack on May 31, 2019 [FaceHack]. Thus, biometric protection schemes become a necessity with the proliferation of biometric applications.
In order to mitigate security and privacy problems in the use of biometrics, several biometric template protection methods have been proposed, including cancelable biometrics, biometric cryptosystems (e.g. fuzzy extractors), keyed biometrics (e.g. homomorphic encryption), and hybrid biometrics. In this paper, we focus on cancelable biometrics (CB), and refer the reader to two surveys [Survey2015, Survey2016] for more details on biometric template protection methods.
In CB, a biometric template is computed through a process where the main inputs are biometric data (e.g. biometric image, or the extracted feature vector) of a user, and a user specific token (e.g. a random key, seed, or a password). In a nutshell, templates can be revoked, changed, and renewed by changing user specific tokens. For the security of the system, it is important that the template generation process is noninvertible (irreversible): given the biometric template and/or the token of a user, it should be computationally infeasible to recover any information about the underlying biometric data. Similarly, given a pair of biometric templates and the corresponding tokens, it should be computationally infeasible to distinguish whether the templates were derived from the same user (unlinkability). We should note that even though user specific tokens in CB may be considered as secret, as part of a twofactor authentication scheme, cryptanalysis of CB with stronger adversarial models commonly assume that the attacker knows both the biometric template and the token of a user. This is a plausible assumption in practice because a user token may have low entropy (e.g. a weak password), or it may just be compromised by an attacker. This scenario is also known as the stolentoken scenario; see [tkl08].
CB was first propsed by Ratha et al. [rcb01] for face recognition. Since then, several CB schemes have been proposed, including the Biohashing algorithm applied on many modalities such as fingerprints [tng04], face [tgn06], and iris [ctjnl06]. Due to its simple design properties based on an orthonormal projection matrix, biohashing has been widely studied [kczky06, nl06, ln07, tkl08]. For example, Ratha et al. proposed three transformations for minutiaebased fingerprint templates in [rccb07]. Another family of CB has been proposed based on Bloom filters; see [rbbb14].
Several attacks on the biohashing type schemes have been proposed [lcm09, LaChRo13, NaNaJa10, fly14, ToKaAzEr16]. As mentioned before, in most of these attacks, it is assumed that the adversary knows the user specific token, except in one scenario of [fly14], which assumes the knowledge of several biometric templates from several distinct subjects (68 individuals, 105 images per individual, for the CMUPIE face database; and 350 individuals, 40 images per individual, for the FRGC face database). In [NaNaJa10, fly14, ToKaAzEr16], the attacks are combined with a masquerade attack. Cryptanalysis efforts have been extended to other types of CB schemes as well. For example, a Bloom filtersbased protection scheme is analyzed in [bmr17], concerning the nonlinkability of templates. The schemes presented in [rccb07] have been attacked in [qfaf08], using the Attack via Record Multiplicity (ARM) technique, where the attack uses the knowledge of multiple templates generated from the original data. ARM is also used by Li and Hu [lh14] for attacking several CB schemes that are designed for minutiae based fingerprint templates, as proposed in [lctlk07, wh12].
In summary, CB schemes offer several advantages such as efficient implementation, high matching accuracy, and revocability. On the other hand, security of CB schemes, in general, are not well understood and the security claims are rather based on some intuitive, heuristic, and informal arguments, as opposed to being based on formal arguments with rigorous proofs.
More recently, Jin et al. [Jin18Ranking] proposed two cancelable biometric schemes based on a particular locality sensitive hashing function, indexofmax (IoM) (see [Charikar02Similarity] for details on IoM hashing): Gaussian Random ProjectionIoM (GRPIoM) and Uniformly Random PermutationIoM (URPIoM). It is shown in [Jin18Ranking] that, for suitably chosen parameters, GRPIoM and URPIoM are robust against variation and noise in the measurement of data. It is also claimed in [Jin18Ranking] that, GRPIoM and URPIoM are resistant against reversibility, authentication, and linkability attacks under the stolen token scenario.
In this paper, we formalize some security notions under the stolen token scenario and propose several attacks against GRPIoM and URPIoM. We argue that both schemes are severely vulnerable against authentication and linkability attacks. We also propose better, but not yet practical, reversibility attacks for GRPIoM. We utilize linear and geometric programming methods in our attacks. Their correctness and practical impact are verified over the same dataset provided by the authors of these two schemes. In order to be more specific, we state the security claims in [Jin18Ranking] and our cryptanalysis results as follows:

Reversibility attack: In a reversibility attack, an adversary, who already has the knowledge of a user’s specific token, and has at least one biometric template of the same user, tries to recover a feature vector, that corresponds to the user’s biometric data.
Analysis in [Jin18Ranking]
It is claimed in [Jin18Ranking] that the best template reversing strategy for an adversary is to exhaustively search feature vectors. Based on some entropy analysis of the feature vectors, it is concluded in [Jin18Ranking] that recovering the exact feature vectors from their system implemented over the FVC 2002 DB1 dataset requires operations for both GRPIoM and URPIoM; see Section VII.A in [Jin18Ranking]. In fact, the attack cost in [Jin18Ranking] was underestimated as because of underestimating as . A more accurate analysis yields a cost of .
Our results
We propose a new reversibility attack against GRPIoM. The main idea is to reduce the search space by guessing the sign of the components of the feature vectors with high success probability. Our analysis and experiments over the FVC 2002 DB1 dataset suggest that recovering GRPIoM feature vectors now requires operations. Even though our attack is not practical, it reduces the previously estimated security level for GRPIoM by bits from bit to bit. Furthermore, we relax the exact reversibility notion to the nearby reversibility notion. This relaxation is reasonable given the fact that different measurements of the same user’s biometric produce different feature vectors due to the inherent noise in the measurements. Under this relaxation, we propose successful attack strategies against GRPIoM. Currently, we do not have any reversing attack strategy against URPIoM that works better than the naive exhaustive search or random guessing strategies. For more details, please see Section IVB.

Authentication attack: In an authentication attack, an adversary, who already has the knowledge of a user’s specific token, and has at least one biometric template of the same user, tries to generate a feature vector such that the adversary can now use that feature vector and the stolen token to be (falsely) authenticated by the system as a legitimate user. Note that authentication attacks are weaker than reversibility attacks because feature vectors generated in the attacks are not required to correspond to actual biometrics.
Analysis in [Jin18Ranking]
The authors in [Jin18Ranking] analyze several authentication attack strategies (brute force, record multiplicity, false acceptance, birthday) against GRPIoM and URPIoM. In particular, the analysis in [Jin18Ranking] yields that authentication attacks against GRPIoM with parameters , and URPIoM with parameters require and operations, respectively, when the underlying dataset is FVC 2002 DB1; see Table V in [Jin18Ranking].
Our results
We utilize linear and geometric programming methods and propose new and practical authentication attacks against both GRPIoM and URPIoM. For example, we verify that our attacks against GRPIoM and URPIoM (under the same parameters and the dataset as above) run in the order of seconds and can authenticate adversaries successfully. We also show that the cancellability property of both GRPIoM and URPIoM are violated in the sense that adversaries can still be (falsely) authenticated by the system even after user templates are revoked and tokens are renewed. For more details, please see Section IVA and VA.

Linkability attack: In a linkability attack, an adversary, who is given a pair of biometric templates, tries to determine whether the templates were generated from two distinct individuals or from the same individual using two distinct tokens.
Analysis in [Jin18Ranking]
Based on some experimental analysis of the pseudogenuine and pseudoimposter score distributions, and the large overlap between the two distributions, it is concluded in [Jin18Ranking] that an adversary cannot be successful in a linkability attack against GRPIoM and URPIoM; see Section VII C in [Jin18Ranking].
Our results
Unlinkability claims in [Jin18Ranking] are limited in the sense that the analysis only takes into account the attack strategies based on correlating the similarity scores of given templates. Therefore, the analysis in [Jin18Ranking] does not rule out other, potentially better, attack strategies. In our analysis, we exploit partial reversibility of GRPIoM and URPIoM, and propose successful attack strategies (distinguishers) against both schemes. More specifically, the distinguisher for GRPIoM uses a preimage finder along with a correlation metric, that counts the number of identically signed components in the preimages. As a result, our attack can correctly link two templates 97 percent of the time. The distinguisher for URPIoM uses a preimage finder along with the Pearson correlation metric, and that can correctly link two templates 83 percent of the time. For more details, please see Section VI.
Organization
The rest of this paper is organized as follows. We provide some background information on GRPIoM and URPIoM in Section II. In Section II, we also formalize some of the concepts for a more rigorous discussion and analysis of our attacks. We provide our attack models and relevant definitions in Section III. Our attacks against GRPIoM and URPIoM are explicitly described and evaluated in Section IV, V, and VI. We derive our conclusions in Section VII.
Ii Formalizing Cancelable Biometric Schemes
Biometric templates in GRPIoM and URPIoM are constructed in two steps: (1) Feature extraction: A feature vector is derived from a biometric image; and (2) Transformation: A user specific secret is used to transform the user’s feature vector to a template. In this section, we present formal descriptions of these two steps and show how GRPIoM and URPIoM can be seen as concrete instantiations of our formal definitions. Our formalization will later help us to describe security notions, and to present our cryptanalysis of GRPIoM and URPIoM in a rigorous manner.
Iia Feature Extraction and Template Generation
In the following, we let and be two metric spaces, where and represent the feature space and template space, respectively; and and are the respective distance functions.
Definition 1
A biometric feature extraction scheme is a pair of deterministic polynomial time algorithms , where

is the feature extractor of the system, that takes biometric data as input, and returns a feature vector .

is the verifier of the system, that takes two feature vectors , , and a threshold as input, and returns if , and returns if .
Remark 1
is not explicitly used in GRPIoM and URPIoM. More specifically, after a feature vector is extracted from a biometric image , a transformation is applied to and a biometric template is derived. Therefore, the feature vector is not used in the protocol. The main reason that we introduce and here is to capture the notion of a vector , that is close to the feature vector . For example, the pair and may represent the feature vectors of the same individual extracted from two different measurements and ; in which case, one would expect to return for relatively small values of . As a second example, may be the feature vector constructed by an attacker to reverse the biometric template of an individual with biometric image . In this case, one may measure the success of the attack as a function of , and the rate of values returned by . A successful attack is expected to result in higher return rates of for relatively small values of .
Remark 2
In this paper, we consider two different methods to quantify the similarity between feature vectors in the GRPIoM and URPIoM schemes. The first one is the Euclidean distance, where one computes
and the verifier returns if , and returns if . In the second method, on computes
and the verifier returns if , and returns if . The reason for using the first method is that Euclidean distance is commonly deployed in biometrics, and the reason for using the second method is that it has been recently argued to be a successful measure in [Jin18Ranking, Jin16Generating].
Definition 2
Let be token (seed) space, representing the set of tokens to be assigned to users. A cancelable biometric scheme is a tuple of deterministic polynomial time algorithms , where

is the secret parameter generator of the system, that takes a token (seed) as input, and returns a secret parameter set .

is the transformation of the system, that takes a feature vector , and the secret parameter set as input, and returns a biometric template .

is the verifier of the system, that takes two biometric templates = , , and a threshold as input; and returns if , and returns if .
IiB GRPIoM and URPIoM Schemes
The feature extractor , which is common for both GRPIoM and URPIoM, takes fingerprint images as input, and generates feature vectors of length , that is with .
Let denote the set of integers from to . In [Jin18Ranking], GRPIoM sets , and URPIoM sets , for some suitable parameters , , and . In the rest of this paper, we unify this notation and use for both GRPIoM and URPIoM. In both GRPIoM and URPIoM, the distance between two templates, , is defined as the Hamming distance between and . Therefore, in the rest of this paper, we use instead of .
Both GRPIoM and URPIoM use an IndexofMax operation, denoted , in their verification algorithm . is the smallest index, at which attains its maximum value. The algorithms and for GRPIoM and URPIoM significantly differ, and we explain them in the following.
GRPIoM Instantiation

takes the seed as input, and generates random Gaussian by matrices , for . The column vectors of the matrices are sampled as standard Gaussian vectors of length: for . As a result, the secret parameter set consists of the sequence of projections .

takes the secret parameter set , and a fingerprint feature vector as input, and computes

,

,
for . The output of is the biometric template .


takes two biometric templates , and a matching threshold as input; computes ; and returns if , and returns if . Note that represents the minimum rate of the number of indices with the same entry in the pair of vectors to be accepted as a genuine pair.
The GRPbased IoM Hashing is depicted Figure 1.
Concrete parameters
In [Jin18Ranking], several experiments are performed to select optimal parameters and . More specifically, accuracy of the system is analyzed for , and . It is concluded that large is necessary for better accuracy, and that the effect of on the accuracy is not significant when is sufficiently large. For example, changing the configuration from to changes the equal error rate (EER) of the system from to , a minor improvement of . As a result, the parameter set is commonly referred in the security and performance analysis of GRPIoM in [Jin18Ranking] with ; see Table IV and Table V in [Jin18Ranking]. For convenient comparison of our results, we also use and as the main reference point in our security analysis in this paper.
URPIoM Instantiation
Let be the symmetric group of all permutations on . Let denote the set of partial permutations for . In other words, permutations in are obtained by restricting permutations in to the first integers . For and , we denote . As an example, for and , restricting the permutation to yields , and we get . Finally, the componentwise (Hadamard) product of two vectors and is denoted by . The secret parameter generation, transformation, and verification operations in URPIoM are performed as follows:

takes the seed as input, and generates partial permutations uniformly at random: for , and . As a result, the secret parameter set consists of the sequence of partial permutations .

takes the secret parameter set , and a fingerprint feature vector as input, and computes

for ,

,
for . The output of is the biometric template .


takes two biometric templates , and a matching threshold as input; computes ; and returns if , and returns if . Note that represents the minimum rate of the number of indices with the same entry in the pair of vectors to be accepted as a genuine pair.
An illustration of the URPbased IoM transformation is given Figure 2.
Concrete parameters
In [Jin18Ranking], several experiments are performed to select optimal parameters , , and . It is reported that, the best performance over the FVC 2002 DB1 dataset is achieved when and . This parameter set is also referred in the security and performance analysis of URPIoM in [Jin18Ranking]; see Table V in [Jin18Ranking]. For convenient comparison of our results, we also use and as the main reference point in our security analysis in this paper.
Iii Stolen Token Attack Models
Let be the set of users of the biometric system. We identify a user with its biometric characteristic, and define a function that takes a biometric characteristic as input, and outputs a digital representation of biometric data ; for instance, the scan of a fingerprint. Note that for two different computations of and (e.g. at different times, or different devices), we may have due to the inherent noise in the measurement of biometric data. Therefore, we model as a probabilistic polynomial time function. We also allow for due to the error rates of recognition systems. In the following, we use to indicate that is chosen from the set uniformly at random.
Iiia Reversibility attacks
Let be a feature vector, and let be the template generated from and the secret parameter set . In a reversibility attack, an adversary is given , , and a threshold value , and the adversary tries to find a feature vector such that is exactly the same as , or is close to with respect to the distance function over and the threshold value . In this case, we say that is a nearbyfeature preimage (or simply a nearbyfeature preimage, when is clear from the context) of the template . More formally, we have the following definition.
Definition 3
Let be a feature vector, and for some secret parameter set . Let be a threshold value. A nearbyfeature preimage of with respect to is a feature vector such that = .
As a result, an adversary in a reversibility attack can be modelled as an algorithm that takes and = as input, and that outputs . We say that the adversary is succeesful, if is a nearbyfeature preimage of . More formally, we have the following definition.
Definition 4
Let be a cancelable biometric protection scheme and an adversary for a nearbyfeature preimage attack. The success rate of , denoted by , is defined as:
Note that an adversary can follow a naive strategy by simply sampling a user from and returning . Under this strategy, the adversary would be expected to succeed with probability , which is the false accept rate of the system with respect to and as the threshold value for the comparison of the pairs of feature vectors. A weakness of the scheme, with respect to the reversibility notion, would require better attack strategies, and this motivates the following definition.
Definition 5
The protection scheme is said to be reversible with advantage , if there exists an adversary such that . If is negligible for all , then we say that is irreversible in the stolen token scenario.
In particular, a protection scheme is irreversible in the stolen token scenario if, the success rate of any adversary is not significantly better than the success rate of the strategy of drawing randomly from .
IiiB Authentication attacks
Let be a feature vector, and let be the template generated from and the secret parameter set . In an authentication attack, an adversary is given , , and a threshold value , and the adversary tries to find a feature vector such that for , is exactly the same as , or is close to with respect to the distance function over and the threshold value . In this case, we say that is a nearbytemplate preimage (or simply a nearbytemplate preimage, when is clear from the context) of the template . More formally, we have the following definition.
Definition 6
Let be a feature vector, and for some secret parameter set . Let be a threshold value. A nearbytemplate preimage of with respect to is a feature vector such that and .
As a result, an adversary in an authentication attack can be modelled as an algorithm that takes and = as input, and that outputs . We say that the adversary is succeesful if is a nearbytemplate preimage of . More formally, we have the following definition.
Definition 7
Let be a cancelable biometric protection scheme and an adversary for finding a nearbytemplate preimage. The success rate of , denoted by , is defined as:
Note that an adversary can follow a naive strategy by simply sampling a user from and returning . Under this strategy, the adversary would be expected to succeed with probability , which is the false accept rate of the system with respect to and as the threshold value for the comparison of the pairs of templates. This strategy is also commonly known as the false acceptance rate attack in the literature. A weakness of the scheme, with respect to the false authentication notion, would require better attack strategies, and this motivates the following definition.
Definition 8
The protection scheme is said to have false authentication with advantage property, if there exists an adversary such that . If is negligible for all , then we say that does not have false authentication property under the stolen token scenario.
In particular, a protection scheme does not have false authentication property under the stolen token scenario, if the success rate of any adversary is not significantly better than the success rate of the strategy of drawing randomly from ; or in other words, the success rate of any attack is bounded by the false acceptance rate of the system.
Now, suppose that an adversary knows the secret parameter set of a user (), and the template of the user, where . At this point, the user may renew her token, or register to another system with a new token and a freshly acquired feature vector. Suppose now that the adversary knows the user’s new secret parameter set , but the adversary does not know the user’s new template . In such a scenario, the adversary would try to compute a nearbytemplate preimage of the template , given , , and . Informally, we call such a nearbytemplate preimage as a longlived nearbytemplate preimage. More formally, we have the following definition.
Definition 9
Let be a cancelable biometric protection scheme and an adversary for finding a longlived nearbytemplate preimage. The success rate of , denoted by , is defined as:
Note that an adversary can follow a naive strategy by simply sampling a user from and returning . Under this strategy, the adversary would be expected to succeed with probability , as explained in the previous authentication attack model. A weakness of the scheme, with respect to the longlived false authentication notion, would require better attack strategies, and this motivates the following definition.
Definition 10
The protection scheme is said to have longlived false authentication with advantage property, if there exists an adversary such that . If is negligible for all , then we say that does not have longlived false authentication property under the stolen token scenario.
In other words, a protection scheme is vulnerable to longlived nearbytemplate preimage attacks if an adversary, who knows a user’s previous token and template pair, and the user’s renewed token, can construct a feature vector that can be (falsely) authenticated by the system with some probability greater than the false accept rate of the system.
Remark 4
We should emphasize that in finding longlived nearbytemplate preimages, we allow the adversary to know , , and , but we do not allow the adversary to know . Therefore, the finding a longlived nearbytemplate preimage problem is not easier than the finding a nearbytemplate preimage problem. This observation also makes sense in practice as explained in the following. Consider an adversary, who has access to an efficient algorithm for finding nearbytemplate preimages. Such an adversary can be blocked by revoking biometric templates and renewing tokens. On the other hand, an adversary, who has access to an efficient algorithm for finding longlived nearbytemplate preimages, can still be (falsely) authenticated by the system even after user templates are revoked and tokens are renewed. In other words, a successful algorithm for finding longlived nearbytemplate preimages would defeat the purpose of cancellability feature of a system.
IiiC Linkability attacks
Let be two feature vectors. Let and be two templates generated from and , and the secret parameters set and . In a linkability attack, an adversary is given , , , and , and the adversary tries to find out whether and are derived from the same user. As a result, an adversary in a linkability attack can be modelled as an algorithm that takes , , , and as input, and that outputs or , where the output indicates that the feature vectors and are extracted from the same user, and the output indicates that the feature vectors and are extracted from two different users. We say that the adversary is successful, if his conclusion (whether the feature vectors are extracted from the same user) is indeed correct. More formally, we have the following definition.
Definition 11
Let be a cancelable biometric protection scheme and an adversary for a linkability attack. The success rate of , denoted by , is defined as:
Note that an adversary can follow a naive strategy by simply sampling a value from uniformly at random. Under this strategy, the adversary would be expected to succeed with probability . This strategy is also known as the guessing attack in the literature. A weakness of the scheme, with respect to the linkability notion, would require better attack strategies, and this motivates the following definition.
Definition 12
The protection scheme is said to be linkable (distinguishable) with advantage , if there exists an adversary such that . If is negligible for all , then we say that is unlinkable (indistinguishable) under the stolen token scenario.
Iv Attacks on GRPIoM
In this section, we propose some concrete attack strategies against GRPIoM, and evaluate the impact of our attacks through our implementation over one of the datasets as provided in [Jin18Ranking]. More specifically, we use the dataset of features extracted from the fingerprint images of FVC2002DB1 as in [Jin18Ranking]. This dataset contains a total of 500 samples: 5 samples per user for 100 users.
As mentioned before, for convenient comparison of our results, we use the GRPIoM paramaters , and as the main reference point in our security analysis, because these parameters are commonly referred in the security and performance analysis of GRPIoM; see Table IV and Table V in [Jin18Ranking].
Iva Authentication attacks on GRPIoM
Finding nearbytemplate preimages
As before, let be a feature vector, and let be the template generated from and the secret parameter set . Assume that an adversary knows and . In order to find a nearbytemplate preimage vector , the adversary proceeds as follows. Since knows , can recover the set of Gaussian random projections in GRPIoM: Let the rows of be denoted by , , …, . Let denote the inner product between the vectors and . Recall that the template produced by GRPIoM is a vector comprised of the indices of maximum, i.e.
from which recovers the set of inequalities
(1) 
As a result, obtains inequalities in unknowns, and sets to be one of the (arbitrary) solutions of this system (possibly imposing for some positive, for ). By the construction of , we must have , and so , for all . In other words, is expected to get (falsely) authenticated by the server with , or equivalently, . The expected success rate of our attack has been verified in our python implementation using the cvxopt library [cvxopt] on a computer running on Ubuntu 17.10 with xfce environment, with an i7 4790k 4 Ghz processor, an 8 GB of RAM, a SATA SSD of 512GB. The attack runs in the order of seconds for the parameters , and .
Finding longlived nearbytemplate preimages
Let and be two feature vectors of the same user, and and two secret parameters sets. Let and . In finding a longlived nearbytemplate preimage of , we assume that the adversary knows , , . In our proposed attack, follows the previously described strategy to find a nearbytemplate preimage based on and , and presents this as a candidate for nearbytemplate preimage of .
We evaluate this attack by computing both the average and the minimum matching score, over one hundred users, between and the reenrolled genuine template . Our experiments yield as the average rate of the number of indices with the same entry in and ; and as the minimum rate of the number of indices with the same entry in and . Therefore, given the matching score thresholds of as set in [Jin18Ranking], we expect that the success rate of the adversary to be . The above attack strategies show that GRPIoM is severely vulnerable against authentication attacks under the stolen token and template attack model, and also show that adversaries cannot be prevented by renewing templates or tokens. In other words, the cancalleability feature of GRPIoM is violated under the stolen token and template scenario.
Optimizing authentication attacks
Next, we explore whether the attacks can be optimized when a user leaks several token and template pairs. More specifically, assume that an adversary captures token, template pairs , for , derived from the same feature vector . In practice, these pairs may correspond to different enrollments of the user for different services using the same biometric image. Assume further that the adversary is in the possession of another token , but not the template , from the ’st enrollment of the user with same feature vector .
Let us denote by the sets of matrices derived from the token . The adversary can either keep all corresponding sets of inequalities, or selectively choose the inequalities of the system to decrease both the memory usage and the running time to refine the solution. In the following, we denote by the attack consisting of using all the constraints, and by the attack where the constraints are selected. The attack proceeds as follows:

First, compute an approximated solution from the pair , and initialize a set of constraints

For , the following computations are performed:

where .

, a vector of differences.

The set of constraints is updated as

is updated subject to the constraints of .


Return .
Recall that the dataset in [Jin18Ranking] contains samples (genuine feature vectors) for each user. Therefore, in our experiments, we consider . We use linear programming solver of the SciPy optimization library in Python. The linprog function is parameterized with the ’interiorpoint’ solver method, with upper bounds (1) and lower bounds (1) for the components of seeked solutions, and without objective function. The experiments yield the results of Table I and Table II, showing an improvement of the matching scores over the previous attacks (for ). Table I reports on the matching scores obtained by an attacker ; and Table II reports on the matching scores obtained by an attacker , optimizing the number of constraints.

9,598  14,098  18,598  
GRP Match. Score – Min (%)  27.7  33  38  
GRP Match. Score – Avg (%)  50.6  53.4  56 
Constr. Number – Avg  3,281  3,832  4,351 
GRP Match. Score – Min (%)  29.3  29.7  38.3 
GRP Match. Score – Avg (%)  48.4  50.6  52.8 
IvB Reversibility attacks on GRPIoM
In authentication attacks in the previous section, adversarial strategies focus on finding nearbytemplate preimages , that are not required to be close to the actual feature vector . In a reversibility attack, an adversary finds a nearbyfeature preimage , and the quality of the attack is measured by the closeness of to .
Exact reversibility
The best case for an attacker is to have . In [Jin18Ranking], it is argued that the best strategy for an attacker to find is to exhaustively search (guess) the components of . Given the feature vectors extracted from FVC2002DB1, it is reported in [Jin18Ranking] that the minimum and maximum values of the feature vector components are and respectively. Therefore, the search space for a feature component consists of possibilities, including the positive and negative signed components. Moreover, the fetaure vectors in GRPIoM are of length . Therefore, it is concluded in [Jin18Ranking] that the attack requires to exhaust a search space of size . In the following, we propose a better attack strategy to recover . The main idea is to guess the sign of the components of the feature vector, and shrink the search space accordingly. Given a token, template pair of a user, the adversary computes a nearbytemplate preimage , and guesses the sign of as the same as the sign of . If all the signs were correctly guessed by the adversary, then the size of the search space would be reduced from to . However, the adversary may guess the signs incorrectly. Based on our experiments, where we compare the sign of the components of the preimage vectors and the actual feature vectors , we estimate that the probabilty of guessing the sign correctly per component is . Therefore, we estimate the size of the search space for as . Even though our attack is not practical, it reduces the previously estimated security level for GRPIoM by bits from bit to bit.
Nearby reversibility
Now, we analyze some attack strategies for finding a nearbyfeature preimage of a template under the stolen token attack scenario. The adversary proceeds similarly as in the authentication attacks, except that now we also include some objective functions, and solve a linearly constrained quadratic optimization problem. We consider three cases for which the objective functions are given as follows:

.

where is the average feature vector in the database provided in [Jin18Ranking]. For our experiments, one sample per user is attacked, i.e. one hundred linear programs are solved.

where is a feature vector derived from a fingerprint of the adversary. For our experiments, is picked at random among the samples of one user. These samples are then removed from the database. Among the remaining samples, one sample per user is attacked, for a total of program solvings.
In our experiments, we use Python and the CVXOPT package [cvxopt] which provides linearly constrained quadratic programming solvers. We measure the success rate of this attack strategy , and report its advantage over the false accept rate of the system. We compute two reference false accept rate values for the dataset provided in [Jin18Ranking], one with respect to the Euclidean distance, and one with respect to the similarity measure as described in Remark 2. We compute using the Euclidean distance, with the threshold . We estimate using the similarity measure, with the threshold . Our attacks are evaluated in three cases: the cases 1, 2 and 3 when an objective function is used in the order as mentioned above, and the case none when no optimization function is used. Results of our experiments, as summarized in Table III and Table IV, show that in most cases the solving of an optimization problem leads to a significantly greater that . We then conclude that GRPIoM is reversible with a single complete leak, both considering the Euclidean distance and the dedicated matching score of [Jin16Generating].
Objective Function (case)  none  1  2  3 
,  2  63  77  98 
,  0  3  4  27.2 
,  0  0  0  5 
,  0  0  0  0 
Objective Function (case)  none  1  2  3 
,  100  0  0  100 
Table III also shows that an adversary’s success rate drops when Euclidean distance threshold is lowered from in the system, as expected. In Table IV, we perform a similar analysis when the similarity measure is used with the threshold . We observe that the best adversarial success rates are obtained when no objective function is used, or the objective function in case 3 is used. The effect of multiple stolen token and template pairs is evaluated for the previously mentioned four cases, the results of which are presented in the Table V and VI. When no function is optimized, we see the success rate is increasing with the number of stolen pairs, up to 3 pairs, after which it decreases. This decrease may be due to the variability of the feature vector components at each reenrollement of the user, i.e. when a renewal of the token is required. Since each system of constraints that we add to the linear program corresponds to a reenrollment, the amount of errors in the constants of the inequalities may exceed the benefit of having more inequalities. Finally, when an objective function is added in the program solving, our experiments show there is no value gained with multiple leaks. We should note that our experiments are rather limited due to the sample size. For better and more definitive conclusions, one would need to perform more experiments.
1  2  3  4  5  
,  2  43  68  63  62 
,  0  3  3  3  3 
,  0  0  0  0  0 
1  2  3  4  5  
,  77  71  69  69  69 
,  4  3  3  3  3 
,  0  0  0  0  0 
V Attacks on URPIoM
In this section, we propose some concrete attack strategies against URPIoM, and evaluate the impact of our attacks through our implementation over one of the datasets as provided in [Jin18Ranking]. More specifically, we use the dataset of features extracted from the fingerprint images of FVC2002DB1 as in [Jin18Ranking]. This dataset contains a total of 500 samples: 5 samples per user for 100 users.
As mentioned before, for convenient comparison of our results, we use the parameter set , and as the main reference point in our security analysis, because these parameters are commonly referred in the security and performance analysis of URPIoM in [Jin18Ranking]; see Table V in [Jin18Ranking].
Va Authentication attacks on URPIoM
Finding nearbytemplate preimages
As before, let be a feature vector, and let be the template generated from and the secret parameter set . Assume that an adversary knows and . In order to find a nearbytemplate preimage vector , the adversary proceeds as follows.
Since knows , can recover the set of permutations in URPIoM. The template is a vector comprised of the indices of maximum, i.e.
where is the window size, from which the adversary can recover the set of inequalities
Each of these inequalities can be transformed into linear constraints by taking the logarithm of the both sides. The corresponding set of linear constraints can be given as follows:
The logarithm adds a new set of constraints, namely , where is the size of the feature vector. finds a solution of this system, and sets such that , , to be one of the (arbitrary) solutions of this system. By the construction of , we must have , and so , for all . In other words, is expected to get (falsely) authenticated by the server with , or equivalently, .
The expected success rate of our attack has been verified in our python implementation using the cvxopt library [cvxopt] on a computer running on Ubuntu 17.10 with xfce environment, with an i7 4790k 4 Ghz processor, an 8 GB of RAM, a SATA SSD of 512GB. The attack runs in the order of seconds for the parameters for the parameters , and . We should note that, if an adversary captures more than one token and template pair, then additional constraints can further optimize the attack as previously discussed for GRPIoM.
Finding longlived nearbytemplate preimages
Let and be two feature vectors of the same user, and and two secret parameters sets. Let and . In finding a longlived nearbytemplate preimage of , we assume that the adversary knows , , . In our proposed attack, follows the previously described strategy to find a nearbytemplate preimage based on and , and presents this as a candidate for nearbytemplate preimage of .
We evaluate this attack by computing both the average and the minimum matching score, over one hundred users, between and the reenrolled genuine template . Our experiments yield as the average rate of the number of indices with the same entry in and ; and as the minimum rate of the number of indices with the same entry in and . Therefore, given the matching score thresholds of as set in [Jin18Ranking], we expect that the success rate of the adversary to be , on average. The above attack strategies show that URPIoM is severely vulnerable against authentication attacks under the stolen token and template attack model, and also show that adversaries cannot be prevented by renewing templates or tokens. In other words, the cancalleability feature of GRPIoM is violated under the stolen token and template scenario.
Optimizing authentication attacks
Similar to our GRPIoM analysis, we now explore whether the attacks can be optimized when a user leaks several token and template pairs. More specifically, assume that an adversary captures token, template pairs , for , derived from the same feature vector . Assume further that the adversary is in the possession of another token , but not the template , from the ’st enrollment of the user with same feature vector .
Table VII reports the values when the number of leaks increases, and shows that 2 stolen token and template pairs are sufficient to yield when .

152,998  229,198  305,398  
URP Match. Score – Min (%)  12.8  14.7  14.8  
URP Match. Score – Avg (%)  28.2  29.6  31.3 
Vi Linkability Attacks on GRPIoM and URPIoM
Recall that an adversary in a linkability attack can be modelled as an algorithm that takes , , , and as input, and that outputs or , where the output indicates that the feature vectors and are extracted from the same user, and the output indicates that the feature vectors and are extracted from two different users.
Authentication attacks on GRPIoM and URPIoM only return a feature vector that enables successful (false) authentication. Reversibility attacks on GRPIoM allows to construct a nearbyfeature preimage vectors, that are somewhat close to the actual feature vector. For example, in the exact reversibility attack on GRPIoM, we were able to guess the sign of a component of the actual feature vector with estimated probability of . In our linkability attack on GRPIoM, we utilize such sign guessing, and partial reversibility results. However, we could not obtain nearbyfeature preimage vectors in URPIoM successfully, mainly because, by the use of geometric programming, all of the components in a preimage must be nonnegative, whereas an actual feature vector component can well be negative. As a result, the linkability attack techniques for GRPIoM do not immediately apply to attack URPIoM. However, we show that it is still possible to successfully link URPIoM templates.
An attack on GRPIoM
Given , , , and , the adversary computes nearbyfeature preimage vectors and as explained before. For some decision threshold value , the adversary computes , where is the number of indices for which and have exactly the same sign. Finally, the adversary outputs , if , indicating that the feature vectors and are extracted from the same user. Otherwise, if , the adversary outputs , indicating that the feature vectors and are extracted from two different users.
In our experiments, we created 500 nearbyfeature preimages, derived from the 500 templates along with their 500 seeds. Recall that the templates are the transformations (using distinct random seeds) of the feature vectors provided by the authors of IoM hashing [Jin18Ranking]. Using our dataset of nearbyfeature preimages (estimated feature vectors) produced by our attack, we estimate the success rate of our attack using the following script:


for between and :

pick at random two nearbyfeature preimage vectors and from the same individual ().

if .

pick at random two nearbyfeature preimage vectors and from two different individuals.

if .


return and .
In our experiments, we set and , and obtained and . Therefore, we estimate that , as the average of the success rates over the genuine and imposter pairs. The run time of the attack is dominated by the run time of computing nearbyfeature preimages, that takes only a few seconds as mentioned earlier.
An attack on URPIoM
Given , , , and , the adversary computes nearbyfeature preimage vectors and as explained before. For some decision threshold value , the adversary computes the Pearson coefficient of and . The formula for the Pearson coefficient is given as follows:
where , , , and .
The adversary outputs , if , indicating that the feature vectors and are extracted from the same user. Otherwise, if , the adversary outputs , indicating that the feature vectors and are extracted from two different users. Following the linkability attack on GRPIoM, we estimate the success rate of our attack using the following script:


for between