Morphing Attack Detection - Database, Evaluation Platform and BenchmarkingThe following paper is a pre-print. The article is accepted for publication in IEEE Transactions on Information Forensics and Security (TIFS).

Morphing Attack Detection - Database, Evaluation Platform and Benchmarking1

Abstract

Morphing attacks have posed a severe threat to Face Recognition System (FRS). Despite the number of advancements reported in recent works, we note serious open issues such as independent benchmarking, generalizability challenges and considerations to age, gender, ethnicity that are inadequately addressed. Morphing Attack Detection (MAD) algorithms often are prone to generalization challenges as they are database dependent. The existing databases, mostly of semi-public nature, lack in diversity in terms of ethnicity, various morphing process and post-processing pipelines. Further, they do not reflect a realistic operational scenario for Automated Border Control (ABC) and do not provide a basis to test MAD on unseen data, in order to benchmark the robustness of algorithms. In this work, we present a new sequestered dataset for facilitating the advancements of MAD where the algorithms can be tested on unseen data in an effort to better generalize. The newly constructed dataset consists of facial images from 150 subjects from various ethnicities, age-groups and both genders. In order to challenge the existing MAD algorithms, the morphed images are with careful subject pre-selection created from the contributing images, and further post-processed to remove morphing artifacts. The images are also printed and scanned to remove all digital cues and to simulate a realistic challenge for MAD algorithms. Further, we present a new online evaluation platform to test algorithms on sequestered data. With the platform we can benchmark the morph detection performance and study the generalization ability. This work also presents a detailed analysis on various subsets of sequestered data and outlines open challenges for future directions in MAD research.

Biometrics, Morphing Attack Detection, Face Recognition, Vulnerability of Biometric Systems
\newmdenv

[innerlinewidth=0.5pt, roundcorner=4pt,linecolor=mycolor,innerleftmargin=6pt, innerrightmargin=6pt,innertopmargin=6pt,innerbottommargin=6pt]rolebox \newmdenv[innerlinewidth=0.5pt, roundcorner=4pt,linecolor=mycolor1,innerleftmargin=6pt, innerrightmargin=6pt,innertopmargin=6pt,innerbottommargin=6pt]commentbox \restylefloatfigure \restylefloattable \bstctlciteIEEEexample:BSTcontrol

1 Introduction

Morphing attacks pose threats to Face Recognition Systems (FRS) by exploiting the tolerance towards intra-subject variations. Such attacks constitute a vulnerability in various applications like identity management, identity verified border crossing and visa management [16]. Morphing attacks consists of generating a composite image of two subjects resembling closely (for instance similar age and same ethnicity) and using the composite image to verify both the subject in an access control scenario. The composite image, hereafter referred as Morphed Image should be of sufficient quality to obtain a score above the threshold recommended by a FRS in an automated face comparison system. It should also be of sufficiently high quality to fool a trained border guard when inspected manually[16].

The morphed image can for instance be obtained by a malicious actor by colluding with a person having no criminal record to mask the identity of the malicious actor himself/herself, in order to obtain a new passport. When a malicious actor is granted a valid identity document, he/she can use it for various purposes posing a risk to national security in the worst possible scenarios. With such an assertion, the initial work demonstrating the morphing attacks illustrated that commercial-off-the-shelf (COTS) FRS could be defeated with a given set of morphed images [16]. That study further assessed if morphing attacks would succeed when presented to border guards. This means morphing attacks pose a threat to FRS systems and leave a major security risk to any nation where the malicious actor enters.

Initial studies have investigated various aspects of morphing attacks starting from analysing the vulnerability of FRS in detail [43, 46, 50, 23] to providing measures to detect and mitigate the attacks effectively [52, 43, 5, 6, 7, 4, 49, 8]. Further, a number of works have focused on studying various parameters influencing the decisions of morphing attack detection subsystems, while other works have focused on providing the set of metrics to gauge the strengths of Morphing Attack Detection (MAD) mechanisms. The works have also noted the vulnerability of FRS with respect to morphing attacks, when using the digital images and re-digitized images (digitally captured image which is printed and subsequently scanned/re-digitized). In pursuit of the current State Of The Art (SOTA) in MAD, we first review the related work in the next section.

Digital (D)/ Database Mode \bigstrut
Work Morphing Method Re-digitized(R) (# Morphed images) Detection Approach (see Section 2.3) \bigstrut
(Print-and-Scan) \bigstrut
Ferrara et al. (2014) [16]* GIMP GAP D 12 - - \bigstrut
Ferrara et al. (2016) [17]* GIMP GAP D 21 - - \bigstrut
Raghavendra et al. (2016) [43]* GIMP GAP D 450 Texture + Classifier S-MAD\bigstrut
Scherhag et al. (2017) [50] GIMP GAP D & R 231 Texture + Classifier S-MAD \bigstrut
Raghavendra et al. (2017) [44]* GIMP GAP D & R 1423 (2) Texture + Classifier S-MAD\bigstrut
Raghavendra et al.(2017) [46]* GIMP GAP D & R 362 Deep-CNN S-MAD \bigstrut
Gomez-Barrero et al. (2017) [23]* - D 840 - S-MAD \bigstrut
Ferrara et al. (2018) [18] Sqirlz Morph 2.1 D & R 100 Demorphing D-MAD \bigstrut
Damer et al. (2018) [5] GAN D 1000 GAN Based Detection S-MAD \bigstrut
Raghavendra et al. (2018) [45] GIMP-GAP D & R 2518 Color Space Texture + Classifier S-MAD\bigstrut
Scherhag et al. (2019) [49]* OpenCV/dlib, D & R 964 (3) PRNU + Classifier S-MAD \bigstrut
FaceFusion and FaceMorpher \bigstrut
Ferrara et al. (2019) [19] Sqirlz Morph 2.1 D & R 100 Deep Neural Networks D-MAD \bigstrut
Ferrara et al. (2019) [15]* Triangulation with Dlib-landmarks D 560 (36) - - \bigstrut
Scherhag et al. (2020) [53] OpenCV/dlib, FaceFusion,u D & R 791+3246 (3) Deep Features D-MAD \bigstrut
FaceMorpher and UBO Morpher \bigstrut
Venkatesh et al. (2020) [58]* StyleGAN D 2500 - S-MAD \bigstrut
\bigstrut
TABLE I: State of the art in Morphing Attack Databases and Vulnerability Reporting (* indicates vulnerability demonstrated using COTS FRS)

2 Related Work in Morphing Attacks on FRS and Databases

Morphing attacks can be conducted in two specific types in a broader sense - (i) morphing attacks using digital images (ii) morphing attacks using re-digitized images (a.k.a. printed-and-scanned images). The former domain is inspired by the practices of various countries which allow to upload a digital representation of the face image for various applications such as passport renewal in UK [24] and visa application in New Zealand[10]. The latter is used in many countries where the passport/visa/identity-card applicant is requested to provide an image such as in India [36] and in most European countries (e.g. in The Netherlands[42]) and this leaves the opportunity for a malicious actor to morph the facial image before it is printed. The image submitted by the applicant is thereafter re-digitized for digital processing and biometric enrolment. The earlier works have considered both scenarios and studied the impact of both types of attacks [16, 46, 23, 50]. In this section, we review the key aspects of earlier works in both domains. While the literature is extensive in the recent years, we focus in this work to the most relevant works with new databases for MAD. The reader is further referred to Scherhag et al. [52] for a detailed survey of the literature.

2.1 Morphing Attacks Using Digital Images

The first work illustrating morphing attacks was reported in 2014 by Ferrara et al. [16] where a set of morphed images was created using the AR Face Database [31]. 5 pairs of images were morphed for male subjects and 5 pairs of female subjects for studying the vulnerability of FRS [16]. Further, to supplement the study, one morphed image constituted by one male and one female subject and another morphed image constituted by 3 male subjects was employed. The studies specifically investigated the vulnerability of two commercial FRS - Neurotechnology VeriLook SDK 5.4 [34] and Luxand SDK 4.0[30]. The initial studies asserted the success of all morphed images in reaching a match for both constituent subjects probe images and thereby illustrating the vulnerability of face recognition systems. In the following work by Raghavendra et al. [43], the authors investigated the vulnerability on a larger set of grey scale images with 450 morphed samples from 110 different subjects on the Neurotechnology Verilook SDK[34]. In the same work, the authors also proposed a first detection approach suitable for morphed images that are processed only in the digital domain. Further, Scherhag et al. [50] conducted a similar analysis on using both a commercial SDK and OpenFace SDK - an open source face recognition SDK. In yet another work, Raghavendra et al. [46] employed a total of 431 morphed images to evaluate MAD mechanisms using deep neural networks. In a complementary work, Gomez-Barrero et al. [23] investigated the vulnerability of FRS to morphing attacks using 840 images from the Multimodal BioSecure Database [37] in the digital domain and also investigated the vulnerability of fingerprint and iris biometric systems against biometric attacks. As an alternative to morphing approaches, Raghavendra et al. [44] presented another concept of averaging facial images and proved the vulnerability of FRS for morphed and averaged images in the digital domain. The vulnerability was reported again using the Neurotechnology Verilook SDK on a newly created database of 580 morphed images and 580 averaged images. In a different paradigm, Damer et al. [5] presented an approach of generating morphed images using Generative Adversarial Networks (GAN) on a set of 1500 images to create 1000 morphed images. The authors compared the results of MAD mechanism against traditional Landmark Aligned (LMA) morphing approaches, the vulnerability of the generated database was reported using two open source face SDKs based on VGG Network [56] and OpenFace [2]. The database was used to devise MAD mechanisms on digital images alone in following works [6, 7, 4, 49, 8].

2.2 Morphing Attacks Using Print and Scanned Images

Motivated by threats of morphed images to FRS, a number of works have also investigated morphing attacks using re-digitized images (printed and scanned). The key assertion behind these works is that the loss of pixel level information, which was originally introduced by the morphing process, and is now lost due to subsequent printing and scanning processes using devices of various vendors decreases the MAD capability. Further the printing and scanning processes cause additional noise artifacts contained in the re-digitized morphed images [50, 44, 45, 25, 18]

The works in detecting re-digitized images employ the same techniques to generate morphs and then print-and-scan them. Raghavendra et al.[44] introduced a print and scanned database of 1423 morphed images using both morphing and averaging of pixels. The images were printed using a RICOH MPC 6003 SP on high-quality photo paper with 300 density and scanned using a HP Photosmart 5520 scanner at 300 dpi for bona fide, morphed and averaged images. The work also illustrated the vulnerability of COTS FRS with regards to re-digitized images to be equal to digital domain images while the MAD performance dropped. The same work was further extended with a database to have 2518 morphed images [45]. In a similar direction, Scherhag et al.[49], introduced a printed-scanned morphed face image database generated using the FRGCv2 face dataset. The authors used the Epson DS-50000 Scanner at 300 dpi to print and scan the morphed images generated using three different morphing schemes (OpenCV/dlib, FaceFusion and FaceMorpher) [49]. Ferrara et al. [18] also introduced a printed-scanned database for MAD, specifically to study the demorphing approach where the authors subtract the re-digitized images to detect a face morphing attack. The morphed images were printed and scanned at 600 dpi using a professional quality photoprinter [18].

Fig. 1: An illustration of the D-MAD pipeline

2.3 Classification of MAD

While the aforementioned works have employed various databases, most of the works have also reported MAD mechanisms correspondingly to mitigate the threats on FRS: The algorithms for MAD can be classified in two classes:

  • Differential-image MAD (D-MAD): A suspected morph image is compared against an image captured in a trusted environment (e.g., ABC gate) to determine if the suspected image is morphed.

  • Single-image MAD (S-MAD): A suspected morph image is investigated (e.g. in a forensic process), in order to determine if the image itself is morphed without using any prior information or another reference image (captured under a trusted acquisition scenario).

We provide a brief review of the relevant algorithms reported in the recent works for both S-MAD and D-MAD.

Differential-image MAD

The general principle behind the D-MAD algorithms relies on the idea that given a suspected morphed image, and a reference image captured in a trusted environment, the difference between and is obtained. The lower the difference, either in the image space or feature space, the larger the probability that the suspected image is accepted as non-morphed (or bona fide image). The first approach of D-MAD was based on inverting the morphing process in a reverse engineered manner which was termed as Demorphing [18]. In a similar manner, a number of works have been reported where the difference of feature vectors from the bona fide image and from the morph image is used to determine if the suspected image is morphed [53, 33]. The deep features from two different networks are employed to determine the difference in features in [53], and features from the 3D shape and the diffuse reflectance component estimated directly from the image was employed to detect a morphing attack in [33]. Another set of works explored the shift in landmarks of bona fide and suspected morph images in face region to determine the morphing attack [4, 49]. For the sake of simplicity a generic illustration of the D-MAD working principle is presented in Figure 1.

Single-image MAD

S-MAD algorithms largely rely on learning a classifier to distinguish the bona fide image from a morphed image. Given a suspected morph image, , the texture information is extracted from the normalized and aligned face. The texture features such as Binarized Statistical Image Features (BSIF) and Local Binary Patterns (LBP) are used to classify the images using a pre-trained SVM classifier [45, 44, 50] in the earlier works. In a very similar direction, the LBP features were also explored in [57, 49]. While extending the works for MAD, another approach was proposed to exploit the colour spaces and the scale spaces jointly [45, 47]. With the intent to address also the post-processed morphed images, pre-trained deep networks for extraction of texture features were employed to detect the morphing attacks not only in the digital domain, but also in re-digitized domain (print-scan) [46]. Notably, the earlier works have employed two deep neural networks including VGG19 [56] and AlexNet [29], where they perform feature level fusion of the first fully connected layers from both the networks [46]. In a continued effort, other deep networks have been investigated for detecting morph attacks [19]. Another approach to detecting morphing attacks was proposed by extracting the features from the “Photo Response Non-Uniformity“ where the characteristics of the image sensor were employed to determine, if the image was morphed or not [8]. Motivated by the effectiveness of the noise modelling, better performing algorithms have been reported where the color space has been investigated to seek for residuals of the morphing process [60] including dedicated context aggregation networks to automatically model the noise [59].

2.4 Limitations

As noted from the set of works listed in the previous section and Table I, there is a need for standardized and reproducible testing of MAD mechanisms. The limitations can be further divided in four main categories:

  • Need for cross-dataset evaluation: As different works have used in-house datasets generated using different approaches, the proposed methods are only evaluated on limited sets. Despite the proposed MAD approaches performing very well on the in-house datasets, no works have attempted to study the generalizable detection performance except in recent works [59, 57] which attempts to study the cross-dataset evaluation. The missing aspect from different studies suffer from validation of SOTA proposed approaches in terms of generalizable detection performance and also indicating the directions for future works. In order to address this aspect, it is necessary to avoid the classical over-fitting problem for MAD mechanisms.

  • Need for sequestered database: Further to support the reporting of generalizable detection performance in studies, there is a need for sequestered data for testing the robustness of the MAD algorithms. Thus, the need for a sequestered dataset, to which researchers do not have access for training purposes, is obvious. Sequestered data should solely be used for reproducible testing. Such tests on unknown data will establish a reliable benchmark of algorithms and will indicate, whether said algorithms are robust to handle various factors unaware to researchers.

  • Need for independent evaluation: As a third factor, MAD algorithms are often tuned to perform well on known datasets owing to the nature of in-house datasets. Despite the datasets being divided in training, testing and validation sets, it can be well observed that the algorithms and researchers have full access to look at the cases during an introspection and thereby improve their own MAD detection performance iteratively. While this enables continuous development and impovement of algorithms, morphing attacks in a real-life border crossing scenario can be compared to biometrics in the wild, where neither morphing generation algorithms, nor the post-processing approaches or printing and scanning mechanisms can be fully controlled. For the algorithms to be ready for operational deployment, there is a need for independent testing using morphed images which are unknown to the developers.

  • Need for evaluation platform: While independent testing is desired, there are not many organizations hosting such platforms limiting the researchers to devise robust algorithms. Although a similar evaluation effort is carried out by NIST [35], the NIST FRVT MORPH dataset, especially the subset containing post-processed print-scan and operational ABC gate images, is currently limited in size. Therefore, the need for an independent evaluation platform that runs continuously is needed to facilitate algorithmic evaluation and benchmark the detection performance against other competing algorithms in the lines of earlier evaluation platforms from University of Bologna, who have provided a long-standing fingerprint evaluation system [11, 3].

2.5 Contributions of this work

In order to address these four key limitations, in this work we provide three major contributions followed by the benchmarking of SOTA MAD mechanisms.

  • A large scale sequestered database of morphed and bona fide images collected in three different sites constituting to photographs of subjects is released along with this article. The database covers various age groups, equal representation of genders and varied ethnicity making it an unique database for MAD algorithm evaluation. The morphing of images was conducted with 6 different morphing algorithms presenting a wide variety of possible approaches. The images in the database consist of 5,748 morphed face images, where subsets consist of: (1) morphed images without post-processing to remove digital artifacts, (2) morphed and post-processed images to remove artifacts induced while morphing to produce passport quality ICAO photos [26], (3) printed and scanned versions of ICAO standard passport images using different combinations of printers and scanners including the scanners used in federal ID management offices in Europe. The database is accessible through the FVConGoing platform [3] to allow third parties for evaluation and benchmarking.

  • An unbiased and independent evaluation of state of the art MAD algorithms against 5,748 morphed face images and 1,396 bona fide face images. A total of 500,200 attempts with bona fide (69,800) and morphed (430,400) face images are evaluated to report the detection performance of current SOTA MAD mechanisms.

  • A new and independent evaluation platform is further presented to facilitate reproducible research where any researcher, governmental agency or private entity can upload SDKs and measure the performance of their MAD algorithm. The platform provides the benchmarking of the MAD performance against all previously submitted algorithms and specifically provides the results for different subsets corresponding to age, gender or ethnicity. Such detailed analysis will enable the researchers to identify the performance limitations of MAD mechanisms and facilitate them to develop more robust algorithms.

In the remainder of this article, in Section 3 we present the newly composed database where the details of the entire dataset are described. The new independent evaluation platform is introduced in Section 4. In Section 5, we present the set of SOTA algorithms that are particularly evaluated on the sequestered dataset. A detailed discussion of results and the analysis of MAD performance is reported in Section 6. While in Section 7 we draw the conclusions and list current limitations with the intention, to facilitate the efforts for development of future algorithms.

Digital Printed&Scanned
Bona fide enrolment 300 1096
Morphed enrolment 2045 3703
Gate (Trusted live capture) 1500 -
TABLE II: Number of images in the database.
Enroll Gate
Min Max Min Max
Original 833x1111 5184x3456 383x533 1920x2560
Digital 362x482 4140x5323 381x508 997x1330
P&S 337x449 552x709 -
TABLE III: Minimum and maximum image size.

3 SOTAMD Database

As noted in the earlier works, the existing MAD efforts by research institutions are largely based on internally created databases, which often are limited in size, diversity of image capture devices, image quality, realistic post-processing, and variability of morphing algorithms. We note that a best practice of using different databases and image acquisition and testing protocols makes it challenging, to benchmark MAD algorithms and thereby makes it for an operator next to impossible to judge the applicability of current MAD for operational deployment. In order to overcome these limitations and provide a new dataset for benchmarking (both for S-MAD and D-MAD algorithms) under realistic conditions with high quality images, we created a new dataset, to which we refer as State of the Art Morphing Detection (SOTAMD) dataset. The dataset consists of:

  1. Enrolment images: bona fide face images taken in a capture set-up, which is meeting the requirements of passport application photo capture (e.g., photographer studio).

  2. Gate images: bona fide face images captured live with a face capture system in an Automated Border Control (ABC) gate.

  3. Chip images: compressed face images stored on an electronic Machine Readable Travel Document (e-MRTD).

  4. Morphed face images: morphed images created from the pool of passport face images. The database contains different kinds of morphed images as listed below:

    1. Digital morphed images: Images obtained obtained directly after morphing in the digital domain.

    2. Digital post-processed morphed images: morphed images that are processed (automatically or manually) in the digital domain, to eliminate or hide the artifacts resulting from a morphing process.

    3. Print-scanned morphed images: post-processed morphed images that are printed and scanned to simulate the passport application process.

A number of factors are considered in creating this dataset as a joint effort in an EU funded project - State-Of-The-Art-Morphing-Detection (SOTAMD) which are explained in the subsequent sections.
Some information about the number of images in the database and their size is given, respectively, in Table II and Table III. The bona fide enrolment images have been cropped to remove the background and resized in order to follow the same inter-eye distance distribution of the morphed images, so that it’s not possible to infer the image class from its size. The details of the various subsets of data along with the details on morphing methods, print-scan pipeline, and compression details is provided in Table XIII and Table XIV.. The images from the database are used to test both S-MAD and D-MAD algorithms according to the testing protocols defined in Section 4.2.

Gender Age \bigstrut
Male Female A18-A35 A36-A55 A56-A75 \bigstrut
86 64 87 47 16 \bigstrut
Ethnicity \bigstrut
European African India-Asian East-Asian Middle-Eastern \bigstrut
96 26 10 9 9 \bigstrut
TABLE IV: Demographics of the SOTAMD database
Automated Morphing Manually post-processed Total
Digital images 1475 570 2045
Printed & Scanned 1453 2250 3703
Total 2928 2820 5748
TABLE V: Total number of images with morphing and manual post-processing.
Partner Algorithm description Automated Manual \bigstrut
Post-processing method Post-processing method \bigstrut
Hochschule Darmstadt FaceMorpher [13] Facemorpher’s internal No Manual \bigstrut
post-processing +sharpening Post-processing \bigstrut
Hochschule Darmstadt FaceFusion [12] FaceFusion’s internal GIMP \bigstrut
(only used by HDA) post-processing+sharpening retouching[22] \bigstrut
Norwegian University of FaceMorph The replacement of the eyeregion GIMP \bigstrut
Science and Technology (OpenCV with Dlib) [28] is performed in post-processing, retouching[22] \bigstrut
to prevent a double iris. \bigstrut
Norwegian University of FantaMorph [14] Fantamorph’s Adobe Photoshop \bigstrut
Science and Technology (only used byNTN) internal processing Retouching [1] \bigstrut
University of Bologna Triangulation with Background replacement,edge Adobe Photoshop \bigstrut
Dlib-landmarks suppression, colour equalization Retouching [1] \bigstrut
University of Twente Triangulation with Background replacement, GIMP \bigstrut
STASM-landmarks [32] Poisson image editing [39] retouching[22] \bigstrut
University of Bologna Triangulation with Background replacement GIMP \bigstrut
NT-landmarks edge suppression, colour equalization retouching[22] \bigstrut
TABLE VI: Contributed morphing methods, manual post-processing methods and automated post-processing methods.

3.1 Subject Pre-selection

An important aspect of creating a successful morph attack is subject selection, such that closely resembling pairs of faces are chosen [50]. Following the guidelines of earlier works, the SOTAMD database was created by selecting the morph pairing candidates with high similarity with careful considerations to age, gender and ethnicity. As an additional measure, the selected morph pairing candidates were also validated by observing the comparison scores from two specific commercial-off-the-shelf (COTS) FRS - Neurotechnology Verilook SDK[34] and Cognitec FaceVacs SDK [21]. All the morphed images that did not verify against probe images from both contributing subjects were classified as low quality morph set in the final database. This labeling makes the SOTAMD database highly relevant to investigate low quality and high quality morph detection capability. Such elimination and careful selection has led to 75 unique pairs of candidates for morphing from a total of 150 individuals of various ethnicity and age group. The subjects were selected amongst university staff and student corpus, and a casting agency website. Table IV presents the gender, age and ethnicity demographics of the selected subjects for the final SOTAMD database.

3.2 Bona Fide enrolment images

For each of the 150 subjects in the SOTAMD database, two enrolment images were captured in high quality studio acquisition set-up reflecting the real-life passport photo capture process. Further, the enrolment images are also printed and scanned to have both digital and correspondingly printed and scanned subsets. The print and scan processes are conducted using various printers and scanners to increase the diversity of the dataset.
Given the nature of this work reflecting a operational border control scenario, we have exercised care to make sure the images are ICAO complaint [26]. Thus, each of the images in the enrolment set was processed with professional software to comply with ICAO standards for eMRTD images. The processed images were further used for printing and scanning to closely follow the actual production scenario of passports based on the regulations in the Netherlands and Germany under EU member state regulations.
The number of bona fide enrolment images in the new SOTAMD database is 300 in digital format, and 1096 printed and scanned.

3.3 Morphed enrolment images

To simulate the criminal attack, we generated a number of morphed images to be used for enrolment, i.e. to be hypothetically presented during the passport application process. The morphed images have been created starting from the bona fide enrolment images (one for each subject).

Unlike the noted previous works in Table I, the newly created morphed set in the SOTAMD database has a wide variation of employed morphing processes. Specifically, the morphing set consists of an unprocessed image set and fully-processed image set. To increase the challenging nature of the dataset and in order to simulate realistic data, the post-processed images are printed and scanned using different pipelines. To further increase the diversity, each image pair was morphed using contributing factors (referred as alpha factor) of 0.3 and 0.5 for each of the two contributing faces. Examples of two morphed face images are shown in Figure  2.

(a) Bona fide face image (Contributor A)
(b) Morphed face image with alpha factor = 0.5
(c) Morphed face image with alpha factor = 0.3
(d) Bona fide face image (Contributor B)
Fig. 2: Impact of morphing factors () on morphing.

Furthermore, the processed images are resized using the OpenCV library [?] to maintain the same inter-eye distance distribution as observed in the morphed images to avoid any possibility of inferring the image class from it’s dimensions. Post-processing methods consist of automatic and/or manual methods to conceal visible, and sometimes easy to detect morphing traces. Due to such variation in algorithms, any MAD algorithm that can achieve significant accuracy of detection on the SOTAMD dataset can be deemed as robust. Examples of automatically and manually post-processed digital morphed face image (left), and the same image after printing and scanning (right) are shown in Figure  3.

(a) Both automatically and manually post-processed digital morphed face image
(b) Image  3(a) but printed, scanned and compressed
Fig. 3: Illustration of post-processing - Careful processing to remove the artifacts can be noted in the eyelids, iris and nostril regions to eliminate the traces of the morphing process. Refer Figure 4 for detailed illustration.

Examples of a morphed face image, before (left) and after (right) manual post-processing are shown in Figure  4. Morphed face images that were both automatically and manually post-processed compose the most challenging subset. All the enrolment face images (bona fide and morphed) were processed with ICAO compliance [26] testing software before entering into the database. An overview of the basic subsets of morphed face images is shown in Table  V.
A detailed account of the morphing methods that were contributed by each partner can be seen in Table VI which provides the various approaches used for automated and manual post-processing pipelines.

(a) Before
(b) After
Fig. 4: Morphed face image before and after manual Post-processing from Figure 3. Only the central part of the face is reported to better appreciate the effect of artifact removal. Careful processing to remove the artifacts can be noted in the eyelids, iris and nostril regions to eliminate the traces of morphing process.

A subset of the generated morphed images has been printed and scanned using multiple pipelines (in analogy with the bona fide enrolment images); the number of morphed images in the database is therefore 2045 in digital format and 3703 printed and scanned.

3.4 Gate images

The SOTAMD database contains 10 gate images captured from each subject (overall 1500 images) during a single acquisition session at different locations under a simulated ABC gate operational scenario 2.
As an additional measure, the quality of the images captured in the emulate ABC set-up was validated by reading the corresponding eMRTD chip images and verifying them against the captured gate image using COTS FRS.
The gate images were captured at two different partner facilities (Norwegian University of Science and Technology - referred to as NTN and Hochschule Darmstadt - referred to as HDA) from 100 subjects that directly corresponds to real ABC gates from two different vendors. These probe images that are generated from two different vendors capture devices, represent images that are used in real operational settings. Another set (from University of Twente - referred to as UTW) of gate images from 50 subjects are captured with a simulated custom-built mock ABC gate. Thus, given three different set-ups of ABC gates, the probe-set provides a variation for benchmarking different MAD algorithms, which demands an agnostic nature and robustness of the algorithms. Examples of the different probe images captured from different set-up are illustrated in Figure  5.

(a) Mock ABC gate (UTW)
(b) ABC gate (HDA)
(c) IDEMIA’s MFace gate (NTN)
Fig. 5: Examples of probe face images captured from different ABC set-up.

4 Evaluation Platform

We further present a new independent evaluation framework to measure the robustness of MAD. The MAD benchmarks have been realized following the testing framework of FVC-onGoing [11, 3]. A web-based automated evaluation platform has been designed to track the advances in MAD, through continuously updated independent testing and reporting of performances on given benchmarks. FVC-onGoing benchmarks are grouped into benchmark areas according to the (sub)problem addressed and the evaluation protocol adopted (e.g. Fingerprint Verification, Palmprint Verification, Face Image ISO Compliance Verification, etc.). To maximize trustworthiness of the results, tests are carried out using a strongly supervised approach on a collection of sequestered datasets and results are reported on-line by using well known performance indicators and metrics. We follow the same design principles to evaluate the MAD algorithms in this work.

The evaluation process is fully automated as illustrated in Figure 6 which consists of participant registration, algorithm submission, performance evaluation, and results visualization.

Fig. 6: The figure shows the architecture of the FVC-onGoing evaluation framework and an example of a typical workflow: a given participant, after registering to the Web Site (1), submits some algorithms (2) to one or more of the available benchmarks; the algorithms (binary executable programs compliant to a given protocol) are stored in a specific repository (3). Each algorithm is evaluated by the Test Engine that, after some preliminary checks (4), executes it on the dataset of the corresponding benchmark (5) and processes its outputs (e.g. comparison scores) to generate (6) all the results (e.g. EER, score graphs), which are finally published (7) on the Web Site.

To protect sensitive information (biometric data) and to prevent external attacks, the FVC-onGoing framework is composed of two different modules physically located in two separate servers:

  • The Front-End server containing the web site and the algorithm repository.

  • The Test Engine server containing the test engine and the benchmark datasets.

A firewall protects the Test Engine server by blocking all inbound and outbound connections on public and private networks. Only a few authorized users can access the Test Engine server from a specific terminal using a protected local connection. Moreover, to avoid undesirable behaviour of the submitted algorithms, all of them are first analysed by antivirus software and then executed in a strongly controlled environment with minimal permissions.
Algorithms can be provided in the form of i) a Win32 console application or ii) a Linux dynamically-linked library compliant to NIST FRVT MORPH specifications [35].

Eye distance
Benchmark area Benchmark Format
Morphing
factor
Min Q25 Q50 Q75 Max
Bona fide
attempts
Morph
attempts
D-MAD-SOTAMD_D-1.0 Digital 0.3 and 0.5 80 156 311 515 1020 3000 30550
D-MAD D-MAD-SOTAMD_P&S-1.0 Printed & Scanned 0.3 and 0.5 80 105 115 140 360 10960 55530
S-MAD-SOTAMD_D-1.0 Digital 0.3 and 0.5 90 326 456 533 1020 300 2045
S-MAD S-MAD-SOTAMD_P&S-1.0 Printed & Scanned 0.3 and 0.5 80 105 111 138 170 1096 3703
TABLE VII: D-MAD and S-MAD benchmarks

Two different benchmark areas (D-MAD and S-MAD) have been created to evaluate the accuracy of MAD algorithms in the differential- and single-image scenarios. Table VII provides detailed information on the benchmarks contained in the two benchmark areas. Algorithms submitted to these benchmarks must comply to specific protocols, whose details are given on the FVC-onGoing web site [3].

4.1 Detection performance evaluation

The evaluation platform is designed to report a number of performance metrics for MAD algorithms as detailed in this section. For each experiment bona fide and morphed face images are used to compute the Bona fide Presentation Classification Error Rate (BPCER) and the Attack Presentation Classification Error Rate (APCER). As defined in [27] the BPCER is the proportion of bona fide presentations falsely classified as morphing presentation attacks while the APCER is the proportion of morphing attack presentations falsely classified as bona fide presentations. The following performance indicators are reported:

  • EER (detection Equal-Error-Rate): the error rate for which BPCER and APCER are identical

  • BPCER10: the lowest BPCER for APCER10%

  • BPCER20: the lowest BPCER for APCER5%

  • BPCER100: the lowest BPCER for APCER1%

  • REJNBFRA: Number of bona fide face images that cannot be processed

  • REJNMRA: Number of morphed face images that cannot be processed

  • Bona fide and Morph detection score distributions

  • APCER(t)/BPCER(t) curves, where t is the detection threshold

  • DET(t) curve (the plot of BPCER against APCER)

4.2 Protocols for Evaluation

In order to benchmark the MAD algorithms, we defined two specific protocols for D-MAD and S-MAD respectively:

  • D-MAD: in this case, the algorithms receive as input a pair of images (an enrolment image and a gate image) and are requested to estimate the probability that the enrolment image is morphed, based on a differential analysis of the two input images. The enrolment images available in the database are thus compared against the gate images (i.e. trusted live capture) according to the following protocol:

    • Bona fide images: the bona fide enrolment image is compared against the gate images of the same subject;

    • Morphed images (factor 0.3): the morphed enrolment image is compared against the gate images of the subject who contributed least in the morphing (the hidden identity);

    • Morphed images (factor 0.5): the morphed enrolment image is compared against the gate images of both contributing subjects.

  • S-MAD: in this case, the algorithms receive as input a single image and are requested to estimate the probability that the image is morphed (i.e. to report a morphing likelihood score). To this aim, the probe set consists of the whole set of available enrolment images (bona fide and morphed).

The resulting number of attempts for the two benchmarks is provided in Table VII.

5 MAD Algorithms

A number of existing state of the art MAD algorithms are evaluated on the newly created SOTAMD database using the new evaluation platform. Within the scope of this work, both D-MAD and S-MAD algorithms have been submitted to the corresponding FVC-onGoing benchmarks. In this section, we provide a brief description of the algorithms that were tested on the newly developed database and the evaluation platform.

5.1 D-Mad

A D-MAD algorithm uses additional information from a second image known to be bona fide (e.g. a live image captured in an ABC gate) to detect morphed face images. D-MAD algorithms obtain the differences in images using textural features (textural features or deep features) or landmark shifts. We present a set of D-MAD algorithms evaluated on SOTAMD database in the subsequent sections.

Bsif

It is based on a set of texture features obtained using the Binarized Statistical Independent Features (BSIF) with a 8-bit filter of size , applied on the normalized and aligned image [51]. Given the histogram feature vector of the dimension for and respectively, the difference is presented to a pre-trained SVM classifier trained on the bona fide and morphed data from FERET [41] and FRGC [40] images. The approach also considers a number of post-processing steps such as median filtering, histogram normalization and sharpness processing on the images before training the SVM classifier for morphs generated from FaceMorpher and OpenCV.

Dfr

It utilizes the information of the embeddings (feature vectors) of the ArcFace algorithm [9], a ResNet based face recognition system. The fundamental idea is to use the feature vectors of the face-generating neural network to train an SVM. Since the neural network does not encounter morphed facial images during training, it can be excluded that the feature extraction overfits to artifacts of certain morphing algorithms, which in turn leads to a higher robustness of the resulting MAD algorithm. The ArcFace feature vector has a length of 512 features. The feature vectors of the e-gate live capture and the suspected morph image are subtracted. The resulting difference is used to train an SVM with RBF kernel. The algorithm evaluated in this paper was trained on the bona fide and morphed data from FERET [41] and FRGC [40]. Details of the DFR MAD algorithm can be found in [53].

Mblbp

It consists of pre-processing, calculation of multiple block LBP from both and followed by classifying them as a bona fide image or morphed image using the pre-trained SVM classifer [51]. The Dlib landmark detector is used to detect the facial area and the landmarks with the face in the pre-processing step where the face is realigned and normalized to achieve ICAO compliance [26]. The normalised face image is then cropped to the pixel wide region of from which the LBP information is extracted using equally sized blocks of the image. Within each block, a window size of pixels is employed to obtain the histograms. Given the histogram of and for and respectively, a difference of and is obtained which is given to the SVM classifier to obtain a final decision on suspected image as morphed or bona fide image. Details on the MBLBP algorithm can be found in [51].

Wl

This method is based the fact that facial landmarks are usually averaged between two individuals when morphed images are created. Therefore, the distance of a given landmark (e.g., right corner of the right eye) between two bona fide images of the same subject will be smaller than the distance between that same landmark from a bona fide images of the subject and the morphed images with another subject. To exploit this idea, a set of 68 facial landmarks is extracted from each input image using dlib. Subsequently, two types of features are computed: Euclidean distances between landmarks, and angles between a pre-defined set of neighbouring landmarks. In order to account for the reliability of the landmarks estimation (e.g., the eye corners are more stable than landmarks on the lips), different weights are applied to the distances before they are classified as bona fide or morphed images using an SVM. Details on the computations of the distances and angles can be found in [48, 4].

Dr

This method is based on the differentiating the image from bona fide image captured from trusted environment, (e.g., ABC gate) and the suspected image from Machine-Readable Travel Document (eMRTD) [33]. Both images and are decomposed into the normal maps, and diffuse map using SfSNet [54] following which the diffuse reconstructed image and a quantized normal map are obtained. From the diffuse map, the features are extracted using ‘fc7’ activation layer of AlexNet [29]. The features from the normal map are extracted by converting them to quantized spherical angles (quantization is 24-bit). The features are used to train polynomial SVM classifiers for each set of features. The classifiers are used then used to determine if the suspected image is morphed or not based on the fusion of scores from each individual classifier corresponding to normal map and diffuse map. Details on the DR D-MAD algorithm can be found in [33].

Face demorphing

The idea of Face Demorphing (FaDe) [18] involves inverting the morphing process in a reverse engineered manner. Given a suspected image that is corresponding to image stored in the ID document where is generally a linear combination of multiple images. where and are the face images of bona fide accomplice and a criminal respectively. The assumption on the other end is that for a genuine ID document (with no morphing attack) the image is a combination of two identical images (for e.g., ), where is the bona fide image.

Given the captured image in a trusted environment, demorphing algorithm obtains a difference between the suspected image and the captured image to obtain a demorphed image . When the is compared against the using a FRS system, a high comparison score () indicates no morphing and lower score indicates higher probability of morphing. Ferrara et al. [18] employ Dlib for comparing the trusted capture image and demorphed image as given below:

(1)

where are thresholds chosen om empirical trials set to respectively.

5.2 S-Mad

An S-MAD algorithm determines whether an image is morphed directly i.e. without using a trusted reference image. Most of the S-MAD algorithms first extract the features from the suspected image using textural or deep networks, followed by learning a classifier. The learnt classifier is used to determine if the image is morphed or not. We briefly describe the set of S-MAD algorithms evaluated in this work.

Prnu

This algorithm is based on the analysis of Photo Response Non-Uniformity (PRNU). In essence, the PRNU stems from slight variations among individual pixels during the photoelectric conversion in digital image sensors. As a consequence, it is present in all acquired images and can be considered as an inherent part of any sensor’s output. In fact, the PRNU has been successfully used for different forensic tasks, such as device identification or detection of digital forgeries. For the particular purpose of detecting morphed images [49], the PRNU is extracted from the preprocessed facial images and subsequently split into cells. From each cell, the variance of 100-bin histograms of the PRNU is computed. Then, the minimum value among all cells is thresholded to obtain a bona fide vs. morphed image decision. More details on this MAD mechanism can be found in [49].

Scale-Space Ensemble Approach (SSE)

The algorithm is based on ensemble approach of extracting textural features followed by learning a classifier[45]. With the set of scores obtained from different classifiers learnt from different features, the final decision is made on whether the image is bona fide or morphed. Specifically, the image is decomposed in different color spaces such as YCbCr and HSV space. For each channel of the color space, the image is decomposed into different scale spaces using a Laplacian pyramid with 3 level decomposition. Further different textural features using Binarized Statistical Independent Features (BSIF), Local Binary Patterns (LBP) and Histogram of Gradients (HOG) are obtained. The obtained features are further used to learn the Collaborative Representative Classifier (CRC). While the testing is carried out on the SOTAMD dataset, the training was performed on a dataset derived from the FRGC face dataset. More details can be found in [46].

Deep-S-MAD

This algorithm uses well-known pre-trained CNNs to detect morphed images [19]. Pre-trained networks have been fine tuned using a large set of artificially generated digital images (both bona fide and morphed). Moreover, in order to deal with the print and scan process (P&S), a further fine tuning step has been performed for the P&S case exploiting a set of images artificially generated to simulate P&S. The simulation follows a mathematical model that allows to control different image characteristics, related to both image visual quality and low-level signal content. In particular, the main visual effects produced when an image is printed and scanned can be successfully reproduced (blurring, gamma correction, color adjustment or noise).
The AlexNet architecture pre-trained on ImageNet [29] has been used on digital images while the VGG-Face16 [38] architecture pre-trained on the VGG-Face dataset [38] has been used on P&S images.

S-Mblbp

The created classification system extracts multi-block local binary patterns from a face image and uses a support vector machine with a linear kernel to classify it as either morphed or bona fide [51]. The approach optimises the feature extraction process by using uniform LBPs with radius, r = 1 (i.e. number of neighbours, n = 8), and a histogram layout of . Before feature extraction the face is detected and cropped with a HOG-based face detector [28], converted to grey scale and finally histogram equalization is applied to enhance image contrast. The histogram layout is realized by splitting the face image by 2 equidistant vertical and horizontal lines. A single histogram contains 59 feature values, which means that after concatenating the 9 histograms of our layout our feature space has 531 dimensions. The classifier was trained on [40] and [55]. As pre-processing steps, all training images were converted to png format without any compression to avoid jpg compression artefacts being detected, and resized using nearest neighbour interpolation to the average size of the three training datasets. Additionally, faces were horizontally aligned to make them similar to (ICAO compliant) benchmark images.

6 Results and Discussion

6.1 Results -D-MAD

The results observed in the Digital Image Benchmark (D-MAD-SOTAMD_D-1.0) are reported in Figure 7 (also Table X in Appendix for the results on two subsets with morphing factor 0.3 and 0.5 respectively). In particular, the DET plots in Figure 7 refer to the overall results, additional results are reported in Appendix A.

Fig. 7: DET plots for the D-MAD-SOTAMD_D-1.0.

The detection accuracy of some of the evaluated algorithms is quite modest. Two algorithms perform better than the average, and the algorithm DFR in particular reaches very promising results. The reason for the general under-performance of MAD algorithms with respect to the detection accuracy reported in the original publications could be due to the difficulty of the benchmark dataset and the over-specialization of said algorithms on the native training sets used previously in the research labs. As to the FaDe approach, its better generalization capability is probably due to the absence in the method of a specific training stage and/or hyperparameters tuning. The good performance of DFR can be attributed to the fact that the ArcFace algorithm used for feature extraction was trained independently of morphed images and thus the extracted feature vectors are not overfitted to the artifacts of individual MAD algorithms.
Table X reports the performance of the tested MADs on the entire set of images as well as separately for the subsets of images with morphing factor 0.3 and 0.5. The results related to the morphing factor 0.3 are in general slightly better than those obtained on the entire database. A noticeable improvement can only be observed on all the performance indicators for DFR and FaDe algorithms. The behavior of FaDe is explainable if we consider that the algorithm has been designed to work on asymmetric morphings. The performance gain of the DFR can be attributed to the use of the difference vector. If the morphing factor is lower, the difference increases and so does the possibility to detect the morph.

For a deeper comprehension of the main image characteristics affecting to a larger extent the MAD performance, the results have been analyzed for specific subsets of images, described in Table XII presented in Appendix. The subsets have been selected according to the number of images available (too small subsets are therefore discarded).

The degree of influence of each specific subset with respect to the overall performance has been evaluated computing, for each subset s, the percentage deviation between the EER measured on the specific subset () and the EER measured on the whole set of images:

(2)

A negative deviation indicates that the specific subset is “easier” with respect to the overall set of images (a lower EER value has been observed), high positive values identify more difficult subsets. The deviation computed for each algorithm, as well as the average deviation () for the subset of tests with morphing factor 0.3 are reported in Table XV in Appendix where the results are sorted by . Some interesting results can be observed, in relation to the main attributes characterizing the database images:

  • Ethnicity: in general the morphed images produced with Indian-Asian and Middle Eastern subjects are easier to detect for most of the algorithms. The cardinality of these subsets is lower than European/American, and the chance of selecting lookalike subjects for morphing was lower.

  • Automatic or manual post-processing: as expected manual post-processing (i.e., retouching for artefact removal) makes morphing detection more difficult w.r.t. automatic post-processing, even if the difference is just minor here.

  • Manual post-processing technique: significant differences can be observed in relation to the manual post-processing executor, thus confirming the importance of manual retouching aimed at removing small artefacts; while PM03 and PM06 are easier to detect, especially for some algorithms, PM02 and PM05 are more difficult to spot.

  • Subset of Morphs: the subset containing UTW images is more difficult with respect to those from the other partners. In fact, in this case, very similar pairs of subjects were selected, making the resulting morphs more difficult to be detected.

  • Morph quality: as expected high quality morphs (i.e., those accepted by commercial face verification algorithms) are more difficult to detect than low quality morphs (i.e., those already rejected by face verification algorithms).

  • Morphing algorithm: the results over different morphing algorithms are quite different; algorithms C06, C07 and C03 are generally easier to detect, while C02 and C01 are quite hard for most of the D-MAD algorithms.

  • Age: the results on subjects in the range 56-75 are generally much worse than those related to younger subjects; as per the Traits subsets (see below) we argue that the transfer of evident skin characteristics such as wrinkles, freckles or moles, can make the morphed images similar enough to both subjects.

  • Gender: morphing detection in female subjects looks on average more difficult.

  • Traits: the error rate on images with specific traits (moles, freckles) is on average higher than that measured on images without particular facial traits. See the above discussion on Age.

The results reported in Table XV (Appendix) show that, even if a common behaviour can be observed for several subsets, in a number of cases (e.g. Type of Post-processing or Ethnicity) different algorithms provide significantly different performance. This leads us to suppose that the tested D-MADs produce quite independent errors and a combination of such different techniques can lead to a performance improvement.

Test Bona fide comparisons Morphed comparisons Algorithm EER BPCER10 BPCER20 BPCER100 REJNBFRA REJNMRA
Overall 10960 55530 BSIF 51.36% 95.66% 98.38% 99.55% 1.35% 1.92%
DFR 4.62% 1.77% 4.08% 19.70% 1.46% 2.11%
MBLBP 29.28% 51.50% 62.38% 81.16% 2.66% 3.56%
WL 36.17% 70.37% 82.75% 95.58% 3.47% 4.19%
DR 50.13% 90.26% 95.37% 99.18% 0.00% 0.00%
FaDe 17.22% 24.82% 32.37% 74.61% 0.16% 0.25%
0.3 10960 18530 BSIF 50.98% 95.60% 98.39% 99.56% 1.35% 1.93%
DFR 2.09% 1.55% 1.55% 12.39% 1.46% 2.13%
MBLBP 27.58% 47.03% 57.72% 75.76% 2.66% 3.63%
WL 31.83% 62.40% 76.43% 93.49% 3.47% 4.26%
DR 50.38% 90.42% 95.64% 99.25% 0.00% 0.00%
FaDe 11.25% 12.74% 20.56% 38.38% 0.16% 0.23%
0.5 10960 37000 BSIF 51.54% 95.67% 98.35% 99.55% 1.35% 1.92%
DFR 5.34% 2.21% 5.60% 23.16% 1.46% 2.09%
MBLBP 30.11% 53.28% 64.63% 83.56% 2.66% 3.53%
WL 38.15% 73.32% 84.90% 96.51% 3.47% 4.15%
DR 49.96% 90.20% 95.22% 99.08% 0.00% 0.00%
FaDe 19.68% 28.55% 38.46% 100.00% 0.16% 0.27%
TABLE VIII: Performance indicators measured on the D-MAD-SOTAMD_P&S-1.0 benchmark for the overall set of images and for the subsets of images with morphing factor 0.3 and 0.5.
Test Bona fide comparisons Morphed comparisons Algorithm EER BPCER10 BPCER20 BPCER100 REJNBFRA REJNMRA
Overall 1096 3703 PRNU 48.04% 85.86% 97.35% 100.00% 0.09% 0.00%
SSE 54.37% 94.89% 98.27% 99.91% 0.00% 0.00%
Deep-S-MAD 37.10% 100.00% 100.00% 100.00% 0.00% 0.00%
S-MBLBP 43.34% 100.00% 100.00% 100.00% 0.09% 0.00%
0.3 1096 1853 PRNU 48.49% 86.13% 97.17% 100.00% 0.09% 0.00%
SSE 55.18% 94.89% 98.36% 99.91% 0.00% 0.00%
Deep-S-MAD 38.26% 100.00% 100.00% 100.00% 0.00% 0.00%
S-MBLBP 44.52% 100.00% 100.00% 100.00% 0.09% 0.00%
0.5 1096 1850 PRNU 47.29% 85.86% 97.45% 100.00% 0.09% 0.00%
SSE 53.74% 94.80% 97.99% 99.91% 0.00% 0.00%
Deep-S-MAD 35.43% 100.00% 100.00% 100.00% 0.00% 0.00%
S-MBLBP 42.15% 100.00% 100.00% 100.00% 0.09% 0.00%
TABLE IX: Performance indicators measured on the S-MAD-SOTAMD_P&S-1.0 benchmark for the overall set of images and for the subsets of images with morphing factor 0.3 and 0.5.

The results obtained on the P&S Image Benchmark (D-MAD-SOTAMD_P&S-1.0) are summarized in Fig. 8. While for the best performing approach (DFR) the detection accuracy on Digital and P&S images is similar, in general a performance drop on Print and Scan images can be observed; for example, for the demorphing method (FaDe) the BPCER values are about 10% higher. Also in this case the influence of the morphing factor on the MAD performance can be observed in Table VIII reporting the results for the overall set of images and for the subsets of images with morphing factor 0.3 and 0.5.

Fig. 8: DET plot for the D-MAD-SOTAMD_P&S-1.0.

6.2 Results - S-MAD

The results of S-MAD algorithms on printed-scanned images are given in Table IX and on digital images in Table XI (Appendix) respectively. In this case the overall performance is quite unsatisfactory in general and very far from the accuracy needed in real operational conditions. No significant differences can be observed between the different test cases: morphing factor 0.3 or 0.5, digital or printed-scanned images. We can conclude that morphing attack detection based on the analysis of the single image is still very complex, particularly in the presence of heterogeneous image sources, different processing pipelines and high quality morphs obtained through a careful selection of subjects and an accurate post-processing aimed at removing all visible artifacts. The results confirm again the importance of cross-database training and testing to improve the robustness of detection algorithms.

6.3 Directions for Future Works

As noted from the results reported in the previous sections, it is evident that the accuracy of MAD does not meet the operational requirements. If we focus on BPCER100, we can see from Tables  X and VIII that the result is around 20% for the best performing D-MAD approach. For all S-MAD algorithms (see Table IX and Table XI in Appendix), BPCER100 is higher than 90%. From a practical point of view, this behaviour would cause a considerable number of false alarms and, as a consequence, a high number or false rejections during face verification at ABC gates. This would be unacceptable if we consider that operational face verification systems for ABC gates are expected to work at a False Accept Rate (FAR) of 0.1 per cent with a False Rejection Rate (FRR) not higher than 5% [20].

  • Given the number of covariates impacting the MAD performance such as age, gender and ethnicity, accurate and better algorithms need to be developed to address the complex challenge of morphing attacks. The results presented in this work also suggest that the combination of approaches of different nature could lead to a general performance improvement.

  • As it can also be noted from the Table VIII that the print and scan process reduces the MAD accuracy to a larger extent. Reliable and accurate algorithms need to be developed to improve the accuracy of the algorithms for detecting morphing attacks specifically when images are processed through the print and scan pipeline.

  • As a complementary direction, the human detection performance should be studied in a standardized manner to understand the key factors in spotting the morphing attacks on FRS.

7 Conclusion and Summary

Given the complex nature of the morphing attack detection and the impact on operational FRS, we presented a new evaluation framework and a new database of morphed images in this work. The sequesterd morphed dataset being publicly available allows researchers to benchmark their algorithms in a continuous manner to contribute to development of morphing attack detection. Further, this work also provides a benchmark of the existing state of the art algorithms to give a clear idea of the limitations in the existing algorithms for MAD.

Acknowledgments

The authors would like to thank European Commission for supporting this work funded by SOTAMD project. The content of this report represents the views of the authors only and is their sole responsibility. The European Commission does not accept any responsibility for use that may be made of the information it contains. Further we are grateful to our colleagues at the German Federal Office for Information Security (BSI), the Hochschule Bonn-Rhein-Sieg (H-BRS) and to the Norwegian Police for the support in the data acquisition.

Kiran Raja obtained his PhD in computer Science from Norwegian University of Science and Technology (NTNU), Norway in 2016. He is faculty member at Dept. of Computer Science at NTNU, Norway. His main research interests include statistical pattern recognition, image processing, and machine learning with applications to biometrics, security and privacy protection. He was/is participating in EU projects SOTAMD, iMARS and other national projects. He has authored several papers in his field of interest and serves as a reviewer for number of journals and conferences. He is a member of EAB and chairs Academic Special Interest Group at EAB.

Matteo Ferrara is an associate professor at the Department of Computer Science and Engineering of the University of Bologna, Italy. His research interests are in the areas of Pattern Recognition, Computer Vision, Image Processing and Machine Learning. He received his bachelor’s degree cum laude in Computer Science from the University of Bologna in 2004, his Master’s degree cum laude in 2005 and his Ph.D. in 2009. Most of his applied research is in the field of Biometric Systems. He is member of the Biometric System Laboratory and he is one of the organizers of the international performance evaluation initiative named “FVC-onGoing”. Moreover, he is one of the authors of the well-known fingerprint recognition algorithm named “Minutia Cylinder-Code” (MCC). Finally he first proved that face morphing can be exploited to fool Automated Border Control (ABC) systems. He is author of several scientific papers and he served as referee for international conferences and journals. He took part to national and European research projects and to consultancy projects between the University of Bologna and foreign universities and companies.

Annalisa Franco is Assistant Professor at the Department of Computer Science and Engineering, University of Bologna, Italy. In 2004 she received her Ph.D. in Electronics, Computer Science and Telecommunications Engineering at DEIS, University of Bologna for her work on Multidimensional Indexing Structures and their application is pattern recognition. She is a member of the Biometric System Laboratory at Computer Science - Cesena. She authored several scientific papers and served as referee for a number of international journals and conferences. Her research interests include Pattern Recognition, Biometric Systems, Image Databases and Multidimensional Data Structures. Recent research activity is mainly focused on automatic face recognition.

Luuk Spreeuwers studied Electrical Engineering at the University of Twente, Netherlands. In 1992 he obtained his PhD from the University of Twente. The title of his PhD-thesis is: Image Filtering with Neural Networks: Applications and Performance Evaluation. Subsequently Luuk Spreeuwers worked at the International Institute for Aerospace and Earth Sciences (ITC) in Enschede, Netherlands, the University of Twente in a SION project on 3-D image analysis of aerial image sequences and in Budapest at the Hungarian Academy of Sciences in a 3-D textures ERCIM project. From 1999-2005 Luuk Spreeuwers worked on 3-D modeling and segmentation of the human heart in MRI at the Image Sciences Institute of the University Medical Centre in Utrecht, the Netherlands. Currently, Luuk is an Associate Professor at the University of Twente and manages and is involved in various biometric research projects in the area of 2D and 3D face recognition, face morphing attack detection and finger vein recognition.

Ilias Batskos holds a diploma in Electrical and Computer Engineering from the Aristotle University of Thessaloniki and a MSc degree in Forensic Science from the University of Amsterdam. He is currently working towards a PhD in Computer Science at the University of Twente. At the Database Management and Biometrics (DMB) group he has focused his research on evaluation of biometric systems and the effect of morphing on biometric system performance.

Florens de Wit is a researcher in biometrics at the University of Twente in the Netherlands. He received a MSc in Applied Physics in 1999 at the Eindhoven University of Technology, and in Forensic Science in 2006 at the University of Amsterdam. Before joining the Database Management and Biometrics (DMB) group of the faculty of Electrical Engineering, Mathematics and Computer Science (EEMCS) in 2017, he worked as a researcher at the Netherlands Forensic Institute (NFI) and as a lecturer at the Saxion University of Applied Science. At the DMB group he has focused his research on evaluation of biometric systems and the effect of morphing on biometric system performance.

Marta Gomez-Barrero is a Professor for IT-Security and technical data privacy at the Hochschule Ansbach, in Germany. Between 2016 and 2020, she was a postdoctoral researcher at the National Research Center for Applied Cybersecurity (ATHENE) - Hochschule Darmstadt, Germany. Before that, she received her MSc degrees in Computer Science and Mathematics (2011), and her PhD degree in Electrical Engineering (2016), all from Universidad Autonoma de Madrid, Spain. Her current research focuses on security and privacy evaluations of biometric systems, Presentation Attack Detection (PAD) methodologies, and biometric template protection (BTP) schemes. She has co-authored more than 70 publications, chaired special sessions and competitions at international conferences, she is associate editor for the EURASIP Journal on Information Security, and represents the German Institute for Standardization (DIN) in ISO/IEC SC37 JTC1 SC37 on biometrics. She has also received a number of distinctions, including: EAB European Biometric Industry Award 2015, Best Ph.D. Thesis Award by Universidad Autonoma de Madrid 2015/16, Siew-Sngiem Best Paper Award at ICB 2015, Archimedes Award for young researches from Spanish MECD, and Best Poster Award at ICB 2013.

Ulrich Scherhag received his B.Eng degree (Electrical Engineering) in 2012 from the Duale Hochschule Baden-Württemberg, Mannheim. He stated studying computer science in 2014 at Hochschule Darmstadt and received the M.Sc degree (Computer Science, IT-Security) in 2016, for which he was granted the CAST Award IT-Security 2016. Since 2016 he is a Ph.D. Student Member of da/sec at the National Research Center for Applied Cybersecurity (ATHENE). He is a member of the European Association for Biometrics (EAB) and a Reviewer for the International Conference of the Biometrics Special Interest Group (BIOSIG) and IEEE Access. His current research focuses on presentation attack detection and morphed face detection.

Daniel Fischer completed his bachelor study program at University of Applied Sciences Darmstadt in cooperation with Deutsche Telekom AG in 2015, where he received his “Bachelor of Science” degree in Computer Science. He subsequently completed his master degree study program (“Master of Science” in Computer Science) with a focus on IT-Security and Biometrics at University of Applied Sciences Darmstadt in 2018. Since 2015 he is a member of the IT-Security group (“da/sec”) at University of Applied Sciences Darmstadt, which is a part of the National Research Center for Applied Cybersecurity (ATHENE).

Sushma Krupa Venkatesh is a PhD candidate at Norwegian University of Science and Technology(NTNU), Norway since 2016. She obtained her bachelor’s degree in computer science in 2008 and master’s degree in computer science and technology in 2011. Her recent research interests include statistical pattern recognition, image processing, and machine learning with applications to biometrics, privacy and security. She has authored number of technical papers in various journals and conferences and serves as a reviewer for various scientific publication venues.

Jag Mohan Singh received the BTech (Honors) and MS degrees by research in computer science from the International Institute of Information Technology (IIIT), Hyderabad, in 2005 and 2008, respectively. He worked in the industrial research & development departments of Intel, Samsung, Qualcomm, and Applied Materials in India from 2010 till 2018. He is currently PhD student at the Norwegian University of Science and Technology (NTNU Gjøvik) in Norwegian Biometrics Laboratory (NBL). His research interests include presentation attack detection for face and finger modalities.

Guoqiang Li received his bachelor’s degree from JiLin University (2005, China), the master’s degree from Harbin Institute of Technology (2007, China), and his PhD degree in information security from Norwegian University of Technology and Science in 2016. He is currently working as a senior researcher in the Department of Information Security and Communication Technology at Norwegian University of Science and Technology (NTNU), Gjøvik, Norway. His research interests are related to fingerprint recognition, face recognition, biometric template protection, behavioural biometrics, and image processing.

Loïc Bergeron obtained his Master’s degree in Computer security in 2018 at Caen Normandie University. After his participation in a research project on keystrokes dynamics at the Norwegian University of Science and Technology (NTNU), he became a member of the SOTAMD project in 2019. He was involved in the different parts of this biometric research project and he mainly coordinated the data collection.

Sergey Isadskiy holds a Master degree in computer science from the Hochschule Darmstadt, Germany. He is a former member of the da/sec - Biometrics and Internet Security research group at Hochschule Darmstadt and the National Research Center for Applied Cybersecurity (ATHENE). His research interests include pattern recognition with focus on biometrics, in particular face and speaker recognition.

Raghavendra Ramachandra received the bachelor’s degree from the University of Mysore (UOM), Mysore, India, the master’s degree in electronics and communication from Visvesvaraya Technological University, Belgaum, India, and the Ph.D. degree in computer science and technology from UOM and Institute Telecom, and Telecom Sudparis, Évry, France (carried out as a collaborative work). He is currently appointed as a Professor with the Norwegian Biometric Laboratory, Norwegian University of Science and Technology (NTNU), Gjøvik, Norway. He was a Researcher with the Istituto Italiano di Tecnologia, Genoa, Italy. His main research interests include statistical pattern recognition, data fusion schemes and random optimization, with applications to biometrics, multimodal biometric fusion, human behavior analysis, and crowd behavior analysis. He has authored several papers, and is a reviewer for several international conferences and journals. He also involved in various conference organising and program committees. He is also an associate editor for various journals.

Christian Rathgeb is a Senior Researcher with the Faculty of Computer Science, Hochschule Darmstadt (HDA), Germany. He is a Principal Investigator in the National Research Center for Applied Cybersecurity (ATHENE). His research includes pattern recognition, iris and face recognition, security aspects of biometric systems, secure process design and privacy enhancing technologies for biometric systems. He co-authored over 100 technical papers in the field of biometrics. He is a winner of the EAB - European Biometrics Research Award 2012, the Austrian Award of Excellence 2012, Best Poster Paper Awards (IJCB’11, IJCB’14, ICB’15) and the Best Paper Award Bronze (ICB’18). He is a member of the European Association for Biometrics (EAB), a Program Chair of the International Conference of the Biometrics Special Interest Group (BIOSIG) and a editorial board member of IET Biometrics (IET BMT). He has served for various program committees and conferences (e.g. ICB, IJCB, BIOSIG, IWBF) and journals as a reviewer (e.g. IEEE TIFS, IEEE TBIOM, IET BMT).

Dinusha Frings is specialised in managing projects related to biometrics, digital identity and travel documents. She currently holds two positions. One at the National Office for Identity Data (NOID) and one at the European Association for Biometrics (EAB).She coordinated the State Of The Art of Morphing Detection (SOTAMD), Known Traveller Digital Identity and Research Live Enrolment on behalf of NOID. And is, on behalf of the EAB, currently involved in project iMARS, short for image Manipulation Attack Resolving Solutions. She chairs of the EAB Operator Special Interest Group and is part of the organising committee of the EAB Research Project Conference. Dinusha previously worked at IDEMIA in the Government Identity Solutions division and coordinated multiple IT projects related to secure ID documents.

Uwe Seidel received his Ph.D. degree in experimental physics and optics from the University of Jena. After some years in industry, he joined the Forensic Science Institute of the German Federal Criminal Police Office (Bundeskriminalamt) as an ID document expert in 2000 and is now heading the BKA’s IT Forensics and Document section. This assignment also involves overseeing R&D projects for increasing the counterfeit resistance of German official documents. Since 2019, he is the chairman of ICAO’s New Technology Working Group and Germany’s Alternate Member for the ICAO Technical Advisory Group TAG/TRIP.

Fons Knopjes is Senior Advisor at the National Office for Identity Data (Dutch Ministry of the Interior and Kingdomrelations). He has advised different Governments during the development of many (electronic) passports and identity documents. His last advice was about Dutch travel documents introduced in 2014. He redesigned the identification process of suspects arrested by the police and has advised Governments by the restructuring of their identity infrastructure. Fons has developed a model aimed at the dynamic management of the security concept of documents. He executed assessments on the identity infrastructure in over 10 countries in Europe, Africa and Asia. High level managers of over 35 countries of all over de globe have participated in Master Classes that Fons has given. He participated in the EU-funded SOTAMD research project. Fons is a member of the core group of experts on Identity related crime of the UNODC (United Nations Office on Drugs and Crime), ICAO’s TAG-TRIP (Traveler Identification Program), IAI (International Association for Identification) and the board of advisors Center for Identity, University of Texas (US).

Raymond Veldhuis graduated from the University of Twente, The Netherlands, in 1981. He received the Ph.D. degree from Nijmegen University on a thesis entitled Adaptive Restoration of Lost Samples in Discrete-Time Signals and Digital Images, in 1988. From 1982 to 1992, he was a Researcher with Philips Research Laboratories, Eindhoven, in various areas of digital signal processing. From 1992 to 2001, he was involved in the field of speech processing. He is currently a Full Professor in Biometric Pattern Recognition with the University of Twente, where he is leading the Data Management and Biometrics group. His main research topics are face recognition (2-D and 3-D), fingerprint recognition, vascular pattern recognition, multibiometric fusion, and biometric template protection. The research is both applied and fundamental.

Davide Maltoni is a Full Professor at University of Bologna (Dept. of Computer Science and Engineering - DISI). His research interests are in the area of Pattern Recognition, Computer Vision, Machine Learning and Computational Neuroscience. Davide Maltoni is co-director of the Biometric Systems Laboratory (BioLab), which is internationally known for its research and publications in the field. Several original techniques have been proposed by BioLab team for fingerprint feature extraction, matching and classification, for hand shape verification, for face location and for performance evaluation of biometric systems. Davide Maltoni is co-author of the Handbook of Fingerprint Recognition published by Springer, 2009 and holds three patents on Fingerprint Recognition. He has been elected IAPR (International Association for Pattern Recognition) Fellow 2010.

Christoph Busch is member of the Norwegian University of Science and Technology (NTNU), Norway. He holds a joint appointment with Hochschule Darmstadt (HDA), Germany. Further he lectures Biometric Systems at Denmark’s DTU since 2007. On behalf of the German BSI he has been the coordinator for the project series BioIS, BioFace, BioFinger, BioKeyS Pilot-DB, KBEinweg and NFIQ2.0. He was//is partner of the EU projects 3D-Face, FIDELITY, TURBINE, SOTAMD, RESPECT, TReSPsS, iMARS and others. He is also principal investigator in the German National Research Center for Applied Cybersecurity (ATHENE) and is co-founder of the European Association for Biometrics (EAB). Christoph co-authored more than 500 technical papers and has been a speaker at international conferences. He is member of the editorial board of the IET journal on Biometrics and of IEEE TIFS journal. Furthermore he chairs the TeleTrusT biometrics working group as well as the German standardization body on Biometrics and is convenor of WG3 in ISO/IEC JTC1 SC37.

\counterwithin

figuresection \counterwithintablesection

Appendix A Additional Analysis - D-MAD Results on overall set of images and for the subsets of images with morphing factor 0.3 and 0.5

Test Bona fide comparisons Morphed comparisons Algorithm EER BPCER10 BPCER20 BPCER100 REJNBFRA REJNMRA
Overall 3000 30550 BSIF 45.93% 78.30% 84.13% 93.83% 1.53% 1.42%
DFR 4.54% 2.00% 3.93% 18.87% 1.67% 1.55%
MBLBP 33.47% 52.80% 59.93% 74.80% 2.80% 2.62%
WL 37.13% 71.67% 83.27% 95.67% 3.33% 3.16%
DR 52.03% 89.70% 94.70% 98.57% 0.00% 0.00%
FaDe 14.17% 17.20% 22.77% 64.57% 0.20% 0.19%
0.3 3000 10350 BSIF 46.43% 78.50% 85.23% 94.50% 1.53% 1.40%
DFR 1.96% 1.67% 1.67% 13.23% 1.67% 1.54%
MBLBP 31.57% 49.67% 56.47% 68.00% 2.80% 2.60%
WL 32.87% 63.90% 77.73% 93.97% 3.33% 3.07%
DR 52.60% 90.00% 94.77% 98.57% 0.00% 0.00%
FaDe 8.43% 7.27% 12.60% 27.47% 0.20% 0.17%
0.5 3000 20200 BSIF 45.70% 78.10% 83.53% 93.47% 1.53% 1.43%
DFR 5.40% 2.47% 5.60% 21.50% 1.67% 1.56%
MBLBP 34.33% 54.20% 61.93% 77.07% 2.80% 2.62%
WL 39.47% 74.47% 84.70% 96.40% 3.33% 3.20%
DR 51.74% 89.60% 94.63% 98.57% 0.00% 0.00%
FaDe 15.76% 20.07% 27.70% 100.00% 0.20% 0.19%
TABLE X: Performance indicators measured on the D-MAD-SOTAMD_D-1.0 benchmark for the overall set of images and for the subsets of images with morphing factor 0.3 and 0.5.

Appendix B Additional Analysis - S-MAD Results on overall set of images and for the subsets of images with morphing factor 0.3 and 0.5

Test Bona fide comparisons Morphed comparisons Algorithm EER BPCER10 BPCER20 BPCER100 REJNBFRA REJNMRA
Overall 300 2045 PRNU 44.81% 100.00% 100.00% 100.00% 0.00% 0.00%
SSE 31.80% 65.00% 79.33% 91.67% 0.00% 0.00%
Deep-S-MAD 38.99% 100.00% 100.00% 100.00% 0.00% 0.00%
S-MBLBP 41.38% 100.00% 100.00% 100.00% 0.00% 0.00%
0.3 300 1035 PRNU 44.81% 100.00% 100.00% 100.00% 0.00% 0.00%
SSE 32.76% 68.00% 81.33% 90.67% 0.00% 0.00%
Deep-S-MAD 39.64% 100.00% 100.00% 100.00% 0.00% 0.00%
S-MBLBP 42.33% 100.00% 100.00% 100.00% 0.00% 0.00%
0.5 300 1010 PRNU 44.82% 100.00% 100.00% 100.00% 0.00% 0.00%
SSE 31.14% 63.33% 77.00% 92.33% 0.00% 0.00%
Deep-S-MAD 38.33% 100.00% 100.00% 100.00% 0.00% 0.00%
S-MBLBP 40.63% 100.00% 100.00% 100.00% 0.00% 0.00%
TABLE XI: Performance indicators measured on the S-MAD-SOTAMD_D-1.0 benchmark for the overall set of images and for the subsets of images with morphing factor 0.3 and 0.5.

Appendix C Additional Analysis - Attributes and subsets used for performance evaluation

Attribute Attribute value
Gender Female
Male
Ethnicity African
East-Asian
European/American
Indian-Asian
Middle Eastern
Age 18..35
36..55
56..75
Traits Freckles
Moles
None – no relevant facial traits
Partner HDA
NTNU
UTW
Post-processing Automatic – no manual retouching
Manual – with manual retouching
Morphing algorithm See Table VI and Table XIII
Manual post-processing See Table VI and Table XIII
Print-scan and See Table XIV
and Compression
Morph quality Low – the morphed image is rejected at face verification stage by at least one FR SDK between Neurotechnology and Cognitec
High – the morphed image is accepted at face verification stage by both Neurotechnology and Cognitec FR SDKs
TABLE XII: List of attributes and subsets used for performance evaluation.
Acronym Algorithm Acronym Automated Acronym Manual
description Post-Processing method post-processing method\bigstrut
C01 FaceMorpher [13] PA01 Facemorpher’s internal PM00 No manual \bigstrut
post-processing + sharpening post-processing \bigstrut
C02 FaceFusion [12] PA02 FaceFusion’s internal PM01 GIMP retouching [22]
(only used by HDA) post-processing+ sharpening \bigstrut
C03 FaceMorph [28] PA03 The replacement of the eye region PM02 GIMP retouching \bigstrut
(OpenCV with Dlib) is performed in post-processing, \bigstrut
to prevent a double iris. \bigstrut
C04 FantaMorph [14] PA04 Fantamorph’s PM03 Adobe Photoshop \bigstrut
(only used by NTN) internal processing retouching [1]\bigstrut
C05 Triangulation PA05 Background replacement PM04 Adobe Photoshop \bigstrut
with Dlib-landmarks , edge suppression, retouching\bigstrut
colour equalization \bigstrut
C06 Triangulation PA06 Background replacement, PM05 GIMP retouching \bigstrut
with STASM-landmarks [32] Poisson image editing [39] \bigstrut
C07 Triangulation PA07 Background replacement, PM06 GIMP \bigstrut
with NT-landmarks edge suppression, retouching\bigstrut
colour equalization \bigstrut
TABLE XIII: Listing of various methods for morphing, automated and manual post-processing
Acronym Print-Scan and Compression Pipeline Acronym Print-Scan and Compression Pipeline \bigstrut
F01 1. Printed with printer Dmb DS-RX1HS at 300x300 dpi, matte
2. Scanned using Epson V600 at 700 dpi
3. Cropped to 785 x 1047 pixels
4. Resized to 400 dpi 449x599 pixels
5. Compressed using JPEG2000, max file size=15kb, RGB_24_BIT F13 1. Printed at professional photo laboratory of Fotogena
2. Scanned using ID Document Scanner (Idemia)
3.Retrieved (compressed) image JPEG2000,max file size=15kb \bigstrut
F02 1. Printed with printer Dmb DS-RX1HS at 300x600 dpi, matte
2. Scanned using Epson V600 at 700 dpi
3. Cropped to 785 x 1047 pixels
4. Resized to 400 dpi 449x599 pixels
5. Compressed using JPEG2000, max file size=15kb, RGB_24_BIT F21 1. Created A4 PDFs with multiple face images in passport size on it
2. Printed using DNP DS820
3. Scanned with Epson XP-860 at 300 dpi
4. Crop individual face images to 420x540
5. Compressed using JPEG2000 (max file size = 15kb, RGB_24_BIT) \bigstrut
F03 1. Printed with printer Dmb DS-RX1HS at 300x300 dpi, matte
2. Scanned using Epson V600 at 300 dpi
3. Cropped to 413 x 531 pixels
4. Compressed using JPEG2000, max file size=15kb, RGB_24_BIT F22 1. Created A4 PDFs with multiple face images in passport size on it
2. Printed using DNP DS820
3. Scan with Canon TS8251 at 300 dpi
4. Crop individual face images to 420x5404
5. Compressed using JPEG2000 (max file size = 15kb, RGB_24_BIT)\bigstrut
F04 1. Printed with printer Dmb DS-RX1HS at 300x300 dpi, matte
2. Scanned using ID Document Scanner (Idemia)
3. Retrieved (compressed) image F23 1. Created A4 PDFs with multiple face images in passport size on it
2. Printed using Epson XP-860
3. Scanned with Epson XP-860 at 300 dpi
4. Cropped individual face images to 420x5404
5. Compressed using JPEG2000 (max file size = 15kb, RGB_24_BIT) \bigstrut
F05 1. Printed with printer Dmb DS-RX1HS at 300x600 dpi, matte
2. Scanned using ID Document Scanner (Idemia)
3. Retrieved (compressed) image F24 1. Created A4 PDFs with multiple face images in passport size on it
2. Printed using Epson XP-860
3. Scanned with Canon TS8251 at 300 dpi
4. Cropped individual face images to 420x5404
5. Compressed using JPEG2000 (max file size = 15kb, RGB_24_BIT) \bigstrut
F11 1. Created A4 PDFs with multiple face images in passport size on it
2. Printed PDF at professional photo laboratory of Fotogena GmbH
to A4 image paper
3. Scanned with Canon Lide 220 at 300 dpi, sharpness filter
+ denoise low as .png
4. Recreated the images from the scanned A4 to independent images
5. Compressed using JPEG2000 (max file size = 15kb, RGB_24_BIT) F25 1. Printed the face images using DNP DS820
2. Scanned using ID Document Scanner (Idemia)
3. Retrieved (compressed) image JPEG2000,max file size=15kb \bigstrut
F12 1. Created A4 PDFs with multiple face images in passport size on it
2. Printed PDF at professional photo laboratory of Fotogena GmbH
to A4 image paper
3. Scanned with Kyocera Ecosys M6035cidn at 300 dpi, sharpness +1,
no other optimizations as tiff
4. Recreated the images from the scanned A4
5. Compressed using JPEG2000 (max file size = 15kb, RGB_24_BIT) F26 1. Printed the face images using Epson XP-860
2. Scanned using ID Document Scanner (Idemia)
3. Retrieved (compressed) image JPEG2000,max file size=15kb \bigstrut
TABLE XIV: Details of print-scan and compression pipeline along with image size
Attribute Subset BSIF DFR MBLBP WL DR FaDe
Ethnicity Middle Eastern -31.36% -4.03% -100.00% -13.43% -30.36% 0.34% -40.69%
Ethnicity Indian-Asian -30.76% -20.87% -63.27% -26.96% -26.41% -6.39% -40.69%
Partner HDA -27.13% -6.53% -90.31% -46.02% -7.73% -2.83% -9.37%
Partner NTNU -21.30% -21.77% -54.08% -20.81% -3.86% 5.00% -32.27%
Age 18..35 -15.30% -4.26% -56.63% -11.15% -6.30% 1.73% -15.18%
Ethnicity East-Asian -13.41% 5.34% -100.00% -13.02% 18.59% 9.90% -1.30%
Traits None -9.65% -4.87% -31.63% -1.58% -4.29% 0.61% -16.13%
Morph quality Low -5.96% 3.17% -20.92% -3.61% -2.56% 0.40% -12.22%
Morph. algorithm C07 -5.76% 1.77% -16.84% -4.34% -0.67% -1.33% -13.17%
Morph. algorithm C06 -5.32% -7.60% 1.02% -9.98% -12.69% -1.83% -0.83%
Post-processing PM03 -4.58% 50.72% -27.04% -1.24% 11.90% -9.03% -52.79%
Morph. algorithm C03 -3.36% 12.69% -20.92% -4.91% 3.07% 0.23% -10.32%
Ethnicity European/Amer. -2.65% -0.58% 8.67% -7.63% 2.19% 1.50% -20.05%
Post-processing PM01 -1.82% -0.78% -9.69% 22.11% 7.45% -0.95% -29.06%
Gender Male -0.63% -2.54% 5.61% -2.41% 4.17% 2.28% -10.91%
Post-processing Automatic -0.60% 0.06% -0.51% -0.76% -1.64% 0.44% -1.19%
Morph. algorithm C05 -0.33% -0.54% 0.00% -0.13% 2.07% -0.76% -2.61%
Post-processing Manual 1.65% -0.45% 0.00% 2.06% 4.08% -1.10% 5.34%
Post-processing PM06 3.34% -42.32% 103.06% -39.66% -23.79% -3.00% 25.74%
Ethnicity African 4.36% 8.46% 15.82% 1.11% -5.84% -5.13% 11.74%
Post-processing PM02 6.50% -6.89% -5.61% 22.27% 10.34% 5.27% 13.64%
Morph. algorithm C02 6.84% -5.82% -7.14% 21.57% 10.50% 8.90% 13.05%
Gender Female 11.01% 1.68% 71.43% 5.64% -3.95% -4.81% -3.91%
Morph. algorithm C01 13.24% 0.60% 62.24% 1.55% 1.34% -1.14% 14.83%
Age 36..55 13.84% 9.93% 53.06% 19.70% 5.81% -2.64% -2.85%
Post-processing PM05 14.38% -2.00% 69.39% -1.87% 9.46% -1.18% 12.46%
Traits Freckels 14.52% 12.00% 53.06% -1.84% 8.67% 2.66% 12.57%
Age 56..75 32.74% 15.74% 89.29% 31.83% 36.26% -0.65% 23.96%
Partner UTW 32.79% 6.66% 142.35% 7.03% 10.10% -3.35% 33.93%
Morph quality High 41.48% -10.96% 204.59% 11.69% 10.80% -1.52% 34.28%
Traits Moles 44.52% 5.15% 188.27% 43.87% 2.22% 8.99% 18.62%
TABLE XV: Subset EER deviation w.r.t. the overall set of digital images with morphing factor 0.3.

Footnotes

  1. thanks: The following paper is a pre-print. The article is accepted for publication in IEEE Transactions on Information Forensics and Security (TIFS).
  2. Due to operational concerns not to interfere border control processes the images were not acquired with operational ABC gates at airport locations. Instead, HDA and NTN used a mock ABC gate setup provided by an ABC manufacturer, whereas UTW created a mock ABC gate setup.

References

  1. (Website) External Links: Link Cited by: TABLE XIII, TABLE VI.
  2. B. Amos, B. Ludwiczuk and M. Satyanarayanan (2016) OpenFace: a general-purpose face recognition library with mobile applications. Technical report CMU-CS-16-118, CMU School of Computer Science. Cited by: §2.1.
  3. BioLab (2020)(Website) External Links: Link Cited by: 4th item, 1st item, §4, §4.
  4. N. Damer, V. Boller, Y. Wainakh, F. Boutros, P. Terhörst, A. Braun and A. Kuijper (2018) Detecting face morphing attacks by analyzing the directed distances of facial landmarks shifts. In German Conference on Pattern Recognition, pp. 518–534. Cited by: §1, §2.1, §2.3.1, §5.1.4.
  5. N. Damer, A. M. Saladié, A. Braun and A. Kuijper (2018) MorGAN: recognition vulnerability and attack detectability of face morphing attacks created by generative adversarial network. In 2018 IEEE 9th International Conference on Biometrics Theory, Applications and Systems (BTAS), pp. 1–10. Cited by: TABLE I, §1, §2.1.
  6. N. Damer, S. Zienert, Y. Wainakh, A. M. Saladie, F. Kirchbuchner and A. Kuijper (2019) A multi-detector solution towards an accurate and generalized detection of face morphing attacks. In 22nd International Conference on Information Fusion, FUSION, pp. 2–5. Cited by: §1, §2.1.
  7. N. Damer12, A. M. Saladié, S. Zienert, Y. Wainakh, P. Terhörst12, F. Kirchbuchner12 and A. Kuijper12 (2019) To detect or not to detect: the right faces to morph. Cited by: §1, §2.1.
  8. L. Debiasi, U. Scherhag, C. Rathgeb, A. Uhl and C. Busch (2018) PRNU-based detection of morphed face images. In 2018 International Workshop on Biometrics and Forensics (IWBF), pp. 1–7. Cited by: §1, §2.1, §2.3.2.
  9. J. Deng, J. Guo, N. Xue and S. Zafeiriou (2019) Arcface: additive angular margin loss for deep face recognition. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 4690–4699. Cited by: §5.1.2.
  10. (Website) External Links: Link Cited by: §2.
  11. B. Dorizzi, R. Cappelli, M. Ferrara, D. Maio, D. Maltoni, N. Houmani, S. Garcia-Salicetti and A. Mayoue (2009) Fingerprint and on-line signature verification competitions at icb 2009. In Advances in Biometrics, M. Tistarelli and M. S. Nixon (Eds.), Berlin, Heidelberg, pp. 725–732. External Links: ISBN 978-3-642-01793-3 Cited by: 4th item, §4.
  12. (Website) External Links: Link Cited by: TABLE XIII, TABLE VI.
  13. (Website) External Links: Link Cited by: TABLE XIII, TABLE VI.
  14. (Website) External Links: Link Cited by: TABLE XIII, TABLE VI.
  15. M. Ferrara, A. Franco and D. Malton (2019) Decoupling texture blending and shape warping in face morphing. 2019 International Conference of the Biometrics Special Interest Group (BIOSIG). Cited by: TABLE I.
  16. M. Ferrara, A. Franco and D. Maltoni (2014) The magic passport. In IEEE International Joint Conference on Biometrics, pp. 1–7. Cited by: TABLE I, §1, §1, §2.1, §2.
  17. M. Ferrara, A. Franco and D. Maltoni (2016) On the effects of image alterations on face recognition accuracy. In Face Recognition Across the Imaging Spectrum, T. Bourlai (Ed.), pp. 195–222. Cited by: TABLE I.
  18. M. Ferrara, A. Franco and D. Maltoni (2018) Face demorphing. IEEE Transactions on Information Forensics and Security 13 (4), pp. 1008–1017. Cited by: TABLE I, §2.2, §2.2, §2.3.1, §5.1.6, §5.1.6.
  19. M. Ferrara, A. Franco and D. Maltoni (2019) Face morphing detection in the presence of printing/scanning and heterogeneous image sources. arXiv preprint arXiv:1901.08811. Cited by: TABLE I, §2.3.2, §5.2.3.
  20. Frontex (2015) Best practice technical guidelines for automated border control (ABC) systems. European Agency for the Management of Operational Cooperation at the …. Cited by: §6.3.
  21. C. S. GmbH (2020)(Website) External Links: Link Cited by: §3.1.
  22. (Website) External Links: Link Cited by: TABLE XIII, TABLE VI.
  23. M. Gomez-Barrero, C. Rathgeb, U. Scherhag and C. Busch (2017) Is your biometric system robust to morphing attacks?. In 2017 5th International Workshop on Biometrics and Forensics (IWBF), pp. 1–6. Cited by: TABLE I, §1, §2.1, §2.
  24. (Website) External Links: Link Cited by: §2.
  25. M. Hildebrandt, T. Neubert, A. Makrushin and J. Dittmann (2017-04) Benchmarking face morphing forgery detection: application of stirtrace for impact simulation of different processing steps. In 5th Intl. Workshop on Biometrics and Forensics (IWBF), Cited by: §2.2.
  26. International Civil Aviation Organization (2015) Machine readable passports – part 9 – deployment of biometric identification and electronic storage of data in eMRTDs. International Civil Aviation Organization (ICAO). Note: \urlhttp://www.icao.int/publications/Documents/9303_p9_cons_en.pdf Cited by: 1st item, §3.2, §3.3, §5.1.3.
  27. ISO/IEC JTC1 SC37 Biometrics (2017) ISO/IEC 30107-3. information technology - biometric presentation attack detection - part 3: testing and reporting. International Organization for Standardization. Cited by: §4.1.
  28. D. King(Website) External Links: Link Cited by: TABLE XIII, TABLE VI, §5.2.4.
  29. A. Krizhevsky, I. Sutskever and G. E. Hinton (2012) Imagenet classification with deep convolutional neural networks. In Advances in neural information processing systems, pp. 1097–1105. Cited by: §2.3.2, §5.1.5, §5.2.3.
  30. Luxand Inc. (2020)(Website) External Links: Link Cited by: §2.1.
  31. A. Martinez and R. Benavente AR face database, 2000. External Links: Link Cited by: §2.1.
  32. S. Milborrow and F. Nicolls (2014-01) Active shape models with sift descriptors and mars. VISAPP 2014 - Proceedings of the 9th International Conference on Computer Vision Theory and Applications 2, pp. 380–387. Cited by: TABLE XIII, TABLE VI.
  33. J. Mohan Singh, R. Ramachandra, K. B. Raja and C. Busch (2019) Robust morph-detection at automated border control gate using deep decomposed 3d shape and diffuse reflectance. In 2019 15th International Conference on Signal-Image Technology & Internet-Based Systems (SITIS), Cited by: §2.3.1, §5.1.5.
  34. Neurotechnology (2020)(Website) External Links: Link Cited by: §2.1, §3.1.
  35. NIST (2020)(Website) External Links: Link Cited by: 4th item, §4.
  36. (Website) External Links: Link Cited by: §2.
  37. J. Ortega-Garcia, J. Fierrez, F. Alonso-Fernandez, J. Galbally, M. R. Freire, J. Gonzalez-Rodriguez, C. Garcia-Mateo, J. Alba-Castro, E. Gonzalez-Agulla and E. Otero-Muras (2009) The multiscenario multienvironment biosecure multimodal database (bmdb). IEEE Transactions on Pattern Analysis and Machine Intelligence 32 (6), pp. 1097–1111. Cited by: §2.1.
  38. O. M. Parkhi, A. Vedaldi and A. Zisserman (2015) Deep face recognition. In British Machine Vision Association, Cited by: §5.2.3.
  39. P. Pérez, M. Gangnet and A. Blake (2003) Poisson image editing. In ACM SIGGRAPH 2003 Papers, SIGGRAPH ’03, New York, NY, USA, pp. 313–318. External Links: ISBN 1581137095, Link, Document Cited by: TABLE XIII, TABLE VI.
  40. P. J. Phillips, P. J. Flynn, T. Scruggs, K. W. Bowyer, J. Chang, K. Hoffman, J. Marques, J. Min and W. Worek (2005) Overview of the face recognition grand challenge. In 2005 IEEE computer society conference on computer vision and pattern recognition (CVPR’05), Vol. 1, pp. 947–954. Cited by: §5.1.1, §5.1.2, §5.2.4.
  41. P. J. Phillips, H. Wechsler, J. Huang and P. J. Rauss (1998) The feret database and evaluation procedure for face-recognition algorithms. Image and vision computing 16 (5), pp. 295–306. Cited by: §5.1.1, §5.1.2.
  42. (Website) External Links: Link Cited by: §2.
  43. R. Raghavendra, K. B. Raja and C. Busch (2016) Detecting morphed face images. In 2016 IEEE 8th International Conference on Biometrics Theory, Applications and Systems (BTAS), pp. 1–7. Cited by: TABLE I, §1, §2.1.
  44. R. Raghavendra, K. Raja, S. Venkatesh and C. Busch (2017) Face morphing versus face averaging: vulnerability and detection. In 2017 IEEE International Joint Conference on Biometrics (IJCB), pp. 555–563. Cited by: TABLE I, §2.1, §2.2, §2.2, §2.3.2.
  45. R. Raghavendra, S. Venkatesh, K. Raja and C. Busch (2018) Detecting face morphing attacks with collaborative representation of steerable features. In IAPR International Conference on Computer Vision & Image Processing (CVIP-2018), pp. 1–7. Cited by: TABLE I, §2.2, §2.2, §2.3.2, §5.2.2.
  46. R. Raghavendra, K. B. Raja, S. Venkatesh and C. Busch (2017) Transferable deep-cnn features for detecting digital and print-scanned morphed face images. In 2017 IEEE Conference on Computer Vision and Pattern Recognition Workshops (CVPRW), pp. 1822–1830. Cited by: TABLE I, §1, §2.1, §2.3.2, §2, §5.2.2.
  47. R. Ramachandra, S. Venkatesh, K. Raja and C. Busch (2019) Towards making morphing attack detection robust using hybrid scale-space colour texture features. In 5th International Conference on Identity, Security, and Behavior Analysis (ISBA), pp. 1–8. Cited by: §2.3.2.
  48. U. Scherhag, D. Budhrani, M. Gomez-Barrero and C. Busch (2018) Detecting morphed face images using facial landmarks. In Proceedings of the 2018 International Conference on Image and Signal Processing (ICISP), pp. 442–452. Cited by: §5.1.4.
  49. U. Scherhag, L. Debiasi, C. Rathgeb, C. Busch and A. Uhl (2019) Detection of face morphing attacks based on PRNU analysis. IEEE Transactions on Biometrics, Behavior, and Identity Science 1 (4), pp. 302–317. Cited by: TABLE I, §1, §2.1, §2.2, §2.3.1, §2.3.2, §5.2.1.
  50. U. Scherhag, R. Raghavendra, K. B. Raja, M. Gomez-Barrero, C. Rathgeb and C. Busch (2017) On the vulnerability of face recognition systems towards morphed face attacks. In 2017 5th International Workshop on Biometrics and Forensics (IWBF), pp. 1–6. Cited by: TABLE I, §1, §2.1, §2.2, §2.3.2, §2, §3.1.
  51. U. Scherhag, C. Rathgeb and C. Busch (2018) Towards detection of morphed face images in electronic travel documents. In 2018 13th IAPR International Workshop on Document Analysis Systems (DAS), pp. 187–192. Cited by: §5.1.1, §5.1.3, §5.2.4.
  52. U. Scherhag, C. Rathgeb, J. Merkle, R. Breithaupt and C. Busch (2019) Face recognition systems under morphing attacks: a survey. IEEE Access 7, pp. 23012–23026. Cited by: §1, §2.
  53. U. Scherhag, C. Rathgeb, J. Merkle and C. Busch (2020) Deep face representations for differential morphing attack detection. IEEE Transactions on Information Forensics and Security (TIFS). External Links: 2001.01202 Cited by: TABLE I, §2.3.1, §5.1.2.
  54. S. Sengupta, A. Kanazawa, C. D. Castillo and D. W. Jacobs (2018) SfSNet: learning shape, reflectance and illuminance of facesin the wild’. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 6296–6305. Cited by: §5.1.5.
  55. A. Sepas-Moghaddam, V. Chiesa, P.L. Correia, F. Pereira and J. Dugelay (2017) The ist-eurecom light field face database. In 2017 International Workshop on Biometrics and Forensics (IWBF), pp. 1–6. Cited by: §5.2.4.
  56. K. Simonyan and A. Zisserman (2015) Very deep convolutional networks for large-scale image recognition. Proc ICLR. San Diego, CA, USA. Cited by: §2.1, §2.3.2.
  57. L. Spreeuwers, M. Schils and R. Veldhuis (2018) Towards robust evaluation of face morphing detection. In 2018 26th European Signal Processing Conference (EUSIPCO), pp. 1027–1031. Cited by: 1st item, §2.3.2.
  58. S. Venkatesh, Z. Haoyu, R. Ramachandra, K. Raja, N. Damer and C. Busch (2020) Can gan generated morphs threaten face recognition systems equally as landmark based morphs? - vulnerability and detection. In 2020 International Workshop on Biometrics and Forensics (IWBF), pp. 1–6. Cited by: TABLE I.
  59. S. Venkatesh, R. Ramachandra, K. Raja, L. J. Spreeuwers, R. Veldhuis and C. Bush (2020) Detecting morphed face attacks using residual noise from deep multi-scale context aggregation network. In WACV 2020, Cited by: 1st item, §2.3.2.
  60. S. Venkatesh, R. Ramachandra, K. Raja, L. Spreeuwers, R. Veldhuis and C. Busch (2019) Morphed face detection based on deep color residual noise. In Ninth International Conference on Image Processing Theory, Tools and Applications (IPTA), pp. 1–6. Cited by: §2.3.2.
Comments 0
Request Comment
You are adding the first comment!
How to quickly get a good reply:
  • Give credit where it’s due by listing out the positive aspects of a paper before getting into which changes should be made.
  • Be specific in your critique, and provide supporting evidence with appropriate references to substantiate general statements.
  • Your comment should inspire ideas to flow and help the author improves the paper.

The better we are at sharing our knowledge with each other, the faster we move forward.
""
The feedback must be of minimum 40 characters and the title a minimum of 5 characters
   
Add comment
Cancel
Loading ...
414504
This is a comment super asjknd jkasnjk adsnkj
Upvote
Downvote
""
The feedback must be of minumum 40 characters
The feedback must be of minumum 40 characters
Submit
Cancel

You are asking your first question!
How to quickly get a good answer:
  • Keep your question short and to the point
  • Check for grammar or spelling errors.
  • Phrase it like a question
Test
Test description