Robust Iris Segmentation Based on Fully Convolutional Networks and Generative Adversarial Networks

Robust Iris Segmentation Based on Fully Convolutional Networks and Generative Adversarial Networks

Cides S. Bezerra1, Rayson Laroca1, Diego R. Lucio1, Evair Severo1,
Lucas F. Oliveira1, Alceu S. Britto Jr.2 and David Menotti1
1Federal University of Paraná (UFPR), Curitiba, PR, Brazil
2Pontifical Catholic University of Paraná (PUCPR), Curitiba, PR, Brazil
{csbezerra,rblsantos,drlucio,ebsevero,lferrari,menotti}@inf.ufpr.br alceu@ppgia.pucpr.br
Abstract

The iris can be considered as one of the most important biometric traits due to its high degree of uniqueness. Iris-based biometrics applications depend mainly on the iris segmentation whose suitability is not robust for different environments such as \gls*nir and \gls*vis ones. In this paper, two approaches for robust iris segmentation based on \glspl*fcn and \glspl*gan are described. Similar to a common convolutional network, but without the fully connected layers (i.e., the classification layers), an \gls*fcn employs at its end a combination of pooling layers from different convolutional layers. Based on the game theory, a \gls*gan is designed as two networks competing with each other to generate the best segmentation. The proposed segmentation networks achieved promising results in all evaluated datasets (i.e., \acs*biosec, \acs*casiai3, \acs*casiat4, \acs*iitd1) of \gls*nir images and (\acs*nice1, \acs*creyeiris and \acs*miche1) of \gls*vis images in both non-cooperative and cooperative domains, outperforming the baselines techniques which are the best ones found so far in the literature, i.e., a new state of the art for these datasets. Furthermore, we manually labeled 2,431 images from \acs*casiat4, \acs*creyeiris and \acs*miche1 datasets, making the masks available for research purposes.

\newacronymstyle

long-short-br \GlsUseAcrEntryDispStylelong-short\GlsUseAcrStyleDefslong-short \setacronymstylelong-short-br \newacronymdcnnDCNNDeep Convolutional Neural Network \newacronymcnnCNNConvolutional Neural Network \newacronymbiosecBioSecBioSec \newacronymcasiai3CasiaI3CASIA-Iris-Interval-v3 \newacronymcasiat4CasiaT4CASIA-Iris-Thousand-v4 \newacronymiitd1IITD-1IITD Iris Image Database 1.0 \newacronymntdameNTDameNotredame 0405 Iris \newacronymnice1NICE.INoisy Iris Challenge Evaluation - Part I \newacronymnice2NICE.IINoisy Iris Challenge Evaluation - Part II \newacronymcreyeirisCrEye-IrisCross-Spectral Iris/Periocular \newacronymmiche1MICHE-IMobile Iris Challenge Evaluation I \newacronymfcnFCNFully Convolutional Network \newacronymganGANGenerative Adversarial Network \newacronymcganCGANConditional Generative Adversarial Network \newacronymosirisOSIRISv4.1Open Source Iris Recognition System Version 4.1 \newacronymirissegIRISSEGIris Segmentation Framework \newacronymnirNIRnear-infrared \newacronymvisVISvisible \newacronymmrsMRSMaximum Radial Suppression \newacronymmtmMTMMarkovian Texture Model \newacronymocacOCACOptical Correlation based Active Contours \newacronymhcnnHCNNHierarchical Convolutional Neural Network \newacronymmfcnMFCNMulti-scale Fully Convolutional Network \newacronymf1F1F-Measure \newacronymmaccMean Acc.Mean Accuracy \newacronymiouIoUIntersection over Union \newacronymtpTPTrue Positive \newacronymfpFPFalse Positive \newacronymtnTNTrue Negative \newacronymfnFNFalse Negative \newacronymprdPRDPeriocular Region Detection \newacronymroiROIRegion of Interest \newacronymencdecEDEncoder-Decoder

I Introduction

\glsresetall

The identification of individuals based on their biological and behavioral characteristics has a higher degree of reliability compared to other means of identification, such as passwords or access cards. Several characteristics of the human body can be used for person recognition (e.g., face, signature, fingerprints, iris, sclera, retina, voice, etc.) [1]. The characteristics present in the iris make it one of the most representative and safe biometric modalities. This circular diaphragm forming the textured portion of the eye is capable of distinguishing individuals with a high degree of uniqueness [2, 3].

As described in [4], an automated biometric system for iris recognition is composed of four main steps: (i) image acquisition, (ii) iris segmentation, (iii) normalization and (iv) feature extraction and matching. The segmentation consists of locating and isolating the iris from other regions (e.g., the sclera, surrounding skin regions, etc.), therefore it is the most critical and challenging step of the system. Incorrect segmentation usually affects the subsequent steps, impairing the system performance [5].

Over the last decade, many approaches have been employed for iris segmentation, such as those based on edge detection [6], Hough transform [7], active contours [8, 9], integro-differential equation [10], \gls*mrs [11], \glspl*mtm [12], and \glspl*cnn [13, 14] (see Section II for more details).

Leveraging the advent of \glspl*cnn we propose two approaches for iris segmentation task. The first is based on a \gls*fcn [15] and the second one is based on a \gls*gan [16]. \glspl*fcn are used for segmentation in many different tasks since medical image analysis to aerospace image analysis [17, 18], while \gls*gan is a young approach to semantic segmentation, which has outperformed the state of the art [19].

The proposed \gls*fcn and \gls*gan iris segmentation approaches outperform three existing frameworks in the largest benchmark datasets found in the literature. There are two main contributions in this paper: (i) two \gls*cnn-based approaches that work well for \gls*nir and \gls*vis images in both cooperative (highly controlled) and non-cooperative environments; and (ii)  new manually labeled masks from images of three existing iris datasets111The new masks are publicly available to the research community at http://web.inf.ufpr.br/vri/databases/iris-segmentation-annotations/. (see Section IV-A).

The remainder of this paper is organized as follows: we briefly review related work in Section II. In Section III, the proposed approaches used for iris segmentation are described. Section IV presents the datasets, evaluation protocol and baselines used in the experiments. We report and discuss the results in Section V. Conclusions are given in Section VI.

Ii Related Work

In this section, we briefly review relevant studies in the context of iris segmentation, which use from conventional image processing to deep learning techniques. For other studies on iris segmentation, please refer to [20, 21].

Jillela and Ross [22] presented an overview of classical approaches, evaluation methods and challenges related to iris segmentation in both \gls*nir and \gls*vis images. Daugman’s study [23] is considered the pioneer in iris segmentation. The integro-differential operator was used to approximate the boundary of the inner and outer iris, generating the central coordinates and both pupil and iris radius.

Liu et al. [6] first detected the inner boundary of the iris and then the outer boundary. In addition, noisy pixels were eliminated based on their high/low-intensity level. Proença and Alexandre [7] used the Fuzzy K-means algorithm to classify each pixel as belonging to a group, considering its coordinates and intensity distribution. Then, they applied the Canny edge detector in the image with the grouped pixels, creating an edge map. Finally, the inner and outer iris boundaries are detected by the circular Hough transform.

Shah and Ross [9] performed iris segmentation through Geodesic Active Contours, combining energy minimization with active contours based on curve evolution. The pupil is detected from a binarization and both inner and outer iris boundaries are approximated using the Fourier series coefficients.

The winning approach of the \gls*nice1, proposed by Tan et al. [10], removes the reflection points using adaptive thresholding and bilinear interpolation. Region growing based on clustering and integro-differential constellation segments the iris. Podder et al. [11] applied an \gls*mrs technique to noise removal. Moreover, they applied the Canny edge detector and Hough transform to detect iris boundaries.

Haindl & Krupička [12] detected the iris using the Daugman’s operator [23] and removed the eyelids employing a third-order polynomial mean and standard deviation estimates. Adaptive thresholding and \gls*mtm were used to remove iris reflection. Ouabida et al. [8] applied the \gls*ocac, that uses the Vander Lugt correlator algorithm, to detect the iris and pupil contours through spatial filtering.

Liu et al. [14] proposed two approaches called \glspl*hcnn and \glspl*mfcn to perform a dense prediction of the pixels using sliding windows, merging shallow and deep layers.

At present, \glspl*cnn are being employed to solve many computer vision problems with impressive results being obtained in several areas such as biometrics, medical imaging and security systems [24, 25, 26]. Teichmann et al. [27] proposed a \gls*cnn architecture, called MultiNet, to joint detection, classification and semantic segmentation. Inspired by the great results reported in their work, we apply the segmentation decoder of the MultiNet to the iris segmentation context, as detailed in Section III-B.

Iii Proposed Approach

This section describes the proposed approach and it is divided into two subsections, one for iris location and one for iris segmentation.

Iii-a Iris Detection

The datasets used in this work have many different sizes, and just resizing the images would generate a distortion in the iris format. In order to avoid this distortion, we first performed the \gls*prd.

YOLO [28] is a real-time object detection system, which regards detection as a regression problem. As great advances were recently attained through models inspired by YOLO [29, 26], we decided to fine-tune it for \gls*prd. However, as we want to detect only one class (i.e., the iris), we chose to use a smaller model, called Fast-YOLO222For training Fast-YOLO we used the weights pre-trained on ImageNet, available at https://pjreddie.com/darknet/yolo/. [28], which uses fewer convolutional layers than YOLO and fewer filters in those layers. The Fast-YOLO’s architecture is shown in Table I.

Layer Filters Size Input Output
conv
max
conv
max
conv
max
conv
max
conv
max
conv
max
conv
conv
conv
detection
TABLE I: Fast-YOLO network used for iris detection.

The \acs*prd network was trained using the images, without any preprocessing, and the coordinates of the \gls*roi as inputs. The annotations provided by Severo et al. [26] were used as ground truth. We applied a small padding in the detected patch to increase the chance that the iris is entirely within the \gls*roi. Afterward, we enlarged the \acs*roi to a square form with width and height that are power of .

By default, only objects detected with a confidence of or higher are returned by Fast-YOLO [28]. We consider only the detection with the largest confidence in cases where more than one iris region is detected, since there is always only one region annotated in the evaluated datasets. If no region is detected, the next stage (iris segmentation) is performed on the image in its original size.

In our previous work on sclera segmentation [30], this same approach was used for iris detection.

Iii-B Iris Segmentation

We chose \gls*fcn and \gls*gan for iris segmentation since they presented good results in other segmentation applications [30]. These results can be explained by the fact that \gls*fcn has no fully connected layer which generally causes loss of spatial information, while the representations embodied by the pair of networks in a \gls*gan model (the generator and the discriminator) are able to capture the statistical distribution of training data, making possible less reliance on huge, well-balanced, and well-labelled datasets.

Iii-B1 Fully Convolutional Networks (FCNs)

are deep neural networks in which an image is provided as input and a mask is generated at the output. This mask is a binary image (of the same size) where each pixel is classified as iris or not iris. Basically, we employed the MultiNet [27] segmentation decoder without the classification and detection decoders. The encoder consists of the first layers of the VGG- network [31]. The features extracted from its fifth pooling layer were then used by the segmentation decoder, which follows the \gls*fcn architecture [32] (see Fig. 1).

Fig. 1: \gls*fcn architecture for iris segmentation.

The fully-connected layers of the VGG- network were transformed into convolutional layers to produce a low-resolution segmentation. Then, three transposed convolution layers were used to perform up-sampling. Finally, high-resolution features were extracted through skip layers from lower layers to improve the up-sampled results.

The segmentation loss function was based on the cross-entropy. The pre-trained VGG- weights on ImageNet were used to initialize the encoder, the segmentation decoder, and the transposed convolutional layers. The training is based on the Adam optimizer algorithm [33], with the following parameters: learning rate of , dropout probability of , weight decay of and standard deviation of to initialize the skip layers.

Iii-B2 Generative Adversarial Networks (GANs)

are deep neural networks composed by both generator and discriminator networks, pitting one against the other. First, the generator network receives noise as input and generates samples. Then the discriminator network receives samples of training data and those of the generator network, being able to distinguish between the two sources [34]. The \gls*gan architecture for iris segmentation is shown in Fig. 2.

Fig. 2: \gls*gan architecture for iris segmentation.

Basically, the generator network learns to produce more realistic samples throughout each iteration, while the discriminator network learns to better distinguish the real and synthetic data.

Isola et al. [16] presented the \gls*gan approach used in this work, which is a \gls*cgan able to learn the relation between an image and its label, and from that, generate a variety of image types, which can be employed in various tasks such as photo-generation and semantic segmentation.

Iv Experiments

In this section, we present the datasets, evaluation protocol and baselines used in our experiments for comparison of results and discussions.

Iv-a Datasets

The experiments were carried out on well-known and challenging publicly available iris datasets with both \gls*nir and \gls*vis images having different sizes and characteristics. An overview of the number of images from each dataset is presented in Table II. The ground truths of the \acsbiosec, \acs*casiai3 and \gls*iitd1 datasets were provided by Hofbauer et al. [35]. In the following, details of the datasets are presented.

Dataset Images Subjects Resolution Wavelength
\acsbiosec [36] (*) \gls*nir
\acscasiai3 [37] , \gls*nir
\acscasiat4 [38] (*) , \gls*nir
\acsiitd1 [39] , \gls*nir
\acsnice1 [40] n/a \gls*vis
\acscreyeiris [41] (*) , \gls*vis
\acsmiche1 [42] (*) , Various \gls*vis
TABLE II: Overview of the iris datasets used in this work, where (*) means that only part of the dataset was used.
\acs

biosec: a multimodal dataset [36] containing fingerprint, frontal face and iris images, as well as voice utterances. The entire dataset has , \gls*nir iris images from subjects with resolution of pixels, however, due to the available segmentation masks, we use only the first images.

\gls

*casiai3: a dataset [37] with , \gls*nir iris images from subjects with extremely clear iris texture details and resolution of pixels, acquired in an indoor environment.

\gls

*casiat4: a dataset [38] containing , \gls*nir images from , subjects, collected in an indoor environment with different lightings setups. For our experiments, we manually labeled the first , images from subjects.

\gls

*iitd1: a dataset [39] with , \gls*nir images acquired from subjects between - years comprising of males and females. All images have a resolution of pixels and were obtained in an indoor environment.

\gls

*creyeiris: a dataset composed of , images from subjects [41]. The images were captured with a dual spectrum sensor (\gls*nir and \gls*vis) and divided into three subsets: iris, masked periocular and ocular images. We manually labeled the first , \gls*vis images from the iris subset.

\gls

*miche1: a dataset [42] with , \gls*vis images captured from subjects under uncontrolled settings using three mobile devices: iPhone 5, Galaxy Samsung IV and Galaxy Tablet II (,, , and images, respectively). The images have resolution of , and pixels, respectively. We used the ground truth masks made available by Hu et al. [43] and labeled another to complete , images from subjects.

\gls

*nice1: a subset of the UBIRIS.v2 dataset [44]. The \gls*nice1 [40] subset is composed of images for training and for testing. However, the test set provided by the organizers of the \gls*nice1 contest has only images. The subjects of the test set were not directly specified.

Fig. 3 shows two samples (\gls*nir and \gls*vis) of the masks we created. We sought to eliminate all noise present in the iris, such as reflections and eyelashes.

(a)
(b)
Fig. 3: Two examples of the masks created by us. (a) shows a \gls*nir image (\gls*casiat4) and (b) a \gls*vis image (\gls*miche1).

Iv-B Evaluation protocol

A pixel-to-pixel comparison between the ground truth (manually labeled) and the algorithm prediction (i.e., the mask/segmentation) generate an average segmentation error  computed as a pixel divergence, given by the exclusive-or logical operator  (i.e., XOR) [40], denoted by

(1)

where and are the coordinates in the mask  and ground truth  images, and stand for the height and width of the image, respectively. Lower and higher values represent better and worse results, respectively. We also reported the \gls*f1 measure which is a harmonic average of Precision and Recall [13].

In order to perform a fair evaluation and comparison of the proposed methodologies to the baselines in all datasets, we randomly divided each dataset into two subsets, containing of the images for training and the remainder for evaluation. The stopping learning criteria was , iterations.

As suggested in [27], we trained the \gls*fcn with iterations. However, we noticed that the more iterations, the better was the model’s performance. Therefore, we doubled the number of iterations (i.e., ) to ensure a good convergence of the model. According to our evaluations, iterations were sufficient for all datasets.

Iv-C Benchmarks

We selected three baseline frameworks described (and available) in the literature to compare with our approaches with: \gls*osiris, \gls*irisseg and Haindl & Krupička [12].

The \gls*osiris [45] framework is composed of four key modules: segmentation, normalization, feature extraction and matching. Nevertheless, we used only the segmentation module to compare it with our method. Although the performance of this framework was only reported in datasets with \gls*nir images, we applied it on both \gls*nir and \gls*vis image datasets. This framework has input parameters such as minimum/maximum iris diameter. For a fair comparison, we tuned the parameters for each dataset in order to obtain the best results.

The \gls*irisseg [46] framework was designed specifically for non-ideal irises and is based on adaptive filtering, following a coarse-to-fine strategy. The authors emphasize that this approach does not require adjustment of parameters for different datasets. As in \gls*osiris, we report the performance of this framework on both \gls*nir and \gls*vis images.

The Haindl & Krupička [12] framework was used to evaluate the results achieved by the proposed approach on \gls*vis datasets. This method was developed for colored eyes images obtained through mobile devices and used as the baseline in the MICHE-II [47] contest. We did not report the Haindl & Krupička [12] performance on \gls*nir images datasets since it was not possible to generate the segmentation masks using the executable provided by the authors.

V Results and Discussions

The experiments were performed using two protocols: the protocol of the \gls*nice1 contest and the one proposed in Section IV-B. Moreover, in order to analyze the robustness among sensors from the same environment (i.e., \gls*nir or \gls*vis) of the proposed \gls*fcn and \gls*gan approaches, they were training using either all \gls*nir or \gls*vis image datasets and then evaluated on the same scenario. Finally, a visual and qualitative analysis showing some good and poor results is performed.

We report the mean \gls*f1 and values by averaging the values obtained for each image. For all the experiments, we also carried out a statistical paired t-test with significance level of between pairs of results for the same image, aiming to claim (statistical) significative difference between the results compared.

V-a The NICE.I Contest

The comparison of the results obtained by our approaches and those obtained by the baselines when using the \gls*nice1 contest protocol is shown in Table III. As can be seen, the \gls*irisseg and \gls*osiris frameworks presented the worst results. They achieved \gls*f1 values of and on the \gls*nice1 test set, respectively. These results might be explained because these frameworks were developed for \gls*nir images. Therefore, their performances are drastically compromised in \gls*vis images. It is noteworthy that the distribution of \gls*f1 values for both frameworks presented high standard deviation (approximately ). This occurs because, in some images, the \glspl*fp were high in both frameworks, including images that do not have iris, resulting in a very poor segmentation.

Dataset Method F1 % E %
\acsnice1
(\gls*vis)
\acs*osiris [45]
\acs*irisseg [46]
Haindl & Krupička [12]
\acrshort*fcn Proposed
\acrshort*gan Proposed
TABLE III: Iris segmentation results using the \gls*nice1 contest protocol.

We expected to obtain good results using the Haindl & Krupička [12] framework, due to the fact that it was developed for \gls*vis images and it was used for generating the reference masks (i.e., the ground truth) of the \gls*miche1 dataset in the recognition contest (MICHE-II). However, according to our experiments, its performance was not promising, although it obtained better results than \gls*irisseg and \gls*osiris.

The proposed \gls*fcn and \gls*gan approaches achieved considerably better mean values for \gls*f1 and metrics than the other approaches. We believe that these results were attained due to the discriminating power of the deep learning approaches and also because our models were adjusted (i.e., trained) specifically for each dataset. We emphasize that \gls*osiris was also adjusted for each dataset.

Although higher standard deviation of \gls*f1 was presented for the \gls*fcn approach, the paired t-test has shown that the \gls*gan approach presented a statistically better \gls*f1 value, however, the \gls*fcn approach has presented a statistically smaller value.

V-B Our protocol

We trained and tested the \gls*fcn and \gls*gan approaches on each dataset to compare them with the benchmarks. Table IV shows the results obtained when using the proposed evaluation protocol (see Section IV-B).

Dataset Method F1 % E %
\acsbiosec
(\gls*nir)
\acs*osiris [45]
\gls*irisseg [46]
\acrshort*fcn Proposed
\acrshort*gan Proposed
\acs*casiai3
(\gls*nir)
\acs*osiris [45]
\gls*irisseg [46]
\acrshort*fcn Proposed
\acrshort*gan Proposed
\acs*casiat4
(\gls*nir)
\acs*osiris [45]
\gls*irisseg [46]
\acrshort*fcn Proposed
\acrshort*gan Proposed
\acs*iitd1
(\gls*nir)
\acs*osiris [45]
\gls*irisseg [46]
\acrshort*fcn Proposed
\acrshort*gan Proposed
\acs*nice1
(\gls*vis)
\acs*osiris [45]
\gls*irisseg [46]
Haindl & Krupička [12]
\acrshort*fcn Proposed
\acrshort*gan Proposed
\acs*creyeiris
(\gls*vis)
\acs*osiris [45]
\gls*irisseg [46]
Haindl & Krupička [12]
\acrshort*fcn Proposed
\acrshort*gan Proposed
\acs*miche1
(\gls*vis)
\acs*osiris [45]
\gls*irisseg [46]
Haindl & Krupička [12]
\acrshort*fcn Proposed
\acrshort*gan Proposed
TABLE IV: Iris segmentation results using the proposed protocol.

Remark that both \gls*irisseg and \gls*osiris frameworks presented good results in \gls*nir datasets, always reaching \gls*f1 values over . Nonetheless, our proposed approaches presented statistically better \gls*f1 values for all datasets even in the \gls*nir datasets, which are the \gls*irisseg and \gls*osiris specific image domain. Observe that there are no results for the approach by Haindl & Krupička [12] since it was not developed for \gls*nir images.

Looking at \gls*vis datasets, the results obtained were slightly worse than in the \gls*nir datasets. This is because \gls*vis images usually have more noise, e.g., reflections. The best \gls*f1 and values achieved for the \gls*vis datasets were achieved by the \gls*fcn approach with and , respectively, in the \gls*creyeiris and \gls*miche1 datasets.

It is worth noting that the \gls*fcn approach is the one with the smallest values in almost all scenarios. This result can be explained by the fact that the \gls*fcn approach took advantage of transfer learning, while the \gls*gan approach was trained from scratch.

Dataset Method F1 % E %
\acs*biosec \acs*fcn
\acs*gan
\acs*casiai3 \acs*fcn
\acs*gan
\acs*casiat4 \acs*fcn
\acs*gan
\acs*iitd1 \acs*fcn
\acs*gan
\acs*nir \acs*fcn
\acs*gan
\acs*nice1 \acs*fcn
\acs*gan
\acs*creyeiris \acs*fcn
\acs*gan
\acs*miche1 \acs*fcn
\acs*gan
\acs*vis \acs*fcn
\acs*gan
TABLE V: Suitability (bold lines) for \gls*nir and \gls*vis environments.
(a)
(b)
(c)
(d)
(e)
(f)
(g)
(h)
(i)
(j)
(k)
(l)
(m)
(n)
Fig. 4: \gls*fcn and \glsgan qualitative results: good (left) and bad (right) results based on the error . Green and red pixels represent the \acrfullpl*fp and \acrfullpl*fn, respectively. (a)-(b) \acsbiosec; (c)-(d) \acscasiai3; (e)-(f) \acscasiat4; (g)-(h) \acsiitd1; (i)-(j) \acsnice1; (k)-(l) \acscreyeiris; (m)-(n) \acsmiche1.

V-C Suitability and Robustness

Here, experiments for evaluating the suitability and robustness of the proposed approaches are presented. By suitability, we expect that models trained with a specific kind of images, i.e. \gls*nir or \glsvis images, work as well as when training on a specific dataset. By robustness, we expect that models trained with all kind of images (\gls*nir and \glsvis) perform as well as when training on a specific dataset.

In summary, the suitability is evaluated by training the models using only \gls*nir or \gls*vis images (i.e., \gls*fcn and \gls*gan trained on the \gls*nir merged and \gls*vis also merged datasets). The robustness is evaluated by training the models using all images available (\gls*nir and \gls*vis merged). The results are presented in Tables V and VI, respectively. Note that we report the results of the separate test subsets as well, to facilitate visual comparison between the tables.

Dataset Method F1 % E %
\acs*biosec \acs*fcn
\acs*gan
\acs*casiai3 \acs*fcn
\acs*gan
\acs*casiat4 \acs*fcn
\acs*gan
\acs*iitd1 \acs*fcn
\acs*gan
\acs*nir \acs*fcn
\acs*gan
\acs*nice1 \acs*fcn
\acs*gan
\acs*creyeiris \acs*fcn
\acs*gan
\acs*miche1 \acs*fcn
\acs*gan
\acs*vis \acs*fcn
\acs*gan
All \acs*fcn
\acs*gan
TABLE VI: Robustness (bold lines) of the iris segmentation approaches.

By comparing the values presented in Table V with those reported in Table IV, we can observe that the values vary slightly, and thus we can state that the proposed approaches are stable in the suitability scenario.

When comparing the results presented in Table V and Table  VI, we noticed that the obtained values of and were similar in \gls*nir datasets. On the other hand, the performance was considerably lower in \gls*vis datasets. Therefore, the proposed approaches are robust for both \gls*nir and \gls*vis images. However, the \gls*gan approach presented a decrease in the results, while the \gls*fcn obtained little variation.

V-D Visual & Qualitative Analysis

Here we perform a visual and qualitative analysis. First, in Fig. 4, we show poor and well-performed iris segmentation results obtained in each dataset by the \gls*fcn and \gls*gan approaches. Some images were poorly segmented, thus explaining the high standard deviations obtained.

Then, in Fig. 5, we show iris segmentation performed by both the \gls*fcn and \gls*gan approaches, as well as the baselines. We only show one image from each the \gls*casiai3 and \gls*creyeiris datasets due to lack of space.

We particularly chose images where all methods perform fairly well and also where our methods performed better, which is the case in most situations. One can observe that our approach performed better in both \gls*nir and \gls*vis images.

(a)
(b)
(c)
(d)
(e)
(f)
(g)
(h)
(i)
Fig. 5: Qualitative results achieved by the \gls*fcn, \gls*gan and baselines. Green and red pixels represent the \gls*fp and \gls*fn, respectively. The first and second rows correspond, respectively, to images from the \gls*casiai3 and \gls*creyeiris datasets.

Vi Conclusion

This work presented two approaches (\gls*fcn and \glsgan) for robust iris segmentation in \gls*nir and \gls*vis images in both cooperative and non-cooperative environments. The proposed approaches were compared with three baselines methods and reported better results in all test cases. The transfer learning for each domain (or dataset) was essential to achieve outstanding results since the number of images for training the \gls*fcn is relatively small. Therefore, the use of pre-trained models from other datasets brings excellent benefits in learning deep networks. Moreover, specific data augmentation techniques can be applied for improving the performance of the \gls*gan approach.

We also labeled more than , images for iris segmentation. These masks (manually labeled) are publicly available to the research community, assisting the development and evaluation of new iris segmentation approaches.

Despite the outstanding results, our approach presented high standard deviation rates in some datasets. Therefore, as future work we intend to (i) evaluate the impact of performing the segmentation in two steps, that is, first perform iris detection and then segment the iris in the detected patch; (ii) create a post-processing stage to refine the prediction, since many images have minor errors (especially at the limbus); (iii) first classify the sensor or image type and then segment each image with a specific and tailored convolutional network model, in order to design a general approach.

Acknowledgments

This work was supported by grants from the National Council for Scientific and Technological Development (CNPq) (# 428333/2016-8, # 313423/2017-2 and # 307277/2014-3) and the Coordination for the Improvement of Higher Education Personnel (CAPES). The Titan Xp GPU used for this research was donated by the NVIDIA Corporation.

References

  • [1] A. K. Jain, K. Nandakumar, and A. Ross, “50 years of biometric research: Accomplishments, challenges, and opportunities,” Pattern Recognition Letters, vol. 79, pp. 80–105, 2016.
  • [2] R. P. Wildes, “Iris recognition: an emerging biometric technology,” Proceedings of the IEEE, vol. 85, no. 9, pp. 1348–1363, 1997.
  • [3] A. K. Jain, A. Ross, and S. Prabhakar, “An introduction to biometric recognition,” IEEE Transactions on Circuits and Systems for Video Technology, vol. 14, no. 1, pp. 4–20, Jan 2004.
  • [4] R. Jillela and A. A. Ross, “Segmenting iris images in the visible spectrum with applications in mobile biometrics,” Pattern Recognition Letters, vol. 57, pp. 4–16, 2015.
  • [5] A. Rattani and R. Derakhshani, “Ocular biometrics in the visible spectrum: A survey,” Image and Vision Computing, vol. 59, pp. 1–16, 2017.
  • [6] X. Liu, K. W. Bowyer, and P. J. Flynn, “Experiments with an improved iris segmentation algorithm,” in IEEE AutoID’05, 2005, pp. 118–123.
  • [7] H. Proenca and L. A. Alexandre, “Iris segmentation methodology for non-cooperative recognition,” IEE Proceedings - Vision, Image and Signal Processing, vol. 153, no. 2, pp. 199–205, 2006.
  • [8] E. Ouabida, A. Essadique, and A. Bouzid, “Vander lugt correlator based active contours for iris segmentation and tracking,” Expert Systems with Applications, vol. 71, pp. 383–395, 2017.
  • [9] S. Shah and A. Ross, “Iris segmentation using geodesic active contours,” IEEE Transactions on Information Forensics and Security, vol. 4, no. 4, pp. 824–836, Dec 2009.
  • [10] T. Tan, Z. He, and Z. Sun, “Efficient and robust segmentation of noisy iris images for non-cooperative iris recognition,” Image and Vision Computing, vol. 28, no. 2, pp. 223–230, 2010.
  • [11] P. Podder, T. Z. Khan, M. H. Khan, M. M. Rahman, R. Ahmed, and M. S. Rahman, “An efficient iris segmentation model based on eyelids and eyelashes detection in iris recognition system,” in Int. Conf. on Computer Communication and Informatics, 2015, pp. 1–7.
  • [12] M. Haindl and M. Krupic̆ka, “Unsupervised detection of non-iris occlusions,” Pattern Recognition Letters, vol. 57, pp. 60–65, 2015.
  • [13] E. Jalilian, A. Uhl, and R. Kwitt, “Domain adaptation for cnn based iris segmentation,” in Int. Conf. of the Biometrics Special Interest Group (BIOSIG), Sept 2017, pp. 1–6.
  • [14] N. Liu, H. Li, M. Zhang, J. Liu, Z. Sun, and T. Tan, “Accurate iris segmentation in non-cooperative environments using fully convolutional networks,” in Int. Conf. on Biometrics, 2016, pp. 1–8.
  • [15] M. Teichmann, M. Weber, J. M. Zöllner, R. Cipolla, and R. Urtasun, “Multinet: Real-time joint semantic reasoning for autonomous driving,” CoRR, vol. abs/1612.07695, 2016.
  • [16] P. Isola, J. Zhu, T. Zhou, and A. A. Efros, “Image-to-image translation with conditional adversarial networks,” CoRR, vol. abs/1611.07004, 2016. [Online]. Available: http://arxiv.org/abs/1611.07004
  • [17] C. Henry, S. M. Azimi, and N. Merkle, “Road segmentation in SAR satellite images with deep fully-convolutional neural networks,” CoRR, vol. abs/1802.01445, 2018.
  • [18] H. R. Roth, H. Oda, X. Zhou, N. Shimizu, Y. Yang, Y. Hayashi, M. Oda, M. Fujiwara, K. Misawa, and K. Mori, “An application of cascaded 3d fully convolutional networks for medical image segmentation,” Computerized Medical Imaging and Graphics, vol. 66, pp. 90–99, jun 2018.
  • [19] P. Luc, C. Couprie, S. Chintala, and J. Verbeek, “Semantic segmentation using adversarial networks,” CoRR, vol. abs/1611.08408, 2016.
  • [20] F. Jan, “Segmentation and localization schemes for non-ideal iris biometric systems,” Signal Processing, vol. 133, pp. 192–212, 2017.
  • [21] M. D. Marsico, M. Nappi, F. Narducci, and H. Proença, “Insights into the results of MICHE I - Mobile Iris CHallenge Evaluation,” Pattern Recognition, vol. 74, pp. 286–304, 2018.
  • [22] R. Jillela and A. A. Ross, Methods for Iris Segmentation.   Springer London, 2016, pp. 137–184.
  • [23] J. G. Daugman, “High confidence visual recognition of persons by a test of statistical independence,” IEEE Transactions on Pattern Analysis and Machine Intelligence, vol. 15, no. 11, pp. 1148–1161, 1993.
  • [24] K. Ahuja, R. Islam, F. A. Barbhuiya, and K. Dey, “Convolutional neural networks for ocular smartphone-based biometrics,” Pattern Recognition Letters, vol. 91, pp. 17–26, 2017.
  • [25] V. Dumoulin and F. Visin, “A guide to convolution arithmetic for deep learning,” arXiv preprint arXiv:1603.07285, 2016.
  • [26] E. Severo, , R. Laroca, C. S. Bezerra, L. A. Zanlorensi, D. Weingaertner, G. Moreira, and D. Menotti, “A benchmark for iris location and a deep learning detector evaluation,” CoRR, vol. abs/1803.01250, 2018. [Online]. Available: http://arxiv.org/abs/1803.01250
  • [27] M. Teichmann, M. Weber, M. Zoellner et al., “Multinet: Real-time joint semantic reasoning for autonomous driving,” arXiv preprint arXiv:1612.07695, 2016.
  • [28] J. Redmon, S. Divvala, R. Girshick, and A. Farhadi, “You only look once: Unified, real-time object detection,” in IEEE Conference on Computer Vision and Pattern Recognition (CVPR), 2016, pp. 779–788.
  • [29] R. Laroca, E. Severo, L. A. Zanlorensi, L. S. Oliveira, G. R. Gonçalves, W. R. Schwartz, and D. Menotti, “A robust real-time automatic license plate recognition based on the YOLO detector,” CoRR, vol. abs/1802.09567, 2018. [Online]. Available: http://arxiv.org/abs/1802.09567
  • [30] D. R. Lucio, R. Laroca, E. Severo, A. S. Britto Jr., and D. Menotti, “Fully convolutional networks and generative adversarial networks applied to sclera segmentation,” CoRR, vol. abs/1806.08722, 2018. [Online]. Available: http://arxiv.org/abs/1806.08722
  • [31] K. Simonyan and A. Zisserman, “Very deep convolutional networks for large-scale image recognition,” CoRR, vol. abs/1409.1556, 2014.
  • [32] E. Shelhamer, J. Long, and T. Darrell, “Fully convolutional networks for semantic segmentation,” IEEE Transactions on Pattern Analysis and Machine Intelligence, vol. 39, no. 4, pp. 640–651, April 2015.
  • [33] D. P. Kingma and J. Ba, “Adam: A method for stochastic optimization,” CoRR, vol. abs/1412.6980, 2014.
  • [34] I. Goodfellow, J. Pouget-Abadie, M. Mirza, B. Xu, D. Warde-Farley, S. Ozair, A. Courville, and Y. Bengio, “Generative adversarial nets,” in NIPS.   Curran Associates, Inc., 2014, pp. 2672–2680.
  • [35] H. Hofbauer, F. Alonso-Fernandez, P. Wild, J. Bigun, and A. Uhl, “A ground truth for iris segmentation,” in 2014 22nd Int. Conf. on Pattern Recognition, Aug 2014, pp. 527–532.
  • [36] J. Fierrez, J. Ortega-Garcia, D. T. Toledano, and J. Gonzalez-Rodriguez, “Biosec baseline corpus: A multimodal biometric database,” Pattern Recognition, vol. 40, no. 4, pp. 1389–1392, 2007.
  • [37] T. Tan and Z. Sun, “CASIA-IrisV3,” Chinese Academy of Sciences Institute of Automation, http://www.cbsr.ia.ac. cn/IrisDatabase.htm, Tech. Rep, 2005.
  • [38] ——, “CASIA-IrisV4,” Chinese Academy of Sciences Institute of Automation, http://biometrics.ideal test.org/dbDetailForUser.do?id=4.htm, Tech. Rep, 2005.
  • [39] A. Kumar and A. Passi, “Comparison and combination of iris matchers for reliable personal authentication,” Pattern Recognition, vol. 43, no. 3, pp. 1016–1026, 2010.
  • [40] H. Proença and L. A. Alexandre, “Toward covert iris biometric recognition: Experimental results from the NICE contests,” IEEE Trans. on Information Forensics and Security, vol. 7, no. 2, pp. 798–808, 2012.
  • [41] A. Sequeira, , L. Chen, P. Wild, J. Ferryman, F. Alonso-Fernandez, K. B. Raja, R. Raghavendra, C. Busch, and J. Bigun, “Cross-Eyed - cross-spectral iris/periocular recognition database and competition,” in Int. Conf. of the Biometrics Special Interest Group, Sept 2016, pp. 1–5.
  • [42] M. Marsico, M. Nappi, D. Riccio, and H. Wechsler, “Mobile iris challenge evaluation (MICHE)-I, biometric iris dataset and protocols,” Pattern Recognition Letters, vol. 57, pp. 17–23, 2015.
  • [43] Y. Hu, K. Sirlantzis, and G. Howells, “Improving colour iris segmentation using a model selection technique,” Pattern Recognition Letters, vol. 57, pp. 24–32, 2015.
  • [44] H. Proenca, S. Filipe, R. Santos, J. Oliveira, and L. Alexandre, “The UBIRIS.v2: A database of visible wavelength images captured on-the-move and at-a-distance,” IEEE TPAMI, vol. 32, no. 8, pp. 1529–1535, 2010.
  • [45] N. Othman, B. Dorizzi, and S. Garcia-Salicetti, “OSIRIS: An open source iris recognition software,” Pat. Rec. Letters, vol. 82, pp. 124–131, 2016.
  • [46] A. Gangwar, A. Joshi, A. Singh, F. Alonso-Fernandez, and J. Bigun, “IrisSeg: A fast and robust iris segmentation framework for non-ideal iris images,” in Int. Conf. on Biometrics (ICB), June 2016, pp. 1–8.
  • [47] M. Marsico, M. Nappi, and H. Proença, “Results from MICHE II – Mobile Iris CHallenge Evaluation II,” Pattern Recognition Letters, vol. 91, pp. 3–10, 2017.
Comments 0
Request Comment
You are adding the first comment!
How to quickly get a good reply:
  • Give credit where it’s due by listing out the positive aspects of a paper before getting into which changes should be made.
  • Be specific in your critique, and provide supporting evidence with appropriate references to substantiate general statements.
  • Your comment should inspire ideas to flow and help the author improves the paper.

The better we are at sharing our knowledge with each other, the faster we move forward.
""
The feedback must be of minimum 40 characters and the title a minimum of 5 characters
   
Add comment
Cancel
Loading ...
268729
This is a comment super asjknd jkasnjk adsnkj
Upvote
Downvote
""
The feedback must be of minumum 40 characters
The feedback must be of minumum 40 characters
Submit
Cancel

You are asking your first question!
How to quickly get a good answer:
  • Keep your question short and to the point
  • Check for grammar or spelling errors.
  • Phrase it like a question
Test
Test description