GHCLNet: A Generalized Hierarchically tuned Contact Lens detection Network
Iris serves as one of the best biometric modality owing to its complex, unique and stable structure. However, it can still be spoofed using fabricated eyeballs and contact lens. Accurate identification of contact lens is must for reliable performance of any biometric authentication system based on this modality. In this paper, we present a novel approach for detecting contact lens using a Generalized Hierarchically tuned Contact Lens detection Network (GHCLNet) . We have proposed hierarchical architecture for three class oculus classification namely: no lens, soft lens and cosmetic lens. Our network architecture is inspired by ResNet-50 model. This network works on raw input iris images without any pre-processing and segmentation requirement and this is one of its prodigious strength. We have performed extensive experimentation on two publicly available data-sets namely: 1)IIIT-D 2)ND and on IIT-K data-set (not publicly available) to ensure the generalizability of our network. The proposed architecture results are quite promising and outperforms the available state-of-the-art lens detection algorithms.
Security is an important issue for every individual, organization and country to protect itâs information from unauthorized access. Today, a major portion of all information is stored in the form of digital documents. Password based security has become futile because of its drawbacks such as short passwords which can be cracked easily and the strong ones which are cumbersome to remember, yielding them ineffective. In such situations biometric based authentication system which provide a unique personal identification to all, proves to be a more reliable security system. There are many different traits in the field of biometrics, the selection of a trait is an essential task to increase the robustness of security. Iris is considered as one of the best traits for biometric authentication because itâs complex patterns are unique, stable, and can easily be captured even from short distances, but it can still be spoofed using contact lenses.
The important feature of an iris is itâs textual patterns which differs from person to person. Studies have proved that these patterns are different even among zygotic twins. The use of contact lens, however, can change these textual patterns. The use of lenses decreases the accuracy of iris detection because of the change in texture brought by them. Therefore, it is of foremost importance to detect the presence of lenses before proceeding for actual iris recognition. Cosmetic lenses, being coloured and textured, are easily detectable as they differ a lot from the texture of a normal iris. However, soft lenses, being transparent in nature, are very difficult to differentiate from no-lens. A lot of work has been done in the past for lens detection but we are still far behind in accurately differentiating soft-lens from no-lens.
Related Work : The first iris based biometric algorithm was pioneered by Daugman in the late 90’s. He suggested a frequency spectrum analysis method to distinguish between a real iris image and a fabricated iris image and to distinguish between an iris without lens and with contact lens. Zang et al. proposed a method based on weighted Local Binary Patters (LBP) encoded with SIFT descriptors for classifying iris images into lens and no-lens category. Ring et al. detected the regions of local distortion within the iris to detect contact lenses. For this they analyzed the iris bitcode. Doyel et al. ensembled 14 classifiers together to conduct three class lens detection problem and achieved an accuracy of 97%. Lovish et al.  proposed a method based on Local Phase Quantization (LPQ) and Binary Gabor Patterns (BGP) for detecting cosmetic lens. Lee et al. proposed a hardware based solution to distinguish between a real and fabricated iris image based on purkinje image formation. Daksha et al. investigated the effects of texture lens on iris recognition by using variants of Local Binary Patterns. Recently, Ragvendra et al.  proposed ContlensNet which is an architecture based on Deep-Convolutional Neural Network for lens detection. However, it can be concluded from the work done so far that classifying cosmetic lens from no-lens is a well studied problem that achieves a Correct Classification Rate (CCR% ) accuracy upto 99% and above. But accurately differentiating soft lens from no-lens is still a challenging issue.
Contribution : Here in this paper we have used a Generalized Hierarchically tuned Contact Lens detection Network (GHCLNet) for three class ocular classification namely no-lens, soft lens and textured lens. The main contribution of this paper is three fold, that is summarized in the following section.
Hierarchical Deep Convolutional Network (GHCLNet) for three class ocular classification namely no-lens, soft lens and cosmetic lens has been proposed. The prodigious strength of this network lies in the fact that it works on full holistic contact lens features without any pre-processing and segmentation prerequisite.
Generalized deep convolutional neural network based architecture has been proposed.
To ensure the generalization ability of the proposed network, multi-sensor and combined-sensor validation has been performed over benchmark databases and compared with the state-of-the-art methods.
The rest of the paper is organized in the following manner Section 2 presents the proposed architecture framework, Section 3 discusses the database and testing protocol, Section 4 presents the experimental results and comparative analysis, and Section 5 finally concludes our paper.
2 Proposed Network
Our proposed network is a hierarchical network as shown in Fig.1 which is inspired from ResNet-50 architecture. The first part of the hierarchical network was exclusively trained for classifying iris images into âtexturedâ and ânon-texturedâ and the second part was exclusively trained for classifying iris images into âlensâ and âno-lensâ. Both the parts were pre-trained ResNet-50 models on ImageNet images with the first part re-trained on textured-lens (âtexturedâ) images and no-lens + soft-lens (ânon-texturedâ) images and the second part re-trained on soft-lens (âlensâ) images and no-lens (âno- lensâ) images. The ResNet-50 model is a popular deep convolutional neural network model made up of five blocks as explained below:
Block-1 is the initial branch which gets the input RGB image of size . The input image is convolved with kernels to give a feature map of , which is then passed to the max-pooling layer to reduce its size to .
Block-2 comprises of three sub-blocks : , and . The output feature map of block-2 is of size .
Block-3 comprises of four sub-blocks : , , and . The final output of block-3 is a feature map.
Block-4 consists of six sub-blocks namely : , , , , and . In the end the output feature map is of the size .
Block-5 is the last block which consists of three sub-blocks : , and . The output of block-5 is a feature map of size .
[a] ResNet Pruning : ResNet-50 is a very deep network with over layers pre-trained on the ImageNet database. During experimentation, we found that similar performance was achieved while re-training the 3rd and 5th blocks of ResNet-50 instead of retraining all the blocks. Re-training on any of the 1st, 2nd and 4th blocks resulted in a drastic drop in performance. After extensive experimentation it was found that for the four databases viz. IITK, Cogent, Vista and ND2, maximum performance along with minimal training time was achieved when only the 3rd and 5th blocks of the ResNet-50 models were trained. Since the database of ND1 is very small in number as compared to other databases, the training was only restricted to the sub-block 5(c) in order to avoid over-fitting. The other network parameters that were found after conducting extensive experimentation are summarized in table .1
During testing, the test image is fed into both the models. The output of the first part of the hierarchical network is first checked. If it classifies the image as âtexturedâ, the image is assigned the label of textured-lens. However, if it classifies the image as ânon-texturedâ, the output of the second part of the network is considered. If it classifies it as âlensâ, the image is put in the soft-lens category and if it classifies it as âno-lensâ, the image is put in the no-lens category.
|Mini batch size||64|
[b] Network Implementation Details : The proposed generalized hierarchically tuned contact-lens detection network has been implemented using python and keras library using tensorflow as its backend. All the implementation has been done on Intel(R) Xenon(R) CPU E5-2630 V4 at 2.20 GHz, with 32 GB RAM and NVIDIA 1080 Ti gpu with 8 GB RAM.
3 Database and Testing Protocol
In this section, we present the details about the database and the testing protocols used. We have tested our proposed network on IIT-Kanpur contact lens iris database (IIT-K) and on two publicly available iris databases namely: Notre Dame cosmetic contact lens 2013 database (ND) and IIIT-Delhi contact lens iris database (IIIT-D). The detailed description about these databases are presented in the following sub section and in table 2
This database consists of a total of iris images corresponding to subjects captured using Vista Imaging FA2 sensor. Since iris images are available for both the left and right eye there are unique instances of iris present in this database. All the soft lenses used in this database are manufactured by Johnson & Johnson  and Bausch & Lomb  and all the cosmetic lenses are manufactured by CIBA Vision , Flamymboyout, Oxycolor and FreshLook. This database is provided with an evaluation protocol that comprises of subjects for training and remaining subjects for testing.
This database is conceptually divided into two databases namely : ND-I and ND-II. ND-I comprises of training set images and testing set images all captured using IrisGuard AD-100 sensor . ND-II comprises of training set images and testing set images all captured using LG-4000  sensor. CIBA Vision, Johnson & Johnson  and Cooper Vision  are the three main suppliers of cosmetic contact lenses in this database. In this work for evaluating the proposed framework we follow the evaluation protocol as recommended for this database .
It consists of a total of iris images corresponding to subjects captured using two sensors namely: Cogent dual iris sensor (CIS 202) and Vista FA2E single iris sensor. All the soft lenses used in this database are manufactured either by CIBA Vision  or by Bausch & Lomb . This database is provided with an evaluation protocol that comprises of subjects for training and remaining subjects for testing.
4 Experimental Results and Discussion
Our network was trained and tested quantitatively and the results were presented on four different experiments namely: (a) Intra-sensor validation (b) Inter-sensor validation (c) Multi-sensor validation (d) Combined-sensor validation. The quantitative results are presented using the Correct Classification Rate (CCR% ) and thus, the higher the value, the better is the performance.
4.1 Intra-sensor Validation
In this testing strategy, training and testing is done for data captured from a single sensor. The results shown in table3 indicates the performance of our proposed network for different sensors. The proposed network’s results were analyzed against three other state-of-the-art algorithms namely Statistically Independent Filters, Deep Image Representation and ContlensNet. The following observations were made after the analysis:
The results obtained on IIITD cogent database with the proposed network show the best performance with a total CCR% of 93.71% . The proposed network shows a total hike of 6% in CCR% as compared to the previous state-of-the-art algorithm of ContlensNet .
The results on IIITD vista database show exceptional performance of the proposed network with a total CCR% of 95.49% . The results obtained show a total hike of 8% in CCR% as compared with second best ContlensNet model .
The results obtained on ND-I database are some what less than the available state-of the art results. The main reason behind this can be attributed to the lesser number of training images available in this database.
The results obtained on ND-II database are comparable to the available state of the art technique.
The results obtained on IITK databases are exceptional with a total CCR% of 99.67%Ṫhe accuracy of all the classes are above 99% in CCR% .This can be attributed to the fact that large number of training samples made the network learn better discriminative representations corresponding to each classes.
4.2 Inter-Sensor Validation
In this testing strategy, quantitative performance of the proposed network is shown on inter-sensor validation. The network is trained on one sensor and testing is done on another sensor. Here, we perform pairwise comparison of IIITD Vista and IIITD Cogent, and ND I and ND II that will result in four different cases. Table 4 shows the quantitative performance of the proposed network against three state-of-the-art algorithms namely: Statistically Independent Filters, Deep Image Representation and ContlensNet. The following are the prominent observations based on this experiments:
When training data is from IIITD Vista sensor and testing data is from IIITD Cogent sensor an accuracy of 82.61% in CCR% is obtained on the proposed network.
When training of the proposed network is done on data from IIITD Cogent sensor and testing is done on data from IIITD Vista sensor an accuracy of 92.01% in CCR % is noted.
When the training data is chosen from ND-II sensor and testing data is chosen from ND-I sensor an excellent accuracy of 91.51% in CCR% is obtained from the proposed network. The results show an increase in accuracy of 3.5% in CCR % as compared to the previous state-of-the-art algorithm of ContlensNet .
When the training data is chosen from ND-I sensor and testing data is chosen from ND-II sensor, the best performance of 90.58% in CCR% is obtained from the proposed network. The results show an hike in accuracy of 0.13 % in CCR % as compared to the previous state-of-the-art algorithm of ContlensNet .
4.3 Multi-sensor Validation
In this testing strategy, data from two or more sensors is combined to form a single database. Here, data from same databases is combined to form two separate databases namely: IIITD-combined and ND-Combined. The training data and testing data are combined separately to maintain modality. Table 5 indicates the quantitative performance of the proposed network along with two state-of-the-art methods on the multi-sensor validation. The observations made after experimentation are as follows:
Training and testing of ND combined data(both ND-I and ND-II) on the proposed network show an excellent accuracy of 95.57% in CCR% . The results show a hike of 3% from the previous state-of-the-art network of ContlensNet .
Training and testing of IIITD combined data(both Vista and Cogent) show an accuracy of 94.82% in CCR% obtained from the proposed network. The results obtained show an improvement of 0.17% in CCR% as compared to the previous state-of the art ContlensNet .
4.4 Combined-sensor Validation
In this testing strategy, training data of all the databases are combined to form a single large train database while testing is performed on individual test databases. The combined database constitutes of: 1)IITK, 2)IIITD (Cogent and Vista) and 3)ND (ND-I and ND-II). The result obtained by experiments are recorded in Table 6 . The following are the observations from the experiment: testing the network on IITK , ND-I, ND-II, IIITD Vista, IIITD Cogent results in a very good accuracy of 99.14%, 92.87%, 94.93%, 95.69% and 95.43% respectively. Here, we are not able to perform any comparative analysis because this kind of validation was not done earlier. The main aim of doing this validation is to show that our network is trained quite well on different images acquired from different kinds of sensors. This depicts the great generalization ability of our network.
|All Database used for training||N-N||99.67||84.38||94.52||94.8||95.19||93.71|
4.5 Comparative Analysis
It can be inferred from table 3,table 4,table 5,table 6 that our proposed architecture( GHCLNet) is performing far better in terms of CCR% as compared to the algorithms of pre deep learning era, . To the best of our knowledge ContlensNet a recent research paper, is the only architecture based on deep convolutional neural network for contact lens detection.
In ContlensNet architecture they have used OSIRIS V4.1, a publicly available segmentation tool for iris segmentation and normalization. This tool has limited performance due to occlusion, illumination and other environmental factors. Thus, one has to segment huge number of iris images manually. ContlensNet architecture takes normalized and segmented iris region in the form of patches of size as the training input. Since this architecture is not taking into consideration the scalera region of the eye, hence it is not effected by occlusion due to eyelashes.
The main advantage of our proposed GHCLNet is that it is not using any kind of pre-processing and segmentation and still giving comparable results and in many cases even better. Our network is trained in such a way that it is able to handle illumination, occlusion and other external environmental factors in a quite remarkable manner. We are marginally lagging behind ContlensNet at few places as discussed in section 4.1 and 4.2. The main reason for this is the poor quality of raw iris images as can be seen in fig2. It is clearly visible from fig2 that some of the iris images are illuminated to a large extent which distorts their textual patterns and some of the images are highly occluded. It is very difficult even for a human being to distinguish between no-lens, soft-lens and cosmetic-lens in such kind of images. As we are using the entire input raw image in our proposed architecture GHCLNet without segmentation these kinds of factors effect our network performance. But as our network is quite deep when we are combining the data of all data-sets in consideration we are getting an exceptional high performance as depicted in table 6, this indicates the high generalization ability of our network.
We can summarize our network performance for different testing protocols as follows:
Intra-Sensor Validation GHCLNet performance is quite high in case of IIITD-Cogent and IIITD-Vista but it is less in case of ND-I and ND-II mainly because of less amount of training data available in these datasets and since our network is deep it requires large amount of data for predicting good results.
Multi -Sensor Validation Due to the great generalization ability of our network our results as depicted in table 6 outperforms all the available state-of-the art techniques.
4.6 Layer Specific Feature Analysis
Fig.3 shows layer specific feature analysis of no-lens, soft-lens and cosmetic lens images. It is clearly evident from Fig.3, that initial convolutional layers learn general specific features. The main reason behind this is that initial convolutional layers look directly at the raw pixels which makes them more interpretative, while as we go deeper features corresponding to no-lens, soft-lens and cosmetic-lens are learned. Features learned by initial layers like layer are very basic features, but as we move deeper in the network more specific learning is been performed like in the layer which detects edges and lines. layer is playing a major role in differentiating no-lens image from cosmetic-lens image. In this layer textual features are learned. Interestingly, we have observed that our network automatically learns state-of-the-art gabor filter like features at different orientations. The lower layers of the network learn high level aggregated discriminative features, as shown in Fig 3, like and layers. In the lower layers of the network, the resolution and features become mostly an encoding of few discriminative intrinsic information.
Iris is considered as one of the best traits for biometric authentication as its complex patterns are unique and stable. However, the use of lenses decreases the accuracy of iris detection because of the change in texture brought by them and thus it is required to detect the presence of lenses before proceeding for actual iris recognition. In this paper, we proposed a novel Generalized Hierarchically tuned Contact Lens detection Network (GHCLNet).
Extensive experimentation has been carried out with three publicly available databases using four testing strategies: intra-sensor validation, inter-sensor validation, multi-sensor validation, and combined-sensor validation. The consistent CCR(%) improvements in multi-sensor validation; and the amazing combined-sensor validation results largely indicates the generalization ability of our network. To the best of our knowledge this kind of combined sensor testing is not done by anyone so far. The proposed architecture, with its promising results, has majorly outperformed the current state-of-the-art techniques. The main strength of this network lies in the fact that it is not using any kind of pre-processing and iris segmentation, and still giving remarkable results. This saves lots of computational time and can thus be integrated very easily as the first step in any iris recognition system to increase its performance.
-  Bausch & lomb, rochester, ny, usa. (2014, jan.). bausch & lomb lenses, available: http://www.bausch.com.
-  C. vision. (2013, apr.). expressions colors, available: http://coopervision.com/contact-lenses/expressions-color-contacts.
-  Cibavision, duluth, ga, usa (2013, apr.) freshlook colorblends, available:http://www.freshlookcontacts.com.
-  Irisguard, washington, dc, usa. (2013, apr.). ad100 camera, available: http://www.Irisguard.com/uploads/AD100ProductSheet.pdf.
-  Johnson& johnson, skillman, nj, usa. (2013, apr.). acuvue2 colours, available: http://www.acuvue.com/products-acuvue-2-colours.
-  Lg, riyadh, saudi arabia. (2011, oct.). lg 4000 camera, available: http://www.lgIris.com.
-  F. Chollet et al. Keras. https://github.com/fchollet/keras, 2015.
-  J. Daugman. How iris recognition works. In Proceedings of the 2002 International Conference on Image Processing, ICIP, pages 33–36, 2002.
-  J. Daugman. Demodulation by complex-valued wavelets for stochastic pattern recognition. Int. J. Wavelets, Multiresolution Inf. Process., 1(1):1–17, 2003.
-  J. Doyle, K. Bowyer, and P. Flynn. Variation in accuracy of textured contact lens detection based on sensor and lens pattern. In 6th IEEE International Conference on Biometrics, Technol., Appl., Syst., pages 1–7, 2013.
-  E. C. Lee, K. R. Park, and J. Kim. Fake iris detection by using purkinje image. In International Conference on Biometrics,IAPR, pages 397–403, 2006.
-  Lovish, A. Nigam, B. Kumar, and P. Gupta. Robust contact lens detection using local phase quantization and binary gabor pattern. In 16th International Conference, Computer Analysis of Images and Patterns CAIP, pages 702–714, 2015.
-  A. Mart et al. Tensorflow: Large-scale machine learning on heterogeneous systems. https://www.tensorflow.org/.
-  R. Raghavendra, K. B. Raja, and C. Busch. Ensemble of statistically independent filters for robust contact lens detection in iris images. In Proceedings of the 2014 Indian Conference on Computer Vision Graphics and Image Processing, page 24, 2014.
-  R. Raghavendra, K. B. Raja, and C. Busch. Contlensnet: Robust iris contact lens detection using deep convolutional neural networks. In 2017 IEEE Winter Conference on Applications of Computer Vision,WACV , Santa Rosa, CA, USA, pages 1160–1167, 2017.
-  S. Ring and K. Bowyer. Detection of iris texture distortions by analyzing iris code matching results. In BTAS, pages 1–6, 2008.
-  P. Silva, E. Luz, R. Baeta, H. Pedrini, A. X. Falcao, and D. Menotti. An approach to iris contact lens detection based on deep image representations. In Graphics, Patterns and Images (SIBGRAPI), pages 157–164, 2015.
-  D. Yadav, N. Kohli, J. S. D. Jr., R. Singh, M. Vatsa, and K. W. Bowyer. Unraveling the effect of textured contact lenses on iris recognition. IEEE Trans. Information Forensics and Security,2014, 9(5):851–862, 2014.
-  H. Zhang, Z. Sun, and T. Tan. Contact lens detection based on weighted lbp. In 20th International Conference on Pattern Recognition, pages 4279–4282, 2010.