A Survey on Behavioral Biometric Authentication on Smartphones
Recent research has shown the possibility of using smartphones’ sensors and accessories to extract some behavioral attributes such as touch dynamics, keystroke dynamics and gait recognition. These attributes are known as behavioral biometrics and could be used to verify or identify users implicitly and continuously on smartphones. The authentication systems that have been built based on these behavioral biometric traits are known as active or continuous authentication systems.
This paper provides a review of the active authentication systems. We present the components and the operating process of the active authentication systems in general, followed by an overview of the state-of-the-art behavioral biometric traits that used to develop an active authentication systems and their evaluation on smartphones. We discuss the issues, strengths and limitations that associated with each behavioral biometric trait. Also, we introduce a comparative summary between them. Finally, challenges and open research problems are presented in this research field.
keywords:Behavioral Biometric Authentication, Touch Dynamics, Keystroke Dynamics, Behavioral Profiling, Gait Recognition
With the diversity of sensors and services on smartphone as shown in Figure 1, the smartphone became more smarter and attracts both (1) users who enjoy using it to facilitate their daily life more than ever before, consequently they store more sensitive and private information on it, and (2) attackers who pay more attention to access or steal these sensitive data. These attacks could be done by either insider attacker, someone who know the user such as friend or family member or stranger attacker, someone who does not know the user Muslukhov:2012 ().
Due to the weaknesses of the traditional authentication mechanisms such as PIN, Pattern and Password, and the biometric based mechanisms such as fingerprint, face and voice recognition on smartphones, the research community have developed authentication mechanisms based on behavioral biometric traits such as gesture, keystroke and gait. These mechanisms are known as active or continuous authentication mechanisms.
In this paper we present the components and the operating process of the active authentication mechanisms in general, followed by some different metrics that used to evaluate the performance of an active authentication mechanisms. We also conducted an extensive survey of the state-of-the-art active authentication systems and their evaluation on smartphones. We discuss the issues, strengths and limitations that associated with each behavioral biometric trait, and introduce a comparative analysis between them. Finally, we identify challenges, open research problems and provide a set of recommendations in this research field.
The rest of the paper is organized as follows: Section 3 provides an overview of active authentication systems in general. Section 4 presents a set of factors that facilitate the selection of a behavioral biometric trait. Section 5 presents another set of factors that help in the designing process of the biometric authentication system. Section 6 surveys the common behavioral biometric traits. Section 8 presents some limitations and followed by set of challenges and future trends.
2 Adversary Attacks
The main goal of attackers is to gain physical access to the victim’s device for snooping or data destruction. These attacks could be done by either insider attacker or stranger attacker Muslukhov:2012 ().
Insider Attacker, someone who know the user such as friend or family member. The insider attacker has opportunity to have unauthorized access to the victim’s device due to the proximity between them. Based on a previous research done by Usmani et al. Usmani:2017 () where they characterized the social insider attacks and found that the existing devices ( i.e., which use the traditional authentication methods such as Pattern or Password) and the Facebook account security measures are ineffective to resist social insider attacks.
Stranger Attacker, someone who does not know the user. The stranger attacker has no prior knowledge about the victim, who may steal the legitimate user’s device or found a lost device.
3 The Active Authentication
In this section we define what is an active authentication and show an overview on how does the active authentication system work, followed by its modes of operation. Finally, we present different metrics that have been used to evaluate the performance of active authentication systems.
3.1 What is an active authentication system?
Active authentication system is an automated recognition process that verifies or identify individuals based on detailed information about their body such as face or their behaviors such as how they type or interact with some sensors on smartphone. Figure 1 shows some sensors and services that can be used to acquire behavioral biometric data. The main goal of the recognition process is to prevent the unauthorized access from imposters and grant access only for legitimate user. The idea behind the recognition process in active authentication system is to establish an identity based on who you are? concept. The details of how recognition process work based on a specific biometric trait will be described in the next section.
There are two important characteristics that should be achieved by any active authentication system which are as follows:
Continuity: A smartphone verifying user in a continuous manner, where the authentication system keep authenticating users as long as the user uses the smartphone. In other words, it is a re-authentication process that conducts periodically.
Transparency: All authentication processes should be carried out in the background without interrupting the user (i.e., user will be implicitly authenticated without any intervention).
The two aforementioned characteristics are representing the cornerstone of any active authentication system, which make it different than the traditional authentication system. There are different biometric techniques could be used to achieve these characteristics. These techniques are categorized into two groups, physiological biometric mechanisms such as face and voice recognition, and behavioral biometric mechanisms such as touch and keystroke dynamics. In this paper we concentrate on surveying the behavioral biometric ones.
3.2 How does the active authentication system work?
The active authentication system works similarly like the biometric recognition system, which contains two main phases, enrolment phase and recognition phase as shown in Figure 2. In the enrolment phase, the system acquires the biometric data, analyzes this data and extracts a distinctive features set, then it builds the feature templates (e.g., like the training process for a classifier). In the recognition phase, the system, similarly, acquires biometric data and extracts features, but instead of storing these features in the feature templates, it compares it with the stored one to verify the user identity.
There is a set of basic modules should be included in any active authentication system in general which are as follows:
Data acquisition module: it is the first step in the system where the raw biometric data is collected by one of the sensors in the smartphone such as camera or touchscreen sensor (see Figure 1). The quality of the collected data is very important because it will affect on the successor modules of the recognition process. The quality of data is impacted by the used sensors and the environment in which the data was collected Jain:2011:IB:2161587 ().
Feature extraction module: before extracting the distinctive features, the raw data has to be preprocessed, detect and remove outliers, improve the data quality, especially if the data collected in an uncontrolled environment with uncooperative users. Then, once the data is cleaned and processed, set of discriminative features are extracted. The extracted features depend on the type of raw data, for example if the collected data contains timestamps, temporal feature could be extracted.
Feature templates: it is a repository database that contains a concatenation of the extracted feature vectors for a specific user (i.e., device owner). It is built during the enrollment phase and used during the recognition phase to be compared with the captured feature sample to verify the claimed identity.
Matching and decision-making module: it used only during the recognition process, where it compares the extracted features against the stored feature templates to generate a matching score to make a decision. The decision validates the claimed identity to see it is done by legitimate user or imposter.
3.3 Modes of operation
There are two different modes that the biometric system could operate, which depends on the recognition context mode, verification or identification.
3.3.1 Verification mode
In verification mode, which is one-to-one matching process, the system verifies the claimed identity by comparing it with the stored one. If the matching score of the claimed identity greater than a predefined threshold , then the claimed identity is accepted as legitimate, otherwise, the claimed identity is rejected as imposter. So, authentication process could be operate based on verification mode and implemented as a binary classification problem. The decision rule is calculated based on the following formula:
where represents the authentication score for a user and is calculated by the classifier, and represents a predefined threshold .
3.3.2 Identification mode
In identification mode, which is one-to-many matching process, the system recognizes the presented biometric sample by comparing it with all stored templates (i.e., a template for each user), where the matching algorithm estimates the identity of the sample based on the highest matching score and a designated threshold (i.e., there is multiple matching scores will be generated, one for each user, in which the highest score will be selected).
3.4 Performance metrics
There are different metrics could be used to evaluate the performance of an active authentication system. Selecting metrics depends on the type of evaluation, and there are three types of evaluation could be performed:
Technology evaluation: is the dominant general type of evaluation testing. It is used to evaluate the same biometric modality in offline mode and compares different algorithms within a single modality on a fixed dataset.
Scenario evaluation: the main objective of this evaluation type is to test the whole biometric system for a class of applications in a real world manner where the dataset collected from real subjects.
Operational evaluation: it is similar to scenario evaluation but it measures a comprehensive biometric system in specific application environment in a real-time manner.
Because our application context here is authentication, we describe the metrics that could be used in verification mode rather than identification mode. Our assumption is that each mobile device is used only by one user (i.e., single user context) and our goal is to prevent the unauthorized access by differentiating between the legitimate user and imposter (i.e., binary-class classification problem).
The basic metrics that used to evaluate the active authentication system are depending on the error rates. Before describing the verification system error rates, we are going to mention some basic metrics that will be used in verification error calculation.
The raw basic metrics and their description in our problem domain are as follows:
True Accept (TA): The system correctly matches a genuine user to the corresponding template stored within the system.
True Reject (TR): The system correctly denies an imposter, where its data that not matching to any template within the system.
False Accept (FA): The impostor was incorrectly matched to a genuine user template stored within a biometric system.
False Reject (FR): The genuine user is incorrectly rejected from the system.
The common metrics that have been used in the literature to evaluate the performance of the active authentication system are as follows:
True Accept Rate (TAR) describes the probability that the system correctly matches a genuine user to the corresponding stored template within the system, and is calculated based on the following formula:
False Accept Rate (FAR) describes the proportion of impostors that were incorrectly matched to a genuine users templates stored in a biometric system, and is calculated based on the following formula:
FAR reflects the ability of a non-authorized user to access the system, whether via zero-effort access attempts or deliberate spoofing or any other method of circumvention.
False Reject Rate (FRR) describes the proportion of genuine users that were incorrectly rejected from a biometric system, and is calculated based on the following formula:
Equal Error Rate (EER): describes the point at which genuine and imposter error rates are equal, where the lower EER indicates better performance. It could be used to summarize the performance of the authentication system in a single value result. Previous research has been conducted to calculate it with respect to energy consumption. For instance, Sitova et al., Sitova:2013 () proposed an evaluation for the active behavioral biometric authentication system based on EERs with respect to various levels of energy consumption.
Also, there are two important metrics that could be used to describe the authentication performance of the system in a presented curve:
Receiver Operating Characteristic (ROC) Curve: depicts the trade-off between TAR along the y-axis and FAR along the x-axis in a single curve at various threshold values, where points are plotted parametrically as a function of the decision threshold (see Figure 3). The top left corner of the plot represents the ideal point, where TPR equal one and FPR equal zero.
Area Under Curve (AUC): it is used to quantify the quality of the authentication model as an alternative to the accuracy (see Figure 3). Also it is useful even when there is a high class imbalance (i.e. one of the classes dominates). The value of AUC ranges from 0.5 to 1, where 0.5 represents the random guessing and 1 represents the ideal results (i.e., no errors in the system).
4 The Selection of The Biometric Trait
There are different biometric traits could be used in active authentication system which relies on the application context. Each trait could be used in certain context but others not. There are some factors could be used to evaluate the suitability of the biometric trait which are as follows Jain:1998 ():
Universality: each user should have the biometric trait.
Uniqueness: the biometric trait should be adequately differentiates between any two users. This will help to generate a discriminative features that could be used to differentiate between legitimate user and imposter with high accuracy.
Permanence: the biometric trait should be durable (i.e., not vary over time).
Collectability: the biometric trait should be easy to collect and measure.
Performance: the accuracy of the biometric trait should be robust and functional for the given environment.
Acceptability: the users should accept and willing to present her biometric trait.
Circumvention: the biometric trait should not be susceptible to spoofing or any other attacks.
5 Design Factors
The design of the biometric authentication system is influenced by some factors Jain:2011:IB:2161587 (), we describe them in regard to mobile application environment which are as follows:
User cooperation: Cooperation refers to the behavior of the user when interacting with the authentication system, where the biometric trait is collected either from cooperative user, where the user interacts with the system in concerted manner, or uncooperative user, where the user does not perform the trait as it should be.
The degree of control: the degree to which a deployment environment is controlled or uncontrolled; whether the deployment environment is outdoors, indoors, or mixed.
User awareness: explores if user is aware that he is being subjected to biometric recognition system or not.
The habituation: explores if the user has experience to interact with the biometric system before or not.
Open versus closed system: explores If a person’s biometric template can be used across multiple applications, in this case the system is open, otherwise, the system is closed.
6 Common Behavioral Biometric Traits
Behavioral biometric trait is a particular characteristics that can be acquired from user actions such as touch gesture, keystroke dynamics or behavioral profiling (See Figure 4). In this section we present a review about some of the commonly used behavioral biometric traits that have been proposed in the literature to design an active authentication systems on smartphones.
6.1 Gesture based authentication
All gesture based authentication methods are built based on the analysis and measuring of touch gestures on smartphones gesture:2016 (). The touch gesture biometric trait is a hand drawn shape on the smartphone touch screen that contains one or more strokes. The stroke is a sequence of consecutive timed points. Each point represented by an ordered pair of numerical coordinates as shown in Figure 4a. Smartphone touch screen represents the input data source for gesture based authentication, which is the main input method used in the smartphones. Touch gestures could be acquired from application level motionevent:2017 () or operating system level motioneventOS:2017 (). Every touch gesture includes touch mechanics which depicts what your fingers do on the screen?, and touch activities which depicts the results of specific gestures gesture:2016 ().
A useful discriminative set of features could be extracted from the touch gesture biometric trait, but at the same time, it also introduces some challenges. We are going to discuss how touch gesture biometric trait has been used to develop an active authentication system.
|Study||# of Participants||Dataset (# of samples)||Platform||# of features||Classification||Performance(%)|
|Shahzad et al. Shahzad:2013 ()||50||Private (15009 overall)||Windows||7||SVDE||EER:0.5|
|Zhao et al., Zhao:2013 ()||30||Private(120/user)||Android||Image (100x150)||NC,L1,L2 distance||EER:2.62|
|Serwadda et al. serwadda:2013 ()||190||Private(50/touch type)||Android||28||10 Classifiers||EER:10.5-42.0|
|Xu et al. Xu:2014 ()||42||Private(200/user)||Android||(4,37,42,49)||SVM||EER:10.0|
|Feng et al. Feng:2012 ()||40||Private(-)||Android||53||DT, RF, BN||FAR:4.66, FRR:0.13|
|Frank et al. Frank:2013 ()||41||Private(-)||Android||27||KNN, SVM||EER:4.0|
|Li et al. li:2013 ()||75||Private(400/user)||Android||10||SVM||EER 3.0|
6.1.1 Data collection
Data collection is the first step in the active authentication system, and the raw touch data is acquired from the touchscreen sensor. The behavior of touch gesture techniques is determined by a transfer function Quinn:2012 () that converts human input actions attached with some parameters such as (size, length, speed, velocity, pressure, or direction) into a gesture output effect. Most of proposed gesture authentication system in the literature are based on the assumption that the users are going to perform the touch gestures in away that reflects their behaviors. Hence, the parameters that attached with the touch input vary from one user to another.
Touch gesture biometric trait has been used in active authentication systems because it implies two important characteristics:
Continuity: Touch gesture biometric trait can be used to continuously authenticate users by monitoring their touch dynamics patterns. Users can be re-authenticated as long as they are using the smartphone. This is one of the most important advantages that touch gesture biometric trait has. Specially when compare it with the traditional authentication methods and physiological biometrics.
Transparency: Touch gesture biometric trait could be acquired without any interruption to the user. This is because the acquiring and processing of touch gestures can be carried out in the background while the smartphone being used by the user.
As illustrated in Table 1, the majority of conducted studies have collected data from less than or equal 50 participants Shahzad:2013 (); Zhao:2013 (); Xu:2014 (); Feng:2012 (); Frank:2013 (). On the other hand, few studies have conducted with a larger number of participants as in serwadda:2013 (); li:2013 (). The majority of formulated datasets contain hundreds of samples per user and were collected during a lab study from cooperative users in controlled environment Teh:2016 ().
Regarding to the platform that used during data collection process, Android was the most common platform used for touch gesture biometric acquisition, due to its popularity in the market shares and it is easier to customize than iOS or Windows.
6.1.2 Feature extraction
Feature extraction is one of the main modules used in active authentication system, where the classifier classifies the users based on features set. The main goal of feature extraction is to identify and extract discriminative set of features by analyzing the raw touch gesture data from the users. The common extracted features were belonged to the following categories:
Temporal features: one of the common used features set in touch gesture authentication system. Extracting this features set relies on the time analysis of user touches, where every touch gesture event is attached with timestamp. For example, the total time taken to perform a touch gesture could be calculated based on the difference between touch down and touch up timestamps, and we can use that calculated duration as a temporal feature.
Spatial features: extracted by doing analysis relating to the position, area, and size of the touch gesture, where every touch gesture is performed in specific position on the touch screen and represented by coordinates. Also touch size can be used which represents the approximation of the screen area that is being touched during a touch event. Another spatial feature which is touch pressure, a value measures the approximated force asserted on the screen for each touch event.
Dynamic features: extracted from the dynamic analysis of the touch event. For example the touch gesture is detected by the motion of object (i.e., finger) on the touch screen. By analyzing this motion helps to generate a useful set of features that could be used to differentiate between users.
Geometric features: extracted by conducting a geometric analysis on the touch gesture. As the touch gesture contains one or more strokes and the stroke is a sequence of consecutive timed points. Analyzing the relationships between points, lines, curves that generated by conducting touch gestures give a useful discriminative features.
The evaluation of touch gesture biometric trait in the literature is based on verification mode (i.e., one-to-one matching) Phillips:2000 (), where the user claim the identity and the system validates the claimed presented identity. The following are set of components that affect on the evaluation process:
Dataset: The dataset sample size has a huge impact on the accuracy of any proposed authentication method. Based on the literature, the majority of conducted studies collected small samples (i.e., hundreds per user as shown in Table 1) such as in Zhao:2013 (); serwadda:2013 (); Xu:2014 (); li:2013 (). On the other hand, there are few studies that collected thousands of samples such as in Shahzad:2013 (). Evaluating the performance of a proposed active authentication system on large-scale dataset that contains millions of samples is going to be more realistic than using small-size datasets.
Classification model: different classification methods have been used in the touch gesture authentication. Some of them were based on probabilistic modelling such as Bayesian Network Feng:2012 (), and others used Support Vector Machine (SVM), which is another classification technique that separates the feature space by a hyperplane such that the margin between the two classes is maximized li:2013 (); Frank:2013 (). One of the most common used classification technique in touch gesture authentication systems is K-nearest-neighbors (KNN) which is robust and fast classification method that takes every new observation and locates it in feature space with respect to all training observations. Also, Decision Tree has been used to classify data based on the learned touch patterns as in Feng:2012 ().
Static and dynamic modes: In the static mode, the identity of a subject is verified based on the input provided by the subject on the first instance of accessing a system. In the dynamic mode, a subject’s identity is continuously verified throughout the active session of a mobile device.
Metrics: Equal Error Rate (EER) is the most common used metric in the literature to evaluate the performance of the touch gesture authentication. Table 1 shows a comparison between the performance of different proposed active authentication systems. As we can see the most proposed systems achieve low EER values based on their collected datasets which are less than 10%. Some proposed systems use other metrics such as Area Under The CurveAUC Shahzad:2013 (), False Rejection Rate (FRR) and False Acceptance Rate (FAR) Feng:2012 (), for more declaration about the most common used metrics see section 3.4.
|Study||# of Subjects||Dataset (# of sample)||Platform||# of features||Classification||Performance(%)|
|Buschek et al. buschek:2015 ()||28||Private(20160 overall)||Android||24||Knn,SVM,NB,LSAD||EER:26.4-36.8|
|Draffin et al. Draffin:2014 ()||13||Private(430000)||Android||6||Neural Networks||FAR:14, FRR:2.2|
|Trojahn et al. trojahn:2013 ()||18||Private(1980)||Android||32||J48 Decision Tree||FAR:2.03, FRR:2.67|
|Feng et al. Feng:2013 ()||40||Private(-)||Android||122||J48,RF,BN||FAR:1.0,FRR1.0|
|Clarke et al. Clarke:2006 ()||30||Private(30)||-||-||GRNN, RBF, FF MLP||EER:12.8|
|Gunetti et al. gunetti:2005 ()||205||Private(-)||HTML form||-||Mathematical Model||FAR:5.0,FRR:0.5|
6.2 Keystroke dynamics
Keystroke dynamics is one of the old behavioral biometric trait that have been proposed for a long time to authenticate user continuously on computers Monrose:2000 (); Bergadano:2002 (); gunetti:2005 (). For smartphones, keystroke dynamics is the process of analyzing the way a user types on smartphone virtual keyboard by monitoring the keyboard inputs as shown in Figure 5, and attempts to identify them based on habitual rhythm patterns in the way they type. With the diversity of touchscreen smartphones, the way a user types on smartphone has changed to be easier and more friendly. At the same time the raw data that associated with keystroke analysis became boarder and opened opportunity to collect more data and extract set of discriminative features that can be used to authenticate smartphone user based on the way they type (keystroke dynamics). Based on the literature, we present how the data that relates to keystroke dynamics collected, followed by a description on extracted features set. Then, we discuss the evaluation techniques that have been used in the literature to evaluate the performance of the active authentication methods that built based on keystroke dynamics.
6.2.1 Data collection
Most research in the field of keystroke analysis collect data from structured and predefined text. For instance, Clarke et al. Clarke:2006 () asked participants to enter 30 text messages over three sessions. The messages contained mixture of quotes, lines from movies and typical text messages. Length of messages varied with average 14 words per message. Similarly, Feng et al. Feng:2013 () developed an android application to collect keystrokes over login session, where user asked to enter passwords (i.e., length is 4, 20 different passwords were used), and post-login session, where the user asked to enter a predefined sentences (i.e., the length of sentences varies from 14 words to 53 words, and on average 23 words). Trojahn et al. trojahn:2013 () also asked users to enter a specified sentence with 11 characters ten times (two words with one space in between). Buschek et al. buschek:2015 () invited participants to spend two sessions, with a gap of at least one week. Each session comprised three main tasks where the participants typed 6 different passwords in random order, 20 times each. As we can see, all aforementioned techniques relies on a specific context with a predefined text.
On the other hand, few studies have been conducted to collect data over all usage context i.e., not restricted with some predefined sentences or password. Draffin et al. Draffin:2014 () conducted a real-world field study to collect keystrokes from 13 users over three weeks period. They collected 86000 keypresses overall context without any intervention, not just from passwords or controlled phrases.
Similar to touch gesture, the keystroke dynamics can be used in active authentication systems because it implies two important characteristics:
Continuity: keystroke dynamics can be used to continuously authenticate smartphone users by monitoring their way of typing, where the users can be re-authenticated as long as they type on the virtual keyboard. This is one of the most important advantages that keystroke dynamics has, specially when compare it with the physiological biometric traits.
Transparency: keystroke dynamics could be acquired implicitly while the users type without any interruption to them. This is because the acquiring and processing of touch gestures can be conducted in the background while the smartphone being used by the user.
6.2.2 Extracted features
Different features can be extracted from keystroke analysis. The most common used features in the literature of keystroke analysis are the duration and latency of the keypresses. The duration represents the amount of time between press and release of a key, and the latency is calculated based on the elapsed time difference between the release of the previous key and the press of the current key. Some other features are relevant to touch gesture as described in section 6.1.2 such as spatial features (i.e., features relate to the position, area, pressure and size of the keystroke presses).
Most of the evaluation methods in the literature are based on verification mode such as in maiorana:2011 (); buschek:2015 (); Feng:2013 (), some other little evaluation methods have been used based on identification mode such as in Nauman:2013 ().
Table 2 compares different proposed active authentication methods based on keystroke dynamics. Buschek et al. buschek:2015 () evaluated their proposed authentication mechanism on a private dataset with 20160 samples. They used different models to make verification. First they used classification models such as KNN, NB and SVM. Second, they used anomaly detection model such as LSAD quinn:2014 () and they found that the classification models performed better than anomaly detection model. Overall the system they reduced EER by 26.4 - 36.8%. Trojahn et al. trojahn:2013 () formulated a dataset of 1980 samples collected from 18 users. They used different classification algorithm such as J48 decision tree, MLP, BayesNet and Naive Bayes. The best result for FAR and FRR error rates achieved by the J48 classifiers which were 2.03% and 2.67% respectively. Feng et al. Feng:2013 () evaluated the system on a dataset collected from 40 subjects. They have used three classifiers, DT, RF and BN. The best FAR was achieved by RF which was 8.93% and the best FRR was achieved by BN which was 0.27%. Clarke et al. Clarke:2006 () evaluated their system based on dataset of 30 samples collected from 30 subjects over three sessions. Their EER was 12.8% on average which achieved by neural network classifiers. In contrast with all mentioned evaluation methods, Draffin et al. Draffin:2014 () evaluated their proposed system based on unconstrained dataset (i.e., they collected data overall application context without intervention or any supervision like what other studies did). The best result achieved by evaluating the system over input sessions of 15 keypresses with detection rate 67.7%, where FAR was 14.0% and FRR was 2.2%, they built a discriminant algorithm based on neural networks.
|Study||# of Subjects||Dataset||Features||Classification||Performance (%)|
|Li et al. Li:2014 ()||22-76||MIT Reality||app name, Tel. number, cell, location, call (duration, time)||Neural Network||FRR:11.45, FAR:4.17|
|Kayacik et al. kayacik:2014 ()||7,35,100||GCU,RiceLivelab, MIT Reality||+ wifi, cpu load, light, noise ,magnetic field and rotation||DR: 53-99|
|Bassu et al. bassu:2013 ()||NA||Private||app usage, time, location, HDI, bandwidth||Bayesian||NA|
|Gupta et al. gupta:2012 ()||37-76||MIT Reality||GPS location, WIFI, bluetooth||Developed model||Precision:85, Recall:91|
|Shi et al. Shi:2011 ()||50||-||SMS, Calls, Browser History, Location||-||-|
6.3 Behavioral profiling
Behavioral Profiling is the way in which the user interacts with the mobile sensors and services. Active authentication systems leverage these interactions to verify the user identity. It has been used for a long time to authenticate users based on their behavioral profile, where the literature of behavioral profiling was concentrated on network-based approaches such as user calling and service provider network to build a user profile Hall:2005 (). Also, some host-based approaches such as application usage and locations were used Li:2011 (). The intuition behind authentication system based on behavioral profiling is to build a profile of user activities over a period of time and compare that profile with the current user profile using some machine learning approaches.
Some recent active authentication techniques have been developed based on bahvioral profiling such as in Li:2014 (); kayacik:2014 (); bassu:2013 (); gupta:2012 (). Regarding these methods, we show what is the collected data and how they collect these data. Also, we discuss the feature extraction process in addition to the evaluation methods.
6.3.1 Data collection
One of the most common public dataset that have been used in the literature of behavioral profiling authentication is MIT Reality dataset eagle:2006 (). The MIT Reality dataset contains a rich amount of behavioral profiles sensors for 100 smartphone users from various departments of MIT. The data collected over the period of 9 months and contains sensor data such as call logs, bluetooth devices in proximity, cell tower IDs, application usage. Some proposed authentication system used MIT Reality dataset such as in Li:2014 (); kayacik:2014 (); Shin:2012 (). Other proposed systems have collected data by conducting a study. The evaluation results done by Li et al. Li:2014 () on the MIT Reality dataset achieved FRR of 11.45% and FAR of 4.17% overall the proposed framework.
Kayacik et al. kayacik:2014 () proposed a data driven technique that compares the current user profile with the stored one. If the behaviour deviates sufficiently from the established norm, actions such as explicit authentication can be triggered. They evaluated the proposed system using three datasets, GCU, Rice Livelab and MIT Reality. GCU dataset consists of a collection from 7 staff and students of Glasgow Caledonian University. It was collected in 2013 from Android devices and contains sensor data from wifi networks, cell towers, application use, light and sound levels and device system stats. The duration of the data varies from 2 weeks to 14 weeks for different users. Rice Livelab Shepard:2011 () dataset was built over 35 users, all of them were students at Rice University or Houston Community College. The data was collected from iPhone 3GS devices between 2010 and 2011 and contains sensor data such as application use, wifi networks, cell tower IDs, GPS readings, battery usage and accelerometer output. The duration of the data varies from a few days to less than one year for different users.
Shi et al. Shi:2011 () developed a data collection application and posted it in Android marketplace. It has been downloaded by 276 users but only 50 users who kept it for a period of 12 days or more. They formulated their dataset based on the 50 users and they evaluated their proposed algorithm based on them. The dataset contains SMS, Phone calls, Browser history and Location. They used only two metrics to evaluate the proposed algorithm. First metric is the number of times the legitimate user used the device before a failed authentication. Second metric is the number of times the adversary used the device before detection.
Gupta et al. gupta:2012 () conducted several experiments using large-scale data collection tool kiukkonen:2010 (). They built a dataset that contains GPS location traces and regular scans of WiFi and Bluetooth radio environments of a large number of users. They developed contexts of interest (CoIs: a context that is significant to the user) identification algorithm. They evaluated their proposed prototype on the collected dataset which achieved Precision:85 and Recall:91.
Bassu et al. bassu:2013 () developed a new behavioral profiling authentication technique that combined four essential behavioral elements corresponding to what, where, when, and how. Apps usage constitutes the what, location and pace of movement defines where, clock time captures when, and gesture or input-output interactions captures how. They extracted set of features base on spatial and temporal analysis. Then they developed a classification model based on Bayesian classifier. Summary of the proposed behavioral profiling authentication methods is shown in Table 3.
The main advantage of behavioral profiling biometric authentication is the capability of providing continuous and transparent authentication when users interact with their mobile devices, where all profile sensor data could be acquired continuously and without interrupting the user. However, a major weakness is the performance inconsistency when users interact with the mobile phones in an unusual way.
|Study||# of Subjects||Dataset(# of sample)||Platform||Sensors||Performance (%)|
|Neverova et al. neverova:2016 ()||1500||Abacus(27.62 TB)||Android||Neural Network||-|
|Hoang et al. hoang:2015 ()||38||Public hoang:2013 ()||Android||sequence of prehensile movements||FAR:0, FRR16.81|
|Juefei-Xu et al. Juefei:2012 ()||36||166f(50.5m)||Android||Accelerometer and Gyroscope||VR: 99.4, FAR:0.1|
|Derawi et al. Derawi:2010 ()||51||37||Android||Accelerometer (AK8976A)||EER:20|
|Mantyjarvi et al. Mantyjarvi:2005 ()||36||-||108/20m||3-D accelerometer||EER:7|
6.4 Gait recognition
Gait based active authentication systems identify users based on the way in which the user walk. It is one of the rare biometric traits that can be used to recognize the people. With the diversity of built-in sensors in smartphones such as accelerometer and gyroscope made the development of authentication systems based on this trait feasible.
Mantyjarvi et a. Mantyjarvi:2005 () proposed an implicit authentication biometric method based on gait dynamics. They collected three-dimensional movement data from 36 users via body worn 3-D accelerometer device, where the users walked about 20 meter in their normal, fast and slow walking speeds. The experimental results showed that the best EER was 7% and achieved by means of a signal correlation method. Rather than using stand alone accelerometer devices, Derawi et al. Derawi:2010 () leverage the low-energy accelerometer sensor in the mobile device to collect accelerometer data from 51 subjects. They created a dataset of 37 meters walking distance with 40-50 samples per second for each of the three directions x, y and z. They evaluated their system based on the collected dataset and achieved EER 20%. In addition to the accelerometer data, Juefei-Xu et al. Juefei:2012 () used gyroscope data to estimate the orientation of the phone in a user’s pocket. They built a dataset with 36 subjects where they walked for 166 feet (50.5 meter). They extracted set of discriminative features based on Continuous Wavelet Transform (CWT) lang:1998 (). The best result achieved based on normal to normal pace which was 99.4% for verification rate at 0.1% for FAR.
Another novel idea has been proposed by Hoang et al. hoang:2015 () which verify the user via a stored key which is biometrically encrypted by gait templates collected from a mobile accelerometer. Also, they investigated the discriminability of sensor-based gait templates to construct an effective gait-based biometric crypto-system. They created a dataset from 38 participants using Google Nexus One device. They achieved zero FAR with approximately 16.18% FRR.
In contrast with all previous studies, Neverova et al. neverova:2016 () created unsupervised and unconstrained dataset which was collected from approximately 1500 volunteers using LG Nexus 5 research phones as their primary devices on a daily basis manner. It is the largest study of its kind, where the motion data acquired from three sensors: accelerometer, gyroscope with 200 Hz sampling rate and magnetometer with 5 Hz sampling rate. They concentrated to authenticate users based on their natural kinematics, the motion patterns of human body. Their results demonstrated that human kinematics convey important information about user identity. Table 4 summarizes the results for all proposed methods that built based on motion sensors.
Gait analysis relies on mobile inertial sensors such accelerometers and gyroscopes for authentication. These sensors are non-contact and non-obtrusive which could be used to design authentication methods that resistant to spoofing attacks.
7 Fusing Different Biometric Traits
Fusing different behavioral biometric traits can improve the authentication accuracy and address some limitations and problems as in section 8. There are different scenarios could be applied to perform the fusion which are as follows Ross:2003 (); Ross:2004 (); Ross:2006 (); Faundez-Zanuy:2005 (); chen:2013 ():
Sensor level fusion, which combine raw data that captured from different sensors for the same biometric traits.
Feature-level fusion, which combine different feature vectors that extracted from multiple biometric modalities in one new feature vector.
Score level fusion, which apply the combination based on the matching score of each authentication modality.
Decision level fusion, comprises decisions from multiple classifiers to make the final decision.
Figure 6 illustrates the combination of multiple behavioral biometric traits based on decision level fusion where the local decisions (i.e., calculated based on each biometric trait) are combined based on the majority voting method Lam:1997 (). The final decision is predicted based on the plurality vote of each local decision as follows kuncheva2004combining ():
where and represents the dimension of classes (i.e., in authentication case is 0 or 1 which means accept or reject), and where represents the number of modalities’ decisions (i.e., three modalities as shown in Figure 6).
8 Limitations and Challenges
Although there are several advantages associated with the active authentication systems, there are different limitations and problems are facing it which are as follows Ross:2003 (); Ross:2004 (); Ross:2006 (); Khaleghi:2013 (); Serwadda:2016 ():
Noisy data: the sensed data that recorded by the sensors devices that used in active authentication systems is always affected by some level of impreciseness in measurements.
Non-universality: the active authentication system may not be able to collect meaningful data. In other words, the collected data might not reflect the correct user behavior.
Intra-class variations: incorrect interaction with sensor, or the changing of the behavioral characteristics of the users at different time instances make variations.
Lack of uniqueness: the interclass similarity between individuals will make some difficulties to differentiate between two users.
Vulnerabilities: such as spoofing and robot attacks. for example Serwadda et al. Serwadda:2016 () developed two Lego-driven robotic attacks on touch-based authentication.
Although we could not cover all literature in active authentication but we could cover an important representative subset of the state-of-the-art methods. Based on these representative subset, there are some challenges and future trends that needs to be covered in the active authentication systems, which are as follows:
Maximizing accuracy: accuracy of active authentication system that built based on behavioral biometric traits is small. A better way to maximize the accuracy is needed.
Domain adaptation capability: user characteristics change overtime. For instance, data collected in the enrollment phase may differ than those in the recognition phase. The active authentication system should use some domain adaptive method to handle this issue Zhang:2015 ().
Biometric feature extraction and selection: feature extraction and selection are challenge process by nature. To extract and select an appropriate set of behavioral biometric features for active authentication system are more challenging. In depth analysis on the collected data is needed to select an appropriate set of behavioral biometric features.
Datasets: one of weakest point in the majority of proposed active authentication systems is the lack of real-world datasets. Conducting a systematic user studies and experiments to get more users involved are very important factor to evaluate the active authentication system. The majority of active authentication system is evaluated on small datasets that contains hundreds of samples. Also, the availability of a public dataset for active authentication system research is needed Crossler:2013 ().
Usability: the usability of active authentication systems is very important factor and needs to be handled Adams:1999 ().
Computation cost and energy consumption: the capabilities of the smartphones are lower than the desktop systems. So the complexity cost should be considered in the design of active authentication systems on smartphones.
We would like to thank our colleagues for their feedback on the earlier version of this Paper. The first author would like to thank Egyptian Mission sector for the doctoral scholarship.
- (1) V. M. Patel, R. Chellappa, D. Chandra, B. Barbello, Continuous user authentication on mobile devices: Recent progress and remaining challenges, IEEE Signal Processing Magazine 33 (4) (2016) 49–61. doi:10.1109/MSP.2016.2555335.
I. Muslukhov, Y. Boshmaf, C. Kuo, J. Lester, K. Beznosov,
requirements for data protection in smartphones, in: Proceedings of the 2012
IEEE 28th International Conference on Data Engineering Workshops, ICDEW ’12,
IEEE Computer Society, Washington, DC, USA, 2012, pp. 228–235.
W. A. Usmani, D. Marques, I. Beschastnikh, K. Beznosov, T. Guerreiro,
insider attacks on facebook, in: Proceedings of the 2017 CHI Conference on
Human Factors in Computing Systems, CHI ’17, ACM, New York, NY, USA, 2017,
- (4) A. K. Jain, A. A. Ross, K. Nandakumar, Introduction to Biometrics, Springer Publishing Company, Incorporated, 2011.
- (5) Z. SitovÃ¡, J. Å edÄnka, Q. Yang, G. Peng, G. Zhou, P. Gasti, K. S. Balagani, Hmog: New behavioral biometric features for continuous authentication of smartphone users, IEEE Transactions on Information Forensics and Security 11 (5) (2016) 877–892. doi:10.1109/TIFS.2015.2506542.
- (6) R. Bolle, S. Pankanti, Biometrics, Personal Identification in Networked Society: Personal Identification in Networked Society, Kluwer Academic Publishers, Norwell, MA, USA, 1998.
T. Hoang, D. Choi, T. Nguyen,
Gait authentication on
mobile phone using biometric cryptosystem and fuzzy commitment scheme,
International Journal of Information Security (2015) 1–12doi:10.1007/s10207-015-0273-1.
D. Buschek, A. De Luca, F. Alt,
applicability and usability of keystroke biometrics on mobile touchscreen
devices, in: Proceedings of the 33rd Annual ACM Conference on Human Factors
in Computing Systems, CHI ’15, ACM, New York, NY, USA, 2015, pp. 1393–1402.
- (9) Googl, Pattern-gesture, november 2016 (2015).
- (10) G. Inc., Framework base (motion event api level), february 2017 (2015).
- (11) G. Inc., Framework base (motion event os level), february 2017 (2015).
M. Shahzad, A. X. Liu, A. Samuel,
Secure unlocking of mobile
touch screen devices by simple gestures: You can see it but you can not do
it, in: Proceedings of the 19th Annual International Conference on Mobile
Computing & Networking, MobiCom ’13, ACM, New York, NY, USA, 2013, pp.
- (13) X. Zhao, T. Feng, W. Shi, Continuous mobile authentication using a novel graphic touch gesture feature, in: Biometrics: Theory, Applications and Systems (BTAS), 2013 IEEE Sixth International Conference on, 2013, pp. 1–6. doi:10.1109/BTAS.2013.6712747.
- (14) A. Serwadda, V. V. Phoha, Z. Wang, Which verifiers work?: A benchmark evaluation of touch-based authentication algorithms, in: 2013 IEEE Sixth International Conference on Biometrics: Theory, Applications and Systems (BTAS), 2013, pp. 1–8. doi:10.1109/BTAS.2013.6712758.
H. Xu, Y. Zhou, M. R. Lyu,
continuous and passive authentication via touch biometrics: An experimental
study on smartphones, in: Symposium On Usable Privacy and Security (SOUPS
2014), USENIX Association, Menlo Park, CA, 2014, pp. 187–198.
- (16) T. Feng, Z. Liu, K. A. Kwon, W. Shi, B. Carbunar, Y. Jiang, N. Nguyen, Continuous mobile authentication using touchscreen gestures, in: 2012 IEEE Conference on Technologies for Homeland Security (HST), 2012, pp. 451–456. doi:10.1109/THS.2012.6459891.
M. Frank, R. Biedert, E. Ma, I. Martinovic, D. Song,
Touchalytics: On the
applicability of touchscreen input as a behavioral biometric for continuous
authentication, IEEE Transactions on Information Forensics and Security
8 (1) (2013) 136–148.
- (18) L. Li, X. Zhao, G. Xue, Unobservable re-authentication for smartphones., in: NDSS, 2013, pp. 1–16.
P. Quinn, A. Cockburn, G. Casiez, N. Roussel, C. Gutwin,
Exposing and understanding
scrolling transfer functions, in: Proceedings of the 25th Annual ACM
Symposium on User Interface Software and Technology, UIST ’12, ACM, New York,
NY, USA, 2012, pp. 341–350.
P. S. Teh, N. Zhang, A. B. J. Teoh, K. Chen,
A survey on touch
dynamics authentication in mobile devices, Comput. Secur. 59 (C) (2016)
P. J. Phillips, A. Martin, C. l. Wilson, M. Przybocki,
An introduction to evaluating
biometric systems, Computer 33 (2) (2000) 56–63.
B. Draffin, J. Zhu, J. Zhang,
Keysens: Passive user
authentication through micro-behavior modeling of soft keyboard interaction,
in: G. Memmi, U. Blanke (Eds.), Mobile Computing, Applications, and Services,
Vol. 130 of Lecture Notes of the Institute for Computer Sciences, Social
Informatics and Telecommunications Engineering, Springer International
Publishing, 2014, pp. 184–201.
M. Trojahn, F. Ortmeier, Toward
mobile authentication with keystroke dynamics on mobile phones and tablets,
in: Proceedings of the 2013 27th International Conference on Advanced
Information Networking and Applications Workshops, WAINA ’13, IEEE Computer
Society, Washington, DC, USA, 2013, pp. 697–702.
T. Feng, X. Zhao, B. Carbunar, W. Shi,
authentication using virtual key typing biometrics, in: Proceedings of the
2013 12th IEEE International Conference on Trust, Security and Privacy in
Computing and Communications, TRUSTCOM ’13, IEEE Computer Society,
Washington, DC, USA, 2013, pp. 1547–1552.
N. L. Clarke, S. M. Furnell,
phone users using keystroke analysis, Int. J. Inf. Secur. 6 (1) (2006)
D. Gunetti, C. Picardi,
Keystroke analysis of free
text, ACM Trans. Inf. Syst. Secur. 8 (3) (2005) 312–347.
F. Monrose, A. D. Rubin,
Keystroke dynamics as
a biometric for authentication, Future Gener. Comput. Syst. 16 (4) (2000)
F. Bergadano, D. Gunetti, C. Picardi,
User authentication through
keystroke dynamics, ACM Trans. Inf. Syst. Secur. 5 (4) (2002) 367–397.
E. Maiorana, P. Campisi, N. González-Carballo, A. Neri,
authentication for mobile phones, in: Proceedings of the 2011 ACM Symposium
on Applied Computing, SAC ’11, ACM, New York, NY, USA, 2011, pp. 21–26.
M. Nauman, T. Ali, A. Rauf,
Using trusted computing
for privacy preserving keystroke-based authentication in smartphones,
Telecommun. Syst. 52 (4) (2013) 2149–2161.
J. A. Quinn, M. Sugiyama,
approach to anomaly detection in static and sequential data, Pattern Recogn.
Lett. 40 (2014) 36–40.
F. Li, N. Clarke, M. Papadaki, P. Dowland,
Active authentication for
mobile devices utilising behaviour profiling, Int. J. Inf. Secur. 13 (3)
- (33) H. G. Kayacik, M. Just, L. Baillie, D. Aspinall, N. Micallef, Data driven authentication: On the effectiveness of user behaviour modelling with mobile device sensors, arXiv preprint arXiv:1410.7743.
- (34) D. Bassu, M. Cochinwala, A. Jain, A new mobile biometric based upon usage context, in: Technologies for Homeland Security (HST), 2013 IEEE International Conference on, IEEE, 2013, pp. 441–446.
- (35) A. Gupta, M. Miettinen, N. Asokan, M. Nagy, Intuitive security policy configuration in mobile devices using context profiling, in: Privacy, Security, Risk and Trust (PASSAT), 2012 International Conference on and 2012 International Confernece on Social Computing (SocialCom), IEEE, 2012, pp. 471–480.
E. Shi, Y. Niu, M. Jakobsson, R. Chow,
authentication through learning user behavior, in: Proceedings of the 13th
International Conference on Information Security, ISC’10, Springer-Verlag,
Berlin, Heidelberg, 2011, pp. 99–113.
- (37) J. Hall, M. Barbeau, E. Kranakis, Anomaly-based intrusion detection using mobility profiles of public transportation users, in: WiMob’2005), IEEE International Conference on Wireless And Mobile Computing, Networking And Communications, 2005., Vol. 2, 2005, pp. 17–24 Vol. 2. doi:10.1109/WIMOB.2005.1512845.
- (38) F. Li, N. Clarke, M. Papadaki, P. Dowland, Behaviour profiling for transparent authentication for mobile devices, in: European Conference on Information Warfare and Security, Academic Conferences International Limited, 2011, p. 307.
- (39) N. Eagle, A. S. Pentland, Reality mining: sensing complex social systems, Personal and ubiquitous computing 10 (4) (2006) 255–268.
C. Shin, J.-H. Hong, A. K. Dey,
prediction of mobile application usage for smart phones, in: Proceedings of
the 2012 ACM Conference on Ubiquitous Computing, UbiComp ’12, ACM, New York,
NY, USA, 2012, pp. 173–182.
C. Shepard, A. Rahmati, C. Tossell, L. Zhong, P. Kortum,
Livelab: Measuring wireless
networks and smartphone users in the field, SIGMETRICS Perform. Eval. Rev.
38 (3) (2011) 15–20.
- (42) N. Kiukkonen, J. Blom, O. Dousse, D. Gatica-Perez, J. Laurila, Towards rich mobile phone datasets: Lausanne data collection campaign, Proc. ICPS, Berlin.
- (43) N. Neverova, C. Wolf, G. Lacey, L. Fridman, D. Chandra, B. Barbello, G. Taylor, Learning human identity from motion patterns, IEEE Access 4 (2016) 1810–1820.
- (44) T. Hoang, D. Choi, V. Vo, A. Nguyen, T. Nguyen, A lightweight gait authentication on mobile phone regardless of installation error, in: IFIP International Information Security Conference, Springer, 2013, pp. 83–101.
- (45) F. Juefei-Xu, C. Bhagavatula, A. Jaech, U. Prasad, M. Savvides, Gait-id on the move: Pace independent human identification using cell phone accelerometer dynamics, in: 2012 IEEE Fifth International Conference on Biometrics: Theory, Applications and Systems (BTAS), 2012, pp. 8–15. doi:10.1109/BTAS.2012.6374552.
M. O. Derawi, C. Nickel, P. Bours, C. Busch,
user-authentication on mobile phones using biometric gait recognition, in:
Proceedings of the 2010 Sixth International Conference on Intelligent
Information Hiding and Multimedia Signal Processing, IIH-MSP ’10, IEEE
Computer Society, Washington, DC, USA, 2010, pp. 306–311.
- (47) J. Mantyjarvi, M. Lindholm, E. Vildjiounaite, S. M. Makela, H. A. Ailisto, Identifying users of portable devices from gait pattern with accelerometers, in: Proceedings. (ICASSP ’05). IEEE International Conference on Acoustics, Speech, and Signal Processing, 2005., Vol. 2, 2005, pp. ii/973–ii/976 Vol. 2. doi:10.1109/ICASSP.2005.1415569.
- (48) W. C. Lang, K. Forinash, Time-frequency analysis with the continuous wavelet transform, American journal of physics 66 (9) (1998) 794–797.
A. Ross, A. Jain,
Information fusion in
biometrics, Pattern Recogn. Lett. 24 (13) (2003) 2115–2125.
- (50) A. Ross, A. K. Jain, Multimodal biometrics: An overview, in: Signal Processing Conference, 2004 12th European, 2004, pp. 1221–1224.
- (51) A. A. Ross, A. K. Jain, K. Nandakumar, Decision level fusion, Springer, 2006.
- (52) M. Faundez-Zanuy, Data fusion in biometrics, IEEE Aerospace and Electronic Systems Magazine 20 (1) (2005) 34–38. doi:10.1109/MAES.2005.1396793.
- (53) C. H. Chen, C. Y. Chen, Optimal fusion of multimodal biometric authentication using wavelet probabilistic neural network, in: 2013 IEEE International Symposium on Consumer Electronics (ISCE), 2013, pp. 55–56. doi:10.1109/ISCE.2013.6570127.
L. Lam, S. Y. Suen, Application of
majority voting to pattern recognition: An analysis of its behavior and
performance, Trans. Sys. Man Cyber. Part A 27 (5) (1997) 553–568.
- (55) L. I. Kuncheva, Combining pattern classifiers: methods and algorithms, John Wiley & Sons, 2004.
B. Khaleghi, A. Khamis, F. O. Karray, S. N. Razavi,
fusion: A review of the state-of-the-art, Inf. Fusion 14 (1) (2013) 28–44.
A. Serwadda, V. V. Phoha, Z. Wang, R. Kumar, D. Shukla,
Toward robotic robbery on the touch
screen, ACM Trans. Inf. Syst. Secur. 18 (4) (2016) 14:1–14:25.
- (58) H. Zhang, V. M. Patel, M. Fathy, R. Chellappa, Touch gesture-based active user authentication using dictionaries, in: Applications of Computer Vision (WACV), 2015 IEEE Winter Conference on, IEEE, 2015, pp. 207–214.
R. E. Crossler, A. C. Johnston, P. B. Lowry, Q. Hu, M. Warkentin,
R. Baskerville, Future
directions for behavioral information security research, Comput. Secur. 32
A. Adams, M. A. Sasse, Users
are not the enemy, Commun. ACM 42 (12) (1999) 40–46.