Machines Learn Appearance Bias in Face Recognition
We seek to determine whether state-of-the-art, black box face recognition techniques can learn first-impression appearance bias from human annotations. With FaceNet, a popular face recognition architecture, we train a transfer learning model on human subjects’ first impressions of personality traits in other faces. We measure the extent to which this appearance bias is embedded and benchmark learning performance for six different perceived traits. In particular, we find that our model is better at judging a person’s dominance based on their face than other traits like trustworthiness or likeability, even for emotionally neutral faces. We also find that our model tends to predict emotions for deliberately manipulated faces with higher accuracy than for randomly generated faces, just like a human subject. Our results lend insight into the manner in which appearance biases may be propagated by standard face recognition models.
Researchers have raised concerns about the use of face recognition for, inter alia, police surveillance and job candidate screening (Deborah Raji et al., 2020). For example, HireVue’s automated recruiting technology uses candidate’s appearance and facial expression to judge their fitness for employment (Harwell, 2019). If a surveillance or hiring algorithm learns harmful human biases from annotated training data, it may systematically discriminate against individuals with certain facial features. We investigate whether industry-standard face recognition algorithms can learn to trust or mistrust faces based on human annotators’ perception of personality traits from faces. If off-the-shelf machine learning face recognition models draw trait inferences about the faces they examine, then any application domain using face recognition to make judgments, from surveillance to hiring to self-driving cars, is at risk of propagating harmful prejudices. In human beings, quick trait inferences should not affect important, deliberate decisions (Willis and Todorov, 2006), but unconscious appearance biases that occur during rapid data annotation may embed and amplify appearance discrimination in machines. We show that the embeddings from FaceNet model can be used to predict human annotators’ first-impression appearance biases for six different personality traits.
Because the predictions made by machine learning models depend on both the training data and the annotations used to label them, systematic biases in either source of data could result in biased predictions. For instance, a dataset on employment information designed to predict which job candidates will be successful in the future might contain data regarding mainly European American men. If such a dataset reflects historical injustices, it is likely to unfairly disadvantage African American job candidates. Moreover, annotators could introduce human bias to the dataset by labeling items according to their implicit biases. If annotators for a computer vision task are presented with a photo of two employees, they might label a woman as the employee and the man standing next to her as the employer or boss. Such embedded implicit or sociocultural bias leads to biased and potentially prejudiced outcomes in decision making systems. Prior research shows that in human-centered data, a priori bias often includes harmful stereotypes and introduces problems of bias or unfairness into subsequent decision-making. In computer vision, models used in face detection or self-driving cars have been proven biased against genders and races (Buolamwini and Gebru, 2018; Wilson et al., 2019).
2 Problem Statement
In this research, we hypothesize that computer vision models not only embed racial or gender biases but also embed other human-like biases, including appearance biases caused by first impressions. Some examples of these biases include gender choices made by automated captioning systems and contextual cues used incorrectly by visual question answering systems (Hendricks et al., 2018; Zhao et al., 2017; Manjunatha et al., 2019). As of this writing, these algorithms are actively used in video interview screening of job applicants (Escalante et al., 2017) (Figure-1), self-driving cars (Geiger et al., 2012), surveillance (Ko, 2008), anomaly detection (Mahadevan et al., 2010), military unmanned autonomous vehicles (Nex and Remondino, 2014), and cancer detection (Bejnordi et al., 2017).
But while many biases affecting machine learning systems are explicit and easily detected with error analysis, some “implicit” biases are consciously disavowed and are much more difficult to measure and counteract. Often, these biases take effect a split second after perception in human judgment. These biases can often be quantified by implicit association tests (Greenwald et al., 1998) or other psychological studies (Willis and Todorov, 2006). We investigate whether biases formed during the first impression of a human face get embedded in face recognition models. First impressions are trait inferences drawn from the facial structure and expression of other people (Willis and Todorov, 2006). Here, traits are personality attributes including attractiveness, competence, extroversion, dominance, likeability, and trustworthiness (Hassin and Trope, 2000). Specifically, we propose that standard machine learning techniques, including pre-trained face recognition models, propagate first-impression trait inferences, or appearance biases, based on facial structures. To determine if machine learning models acquire first appearance bias, we quantify the correlation between our model’s predicted trust scores and human subjects’ actual trait inferences.
But while many biases affecting machine learning systems are explicit and easily detected with error analysis, some “implicit” biases are consciously disavowed and are much more difficult to measure and counteract. Often, these biases take effect a split second after perception. These biases can often be quantified by implicit association tests (Greenwald et al., 1998) or other psychological studies (Willis and Todorov, 2006). Specifically, we examine biases formed during the first impression of a human face. First impressions are trait inferences drawn from the facial structure and expression of other people (Willis and Todorov, 2006). Here, traits are personality attributes including as attractiveness, competence, extroversion, dominance, likeability, and trustworthiness (Hassin and Trope, 2000). Specifically, we propose that standard machine learning techniques, including pre-trained face recognition models, propagate first-impression trait inferences, or appearance biases, based on facial structures. To determine if machine learning models acquire first appearance bias, we test for a correlation between our model’s predicted trust scores and human subjects’ actual trait inferences.
3 Related Work
There is a wealth of literature measuring the stereotypes perpetuated by image classifiers and other machine learning models, from search results to automated captioning (Kay et al., 2015; Hendricks et al., 2018; Kleinberg et al., 2017). Previous applications of unsupervised machine learning methods demonstrated the existence of social and cultural biases embedded in the statistical properties of language, but little research has been conducted with respect to the biases in transfer learning models for faces or people and even less attention has been paid to the intersection of machine learning and first appearance bias (Caliskan et al., 2017; Torralba and Efros, 2011). Tangentially, Jacques Junior et. al review the use of computer vision to anticipate personality traits (Jacques Junior et al., 2018). Most notably, Yang and Glaser use a novel long-short term memory (LSTM) approach to predict first impressions of the Big Five personality traits after 15 seconds (Yang and Glaser, 2017). But are these first impressions preserved in datasets and off-the-shelf models used in transfer learning, and can even more rapid judgments such as first-impression bias be replicated?
There are a few relevant psychology studies devoted to measuring cognitive bias associated with human face recognition. In particular, Willis and Todorov (2006) measure the immediate judgments people make about others’ faces on first sight, recording a spectrum of trait inferences, from trustworthiness to aggressiveness, after less than a second of exposure to computer-generated faces (Willis and Todorov, 2006; Todorov, 2017). Oosterhof and Todorov (2008) identified 50 principal components in the 2D space that represent face shape to linearly generate face variations that capture a broad range of facial attributes and trait judgments. They further analyze the facial cues used to make evaluations about trustworthiness and dominance, identifying “approach/avoidance” expressions that signal trustworthiness and features that signal physical strength, or dominance.
To test whether first impression trait inferences can be learned from facial cues visualized in Figure 2, we aggregated datasets of computer-generated faces used to measure appearance bias in two psychological studies (Oosterhof and Todorov, 2008; Todorov et al., 2013). In each experiment, human participants are shown a face for less than a second and then asked to rate the degree to which it exhibits a given trait on a 9-point scale. Each face is hairless and centered on a black background. These face models were generated with FaceGen, which uses a database of laser-scanned male and female human faces to create new, unique faces (7). Together, these two sets provide a benchmark for first impression, appearance-based evaluations of personality traits by human participants.
4.1 Randomly Generated Faces
The first dataset (300 Random Faces) includes 300 computer-generated, emotionally neutral, Caucasian male faces (Figure 2). Male faces were used due to lack of hair; participants tend to categorize bald faces as male (Todorov et al., 2013). In this study, the authors asked 75 Princeton University undergraduates to judge each face from this dataset on attractiveness, competence, extroversion, dominance, likeability, and trustworthiness (Oosterhof and Todorov, 2008; Todorov et al., 2011). Here, the ground-truth labels are the trustworthy scores provided by the study participants. So that the ground-truth labels for both datasets are distributed normally, we standardize each score by calculating its distance from the mean , where is the mean and is the standard deviation (SD).
4.2 Faces Manipulated Along Trait Dimensions
For the second dataset (Maximally Distinct Faces), Todorov et al. (2011) select 25 “maximally distinct” faces from a random sample of 1,000 randomly generated faces. Maximally distant faces are those faces whose principal components are separated by the maximum Euclidean distance. Using the correlations between a principal component face representation and empirical trait inference ratings from the human participants, each maximally distinct face is manipulated along each of the six trait dimensions to produce a set of faces to elicit a trait inference -3, -2, -1, 0, 1, 2, and 3 SD from the mean. Manipulations are numerical perturbations in the face representation vector based entirely on the correlations between the vector and human participants’ judgments of the face it represents. Though the perturbations themselves are not psychologically meaningful, these manipulations tend to produce faces that vary noticeably along the trait dimensions (Figure 2). After the faces were produced, the target trait scores were validated by 15 different Princeton University student participants on the same 9-point scale as in the first study (Todorov et al., 2013).
To train a regression model to predict trustworthy trait inferences, we construct a transfer learning pipeline to leverage face representations extracted from a pre-trained, state-of-the-art face recognition model (Figure 3). From the final layer of FaceNet, a popular open-source Inception-ResNet-V1 deep learning architecture, we extract a 128 dimensional feature vector from the pixels of each image in the two sets of labeled faces described in Section 4 (Schroff et al., 2015). For thousands of images, extraction takes minutes. Rather than train FaceNet from scratch, we utilize a model with weights pre-trained using softmax loss on the MS-Celeb-1M dataset, a common face recognition benchmark (Guo et al., 2016). Pre-training the model for feature extraction allows us to replicate the feature processing used commonly in black box industry models. The FaceNet model (10k stars on Github), and similar architectures such as OpenFace (13.1k stars on Github), are used by software developers, researchers, and industry groups (Schroff et al., 2015; Amos et al., 2016).
If widely used black box face recognition models tend to learn appearance biases embedded in datasets, face recognition applications may make biased decisions that inequitably impact users with certain facial features. After feature extraction, we train six random forest regression models to predict appearance bias for each of the six traits measured: attractiveness, competence, dominance, extroversion, likeability, and trustworthiness. The human participants’ trait scores, multiplied by 100, serve as the ground-truth labels. The random forest includes 100 weak learners with no maximum depth, a minimum split size of two, and mean-squared-error split criterion. These hyper-parameters were chosen using a holdout test set, consisting of one image from the twenty-five maximally distinct faces and twelve images from the 300 random faces. Data and code used to produce the figures, tables, and machine learning pipeline (Figure 3) in this work are open-sourced at https://github.com/anonymous/repo. The following section details two methods for training and testing this regression model.
6.1 Cross-Fold Validation
Experiment A: To test how well the random forest regression model learns appearance bias from the labeled faces, we shuffle the image embeddings extracted with FaceNet such that the 300 random faces and maximally distinct faces are mixed. The target labels are the original appearance bias measurements provided by human participants. Splitting the training data into 10 equal folds, we do the following for each fold: 1) train the regressor on the other 9 partitions; 2) record and plot appearance bias predictions for the current partition. Once all 10 partitions are processed, each image has a corresponding vector of predicted appearance bias scores, one for each trait measured.
Table 1 displays goodness-of-fit and correlation statistics from the cross-validations for regressions on all six traits measured. Notably, our approach learns appearance bias to a high degree of precision for the maximally distinct faces (, but the accuracy drops on randomly generated faces.
6.2 Testing on Randomly Generated Faces
Experiment B: To better assess our model’s performance and investigate the disparity in predictive performance on the maximally distinct faces and the randomly generated faces, we train the regression model on only the maximally distinct faces and test on only the randomly generated faces. Prediction on the randomly generated faces in this experiment has a smaller correlation coefficient (. This result may be due in part by the lower sample size for the randomly generated faces, but is more likely a result of the higher variance in participants’ responses to randomly generated faces (Todorov et al., 2011). Like the human participants, our model tends to agree more about judgments of deliberately manipulated faces than about judgments of randomly generated faces. Our approach learns appearance bias more accurately with respect to judgments of dominance than judgments of other traits, perhaps because dominance has been shown to be less correlated with facial cues than other traits (Willis and Todorov, 2006). Also like the human participants, our model is much more accurate at predicting dominance judgments for the randomly generated faces than it is at predicting other trait judgments.
Pearson’s correlation coefficient and root mean square error (RMSE) for regression predictions. In Experiment A, a random forest regression is fitted on both sets of faces and predictions are produced by 10-fold cross validation; in B, the regression is fitted on maximally distinct faces and tested on randomly generated faces. P-values are from the correlation t-test of .
We show that state-of-the-art face recognition techniques learn appearance biases from human annotators without special tuning or design, suggesting that biases from annotators may be creeping into face recognition applications in the wild. Significantly, we also find that appearance biases are propagated more easily from images which have been artificially manipulated to appear more or less trustworthy to human beings, and that bias about the dominant trait may be easier to learn. The results of this research will be particularly useful to AI and machine learning practitioners wishing to detect and mitigate bias in their systems, psychologists studying prejudice in human perception of faces, and any public policy concerning fairness and bias in technology. Further work is necessary to determine the extent to which first appearance bias exists in specific machine learning domains. Models using traditional or transfer learning may not be the only methods capable of learning trait inferences; zero-shot, semi-supervised, and multi-modal learning approaches should also be investigated to determine whether another approach exacerbates or mitigates bias.
Additionally, it is not clear whether and how appearance biases translate to algorithmic decision-making; additional research is necessary to identify appearance biases in commonly used datasets and associate those biases with algorithmic outcomes. An unsupervised approach, like image embeddings, may be useful for formulating even more accurate measures of implicit biases in image classification in the style of the Word Embedding Association Test (Caliskan et al., 2017).
As artificial intelligence systems have a greater role in human interactions, a stronger understanding of the types of biases that can pervade these systems and the effect they have on machine learning systems is necessary to enforce fair and ethical human-AI interactions. We find that machine learning models might easily replicate human beings’ first impressions of personality traits in other faces using state-of-the-art models and data. We develop a method for measuring the extent to which trust impressions are learned in a standard face recognition scenario and establish benchmark learning rates for six perceived social traits. Our model learns to perceive traits using similar facial features as human participants, as demonstrated by its improved performance on the dominance trait for randomly generated faces. Because trait impressions are learned more easily for faces which have been artificially manipulated, our results lend insight into the manner in which biases may be extracted and interpreted by standard face recognition models.
- OpenFace: A general-purpose face recognition library with mobile applications. Technical report External Links: Cited by: §5.
- Diagnostic assessment of deep learning algorithms for detection of lymph node metastases in women with breast cancer. JAMA - Journal of the American Medical Association 318 (22), pp. 2199–2210. External Links: Cited by: §2.
- Gender Shades: Intersectional Accuracy Disparities in Commercial Gender Classification. In Proceedings of Machine Learning Research, S. A. Friedler and C. Wilson (Eds.), Vol. 81, pp. 1–15. External Links: Cited by: §1.
- Semantics Derived Automatically from Language Corpora Contain Human-like Biases. Technical report Technical Report 6334, Vol. 356, Science. External Links: Cited by: §3, §7.
- Saving Face: Investigating the Ethical Concerns of Facial Recognition Auditing. 7. External Links: Cited by: §1.
- Design of an explainable machine learning challenge for video interviews. In Proceedings of the International Joint Conference on Neural Networks, Vol. 2017-May, pp. 3688–3695. External Links: Cited by: §2.
- FaceGen - 3D Human Faces. External Links: Cited by: §4.
- Are we ready for autonomous driving? the KITTI vision benchmark suite. In Proceedings of the IEEE Computer Society Conference on Computer Vision and Pattern Recognition, pp. 3354–3361. External Links: Cited by: §2.
- Measuring Individual Differences in Implicit Cognition: The Implicit Association Test. Journal of Personality and Social Psychology 74 (6), pp. 1464–80. External Links: Cited by: §2, §2.
- MS-Celeb-1M: A Dataset and Benchmark for Large-Scale Face Recognition. Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics) 9907 LNCS, pp. 87–102. External Links: Cited by: §5.
- A face-scanning algorithm increasingly decides whether you deserve the job. The Washington Post. Cited by: Figure 1, §1.
- Facing faces: Studies on the cognitive aspects of physiognomy. Journal of Personality and Social Psychology 78 (5), pp. 837–852. External Links: Cited by: §2, §2.
- Women Also Snowboard: Overcoming Bias in Captioning Models. CoRR. External Links: Cited by: §2, §3.
- First Impressions: A Survey on Vision-based Apparent Personality Trait Analysis. Technical report arXiv. External Links: Cited by: §3.
- Unequal Representation and Gender Stereotypes in Image Search Results for Occupations. In Proceedings of the 33rd Annual ACM Conference on Human Factors in Computing Systems - CHI ’15, New York, New York, USA, pp. 3819–3828. External Links: Cited by: §3.
- Human Decisions and Machine Predictions. Technical report Technical Report 23180, Working Paper Series, National Bureau of Economic Research. External Links: Cited by: §3.
- A survey on behavior analysis in video surveillance for homeland security applications. In Proceedings - Applied Imagery Pattern Recognition Workshop, External Links: Cited by: §2.
- Anomaly detection in crowded scenes. In Proceedings of the IEEE Computer Society Conference on Computer Vision and Pattern Recognition, pp. 1975–1981. External Links: Cited by: §2.
- Explicit Bias Discovery in Visual Question Answering Models. pp. 9554–9563. External Links: Cited by: §2.
- UAV for 3D mapping applications: A review. Vol. 6, Springer Verlag. External Links: Cited by: §2.
- The functional basis of face evaluation. Technical report Vol. 12, PNAS August. External Links: Cited by: §3, §4.1, §4.
- FaceNet: A Unified Embedding for Face Recognition and Clustering. In IEEE CVPR, pp. 815–823. External Links: Cited by: §5.
- Face Value: The Irresistible Influence of First Impressions. Princeton University Press, Princeton. External Links: Cited by: §3.
- Validation of Data-Driven Computational Models of Social Perception of Faces People instantly form impressions from facial. Emotion 13 (4), pp. 724–738. External Links: Cited by: Figure 2, §4.1, §4.2, §4.
- Data-driven Methods for Modeling Social Perception. Social and Personality Psychology Compass 5 (10), pp. 775–791. External Links: Cited by: §4.1, §4.2, §6.2.
- Unbiased Look at Dataset Bias. Technical report CVPR. Cited by: §3.
- First Impressions. Psychological Science 17 (7), pp. 592–598. External Links: Cited by: §1, §2, §2, §3, §6.2.
- Predictive Inequity in Object Detection. External Links: Cited by: §1.
- Prediction of Personality First Impressions With Deep Bimodal LSTM. Technical report arXiv. External Links: Cited by: §3.
- Men also like shopping: Reducing gender bias amplification using corpus-level constraints. In EMNLP 2017 - Conference on Empirical Methods in Natural Language Processing, Proceedings, pp. 2979–2989. External Links: Cited by: §2.