Measuring social bias in knowledge graph embeddings
It has recently been shown that word embeddings encode social biases, with a harmful impact on downstream tasks. However, to this point there has been no similar work done in the field of graph embeddings. We present the first study on social bias in knowledge graph embeddings, and propose a new metric suitable for measuring such bias. We conduct experiments on Wikidata and Freebase, and show that, as with word embeddings, harmful social biases related to professions are encoded in the embeddings with respect to gender, religion, ethnicity and nationality. For example, graph embeddings encode the information that men are more likely to be bankers, and women more likely to be homekeepers. As graph embeddings become increasingly utilized, we suggest that it is important the existence of such biases are understood and steps taken to mitigate their impact.
Recent work in the word embeddings literature has shown that embeddings encode gender and racial biases, Bolukbasi et al. (2016); Caliskan et al. (2017); Garg et al. (2017). These biases can have harmful effects in downstream tasks including coreference resolution, Zhao et al. (2018a) and machine translation, Stanovsky et al. (2019), leading to the development of a range of methods to try to mitigate such biases, Bolukbasi et al. (2016); Zhao et al. (2018b). In an adjacent literature, learning embeddings of knowledge graph (KG) entities and relations is becoming an increasingly common first step in utilizing KGs for a range of tasks, from missing link prediction, Bordes et al. (2013); Trouillon et al. (2016), to more recent methods integrating learned embeddings into language models, Zhang et al. (2019); IV et al. (2019); Peters et al. (2019).
A natural question to ask is “do graph embeddings encode social biases in similar fashion to word embeddings”. We show that existing methods for identifying bias in word embeddings are not suitable for KG embeddings, and present an approach to overcome this using embedding finetuning. We demonstrate (perhaps unsurprisingly) that unequal distributions of people of different genders, ethnicities, religions and nationalities in Freebase and Wikidata result in biases related to professions being encoded in graph embeddings, such as that men are more likely to be bankers and women more likely to be homekeepers.
Such biases are potentially harmful when KG embeddings are used in applications. For example, if embeddings are used in a fact checking task
2.1 Graph Embeddings
Graph embeddings are a vector representation of dimension of all entities and relations in a KG. To learn these representations, we define a score function which takes as input the embeddings of a fact in triple form and outputs a score, denoting how likely this triple is to be correct.
where are the dimension embeddings of entities 1/2, and is the dimension embedding of relation 1. The score function is composed of a transformation, which takes as input one entity embedding and the relation embedding and outputs a vector of the same dimension, and a similarity function, which calculates the similarity or distance between the output of the transformation function and the other entity embedding.
Many transformation functions have been proposed, including TransE Bordes et al. (2013), ComplEx Trouillon et al. (2016) and RotatE Sun et al. (2019). In this paper we use the TransE function and the dot product similarity metric, though emphasize that our approach is applicable to any score function:
We use embeddings of dimension 200, and sample 1000 negative triples per positive, by randomly permuting the lhs or rhs entity. We pass the 1000 negatives and single positive through a softmax function, and train using the cross entropy loss. All training is implemented using the PyTorch-BigGraph library Lerer et al. (2019).
2.2 Defining bias in embeddings
Bias can be thought of as “prejudice in favor or against a person, group, or thing that is considered to be unfair” Jones (2019). Because definitions of fairness have changed over time, algorithms which are trained on “real-world” data
2.3 Measuring bias in word embeddings
The first common technique for exposing bias in word embeddings, the “Word Embedding Association Test” Caliskan et al. (2017), measures the cosine distance between embeddings and the average embeddings of sets of attribute words (e.g. male vs. female). They give a range of examples of biases according to this metric, including that science related words are more associated with “male”, and art related words with “female”. In a similar vein, in Bolukbasi et al. (2016), the authors use the direction between vectors to expose stereotypical analogies, claiming that the direction between man::doctor is analogous to that of woman::nurse. Despite Nissim et al. (2019) exposing some technical shortcomings in this approach, it remains the case that distance metrics appear to be appropriate in at least exposing bias in word embeddings, which has then been shown to clearly propagate to downstream tasks, Zhao et al. (2018a); Stanovsky et al. (2019).
We suggest that distance-based metrics are not suitable for measuring bias in KG embeddings. Figure 1 provides a simple demonstration of this. Visualizing in a two dimensional space, the embedding of person1 is closer to nurse than to doctor. However, graph embedding models do not use distance between two entity embeddings when making predictions, but rather the distance between some transformation of one entity embedding with the relation embedding.
In the simplest case of TransE Bordes et al. (2013) this transformation is a summation, which could result in a vector positioned at the yellow dot in Figure 1, when making a prediction of the profession of person1. As the transformation function becomes more complicated, Trouillon et al. (2016); Sun et al. (2019) etc., the distance metric becomes increasingly less applicable, as associations in the distance space become less and less correlated with associations in the score function space.
2.4 Score based metric
In light of this, we present an alternative metric based on the score function. We define the sensitive attribute we are interested in, denoted , and two alternative values of this attribute, denoted and . For the purposes of this example we use gender as the sensitive attribute , and male and female as the alternative values and . We take a trained embedding of a human entity, , denoted and calculate an update to this embedding which increases the score that they have attribute (male), and decreases the score that they have attribute (female). In other words, we finetune the embedding to make the person “more male” according to the model’s encoding of masculinity. This is visualized in Figure 2, where we shift person1’s embedding so that the transformation between person1 and the relation has_gender moves closer to male and away from female.
Mathematically, we define function as the difference between the score that person has sensitive attribute (male) and that they have sensitive attribute (female). We then differentiate wrt the embedding of person , , and update the embedding to increase this score function.
where denotes the new embedding for person , the embedding of the sensitive relation (gender), and and the embeddings of attributes and (male and female). This is equivalent to providing the model with a batch of two triples, and , and taking a step with the basic gradient descent algorithm with learning rate .
We then analyse the change in the scores for all professions. That is, we calculate whether, according to the model’s score function, making an entity more male increases or decreases the likelihood that they have a particular profession, :
where denotes the entity embedding of the profession, .
Figure 3 illustrates this. The adjustment to person1’s embedding defined in Figure 2 results in the transformation of person1 and the relation has_profession moving closer to doctor and further away from nurse. That is, the score g(person1, has_profession, doctor) has increased, and the score g(person1, has_profession, nurse) has decreased. In other words, the embeddings in this case encode the bias that doctor is a profession associated with male rather than female entities.
We can then repeat the process for all humans in the KG and calculate the average changes, giving a bias score for profession :
where J is the number of human entities in the KG. We calculate this score for each profession, and rank the results.
We provide results in the main paper for Wikidata using TransE Bordes et al. (2013) embeddings, showing only professions which have at least 20 observations in the KG.
Table 1 presents the results for gender, with attribute being male and female. Alongside the score we present the counts of humans in the KG which have this profession, split by attributes and . For example, the top rows of column and in Table 1 shows that there are 44 male entities in Wikidata with the profession baritone, and 0 female entities with this profession.
|Formula One driver||0.100||681||3|
Whilst the discrepancies in counts are of interest in themselves Wagner et al. (2015) our main aim in this paper is to show that these differences propagate to the learned embeddings. Table 1 confirms this; although it includes a number of professions which are male by definition, such as “baritone”, there are also many which we may wish to be neutral, such as “banker” and “engineer”. Whilst there is a strong correlation between the counts and , it is not perfect. For example, there are more male and less female priests than there are bankers, but we get a higher score according to the model for banker than we do priest. The interconnected nature of graphs makes diagnosing the reason for this difficult, but there is clearly a difference in representation of the male entities in the graph who are bankers relatives to priests, which plays out along gender lines.
|woman of letters||0.107||165||10|
Table 2 presents the most female professions relative to male for Wikidata (i.e. we reverse and from Table 1). As with the most male case, there are a mixture of professions which are female by definition, such as “nun”, and those which we may wish to be neutral, such as “nurse” and “homekeeper”. This story is supported by Tables 9 and 10 in the Appendix, which give the same results but for the FB3M dataset.
|film score composer||0.088||10||25|
We can also calculate biases for other sensitive relations such as ethnicity, religion and nationality. For each of these relations, we choose two attributes to compare. In Table 3, we show the professions most associated with the ethnicity “Jewish” relative to “African American”. As previously, the results include potentially harmful stereotypes, such as the “economist” and “entrepreneur” cases. It is interesting that these stereotypes play out in our measure, despite the more balanced nature of the counts
We have presented the first study on social bias in KG embeddings, and proposed a new metric for measuring such bias. We demonstrated that differences in the distributions of entities in real-world knowledge graphs (there are many more male bankers in Wikidata than female) translate into harmful biases related to professions being encoded in embeddings. Given that KGs are formed of real-world entities, we cannot simply equalize the counts; it is not possible to correct history by creating female US Presidents, etc. In light of this, we suggest that care is needed when applying graph embeddings in NLP pipelines, and work needed to develop robust methods to debias such embeddings.
Appendix A Appendices
a.1 Wikidata additional results
We provide a sample of additional results for Wikidata, across ethnicity, religion and nationality. For each case we choose a pair of values (e.g. Catholic and Islam for religion) to compare.
The picture presented is similar to that in the main paper; the bias measure is highly correlated with the raw counts, with some associations being non-controversial, and others demonstrating potentially harmful stereotypes. Table 8 is interesting, as the larger number of US entities in Wikidata (390k) relative to UK entities (131k) means the counts are more balanced, and the correlation between counts and bias measure less strong.
|Canadian football player||0.217||298||0|
|American football player||0.180||1661||1|
|mixed martial artist||0.137||60||0|
|civil rights advocate||0.121||73||0|
|human rights activist||0.125||59||42|
|rugby union player||0.063||195||2554|
|association football referee||0.055||45||159|
|Canadian football player||0.106||2163||1|
|real estate developer||0.097||28||0|
|civil rights advocate||0.095||85||0|
|video game developer||0.084||75||11|
a.2 FB3M results
For comparison, we train TransE embeddings on FB3M of the same dimension, and present the corresponding results tables for gender, religion, ethnicity and nationality. The distribution of entities in FB3M is significantly different to that in Wikidata, resulting in a variety of different professions entering the top twenty counts. However, the broad conclusion is similar; the embeddings encode common and potentially harmful stereotypes related to professions.
|Holy Roman Emperor||0.119||23||0|
|Nordic combined skier||0.102||65||0|
|Visual Effects Animator||0.098||27||2|
|Nude Glamour Model||0.081||1||511|
|Key Hair Stylist||0.047||11||43|
|Key Makeup Artist||0.044||9||29|
|Hair and Makeup Artist||0.042||4||24|
|Talk show host||0.052||18||20|
|film score composer||0.048||106||100|
|music video director||0.039||14||7|
|American football player||0.054||525||8|
|Holy Roman Emperor||0.052||20||0|
|attorney at law||0.031||19||1|
|American football player||0.021||43||3|
|film score composer||0.029||29||36|
|association football player||0.024||67||58|
|field hockey player||0.042||16||16|
|Talk show host||0.045||161||6|
|law enforcement officer||0.042||20||0|
|American football player||0.038||7405||3|
|Certified Public Accountant||0.033||33||0|
|Game Show Host||0.031||60||2|
|attorney at law||0.030||83||0|
a.3 Complex embeddings
Our method is equally applicable to any transformation function. To demonstrate this, we trained embeddings of the same dimension using the ComplEx transformation Trouillon et al. (2016), and provide the results for gender in Tables 17 and 18 below. It would be interesting to carry out a comparison of the differences in how bias is encoded for different transformation functions, which we leave to future work.
|association football manager||0.132||587||5|
|association football player||0.115||13321||227|
|Nude Glamour Model||0.177||511||1|
|Key Hair Stylist||0.157||43||11|
- Where we evaluate the likelihood that a new triple is correct before adding it to a knowledge base.
- Such as news articles or a knowledge graph
- For example, we could consider the encoded relationship between a person’s nationality and their chances of being a CEO etc.
- The balanced counts are themselves due to there being many more entities with ethnicity “African American” in Wikidata (16280) than ethnicity “Jewish” (1588).
- Evaluating the underlying gender bias in contextualized word embeddings. CoRR abs/1904.08783. External Links:
- Man is to computer programmer as woman is to homemaker? debiasing word embeddings. CoRR abs/1607.06520. External Links: Cited by: §1, §2.3.
- Translating embeddings for modeling multi-relational data. In Proceedings of the 26th International Conference on Neural Information Processing Systems - Volume 2, NIPS’13, USA, pp. 2787–2795. External Links: Cited by: §1, §2.1, §2.3, §3.
- Learning structured embeddings of knowledge bases. In Proceedings of the Twenty-Fifth AAAI Conference on Artificial Intelligence, AAAI’11, pp. 301–306. External Links:
- Compositional fairness constraints for graph embeddings. CoRR abs/1905.10674. External Links:
- Semantics derived automatically from language corpora contain human-like biases. Science 356, pp. 183–186. Cited by: §1, §2.3.
- Word embeddings quantify 100 years of gender and ethnic stereotypes. CoRR abs/1711.08412. External Links: Cited by: §1.
- Lipstick on a pig: debiasing methods cover up systematic gender biases in word embeddings but do not remove them. CoRR abs/1903.03862. External Links:
- Uneven geographies of user-generated information: patterns of increasing informational poverty. Annals of the Association of American Geographers 104 (4), pp. 746–764. External Links:
- Barack’s wife hillary: using knowledge-graphs for fact-aware language modeling. CoRR abs/1906.07241. External Links: Cited by: §1, §1.
- External Links: Cited by: §2.2.
- PyTorch-BigGraph: A Large-scale Graph Embedding System. In Proceedings of the 2nd SysML Conference, Palo Alto, CA, USA. Cited by: §2.1.
- Black is to criminal as caucasian is to police: detecting and removing multiclass bias in word embeddings. arXiv preprint arXiv:1904.04047.
- On measuring social biases in sentence encoders. CoRR abs/1903.10561. External Links:
- Fair is better than sensational: man is to doctor as woman is to doctor. CoRR abs/1905.09866. External Links: Cited by: §2.3.
- Knowledge enhanced contextual word representations. External Links: Cited by: §1, §1.
- Gender bias in coreference resolution. arXiv preprint arXiv:1804.09301.
- Evaluating gender bias in machine translation. CoRR abs/1906.00591. External Links: Cited by: §1, §2.3.
- RotatE: knowledge graph embedding by relational rotation in complex space. In International Conference on Learning Representations, External Links: Cited by: §2.1, §2.3.
- Complex embeddings for simple link prediction. CoRR abs/1606.06357. External Links: Cited by: §A.3, §1, §2.1, §2.3.
- It’s a man’s wikipedia? assessing gender inequality in an online encyclopedia. CoRR abs/1501.06307. External Links: Cited by: §3.
- Women through the glass-ceiling: gender asymmetries in wikipedia. CoRR abs/1601.04890. External Links:
- Measuring sex stereotypes: a multination study, rev. ed..
- ERNIE: enhanced language representation with informative entities. CoRR abs/1905.07129. External Links: Cited by: §1, §1.
- Gender bias in coreference resolution: evaluation and debiasing methods. CoRR abs/1804.06876. External Links: Cited by: §1, §2.3.
- Learning gender-neutral word embeddings. CoRR abs/1809.01496. External Links: Cited by: §1.
- AI can be sexist and racistâitâs time to make it fair. Nature Publishing Group.