Word Embedding Bias¶
Metrics and debiasing for bias (such as gender and race) in word embedding.
Important
The following paper suggests that the current methods have an only superficial effect on the bias in word embeddings:
Gonen, H., & Goldberg, Y. (2019). Lipstick on a Pig: Debiasing Methods Cover up Systematic Gender Biases in Word Embeddings But do not Remove Them. arXiv preprint arXiv:1903.03862.
Important
The following paper criticize using
most_similar()
function from gensim in the context
of word embedding bias and the generating analogies process:
Nissim, M., van Noord, R., van der Goot, R. (2019). Fair is Better than Sensational: Man is to Doctor as Woman is to Doctor.
Therefore, in responsibly there is an implementation of
most_similar()
with the argument
unrestricted that doesn’t filter the results.
Similar argument exist for
generate_analogies()
.
Currently, three methods are supported:
Bolukbasi et al. (2016) bias measure and debiasing -
responsibly.we.bias
WEAT measure -
responsibly.we.weat
Gonen et al. (2019) clustering as classification of biased neutral words -
responsibly.we.bias.BiasWordEmbedding.plot_most_biased_clustering()
Besides, some of the standard benchmarks for word embeddings are also available, primarily to check the impact of debiasing performance.
Refer to the Word Embedding demo for a complete usage example.
For a technical discussion about the various bias metrics, refer to the page Analysis of Word Embedding Bias Metrics.
Check the WEFE (The Word Embeddings Fairness Evaluation Framework) package for additional word embeddings bias measures.
Bolukbasi Bias Measure and Debiasing¶
Measuring and adjusting bias in word embedding by Bolukbasi (2016).
- References:
Bolukbasi, T., Chang, K. W., Zou, J. Y., Saligrama, V., & Kalai, A. T. (2016). Man is to computer programmer as woman is to homemaker? debiasing word embeddings. In Advances in neural information processing systems (pp. 4349-4357).
The code and data is based on the GitHub repository: https://github.com/tolga-b/debiaswe (MIT License).
Gonen, H., & Goldberg, Y. (2019). Lipstick on a Pig: Debiasing Methods Cover up Systematic Gender Biases in Word Embeddings But do not Remove Them. arXiv preprint arXiv:1903.03862.
Nissim, M., van Noord, R., van der Goot, R. (2019). Fair is Better than Sensational: Man is to Doctor as Woman is to Doctor.
Usage¶
>>> from responsibly.we import GenderBiasWE
>>> from gensim import downloader
>>> w2v_model = downloader.load('word2vec-google-news-300')
>>> w2v_gender_bias_we = GenderBiasWE(w2v_model)
>>> w2v_gender_bias_we.calc_direct_bias()
0.07307904249481942
>>> w2v_gender_bias_we.debias()
>>> w2v_gender_bias_we.calc_direct_bias()
1.7964246601064155e-09
Types of Bias¶
Direct Bias¶
- Associations
Words that are closer to one end (e.g., he) than to the other end (she). For example, occupational stereotypes (page 7). Calculated by
calc_direct_bias()
.
- Analogies
Analogies of he:x::she:y. For example analogies exhibiting stereotypes (page 7). Generated by
generate_analogies()
.
Indirect Bias¶
Projection of a neutral words into a two neutral words direction is explained in a great portion by a shared bias direction projection.
Calculated by
calc_indirect_bias()
and
generate_closest_words_indirect_bias()
.
-
class
responsibly.we.bias.
GenderBiasWE
(model, only_lower=False, verbose=False, identify_direction='pca', to_normalize=True)[source]¶ Measure and adjust the Gender Bias in English Word Embedding.
- Parameters
model – Word embedding model of
gensim.model.KeyedVectors
only_lower (bool) – Whether the word embedding contrains only lower case words
verbose (bool) – Set verbosity
identify_direction (str) – Set the method of identifying the gender direction: ‘single’, ‘sum’ or ‘pca’.
to_normalize (bool) – Whether to normalize all the vectors (recommended!)
-
plot_projection_scores
(words='professions', n_extreme=10, ax=None, axis_projection_step=None)[source]¶ Plot the projection scalar of words on the direction.
-
plot_dist_projections_on_direction
(word_groups='bolukbasi', ax=None)[source]¶ Plot the projection scalars distribution on the direction.
- Parameters
word_groups word (dict) – The groups to projects
- Return float
The ax object of the plot
-
classmethod
plot_bias_across_word_embeddings
(word_embedding_bias_dict, ax=None, scatter_kwargs=None)[source]¶ Plot the projections of same words of two word mbeddings.
-
calc_direct_bias
(neutral_words='professions', c=None)[source]¶ Calculate the direct bias.
Based on the projection of neutral words on the direction.
-
generate_closest_words_indirect_bias
(neutral_positive_end, neutral_negative_end, words='professions', n_extreme=5)[source]¶ Generate closest words to a neutral direction and their indirect bias.
The direction of the neutral words is used to find the most extreme words. The indirect bias is calculated between the most extreme words and the closest end.
- Parameters
neutral_positive_end (str) – A word that define the positive side of the neutral direction.
neutral_negative_end (str) – A word that define the negative side of the neutral direction.
words (list) – List of words to project on the neutral direction.
n_extreme (int) – The number for the most extreme words (positive and negative) to show.
- Returns
Data Frame of the most extreme words with their projection scores and indirect biases.
-
debias
(method='hard', neutral_words=None, equality_sets=None, inplace=True)[source]¶ Debias the word embedding.
- Parameters
Warning
After calling debias, all the vectors of the word embedding will be normalized to unit length.
-
learn_full_specific_words
(seed_specific_words='bolukbasi', max_non_specific_examples=None, debug=None)[source]¶ Learn specific words given a list of seed specific wordsself.
Using Linear SVM.
-
compute_factual_association
(factual_properity={'accountant': 61, 'analyst': 41, 'assistant': 85, 'attendant': 76, 'auditor': 61, 'baker': 65, 'carpenter': 2, 'cashier': 73, 'ceo': 39, 'chief': 27, 'cleaner': 89, 'clerk': 72, 'construction_worker': 4, 'cook': 38, 'counselors': 73, 'designers': 54, 'developer': 20, 'driver': 6, 'editor': 52, 'farmer': 22, 'guard': 22, 'hairdressers': 92, 'housekeeper': 89, 'janitor': 34, 'laborer': 4, 'lawyer': 35, 'librarian': 84, 'manager': 43, 'mechanician': 4, 'mover': 18, 'nurse': 90, 'physician': 38, 'receptionist': 90, 'salesperson': 48, 'secretary': 95, 'sewer': 80, 'sheriff': 14, 'supervisor': 44, 'teacher': 78, 'writer': 63})[source]¶ Compute association of a factual property to the projection.
Inspired by WEFAT (Word-Embedding Factual Association Test), but it is not the same: - Caliskan, A., Bryson, J. J., & Narayanan, A. (2017). Semantics derived automatically from language corpora contain human-like biases. Science, 356(6334), 183-186.
In a future version, the WEFAT will also be implemented.
If a word doesn’t exist in the word embedding, then it will be filtered out.
For example, in
responsibly.we.bias.GenderBiasWE
, the defuat factual property is the percentage of female in various occupations from the Labor Force Statistics of 2017 Population Survey, Taken from: https://arxiv.org/abs/1804.06876- Parameters
factual_properity (dict) – Dictionary of words and their factual values.
- Returns
Pearson r, pvalue and the words with their associated factual values and their projection on the bias direction.
-
plot_factual_association
(factual_properity={'accountant': 61, 'analyst': 41, 'assistant': 85, 'attendant': 76, 'auditor': 61, 'baker': 65, 'carpenter': 2, 'cashier': 73, 'ceo': 39, 'chief': 27, 'cleaner': 89, 'clerk': 72, 'construction_worker': 4, 'cook': 38, 'counselors': 73, 'designers': 54, 'developer': 20, 'driver': 6, 'editor': 52, 'farmer': 22, 'guard': 22, 'hairdressers': 92, 'housekeeper': 89, 'janitor': 34, 'laborer': 4, 'lawyer': 35, 'librarian': 84, 'manager': 43, 'mechanician': 4, 'mover': 18, 'nurse': 90, 'physician': 38, 'receptionist': 90, 'salesperson': 48, 'secretary': 95, 'sewer': 80, 'sheriff': 14, 'supervisor': 44, 'teacher': 78, 'writer': 63}, ax=None)[source]¶ Plot association of a factual property to the projection.
See:
BiasWordEmbedding.compute_factual_association()
- Parameters
factual_properity (dict) – Dictionary of words and their factual values.
-
class
responsibly.we.bias.
BiasWordEmbedding
(model, only_lower=False, verbose=False, identify_direction=False, to_normalize=True)[source]¶ Measure and adjust a bias in English word embedding.
- Parameters
-
project_on_direction
(word)[source]¶ Project the normalized vector of the word on the direction.
- Parameters
word (str) – The word tor project
- Return float
The projection scalar
-
calc_projection_data
(words)[source]¶ Calculate projection, projected and rejected vectors of a words list.
- Parameters
words (list) – List of words
- Returns
pandas.DataFrame
of the projection, projected and rejected vectors of the words list
-
plot_projection_scores
(words, n_extreme=10, ax=None, axis_projection_step=None)[source]¶ Plot the projection scalar of words on the direction.
-
plot_dist_projections_on_direction
(word_groups, ax=None)[source]¶ Plot the projection scalars distribution on the direction.
- Parameters
word_groups word (dict) – The groups to projects
- Return float
The ax object of the plot
-
classmethod
plot_bias_across_word_embeddings
(word_embedding_bias_dict, words, ax=None, scatter_kwargs=None)[source]¶ Plot the projections of same words of two word mbeddings.
-
generate_analogies
(n_analogies=100, seed='ends', multiple=False, delta=1.0, restrict_vocab=30000, unrestricted=False)[source]¶ Generate analogies based on a seed vector.
x - y ~ seed vector. or a:x::b:y when a-b ~ seed vector.
The seed vector can be defined by two word ends, or by the bias direction.
delta
is used for semantically coherent. Default vale of 1 corresponds to an angle <= pi/3.There is criticism regarding generating analogies when used with unstricted=False and not ignoring analogies with match column equal to False. Tolga’s technique of generating analogies, as implemented in this method, is limited inherently to analogies with x != y, which may be force “fake” bias analogies.
See:
Nissim, M., van Noord, R., van der Goot, R. (2019). Fair is Better than Sensational: Man is to Doctor as Woman is to Doctor.
- Parameters
seed – The definition of the seed vector. Either by a tuple of two word ends, or by ‘ends for the pre-defined ends or by ‘direction’ for the pre-defined direction vector.
n_analogies (int) – Number of analogies to generate.
multiple (bool) – Whether to allow multiple appearances of a word in the analogies.
delta (float) – Threshold for semantic similarity. The maximal distance between x and y.
restrict_vocab (int) – The vocabulary size to use.
unrestricted (bool) – Whether to validate the generated analogies with unrestricted most_similar.
- Returns
Data Frame of analogies (x, y), their distances, and their cosine similarity scores
-
calc_direct_bias
(neutral_words, c=None)[source]¶ Calculate the direct bias.
Based on the projection of neutral words on the direction.
-
calc_indirect_bias
(word1, word2)[source]¶ Calculate the indirect bias between two words.
Based on the amount of shared projection of the words on the direction.
Also called PairBias. :param str word1: First word :param str word2: Second word :type c: float or None :return The indirect bias between the two words
-
generate_closest_words_indirect_bias
(neutral_positive_end, neutral_negative_end, words=None, n_extreme=5)[source]¶ Generate closest words to a neutral direction and their indirect bias.
The direction of the neutral words is used to find the most extreme words. The indirect bias is calculated between the most extreme words and the closest end.
- Parameters
neutral_positive_end (str) – A word that define the positive side of the neutral direction.
neutral_negative_end (str) – A word that define the negative side of the neutral direction.
words (list) – List of words to project on the neutral direction.
n_extreme (int) – The number for the most extreme words (positive and negative) to show.
- Returns
Data Frame of the most extreme words with their projection scores and indirect biases.
-
debias
(method='hard', neutral_words=None, equality_sets=None, inplace=True)[source]¶ Debias the word embedding.
- Parameters
Warning
After calling debias, all the vectors of the word embedding will be normalized to unit length.
-
evaluate_word_embedding
(kwargs_word_pairs=None, kwargs_word_analogies=None)[source]¶ Evaluate word pairs tasks and word analogies tasks.
- Parameters
- Returns
Tuple of
pandas.DataFrame
for the evaluation results.
-
learn_full_specific_words
(seed_specific_words, max_non_specific_examples=None, debug=None)[source]¶ Learn specific words given a list of seed specific wordsself.
Using Linear SVM.
-
compute_factual_association
(factual_properity)[source]¶ Compute association of a factual property to the projection.
Inspired by WEFAT (Word-Embedding Factual Association Test), but it is not the same: - Caliskan, A., Bryson, J. J., & Narayanan, A. (2017). Semantics derived automatically from language corpora contain human-like biases. Science, 356(6334), 183-186.
In a future version, the WEFAT will also be implemented.
If a word doesn’t exist in the word embedding, then it will be filtered out.
For example, in
responsibly.we.bias.GenderBiasWE
, the defuat factual property is the percentage of female in various occupations from the Labor Force Statistics of 2017 Population Survey, Taken from: https://arxiv.org/abs/1804.06876- Parameters
factual_properity (dict) – Dictionary of words and their factual values.
- Returns
Pearson r, pvalue and the words with their associated factual values and their projection on the bias direction.
-
plot_factual_association
(factual_properity, ax=None)[source]¶ Plot association of a factual property to the projection.
See:
BiasWordEmbedding.compute_factual_association()
- Parameters
factual_properity (dict) – Dictionary of words and their factual values.
-
static
plot_most_biased_clustering
(biased, debiased, seed='ends', n_extreme=500, random_state=1)[source]¶ Plot clustering as classification of biased neutral words.
- Parameters
biased – Biased word embedding of
BiasWordEmbedding
.debiased – Debiased word embedding of
BiasWordEmbedding
.seed – The definition of the seed vector. Either by a tuple of two word ends, or by ‘ends for the pre-defined ends or by ‘direction’ for the pre-defined direction vector.
n_extrem – The number of extreme biased neutral words to use.
- Returns
Tuple of list of ax objects of the plot, and a dictionary with the most positive and negative words.
Based on:
Gonen, H., & Goldberg, Y. (2019). Lipstick on a Pig: Debiasing Methods Cover up Systematic Gender Biases in Word Embeddings But do not Remove Them. arXiv preprint arXiv:1903.03862.
WEAT¶
Compute WEAT score of a Word Embedding.
WEAT is a bias measurement method for word embedding, which is inspired by the IAT (Implicit Association Test) for humans. It measures the similarity between two sets of target words (e.g., programmer, engineer, scientist, … and nurse, teacher, librarian, …) and two sets of attribute words (e.g., man, male, … and woman, female …). A p-value is calculated using a permutation-test.
- Reference:
Caliskan, A., Bryson, J. J., & Narayanan, A. (2017). Semantics derived automatically from language corpora contain human-like biases. Science, 356(6334), 183-186.
Important
The effect size and pvalue in the WEAT have entirely different meaning from those reported in IATs (original finding). Refer to the paper for more details.
Stimulus and original finding from:
[0, 1, 2] A. G. Greenwald, D. E. McGhee, J. L. Schwartz, Measuring individual differences in implicit cognition: the implicit association test., Journal of personality and social psychology 74, 1464 (1998).
[3, 4]: M. Bertrand, S. Mullainathan, Are Emily and Greg more employable than Lakisha and Jamal? a field experiment on labor market discrimination, The American Economic Review 94, 991 (2004).
[5, 6, 9]: B. A. Nosek, M. Banaji, A. G. Greenwald, Harvesting implicit group attitudes and beliefs from a demonstration web site., Group Dynamics: Theory, Research, and Practice 6, 101 (2002).
[7]: B. A. Nosek, M. R. Banaji, A. G. Greenwald, Math=male, me=female, therefore math≠me., Journal of Personality and Social Psychology 83, 44 (2002).
[8] P. D. Turney, P. Pantel, From frequency to meaning: Vector space models of semantics, Journal of Artificial Intelligence Research 37, 141 (2010).
-
responsibly.we.weat.
calc_single_weat
(model, first_target, second_target, first_attribute, second_attribute, with_pvalue=True, pvalue_kwargs=None)[source]¶ Calc the WEAT result of a word embedding.
- Parameters
model – Word embedding model of
gensim.model.KeyedVectors
first_target (dict) – First target words list and its name
second_target (dict) – Second target words list and its name
first_attribute (dict) – First attribute words list and its name
second_attribute (dict) – Second attribute words list and its name
with_pvalue (bool) – Whether to calculate the p-value of the WEAT score (might be computationally expensive)
- Returns
WEAT result (score, size effect, Nt, Na and p-value)
-
responsibly.we.weat.
calc_weat_pleasant_unpleasant_attribute
(model, first_target, second_target, with_pvalue=True, pvalue_kwargs=None)[source]¶ Calc the WEAT result with pleasent vs. unpleasant attributes.
- Parameters
- Returns
WEAT result (score, size effect, Nt, Na and p-value)
-
responsibly.we.weat.
calc_all_weat
(model, weat_data='caliskan', filter_by='model', with_original_finding=False, with_pvalue=True, pvalue_kwargs=None)[source]¶ Calc the WEAT results of a word embedding on multiple cases.
Note that for the effect size and pvalue in the WEAT have entirely different meaning from those reported in IATs (original finding). Refer to the paper for more details.
- Parameters
model – Word embedding model of
gensim.model.KeyedVectors
weat_data (dict) –
WEAT cases data. - If ‘caliskan’ (default) then all
the experiments from the original will be used.
If an interger, then the specific experiment by index from the original paper will be used.
If a interger, then tje specific experiments by indices from the original paper will be used.
filter_by (bool) – Whether to filter the word lists by the model (‘model’) or by the remove key in weat_data (‘data’).
with_original_finding (bool) – Show the origina
with_pvalue (bool) – Whether to calculate the p-value of the WEAT results (might be computationally expensive)
- Returns
pandas.DataFrame
of WEAT results (score, size effect, Nt, Na and p-value)
Utilities¶
-
responsibly.we.utils.
cosine_similarity
(v, u)[source]¶ Calculate the cosine similarity between two vectors.
-
responsibly.we.utils.
project_reject_vector
(v, u)[source]¶ Projecting and rejecting the vector v onto direction u.
-
responsibly.we.utils.
project_params
(u, v)[source]¶ Projecting and rejecting the vector v onto direction u with scalar.
-
responsibly.we.utils.
cosine_similarities_by_words
(model, word, words)[source]¶ Compute cosine similarities between a word and a set of other words.
-
responsibly.we.utils.
most_similar
(model, positive=None, negative=None, topn=10, restrict_vocab=None, indexer=None, unrestricted=True)[source]¶ Find the top-N most similar words.
Positive words contribute positively towards the similarity, negative words negatively.
This function computes cosine similarity between a simple mean of the projection weight vectors of the given words and the vectors for each word in the model. The function corresponds to the word-analogy and distance scripts in the original word2vec implementation.
Based on Gensim implementation.
- Parameters
model – Word embedding model of
gensim.model.KeyedVectors
.positive (list) – List of words that contribute positively.
negative (list) – List of words that contribute negatively.
topn (int) – Number of top-N similar words to return.
restrict_vocab (int) – Optional integer which limits the range of vectors which are searched for most-similar values. For example, restrict_vocab=10000 would only check the first 10000 word vectors in the vocabulary order. (This may be meaningful if you’ve sorted the vocabulary by descending frequency.)
unrestricted (bool) – Whether to restricted the most similar words to be not from the positive or negative word list.
- Returns
Sequence of (word, similarity).
Word Embedding Benchmarks¶
Evaluate word embedding by standard benchmarks.
Word Pairs Tasks¶
The WordSimilarity-353 Test Collection http://www.cs.technion.ac.il/~gabr/resources/data/wordsim353/
Rubenstein, H., and Goodenough, J. 1965. Contextual correlates of synonymy https://www.seas.upenn.edu/~hansens/conceptSim/
Stanford Rare Word (RW) Similarity Dataset https://nlp.stanford.edu/~lmthang/morphoNLM/
The Word Relatedness Mturk-771 Test Collection http://www2.mta.ac.il/~gideon/datasets/mturk_771.html
The MEN Test Collection http://clic.cimec.unitn.it/~elia.bruni/MEN.html
SimLex-999 https://fh295.github.io/simlex.html
TR9856 https://www.research.ibm.com/haifa/dept/vst/files/IBM_Debater_(R)_TR9856.v2.zip
Analogies Tasks¶
Google Analogies (subset of WordRep) https://code.google.com/archive/p/word2vec/source
MSR - Syntactic Analogies http://research.microsoft.com/en-us/projects/rnn/
-
responsibly.we.benchmark.
evaluate_word_pairs
(model, kwargs_word_pairs=None)[source]¶ Evaluate word pairs tasks.
- Parameters
- Returns
pandas.DataFrame
of evaluation results.
-
responsibly.we.benchmark.
evaluate_word_analogies
(model, kwargs_word_analogies=None)[source]¶ Evaluate word analogies tasks.
- Parameters
model – Word embedding.
kwargs_word_analogies – Kwargs for evaluate_word_analogies method.
- Returns
pandas.DataFrame
of evaluation results.