[go: up one dir, main page]

Skip to content

Measuring Biases in Masked Language Models for PyTorch Transformers. Support for multiple social biases and evaluation measures.

License

Notifications You must be signed in to change notification settings

zalkikar/mlm-bias

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

11 Commits
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

Measuring Biases in Masked Language Models for PyTorch Transformers

Evaluate biases in (pre-trained or re-trained) masked language models (MLMs), such as those available thorugh HuggingFace using multiple state-of-the-art methods to compute a biase score for each bias type in benchmark datasets CrowS-Pairs (CPS) and StereoSet (SS) (intrasentence), or a custom linebyline dataset with files bias_types.txt containing bias categories, and dis.txt and adv.txt (to create sentence pairs), where dis.txt contains sentences with bias against disadvantaged groups (stereotypical) and adv.txt contains sentences with bias against advantaged groups (anti-stereotypical). Additionally, compare relative bias between two MLMs (and compare re-trained MLMs with their pre-trained base).

Evaluation Methods

Bias scores for an MLM are computed based on implemented measures for sentence pairs in the dataset.

Measures computed using an iterative masking experiment, where an MLM masks one token at a time until all tokens are masked once (so we have n logits or predictions for a sentence with n tokens) are reported below (see current citation in Citation). As a result these measures take longer to compute. Measures are defined to represent MLM preference (or prediction quality). Bias against disadvantaged groups for a sentence pair is represented by a higher relative measure value for a sentence in adv compared to dis.

  • CRR (Difference in reciprocal rank of a predicted token (always equal to 1) and the reciprocal rank of a masked token
  • CRRA (CRR with Attention weights)
  • ΔP (Difference in log-liklihood of a predicted token and the masked token)
  • ΔPAP with Attention weights)

Measures that are computed with a single encoded input (see citations in References for more details):

Setup

pip install mlm-bias

import mlm_bias
cps_dataset = mlm_bias.BiasBenchmarkDataset("cps")
cps_dataset.sample(indices=list(range(10)))
model = "bert-base-uncased"
mlm_bias = mlm_bias.BiasMLM(model, cps_dataset)
result = mlm_bias.evaluate(inc_attention=True)
result.save("./bert-base-uncased")

Example Script

Clone this repo:

git clone https://github.com/zalkikar/mlm-bias.git
cd mlm-bias
python3 -m pip install .

Using the mlm_bias.py example script:

mlm_bias.py [-h] --data {cps,ss,custom} --model MODEL [--model2 MODEL2] [--output OUTPUT] [--measures {all,crr,crra,dp,dpa,aul,aula,csps,sss}] [--start S] [--end E]
# single mlm
python3 mlm_bias.py --data cps --model roberta-base --start 0 --end 30
python3 mlm_bias.py --data ss --model bert-base-uncased --start 0 --end 30
# relative
python3 mlm_bias.py --data cps --model roberta-base --start 0 --end 30 --model2 bert-base-uncased

With default arguments:

  • /data will have cps.csv (CPS) and/or ss.csv (SS)
  • /eval will have out.txt with computed measures and pickled results objects

Example command output:

Created output directory.
Created Data Directory |██████████████████████████████| 1/1 [100%] in 0s ETA: 0s
Downloaded Data [CrowSPairs] |██████████████████████████████| 1/1 [100%] in 0s ETA: 0s
Loaded Data [CrowSPairs] |██████████████████████████████| 1/1 [100%] in 0s ETA: 0s
Evaluating Bias [roberta-base] |██████████████████████████████| 30/30 [100%] in 2m 46s ETA: 0s
Saved bias results for roberta-base in ./eval/roberta-base
Saved scores in ./eval/out.txt
--------------------------------------------------
MLM: roberta-base
CRR total = 50.0
CRRA total = 53.333
ΔP total = 56.667
ΔPA total = 56.667
AUL total = 76.667
AULA total = 70.0
SSS total = 53.333
CSPS total = 63.33

Citation

If using this for research, please cite the following:

@misc{zalkikar-chandra-2024,
    author  = {Rahul Zalkikar and Kanchan Chandra},
    title   = {Measuring Social Biases in Masked Language Models by Proxy of Prediction Quality},
    year    = {2024}
}

References

@InProceedings{Kaneko:AUL:2022,
  author={Masahiro Kaneko and Danushka Bollegala},
  title={Unmasking the Mask -- Evaluating Social Biases in Masked Language Models},
  booktitle = {Proceedings of the 36th AAAI Conference on Artificial Intelligence},
  year      = {2022},
  month     = {February},
  address   = {Vancouver, BC, Canada}
}
@article{salutari-etal-2023,
  author       = {Flavia Salutari and Jerome Ramos and Hosein A Rahmani and Leonardo Linguaglossa and Aldo Lipani.},
  title        = {Quantifying
the Bias of Transformer-Based Language Models for African American English in Masked Language
Modeling.},
  journal      = {The Pacific-Asia Conference on Knowledge Discovery and Data Mining (PAKDD) 2023},
  month        = may,
  year         = {2023},
  url          = {https://telecom-paris.hal.science/hal-04067844},
}
@inproceedings{nangia-etal-2020-crows,
    title = "{C}row{S}-Pairs: A Challenge Dataset for Measuring Social Biases in Masked Language Models",
    author = "Nangia, Nikita  and
      Vania, Clara  and
      Bhalerao, Rasika  and
      Bowman, Samuel R.",
    editor = "Webber, Bonnie  and
      Cohn, Trevor  and
      He, Yulan  and
      Liu, Yang",
    booktitle = "Proceedings of the 2020 Conference on Empirical Methods in Natural Language Processing (EMNLP)",
    month = nov,
    year = "2020",
    address = "Online",
    publisher = "Association for Computational Linguistics",
    url = "https://aclanthology.org/2020.emnlp-main.154",
    doi = "10.18653/v1/2020.emnlp-main.154",
    pages = "1953--1967",
    abstract = "Pretrained language models, especially masked language models (MLMs) have seen success across many NLP tasks. However, there is ample evidence that they use the cultural biases that are undoubtedly present in the corpora they are trained on, implicitly creating harm with biased representations. To measure some forms of social bias in language models against protected demographic groups in the US, we introduce the Crowdsourced Stereotype Pairs benchmark (CrowS-Pairs). CrowS-Pairs has 1508 examples that cover stereotypes dealing with nine types of bias, like race, religion, and age. In CrowS-Pairs a model is presented with two sentences: one that is more stereotyping and another that is less stereotyping. The data focuses on stereotypes about historically disadvantaged groups and contrasts them with advantaged groups. We find that all three of the widely-used MLMs we evaluate substantially favor sentences that express stereotypes in every category in CrowS-Pairs. As work on building less biased models advances, this dataset can be used as a benchmark to evaluate progress.",
}
@misc{nadeem2020stereoset,
    title={StereoSet: Measuring stereotypical bias in pretrained language models},
    author={Moin Nadeem and Anna Bethke and Siva Reddy},
    year={2020},
    eprint={2004.09456},
    archivePrefix={arXiv},
    primaryClass={cs.CL}
}