COFFEE: Counterfactual Fairness for Personalized Text Generation in
Explainable Recommendation
- URL: http://arxiv.org/abs/2210.15500v2
- Date: Sun, 22 Oct 2023 23:12:51 GMT
- Title: COFFEE: Counterfactual Fairness for Personalized Text Generation in
Explainable Recommendation
- Authors: Nan Wang, Qifan Wang, Yi-Chia Wang, Maziar Sanjabi, Jingzhou Liu,
Hamed Firooz, Hongning Wang, Shaoliang Nie
- Abstract summary: bias inherent in user written text can associate different levels of linguistic quality with users' protected attributes.
We introduce a general framework to achieve measure-specific counterfactual fairness in explanation generation.
- Score: 56.520470678876656
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: As language models become increasingly integrated into our digital lives,
Personalized Text Generation (PTG) has emerged as a pivotal component with a
wide range of applications. However, the bias inherent in user written text,
often used for PTG model training, can inadvertently associate different levels
of linguistic quality with users' protected attributes. The model can inherit
the bias and perpetuate inequality in generating text w.r.t. users' protected
attributes, leading to unfair treatment when serving users. In this work, we
investigate fairness of PTG in the context of personalized explanation
generation for recommendations. We first discuss the biases in generated
explanations and their fairness implications. To promote fairness, we introduce
a general framework to achieve measure-specific counterfactual fairness in
explanation generation. Extensive experiments and human evaluations demonstrate
the effectiveness of our method.
Related papers
- GUS-Net: Social Bias Classification in Text with Generalizations, Unfairness, and Stereotypes [2.2162879952427343]
This paper introduces GUS-Net, an innovative approach to bias detection.
GUS-Net focuses on three key types of biases: (G)eneralizations, (U)nfairness, and (S)tereotypes.
Our methodology enhances traditional bias detection methods by incorporating the contextual encodings of pre-trained models.
arXiv Detail & Related papers (2024-10-10T21:51:22Z) - Distributionally Generative Augmentation for Fair Facial Attribute Classification [69.97710556164698]
Facial Attribute Classification (FAC) holds substantial promise in widespread applications.
FAC models trained by traditional methodologies can be unfair by exhibiting accuracy inconsistencies across varied data subpopulations.
This work proposes a novel, generation-based two-stage framework to train a fair FAC model on biased data without additional annotation.
arXiv Detail & Related papers (2024-03-11T10:50:53Z) - GPTBIAS: A Comprehensive Framework for Evaluating Bias in Large Language
Models [83.30078426829627]
Large language models (LLMs) have gained popularity and are being widely adopted by a large user community.
The existing evaluation methods have many constraints, and their results exhibit a limited degree of interpretability.
We propose a bias evaluation framework named GPTBIAS that leverages the high performance of LLMs to assess bias in models.
arXiv Detail & Related papers (2023-12-11T12:02:14Z) - All Should Be Equal in the Eyes of Language Models: Counterfactually
Aware Fair Text Generation [16.016546693767403]
We propose a framework that dynamically compares the model understanding of diverse demographics to generate more equitable sentences.
CAFIE produces fairer text and strikes the best balance between fairness and language modeling capability.
arXiv Detail & Related papers (2023-11-09T15:39:40Z) - Toward Fairness in Text Generation via Mutual Information Minimization
based on Importance Sampling [23.317845744611375]
We propose to minimize the mutual information between the semantics in the generated text sentences and their demographic polarity.
In this way, the mentioning of a demographic group is encouraged to be independent from how it is described in the generated text.
We also propose a distillation mechanism that preserves the language modeling ability of the PLMs after debiasing.
arXiv Detail & Related papers (2023-02-25T18:29:02Z) - Fair NLP Models with Differentially Private Text Encoders [1.7434507809930746]
We propose FEDERATE, an approach that combines ideas from differential privacy and adversarial training to learn private text representations.
We empirically evaluate the trade-off between the privacy of the representations and the fairness and accuracy of the downstream model on four NLP datasets.
arXiv Detail & Related papers (2022-05-12T14:58:38Z) - Measuring Fairness of Text Classifiers via Prediction Sensitivity [63.56554964580627]
ACCUMULATED PREDICTION SENSITIVITY measures fairness in machine learning models based on the model's prediction sensitivity to perturbations in input features.
We show that the metric can be theoretically linked with a specific notion of group fairness (statistical parity) and individual fairness.
arXiv Detail & Related papers (2022-03-16T15:00:33Z) - Towards Understanding and Mitigating Social Biases in Language Models [107.82654101403264]
Large-scale pretrained language models (LMs) can be potentially dangerous in manifesting undesirable representational biases.
We propose steps towards mitigating social biases during text generation.
Our empirical results and human evaluation demonstrate effectiveness in mitigating bias while retaining crucial contextual information.
arXiv Detail & Related papers (2021-06-24T17:52:43Z) - The Authors Matter: Understanding and Mitigating Implicit Bias in Deep
Text Classification [36.361778457307636]
Deep text classification models can produce biased outcomes for texts written by authors of certain demographic groups.
In this paper, we first demonstrate that implicit bias exists in different text classification tasks for different demographic groups.
We then build a learning-based interpretation method to deepen our knowledge of implicit bias.
arXiv Detail & Related papers (2021-05-06T16:17:38Z) - Towards Controllable Biases in Language Generation [87.89632038677912]
We develop a method to induce societal biases in generated text when input prompts contain mentions of specific demographic groups.
We analyze two scenarios: 1) inducing negative biases for one demographic and positive biases for another demographic, and 2) equalizing biases between demographics.
arXiv Detail & Related papers (2020-05-01T08:25:11Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.