Bias in Opinion Summarisation from Pre-training to Adaptation: A Case
Study in Political Bias
- URL: http://arxiv.org/abs/2402.00322v1
- Date: Thu, 1 Feb 2024 04:15:59 GMT
- Title: Bias in Opinion Summarisation from Pre-training to Adaptation: A Case
Study in Political Bias
- Authors: Nannan Huang, Haytham Fayek, Xiuzhen Zhang
- Abstract summary: Opinion summarisation aims to summarise the salient information and opinions presented in documents such as product reviews, discussion forums, and social media texts.
generating biased summaries has the risk of potentially swaying public opinion.
- Score: 4.964212137957899
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: Opinion summarisation aims to summarise the salient information and opinions
presented in documents such as product reviews, discussion forums, and social
media texts into short summaries that enable users to effectively understand
the opinions therein. Generating biased summaries has the risk of potentially
swaying public opinion. Previous studies focused on studying bias in opinion
summarisation using extractive models, but limited research has paid attention
to abstractive summarisation models. In this study, using political bias as a
case study, we first establish a methodology to quantify bias in abstractive
models, then trace it from the pre-trained models to the task of summarising
social media opinions using different models and adaptation methods. We find
that most models exhibit intrinsic bias. Using a social media text
summarisation dataset and contrasting various adaptation methods, we find that
tuning a smaller number of parameters is less biased compared to standard
fine-tuning; however, the diversity of topics in training data used for
fine-tuning is critical.
Related papers
- Leveraging Prototypical Representations for Mitigating Social Bias without Demographic Information [50.29934517930506]
DAFair is a novel approach to address social bias in language models.
We leverage prototypical demographic texts and incorporate a regularization term during the fine-tuning process to mitigate bias.
arXiv Detail & Related papers (2024-03-14T15:58:36Z) - Revisiting Zero-Shot Abstractive Summarization in the Era of Large Language Models from the Perspective of Position Bias [13.828653029379257]
We characterize and study zero-shot abstractive summarization in Large Language Models (LLMs) by measuring position bias.
Position bias captures the tendency of a model unfairly prioritizing information from certain parts of the input text over others, leading to undesirable behavior.
Our findings lead to novel insights and discussion on performance and position bias of models for zero-shot summarization tasks.
arXiv Detail & Related papers (2024-01-03T21:38:40Z) - Debiasing Multimodal Models via Causal Information Minimization [65.23982806840182]
We study bias arising from confounders in a causal graph for multimodal data.
Robust predictive features contain diverse information that helps a model generalize to out-of-distribution data.
We use these features as confounder representations and use them via methods motivated by causal theory to remove bias from models.
arXiv Detail & Related papers (2023-11-28T16:46:14Z) - Bias in News Summarization: Measures, Pitfalls and Corpora [4.917075909999548]
We introduce definitions for biased behaviours in summarization models, along with practical operationalizations.
We measure gender bias in English summaries generated by both purpose-built summarization models and general purpose chat models.
We find content selection in single document summarization to be largely unaffected by gender bias, while hallucinations exhibit evidence of bias.
arXiv Detail & Related papers (2023-09-14T22:20:27Z) - Debiasing Vision-Language Models via Biased Prompts [79.04467131711775]
We propose a general approach for debiasing vision-language foundation models by projecting out biased directions in the text embedding.
We show that debiasing only the text embedding with a calibrated projection matrix suffices to yield robust classifiers and fair generative models.
arXiv Detail & Related papers (2023-01-31T20:09:33Z) - NeuS: Neutral Multi-News Summarization for Mitigating Framing Bias [54.89737992911079]
We propose a new task, a neutral summary generation from multiple news headlines of the varying political spectrum.
One of the most interesting observations is that generation models can hallucinate not only factually inaccurate or unverifiable content, but also politically biased content.
arXiv Detail & Related papers (2022-04-11T07:06:01Z) - The SAME score: Improved cosine based bias score for word embeddings [49.75878234192369]
We introduce SAME, a novel bias score for semantic bias in embeddings.
We show that SAME is capable of measuring semantic bias and identify potential causes for social bias in downstream tasks.
arXiv Detail & Related papers (2022-03-28T09:28:13Z) - Balancing out Bias: Achieving Fairness Through Training Reweighting [58.201275105195485]
Bias in natural language processing arises from models learning characteristics of the author such as gender and race.
Existing methods for mitigating and measuring bias do not directly account for correlations between author demographics and linguistic variables.
This paper introduces a very simple but highly effective method for countering bias using instance reweighting.
arXiv Detail & Related papers (2021-09-16T23:40:28Z) - Subjective Bias in Abstractive Summarization [11.675414451656568]
We formulate the differences among possible multiple expressions summarizing the same content as subjective bias and examine the role of this bias in the context of abstractive summarization.
Results of summarization models trained on style-clustered datasets show that there are certain types of styles that lead to better convergence, abstraction and generalization.
arXiv Detail & Related papers (2021-06-18T12:17:55Z) - RedditBias: A Real-World Resource for Bias Evaluation and Debiasing of
Conversational Language Models [37.98671828283487]
Text representation models are prone to exhibit a range of societal biases.
Recent work has predominantly focused on measuring and mitigating bias in pretrained language models.
We present RedditBias, the first conversational data set grounded in the actual human conversations from Reddit.
arXiv Detail & Related papers (2021-06-07T11:22:39Z) - Inflating Topic Relevance with Ideology: A Case Study of Political
Ideology Bias in Social Topic Detection Models [16.279854003220418]
We investigate the impact of political ideology biases in training data.
Our work highlights the susceptibility of large, complex models to propagating the biases from human-selected input.
As a way to mitigate the bias, we propose to learn a text representation that is invariant to political ideology while still judging topic relevance.
arXiv Detail & Related papers (2020-11-29T05:54:03Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.