Bias in News Summarization: Measures, Pitfalls and Corpora
- URL: http://arxiv.org/abs/2309.08047v3
- Date: Thu, 6 Jun 2024 11:49:26 GMT
- Title: Bias in News Summarization: Measures, Pitfalls and Corpora
- Authors: Julius Steen, Katja Markert,
- Abstract summary: We introduce definitions for biased behaviours in summarization models, along with practical operationalizations.
We measure gender bias in English summaries generated by both purpose-built summarization models and general purpose chat models.
We find content selection in single document summarization to be largely unaffected by gender bias, while hallucinations exhibit evidence of bias.
- Score: 4.917075909999548
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Summarization is an important application of large language models (LLMs). Most previous evaluation of summarization models has focused on their content selection, faithfulness, grammaticality and coherence. However, it is well known that LLMs can reproduce and reinforce harmful social biases. This raises the question: Do biases affect model outputs in a constrained setting like summarization? To help answer this question, we first motivate and introduce a number of definitions for biased behaviours in summarization models, along with practical operationalizations. Since we find that biases inherent to input documents can confound bias analysis in summaries, we propose a method to generate input documents with carefully controlled demographic attributes. This allows us to study summarizer behavior in a controlled setting, while still working with realistic input documents. We measure gender bias in English summaries generated by both purpose-built summarization models and general purpose chat models as a case study. We find content selection in single document summarization to be largely unaffected by gender bias, while hallucinations exhibit evidence of bias. To demonstrate the generality of our approach, we additionally investigate racial bias, including intersectional settings.
Related papers
- Mitigating Gender Bias in Contextual Word Embeddings [1.208453901299241]
We propose a novel objective function for Lipstick(Masked-Language Modeling) which largely mitigates the gender bias in contextual embeddings.
We also propose new methods for debiasing static embeddings and provide empirical proof via extensive analysis and experiments.
arXiv Detail & Related papers (2024-11-18T21:36:44Z) - On Positional Bias of Faithfulness for Long-form Summarization [83.63283027830657]
Large Language Models (LLMs) often exhibit positional bias in long-context settings, under-attending to information in the middle of inputs.
We investigate the presence of this bias in long-form summarization, its impact on faithfulness, and various techniques to mitigate this bias.
arXiv Detail & Related papers (2024-10-31T03:50:15Z) - Understanding Position Bias Effects on Fairness in Social Multi-Document Summarization [1.9950682531209158]
We investigate the effect of group ordering in input documents when summarizing tweets from three linguistic communities.
Our results suggest that position bias manifests differently in social multi-document summarization.
arXiv Detail & Related papers (2024-05-03T00:19:31Z) - Bias in Opinion Summarisation from Pre-training to Adaptation: A Case
Study in Political Bias [4.964212137957899]
Opinion summarisation aims to summarise the salient information and opinions presented in documents such as product reviews, discussion forums, and social media texts.
generating biased summaries has the risk of potentially swaying public opinion.
arXiv Detail & Related papers (2024-02-01T04:15:59Z) - Exploring the Jungle of Bias: Political Bias Attribution in Language Models via Dependency Analysis [86.49858739347412]
Large Language Models (LLMs) have sparked intense debate regarding the prevalence of bias in these models and its mitigation.
We propose a prompt-based method for the extraction of confounding and mediating attributes which contribute to the decision process.
We find that the observed disparate treatment can at least in part be attributed to confounding and mitigating attributes and model misalignment.
arXiv Detail & Related papers (2023-11-15T00:02:25Z) - Fair Abstractive Summarization of Diverse Perspectives [103.08300574459783]
A fair summary should provide a comprehensive coverage of diverse perspectives without underrepresenting certain groups.
We first formally define fairness in abstractive summarization as not underrepresenting perspectives of any groups of people.
We propose four reference-free automatic metrics by measuring the differences between target and source perspectives.
arXiv Detail & Related papers (2023-11-14T03:38:55Z) - On Context Utilization in Summarization with Large Language Models [83.84459732796302]
Large language models (LLMs) excel in abstractive summarization tasks, delivering fluent and pertinent summaries.
Recent advancements have extended their capabilities to handle long-input contexts, exceeding 100k tokens.
We conduct the first comprehensive study on context utilization and position bias in summarization.
arXiv Detail & Related papers (2023-10-16T16:45:12Z) - Causally Testing Gender Bias in LLMs: A Case Study on Occupational Bias [33.99768156365231]
We introduce a causal formulation for bias measurement in generative language models.
We propose a benchmark called OccuGender, with a bias-measuring procedure to investigate occupational gender bias.
The results show that these models exhibit substantial occupational gender bias.
arXiv Detail & Related papers (2022-12-20T22:41:24Z) - Correcting Diverse Factual Errors in Abstractive Summarization via
Post-Editing and Language Model Infilling [56.70682379371534]
We show that our approach vastly outperforms prior methods in correcting erroneous summaries.
Our model -- FactEdit -- improves factuality scores by over 11 points on CNN/DM and over 31 points on XSum.
arXiv Detail & Related papers (2022-10-22T07:16:19Z) - The Birth of Bias: A case study on the evolution of gender bias in an
English language model [1.6344851071810076]
We use a relatively small language model, using the LSTM architecture trained on an English Wikipedia corpus.
We find that the representation of gender is dynamic and identify different phases during training.
We show that gender information is represented increasingly locally in the input embeddings of the model.
arXiv Detail & Related papers (2022-07-21T00:59:04Z) - Balancing out Bias: Achieving Fairness Through Training Reweighting [58.201275105195485]
Bias in natural language processing arises from models learning characteristics of the author such as gender and race.
Existing methods for mitigating and measuring bias do not directly account for correlations between author demographics and linguistic variables.
This paper introduces a very simple but highly effective method for countering bias using instance reweighting.
arXiv Detail & Related papers (2021-09-16T23:40:28Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.