A Domain-adaptive Pre-training Approach for Language Bias Detection in
News
- URL: http://arxiv.org/abs/2205.10773v1
- Date: Sun, 22 May 2022 08:18:19 GMT
- Title: A Domain-adaptive Pre-training Approach for Language Bias Detection in
News
- Authors: Jan-David Krieger, Timo Spinde, Terry Ruas, Juhi Kulshrestha, Bela
Gipp
- Abstract summary: We present DA-RoBERTa, a new state-of-the-art transformer-based model adapted to the media bias domain.
We also train, DA-BERT and DA-BART, two more transformer models adapted to the bias domain.
Our proposed domain-adapted models outperform prior bias detection approaches on the same data.
- Score: 3.7238620986236373
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: Media bias is a multi-faceted construct influencing individual behavior and
collective decision-making. Slanted news reporting is the result of one-sided
and polarized writing which can occur in various forms. In this work, we focus
on an important form of media bias, i.e. bias by word choice. Detecting biased
word choices is a challenging task due to its linguistic complexity and the
lack of representative gold-standard corpora. We present DA-RoBERTa, a new
state-of-the-art transformer-based model adapted to the media bias domain which
identifies sentence-level bias with an F1 score of 0.814. In addition, we also
train, DA-BERT and DA-BART, two more transformer models adapted to the bias
domain. Our proposed domain-adapted models outperform prior bias detection
approaches on the same data.
Related papers
- Is There a One-Model-Fits-All Approach to Information Extraction? Revisiting Task Definition Biases [62.806300074459116]
Definition bias is a negative phenomenon that can mislead models.
We identify two types of definition bias in IE: bias among information extraction datasets and bias between information extraction datasets and instruction tuning datasets.
We propose a multi-stage framework consisting of definition bias measurement, bias-aware fine-tuning, and task-specific bias mitigation.
arXiv Detail & Related papers (2024-03-25T03:19:20Z) - Bias-Conflict Sample Synthesis and Adversarial Removal Debias Strategy
for Temporal Sentence Grounding in Video [67.24316233946381]
Temporal Sentence Grounding in Video (TSGV) is troubled by dataset bias issue.
We propose the bias-conflict sample synthesis and adversarial removal debias strategy (BSSARD)
arXiv Detail & Related papers (2024-01-15T09:59:43Z) - Mitigating Bias for Question Answering Models by Tracking Bias Influence [84.66462028537475]
We propose BMBI, an approach to mitigate the bias of multiple-choice QA models.
Based on the intuition that a model would lean to be more biased if it learns from a biased example, we measure the bias level of a query instance.
We show that our method could be applied to multiple QA formulations across multiple bias categories.
arXiv Detail & Related papers (2023-10-13T00:49:09Z) - Unlocking Bias Detection: Leveraging Transformer-Based Models for Content Analysis [1.8692054990918079]
We present the Contextualized Bi-Directional Dual Transformer (CBDT) textcolorgreenfaLeaf classifier.
We have prepared a dataset specifically for training these models to identify and locate biases in texts.
Our evaluations across various datasets demonstrate CBDT textcolorgreen effectiveness in distinguishing biased narratives from neutral ones and identifying specific biased terms.
arXiv Detail & Related papers (2023-09-30T12:06:04Z) - Introducing MBIB -- the first Media Bias Identification Benchmark Task
and Dataset Collection [24.35462897801079]
We introduce the Media Bias Identification Benchmark (MBIB) to group different types of media bias under a common framework.
After reviewing 115 datasets, we select nine tasks and carefully propose 22 associated datasets for evaluating media bias detection techniques.
Our results suggest that while hate speech, racial bias, and gender bias are easier to detect, models struggle to handle certain bias types, e.g., cognitive and political bias.
arXiv Detail & Related papers (2023-04-25T20:49:55Z) - Unveiling the Hidden Agenda: Biases in News Reporting and Consumption [59.55900146668931]
We build a six-year dataset on the Italian vaccine debate and adopt a Bayesian latent space model to identify narrative and selection biases.
We found a nonlinear relationship between biases and engagement, with higher engagement for extreme positions.
Analysis of news consumption on Twitter reveals common audiences among news outlets with similar ideological positions.
arXiv Detail & Related papers (2023-01-14T18:58:42Z) - Exploiting Transformer-based Multitask Learning for the Detection of
Media Bias in News Articles [21.960154864540282]
We propose a Transformer-based deep learning architecture trained via Multi-Task Learning to detect media bias.
Our best-performing implementation achieves a macro $F_1$ of 0.776, a performance boost of 3% compared to our baseline, outperforming existing methods.
arXiv Detail & Related papers (2022-11-07T12:22:31Z) - Neural Media Bias Detection Using Distant Supervision With BABE -- Bias
Annotations By Experts [24.51774048437496]
This paper presents BABE, a robust and diverse data set for media bias research.
It consists of 3,700 sentences balanced among topics and outlets, containing media bias labels on the word and sentence level.
Based on our data, we also introduce a way to detect bias-inducing sentences in news articles automatically.
arXiv Detail & Related papers (2022-09-29T05:32:55Z) - An Interdisciplinary Approach for the Automated Detection and
Visualization of Media Bias in News Articles [0.0]
I aim to devise data sets and methods to identify media bias.
My vision is to devise a system that helps news readers become aware of media coverage differences caused by bias.
arXiv Detail & Related papers (2021-12-26T10:46:32Z) - Balancing out Bias: Achieving Fairness Through Training Reweighting [58.201275105195485]
Bias in natural language processing arises from models learning characteristics of the author such as gender and race.
Existing methods for mitigating and measuring bias do not directly account for correlations between author demographics and linguistic variables.
This paper introduces a very simple but highly effective method for countering bias using instance reweighting.
arXiv Detail & Related papers (2021-09-16T23:40:28Z) - Improving Robustness by Augmenting Training Sentences with
Predicate-Argument Structures [62.562760228942054]
Existing approaches to improve robustness against dataset biases mostly focus on changing the training objective.
We propose to augment the input sentences in the training data with their corresponding predicate-argument structures.
We show that without targeting a specific bias, our sentence augmentation improves the robustness of transformer models against multiple biases.
arXiv Detail & Related papers (2020-10-23T16:22:05Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.