Neural Media Bias Detection Using Distant Supervision With BABE -- Bias
Annotations By Experts
- URL: http://arxiv.org/abs/2209.14557v1
- Date: Thu, 29 Sep 2022 05:32:55 GMT
- Title: Neural Media Bias Detection Using Distant Supervision With BABE -- Bias
Annotations By Experts
- Authors: Timo Spinde, Manuel Plank, Jan-David Krieger, Terry Ruas, Bela Gipp,
Akiko Aizawa
- Abstract summary: This paper presents BABE, a robust and diverse data set for media bias research.
It consists of 3,700 sentences balanced among topics and outlets, containing media bias labels on the word and sentence level.
Based on our data, we also introduce a way to detect bias-inducing sentences in news articles automatically.
- Score: 24.51774048437496
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: Media coverage has a substantial effect on the public perception of events.
Nevertheless, media outlets are often biased. One way to bias news articles is
by altering the word choice. The automatic identification of bias by word
choice is challenging, primarily due to the lack of a gold standard data set
and high context dependencies. This paper presents BABE, a robust and diverse
data set created by trained experts, for media bias research. We also analyze
why expert labeling is essential within this domain. Our data set offers better
annotation quality and higher inter-annotator agreement than existing work. It
consists of 3,700 sentences balanced among topics and outlets, containing media
bias labels on the word and sentence level. Based on our data, we also
introduce a way to detect bias-inducing sentences in news articles
automatically. Our best performing BERT-based model is pre-trained on a larger
corpus consisting of distant labels. Fine-tuning and evaluating the model on
our proposed supervised data set, we achieve a macro F1-score of 0.804,
outperforming existing methods.
Related papers
- GradBias: Unveiling Word Influence on Bias in Text-to-Image Generative Models [75.04426753720553]
We propose a framework to identify, quantify, and explain biases in an open set setting.
This pipeline leverages a Large Language Model (LLM) to propose biases starting from a set of captions.
We show two variations of this framework: OpenBias and GradBias.
arXiv Detail & Related papers (2024-08-29T16:51:07Z) - Take Care of Your Prompt Bias! Investigating and Mitigating Prompt Bias in Factual Knowledge Extraction [56.17020601803071]
Recent research shows that pre-trained language models (PLMs) suffer from "prompt bias" in factual knowledge extraction.
This paper aims to improve the reliability of existing benchmarks by thoroughly investigating and mitigating prompt bias.
arXiv Detail & Related papers (2024-03-15T02:04:35Z) - Unveiling the Hidden Agenda: Biases in News Reporting and Consumption [59.55900146668931]
We build a six-year dataset on the Italian vaccine debate and adopt a Bayesian latent space model to identify narrative and selection biases.
We found a nonlinear relationship between biases and engagement, with higher engagement for extreme positions.
Analysis of news consumption on Twitter reveals common audiences among news outlets with similar ideological positions.
arXiv Detail & Related papers (2023-01-14T18:58:42Z) - Exploiting Transformer-based Multitask Learning for the Detection of
Media Bias in News Articles [21.960154864540282]
We propose a Transformer-based deep learning architecture trained via Multi-Task Learning to detect media bias.
Our best-performing implementation achieves a macro $F_1$ of 0.776, a performance boost of 3% compared to our baseline, outperforming existing methods.
arXiv Detail & Related papers (2022-11-07T12:22:31Z) - NeuS: Neutral Multi-News Summarization for Mitigating Framing Bias [54.89737992911079]
We propose a new task, a neutral summary generation from multiple news headlines of the varying political spectrum.
One of the most interesting observations is that generation models can hallucinate not only factually inaccurate or unverifiable content, but also politically biased content.
arXiv Detail & Related papers (2022-04-11T07:06:01Z) - An Interdisciplinary Approach for the Automated Detection and
Visualization of Media Bias in News Articles [0.0]
I aim to devise data sets and methods to identify media bias.
My vision is to devise a system that helps news readers become aware of media coverage differences caused by bias.
arXiv Detail & Related papers (2021-12-26T10:46:32Z) - Balancing out Bias: Achieving Fairness Through Training Reweighting [58.201275105195485]
Bias in natural language processing arises from models learning characteristics of the author such as gender and race.
Existing methods for mitigating and measuring bias do not directly account for correlations between author demographics and linguistic variables.
This paper introduces a very simple but highly effective method for countering bias using instance reweighting.
arXiv Detail & Related papers (2021-09-16T23:40:28Z) - MBIC -- A Media Bias Annotation Dataset Including Annotator
Characteristics [0.0]
Media bias, or slanted news coverage, can have a substantial impact on public perception of events.
In this poster, we present a matrix-based methodology to crowdsource such data using a self-developed annotation platform.
We also present MBIC - the first sample of 1,700 statements representing various media bias instances.
arXiv Detail & Related papers (2021-05-20T15:05:17Z) - Improving Robustness by Augmenting Training Sentences with
Predicate-Argument Structures [62.562760228942054]
Existing approaches to improve robustness against dataset biases mostly focus on changing the training objective.
We propose to augment the input sentences in the training data with their corresponding predicate-argument structures.
We show that without targeting a specific bias, our sentence augmentation improves the robustness of transformer models against multiple biases.
arXiv Detail & Related papers (2020-10-23T16:22:05Z) - Detecting Media Bias in News Articles using Gaussian Bias Distributions [35.19976910093135]
We study how second-order information about biased statements in an article helps to improve detection effectiveness.
On an existing media bias dataset, we find that the frequency and positions of biased statements strongly impact article-level bias.
Using a standard model for sentence-level bias detection, we provide empirical evidence that article-level bias detectors that use second-order information clearly outperform those without.
arXiv Detail & Related papers (2020-10-20T22:20:49Z) - Towards Detection of Subjective Bias using Contextualized Word
Embeddings [9.475039534437332]
We perform experiments for detecting subjective bias using BERT-based models on the Wiki Neutrality Corpus(WNC)
The dataset consists of $360k$ labeled instances, from Wikipedia edits that remove various instances of the bias.
arXiv Detail & Related papers (2020-02-16T18:39:16Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.