MBIC -- A Media Bias Annotation Dataset Including Annotator
Characteristics
- URL: http://arxiv.org/abs/2105.11910v1
- Date: Thu, 20 May 2021 15:05:17 GMT
- Title: MBIC -- A Media Bias Annotation Dataset Including Annotator
Characteristics
- Authors: T. Spinde, L. Rudnitckaia, K. Sinha, F. Hamborg, B. Gipp, K. Donnay
- Abstract summary: Media bias, or slanted news coverage, can have a substantial impact on public perception of events.
In this poster, we present a matrix-based methodology to crowdsource such data using a self-developed annotation platform.
We also present MBIC - the first sample of 1,700 statements representing various media bias instances.
- Score: 0.0
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Many people consider news articles to be a reliable source of information on
current events. However, due to the range of factors influencing news agencies,
such coverage may not always be impartial. Media bias, or slanted news
coverage, can have a substantial impact on public perception of events, and,
accordingly, can potentially alter the beliefs and views of the public. The
main data gap in current research on media bias detection is a robust,
representative, and diverse dataset containing annotations of biased words and
sentences. In particular, existing datasets do not control for the individual
background of annotators, which may affect their assessment and, thus,
represents critical information for contextualizing their annotations. In this
poster, we present a matrix-based methodology to crowdsource such data using a
self-developed annotation platform. We also present MBIC (Media Bias Including
Characteristics) - the first sample of 1,700 statements representing various
media bias instances. The statements were reviewed by ten annotators each and
contain labels for media bias identification both on the word and sentence
level. MBIC is the first available dataset about media bias reporting detailed
information on annotator characteristics and their individual background. The
current dataset already significantly extends existing data in this domain
providing unique and more reliable insights into the perception of bias. In
future, we will further extend it both with respect to the number of articles
and annotators per article.
Related papers
- Mapping the Media Landscape: Predicting Factual Reporting and Political Bias Through Web Interactions [0.7249731529275342]
We propose an extension to a recently presented news media reliability estimation method.
We assess the classification performance of four reinforcement learning strategies on a large news media hyperlink graph.
Our experiments, targeting two challenging bias descriptors, factual reporting and political bias, showed a significant performance improvement at the source media level.
arXiv Detail & Related papers (2024-10-23T08:18:26Z) - Distance between Relevant Information Pieces Causes Bias in Long-Context LLMs [50.40165119718928]
LongPiBench is a benchmark designed to assess positional bias involving multiple pieces of relevant information.
These experiments reveal that while most current models are robust against the "lost in the middle" issue, there exist significant biases related to the spacing of relevant information pieces.
arXiv Detail & Related papers (2024-10-18T17:41:19Z) - Navigating News Narratives: A Media Bias Analysis Dataset [3.0821115746307672]
"Navigating News Narratives: A Media Bias Analysis dataset" is a comprehensive dataset to address the urgent need for tools to detect and analyze media bias.
This dataset encompasses a broad spectrum of biases, making it a unique and valuable asset in the field of media studies and artificial intelligence.
arXiv Detail & Related papers (2023-11-30T19:59:19Z) - Towards Corpus-Scale Discovery of Selection Biases in News Coverage:
Comparing What Sources Say About Entities as a Start [65.28355014154549]
This paper investigates the challenges of building scalable NLP systems for discovering patterns of media selection biases directly from news content in massive-scale news corpora.
We show the capabilities of the framework through a case study on NELA-2020, a corpus of 1.8M news articles in English from 519 news sources worldwide.
arXiv Detail & Related papers (2023-04-06T23:36:45Z) - Bias or Diversity? Unraveling Fine-Grained Thematic Discrepancy in U.S.
News Headlines [63.52264764099532]
We use a large dataset of 1.8 million news headlines from major U.S. media outlets spanning from 2014 to 2022.
We quantify the fine-grained thematic discrepancy related to four prominent topics - domestic politics, economic issues, social issues, and foreign affairs.
Our findings indicate that on domestic politics and social issues, the discrepancy can be attributed to a certain degree of media bias.
arXiv Detail & Related papers (2023-03-28T03:31:37Z) - Unveiling the Hidden Agenda: Biases in News Reporting and Consumption [59.55900146668931]
We build a six-year dataset on the Italian vaccine debate and adopt a Bayesian latent space model to identify narrative and selection biases.
We found a nonlinear relationship between biases and engagement, with higher engagement for extreme positions.
Analysis of news consumption on Twitter reveals common audiences among news outlets with similar ideological positions.
arXiv Detail & Related papers (2023-01-14T18:58:42Z) - Neural Media Bias Detection Using Distant Supervision With BABE -- Bias
Annotations By Experts [24.51774048437496]
This paper presents BABE, a robust and diverse data set for media bias research.
It consists of 3,700 sentences balanced among topics and outlets, containing media bias labels on the word and sentence level.
Based on our data, we also introduce a way to detect bias-inducing sentences in news articles automatically.
arXiv Detail & Related papers (2022-09-29T05:32:55Z) - NeuS: Neutral Multi-News Summarization for Mitigating Framing Bias [54.89737992911079]
We propose a new task, a neutral summary generation from multiple news headlines of the varying political spectrum.
One of the most interesting observations is that generation models can hallucinate not only factually inaccurate or unverifiable content, but also politically biased content.
arXiv Detail & Related papers (2022-04-11T07:06:01Z) - Newsalyze: Enabling News Consumers to Understand Media Bias [7.652448987187803]
Knowing a news article's slant and authenticity is of crucial importance in times of "fake news"
We introduce Newsalyze, a bias-aware news reader focusing on a subtle, yet powerful form of media bias, named bias by word choice and labeling (WCL)
WCL bias can alter the assessment of entities reported in the news, e.g., "freedom fighters" vs. "terrorists"
arXiv Detail & Related papers (2021-05-20T11:20:37Z) - REVISE: A Tool for Measuring and Mitigating Bias in Visual Datasets [64.76453161039973]
REVISE (REvealing VIsual biaSEs) is a tool that assists in the investigation of a visual dataset.
It surfacing potential biases along three dimensions: (1) object-based, (2) person-based, and (3) geography-based.
arXiv Detail & Related papers (2020-04-16T23:54:37Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.