Towards Corpus-Scale Discovery of Selection Biases in News Coverage:
Comparing What Sources Say About Entities as a Start
- URL: http://arxiv.org/abs/2304.03414v1
- Date: Thu, 6 Apr 2023 23:36:45 GMT
- Title: Towards Corpus-Scale Discovery of Selection Biases in News Coverage:
Comparing What Sources Say About Entities as a Start
- Authors: Sihao Chen and William Bruno and Dan Roth
- Abstract summary: This paper investigates the challenges of building scalable NLP systems for discovering patterns of media selection biases directly from news content in massive-scale news corpora.
We show the capabilities of the framework through a case study on NELA-2020, a corpus of 1.8M news articles in English from 519 news sources worldwide.
- Score: 65.28355014154549
- License: http://creativecommons.org/licenses/by-sa/4.0/
- Abstract: News sources undergo the process of selecting newsworthy information when
covering a certain topic. The process inevitably exhibits selection biases,
i.e. news sources' typical patterns of choosing what information to include in
news coverage, due to their agenda differences. To understand the magnitude and
implications of selection biases, one must first discover (1) on what topics do
sources typically have diverging definitions of "newsworthy" information, and
(2) do the content selection patterns correlate with certain attributes of the
news sources, e.g. ideological leaning, etc.
The goal of the paper is to investigate and discuss the challenges of
building scalable NLP systems for discovering patterns of media selection
biases directly from news content in massive-scale news corpora, without
relying on labeled data. To facilitate research in this domain, we propose and
study a conceptual framework, where we compare how sources typically mention
certain controversial entities, and use such as indicators for the sources'
content selection preferences. We empirically show the capabilities of the
framework through a case study on NELA-2020, a corpus of 1.8M news articles in
English from 519 news sources worldwide. We demonstrate an unsupervised
representation learning method to capture the selection preferences for how
sources typically mention controversial entities. Our experiments show that
that distributional divergence of such representations, when studied
collectively across entities and news sources, serve as good indicators for an
individual source's ideological leaning. We hope our findings will provide
insights for future research on media selection biases.
Related papers
- Mapping the Media Landscape: Predicting Factual Reporting and Political Bias Through Web Interactions [0.7249731529275342]
We propose an extension to a recently presented news media reliability estimation method.
We assess the classification performance of four reinforcement learning strategies on a large news media hyperlink graph.
Our experiments, targeting two challenging bias descriptors, factual reporting and political bias, showed a significant performance improvement at the source media level.
arXiv Detail & Related papers (2024-10-23T08:18:26Z) - Knowledge Graph Representation for Political Information Sources [16.959319157216466]
We analyze data collected from two news portals, Breitbart News (BN) and New York Times (NYT)
Our research findings are presented through knowledge graphs, utilizing a dataset spanning 11.5 years gathered from BN and NYT media portals.
arXiv Detail & Related papers (2024-04-04T13:36:01Z) - From Nuisance to News Sense: Augmenting the News with Cross-Document
Evidence and Context [25.870137795858522]
We present NEWSSENSE, a novel sensemaking tool and reading interface designed to collect and integrate information from multiple news articles on a central topic.
NEWSSENSE augments a central, grounding article of the user's choice by linking it to related articles from different sources.
Our pilot study shows that NEWSSENSE has the potential to help users identify key information, verify the credibility of news articles, and explore different perspectives.
arXiv Detail & Related papers (2023-10-06T21:15:11Z) - Identifying Informational Sources in News Articles [109.70475599552523]
We build the largest and widest-ranging annotated dataset of informational sources used in news writing.
We introduce a novel task, source prediction, to study the compositionality of sources in news articles.
arXiv Detail & Related papers (2023-05-24T08:56:35Z) - Unveiling the Hidden Agenda: Biases in News Reporting and Consumption [59.55900146668931]
We build a six-year dataset on the Italian vaccine debate and adopt a Bayesian latent space model to identify narrative and selection biases.
We found a nonlinear relationship between biases and engagement, with higher engagement for extreme positions.
Analysis of news consumption on Twitter reveals common audiences among news outlets with similar ideological positions.
arXiv Detail & Related papers (2023-01-14T18:58:42Z) - Discord Questions: A Computational Approach To Diversity Analysis in
News Coverage [84.55145223950427]
We propose a new framework to assist readers in identifying source differences and gaining an understanding of news coverage diversity.
The framework is based on the generation of Discord Questions: questions with a diverse answer pool.
arXiv Detail & Related papers (2022-11-09T16:37:55Z) - NeuS: Neutral Multi-News Summarization for Mitigating Framing Bias [54.89737992911079]
We propose a new task, a neutral summary generation from multiple news headlines of the varying political spectrum.
One of the most interesting observations is that generation models can hallucinate not only factually inaccurate or unverifiable content, but also politically biased content.
arXiv Detail & Related papers (2022-04-11T07:06:01Z) - How to Effectively Identify and Communicate Person-Targeting Media Bias
in Daily News Consumption? [8.586057042714698]
We present an in-progress system for news recommendation that is the first to automate the manual procedure of content analysis.
Our recommender detects and reveals substantial frames that are actually present in individual news articles.
Our study shows that recommending news articles that differently frame an event significantly improves respondents' awareness of bias.
arXiv Detail & Related papers (2021-10-18T10:13:23Z) - "Don't quote me on that": Finding Mixtures of Sources in News Articles [85.92467549469147]
We construct an ontological labeling system for sources based on each source's textitaffiliation and textitrole
We build a probabilistic model to infer these attributes for named sources and to describe news articles as mixtures of these sources.
arXiv Detail & Related papers (2021-04-19T21:57:11Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.