Identifying Informational Sources in News Articles
- URL: http://arxiv.org/abs/2305.14904v1
- Date: Wed, 24 May 2023 08:56:35 GMT
- Title: Identifying Informational Sources in News Articles
- Authors: Alexander Spangher, Nanyun Peng, Jonathan May, Emilio Ferrara
- Abstract summary: We build the largest and widest-ranging annotated dataset of informational sources used in news writing.
We introduce a novel task, source prediction, to study the compositionality of sources in news articles.
- Score: 109.70475599552523
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: News articles are driven by the informational sources journalists use in
reporting. Modeling when, how and why sources get used together in stories can
help us better understand the information we consume and even help journalists
with the task of producing it. In this work, we take steps toward this goal by
constructing the largest and widest-ranging annotated dataset, to date, of
informational sources used in news writing. We show that our dataset can be
used to train high-performing models for information detection and source
attribution. We further introduce a novel task, source prediction, to study the
compositionality of sources in news articles. We show good performance on this
task, which we argue is an important proof for narrative science exploring the
internal structure of news articles and aiding in planning-based language
generation, and an important step towards a source-recommendation system to aid
journalists.
Related papers
- SciNews: From Scholarly Complexities to Public Narratives -- A Dataset for Scientific News Report Generation [20.994565065595232]
We present a new corpus to facilitate the automated generation of scientific news reports.
Our dataset comprises academic publications and their corresponding scientific news reports across nine disciplines.
We benchmark our dataset employing state-of-the-art text generation models.
arXiv Detail & Related papers (2024-03-26T14:54:48Z) - Envisioning the Applications and Implications of Generative AI for News
Media [4.324021238526106]
This article considers the increasing use of algorithmic decision-support systems and synthetic media in the newsroom.
We draw from a taxonomy of tasks associated with news production, and discuss where generative models could appropriately support reporters.
Our essay is relevant to practitioners and researchers as they consider using generative AI systems to support different tasks.
arXiv Detail & Related papers (2024-02-29T03:40:25Z) - From Nuisance to News Sense: Augmenting the News with Cross-Document
Evidence and Context [25.870137795858522]
We present NEWSSENSE, a novel sensemaking tool and reading interface designed to collect and integrate information from multiple news articles on a central topic.
NEWSSENSE augments a central, grounding article of the user's choice by linking it to related articles from different sources.
Our pilot study shows that NEWSSENSE has the potential to help users identify key information, verify the credibility of news articles, and explore different perspectives.
arXiv Detail & Related papers (2023-10-06T21:15:11Z) - Towards Corpus-Scale Discovery of Selection Biases in News Coverage:
Comparing What Sources Say About Entities as a Start [65.28355014154549]
This paper investigates the challenges of building scalable NLP systems for discovering patterns of media selection biases directly from news content in massive-scale news corpora.
We show the capabilities of the framework through a case study on NELA-2020, a corpus of 1.8M news articles in English from 519 news sources worldwide.
arXiv Detail & Related papers (2023-04-06T23:36:45Z) - Why Do We Click: Visual Impression-aware News Recommendation [108.73539346064386]
This work is inspired by the fact that users make their click decisions mostly based on the visual impression they perceive when browsing news.
We propose to capture such visual impression information with visual-semantic modeling for news recommendation.
In addition, we inspect the impression from a global view and take structural information, such as the arrangement of different fields and spatial position of different words on the impression.
arXiv Detail & Related papers (2021-09-26T16:58:14Z) - CitationIE: Leveraging the Citation Graph for Scientific Information
Extraction [89.33938657493765]
We use the citation graph of referential links between citing and cited papers.
We observe a sizable improvement in end-to-end information extraction over the state-of-the-art.
arXiv Detail & Related papers (2021-06-03T03:00:12Z) - Hidden Biases in Unreliable News Detection Datasets [60.71991809782698]
We show that selection bias during data collection leads to undesired artifacts in the datasets.
We observed a significant drop (>10%) in accuracy for all models tested in a clean split with no train/test source overlap.
We suggest future dataset creation include a simple model as a difficulty/bias probe and future model development use a clean non-overlapping site and date split.
arXiv Detail & Related papers (2021-04-20T17:16:41Z) - "Don't quote me on that": Finding Mixtures of Sources in News Articles [85.92467549469147]
We construct an ontological labeling system for sources based on each source's textitaffiliation and textitrole
We build a probabilistic model to infer these attributes for named sources and to describe news articles as mixtures of these sources.
arXiv Detail & Related papers (2021-04-19T21:57:11Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.