Tracking the Newsworthiness of Public Documents
- URL: http://arxiv.org/abs/2311.09734v1
- Date: Thu, 16 Nov 2023 10:05:26 GMT
- Title: Tracking the Newsworthiness of Public Documents
- Authors: Alexander Spangher, Emilio Ferrara, Ben Welsh, Nanyun Peng, Serdar
Tumgoren, Jonathan May
- Abstract summary: This work focuses on news coverage of local public policy in the San Francisco Bay Area by the San Francisco Chronicle.
First, we gather news articles, public policy documents and meeting recordings and link them using probabilistic relational modeling.
Second, we define a new task: newsworthiness prediction, to predict if a policy item will get covered.
- Score: 107.12303391111014
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: Journalists must find stories in huge amounts of textual data (e.g. leaks,
bills, press releases) as part of their jobs: determining when and why text
becomes news can help us understand coverage patterns and help us build
assistive tools. Yet, this is challenging because very few labelled links
exist, language use between corpora is very different, and text may be covered
for a variety of reasons. In this work we focus on news coverage of local
public policy in the San Francisco Bay Area by the San Francisco Chronicle.
First, we gather news articles, public policy documents and meeting recordings
and link them using probabilistic relational modeling, which we show is a
low-annotation linking methodology that outperforms other retrieval-based
baselines. Second, we define a new task: newsworthiness prediction, to predict
if a policy item will get covered. We show that different aspects of public
policy discussion yield different newsworthiness signals. Finally we perform
human evaluation with expert journalists and show our systems identify policies
they consider newsworthy with 68% F1 and our coverage recommendations are
helpful with an 84% win-rate.
Related papers
- A Multilingual Similarity Dataset for News Article Frame [14.977682986280998]
We introduce an extended version of a large labeled news article dataset with 16,687 new labeled pairs.
Our method frees the work of manual identification of frame classes in traditional news frame analysis studies.
Overall we introduce the most extensive cross-lingual news article similarity dataset available to date with 26,555 labeled news article pairs across 10 languages.
arXiv Detail & Related papers (2024-05-22T01:01:04Z) - Towards Corpus-Scale Discovery of Selection Biases in News Coverage:
Comparing What Sources Say About Entities as a Start [65.28355014154549]
This paper investigates the challenges of building scalable NLP systems for discovering patterns of media selection biases directly from news content in massive-scale news corpora.
We show the capabilities of the framework through a case study on NELA-2020, a corpus of 1.8M news articles in English from 519 news sources worldwide.
arXiv Detail & Related papers (2023-04-06T23:36:45Z) - Disentangling Structure and Style: Political Bias Detection in News by
Inducing Document Hierarchy [8.919312558800573]
We introduce a novel multi-head hierarchical attention model that effectively encodes the structure of long documents through a diverse ensemble of attention heads.
We demonstrate that our method overcomes this domain dependency and outperforms previous approaches for robustness and accuracy.
arXiv Detail & Related papers (2023-04-05T06:35:41Z) - Designing and Evaluating Interfaces that Highlight News Coverage
Diversity Using Discord Questions [84.55145223950427]
This paper shows that navigating large source collections for a news story can be challenging without further guidance.
We design three interfaces -- the Annotated Article, the Recomposed Article, and the Question Grid -- aimed at accompanying news readers in discovering coverage diversity while they read.
arXiv Detail & Related papers (2023-02-17T16:59:31Z) - Discord Questions: A Computational Approach To Diversity Analysis in
News Coverage [84.55145223950427]
We propose a new framework to assist readers in identifying source differences and gaining an understanding of news coverage diversity.
The framework is based on the generation of Discord Questions: questions with a diverse answer pool.
arXiv Detail & Related papers (2022-11-09T16:37:55Z) - NeuS: Neutral Multi-News Summarization for Mitigating Framing Bias [54.89737992911079]
We propose a new task, a neutral summary generation from multiple news headlines of the varying political spectrum.
One of the most interesting observations is that generation models can hallucinate not only factually inaccurate or unverifiable content, but also politically biased content.
arXiv Detail & Related papers (2022-04-11T07:06:01Z) - How to Effectively Identify and Communicate Person-Targeting Media Bias
in Daily News Consumption? [8.586057042714698]
We present an in-progress system for news recommendation that is the first to automate the manual procedure of content analysis.
Our recommender detects and reveals substantial frames that are actually present in individual news articles.
Our study shows that recommending news articles that differently frame an event significantly improves respondents' awareness of bias.
arXiv Detail & Related papers (2021-10-18T10:13:23Z) - DEAP-FAKED: Knowledge Graph based Approach for Fake News Detection [0.04834203844100679]
We propose a knowleDgE grAPh FAKe nEws Detection framework for identifying Fake News.
Our approach is a combination of the NLP -- where we encode the news content, and the GNN technique -- where we encode the Knowledge Graph.
We evaluate our framework using two publicly available datasets containing articles from domains such as politics, business, technology, and healthcare.
arXiv Detail & Related papers (2021-07-04T07:09:59Z) - Supporting verification of news articles with automated search for
semantically similar articles [0.0]
We propose an evidence retrieval approach to handle fake news.
The learning task is formulated as an unsupervised machine learning problem.
We find that our approach is agnostic to concept drifts, i.e. the machine learning task is independent of the hypotheses in a text.
arXiv Detail & Related papers (2021-03-29T12:56:59Z) - Political audience diversity and news reliability in algorithmic ranking [54.23273310155137]
We propose using the political diversity of a website's audience as a quality signal.
Using news source reliability ratings from domain experts and web browsing data from a diverse sample of 6,890 U.S. citizens, we first show that websites with more extreme and less politically diverse audiences have lower journalistic standards.
arXiv Detail & Related papers (2020-07-16T02:13:55Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.