Specious Sites: Tracking the Spread and Sway of Spurious News Stories at
Scale
- URL: http://arxiv.org/abs/2308.02068v3
- Date: Sat, 3 Feb 2024 01:14:57 GMT
- Title: Specious Sites: Tracking the Spread and Sway of Spurious News Stories at
Scale
- Authors: Hans W. A. Hanley, Deepak Kumar, Zakir Durumeric
- Abstract summary: We identify 52,036 narratives on 1,334 unreliable news websites.
We show how our system can be utilized to detect new narratives originating from unreliable news websites.
- Score: 6.917588580148212
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Misinformation, propaganda, and outright lies proliferate on the web, with
some narratives having dangerous real-world consequences on public health,
elections, and individual safety. However, despite the impact of
misinformation, the research community largely lacks automated and programmatic
approaches for tracking news narratives across online platforms. In this work,
utilizing daily scrapes of 1,334 unreliable news websites, the large-language
model MPNet, and DP-Means clustering, we introduce a system to automatically
identify and track the narratives spread within online ecosystems. Identifying
52,036 narratives on these 1,334 websites, we describe the most prevalent
narratives spread in 2022 and identify the most influential websites that
originate and amplify narratives. Finally, we show how our system can be
utilized to detect new narratives originating from unreliable news websites and
to aid fact-checkers in more quickly addressing misinformation. We release code
and data at https://github.com/hanshanley/specious-sites.
Related papers
- Finding Fake News Websites in the Wild [0.0860395700487494]
We propose a novel methodology for identifying websites responsible for creating and disseminating misinformation content.
We validate our approach on Twitter by examining various execution modes and contexts.
arXiv Detail & Related papers (2024-07-09T18:00:12Z) - SCStory: Self-supervised and Continual Online Story Discovery [53.72745249384159]
SCStory helps people digest rapidly published news article streams in real-time without human annotations.
SCStory employs self-supervised and continual learning with a novel idea of story-indicative adaptive modeling of news article streams.
arXiv Detail & Related papers (2023-11-27T04:50:01Z) - Machine-Made Media: Monitoring the Mobilization of Machine-Generated Articles on Misinformation and Mainstream News Websites [5.161088104035108]
We train a DeBERTa-based synthetic news detector and classify over 15.46 million articles from 3,074 misinformation and mainstream news websites.
We find that between January 1, 2022, and May 1, 2023, the relative number of synthetic news articles increased by 57.3% on mainstream websites while increasing by 474% on misinformation sites.
arXiv Detail & Related papers (2023-05-16T21:51:01Z) - Unveiling the Hidden Agenda: Biases in News Reporting and Consumption [59.55900146668931]
We build a six-year dataset on the Italian vaccine debate and adopt a Bayesian latent space model to identify narrative and selection biases.
We found a nonlinear relationship between biases and engagement, with higher engagement for extreme positions.
Analysis of news consumption on Twitter reveals common audiences among news outlets with similar ideological positions.
arXiv Detail & Related papers (2023-01-14T18:58:42Z) - Deep Breath: A Machine Learning Browser Extension to Tackle Online
Misinformation [0.0]
This paper proposes a novel system for detecting, processing, and warning users about misleading content online.
By training a machine learning model on an existing dataset of 32,000 clickbait news article headlines, the model predicts how sensationalist a headline is.
It interfaces with a web browser extension which constructs a unique content warning notification based on existing design principles.
arXiv Detail & Related papers (2023-01-09T12:43:58Z) - SOK: Fake News Outbreak 2021: Can We Stop the Viral Spread? [5.64512235559998]
Social Networks' omnipresence and ease of use has revolutionized the generation and distribution of information in today's world.
Unlike traditional media channels, social networks facilitate faster and wider spread of disinformation and misinformation.
Viral spread of false information has serious implications on the behaviors, attitudes and beliefs of the public.
arXiv Detail & Related papers (2021-05-22T09:26:13Z) - User Preference-aware Fake News Detection [61.86175081368782]
Existing fake news detection algorithms focus on mining news content for deceptive signals.
We propose a new framework, UPFD, which simultaneously captures various signals from user preferences by joint content and graph modeling.
arXiv Detail & Related papers (2021-04-25T21:19:24Z) - Misinfo Belief Frames: A Case Study on Covid & Climate News [49.979419711713795]
We propose a formalism for understanding how readers perceive the reliability of news and the impact of misinformation.
We introduce the Misinfo Belief Frames (MBF) corpus, a dataset of 66k inferences over 23.5k headlines.
Our results using large-scale language modeling to predict misinformation frames show that machine-generated inferences can influence readers' trust in news headlines.
arXiv Detail & Related papers (2021-04-18T09:50:11Z) - The Rise and Fall of Fake News sites: A Traffic Analysis [62.51737815926007]
We investigate the online presence of fake news websites and characterize their behavior in comparison to real news websites.
Based on our findings, we build a content-agnostic ML for automatic detection of fake news websites.
arXiv Detail & Related papers (2021-03-16T18:10:22Z) - Fake News Spreader Detection on Twitter using Character N-Grams.
Notebook for PAN at CLEF 2020 [0.0]
This notebook describes our profiling system for the fake news detection task on Twitter.
We conduct different feature extraction techniques and learning experiments from a multilingual perspective.
Our models achieve an overall accuracy of 73% and 79% on the English and Spanish official test set.
arXiv Detail & Related papers (2020-09-29T08:32:32Z) - Political audience diversity and news reliability in algorithmic ranking [54.23273310155137]
We propose using the political diversity of a website's audience as a quality signal.
Using news source reliability ratings from domain experts and web browsing data from a diverse sample of 6,890 U.S. citizens, we first show that websites with more extreme and less politically diverse audiences have lower journalistic standards.
arXiv Detail & Related papers (2020-07-16T02:13:55Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.