Related papers: The Impact of LLMs on Online News Consumption and Production

The Impact of LLMs on Online News Consumption and Production

URL: http://arxiv.org/abs/2512.24968v2
Date: Wed, 07 Jan 2026 04:07:45 GMT
Title: The Impact of LLMs on Online News Consumption and Production
Authors: Hangcheng Zhao, Ron Berman,
Abstract summary: Large language models (LLMs) change how consumers acquire information online.<n>LLMs also crawl news publishers' websites for training data and to answer consumer queries.<n>These changes lead to predictions of adverse impact on news publishers.
Score: 0.0
License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
Abstract: Large language models (LLMs) change how consumers acquire information online; their bots also crawl news publishers' websites for training data and to answer consumer queries; and they provide tools that can lower the cost of content creation. These changes lead to predictions of adverse impact on news publishers in the form of lowered consumer demand, reduced demand for newsroom employees, and an increase in news "slop." Consequently, some publishers strategically responded by blocking LLM access to their websites using the robots.txt file standard. Using high-frequency granular data, we document four effects related to the predicted shifts in news publishing following the introduction of generative AI (GenAI). First, we find a moderate decline in traffic to news publishers occurring after August 2024. Second, using a difference-in-differences approach, we find that blocking GenAI bots can be associated with a reduction of total website traffic to large publishers compared to not blocking. Third, on the hiring side, we do not find evidence that LLMs are replacing editorial or content-production jobs yet. The share of new editorial and content-production job listings increases over time. Fourth, regarding content production, we find no evidence that large publishers increased text volume; instead, they significantly increased rich content and use more advertising and targeting technologies. Together, these findings provide early evidence of some unforeseen impacts of the introduction of LLMs on news production and consumption.

Related papers

Web Intellectual Property at Risk: Preventing Unauthorized Real-Time Retrieval by Large Language Models [49.270849415269936]
We propose a novel framework that empowers web content creators to safeguard their web-based IP from unauthorized extraction and redistribution.<n>Our method follows principled motivations and effectively addresses an intractable black-box optimization problem.
arXiv Detail & Related papers (2025-05-19T03:14:08Z)
Against Opacity: Explainable AI and Large Language Models for Effective Digital Advertising [45.512178197258066]
Meta Ads and Googles of the world attract countless advertisers who rely on intuition, with billions of dollars lost on ineffective social media ads.<n>This lack of transparency hinders the advertisers' ability to make informed decisions and necessitates efforts to promote transparency, standardize industry metrics, and strengthen regulatory frameworks.<n>In this work, we propose novel ways to assist marketers in optimizing their advertising strategies via machine learning techniques designed to analyze and evaluate content, in particular, predict the click-through rates (CTR) of novel advertising content.
arXiv Detail & Related papers (2025-04-22T08:46:47Z)
The Influence of Generative AI on Content Platforms: Supply, Demand, and Welfare Impacts in Two-Sided Markets [3.4039202831583903]
This paper explores how generative artificial intelligence affects online platforms where both human creators and AI generate content. We develop a model to understand how generative AI changes supply and demand, impacts traffic distribution, and influences social welfare.
arXiv Detail & Related papers (2024-10-17T00:14:12Z)
Harnessing the Power of LLMs: Evaluating Human-AI Text Co-Creation through the Lens of News Headline Generation [58.31430028519306]
This study explores how humans can best leverage LLMs for writing and how interacting with these models affects feelings of ownership and trust in the writing process. While LLMs alone can generate satisfactory news headlines, on average, human control is needed to fix undesirable model outputs.
arXiv Detail & Related papers (2023-10-16T15:11:01Z)
Fake News Detectors are Biased against Texts Generated by Large Language Models [39.36284616311687]
The spread of fake news has emerged as a critical challenge, undermining trust and posing threats to society. We present a novel paradigm to evaluate fake news detectors in scenarios involving both human-written and LLM-generated misinformation.
arXiv Detail & Related papers (2023-09-15T18:04:40Z)
ManiTweet: A New Benchmark for Identifying Manipulation of News on Social Media [74.93847489218008]
We present a novel task, identifying manipulation of news on social media, which aims to detect manipulation in social media posts and identify manipulated or inserted information.<n>To study this task, we have proposed a data collection schema and curated a dataset called ManiTweet, consisting of 3.6K pairs of tweets and corresponding articles.<n>Our analysis demonstrates that this task is highly challenging, with large language models (LLMs) yielding unsatisfactory performance.
arXiv Detail & Related papers (2023-05-23T16:40:07Z)
Machine-Made Media: Monitoring the Mobilization of Machine-Generated Articles on Misinformation and Mainstream News Websites [5.161088104035108]
We train a DeBERTa-based synthetic news detector and classify over 15.46 million articles from 3,074 misinformation and mainstream news websites. We find that between January 1, 2022, and May 1, 2023, the relative number of synthetic news articles increased by 57.3% on mainstream websites while increasing by 474% on misinformation sites.
arXiv Detail & Related papers (2023-05-16T21:51:01Z)
Multilingual Disinformation Detection for Digital Advertising [0.9684919127633844]
We make the first step towards quickly detecting and red-flaging websites that potentially manipulate the public with disinformation. We build a machine learning model based on multilingual text embeddings that first determines whether the page mentions a topic of interest, then estimates the likelihood of the content being malicious. Our system empowers internal teams to proactively blacklist unsafe content, thus protecting the reputation of the advertisement provider.
arXiv Detail & Related papers (2022-07-04T10:29:20Z)
Faking Fake News for Real Fake News Detection: Propaganda-loaded Training Data Generation [105.20743048379387]
We propose a novel framework for generating training examples informed by the known styles and strategies of human-authored propaganda. Specifically, we perform self-critical sequence training guided by natural language inference to ensure the validity of the generated articles. Our experimental results show that fake news detectors trained on PropaNews are better at detecting human-written disinformation by 3.62 - 7.69% F1 score on two public datasets.
arXiv Detail & Related papers (2022-03-10T14:24:19Z)
A Study of Fake News Reading and Annotating in Social Media Context [1.0499611180329804]
We present an eye-tracking study, in which we let 44 lay participants to casually read through a social media feed containing posts with news articles, some of which were fake. In a second run, we asked the participants to decide on the truthfulness of these articles. We also describe a follow-up qualitative study with a similar scenario but this time with 7 expert fake news annotators.
arXiv Detail & Related papers (2021-09-26T08:11:17Z)
News consumption and social media regulations policy [70.31753171707005]
We analyze two social media that enforced opposite moderation methods, Twitter and Gab, to assess the interplay between news consumption and content regulation. Our results show that the presence of moderation pursued by Twitter produces a significant reduction of questionable content. The lack of clear regulation on Gab results in the tendency of the user to engage with both types of content, showing a slight preference for the questionable ones which may account for a dissing/endorsement behavior.
arXiv Detail & Related papers (2021-06-07T19:26:32Z)

This list is automatically generated from the titles and abstracts of the papers in this site.