Related papers: What Signals Really Matter for Misinformation Tasks? Evaluating Fake-News Detection and Virality Prediction under Real-World Constraints

What Signals Really Matter for Misinformation Tasks? Evaluating Fake-News Detection and Virality Prediction under Real-World Constraints

URL: http://arxiv.org/abs/2512.02552v1
Date: Tue, 02 Dec 2025 09:24:16 GMT
Title: What Signals Really Matter for Misinformation Tasks? Evaluating Fake-News Detection and Virality Prediction under Real-World Constraints
Authors: Francesco Paolo Savatteri, Chahan Vidal-Gorène, Florian Cafiero,
Abstract summary: We study two practical tasks regarding online misinformation: fake-news detection and virality prediction.<n>We show that textual content alone is a strong discriminator for fake-news detection.<n>We discuss implications for evaluation design and report constraints that realistically affect the field.
Score: 0.08496348835248901
License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
Abstract: We present an evaluation-driven study of two practical tasks regarding online misinformation: (i) fake-news detection and (ii) virality prediction in the context of operational settings, with the necessity for rapid reaction. Using the EVONS and FakeNewsNet datasets, we compare textual embeddings (RoBERTa; with a control using Mistral) against lightweight numeric features (timing, follower counts, verification, likes) and sequence models (GRU, gating architectures, Transformer encoders). We show that textual content alone is a strong discriminator for fake-news detection, while numeric-only pipelines remain viable when language models are unavailable or compute is constrained. Virality prediction is markedly harder than fake-news detection and is highly sensitive to label construction; in our setup, a median-based ''viral'' split (<50 likes) is pragmatic but underestimates real-world virality, and time-censoring for engagement features is desirable yet difficult under current API limits. Dimensionality-reduction analyses suggest non-linear structure is more informative for virality than for fake-news detection (t-SNE > PCA on numeric features). Swapping RoBERTa for Mistral embeddings yields only modest deltas, leaving conclusions unchanged. We discuss implications for evaluation design and report reproducibility constraints that realistically affect the field. We release splits and code where possible and provide guidance for metric selection.

Related papers

Deep Learning for Contextualized NetFlow-Based Network Intrusion Detection: Methods, Data, Evaluation and Deployment [5.402853794565817]
This paper synthesizes recent research on context-aware deep learning for flow-based intrusion detection.<n>We organize existing methods into a four-dimensional taxonomy covering temporal context, graph or relational context, multimodal context, and multi-resolution context.<n>We review common failure modes that can inflate reported results, including temporal leakage, data splitting, dataset design flaws, limited dataset diversity, and weak cross-dataset generalization.
arXiv Detail & Related papers (2026-02-05T12:25:18Z)
What's Left Unsaid? Detecting and Correcting Misleading Omissions in Multimodal News Previews [31.373823254968315]
Even when factually correct, social-media news previews can induce interpretation drift.<n>This covert harm is harder to detect than explicit misinformation yet remains underexplored.<n>We develop a multi-stage pipeline that disentangles and simulates preview-based versus context-based understanding.
arXiv Detail & Related papers (2026-01-09T06:29:19Z)
Seeing the Unseen: Towards Zero-Shot Inspection for Wind Turbine Blades using Knowledge-Augmented Vision Language Models [10.230967860299504]
We propose a zero-shot-oriented inspection framework that integrates Retrieval-Augmented Generation with Vision-Language Models.<n>A multimodal knowledge base is constructed, comprising technical documentation, representative reference images, and domain-specific guidelines.<n>We evaluate the framework on 30 labeled blade images covering diverse damage categories.
arXiv Detail & Related papers (2025-10-26T23:19:28Z)
Revisiting Multivariate Time Series Forecasting with Missing Values [65.30332997607141]
Missing values are common in real-world time series.<n>Current approaches have developed an imputation-then-prediction framework that uses imputation modules to fill in missing values, followed by forecasting on the imputed data.<n>This framework overlooks a critical issue: there is no ground truth for the missing values, making the imputation process susceptible to errors that can degrade prediction accuracy.<n>We introduce Consistency-Regularized Information Bottleneck (CRIB), a novel framework built on the Information Bottleneck principle.
arXiv Detail & Related papers (2025-09-27T20:57:48Z)
Probing Pre-trained Language Models on Code Changes: Insights from ReDef, a High-Confidence Just-in-Time Defect Prediction Dataset [0.0]
We present ReDef, a high-confidence benchmark of function-level modifications curated from 22 large-scale C/C++ projects.<n>Defective cases are anchored by revert commits, while clean cases are validated through post-hoc history checks.<n>This pipeline yields 3,164 defective and 10,268 clean modifications, offering substantially more reliable labels than prior existing resources.
arXiv Detail & Related papers (2025-09-11T07:07:11Z)
Are You Sure You're Positive? Consolidating Chain-of-Thought Agents with Uncertainty Quantification for Aspect-Category Sentiment Analysis [4.14197005718384]
We argue that leveraging large language models in a zero-shot setting is beneficial where the time and resources required for dataset annotation are limited.<n>We propose novel techniques that combine multiple chain-of-thought agents by leveraging large language models' token-level uncertainty scores.
arXiv Detail & Related papers (2025-08-24T08:51:16Z)
Coarse-to-Fine Proposal Refinement Framework for Audio Temporal Forgery Detection and Localization [60.899082019130766]
We introduce a frame-level detection network (FDN) and a proposal refinement network (PRN) for audio temporal forgery detection and localization. FDN aims to mine informative inconsistency cues between real and fake frames to obtain discriminative features that are beneficial for roughly indicating forgery regions. PRN is responsible for predicting confidence scores and regression offsets to refine the coarse-grained proposals derived from the FDN.
arXiv Detail & Related papers (2024-07-23T15:07:52Z)
FactCHD: Benchmarking Fact-Conflicting Hallucination Detection [64.4610684475899]
FactCHD is a benchmark designed for the detection of fact-conflicting hallucinations from LLMs. FactCHD features a diverse dataset that spans various factuality patterns, including vanilla, multi-hop, comparison, and set operation. We introduce Truth-Triangulator that synthesizes reflective considerations by tool-enhanced ChatGPT and LoRA-tuning based on Llama2.
arXiv Detail & Related papers (2023-10-18T16:27:49Z)
Prototype-based Aleatoric Uncertainty Quantification for Cross-modal Retrieval [139.21955930418815]
Cross-modal Retrieval methods build similarity relations between vision and language modalities by jointly learning a common representation space. However, the predictions are often unreliable due to the Aleatoric uncertainty, which is induced by low-quality data, e.g., corrupt images, fast-paced videos, and non-detailed texts. We propose a novel Prototype-based Aleatoric Uncertainty Quantification (PAU) framework to provide trustworthy predictions by quantifying the uncertainty arisen from the inherent data ambiguity.
arXiv Detail & Related papers (2023-09-29T09:41:19Z)
Bring Your Own Data! Self-Supervised Evaluation for Large Language Models [52.15056231665816]
We propose a framework for self-supervised evaluation of Large Language Models (LLMs) We demonstrate self-supervised evaluation strategies for measuring closed-book knowledge, toxicity, and long-range context dependence. We find strong correlations between self-supervised and human-supervised evaluations.
arXiv Detail & Related papers (2023-06-23T17:59:09Z)
A Closer Look at Debiased Temporal Sentence Grounding in Videos: Dataset, Metric, and Approach [53.727460222955266]
Temporal Sentence Grounding in Videos (TSGV) aims to ground a natural language sentence in an untrimmed video. Recent studies have found that current benchmark datasets may have obvious moment annotation biases. We introduce a new evaluation metric "dR@n,IoU@m" that discounts the basic recall scores to alleviate the inflating evaluation caused by biased datasets.
arXiv Detail & Related papers (2022-03-10T08:58:18Z)
Tracking the risk of a deployed model and detecting harmful distribution shifts [105.27463615756733]
In practice, it may make sense to ignore benign shifts, under which the performance of a deployed model does not degrade substantially. We argue that a sensible method for firing off a warning has to both (a) detect harmful shifts while ignoring benign ones, and (b) allow continuous monitoring of model performance without increasing the false alarm rate.
arXiv Detail & Related papers (2021-10-12T17:21:41Z)

This list is automatically generated from the titles and abstracts of the papers in this site.