MisSpans: Fine-Grained False Span Identification in Cross-Domain Fake News
- URL: http://arxiv.org/abs/2601.04857v1
- Date: Thu, 08 Jan 2026 11:46:30 GMT
- Title: MisSpans: Fine-Grained False Span Identification in Cross-Domain Fake News
- Authors: Zhiwei Liu, Paul Thompson, Jiaqi Rong, Baojie Qu, Runteng Guo, Min Peng, Qianqian Xie, Sophia Ananiadou,
- Abstract summary: MisSpans is a benchmark for span-level misinformation detection and analysis.<n>It consists of paired real and fake news stories.<n>It enables fine-grained localisation, nuanced characterisation beyond true/false and actionable explanations.
- Score: 30.038748916578978
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: Online misinformation is increasingly pervasive, yet most existing benchmarks and methods evaluate veracity at the level of whole claims or paragraphs using coarse binary labels, obscuring how true and false details often co-exist within single sentences. These simplifications also limit interpretability: global explanations cannot identify which specific segments are misleading or differentiate how a detail is false (e.g., distorted vs. fabricated). To address these gaps, we introduce MisSpans, the first multi-domain, human-annotated benchmark for span-level misinformation detection and analysis, consisting of paired real and fake news stories. MisSpans defines three complementary tasks: MisSpansIdentity for pinpointing false spans within sentences, MisSpansType for categorising false spans by misinformation type, and MisSpansExplanation for providing rationales grounded in identified spans. Together, these tasks enable fine-grained localisation, nuanced characterisation beyond true/false and actionable explanations. Expert annotators were guided by standardised guidelines and consistency checks, leading to high inter-annotator agreement. We evaluate 15 representative LLMs, including reasoning-enhanced and non-reasoning variants, under zero-shot and one-shot settings. Results reveal the challenging nature of fine-grained misinformation identification and analysis, and highlight the need for a deeper understanding of how performance may be influenced by multiple interacting factors, including model size and reasoning capabilities, along with domain-specific textual features. This project will be available at https://github.com/lzw108/MisSpans.
Related papers
- What's Left Unsaid? Detecting and Correcting Misleading Omissions in Multimodal News Previews [31.373823254968315]
Even when factually correct, social-media news previews can induce interpretation drift.<n>This covert harm is harder to detect than explicit misinformation yet remains underexplored.<n>We develop a multi-stage pipeline that disentangles and simulates preview-based versus context-based understanding.
arXiv Detail & Related papers (2026-01-09T06:29:19Z) - Responsible AI in NLP: GUS-Net Span-Level Bias Detection Dataset and Benchmark for Generalizations, Unfairness, and Stereotypes [6.30817290125825]
We introduce the GUS-Net Framework, comprising the GUS dataset and a multi-label token-level detector for span-level analysis of social bias.<n>The GUS dataset contains 3,739 unique snippets across multiple domains, with over 69,000 token-level annotations.<n>We formulate bias detection as multi-label token-level classification and benchmark both encoder-based models and decoder-based large language models.
arXiv Detail & Related papers (2024-10-10T21:51:22Z) - Localizing Factual Inconsistencies in Attributable Text Generation [74.11403803488643]
We introduce QASemConsistency, a new formalism for localizing factual inconsistencies in attributable text generation.<n>We show that QASemConsistency yields factual consistency scores that correlate well with human judgments.
arXiv Detail & Related papers (2024-10-09T22:53:48Z) - Claim Check-Worthiness Detection: How Well do LLMs Grasp Annotation Guidelines? [0.0]
We use zero- and few-shot LLM prompting to identify text segments requiring fact-checking.
We evaluate the LLMs' predictive and calibration accuracy on five CD/CW datasets from diverse domains.
Our results show that optimal prompt verbosity is domain-dependent, adding context does not improve performance.
arXiv Detail & Related papers (2024-04-18T13:31:05Z) - FineFake: A Knowledge-Enriched Dataset for Fine-Grained Multi-Domain Fake News Detection [54.37159298632628]
FineFake is a multi-domain knowledge-enhanced benchmark for fake news detection.
FineFake encompasses 16,909 data samples spanning six semantic topics and eight platforms.
The entire FineFake project is publicly accessible as an open-source repository.
arXiv Detail & Related papers (2024-03-30T14:39:09Z) - Evaluating the Factuality of Zero-shot Summarizers Across Varied Domains [60.5207173547769]
We evaluate zero-shot generated summaries across specialized domains including biomedical articles, and legal bills.
We acquire annotations from domain experts to identify inconsistencies in summaries and systematically categorize these errors.
We release all collected annotations to facilitate additional research toward measuring and realizing factually accurate summarization, beyond news articles.
arXiv Detail & Related papers (2024-02-05T20:51:11Z) - X-PARADE: Cross-Lingual Textual Entailment and Information Divergence across Paragraphs [55.80189506270598]
X-PARADE is the first cross-lingual dataset of paragraph-level information divergences.
Annotators label a paragraph in a target language at the span level and evaluate it with respect to a corresponding paragraph in a source language.
Aligned paragraphs are sourced from Wikipedia pages in different languages.
arXiv Detail & Related papers (2023-09-16T04:34:55Z) - How to Handle Different Types of Out-of-Distribution Scenarios in Computational Argumentation? A Comprehensive and Fine-Grained Field Study [59.13867562744973]
This work systematically assesses LMs' capabilities for out-of-distribution (OOD) scenarios.
We find that the efficacy of such learning paradigms varies with the type of OOD.
Specifically, while ICL excels for domain shifts, prompt-based fine-tuning surpasses for topic shifts.
arXiv Detail & Related papers (2023-09-15T11:15:47Z) - Interpretable Automatic Fine-grained Inconsistency Detection in Text
Summarization [56.94741578760294]
We propose the task of fine-grained inconsistency detection, the goal of which is to predict the fine-grained types of factual errors in a summary.
Motivated by how humans inspect factual inconsistency in summaries, we propose an interpretable fine-grained inconsistency detection model, FineGrainFact.
arXiv Detail & Related papers (2023-05-23T22:11:47Z) - Contextual information integration for stance detection via
cross-attention [59.662413798388485]
Stance detection deals with identifying an author's stance towards a target.
Most existing stance detection models are limited because they do not consider relevant contextual information.
We propose an approach to integrate contextual information as text.
arXiv Detail & Related papers (2022-11-03T15:04:29Z) - Assessing Effectiveness of Using Internal Signals for Check-Worthy Claim
Identification in Unlabeled Data for Automated Fact-Checking [6.193231258199234]
This paper explores methodology to identify check-worthy claim sentences from fake news articles.
We leverage two internal supervisory signals - headline and the abstractive summary - to rank the sentences.
We show that while the headline has more gisting similarity with how a fact-checking website writes a claim, the summary-based pipeline is the most promising for an end-to-end fact-checking system.
arXiv Detail & Related papers (2021-11-02T16:17:20Z) - Dynamic Semantic Matching and Aggregation Network for Few-shot Intent
Detection [69.2370349274216]
Few-shot Intent Detection is challenging due to the scarcity of available annotated utterances.
Semantic components are distilled from utterances via multi-head self-attention.
Our method provides a comprehensive matching measure to enhance representations of both labeled and unlabeled instances.
arXiv Detail & Related papers (2020-10-06T05:16:38Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.