Multimodal Learning for Fake News Detection in Short Videos Using Linguistically Verified Data and Heterogeneous Modality Fusion
- URL: http://arxiv.org/abs/2509.15578v1
- Date: Fri, 19 Sep 2025 04:24:57 GMT
- Title: Multimodal Learning for Fake News Detection in Short Videos Using Linguistically Verified Data and Heterogeneous Modality Fusion
- Authors: Shanghong Li, Chiam Wen Qi Ruth, Hong Xu, Fang Liu,
- Abstract summary: Current methods often struggle with the dynamic and multimodal nature of short video content.<n>This paper presents HFN, a novel framework that integrates video, audio, and text data to evaluate the authenticity of short video content.
- Score: 5.850574227112314
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: The rapid proliferation of short video platforms has necessitated advanced methods for detecting fake news. This need arises from the widespread influence and ease of sharing misinformation, which can lead to significant societal harm. Current methods often struggle with the dynamic and multimodal nature of short video content. This paper presents HFN, Heterogeneous Fusion Net, a novel multimodal framework that integrates video, audio, and text data to evaluate the authenticity of short video content. HFN introduces a Decision Network that dynamically adjusts modality weights during inference and a Weighted Multi-Modal Feature Fusion module to ensure robust performance even with incomplete data. Additionally, we contribute a comprehensive dataset VESV (VEracity on Short Videos) specifically designed for short video fake news detection. Experiments conducted on the FakeTT and newly collected VESV datasets demonstrate improvements of 2.71% and 4.14% in Marco F1 over state-of-the-art methods. This work establishes a robust solution capable of effectively identifying fake news in the complex landscape of short video platforms, paving the way for more reliable and comprehensive approaches in combating misinformation.
Related papers
- Consolidating Diffusion-Generated Video Detection with Unified Multimodal Forgery Learning [61.3737746844896]
Existing methods primarily focus on image-level forgery detection, leaving generic video-level forgery detection largely underexplored.<n>We propose a consolidated multimodal detection, named MM-Det++, specifically designed for detecting diffusion-generated videos.
arXiv Detail & Related papers (2025-11-22T16:05:12Z) - Enhancing Fake News Video Detection via LLM-Driven Creative Process Simulation [14.79644134032037]
The emergence of fake news on short video platforms has become a new significant societal concern.<n>Current detectors rely on pattern-based features to separate fake news videos from real ones.<n>We propose a data augmentation framework, AgentAug, that generates diverse fake news videos by simulating typical creative processes.
arXiv Detail & Related papers (2025-10-05T04:05:37Z) - ARC-Hunyuan-Video-7B: Structured Video Comprehension of Real-World Shorts [56.75723197779384]
ARC-Hunyuan-Video is a multimodal model that processes visual, audio, and textual signals end-to-end for structured comprehension.<n>Our model is capable of multi-granularity timestamped video captioning and summarization, open-ended video question answering, temporal video grounding, and video reasoning.
arXiv Detail & Related papers (2025-07-28T15:52:36Z) - Enhanced Multimodal Hate Video Detection via Channel-wise and Modality-wise Fusion [7.728348842555291]
The rapid rise of video content on platforms such as TikTok and YouTube has transformed information dissemination.<n>Despite significant efforts to combat hate speech, detecting these videos remains challenging due to their often implicit nature.<n>We present CMFusion, an enhanced multimodal hate video detection model utilizing a novel Channel-wise and Modality-wise Fusion Mechanism.
arXiv Detail & Related papers (2025-05-17T15:24:48Z) - FMNV: A Dataset of Media-Published News Videos for Fake News Detection [10.36393083923778]
We construct FMNV, a novel da-taset composed of news videos published by media organizations.<n>We employ Large Language Models (LLMs) to automatically generate content by manipulating authentic media-published news.<n>This work establishes critical benchmarks for de-tecting high-impact fake news in media ecosystems.
arXiv Detail & Related papers (2025-04-10T12:16:32Z) - External Reliable Information-enhanced Multimodal Contrastive Learning for Fake News Detection [10.575512607941839]
ERIC-FND is an external reliable information-enhanced multimodal contrastive learning framework for fake news detection.<n>Experiments are done on two commonly used datasets in different languages, X (Twitter) and Weibo.
arXiv Detail & Related papers (2025-03-05T02:07:38Z) - VMID: A Multimodal Fusion LLM Framework for Detecting and Identifying Misinformation of Short Videos [14.551693267228345]
This paper presents a novel fake news detection method based on multimodal information, designed to identify misinformation through a multi-level analysis of video content.
The proposed framework successfully integrates multimodal features within videos, significantly enhancing the accuracy and reliability of fake news detection.
arXiv Detail & Related papers (2024-11-15T08:20:26Z) - Bridging Information Asymmetry in Text-video Retrieval: A Data-centric Approach [56.610806615527885]
A key challenge in text-video retrieval (TVR) is the information asymmetry between video and text.<n>This paper introduces a data-centric framework to bridge this gap by enriching textual representations to better match the richness of video content.<n>We propose a query selection mechanism that identifies the most relevant and diverse queries, reducing computational cost while improving accuracy.
arXiv Detail & Related papers (2024-08-14T01:24:09Z) - AVTENet: A Human-Cognition-Inspired Audio-Visual Transformer-Based Ensemble Network for Video Deepfake Detection [49.81915942821647]
This study introduces the audio-visual transformer-based ensemble network (AVTENet) to detect deepfake videos.<n>For evaluation, we use the recently released benchmark multimodal audio-video FakeAVCeleb dataset.<n>For a detailed analysis, we evaluate AVTENet, its variants, and several existing methods on multiple test sets of the FakeAVCeleb dataset.
arXiv Detail & Related papers (2023-10-19T19:01:26Z) - Causal Video Summarizer for Video Exploration [74.27487067877047]
Causal Video Summarizer (CVS) is proposed to capture the interactive information between the video and query.
Based on the evaluation of the existing multi-modal video summarization dataset, experimental results show that the proposed approach is effective.
arXiv Detail & Related papers (2023-07-04T22:52:16Z) - Multimodal Short Video Rumor Detection System Based on Contrastive
Learning [3.4192832062683842]
Short video platforms in China have gradually evolved into fertile grounds for the proliferation of fake news.
distinguishing short video rumors poses a significant challenge due to the substantial amount of information and shared features.
Our research group proposes a methodology encompassing multimodal feature fusion and the integration of external knowledge.
arXiv Detail & Related papers (2023-04-17T16:07:00Z) - Few-Shot Video Object Detection [70.43402912344327]
We introduce Few-Shot Video Object Detection (FSVOD) with three important contributions.
FSVOD-500 comprises of 500 classes with class-balanced videos in each category for few-shot learning.
Our TPN and TMN+ are jointly and end-to-end trained.
arXiv Detail & Related papers (2021-04-30T07:38:04Z) - VMSMO: Learning to Generate Multimodal Summary for Video-based News
Articles [63.32111010686954]
We propose the task of Video-based Multimodal Summarization with Multimodal Output (VMSMO)
The main challenge in this task is to jointly model the temporal dependency of video with semantic meaning of article.
We propose a Dual-Interaction-based Multimodal Summarizer (DIMS), consisting of a dual interaction module and multimodal generator.
arXiv Detail & Related papers (2020-10-12T02:19:16Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.