Not all Fake News is Written: A Dataset and Analysis of Misleading Video
Headlines
- URL: http://arxiv.org/abs/2310.13859v2
- Date: Thu, 14 Dec 2023 20:34:32 GMT
- Title: Not all Fake News is Written: A Dataset and Analysis of Misleading Video
Headlines
- Authors: Yoo Yeon Sung and Jordan Boyd-Graber and Naeemul Hassan
- Abstract summary: We present a dataset that consists of videos and whether annotators believe the headline is representative of the video's contents.
After collecting and annotating this dataset, we analyze multimodal baselines for detecting misleading headlines.
Our annotation process also focuses on why annotators view a video as misleading, allowing us to better understand the interplay of annotators' background and the content of the videos.
- Score: 6.939987423356328
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: Polarization and the marketplace for impressions have conspired to make
navigating information online difficult for users, and while there has been a
significant effort to detect false or misleading text, multimodal datasets have
received considerably less attention. To complement existing resources, we
present multimodal Video Misleading Headline (VMH), a dataset that consists of
videos and whether annotators believe the headline is representative of the
video's contents. After collecting and annotating this dataset, we analyze
multimodal baselines for detecting misleading headlines. Our annotation process
also focuses on why annotators view a video as misleading, allowing us to
better understand the interplay of annotators' background and the content of
the videos.
Related papers
- Multi-view autoencoders for Fake News Detection [5.863538874435322]
This paper proposes using multi-view autoencoders to generate a joint feature representation for fake news detection.
Experiments on fake news datasets show a significant improvement in classification performance compared to individual views.
arXiv Detail & Related papers (2025-04-10T19:59:34Z) - FMNV: A Dataset of Media-Published News Videos for Fake News Detection [10.36393083923778]
We construct FMNV, a novel dataset exclusively composed of news videos published by media organizations.
We employ Large Language Models (LLMs) to automatically generate content by manipulating authentic media-published news videos.
We propose FMNVD, a baseline model featuring a dual-stream architecture integrating CLIP and Faster R-CNN for video feature extraction.
arXiv Detail & Related papers (2025-04-10T12:16:32Z) - Multimodal Fake News Video Explanation: Dataset, Analysis and Evaluation [13.779579002878918]
We develop a new dataset of 2,672 fake news video posts that can definitively explain four real-life fake news video aspects.
In addition, we propose a Multimodal Relation Graph Transformer (MRGT) based on the architecture of multimodal Transformer to benchmark FakeVE.
arXiv Detail & Related papers (2025-01-15T01:52:54Z) - MultiVENT 2.0: A Massive Multilingual Benchmark for Event-Centric Video Retrieval [57.891157692501345]
$textbfMultiVENT 2.0$ is a large-scale, multilingual event-centric video retrieval benchmark.
It features a collection of more than 218,000 news videos and 3,906 queries targeting specific world events.
Preliminary results show that state-of-the-art vision-language models struggle significantly with this task.
arXiv Detail & Related papers (2024-10-15T13:56:34Z) - Bridging Information Asymmetry in Text-video Retrieval: A Data-centric Approach [56.610806615527885]
A key challenge in text-video retrieval (TVR) is the information asymmetry between video and text.
This paper introduces a data-centric framework to bridge this gap by enriching textual representations to better match the richness of video content.
We propose a query selection mechanism that identifies the most relevant and diverse queries, reducing computational cost while improving accuracy.
arXiv Detail & Related papers (2024-08-14T01:24:09Z) - Official-NV: An LLM-Generated News Video Dataset for Multimodal Fake News Detection [9.48705939124715]
We construct a dataset named Official-NV, comprising officially published news videos.
The crawl officially published videos are augmented through the use of LLMs-based generation and manual verification.
The proposed dataset is benchmarked against several baselines to demonstrate its effectiveness in multimodal news detection.
arXiv Detail & Related papers (2024-07-28T13:23:43Z) - FakingRecipe: Detecting Fake News on Short Video Platforms from the Perspective of Creative Process [19.629705422258905]
We introduce a novel perspective that considers how fake news might be created.
Through the lens of the creative process behind news video production, our empirical analysis uncovers the unique characteristics of fake news videos.
Based on the obtained insights, we design FakingRecipe, a creative process-aware model for detecting fake news short videos.
arXiv Detail & Related papers (2024-07-23T17:39:49Z) - Multi-modal News Understanding with Professionally Labelled Videos
(ReutersViLNews) [25.78619140103048]
We present a large-scale analysis on an in-house dataset collected by the Reuters News Agency, called Reuters Video-Language News (ReutersViLNews) dataset.
The dataset focuses on high-level video-language understanding with an emphasis on long-form news.
The results suggest that news-oriented videos are a substantial challenge for current video-language understanding algorithms.
arXiv Detail & Related papers (2024-01-23T00:42:04Z) - Video Summarization: Towards Entity-Aware Captions [73.28063602552741]
We propose the task of summarizing news video directly to entity-aware captions.
We show that our approach generalizes to existing news image captions dataset.
arXiv Detail & Related papers (2023-12-01T23:56:00Z) - AVTENet: Audio-Visual Transformer-based Ensemble Network Exploiting
Multiple Experts for Video Deepfake Detection [53.448283629898214]
The recent proliferation of hyper-realistic deepfake videos has drawn attention to the threat of audio and visual forgeries.
Most previous work on detecting AI-generated fake videos only utilize visual modality or audio modality.
We propose an Audio-Visual Transformer-based Ensemble Network (AVTENet) framework that considers both acoustic manipulation and visual manipulation.
arXiv Detail & Related papers (2023-10-19T19:01:26Z) - InternVid: A Large-scale Video-Text Dataset for Multimodal Understanding
and Generation [90.71796406228265]
InternVid is a large-scale video-centric multimodal dataset that enables learning powerful and transferable video-text representations.
The InternVid dataset contains over 7 million videos lasting nearly 760K hours, yielding 234M video clips accompanied by detailed descriptions of total 4.1B words.
arXiv Detail & Related papers (2023-07-13T17:58:32Z) - Labelling unlabelled videos from scratch with multi-modal
self-supervision [82.60652426371936]
unsupervised labelling of a video dataset does not come for free from strong feature encoders.
We propose a novel clustering method that allows pseudo-labelling of a video dataset without any human annotations.
An extensive analysis shows that the resulting clusters have high semantic overlap to ground truth human labels.
arXiv Detail & Related papers (2020-06-24T12:28:17Z) - VIOLIN: A Large-Scale Dataset for Video-and-Language Inference [103.7457132841367]
We introduce a new task, Video-and-Language Inference, for joint multimodal understanding of video and text.
Given a video clip with subtitles aligned as premise, paired with a natural language hypothesis based on the video content, a model needs to infer whether the hypothesis is entailed or contradicted by the given video clip.
A new large-scale dataset, named Violin (VIdeO-and-Language INference), is introduced for this task, which consists of 95,322 video-hypothesis pairs from 15,887 video clips.
arXiv Detail & Related papers (2020-03-25T20:39:05Z) - BaitWatcher: A lightweight web interface for the detection of
incongruent news headlines [27.29585619643952]
BaitWatcher is a lightweight web interface that guides readers in estimating the likelihood of incongruence in news articles before clicking on the headlines.
BaiittWatcher utilizes a hierarchical recurrent encoder that efficiently learns complex textual representations of a news headline and its associated body text.
arXiv Detail & Related papers (2020-03-23T23:43:02Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.