Multimodal Misinformation Detection using Large Vision-Language Models
- URL: http://arxiv.org/abs/2407.14321v1
- Date: Fri, 19 Jul 2024 13:57:11 GMT
- Title: Multimodal Misinformation Detection using Large Vision-Language Models
- Authors: Sahar Tahmasebi, Eric Müller-Budack, Ralph Ewerth,
- Abstract summary: Large language models (LLMs) have shown remarkable performance in various tasks.
Few approaches consider evidence retrieval as part of misinformation detection.
We propose a novel re-ranking approach for multimodal evidence retrieval.
- Score: 7.505532091249881
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: The increasing proliferation of misinformation and its alarming impact have motivated both industry and academia to develop approaches for misinformation detection and fact checking. Recent advances on large language models (LLMs) have shown remarkable performance in various tasks, but whether and how LLMs could help with misinformation detection remains relatively underexplored. Most of existing state-of-the-art approaches either do not consider evidence and solely focus on claim related features or assume the evidence to be provided. Few approaches consider evidence retrieval as part of the misinformation detection but rely on fine-tuning models. In this paper, we investigate the potential of LLMs for misinformation detection in a zero-shot setting. We incorporate an evidence retrieval component into the process as it is crucial to gather pertinent information from various sources to detect the veracity of claims. To this end, we propose a novel re-ranking approach for multimodal evidence retrieval using both LLMs and large vision-language models (LVLM). The retrieved evidence samples (images and texts) serve as the input for an LVLM-based approach for multimodal fact verification (LVLM4FV). To enable a fair evaluation, we address the issue of incomplete ground truth for evidence samples in an existing evidence retrieval dataset by annotating a more complete set of evidence samples for both image and text retrieval. Our experimental results on two datasets demonstrate the superiority of the proposed approach in both evidence retrieval and fact verification tasks and also better generalization capability across dataset compared to the supervised baseline.
Related papers
- Unstructured Evidence Attribution for Long Context Query Focused Summarization [46.713307974729844]
Large language models (LLMs) are capable of generating coherent summaries from very long contexts given a user query.
We show how existing systems struggle to generate and properly cite unstructured evidence from their context.
arXiv Detail & Related papers (2025-02-20T09:57:42Z) - E2LVLM:Evidence-Enhanced Large Vision-Language Model for Multimodal Out-of-Context Misinformation Detection [7.1939657372410375]
We present E2LVLM, a novel evidence-enhanced large vision-language model by adapting textual evidence in two levels.
To address the scarcity of news domain datasets with both judgment and explanation, we generate a novel OOC multimodal instruction-following dataset.
A multitude of experiments demonstrate that E2LVLM achieves superior performance than state-of-the-art methods.
arXiv Detail & Related papers (2025-02-12T04:25:14Z) - MetaSumPerceiver: Multimodal Multi-Document Evidence Summarization for Fact-Checking [0.283600654802951]
We present a summarization model designed to generate claim-specific summaries useful for fact-checking from multimodal datasets.
We introduce a dynamic perceiver-based model that can handle inputs from multiple modalities of arbitrary lengths.
Our approach outperforms the SOTA approach by 4.6% in the claim verification task on the MOCHEG dataset.
arXiv Detail & Related papers (2024-07-18T01:33:20Z) - Detect, Investigate, Judge and Determine: A Knowledge-guided Framework for Few-shot Fake News Detection [50.079690200471454]
Few-Shot Fake News Detection (FS-FND) aims to distinguish inaccurate news from real ones in extremely low-resource scenarios.
This task has garnered increased attention due to the widespread dissemination and harmful impact of fake news on social media.
We propose a Dual-perspective Knowledge-guided Fake News Detection (DKFND) model, designed to enhance LLMs from both inside and outside perspectives.
arXiv Detail & Related papers (2024-07-12T03:15:01Z) - Retrieval Augmented Fact Verification by Synthesizing Contrastive Arguments [23.639378586798884]
We propose retrieval augmented fact verification through the synthesis of contrasting arguments.
Our method effectively retrieves relevant documents as evidence and evaluates arguments from varying perspectives.
We demonstrate the effectiveness of our method through extensive experiments, where RAFTS can outperform GPT-based methods with a significantly smaller 7B LLM.
arXiv Detail & Related papers (2024-06-14T08:13:34Z) - C-ICL: Contrastive In-context Learning for Information Extraction [54.39470114243744]
c-ICL is a novel few-shot technique that leverages both correct and incorrect sample constructions to create in-context learning demonstrations.
Our experiments on various datasets indicate that c-ICL outperforms previous few-shot in-context learning methods.
arXiv Detail & Related papers (2024-02-17T11:28:08Z) - Progressive Evidence Refinement for Open-domain Multimodal Retrieval
Question Answering [20.59485758381809]
Current multimodal retrieval question-answering models face two main challenges.
utilizing compressed evidence features as input to the model results in the loss of fine-grained information within the evidence.
We propose a two-stage framework for evidence retrieval and question-answering to alleviate these issues.
arXiv Detail & Related papers (2023-10-15T01:18:39Z) - Give Me More Details: Improving Fact-Checking with Latent Retrieval [58.706972228039604]
Evidence plays a crucial role in automated fact-checking.
Existing fact-checking systems either assume the evidence sentences are given or use the search snippets returned by the search engine.
We propose to incorporate full text from source documents as evidence and introduce two enriched datasets.
arXiv Detail & Related papers (2023-05-25T15:01:19Z) - GERE: Generative Evidence Retrieval for Fact Verification [57.78768817972026]
We propose GERE, the first system that retrieves evidences in a generative fashion.
The experimental results on the FEVER dataset show that GERE achieves significant improvements over the state-of-the-art baselines.
arXiv Detail & Related papers (2022-04-12T03:49:35Z) - A Multi-Level Attention Model for Evidence-Based Fact Checking [58.95413968110558]
We present a simple model that can be trained on sequence structures.
Results on a large-scale dataset for Fact Extraction and VERification show that our model outperforms the graph-based approaches.
arXiv Detail & Related papers (2021-06-02T05:40:12Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.