Related papers: Multimodal Misinformation Detection using Large Vision-Language Models

Multimodal Misinformation Detection using Large Vision-Language Models

URL: http://arxiv.org/abs/2407.14321v1
Date: Fri, 19 Jul 2024 13:57:11 GMT
Title: Multimodal Misinformation Detection using Large Vision-Language Models
Authors: Sahar Tahmasebi, Eric Müller-Budack, Ralph Ewerth,
Abstract summary: Large language models (LLMs) have shown remarkable performance in various tasks. Few approaches consider evidence retrieval as part of misinformation detection. We propose a novel re-ranking approach for multimodal evidence retrieval.
Score: 7.505532091249881
License: http://creativecommons.org/licenses/by/4.0/
Abstract: The increasing proliferation of misinformation and its alarming impact have motivated both industry and academia to develop approaches for misinformation detection and fact checking. Recent advances on large language models (LLMs) have shown remarkable performance in various tasks, but whether and how LLMs could help with misinformation detection remains relatively underexplored. Most of existing state-of-the-art approaches either do not consider evidence and solely focus on claim related features or assume the evidence to be provided. Few approaches consider evidence retrieval as part of the misinformation detection but rely on fine-tuning models. In this paper, we investigate the potential of LLMs for misinformation detection in a zero-shot setting. We incorporate an evidence retrieval component into the process as it is crucial to gather pertinent information from various sources to detect the veracity of claims. To this end, we propose a novel re-ranking approach for multimodal evidence retrieval using both LLMs and large vision-language models (LVLM). The retrieved evidence samples (images and texts) serve as the input for an LVLM-based approach for multimodal fact verification (LVLM4FV). To enable a fair evaluation, we address the issue of incomplete ground truth for evidence samples in an existing evidence retrieval dataset by annotating a more complete set of evidence samples for both image and text retrieval. Our experimental results on two datasets demonstrate the superiority of the proposed approach in both evidence retrieval and fact verification tasks and also better generalization capability across dataset compared to the supervised baseline.

Related papers

Unstructured Evidence Attribution for Long Context Query Focused Summarization [46.713307974729844]
Large language models (LLMs) are capable of generating coherent summaries from very long contexts given a user query. We show how existing systems struggle to generate and properly cite unstructured evidence from their context.
arXiv Detail & Related papers (2025-02-20T09:57:42Z)
E2LVLM:Evidence-Enhanced Large Vision-Language Model for Multimodal Out-of-Context Misinformation Detection [7.1939657372410375]
We present E2LVLM, a novel evidence-enhanced large vision-language model by adapting textual evidence in two levels. To address the scarcity of news domain datasets with both judgment and explanation, we generate a novel OOC multimodal instruction-following dataset. A multitude of experiments demonstrate that E2LVLM achieves superior performance than state-of-the-art methods.
arXiv Detail & Related papers (2025-02-12T04:25:14Z)
MetaSumPerceiver: Multimodal Multi-Document Evidence Summarization for Fact-Checking [0.283600654802951]
We present a summarization model designed to generate claim-specific summaries useful for fact-checking from multimodal datasets. We introduce a dynamic perceiver-based model that can handle inputs from multiple modalities of arbitrary lengths. Our approach outperforms the SOTA approach by 4.6% in the claim verification task on the MOCHEG dataset.
arXiv Detail & Related papers (2024-07-18T01:33:20Z)
Detect, Investigate, Judge and Determine: A Knowledge-guided Framework for Few-shot Fake News Detection [50.079690200471454]
Few-Shot Fake News Detection (FS-FND) aims to distinguish inaccurate news from real ones in extremely low-resource scenarios. This task has garnered increased attention due to the widespread dissemination and harmful impact of fake news on social media. We propose a Dual-perspective Knowledge-guided Fake News Detection (DKFND) model, designed to enhance LLMs from both inside and outside perspectives.
arXiv Detail & Related papers (2024-07-12T03:15:01Z)
Retrieval Augmented Fact Verification by Synthesizing Contrastive Arguments [23.639378586798884]
We propose retrieval augmented fact verification through the synthesis of contrasting arguments. Our method effectively retrieves relevant documents as evidence and evaluates arguments from varying perspectives. We demonstrate the effectiveness of our method through extensive experiments, where RAFTS can outperform GPT-based methods with a significantly smaller 7B LLM.
arXiv Detail & Related papers (2024-06-14T08:13:34Z)
Re-Search for The Truth: Multi-round Retrieval-augmented Large Language Models are Strong Fake News Detectors [38.75533934195315]
Large Language Models (LLMs) are known for their remarkable reasoning and generative capabilities. We introduce a novel, retrieval-augmented LLMs framework--the first of its kind to automatically and strategically extract key evidence from web sources for claim verification. Our framework ensures the acquisition of sufficient, relevant evidence, thereby enhancing performance.
arXiv Detail & Related papers (2024-03-14T00:35:39Z)
C-ICL: Contrastive In-context Learning for Information Extraction [54.39470114243744]
c-ICL is a novel few-shot technique that leverages both correct and incorrect sample constructions to create in-context learning demonstrations. Our experiments on various datasets indicate that c-ICL outperforms previous few-shot in-context learning methods.
arXiv Detail & Related papers (2024-02-17T11:28:08Z)
ReEval: Automatic Hallucination Evaluation for Retrieval-Augmented Large Language Models via Transferable Adversarial Attacks [91.55895047448249]
This paper presents ReEval, an LLM-based framework using prompt chaining to perturb the original evidence for generating new test cases. We implement ReEval using ChatGPT and evaluate the resulting variants of two popular open-domain QA datasets. Our generated data is human-readable and useful to trigger hallucination in large language models.
arXiv Detail & Related papers (2023-10-19T06:37:32Z)
FactCHD: Benchmarking Fact-Conflicting Hallucination Detection [64.4610684475899]
FactCHD is a benchmark designed for the detection of fact-conflicting hallucinations from LLMs. FactCHD features a diverse dataset that spans various factuality patterns, including vanilla, multi-hop, comparison, and set operation. We introduce Truth-Triangulator that synthesizes reflective considerations by tool-enhanced ChatGPT and LoRA-tuning based on Llama2.
arXiv Detail & Related papers (2023-10-18T16:27:49Z)
Progressive Evidence Refinement for Open-domain Multimodal Retrieval Question Answering [20.59485758381809]
Current multimodal retrieval question-answering models face two main challenges. utilizing compressed evidence features as input to the model results in the loss of fine-grained information within the evidence. We propose a two-stage framework for evidence retrieval and question-answering to alleviate these issues.
arXiv Detail & Related papers (2023-10-15T01:18:39Z)
Give Me More Details: Improving Fact-Checking with Latent Retrieval [58.706972228039604]
Evidence plays a crucial role in automated fact-checking. Existing fact-checking systems either assume the evidence sentences are given or use the search snippets returned by the search engine. We propose to incorporate full text from source documents as evidence and introduce two enriched datasets.
arXiv Detail & Related papers (2023-05-25T15:01:19Z)
GERE: Generative Evidence Retrieval for Fact Verification [57.78768817972026]
We propose GERE, the first system that retrieves evidences in a generative fashion. The experimental results on the FEVER dataset show that GERE achieves significant improvements over the state-of-the-art baselines.
arXiv Detail & Related papers (2022-04-12T03:49:35Z)
A Multi-Level Attention Model for Evidence-Based Fact Checking [58.95413968110558]
We present a simple model that can be trained on sequence structures. Results on a large-scale dataset for Fact Extraction and VERification show that our model outperforms the graph-based approaches.
arXiv Detail & Related papers (2021-06-02T05:40:12Z)

This list is automatically generated from the titles and abstracts of the papers in this site.