Related papers: I'll believe it when I see it: Images increase misinformation sharing in Vision-Language Models

I'll believe it when I see it: Images increase misinformation sharing in Vision-Language Models

URL: http://arxiv.org/abs/2505.13302v1
Date: Mon, 19 May 2025 16:20:54 GMT
Title: I'll believe it when I see it: Images increase misinformation sharing in Vision-Language Models
Authors: Alice Plebe, Timothy Douglas, Diana Riazi, R. Maria del Rio-Chanona,
Abstract summary: We present the first study examining how images influence vision-language models' propensity to reshare news content.<n>Experiments across model families reveal that image presence increases resharing rates by 4.8% for true news and 15.0% for false news.<n> Persona conditioning further modulates this effect: Dark Triad traits amplify resharing of false news, whereas Republican-aligned profiles exhibit reduced veracity sensitivity.
Score: 1.5186937600119894
License: http://creativecommons.org/licenses/by/4.0/
Abstract: Large language models are increasingly integrated into news recommendation systems, raising concerns about their role in spreading misinformation. In humans, visual content is known to boost credibility and shareability of information, yet its effect on vision-language models (VLMs) remains unclear. We present the first study examining how images influence VLMs' propensity to reshare news content, whether this effect varies across model families, and how persona conditioning and content attributes modulate this behavior. To support this analysis, we introduce two methodological contributions: a jailbreaking-inspired prompting strategy that elicits resharing decisions from VLMs while simulating users with antisocial traits and political alignments; and a multimodal dataset of fact-checked political news from PolitiFact, paired with corresponding images and ground-truth veracity labels. Experiments across model families reveal that image presence increases resharing rates by 4.8% for true news and 15.0% for false news. Persona conditioning further modulates this effect: Dark Triad traits amplify resharing of false news, whereas Republican-aligned profiles exhibit reduced veracity sensitivity. Of all the tested models, only Claude-3-Haiku demonstrates robustness to visual misinformation. These findings highlight emerging risks in multimodal model behavior and motivate the development of tailored evaluation frameworks and mitigation strategies for personalized AI systems. Code and dataset are available at: https://github.com/3lis/misinfo_vlm

Related papers

Revisiting LLM Value Probing Strategies: Are They Robust and Expressive? [81.49470136653665]
We evaluate the robustness and expressiveness of value representations across three widely used probing strategies.<n>We show that the demographic context has little effect on the free-text generation, and the models' values only weakly correlate with their preference for value-based actions.
arXiv Detail & Related papers (2025-07-17T18:56:41Z)
VIP: Visual Information Protection through Adversarial Attacks on Vision-Language Models [15.158545794377169]
We frame the preservation of privacy in Vision-Language Models as an adversarial attack problem.<n>We propose a novel attack strategy that selectively conceals information within designated Region Of Interests in an image.<n> Experimental results across three state-of-the-art VLMs demonstrate up to 98% reduction in detecting targeted ROIs.
arXiv Detail & Related papers (2025-07-11T19:34:01Z)
Large Language Model-Informed Feature Discovery Improves Prediction and Interpretation of Credibility Perceptions of Visual Content [0.24999074238880484]
We introduce a Large Language Model (LLM)-informed feature discovery framework to evaluate content credibility and explain its reasoning.<n>We extract and quantify interpretable features using targeted prompts and integrate them into machine learning models to improve credibility predictions.<n>Our method outperformed zero-shot GPT-based predictions by 13 percent in R2, and revealed key features like information concreteness and image format.
arXiv Detail & Related papers (2025-04-15T05:11:40Z)
Fact-or-Fair: A Checklist for Behavioral Testing of AI Models on Fairness-Related Queries [85.909363478929]
In this study, we focus on 19 real-world statistics collected from authoritative sources.<n>We develop a checklist comprising objective and subjective queries to analyze behavior of large language models.<n>We propose metrics to assess factuality and fairness, and formally prove the inherent trade-off between these two aspects.
arXiv Detail & Related papers (2025-02-09T10:54:11Z)
A Self-Learning Multimodal Approach for Fake News Detection [35.98977478616019]
We introduce a self-learning multimodal model for fake news classification.<n>The model leverages contrastive learning, a robust method for feature extraction that operates without requiring labeled data.<n>Our experimental results on a public dataset demonstrate that the proposed model outperforms several state-of-the-art classification approaches.
arXiv Detail & Related papers (2024-12-08T07:41:44Z)
VHELM: A Holistic Evaluation of Vision Language Models [75.88987277686914]
We present the Holistic Evaluation of Vision Language Models (VHELM) VHELM aggregates various datasets to cover one or more of the 9 aspects: visual perception, knowledge, reasoning, bias, fairness, multilinguality, robustness, toxicity, and safety. Our framework is designed to be lightweight and automatic so that evaluation runs are cheap and fast.
arXiv Detail & Related papers (2024-10-09T17:46:34Z)
SA-Attack: Improving Adversarial Transferability of Vision-Language Pre-training Models via Self-Augmentation [56.622250514119294]
In contrast to white-box adversarial attacks, transfer attacks are more reflective of real-world scenarios. We propose a self-augment-based transfer attack method, termed SA-Attack.
arXiv Detail & Related papers (2023-12-08T09:08:50Z)
SEPSIS: I Can Catch Your Lies -- A New Paradigm for Deception Detection [9.20397189600732]
This research explores the problem of deception through the lens of psychology. We propose a novel framework for deception detection leveraging NLP techniques. We present a novel multi-task learning pipeline that leverages the dataless merging of fine-tuned language models.
arXiv Detail & Related papers (2023-12-01T02:13:25Z)
Fact-checking information from large language models can decrease headline discernment [6.814801748069122]
We investigate the impact of fact-checking information generated by a popular large language model on belief in, and sharing intent of, political news headlines. We find that this information does not significantly improve participants' ability to discern headline accuracy or share accurate news. Our findings highlight an important source of potential harm stemming from AI applications.
arXiv Detail & Related papers (2023-08-21T15:47:37Z)
Countering Malicious Content Moderation Evasion in Online Social Networks: Simulation and Detection of Word Camouflage [64.78260098263489]
Twisting and camouflaging keywords are among the most used techniques to evade platform content moderation systems. This article contributes significantly to countering malicious information by developing multilingual tools to simulate and detect new methods of evasion of content.
arXiv Detail & Related papers (2022-12-27T16:08:49Z)
Like Article, Like Audience: Enforcing Multimodal Correlations for Disinformation Detection [20.394457328537975]
correlations between user-generated and user-shared content can be leveraged for detecting disinformation in online news articles. We develop a multimodal learning algorithm for disinformation detection.
arXiv Detail & Related papers (2021-08-31T14:50:16Z)
Machine Learning Explanations to Prevent Overtrust in Fake News Detection [64.46876057393703]
This research investigates the effects of an Explainable AI assistant embedded in news review platforms for combating the propagation of fake news. We design a news reviewing and sharing interface, create a dataset of news stories, and train four interpretable fake news detection algorithms. For a deeper understanding of Explainable AI systems, we discuss interactions between user engagement, mental model, trust, and performance measures in the process of explaining.
arXiv Detail & Related papers (2020-07-24T05:42:29Z)

This list is automatically generated from the titles and abstracts of the papers in this site.