Related papers: Hidden in the Metadata: Stealth Poisoning Attacks on Multimodal Retrieval-Augmented Generation

Hidden in the Metadata: Stealth Poisoning Attacks on Multimodal Retrieval-Augmented Generation

URL: http://arxiv.org/abs/2603.00172v1
Date: Thu, 26 Feb 2026 15:59:45 GMT
Title: Hidden in the Metadata: Stealth Poisoning Attacks on Multimodal Retrieval-Augmented Generation
Authors: Kennedy Edemacu, Mohammad Mahdi Shokri,
Abstract summary: We present MM-MEPA, a multimodal poisoning attack that targets the metadata components of image-text entries while leaving the associated visual content unaltered.<n> MM-MEPA achieves an attack success rate of up to 91% consistently disrupting system behaviors across four retrievers and two multimodal generators.
Score: 0.8103046443444949
License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
Abstract: Retrieval-augmented generation (RAG) has emerged as a powerful paradigm for enhancing multimodal large language models by grounding their responses in external, factual knowledge and thus mitigating hallucinations. However, the integration of externally sourced knowledge bases introduces a critical attack surface. Adversaries can inject malicious multimodal content capable of influencing both retrieval and downstream generation. In this work, we present MM-MEPA, a multimodal poisoning attack that targets the metadata components of image-text entries while leaving the associated visual content unaltered. By only manipulating the metadata, MM-MEPA can still steer multimodal retrieval and induce attacker-desired model responses. We evaluate the attack across multiple benchmark settings and demonstrate its severity. MM-MEPA achieves an attack success rate of up to 91\% consistently disrupting system behaviors across four retrievers and two multimodal generators. Additionally, we assess representative defense strategies and find them largely ineffective against this form of metadata-only poisoning. Our findings expose a critical vulnerability in multimodal RAG and underscore the urgent need for more robust, defense-aware retrieval and knowledge integration methods.

Related papers

HV-Attack: Hierarchical Visual Attack for Multimodal Retrieval Augmented Generation [30.52562642784609]
multimodal Retrieval-Augmented Generation (MRAG) techniques have been widely applied to enhance the capabilities of Large Multimodal Models (LMMs)<n>Existing adversarial research has revealed the vulnerability of MRAG systems to knowledge poisoning attacks.<n>We propose visual attack of MRAG by solely adding imperceptible perturbations at the image inputs of users.
arXiv Detail & Related papers (2025-11-19T13:45:24Z)
MrM: Black-Box Membership Inference Attacks against Multimodal RAG Systems [31.53306157650065]
Multimodal retrieval-augmented generation (RAG) systems enhance large vision-language models by integrating cross-modal knowledge.<n>These knowledge databases may contain sensitive information that requires privacy protection.<n>MrM is the first black-box MIA framework targeted at multimodal RAG systems.
arXiv Detail & Related papers (2025-06-09T03:48:50Z)
MIRAGE: Multimodal Immersive Reasoning and Guided Exploration for Red-Team Jailbreak Attacks [85.3303135160762]
MIRAGE is a novel framework that exploits narrative-driven context and role immersion to circumvent safety mechanisms in Multimodal Large Language Models.<n>It achieves state-of-the-art performance, improving attack success rates by up to 17.5% over the best baselines.<n>We demonstrate that role immersion and structured semantic reconstruction can activate inherent model biases, facilitating the model's spontaneous violation of ethical safeguards.
arXiv Detail & Related papers (2025-03-24T20:38:42Z)
Poisoned-MRAG: Knowledge Poisoning Attacks to Multimodal Retrieval Augmented Generation [71.32665836294103]
Multimodal retrieval-augmented generation (RAG) enhances the visual reasoning capability of vision-language models (VLMs)<n>In this work, we introduce textitPoisoned-MRAG, the first knowledge poisoning attack on multimodal RAG systems.
arXiv Detail & Related papers (2025-03-08T15:46:38Z)
MM-PoisonRAG: Disrupting Multimodal RAG with Local and Global Poisoning Attacks [104.50239783909063]
Multimodal large language models with Retrieval Augmented Generation (RAG) have significantly advanced tasks such as multimodal question answering.<n>This reliance on external knowledge poses a critical yet underexplored safety risk: knowledge poisoning attacks.<n>We propose MM-PoisonRAG, the first framework to systematically design knowledge poisoning in multimodal RAG.
arXiv Detail & Related papers (2025-02-25T04:23:59Z)
Retrievals Can Be Detrimental: A Contrastive Backdoor Attack Paradigm on Retrieval-Augmented Diffusion Models [37.66349948811172]
Diffusion models (DMs) have recently demonstrated remarkable generation capability.<n>Recent studies empower DMs with the advanced Retrieval-Augmented Generation (RAG) technique.<n>RAG enhances DMs' generation and generalization ability while significantly reducing model parameters.<n>Despite the great success, RAG may introduce novel security issues that warrant further investigation.
arXiv Detail & Related papers (2025-01-23T02:42:28Z)
Towards Probing Speech-Specific Risks in Large Multimodal Models: A Taxonomy, Benchmark, and Insights [50.89022445197919]
We propose a speech-specific risk taxonomy, covering 8 risk categories under hostility (malicious sarcasm and threats), malicious imitation (age, gender, ethnicity), and stereotypical biases (age, gender, ethnicity) Based on the taxonomy, we create a small-scale dataset for evaluating current LMMs capability in detecting these categories of risk.
arXiv Detail & Related papers (2024-06-25T10:08:45Z)
Dissecting Adversarial Robustness of Multimodal LM Agents [70.2077308846307]
We manually create 200 targeted adversarial tasks and evaluation scripts in a realistic threat model on top of VisualWebArena.<n>We find that we can successfully break latest agents that use black-box frontier LMs, including those that perform reflection and tree search.<n>We also use ARE to rigorously evaluate how the robustness changes as new components are added.
arXiv Detail & Related papers (2024-06-18T17:32:48Z)
White-box Multimodal Jailbreaks Against Large Vision-Language Models [61.97578116584653]
We propose a more comprehensive strategy that jointly attacks both text and image modalities to exploit a broader spectrum of vulnerability within Large Vision-Language Models. Our attack method begins by optimizing an adversarial image prefix from random noise to generate diverse harmful responses in the absence of text input. An adversarial text suffix is integrated and co-optimized with the adversarial image prefix to maximize the probability of eliciting affirmative responses to various harmful instructions.
arXiv Detail & Related papers (2024-05-28T07:13:30Z)
Avoid Adversarial Adaption in Federated Learning by Multi-Metric Investigations [55.2480439325792]
Federated Learning (FL) facilitates decentralized machine learning model training, preserving data privacy, lowering communication costs, and boosting model performance through diversified data sources. FL faces vulnerabilities such as poisoning attacks, undermining model integrity with both untargeted performance degradation and targeted backdoor attacks. We define a new notion of strong adaptive adversaries, capable of adapting to multiple objectives simultaneously. MESAS is the first defense robust against strong adaptive adversaries, effective in real-world data scenarios, with an average overhead of just 24.37 seconds.
arXiv Detail & Related papers (2023-06-06T11:44:42Z)

This list is automatically generated from the titles and abstracts of the papers in this site.