The Security Threat of Compressed Projectors in Large Vision-Language Models
- URL: http://arxiv.org/abs/2506.00534v2
- Date: Sat, 04 Oct 2025 03:41:35 GMT
- Title: The Security Threat of Compressed Projectors in Large Vision-Language Models
- Authors: Yudong Zhang, Ruobing Xie, Xingwu Sun, Jiansheng Chen, Zhanhui Kang, Di Wang, Yu Wang,
- Abstract summary: The choice of a suitable visual language projector is critical to the successful training of large visual language models.<n>Our evaluation reveals significant differences in their security profiles.
- Score: 55.27670937069075
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: The choice of a suitable visual language projector (VLP) is critical to the successful training of large visual language models (LVLMs). Mainstream VLPs can be broadly categorized into compressed and uncompressed projectors, and each offers distinct advantages in performance and computational efficiency. However, their security implications have not been thoroughly examined. Our comprehensive evaluation reveals significant differences in their security profiles: compressed projectors exhibit substantial vulnerabilities, allowing adversaries to successfully compromise LVLMs even with minimal knowledge of structure information. In stark contrast, uncompressed projectors demonstrate robust security properties and do not introduce additional vulnerabilities. These findings provide critical guidance for researchers in selecting optimal VLPs that enhance the security and reliability of visual language models. The code is available at https://github.com/btzyd/TCP.
Related papers
- Risk Awareness Injection: Calibrating Vision-Language Models for Safety without Compromising Utility [26.564913442069866]
Vision language models (VLMs) extend the reasoning capabilities of large language models (LLMs) to cross-modal settings.<n>Existing defenses rely on safety fine-tuning or aggressive token manipulations, incurring substantial training costs or significantly degrading utility.<n>We propose Risk Awareness Injection (RAI), a lightweight and training-free framework for safety calibration.
arXiv Detail & Related papers (2026-02-03T11:26:05Z) - CoViPAL: Layer-wise Contextualized Visual Token Pruning for Large Vision-Language Models [75.88232735646018]
Large Vision-Language Models (LVLMs) process multimodal inputs consisting of text tokens and vision tokens extracted from images or videos.<n>Existing methods attempt to prune redundant vision tokens, revealing substantial redundancy in visual representations.<n>We propose CoViPAL, a layer-wise contextualized visual token pruning method that employs a Plug-and-Play Pruning Module (PPM) to predict and remove redundant vision tokens before they are processed by the LVLM.
arXiv Detail & Related papers (2025-08-24T07:47:00Z) - Security Tensors as a Cross-Modal Bridge: Extending Text-Aligned Safety to Vision in LVLM [40.83149588857177]
Large visual-language models (LVLMs) integrate aligned large language models (LLMs) with visual modules to process multimodal inputs.<n>We introduce security tensors - trainable input vectors applied during inference through either the textual or visual modality.
arXiv Detail & Related papers (2025-07-28T16:59:53Z) - Response Wide Shut? Surprising Observations in Basic Vision Language Model Capabilities [54.94982467313341]
Vision-language Models (VLMs) have emerged as general-purpose tools for addressing a variety of complex computer vision problems.<n>We set out to understand the limitations of SoTA VLMs on fundamental visual tasks by constructing a series of tests that probe which components of design, specifically, may be lacking.
arXiv Detail & Related papers (2025-07-10T15:26:41Z) - HoliSafe: Holistic Safety Benchmarking and Modeling for Vision-Language Model [58.12612140992874]
We introduce a holistic safety dataset and benchmark, textbfHoliSafe, that spans all five safe/unsafe image-text combinations.<n>We also propose a novel modular framework for enhancing VLM safety with a visual guard module (VGM) designed to assess the harmfulness of input images.<n> Experiments show that Safe-VLM with VGM, trained on our HoliSafe, achieves state-of-the-art safety performance across multiple VLM benchmarks.
arXiv Detail & Related papers (2025-06-05T07:26:34Z) - Video-SafetyBench: A Benchmark for Safety Evaluation of Video LVLMs [51.90597846977058]
Video-SafetyBench is the first benchmark designed to evaluate the safety of LVLMs under video-text attacks.<n>It comprises 2,264 video-text pairs spanning 48 fine-grained unsafe categories.<n>To generate semantically accurate videos for safety evaluation, we design a controllable pipeline that decomposes video semantics into subject images and motion text.
arXiv Detail & Related papers (2025-05-17T05:06:38Z) - Transferable Adversarial Attacks on Black-Box Vision-Language Models [63.22532779621001]
adversarial attacks can transfer from open-source to proprietary black-box models in text-only and vision-only contexts.<n>We show that attackers can craft perturbations to induce specific attacker-chosen interpretations of visual information.<n>We discover that universal perturbations -- modifications applicable to a wide set of images -- can consistently induce these misinterpretations.
arXiv Detail & Related papers (2025-05-02T06:51:11Z) - Do We Really Need Curated Malicious Data for Safety Alignment in Multi-modal Large Language Models? [83.53005932513155]
Multi-modal large language models (MLLMs) have made significant progress, yet their safety alignment remains limited.<n>We propose finetuning MLLMs on a small set of benign instruct-following data with responses replaced by simple, clear rejection sentences.
arXiv Detail & Related papers (2025-04-14T09:03:51Z) - PSA-VLM: Enhancing Vision-Language Model Safety through Progressive Concept-Bottleneck-Driven Alignment [28.008884416277954]
We propose a progressive concept-based alignment strategy, PSA-VLM, to enhance visual modality safety alignment.<n>Our method achieves state-of-the-art results on popular VLM safety benchmark.
arXiv Detail & Related papers (2024-11-18T13:01:57Z) - Break the Visual Perception: Adversarial Attacks Targeting Encoded Visual Tokens of Large Vision-Language Models [15.029014337718849]
Large vision-language models (LVLMs) integrate visual information into large language models, showcasing remarkable multi-modal conversational capabilities.
In general, LVLMs rely on vision encoders to transform images into visual tokens, which are crucial for the language models to perceive image contents effectively.
We propose a non-targeted attack method referred to as VT-Attack, which constructs adversarial examples from multiple perspectives.
arXiv Detail & Related papers (2024-10-09T09:06:56Z) - Safety Alignment for Vision Language Models [21.441662865727448]
We enhance the visual modality safety alignment of Vision Language Models (VLMs) by adding safety modules.
Our method boasts ease of use, high flexibility, and strong controllability, and it enhances safety while having minimal impact on the model's general performance.
arXiv Detail & Related papers (2024-05-22T12:21:27Z) - B-AVIBench: Towards Evaluating the Robustness of Large Vision-Language Model on Black-box Adversarial Visual-Instructions [73.97665608366447]
Large Vision-Language Models (LVLMs) have shown significant progress in responding well to visual-instructions from users.<n>These instructions, encompassing images and text, are susceptible to both intentional and inadvertent attacks.<n>We introduce B-AVIBench, a framework designed to analyze the robustness of LVLMs when facing various Black-box Adrial Visual-Instructions.
arXiv Detail & Related papers (2024-03-14T12:51:07Z) - Adversarial Prompt Tuning for Vision-Language Models [86.5543597406173]
Adversarial Prompt Tuning (AdvPT) is a technique to enhance the adversarial robustness of image encoders in Vision-Language Models (VLMs)
We demonstrate that AdvPT improves resistance against white-box and black-box adversarial attacks and exhibits a synergistic effect when combined with existing image-processing-based defense techniques.
arXiv Detail & Related papers (2023-11-19T07:47:43Z) - On Evaluating Adversarial Robustness of Large Vision-Language Models [64.66104342002882]
We evaluate the robustness of large vision-language models (VLMs) in the most realistic and high-risk setting.
In particular, we first craft targeted adversarial examples against pretrained models such as CLIP and BLIP.
Black-box queries on these VLMs can further improve the effectiveness of targeted evasion.
arXiv Detail & Related papers (2023-05-26T13:49:44Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.