Related papers: SteerVLM: Robust Model Control through Lightweight Activation Steering for Vision Language Models

SteerVLM: Robust Model Control through Lightweight Activation Steering for Vision Language Models

URL: http://arxiv.org/abs/2510.26769v1
Date: Thu, 30 Oct 2025 17:52:39 GMT
Title: SteerVLM: Robust Model Control through Lightweight Activation Steering for Vision Language Models
Authors: Anushka Sivakumar, Andrew Zhang, Zaber Hakim, Chris Thomas,
Abstract summary: This work introduces SteerVLM, a lightweight steering module for Vision-Language Models (VLMs)<n>Our approach learns from the latent embeddings of paired prompts encoding target and converse behaviors to dynamically adjust activations connecting the language modality with image context.<n>Our steering module requires learning parameters equal to 0.14% of the original VLM's size.
Score: 4.506695482619111
License: http://creativecommons.org/licenses/by/4.0/
Abstract: This work introduces SteerVLM, a lightweight steering module designed to guide Vision-Language Models (VLMs) towards outputs that better adhere to desired instructions. Our approach learns from the latent embeddings of paired prompts encoding target and converse behaviors to dynamically adjust activations connecting the language modality with image context. This allows for fine-grained, inference-time control over complex output semantics without modifying model weights while preserving performance on off-target tasks. Our steering module requires learning parameters equal to 0.14% of the original VLM's size. Our steering module gains model control through dimension-wise activation modulation and adaptive steering across layers without requiring pre-extracted static vectors or manual tuning of intervention points. Furthermore, we introduce VNIA (Visual Narrative Intent Alignment), a multimodal dataset specifically created to facilitate the development and evaluation of VLM steering techniques. Our method outperforms existing intervention techniques on steering and hallucination mitigation benchmarks for VLMs and proposes a robust solution for multimodal model control through activation engineering.

Related papers

From One-to-One to Many-to-Many: Dynamic Cross-Layer Injection for Deep Vision-Language Fusion [91.35078719566472]
Vision-Language Models (VLMs) create a severe visual feature bottleneck by using a crude, asymmetric connection.<n>We introduce Cross-Layer Injection (CLI), a novel and lightweight framework that forges a dynamic many-to-many bridge between the two modalities.
arXiv Detail & Related papers (2026-01-15T18:59:10Z)
VISOR++: Universal Visual Inputs based Steering for Large Vision Language Models [2.8676122062166187]
We introduce universal visual input based steering for output redirection (VISOR++) to achieve behavioral control through optimized visual inputs alone.<n>We demonstrate that a single VISOR++ image can be generated for an ensemble of Vision Language Models (VLMs) to emulate each of their steering vectors.<n>We also show the promise of VISOR++ images in achieving directional behavioral shifts for unseen models including both open-access and closed-access ones.
arXiv Detail & Related papers (2025-09-29T21:43:18Z)
EasySteer: A Unified Framework for High-Performance and Extensible LLM Steering [55.56674028743782]
Large language model (LLM) steering has emerged as a promising paradigm for controlling model behavior at inference time.<n>We present EasySteer, a unified framework for high-performance, LLM steering built on vLLM.
arXiv Detail & Related papers (2025-09-29T17:59:07Z)
Mechanistic interpretability for steering vision-language-action models [0.23371356738437823]
Vision-Language-Action (VLA) models are a promising path to realizing generalist embodied agents.<n>We introduce the first framework for interpreting and steering VLAs via their internal representations.<n>We introduce a general-purpose activation steering method that modulates behavior in real time, without fine-tuning, reward signals, or environment interaction.
arXiv Detail & Related papers (2025-08-30T03:01:57Z)
LF-Steering: Latent Feature Activation Steering for Enhancing Semantic Consistency in Large Language Models [16.37602070339033]
Large Language Models (LLMs) often generate inconsistent responses when prompted with semantically equivalent paraphrased inputs.<n>We propose LF-Steering, a novel activation steering approach to precisely identify latent feature representations responsible for semantic inconsistency.<n>Our method maps the hidden states of the relevant transformer layer into a sparsely activated, high-dimensional feature space based on a sparse autoencoder.
arXiv Detail & Related papers (2025-01-19T13:06:51Z)
LLaVA Steering: Visual Instruction Tuning with 500x Fewer Parameters through Modality Linear Representation-Steering [30.51487692912812]
Multimodal Large Language Models (MLLMs) have significantly advanced visual tasks by integrating visual representations into large language models (LLMs)<n>We introduce Modality Linear Representation-Steering (MoReS) to achieve the goal.<n>MoReS effectively re-balances the intrinsic modalities throughout the model, where the key idea is to steer visual representations through linear transformations in the visual subspace across each model layer.
arXiv Detail & Related papers (2024-12-16T21:14:11Z)
Improving Instruction-Following in Language Models through Activation Steering [58.876600545898675]
We derive instruction-specific vector representations from language models and use them to steer models accordingly.<n>We demonstrate how this method can enhance model adherence to constraints such as output format, length, and word inclusion.<n>Our findings demonstrate that activation steering offers a practical and scalable approach for fine-grained control in language generation.
arXiv Detail & Related papers (2024-10-15T08:38:20Z)
HyperLLaVA: Dynamic Visual and Language Expert Tuning for Multimodal Large Language Models [70.25499865569353]
We introduce HyperLLaVA, which involves adaptive tuning of the projector and LLM parameters, in conjunction with a dynamic visual expert and language expert. Our solution significantly surpasses LLaVA on existing MLLM benchmarks, including MME, MMBench, SEED-Bench, and LLaVA-Bench.
arXiv Detail & Related papers (2024-03-20T09:42:43Z)
DriveMLM: Aligning Multi-Modal Large Language Models with Behavioral Planning States for Autonomous Driving [69.82743399946371]
DriveMLM is a framework that can perform close-loop autonomous driving in realistic simulators. We employ a multi-modal LLM (MLLM) to model the behavior planning module of a module AD system. This model can plug-and-play in existing AD systems such as Apollo for close-loop driving.
arXiv Detail & Related papers (2023-12-14T18:59:05Z)
Goal-Conditioned End-to-End Visuomotor Control for Versatile Skill Primitives [89.34229413345541]
We propose a conditioning scheme which avoids pitfalls by learning the controller and its conditioning in an end-to-end manner. Our model predicts complex action sequences based directly on a dynamic image representation of the robot motion. We report significant improvements in task success over representative MPC and IL baselines.
arXiv Detail & Related papers (2020-03-19T15:04:37Z)

This list is automatically generated from the titles and abstracts of the papers in this site.