Related papers: VISPA: Pluralistic Alignment via Automatic Value Selection and Activation

VISPA: Pluralistic Alignment via Automatic Value Selection and Activation

URL: http://arxiv.org/abs/2601.12758v1
Date: Mon, 19 Jan 2026 06:38:52 GMT
Title: VISPA: Pluralistic Alignment via Automatic Value Selection and Activation
Authors: Shenyan Zheng, Jiayou Zhong, Anudeex Shetty, Heng Ji, Preslav Nakov, Usman Naseem,
Abstract summary: We introduce VISPA, a training-free pluralistic alignment framework.<n>We show VISPA is performant across all pluralistic alignment modes in healthcare and beyond.
Score: 82.8405077104797
License: http://creativecommons.org/licenses/by/4.0/
Abstract: As large language models are increasingly used in high-stakes domains, it is essential that their outputs reflect not average} human preference, rather range of varying perspectives. Achieving such pluralism, however, remains challenging. Existing approaches consider limited values or rely on prompt-level interventions, lacking value control and representation. To address this, we introduce VISPA, a training-free pluralistic alignment framework, that enables direct control over value expression by dynamic selection and internal model activation steering. Across extensive empirical studies spanning multiple models and evaluation settings, we show VISPA is performant across all pluralistic alignment modes in healthcare and beyond. Further analysis reveals VISPA is adaptable with different steering initiations, model, and/or values. These results suggest that pluralistic alignment can be achieved through internal activation mechanisms, offering a scalable path toward language models that serves all.

Related papers

Beyond Language Modeling: An Exploration of Multimodal Pretraining [125.34714978184638]
We provide empirical clarity through controlled, from-scratch pretraining experiments.<n>We adopt the Transfusion framework, using next-token prediction for language and diffusion for vision.<n>We demonstrate that the MoE architecture harmonizes this scaling asymmetry by providing the high model capacity required by language.
arXiv Detail & Related papers (2026-03-03T18:58:00Z)
Dynamic Multimodal Activation Steering for Hallucination Mitigation in Large Vision-Language Models [22.535916867005955]
Large Vision-Language Models (LVLMs) exhibit outstanding performance on vision-language tasks but struggle with hallucination problems.<n>We propose Dynamic Multimodal Activation Steering, a training-free approach for hallucination mitigation.
arXiv Detail & Related papers (2026-02-25T09:10:00Z)
Alignment among Language, Vision and Action Representations [0.0]
We show that linguistic, visual, and action representations converge toward partially shared semantic structures.<n>These findings indicate that linguistic, visual, and action representations converge toward partially shared semantic structures.
arXiv Detail & Related papers (2026-01-30T13:12:07Z)
HMVLA: Hyperbolic Multimodal Fusion for Vision-Language-Action Models [4.59200581394731]
HMVLA exploits inherent hierarchical structures in vision and language for comprehensive semantic alignment.<n>Our HMVLA embeds multimodal features in hyperbolic space, enabling more effective modeling of the hierarchical relationships present in image text data.
arXiv Detail & Related papers (2026-01-28T07:50:30Z)
When Alignment Fails: Multimodal Adversarial Attacks on Vision-Language-Action Models [75.16145284285456]
We introduce VLA-Fool, a comprehensive study of multimodal adversarial robustness in embodied VLA models under both white-box and black-box settings.<n>We develop the first automatically crafted and semantically guided prompting framework.<n> Experiments on the LIBERO benchmark reveal that even minor multimodal perturbations can cause significant behavioral deviations.
arXiv Detail & Related papers (2025-11-20T10:14:32Z)
V-SEAM: Visual Semantic Editing and Attention Modulating for Causal Interpretability of Vision-Language Models [10.052877942432783]
We introduce V-SEAM, a novel framework that combines Visual Semantic Editing and Attention Modulating for causal interpretation of vision-language models.<n>V-SEAM identifies attention heads with positive or negative contributions to predictions across three semantic levels.<n>We demonstrate enhanced performance for both LLaVA and InstructBLIP across three diverse VQA benchmarks.
arXiv Detail & Related papers (2025-09-18T10:58:34Z)
Evaluating and Steering Modality Preferences in Multimodal Large Language Model [42.828461839307174]
Multimodal large language models (MLLMs) have achieved remarkable performance on complex tasks with multimodal context.<n>We show that all 18 tested MLLMs generally demonstrate clear modality bias, and modality preference can be influenced by external interventions.<n>We propose a probing and steering method based on representation engineering to explicitly control modality preference.
arXiv Detail & Related papers (2025-05-27T10:07:59Z)
SAE-SSV: Supervised Steering in Sparse Representation Spaces for Reliable Control of Language Models [41.553639748766784]
Large language models (LLMs) have demonstrated impressive capabilities in natural language understanding and generation.<n>This paper introduces a novel supervised steering approach that operates in sparse, interpretable representation spaces.
arXiv Detail & Related papers (2025-05-22T03:46:57Z)
Improving Multilingual Language Models by Aligning Representations through Steering [10.159957091670883]
This paper investigates how Large Language Models (LLMs) represent non-English tokens.<n>We propose a lightweight intervention method using representation steering, where a learned vector is added to the residual stream at a single model layer to enhance multilingual performance.
arXiv Detail & Related papers (2025-05-19T00:14:43Z)
Unified Generative and Discriminative Training for Multi-modal Large Language Models [88.84491005030316]
Generative training has enabled Vision-Language Models (VLMs) to tackle various complex tasks. Discriminative training, exemplified by models like CLIP, excels in zero-shot image-text classification and retrieval. This paper proposes a unified approach that integrates the strengths of both paradigms.
arXiv Detail & Related papers (2024-11-01T01:51:31Z)
Conditional Language Policy: A General Framework for Steerable Multi-Objective Finetuning [72.46388818127105]
Conditional Language Policy (CLP) is a framework for finetuning language models on multiple objectives. We show that CLP learns steerable models that effectively trade-off conflicting objectives at inference time.
arXiv Detail & Related papers (2024-07-22T16:13:38Z)
PaLM-E: An Embodied Multimodal Language Model [101.29116156731762]
We propose embodied language models to incorporate real-world continuous sensor modalities into language models. We train these encodings end-to-end, in conjunction with a pre-trained large language model, for multiple embodied tasks. Our largest model, PaLM-E-562B with 562B parameters, is a visual-language generalist with state-of-the-art performance on OK-VQA.
arXiv Detail & Related papers (2023-03-06T18:58:06Z)

This list is automatically generated from the titles and abstracts of the papers in this site.