Related papers: Towards Automated Semantic Interpretability in Reinforcement Learning via Vision-Language Models

Related papers

Semore: VLM-guided Enhanced Semantic Motion Representations for Visual Reinforcement Learning [11.901989132359676]
We introduce Enhanced Semantic Motion Representations (Semore), a new VLM-based framework for visual reinforcement learning (RL)<n>Semore simultaneously extract semantic and motion representations through a dual-path backbone from the RGB flows.<n>Our method exhibits efficient and adaptive ability compared to state-of-art methods.
arXiv Detail & Related papers (2025-12-04T16:54:41Z)
BLIP-FusePPO: A Vision-Language Deep Reinforcement Learning Framework for Lane Keeping in Autonomous Vehicles [0.0]
We propose a novel framework for multimodal reinforcement learning (RL) for autonomous lane-keeping (LK)<n>The proposed method lets the agent learn driving rules that are aware of their surroundings and easy to understand.<n>A hybrid reward function that includes semantic alignment, LK accuracy, obstacle avoidance, and speed regulation helps learning to be more efficient and generalizable.
arXiv Detail & Related papers (2025-10-25T17:27:08Z)
SSL4RL: Revisiting Self-supervised Learning as Intrinsic Reward for Visual-Language Reasoning [88.9014727048442]
SSL4RL is a novel framework that leverages self-supervised learning tasks as a source of verifiable rewards for RL-based fine-tuning.<n>Our approach reformulates SSL objectives-such as predicting image rotation or reconstructing masked patches-into dense, automatic reward signals.<n>Experiments show that SSL4RL substantially improves performance on both vision-centric and vision-language reasoning benchmarks.
arXiv Detail & Related papers (2025-10-18T09:22:40Z)
Executable Analytic Concepts as the Missing Link Between VLM Insight and Precise Manipulation [70.8381970762877]
Vision-Language Models (VLMs) have demonstrated remarkable capabilities in semantic reasoning and task planning.<n>We introduce GRACE, a novel framework that grounds VLM-based reasoning through executable analytic concepts.<n>G GRACE provides a unified and interpretable interface between high-level instruction understanding and low-level robot control.
arXiv Detail & Related papers (2025-10-09T09:08:33Z)
Discrete Diffusion for Reflective Vision-Language-Action Models in Autonomous Driving [55.13109926181247]
We introduce ReflectDrive, a learning-based framework that integrates a reflection mechanism for safe trajectory generation via discrete diffusion.<n>Central to our approach is a safety-aware reflection mechanism that performs iterative self-correction without gradient.<n>Our method begins with goal-conditioned trajectory generation to model multi-modal driving behaviors.
arXiv Detail & Related papers (2025-09-24T13:35:15Z)
Mechanistic interpretability for steering vision-language-action models [0.23371356738437823]
Vision-Language-Action (VLA) models are a promising path to realizing generalist embodied agents.<n>We introduce the first framework for interpreting and steering VLAs via their internal representations.<n>We introduce a general-purpose activation steering method that modulates behavior in real time, without fine-tuning, reward signals, or environment interaction.
arXiv Detail & Related papers (2025-08-30T03:01:57Z)
LANGTRAJ: Diffusion Model and Dataset for Language-Conditioned Trajectory Simulation [94.84458417662404]
LangTraj is a language-conditioned scene-diffusion model that simulates the joint behavior of all agents in traffic scenarios. By conditioning on natural language inputs, LangTraj provides flexible and intuitive control over interactive behaviors. LangTraj demonstrates strong performance in realism, language controllability, and language-conditioned safety-critical simulation.
arXiv Detail & Related papers (2025-04-15T17:14:06Z)
Sparse Autoencoders Learn Monosemantic Features in Vision-Language Models [50.587868616659826]
Sparse Autoencoders (SAEs) have been shown to enhance interpretability and steerability in Large Language Models (LLMs) In this work, we extend the application of SAEs to Vision-Language Models (VLMs), such as CLIP, and introduce a comprehensive framework for evaluating monosemanticity in vision representations.
arXiv Detail & Related papers (2025-04-03T17:58:35Z)
Proposition of Affordance-Driven Environment Recognition Framework Using Symbol Networks in Large Language Models [1.2430809884830318]
This study proposes a method for automatic affordance acquisition by leveraging large language models (LLMs) Experiments using apple'' as an example demonstrated the method's ability to extract context-dependent affordances with high explainability.
arXiv Detail & Related papers (2025-04-02T11:48:44Z)
MoRE-LLM: Mixture of Rule Experts Guided by a Large Language Model [54.14155564592936]
We propose a Mixture of Rule Experts guided by a Large Language Model (MoRE-LLM)<n>MoRE-LLM steers the discovery of local rule-based surrogates during training and their utilization for the classification task.<n>LLM is responsible for enhancing the domain knowledge alignment of the rules by correcting and contextualizing them.
arXiv Detail & Related papers (2025-03-26T11:09:21Z)
Sparse Autoencoder Features for Classifications and Transferability [11.2185030332009]
We analyze Sparse Autoencoders (SAEs) for interpretable feature extraction from Large Language Models (LLMs) Our framework evaluates (1) model-layer selection and scaling properties, (2) SAE architectural configurations, including width and pooling strategies, and (3) the effect of binarizing continuous SAE activations.
arXiv Detail & Related papers (2025-02-17T02:30:45Z)
RLS3: RL-Based Synthetic Sample Selection to Enhance Spatial Reasoning in Vision-Language Models for Indoor Autonomous Perception [20.01853641155509]
Vision-language model (VLM) fine-tuning for application-specific visual grounding based on natural language instructions has become one of the most popular approaches for learning-enabled autonomous systems.<n>We propose a new generalizable framework to improve VLM fine-tuning by integrating it with a reinforcement learning (RL) agent.
arXiv Detail & Related papers (2025-01-31T04:30:42Z)
Mechanistic understanding and validation of large AI models with SemanticLens [13.712668314238082]
Unlike human-engineered systems such as aeroplanes, the inner workings of AI models remain largely opaque.<n>This paper introduces SemanticLens, a universal explanation method for neural networks that maps hidden knowledge encoded by components.
arXiv Detail & Related papers (2025-01-09T17:47:34Z)
Tokens, the oft-overlooked appetizer: Large language models, the distributional hypothesis, and meaning [31.632816425798108]
Tokenization is a necessary component within the current architecture of many language models.<n>We discuss how tokens and pretraining can act as a backdoor for bias and other unwanted content.<n>We relay evidence that the tokenization algorithm's objective function impacts the large language model's cognition.
arXiv Detail & Related papers (2024-12-14T18:18:52Z)
Language Model as Visual Explainer [72.88137795439407]
We present a systematic approach for interpreting vision models using a tree-structured linguistic explanation.<n>Our method provides human-understandable explanations in the form of attribute-laden trees.<n>To access the effectiveness of our approach, we introduce new benchmarks and conduct rigorous evaluations.
arXiv Detail & Related papers (2024-12-08T20:46:23Z)
Large Language Models are Interpretable Learners [53.56735770834617]
In this paper, we show a combination of Large Language Models (LLMs) and symbolic programs can bridge the gap between expressiveness and interpretability. The pretrained LLM with natural language prompts provides a massive set of interpretable modules that can transform raw input into natural language concepts. As the knowledge learned by LSP is a combination of natural language descriptions and symbolic rules, it is easily transferable to humans (interpretable) and other LLMs.
arXiv Detail & Related papers (2024-06-25T02:18:15Z)
Verbalized Machine Learning: Revisiting Machine Learning with Language Models [63.10391314749408]
We introduce the framework of verbalized machine learning (VML)<n>VML constrains the parameter space to be human-interpretable natural language.<n>We empirically verify the effectiveness of VML, and hope that VML can serve as a stepping stone to stronger interpretability.
arXiv Detail & Related papers (2024-06-06T17:59:56Z)
Understanding Large Language Model Behaviors through Interactive Counterfactual Generation and Analysis [22.755345889167934]
We present an interactive visualization system that enables exploration of large language models (LLMs) through counterfactual analysis.<n>Our system features a novel algorithm that generates fluent and semantically meaningful counterfactuals.<n>A user study with LLM practitioners and interviews with experts demonstrate the system's usability and effectiveness.
arXiv Detail & Related papers (2024-04-23T19:57:03Z)
Tuning-Free Accountable Intervention for LLM Deployment -- A Metacognitive Approach [55.613461060997004]
Large Language Models (LLMs) have catalyzed transformative advances across a spectrum of natural language processing tasks. We propose an innovative textitmetacognitive approach, dubbed textbfCLEAR, to equip LLMs with capabilities for self-aware error identification and correction.
arXiv Detail & Related papers (2024-03-08T19:18:53Z)
Vision-Language Models Provide Promptable Representations for Reinforcement Learning [67.40524195671479]
We propose a novel approach that uses the vast amounts of general and indexable world knowledge encoded in vision-language models (VLMs) pre-trained on Internet-scale data for embodied reinforcement learning (RL) We show that our approach can use chain-of-thought prompting to produce representations of common-sense semantic reasoning, improving policy performance in novel scenes by 1.5 times.
arXiv Detail & Related papers (2024-02-05T00:48:56Z)
Sparsity-Guided Holistic Explanation for LLMs with Interpretable Inference-Time Intervention [53.896974148579346]
Large Language Models (LLMs) have achieved unprecedented breakthroughs in various natural language processing domains. The enigmatic black-box'' nature of LLMs remains a significant challenge for interpretability, hampering transparent and accountable applications. We propose a novel methodology anchored in sparsity-guided techniques, aiming to provide a holistic interpretation of LLMs.
arXiv Detail & Related papers (2023-12-22T19:55:58Z)
AS-XAI: Self-supervised Automatic Semantic Interpretation for CNN [5.42467030980398]
We propose a self-supervised automatic semantic interpretable artificial intelligence (AS-XAI) framework. It utilizes transparent embedding semantic extraction spaces and row-centered principal component analysis (PCA) for global semantic interpretation of model decisions. The proposed approach offers broad fine-grained practical applications, including shared semantic interpretation under out-of-distribution categories.
arXiv Detail & Related papers (2023-12-02T10:06:54Z)
Interpreting Pretrained Language Models via Concept Bottlenecks [55.47515772358389]
Pretrained language models (PLMs) have made significant strides in various natural language processing tasks. The lack of interpretability due to their black-box'' nature poses challenges for responsible implementation. We propose a novel approach to interpreting PLMs by employing high-level, meaningful concepts that are easily understandable for humans.
arXiv Detail & Related papers (2023-11-08T20:41:18Z)
SELF: Self-Evolution with Language Feedback [68.6673019284853]
'SELF' (Self-Evolution with Language Feedback) is a novel approach to advance large language models. It enables LLMs to self-improve through self-reflection, akin to human learning processes. Our experiments in mathematics and general tasks demonstrate that SELF can enhance the capabilities of LLMs without human intervention.
arXiv Detail & Related papers (2023-10-01T00:52:24Z)
Artificial-Spiking Hierarchical Networks for Vision-Language Representation Learning [16.902924543372713]
State-of-the-art methods achieve impressive performance by pre-training on large-scale datasets. We propose an efficient framework for multimodal alignment by introducing a novel visual semantic module. Experiments show that the proposed ASH-Nets achieve competitive results.
arXiv Detail & Related papers (2023-08-18T10:40:25Z)
Planning for Learning Object Properties [117.27898922118946]
We formalize the problem of automatically training a neural network to recognize object properties as a symbolic planning problem. We use planning techniques to produce a strategy for automating the training dataset creation and the learning process. We provide an experimental evaluation in both a simulated and a real environment.
arXiv Detail & Related papers (2023-01-15T09:37:55Z)
PEVL: Position-enhanced Pre-training and Prompt Tuning for Vision-language Models [127.17675443137064]
We introduce PEVL, which enhances the pre-training and prompt tuning of vision-language models with explicit object position modeling. PEVL reformulates discretized object positions and language in a unified language modeling framework. We show that PEVL enables state-of-the-art performance on position-sensitive tasks such as referring expression comprehension and phrase grounding.
arXiv Detail & Related papers (2022-05-23T10:17:53Z)

This list is automatically generated from the titles and abstracts of the papers in this site.