Concept-Based Interpretable Reinforcement Learning with Limited to No Human Labels
- URL: http://arxiv.org/abs/2407.15786v1
- Date: Mon, 22 Jul 2024 16:46:33 GMT
- Title: Concept-Based Interpretable Reinforcement Learning with Limited to No Human Labels
- Authors: Zhuorui Ye, Stephanie Milani, Geoffrey J. Gordon, Fei Fang,
- Abstract summary: We introduce a novel training scheme that enables RL algorithms to efficiently learn a concept-based policy.
Our algorithm, LICORICE, involves three main contributions: interleaving concept learning and RL training, using a concept ensembles to actively select informative data points for labeling.
We show how LICORICE reduces manual labeling efforts to to 500 or fewer concept labels in three environments.
- Score: 38.05773318621547
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: Recent advances in reinforcement learning (RL) have predominantly leveraged neural network-based policies for decision-making, yet these models often lack interpretability, posing challenges for stakeholder comprehension and trust. Concept bottleneck models offer an interpretable alternative by integrating human-understandable concepts into neural networks. However, a significant limitation in prior work is the assumption that human annotations for these concepts are readily available during training, necessitating continuous real-time input from human annotators. To overcome this limitation, we introduce a novel training scheme that enables RL algorithms to efficiently learn a concept-based policy by only querying humans to label a small set of data, or in the extreme case, without any human labels. Our algorithm, LICORICE, involves three main contributions: interleaving concept learning and RL training, using a concept ensembles to actively select informative data points for labeling, and decorrelating the concept data with a simple strategy. We show how LICORICE reduces manual labeling efforts to to 500 or fewer concept labels in three environments. Finally, we present an initial study to explore how we can use powerful vision-language models to infer concepts from raw visual inputs without explicit labels at minimal cost to performance.
Related papers
- Online-PVLM: Advancing Personalized VLMs with Online Concept Learning [19.46716778297505]
Online-PVLM is a framework for online concept learning by leveraging hyperbolic representations.<n>We develop OP-Eval, a benchmark comprising 1,292 concepts and over 30K high-quality instances with diverse question types.
arXiv Detail & Related papers (2025-11-25T08:25:30Z) - AUVIC: Adversarial Unlearning of Visual Concepts for Multi-modal Large Language Models [63.05306474002547]
Regulatory frameworks mandating the 'right to be forgotten' drive the need for machine unlearning.<n>We introduce AUVIC, a novel visual concept unlearning framework for MLLMs.<n>We show that AUVIC achieves state-of-the-art target forgetting rates while incurs minimal performance degradation on non-target concepts.
arXiv Detail & Related papers (2025-11-14T13:35:32Z) - Interpretable Few-Shot Image Classification via Prototypical Concept-Guided Mixture of LoRA Experts [79.18608192761512]
Self-Explainable Models (SEMs) rely on Prototypical Concept Learning (PCL) to enable their visual recognition processes more interpretable.<n>We propose a Few-Shot Prototypical Concept Classification framework that mitigates two key challenges under low-data regimes: parametric imbalance and representation misalignment.<n>Our approach consistently outperforms existing SEMs by a notable margin, with 4.2%-8.7% relative gains in 5-way 5-shot classification.
arXiv Detail & Related papers (2025-06-05T06:39:43Z) - Towards Automated Semantic Interpretability in Reinforcement Learning via Vision-Language Models [3.757469564056709]
We introduce interpretable Tree-based Reinforcement learning via Automated Concept Extraction (iTRACE)<n>iTRACE uses pre-trained vision-language models (VLM) for semantic feature extraction and train a interpretable tree-based model via RL.<n>We evaluate iTRACE across three domains: Atari games, grid-world navigation, and driving.
arXiv Detail & Related papers (2025-03-20T21:53:19Z) - General Intelligence Requires Reward-based Pretraining [19.90997698310839]
Large Language Models (LLMs) have demonstrated impressive real-world utility.<n>But their ability to reason adaptively and robustly remains fragile.<n>We propose disangling knowledge and reasoning through three key directions.
arXiv Detail & Related papers (2025-02-26T18:51:12Z) - Sample-efficient Learning of Concepts with Theoretical Guarantees: from Data to Concepts without Interventions [7.3784937557132855]
Concept-based models (CBM) learn interpretable concepts from high-dimensional data, e.g. images, which are used to predict labels.
An important issue in CBMs is concept leakage, i.e., spurious information in the learned concepts, which effectively leads to learning "wrong" concepts.
We describe a framework that provides theoretical guarantees on the correctness of the learned concepts and on the number of required labels.
arXiv Detail & Related papers (2025-02-10T15:01:56Z) - Accelerating LLM Inference with Lossless Speculative Decoding Algorithms for Heterogeneous Vocabularies [10.971976066073442]
Speculative decoding (SD) methods offer substantial efficiency gains by generating multiple tokens using a single target forward pass.
Existing SD approaches require the drafter and target models to share the same vocabulary, thus limiting the pool of possible drafters.
We present three new SD methods that remove this shared-vocabulary constraint.
arXiv Detail & Related papers (2025-01-31T19:13:58Z) - CBVLM: Training-free Explainable Concept-based Large Vision Language Models for Medical Image Classification [7.641602003136712]
Concept Bottleneck Models (CBMs) tackle the latter by constraining the model output on a set of predefined and human-interpretable concepts.<n>We propose a simple, yet effective, methodology, CBVLM, which tackles both of the aforementioned challenges.<n>We validate our approach with extensive experiments across four medical datasets and twelve LVLMs.
arXiv Detail & Related papers (2025-01-21T16:38:04Z) - SHIRE: Enhancing Sample Efficiency using Human Intuition in REinforcement Learning [11.304750795377657]
We propose SHIRE, a framework for encoding human intuition using Probabilistic Graphical Models (PGMs)
SHIRE achieves 25-78% sample efficiency gains across the environments we evaluate at negligible overhead cost.
arXiv Detail & Related papers (2024-09-16T04:46:22Z) - Mind the Interference: Retaining Pre-trained Knowledge in Parameter Efficient Continual Learning of Vision-Language Models [79.28821338925947]
Domain-Class Incremental Learning is a realistic but challenging continual learning scenario.
To handle these diverse tasks, pre-trained Vision-Language Models (VLMs) are introduced for their strong generalizability.
This incurs a new problem: the knowledge encoded in the pre-trained VLMs may be disturbed when adapting to new tasks, compromising their inherent zero-shot ability.
Existing methods tackle it by tuning VLMs with knowledge distillation on extra datasets, which demands heavy overhead.
We propose the Distribution-aware Interference-free Knowledge Integration (DIKI) framework, retaining pre-trained knowledge of
arXiv Detail & Related papers (2024-07-07T12:19:37Z) - Concept Induction using LLMs: a user experiment for assessment [1.1982127665424676]
This study explores the potential of a Large Language Model (LLM) to generate high-level concepts that are meaningful as explanations for humans.
We compare the concepts generated by the LLM with two other methods: concepts generated by humans and the ECII concept induction system.
Our findings indicate that while human-generated explanations remain superior, concepts derived from GPT-4 are more comprehensible to humans compared to those generated by ECII.
arXiv Detail & Related papers (2024-04-18T03:22:02Z) - Foundation Policies with Hilbert Representations [54.44869979017766]
We propose an unsupervised framework to pre-train generalist policies from unlabeled offline data.
Our key insight is to learn a structured representation that preserves the temporal structure of the underlying environment.
Our experiments show that our unsupervised policies can solve goal-conditioned and general RL tasks in a zero-shot fashion.
arXiv Detail & Related papers (2024-02-23T19:09:10Z) - Vision-Language Models Provide Promptable Representations for Reinforcement Learning [67.40524195671479]
We propose a novel approach that uses the vast amounts of general and indexable world knowledge encoded in vision-language models (VLMs) pre-trained on Internet-scale data for embodied reinforcement learning (RL)
We show that our approach can use chain-of-thought prompting to produce representations of common-sense semantic reasoning, improving policy performance in novel scenes by 1.5 times.
arXiv Detail & Related papers (2024-02-05T00:48:56Z) - Generalized Label-Efficient 3D Scene Parsing via Hierarchical Feature
Aligned Pre-Training and Region-Aware Fine-tuning [55.517000360348725]
This work presents a framework for dealing with 3D scene understanding when the labeled scenes are quite limited.
To extract knowledge for novel categories from the pre-trained vision-language models, we propose a hierarchical feature-aligned pre-training and knowledge distillation strategy.
Experiments with both indoor and outdoor scenes demonstrated the effectiveness of our approach in both data-efficient learning and open-world few-shot learning.
arXiv Detail & Related papers (2023-12-01T15:47:04Z) - Concept Distillation: Leveraging Human-Centered Explanations for Model
Improvement [3.026365073195727]
Concept Activation Vectors (CAVs) estimate a model's sensitivity and possible biases to a given concept.
We extend CAVs from post-hoc analysis to ante-hoc training in order to reduce model bias through fine-tuning.
We show applications of concept-sensitive training to debias several classification problems.
arXiv Detail & Related papers (2023-11-26T14:00:14Z) - Interpreting Pretrained Language Models via Concept Bottlenecks [55.47515772358389]
Pretrained language models (PLMs) have made significant strides in various natural language processing tasks.
The lack of interpretability due to their black-box'' nature poses challenges for responsible implementation.
We propose a novel approach to interpreting PLMs by employing high-level, meaningful concepts that are easily understandable for humans.
arXiv Detail & Related papers (2023-11-08T20:41:18Z) - Interpretable Neural-Symbolic Concept Reasoning [7.1904050674791185]
Concept-based models aim to address this issue by learning tasks based on a set of human-understandable concepts.
We propose the Deep Concept Reasoner (DCR), the first interpretable concept-based model that builds upon concept embeddings.
arXiv Detail & Related papers (2023-04-27T09:58:15Z) - A Simple Long-Tailed Recognition Baseline via Vision-Language Model [92.2866546058082]
The visual world naturally exhibits a long-tailed distribution of open classes, which poses great challenges to modern visual systems.
Recent advances in contrastive visual-language pretraining shed light on a new pathway for visual recognition.
We propose BALLAD to leverage contrastive vision-language models for long-tailed recognition.
arXiv Detail & Related papers (2021-11-29T17:49:24Z) - Active Refinement for Multi-Label Learning: A Pseudo-Label Approach [84.52793080276048]
Multi-label learning (MLL) aims to associate a given instance with its relevant labels from a set of concepts.
Previous works of MLL mainly focused on the setting where the concept set is assumed to be fixed.
Many real-world applications require introducing new concepts into the set to meet new demands.
arXiv Detail & Related papers (2021-09-29T19:17:05Z) - Learning Interpretable Concept-Based Models with Human Feedback [36.65337734891338]
We propose an approach for learning a set of transparent concept definitions in high-dimensional data that relies on users labeling concept features.
Our method produces concepts that both align with users' intuitive sense of what a concept means, and facilitate prediction of the downstream label by a transparent machine learning model.
arXiv Detail & Related papers (2020-12-04T23:41:05Z) - Concept Learners for Few-Shot Learning [76.08585517480807]
We propose COMET, a meta-learning method that improves generalization ability by learning to learn along human-interpretable concept dimensions.
We evaluate our model on few-shot tasks from diverse domains, including fine-grained image classification, document categorization and cell type annotation.
arXiv Detail & Related papers (2020-07-14T22:04:17Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.