Flexible Concept Bottleneck Model
- URL: http://arxiv.org/abs/2511.06678v1
- Date: Mon, 10 Nov 2025 03:50:57 GMT
- Title: Flexible Concept Bottleneck Model
- Authors: Xingbo Du, Qiantong Dou, Lei Fan, Rui Zhang,
- Abstract summary: Concept bottleneck models (CBMs) improve neural network interpretability by introducing an intermediate layer that maps human-understandable concepts to predictions.<n>We propose Flexible Concept Bottleneck Model (FCBM) which supports dynamic concept adaptation, including complete replacement of the original concept set.<n>Our method achieves accuracy comparable to state-of-the-art baselines with a similar number of effective concepts.
- Score: 7.3992593868058245
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Concept bottleneck models (CBMs) improve neural network interpretability by introducing an intermediate layer that maps human-understandable concepts to predictions. Recent work has explored the use of vision-language models (VLMs) to automate concept selection and annotation. However, existing VLM-based CBMs typically require full model retraining when new concepts are involved, which limits their adaptability and flexibility in real-world scenarios, especially considering the rapid evolution of vision-language foundation models. To address these issues, we propose Flexible Concept Bottleneck Model (FCBM), which supports dynamic concept adaptation, including complete replacement of the original concept set. Specifically, we design a hypernetwork that generates prediction weights based on concept embeddings, allowing seamless integration of new concepts without retraining the entire model. In addition, we introduce a modified sparsemax module with a learnable temperature parameter that dynamically selects the most relevant concepts, enabling the model to focus on the most informative features. Extensive experiments on five public benchmarks demonstrate that our method achieves accuracy comparable to state-of-the-art baselines with a similar number of effective concepts. Moreover, the model generalizes well to unseen concepts with just a single epoch of fine-tuning, demonstrating its strong adaptability and flexibility.
Related papers
- FaCT: Faithful Concept Traces for Explaining Neural Network Decisions [56.796533084868884]
Deep networks have shown remarkable performance across a wide range of tasks, yet getting a global concept-level understanding of how they function remains a key challenge.<n>We put emphasis on the faithfulness of concept-based explanations and propose a new model with model-inherent mechanistic concept-explanations.<n>Our concepts are shared across classes and, from any layer, their contribution to the logit and their input-visualization can be faithfully traced.
arXiv Detail & Related papers (2025-10-29T13:35:46Z) - Post-hoc Stochastic Concept Bottleneck Models [18.935442650741]
Concept Bottleneck Models (CBMs) are interpretable models that predict the target variable through high-level human-understandable concepts.<n>We introduce Post-hoc Concept Bottleneck Models (PSCBMs), a lightweight method that augments any pre-trained CBM with a normal distribution over concepts without retraining the backbone model.<n>We show that PSCBMs perform much better than CBMs under interventions, while remaining far more efficient than retraining a similar model from scratch.
arXiv Detail & Related papers (2025-10-09T13:42:54Z) - Graph Integrated Multimodal Concept Bottleneck Model [24.726638033402747]
MoE-SGT is a framework that augments Concept Bottleneck Models (CBMs) with a structure injecting Graph Transformer and a Mixture of Experts (MoE) module.<n>We construct answer-concept and answer-question graphs for multimodal inputs to explicitly model the structured relationships among concepts.<n>MoE-SGT achieves higher accuracy than other concept bottleneck networks on multiple datasets.
arXiv Detail & Related papers (2025-10-01T09:18:38Z) - Interpretable Reward Modeling with Active Concept Bottlenecks [54.00085739303773]
We introduce Concept Bottleneck Reward Models (CB-RM), a reward modeling framework that enables interpretable preference learning.<n>Unlike standard RLHF methods that rely on opaque reward functions, CB-RM decomposes reward prediction into human-interpretable concepts.<n>We formalize an active learning strategy that dynamically acquires the most informative concept labels.
arXiv Detail & Related papers (2025-07-07T06:26:04Z) - MVP-CBM:Multi-layer Visual Preference-enhanced Concept Bottleneck Model for Explainable Medical Image Classification [17.91330444111181]
The concept bottleneck model (CBM) improves interpretability via linking predictions to human-understandable concepts.<n>We propose a novel Multi-layer Visual Preference-enhanced Concept Bottleneck Model (MVP-CBM)<n>MVP-CBM can comprehensively leverage multi-layer visual information to provide a more nuanced and accurate explanation of model decisions.
arXiv Detail & Related papers (2025-06-14T16:52:04Z) - Towards Better Generalization and Interpretability in Unsupervised Concept-Based Models [9.340843984411137]
This paper introduces a novel unsupervised concept-based model for image classification, named Learnable Concept-Based Model (LCBM)<n>We demonstrate that LCBM surpasses existing unsupervised concept-based models in generalization capability and nearly matches the performance of black-box models.<n>Despite the use of concept embeddings, we maintain model interpretability by means of a local linear combination of concepts.
arXiv Detail & Related papers (2025-06-02T16:26:41Z) - Mod-Adapter: Tuning-Free and Versatile Multi-concept Personalization via Modulation Adapter [57.49476151976054]
We propose a tuning-free method for multi-concept personalization that can effectively customize both object and abstract concepts without test-time fine-tuning.<n>Our method achieves state-of-the-art performance in multi-concept personalization, supported by quantitative, qualitative, and human evaluations.
arXiv Detail & Related papers (2025-05-24T09:21:32Z) - Sparse autoencoders reveal selective remapping of visual concepts during adaptation [54.82630842681845]
Adapting foundation models for specific purposes has become a standard approach to build machine learning systems.<n>We develop a new Sparse Autoencoder (SAE) for the CLIP vision transformer, named PatchSAE, to extract interpretable concepts.
arXiv Detail & Related papers (2024-12-06T18:59:51Z) - How to Continually Adapt Text-to-Image Diffusion Models for Flexible Customization? [91.49559116493414]
We propose a novel Concept-Incremental text-to-image Diffusion Model (CIDM)
It can resolve catastrophic forgetting and concept neglect to learn new customization tasks in a concept-incremental manner.
Experiments validate that our CIDM surpasses existing custom diffusion models.
arXiv Detail & Related papers (2024-10-23T06:47:29Z) - Improving Intervention Efficacy via Concept Realignment in Concept Bottleneck Models [57.86303579812877]
Concept Bottleneck Models (CBMs) ground image classification on human-understandable concepts to allow for interpretable model decisions.
Existing approaches often require numerous human interventions per image to achieve strong performances.
We introduce a trainable concept realignment intervention module, which leverages concept relations to realign concept assignments post-intervention.
arXiv Detail & Related papers (2024-05-02T17:59:01Z) - Auxiliary Losses for Learning Generalizable Concept-based Models [5.4066453042367435]
Concept Bottleneck Models (CBMs) have gained popularity since their introduction.
CBMs essentially limit the latent space of a model to human-understandable high-level concepts.
We propose cooperative-Concept Bottleneck Model (coop-CBM) to overcome the performance trade-off.
arXiv Detail & Related papers (2023-11-18T15:50:07Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.