Interpretable Generative Models through Post-hoc Concept Bottlenecks
- URL: http://arxiv.org/abs/2503.19377v1
- Date: Tue, 25 Mar 2025 06:09:51 GMT
- Title: Interpretable Generative Models through Post-hoc Concept Bottlenecks
- Authors: Akshay Kulkarni, Ge Yan, Chung-En Sun, Tuomas Oikarinen, Tsui-Wei Weng,
- Abstract summary: Concept bottleneck models (CBM) aim to produce inherently interpretable models that rely on human-understandable concepts for their predictions.<n>Existing approaches to design interpretable generative models based on CBMs require expensive generative model training from scratch as well as real images with labor-intensive concept supervision.<n>We present two novel and low-cost methods to build interpretable generative models through post-hoc techniques.
- Score: 14.083735412774812
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Concept bottleneck models (CBM) aim to produce inherently interpretable models that rely on human-understandable concepts for their predictions. However, existing approaches to design interpretable generative models based on CBMs are not yet efficient and scalable, as they require expensive generative model training from scratch as well as real images with labor-intensive concept supervision. To address these challenges, we present two novel and low-cost methods to build interpretable generative models through post-hoc techniques and we name our approaches: concept-bottleneck autoencoder (CB-AE) and concept controller (CC). Our proposed approaches enable efficient and scalable training without the need of real data and require only minimal to no concept supervision. Additionally, our methods generalize across modern generative model families including generative adversarial networks and diffusion models. We demonstrate the superior interpretability and steerability of our methods on numerous standard datasets like CelebA, CelebA-HQ, and CUB with large improvements (average ~25%) over the prior work, while being 4-15x faster to train. Finally, a large-scale user study is performed to validate the interpretability and steerability of our methods.
Related papers
- AI in a vat: Fundamental limits of efficient world modelling for agent sandboxing and interpretability [84.52205243353761]
Recent work proposes using world models to generate controlled virtual environments in which AI agents can be tested before deployment.
We investigate ways of simplifying world models that remain agnostic to the AI agent under evaluation.
arXiv Detail & Related papers (2025-04-06T20:35:44Z) - CAR: Controllable Autoregressive Modeling for Visual Generation [100.33455832783416]
Controllable AutoRegressive Modeling (CAR) is a novel, plug-and-play framework that integrates conditional control into multi-scale latent variable modeling.
CAR progressively refines and captures control representations, which are injected into each autoregressive step of the pre-trained model to guide the generation process.
Our approach demonstrates excellent controllability across various types of conditions and delivers higher image quality compared to previous methods.
arXiv Detail & Related papers (2024-10-07T00:55:42Z) - Adversarial Robustification via Text-to-Image Diffusion Models [56.37291240867549]
Adrial robustness has been conventionally believed as a challenging property to encode for neural networks.
We develop a scalable and model-agnostic solution to achieve adversarial robustness without using any data.
arXiv Detail & Related papers (2024-07-26T10:49:14Z) - VLG-CBM: Training Concept Bottleneck Models with Vision-Language Guidance [16.16577751549164]
Concept Bottleneck Models (CBMs) provide interpretable prediction.<n>CBMs encode human-understandable concepts to explain models' decision.<n>We propose Vision-Language-Guided Concept Bottleneck Model (VLG-CBM) to enable faithful interpretability.
arXiv Detail & Related papers (2024-07-18T19:44:44Z) - AnyCBMs: How to Turn Any Black Box into a Concept Bottleneck Model [7.674744385997066]
Concept Bottleneck Models enhance the interpretability of neural networks by integrating a layer of human-understandable concepts.
"AnyCBM" transforms any existing trained model into a Concept Bottleneck Model with minimal impact on computational resources.
arXiv Detail & Related papers (2024-05-26T10:19:04Z) - Improving Intervention Efficacy via Concept Realignment in Concept Bottleneck Models [57.86303579812877]
Concept Bottleneck Models (CBMs) ground image classification on human-understandable concepts to allow for interpretable model decisions.
Existing approaches often require numerous human interventions per image to achieve strong performances.
We introduce a trainable concept realignment intervention module, which leverages concept relations to realign concept assignments post-intervention.
arXiv Detail & Related papers (2024-05-02T17:59:01Z) - Data-efficient Large Vision Models through Sequential Autoregression [58.26179273091461]
We develop an efficient, autoregression-based vision model on a limited dataset.
We demonstrate how this model achieves proficiency in a spectrum of visual tasks spanning both high-level and low-level semantic understanding.
Our empirical evaluations underscore the model's agility in adapting to various tasks, heralding a significant reduction in the parameter footprint.
arXiv Detail & Related papers (2024-02-07T13:41:53Z) - Auxiliary Losses for Learning Generalizable Concept-based Models [5.4066453042367435]
Concept Bottleneck Models (CBMs) have gained popularity since their introduction.
CBMs essentially limit the latent space of a model to human-understandable high-level concepts.
We propose cooperative-Concept Bottleneck Model (coop-CBM) to overcome the performance trade-off.
arXiv Detail & Related papers (2023-11-18T15:50:07Z) - Learning to Receive Help: Intervention-Aware Concept Embedding Models [44.1307928713715]
Concept Bottleneck Models (CBMs) tackle the opacity of neural architectures by constructing and explaining their predictions using a set of high-level concepts.
Recent work has shown that intervention efficacy can be highly dependent on the order in which concepts are intervened.
We propose Intervention-aware Concept Embedding models (IntCEMs), a novel CBM-based architecture and training paradigm that improves a model's receptiveness to test-time interventions.
arXiv Detail & Related papers (2023-09-29T02:04:24Z) - Towards a Better Theoretical Understanding of Independent Subnetwork Training [56.24689348875711]
We take a closer theoretical look at Independent Subnetwork Training (IST)
IST is a recently proposed and highly effective technique for solving the aforementioned problems.
We identify fundamental differences between IST and alternative approaches, such as distributed methods with compressed communication.
arXiv Detail & Related papers (2023-06-28T18:14:22Z) - Bayesian Prompt Learning for Image-Language Model Generalization [64.50204877434878]
We use the regularization ability of Bayesian methods to frame prompt learning as a variational inference problem.
Our approach regularizes the prompt space, reduces overfitting to the seen prompts and improves the prompt generalization on unseen prompts.
We demonstrate empirically on 15 benchmarks that Bayesian prompt learning provides an appropriate coverage of the prompt space.
arXiv Detail & Related papers (2022-10-05T17:05:56Z) - Concept Embedding Models [27.968589555078328]
Concept bottleneck models promote trustworthiness by conditioning classification tasks on an intermediate level of human-like concepts.
Existing concept bottleneck models are unable to find optimal compromises between high task accuracy, robust concept-based explanations, and effective interventions on concepts.
We propose Concept Embedding Models, a novel family of concept bottleneck models which goes beyond the current accuracy-vs-interpretability trade-off by learning interpretable high-dimensional concept representations.
arXiv Detail & Related papers (2022-09-19T14:49:36Z) - Concept Bottleneck Model with Additional Unsupervised Concepts [0.5939410304994348]
We propose a novel interpretable model based on the concept bottleneck model (CBM)
CBM uses concept labels to train an intermediate layer as the additional visible layer.
By seamlessly training these two types of concepts while reducing the amount of computation, we can obtain both supervised and unsupervised concepts simultaneously.
arXiv Detail & Related papers (2022-02-03T08:30:51Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.