Related papers: ConceptDistil: Model-Agnostic Distillation of Concept Explanations

ConceptDistil: Model-Agnostic Distillation of Concept Explanations

URL: http://arxiv.org/abs/2205.03601v1
Date: Sat, 7 May 2022 08:58:54 GMT
Title: ConceptDistil: Model-Agnostic Distillation of Concept Explanations
Authors: Jo\~ao Bento Sousa, Ricardo Moreira, Vladimir Balayan, Pedro Saleiro, Pedro Bizarro
Abstract summary: Concept-based explanations aims to fill the model interpretability gap for non-technical humans-in-the-loop. We propose ConceptDistil, a method to bring concept explanations to any black-box classifier using knowledge distillation. We validate ConceptDistil in a real world use-case, showing that it is able to optimize both tasks.
Score: 4.462334751640166
License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
Abstract: Concept-based explanations aims to fill the model interpretability gap for non-technical humans-in-the-loop. Previous work has focused on providing concepts for specific models (eg, neural networks) or data types (eg, images), and by either trying to extract concepts from an already trained network or training self-explainable models through multi-task learning. In this work, we propose ConceptDistil, a method to bring concept explanations to any black-box classifier using knowledge distillation. ConceptDistil is decomposed into two components:(1) a concept model that predicts which domain concepts are present in a given instance, and (2) a distillation model that tries to mimic the predictions of a black-box model using the concept model predictions. We validate ConceptDistil in a real world use-case, showing that it is able to optimize both tasks, bringing concept-explainability to any black-box model.

Related papers

Towards Better Generalization and Interpretability in Unsupervised Concept-Based Models [9.340843984411137]
This paper introduces a novel unsupervised concept-based model for image classification, named Learnable Concept-Based Model (LCBM)<n>We demonstrate that LCBM surpasses existing unsupervised concept-based models in generalization capability and nearly matches the performance of black-box models.<n>Despite the use of concept embeddings, we maintain model interpretability by means of a local linear combination of concepts.
arXiv Detail & Related papers (2025-06-02T16:26:41Z)
OmniPrism: Learning Disentangled Visual Concept for Image Generation [57.21097864811521]
Creative visual concept generation often draws inspiration from specific concepts in a reference image to produce relevant outcomes. We propose OmniPrism, a visual concept disentangling approach for creative image generation. Our method learns disentangled concept representations guided by natural language and trains a diffusion model to incorporate these concepts.
arXiv Detail & Related papers (2024-12-16T18:59:52Z)
Discover-then-Name: Task-Agnostic Concept Bottlenecks via Automated Concept Discovery [52.498055901649025]
Concept Bottleneck Models (CBMs) have been proposed to address the 'black-box' problem of deep neural networks. We propose a novel CBM approach -- called Discover-then-Name-CBM (DN-CBM) -- that inverts the typical paradigm. Our concept extraction strategy is efficient, since it is agnostic to the downstream task, and uses concepts already known to the model.
arXiv Detail & Related papers (2024-07-19T17:50:11Z)
Concept Bottleneck Models Without Predefined Concepts [26.156636891713745]
We introduce an input-dependent concept selection mechanism that ensures only a small subset of concepts is used across all classes. We show that our approach improves downstream performance and narrows the performance gap to black-box models.
arXiv Detail & Related papers (2024-07-04T13:34:50Z)
Separable Multi-Concept Erasure from Diffusion Models [52.51972530398691]
We propose a Separable Multi-concept Eraser (SepME) to eliminate unsafe concepts from large-scale diffusion models. The latter separates optimizable model weights, making each weight increment correspond to a specific concept erasure. Extensive experiments indicate the efficacy of our approach in eliminating concepts, preserving model performance, and offering flexibility in the erasure or recovery of various concepts.
arXiv Detail & Related papers (2024-02-03T11:10:57Z)
Knowledge-Aware Neuron Interpretation for Scene Classification [32.32713349524347]
We propose a knowledge-aware neuron interpretation framework to explain model predictions for image scene classification. For concept completeness, we present core concepts of a scene based on knowledge graph, ConceptNet, to gauge the completeness of concepts. For concept fusion, we introduce a knowledge graph-based method known as Concept Filtering, which produces over 23% point gain on neuron behaviors for neuron interpretation.
arXiv Detail & Related papers (2024-01-29T01:00:17Z)
An Axiomatic Approach to Model-Agnostic Concept Explanations [67.84000759813435]
We propose an approach to concept explanations that satisfy three natural axioms: linearity, recursivity, and similarity. We then establish connections with previous concept explanation methods, offering insight into their varying semantic meanings.
arXiv Detail & Related papers (2024-01-12T20:53:35Z)
ConcEPT: Concept-Enhanced Pre-Training for Language Models [57.778895980999124]
ConcEPT aims to infuse conceptual knowledge into pre-trained language models. It exploits external entity concept prediction to predict the concepts of entities mentioned in the pre-training contexts. Results of experiments show that ConcEPT gains improved conceptual knowledge with concept-enhanced pre-training.
arXiv Detail & Related papers (2024-01-11T05:05:01Z)
SurroCBM: Concept Bottleneck Surrogate Models for Generative Post-hoc Explanation [11.820167569334444]
This paper introduces the Concept Bottleneck Surrogate Models (SurroCBM) to explain black-box models. SurroCBM identifies shared and unique concepts across various black-box models and employs an explainable surrogate model for post-hoc explanations. An effective training strategy using self-generated data is proposed to enhance explanation quality continuously.
arXiv Detail & Related papers (2023-10-11T17:46:59Z)
Concept Gradient: Concept-based Interpretation Without Linear Assumption [77.96338722483226]
Concept Activation Vector (CAV) relies on learning a linear relation between some latent representation of a given model and concepts. We proposed Concept Gradient (CG), extending concept-based interpretation beyond linear concept functions. We demonstrated CG outperforms CAV in both toy examples and real world datasets.
arXiv Detail & Related papers (2022-08-31T17:06:46Z)
Human-Centered Concept Explanations for Neural Networks [47.71169918421306]
We introduce concept explanations including the class of Concept Activation Vectors (CAV) We then discuss approaches to automatically extract concepts, and approaches to address some of their caveats. Finally, we discuss some case studies that showcase the utility of such concept-based explanations in synthetic settings and real world applications.
arXiv Detail & Related papers (2022-02-25T01:27:31Z)
MACE: Model Agnostic Concept Extractor for Explaining Image Classification Networks [10.06397994266945]
We propose MACE: a Model Agnostic Concept Extractor, which can explain the working of a convolutional network through smaller concepts. We validate our framework using VGG16 and ResNet50 CNN architectures, and on datasets like Animals With Attributes 2 (AWA2) and Places365.
arXiv Detail & Related papers (2020-11-03T04:40:49Z)

This list is automatically generated from the titles and abstracts of the papers in this site.