Concept-Centric Transformers: Enhancing Model Interpretability through
Object-Centric Concept Learning within a Shared Global Workspace
- URL: http://arxiv.org/abs/2305.15775v3
- Date: Tue, 7 Nov 2023 23:56:05 GMT
- Title: Concept-Centric Transformers: Enhancing Model Interpretability through
Object-Centric Concept Learning within a Shared Global Workspace
- Authors: Jinyung Hong, Keun Hee Park, Theodore P. Pavlic
- Abstract summary: Concept-Centric Transformers is a simple yet effective configuration of the shared global workspace for interpretability.
We show that our model achieves better classification accuracy than all baselines across all problems.
- Score: 1.6574413179773757
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: Many interpretable AI approaches have been proposed to provide plausible
explanations for a model's decision-making. However, configuring an explainable
model that effectively communicates among computational modules has received
less attention. A recently proposed shared global workspace theory showed that
networks of distributed modules can benefit from sharing information with a
bottlenecked memory because the communication constraints encourage
specialization, compositionality, and synchronization among the modules.
Inspired by this, we propose Concept-Centric Transformers, a simple yet
effective configuration of the shared global workspace for interpretability,
consisting of: i) an object-centric-based memory module for extracting semantic
concepts from input features, ii) a cross-attention mechanism between the
learned concept and input embeddings, and iii) standard classification and
explanation losses to allow human analysts to directly assess an explanation
for the model's classification reasoning. We test our approach against other
existing concept-based methods on classification tasks for various datasets,
including CIFAR100, CUB-200-2011, and ImageNet, and we show that our model
achieves better classification accuracy than all baselines across all problems
but also generates more consistent concept-based explanations of classification
output.
Related papers
- Decompose the model: Mechanistic interpretability in image models with Generalized Integrated Gradients (GIG) [24.02036048242832]
This paper introduces a novel approach to trace the entire pathway from input through all intermediate layers to the final output within the whole dataset.
We utilize Pointwise Feature Vectors (PFVs) and Effective Receptive Fields (ERFs) to decompose model embeddings into interpretable Concept Vectors.
Then, we calculate the relevance between concept vectors with our Generalized Integrated Gradients (GIG) enabling a comprehensive, dataset-wide analysis of model behavior.
arXiv Detail & Related papers (2024-09-03T05:19:35Z) - AdaCBM: An Adaptive Concept Bottleneck Model for Explainable and Accurate Diagnosis [38.16978432272716]
The integration of vision-language models such as CLIP and Concept Bottleneck Models (CBMs) offers a promising approach to explaining deep neural network (DNN) decisions.
While CLIP provides both explainability and zero-shot classification capability, its pre-training on generic image and text data may limit its classification accuracy and applicability to medical image diagnostic tasks.
This paper takes an unconventional approach by re-examining the CBM framework through the lens of its geometrical representation as a simple linear classification system.
arXiv Detail & Related papers (2024-08-04T11:59:09Z) - Improving Intervention Efficacy via Concept Realignment in Concept Bottleneck Models [57.86303579812877]
Concept Bottleneck Models (CBMs) ground image classification on human-understandable concepts to allow for interpretable model decisions.
Existing approaches often require numerous human interventions per image to achieve strong performances.
We introduce a trainable concept realignment intervention module, which leverages concept relations to realign concept assignments post-intervention.
arXiv Detail & Related papers (2024-05-02T17:59:01Z) - MCPNet: An Interpretable Classifier via Multi-Level Concept Prototypes [24.28807025839685]
We argue that explanations lacking insights into the decision processes of low and mid-level features are neither fully faithful nor useful.
We propose a novel paradigm that learns and aligns multi-level concept prototype distributions for classification purposes via Class-aware Concept Distribution (CCD) loss.
arXiv Detail & Related papers (2024-04-13T11:13:56Z) - Self-Supervised Representation Learning with Meta Comprehensive
Regularization [11.387994024747842]
We introduce a module called CompMod with Meta Comprehensive Regularization (MCR), embedded into existing self-supervised frameworks.
We update our proposed model through a bi-level optimization mechanism, enabling it to capture comprehensive features.
We provide theoretical support for our proposed method from information theory and causal counterfactual perspective.
arXiv Detail & Related papers (2024-03-03T15:53:48Z) - Advancing Ante-Hoc Explainable Models through Generative Adversarial Networks [24.45212348373868]
This paper presents a novel concept learning framework for enhancing model interpretability and performance in visual classification tasks.
Our approach appends an unsupervised explanation generator to the primary classifier network and makes use of adversarial training.
This work presents a significant step towards building inherently interpretable deep vision models with task-aligned concept representations.
arXiv Detail & Related papers (2024-01-09T16:16:16Z) - Universal Information Extraction as Unified Semantic Matching [54.19974454019611]
We decouple information extraction into two abilities, structuring and conceptualizing, which are shared by different tasks and schemas.
Based on this paradigm, we propose to universally model various IE tasks with Unified Semantic Matching framework.
In this way, USM can jointly encode schema and input text, uniformly extract substructures in parallel, and controllably decode target structures on demand.
arXiv Detail & Related papers (2023-01-09T11:51:31Z) - Translational Concept Embedding for Generalized Compositional Zero-shot
Learning [73.60639796305415]
Generalized compositional zero-shot learning means to learn composed concepts of attribute-object pairs in a zero-shot fashion.
This paper introduces a new approach, termed translational concept embedding, to solve these two difficulties in a unified framework.
arXiv Detail & Related papers (2021-12-20T21:27:51Z) - Visual Concept Reasoning Networks [93.99840807973546]
A split-transform-merge strategy has been broadly used as an architectural constraint in convolutional neural networks for visual recognition tasks.
We propose to exploit this strategy and combine it with our Visual Concept Reasoning Networks (VCRNet) to enable reasoning between high-level visual concepts.
Our proposed model, VCRNet, consistently improves the performance by increasing the number of parameters by less than 1%.
arXiv Detail & Related papers (2020-08-26T20:02:40Z) - Concept Learners for Few-Shot Learning [76.08585517480807]
We propose COMET, a meta-learning method that improves generalization ability by learning to learn along human-interpretable concept dimensions.
We evaluate our model on few-shot tasks from diverse domains, including fine-grained image classification, document categorization and cell type annotation.
arXiv Detail & Related papers (2020-07-14T22:04:17Z) - Closed-Form Factorization of Latent Semantics in GANs [65.42778970898534]
A rich set of interpretable dimensions has been shown to emerge in the latent space of the Generative Adversarial Networks (GANs) trained for synthesizing images.
In this work, we examine the internal representation learned by GANs to reveal the underlying variation factors in an unsupervised manner.
We propose a closed-form factorization algorithm for latent semantic discovery by directly decomposing the pre-trained weights.
arXiv Detail & Related papers (2020-07-13T18:05:36Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.