Towards Robust Metrics for Concept Representation Evaluation
- URL: http://arxiv.org/abs/2301.10367v1
- Date: Wed, 25 Jan 2023 00:40:19 GMT
- Title: Towards Robust Metrics for Concept Representation Evaluation
- Authors: Mateo Espinosa Zarlenga, Pietro Barbiero, Zohreh Shams, Dmitry
Kazhdan, Umang Bhatt, Adrian Weller, Mateja Jamnik
- Abstract summary: Concept learning models have been shown to be prone to encoding impurities in their representations.
We propose novel metrics for evaluating the purity of concept representations in both approaches.
- Score: 25.549961337814523
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Recent work on interpretability has focused on concept-based explanations,
where deep learning models are explained in terms of high-level units of
information, referred to as concepts. Concept learning models, however, have
been shown to be prone to encoding impurities in their representations, failing
to fully capture meaningful features of their inputs. While concept learning
lacks metrics to measure such phenomena, the field of disentanglement learning
has explored the related notion of underlying factors of variation in the data,
with plenty of metrics to measure the purity of such factors. In this paper, we
show that such metrics are not appropriate for concept learning and propose
novel metrics for evaluating the purity of concept representations in both
approaches. We show the advantage of these metrics over existing ones and
demonstrate their utility in evaluating the robustness of concept
representations and interventions performed on them. In addition, we show their
utility for benchmarking state-of-the-art methods from both families and find
that, contrary to common assumptions, supervision alone may not be sufficient
for pure concept representations.
Related papers
- Intrinsic Evaluation of Unlearning Using Parametric Knowledge Traces [34.00971641141313]
"Unlearning" certain concepts in large language models (LLMs) has attracted immense attention recently.
Current protocols to evaluate unlearning methods rely on behavioral tests, without monitoring the presence of associated knowledge.
We argue that unlearning should also be evaluated internally, by considering changes in the parametric knowledge traces of the unlearned concepts.
arXiv Detail & Related papers (2024-06-17T15:00:35Z) - Evaluating Readability and Faithfulness of Concept-based Explanations [35.48852504832633]
Concept-based explanations arise as a promising avenue for explaining high-level patterns learned by Large Language Models.
Current methods approach concepts from different perspectives, lacking a unified formalization.
This makes evaluating the core measures of concepts, namely faithfulness or readability, challenging.
arXiv Detail & Related papers (2024-04-29T09:20:25Z) - ConcEPT: Concept-Enhanced Pre-Training for Language Models [57.778895980999124]
ConcEPT aims to infuse conceptual knowledge into pre-trained language models.
It exploits external entity concept prediction to predict the concepts of entities mentioned in the pre-training contexts.
Results of experiments show that ConcEPT gains improved conceptual knowledge with concept-enhanced pre-training.
arXiv Detail & Related papers (2024-01-11T05:05:01Z) - Do Concept Bottleneck Models Respect Localities? [14.77558378567965]
Concept-based methods explain model predictions using human-understandable concepts.
"Localities" involve using only relevant features when predicting a concept's value.
CBMs may not capture localities, even when independent concepts are localised to non-overlapping feature subsets.
arXiv Detail & Related papers (2024-01-02T16:05:23Z) - Interpreting Pretrained Language Models via Concept Bottlenecks [55.47515772358389]
Pretrained language models (PLMs) have made significant strides in various natural language processing tasks.
The lack of interpretability due to their black-box'' nature poses challenges for responsible implementation.
We propose a novel approach to interpreting PLMs by employing high-level, meaningful concepts that are easily understandable for humans.
arXiv Detail & Related papers (2023-11-08T20:41:18Z) - A Unified Concept-Based System for Local, Global, and Misclassification
Explanations [13.321794212377949]
We present a unified concept-based system for unsupervised learning of both local and global concepts.
Our primary objective is to uncover the intrinsic concepts underlying each data category by training surrogate explainer networks.
Our approach facilitates the explanation of both accurate and erroneous predictions.
arXiv Detail & Related papers (2023-06-06T09:28:37Z) - Concept Gradient: Concept-based Interpretation Without Linear Assumption [77.96338722483226]
Concept Activation Vector (CAV) relies on learning a linear relation between some latent representation of a given model and concepts.
We proposed Concept Gradient (CG), extending concept-based interpretation beyond linear concept functions.
We demonstrated CG outperforms CAV in both toy examples and real world datasets.
arXiv Detail & Related papers (2022-08-31T17:06:46Z) - GlanceNets: Interpretabile, Leak-proof Concept-based Models [23.7625973884849]
Concept-based models (CBMs) combine high-performance and interpretability by acquiring and reasoning with a vocabulary of high-level concepts.
We provide a clear definition of interpretability in terms of alignment between the model's representation and an underlying data generation process.
We introduce GlanceNets, a new CBM that exploits techniques from disentangled representation learning and open-set recognition to achieve alignment.
arXiv Detail & Related papers (2022-05-31T08:53:53Z) - Translational Concept Embedding for Generalized Compositional Zero-shot
Learning [73.60639796305415]
Generalized compositional zero-shot learning means to learn composed concepts of attribute-object pairs in a zero-shot fashion.
This paper introduces a new approach, termed translational concept embedding, to solve these two difficulties in a unified framework.
arXiv Detail & Related papers (2021-12-20T21:27:51Z) - Separating Skills and Concepts for Novel Visual Question Answering [66.46070380927372]
Generalization to out-of-distribution data has been a problem for Visual Question Answering (VQA) models.
"Skills" are visual tasks, such as counting or attribute recognition, and are applied to "concepts" mentioned in the question.
We present a novel method for learning to compose skills and concepts that separates these two factors implicitly within a model.
arXiv Detail & Related papers (2021-07-19T18:55:10Z) - Formalising Concepts as Grounded Abstractions [68.24080871981869]
This report shows how representation learning can be used to induce concepts from raw data.
The main technical goal of this report is to show how techniques from representation learning can be married with a lattice-theoretic formulation of conceptual spaces.
arXiv Detail & Related papers (2021-01-13T15:22:01Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.