Towards Robust Metrics for Concept Representation Evaluation
- URL: http://arxiv.org/abs/2301.10367v1
- Date: Wed, 25 Jan 2023 00:40:19 GMT
- Title: Towards Robust Metrics for Concept Representation Evaluation
- Authors: Mateo Espinosa Zarlenga, Pietro Barbiero, Zohreh Shams, Dmitry
Kazhdan, Umang Bhatt, Adrian Weller, Mateja Jamnik
- Abstract summary: Concept learning models have been shown to be prone to encoding impurities in their representations.
We propose novel metrics for evaluating the purity of concept representations in both approaches.
- Score: 25.549961337814523
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Recent work on interpretability has focused on concept-based explanations,
where deep learning models are explained in terms of high-level units of
information, referred to as concepts. Concept learning models, however, have
been shown to be prone to encoding impurities in their representations, failing
to fully capture meaningful features of their inputs. While concept learning
lacks metrics to measure such phenomena, the field of disentanglement learning
has explored the related notion of underlying factors of variation in the data,
with plenty of metrics to measure the purity of such factors. In this paper, we
show that such metrics are not appropriate for concept learning and propose
novel metrics for evaluating the purity of concept representations in both
approaches. We show the advantage of these metrics over existing ones and
demonstrate their utility in evaluating the robustness of concept
representations and interventions performed on them. In addition, we show their
utility for benchmarking state-of-the-art methods from both families and find
that, contrary to common assumptions, supervision alone may not be sufficient
for pure concept representations.
Related papers
- Intrinsic Evaluation of Unlearning Using Parametric Knowledge Traces [34.00971641141313]
"Unlearning" certain concepts in large language models has attracted immense attention recently.
Current protocols to evaluate unlearning methods largely rely on behavioral tests.
We argue that unlearning should also be evaluated internally, by considering changes in parametric knowledge traces.
arXiv Detail & Related papers (2024-06-17T15:00:35Z) - Evaluating Concept-based Explanations of Language Models: A Study on Faithfulness and Readability [35.48852504832633]
We introduce a formal definition of concept generalizable to diverse concept-based explanations.
We quantify faithfulness via the difference in the output upon perturbation.
We then provide an automatic measure for readability, by measuring the coherence of patterns that maximally activate a concept.
arXiv Detail & Related papers (2024-04-29T09:20:25Z) - ConcEPT: Concept-Enhanced Pre-Training for Language Models [57.778895980999124]
ConcEPT aims to infuse conceptual knowledge into pre-trained language models.
It exploits external entity concept prediction to predict the concepts of entities mentioned in the pre-training contexts.
Results of experiments show that ConcEPT gains improved conceptual knowledge with concept-enhanced pre-training.
arXiv Detail & Related papers (2024-01-11T05:05:01Z) - Interpreting Pretrained Language Models via Concept Bottlenecks [55.47515772358389]
Pretrained language models (PLMs) have made significant strides in various natural language processing tasks.
The lack of interpretability due to their black-box'' nature poses challenges for responsible implementation.
We propose a novel approach to interpreting PLMs by employing high-level, meaningful concepts that are easily understandable for humans.
arXiv Detail & Related papers (2023-11-08T20:41:18Z) - A Unified Concept-Based System for Local, Global, and Misclassification
Explanations [13.321794212377949]
We present a unified concept-based system for unsupervised learning of both local and global concepts.
Our primary objective is to uncover the intrinsic concepts underlying each data category by training surrogate explainer networks.
Our approach facilitates the explanation of both accurate and erroneous predictions.
arXiv Detail & Related papers (2023-06-06T09:28:37Z) - Enriching Disentanglement: From Logical Definitions to Quantitative Metrics [59.12308034729482]
Disentangling the explanatory factors in complex data is a promising approach for generalizable and data-efficient representation learning.
We establish a theoretical connection between logical definitions of disentanglement and quantitative metrics using topos theory and enriched category theory.
We empirically demonstrate the effectiveness of the proposed metrics by isolating different aspects of disentangled representations.
arXiv Detail & Related papers (2023-05-19T08:22:23Z) - Towards explainable evaluation of language models on the semantic
similarity of visual concepts [0.0]
We examine the behavior of high-performing pre-trained language models, focusing on the task of semantic similarity for visual vocabularies.
First, we address the need for explainable evaluation metrics, necessary for understanding the conceptual quality of retrieved instances.
Secondly, adversarial interventions on salient query semantics expose vulnerabilities of opaque metrics and highlight patterns in learned linguistic representations.
arXiv Detail & Related papers (2022-09-08T11:40:57Z) - GlanceNets: Interpretabile, Leak-proof Concept-based Models [23.7625973884849]
Concept-based models (CBMs) combine high-performance and interpretability by acquiring and reasoning with a vocabulary of high-level concepts.
We provide a clear definition of interpretability in terms of alignment between the model's representation and an underlying data generation process.
We introduce GlanceNets, a new CBM that exploits techniques from disentangled representation learning and open-set recognition to achieve alignment.
arXiv Detail & Related papers (2022-05-31T08:53:53Z) - Translational Concept Embedding for Generalized Compositional Zero-shot
Learning [73.60639796305415]
Generalized compositional zero-shot learning means to learn composed concepts of attribute-object pairs in a zero-shot fashion.
This paper introduces a new approach, termed translational concept embedding, to solve these two difficulties in a unified framework.
arXiv Detail & Related papers (2021-12-20T21:27:51Z) - Separating Skills and Concepts for Novel Visual Question Answering [66.46070380927372]
Generalization to out-of-distribution data has been a problem for Visual Question Answering (VQA) models.
"Skills" are visual tasks, such as counting or attribute recognition, and are applied to "concepts" mentioned in the question.
We present a novel method for learning to compose skills and concepts that separates these two factors implicitly within a model.
arXiv Detail & Related papers (2021-07-19T18:55:10Z) - Formalising Concepts as Grounded Abstractions [68.24080871981869]
This report shows how representation learning can be used to induce concepts from raw data.
The main technical goal of this report is to show how techniques from representation learning can be married with a lattice-theoretic formulation of conceptual spaces.
arXiv Detail & Related papers (2021-01-13T15:22:01Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.