Related papers: Towards Robust Metrics for Concept Representation Evaluation

Towards Robust Metrics for Concept Representation Evaluation

URL: http://arxiv.org/abs/2301.10367v1
Date: Wed, 25 Jan 2023 00:40:19 GMT
Title: Towards Robust Metrics for Concept Representation Evaluation
Authors: Mateo Espinosa Zarlenga, Pietro Barbiero, Zohreh Shams, Dmitry Kazhdan, Umang Bhatt, Adrian Weller, Mateja Jamnik
Abstract summary: Concept learning models have been shown to be prone to encoding impurities in their representations. We propose novel metrics for evaluating the purity of concept representations in both approaches.
Score: 25.549961337814523
License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
Abstract: Recent work on interpretability has focused on concept-based explanations, where deep learning models are explained in terms of high-level units of information, referred to as concepts. Concept learning models, however, have been shown to be prone to encoding impurities in their representations, failing to fully capture meaningful features of their inputs. While concept learning lacks metrics to measure such phenomena, the field of disentanglement learning has explored the related notion of underlying factors of variation in the data, with plenty of metrics to measure the purity of such factors. In this paper, we show that such metrics are not appropriate for concept learning and propose novel metrics for evaluating the purity of concept representations in both approaches. We show the advantage of these metrics over existing ones and demonstrate their utility in evaluating the robustness of concept representations and interventions performed on them. In addition, we show their utility for benchmarking state-of-the-art methods from both families and find that, contrary to common assumptions, supervision alone may not be sufficient for pure concept representations.

Related papers

Sample-efficient Learning of Concepts with Theoretical Guarantees: from Data to Concepts without Interventions [7.3784937557132855]
Concept-based models (CBM) learn interpretable concepts from high-dimensional data, e.g. images, which are used to predict labels. An important issue in CBMs is concept leakage, i.e., spurious information in the learned concepts, which effectively leads to learning "wrong" concepts. We describe a framework that provides theoretical guarantees on the correctness of the learned concepts and on the number of required labels.
arXiv Detail & Related papers (2025-02-10T15:01:56Z)
Concept-Based Explainable Artificial Intelligence: Metrics and Benchmarks [0.0]
Concept-based explanation methods aim to improve the interpretability of machine learning models. We propose three metrics: the concept global importance metric, the concept existence metric, and the concept location metric. We demonstrate that, in many cases, even the most important concepts determined by post-hoc CBMs are not present in input images.
arXiv Detail & Related papers (2025-01-31T16:32:36Z)
Intrinsic Evaluation of Unlearning Using Parametric Knowledge Traces [34.00971641141313]
"Unlearning" certain concepts in large language models (LLMs) has attracted immense attention recently. Current protocols to evaluate unlearning methods rely on behavioral tests, without monitoring the presence of associated knowledge. We argue that unlearning should also be evaluated internally, by considering changes in the parametric knowledge traces of the unlearned concepts.
arXiv Detail & Related papers (2024-06-17T15:00:35Z)
A Concept-Based Explainability Framework for Large Multimodal Models [52.37626977572413]
We propose a dictionary learning based approach, applied to the representation of tokens. We show that these concepts are well semantically grounded in both vision and text. We show that the extracted multimodal concepts are useful to interpret representations of test samples.
arXiv Detail & Related papers (2024-06-12T10:48:53Z)
Evaluating Readability and Faithfulness of Concept-based Explanations [35.48852504832633]
Concept-based explanations arise as a promising avenue for explaining high-level patterns learned by Large Language Models. Current methods approach concepts from different perspectives, lacking a unified formalization. This makes evaluating the core measures of concepts, namely faithfulness or readability, challenging.
arXiv Detail & Related papers (2024-04-29T09:20:25Z)
ConcEPT: Concept-Enhanced Pre-Training for Language Models [57.778895980999124]
ConcEPT aims to infuse conceptual knowledge into pre-trained language models. It exploits external entity concept prediction to predict the concepts of entities mentioned in the pre-training contexts. Results of experiments show that ConcEPT gains improved conceptual knowledge with concept-enhanced pre-training.
arXiv Detail & Related papers (2024-01-11T05:05:01Z)
Do Concept Bottleneck Models Respect Localities? [14.77558378567965]
Concept-based methods explain model predictions using human-understandable concepts. "Localities" involve using only relevant features when predicting a concept's value. CBMs may not capture localities, even when independent concepts are localised to non-overlapping feature subsets.
arXiv Detail & Related papers (2024-01-02T16:05:23Z)
Interpreting Pretrained Language Models via Concept Bottlenecks [55.47515772358389]
Pretrained language models (PLMs) have made significant strides in various natural language processing tasks. The lack of interpretability due to their black-box'' nature poses challenges for responsible implementation. We propose a novel approach to interpreting PLMs by employing high-level, meaningful concepts that are easily understandable for humans.
arXiv Detail & Related papers (2023-11-08T20:41:18Z)
A Unified Concept-Based System for Local, Global, and Misclassification Explanations [13.321794212377949]
We present a unified concept-based system for unsupervised learning of both local and global concepts. Our primary objective is to uncover the intrinsic concepts underlying each data category by training surrogate explainer networks. Our approach facilitates the explanation of both accurate and erroneous predictions.
arXiv Detail & Related papers (2023-06-06T09:28:37Z)
Concept Gradient: Concept-based Interpretation Without Linear Assumption [77.96338722483226]
Concept Activation Vector (CAV) relies on learning a linear relation between some latent representation of a given model and concepts. We proposed Concept Gradient (CG), extending concept-based interpretation beyond linear concept functions. We demonstrated CG outperforms CAV in both toy examples and real world datasets.
arXiv Detail & Related papers (2022-08-31T17:06:46Z)
GlanceNets: Interpretabile, Leak-proof Concept-based Models [23.7625973884849]
Concept-based models (CBMs) combine high-performance and interpretability by acquiring and reasoning with a vocabulary of high-level concepts. We provide a clear definition of interpretability in terms of alignment between the model's representation and an underlying data generation process. We introduce GlanceNets, a new CBM that exploits techniques from disentangled representation learning and open-set recognition to achieve alignment.
arXiv Detail & Related papers (2022-05-31T08:53:53Z)
Translational Concept Embedding for Generalized Compositional Zero-shot Learning [73.60639796305415]
Generalized compositional zero-shot learning means to learn composed concepts of attribute-object pairs in a zero-shot fashion. This paper introduces a new approach, termed translational concept embedding, to solve these two difficulties in a unified framework.
arXiv Detail & Related papers (2021-12-20T21:27:51Z)
Separating Skills and Concepts for Novel Visual Question Answering [66.46070380927372]
Generalization to out-of-distribution data has been a problem for Visual Question Answering (VQA) models. "Skills" are visual tasks, such as counting or attribute recognition, and are applied to "concepts" mentioned in the question. We present a novel method for learning to compose skills and concepts that separates these two factors implicitly within a model.
arXiv Detail & Related papers (2021-07-19T18:55:10Z)
Formalising Concepts as Grounded Abstractions [68.24080871981869]
This report shows how representation learning can be used to induce concepts from raw data. The main technical goal of this report is to show how techniques from representation learning can be married with a lattice-theoretic formulation of conceptual spaces.
arXiv Detail & Related papers (2021-01-13T15:22:01Z)

This list is automatically generated from the titles and abstracts of the papers in this site.