Analyzing Encoded Concepts in Transformer Language Models
- URL: http://arxiv.org/abs/2206.13289v1
- Date: Mon, 27 Jun 2022 13:32:10 GMT
- Title: Analyzing Encoded Concepts in Transformer Language Models
- Authors: Hassan Sajjad, Nadir Durrani, Fahim Dalvi, Firoj Alam, Abdul Rafae
Khan, Jia Xu
- Abstract summary: ConceptX analyses how latent concepts are encoded in representations learned within pre-trained language models.
It uses clustering to discover the encoded concepts and explains them by aligning with a large set of human-defined concepts.
- Score: 21.76062029833023
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: We propose a novel framework ConceptX, to analyze how latent concepts are
encoded in representations learned within pre-trained language models. It uses
clustering to discover the encoded concepts and explains them by aligning with
a large set of human-defined concepts. Our analysis on seven transformer
language models reveal interesting insights: i) the latent space within the
learned representations overlap with different linguistic concepts to a varying
degree, ii) the lower layers in the model are dominated by lexical concepts
(e.g., affixation), whereas the core-linguistic concepts (e.g., morphological
or syntactic relations) are better represented in the middle and higher layers,
iii) some encoded concepts are multi-faceted and cannot be adequately explained
using the existing human-defined concepts.
Related papers
- A Concept-Based Explainability Framework for Large Multimodal Models [52.37626977572413]
We propose a dictionary learning based approach, applied to the representation of tokens.
We show that these concepts are well semantically grounded in both vision and text.
We show that the extracted multimodal concepts are useful to interpret representations of test samples.
arXiv Detail & Related papers (2024-06-12T10:48:53Z) - Concept Formation and Alignment in Language Models: Bridging Statistical Patterns in Latent Space to Concept Taxonomy [11.232704182001253]
This paper explores the concept formation and alignment within the realm of language models (LMs)
We propose a mechanism for identifying concepts and their hierarchical organization within the semantic representations learned by various LMs.
arXiv Detail & Related papers (2024-06-08T01:27:19Z) - The Hidden Language of Diffusion Models [70.03691458189604]
We present Conceptor, a novel method to interpret the internal representation of a textual concept by a diffusion model.
We find surprising visual connections between concepts, that transcend their textual semantics.
We additionally discover concepts that rely on mixtures of exemplars, biases, renowned artistic styles, or a simultaneous fusion of multiple meanings.
arXiv Detail & Related papers (2023-06-01T17:57:08Z) - Brain encoding models based on multimodal transformers can transfer
across language and vision [60.72020004771044]
We used representations from multimodal transformers to train encoding models that can transfer across fMRI responses to stories and movies.
We found that encoding models trained on brain responses to one modality can successfully predict brain responses to the other modality.
arXiv Detail & Related papers (2023-05-20T17:38:44Z) - ConceptX: A Framework for Latent Concept Analysis [21.760620298330235]
We present ConceptX, a human-in-the-loop framework for interpreting and annotating latent representational space in Language Models (pLMs)
We use an unsupervised method to discover concepts learned in these models and enable a graphical interface for humans to generate explanations for the concepts.
arXiv Detail & Related papers (2022-11-12T11:31:09Z) - Concept Gradient: Concept-based Interpretation Without Linear Assumption [77.96338722483226]
Concept Activation Vector (CAV) relies on learning a linear relation between some latent representation of a given model and concepts.
We proposed Concept Gradient (CG), extending concept-based interpretation beyond linear concept functions.
We demonstrated CG outperforms CAV in both toy examples and real world datasets.
arXiv Detail & Related papers (2022-08-31T17:06:46Z) - Discovering Latent Concepts Learned in BERT [21.760620298330235]
We study what latent concepts exist in the pre-trained BERT model.
We also release a novel BERT ConceptNet dataset (BCN) consisting of 174 concept labels and 1M annotated instances.
arXiv Detail & Related papers (2022-05-15T09:45:34Z) - The Conceptual VAE [7.15767183672057]
We present a new model of concepts, based on the framework of variational autoencoders.
The model is inspired by, and closely related to, the Beta-VAE model of concepts.
We show how the model can be used as a concept classifier, and how it can be adapted to learn from fewer labels per instance.
arXiv Detail & Related papers (2022-03-21T17:27:28Z) - Human-Centered Concept Explanations for Neural Networks [47.71169918421306]
We introduce concept explanations including the class of Concept Activation Vectors (CAV)
We then discuss approaches to automatically extract concepts, and approaches to address some of their caveats.
Finally, we discuss some case studies that showcase the utility of such concept-based explanations in synthetic settings and real world applications.
arXiv Detail & Related papers (2022-02-25T01:27:31Z) - Towards Visual Semantics [17.1623244298824]
We study how humans build mental representations, i.e., concepts, of what they visually perceive.
In this paper we provide a theory and an algorithm which learns substance concepts which correspond to the concepts, that we call classification concepts.
The experiments, though preliminary, show that the algorithm manages to acquire the notions of Genus and Differentia with reasonable accuracy.
arXiv Detail & Related papers (2021-04-26T07:28:02Z) - Formalising Concepts as Grounded Abstractions [68.24080871981869]
This report shows how representation learning can be used to induce concepts from raw data.
The main technical goal of this report is to show how techniques from representation learning can be married with a lattice-theoretic formulation of conceptual spaces.
arXiv Detail & Related papers (2021-01-13T15:22:01Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.