Related papers: Analyzing Encoded Concepts in Transformer Language Models

Analyzing Encoded Concepts in Transformer Language Models

URL: http://arxiv.org/abs/2206.13289v1
Date: Mon, 27 Jun 2022 13:32:10 GMT
Title: Analyzing Encoded Concepts in Transformer Language Models
Authors: Hassan Sajjad, Nadir Durrani, Fahim Dalvi, Firoj Alam, Abdul Rafae Khan, Jia Xu
Abstract summary: ConceptX analyses how latent concepts are encoded in representations learned within pre-trained language models. It uses clustering to discover the encoded concepts and explains them by aligning with a large set of human-defined concepts.
Score: 21.76062029833023
License: http://creativecommons.org/licenses/by/4.0/
Abstract: We propose a novel framework ConceptX, to analyze how latent concepts are encoded in representations learned within pre-trained language models. It uses clustering to discover the encoded concepts and explains them by aligning with a large set of human-defined concepts. Our analysis on seven transformer language models reveal interesting insights: i) the latent space within the learned representations overlap with different linguistic concepts to a varying degree, ii) the lower layers in the model are dominated by lexical concepts (e.g., affixation), whereas the core-linguistic concepts (e.g., morphological or syntactic relations) are better represented in the middle and higher layers, iii) some encoded concepts are multi-faceted and cannot be adequately explained using the existing human-defined concepts.

Related papers

From Words to Waves: Analyzing Concept Formation in Speech and Text-Based Foundation Models [20.244145418997377]
We analyze the conceptual structures learned by speech and textual models both individually and jointly.<n>We employ Latent Concept Analysis, an unsupervised method for uncovering latent representations in neural networks, to examine how semantic abstractions form across modalities.
arXiv Detail & Related papers (2025-06-01T19:33:21Z)
Human-like conceptual representations emerge from language prediction [72.5875173689788]
Large language models (LLMs) trained exclusively through next-token prediction over language data exhibit remarkably human-like behaviors. Are these models developing concepts akin to humans, and if so, how are such concepts represented and organized? Our results demonstrate that LLMs can flexibly derive concepts from linguistic descriptions in relation to contextual cues about other concepts. These findings establish that structured, human-like conceptual representations can naturally emerge from language prediction without real-world grounding.
arXiv Detail & Related papers (2025-01-21T23:54:17Z)
OmniPrism: Learning Disentangled Visual Concept for Image Generation [57.21097864811521]
Creative visual concept generation often draws inspiration from specific concepts in a reference image to produce relevant outcomes. We propose OmniPrism, a visual concept disentangling approach for creative image generation. Our method learns disentangled concept representations guided by natural language and trains a diffusion model to incorporate these concepts.
arXiv Detail & Related papers (2024-12-16T18:59:52Z)
A Concept-Based Explainability Framework for Large Multimodal Models [52.37626977572413]
We propose a dictionary learning based approach, applied to the representation of tokens. We show that these concepts are well semantically grounded in both vision and text. We show that the extracted multimodal concepts are useful to interpret representations of test samples.
arXiv Detail & Related papers (2024-06-12T10:48:53Z)
Concept Formation and Alignment in Language Models: Bridging Statistical Patterns in Latent Space to Concept Taxonomy [11.232704182001253]
This paper explores the concept formation and alignment within the realm of language models (LMs) We propose a mechanism for identifying concepts and their hierarchical organization within the semantic representations learned by various LMs.
arXiv Detail & Related papers (2024-06-08T01:27:19Z)
The Hidden Language of Diffusion Models [70.03691458189604]
We present Conceptor, a novel method to interpret the internal representation of a textual concept by a diffusion model. We find surprising visual connections between concepts, that transcend their textual semantics. We additionally discover concepts that rely on mixtures of exemplars, biases, renowned artistic styles, or a simultaneous fusion of multiple meanings.
arXiv Detail & Related papers (2023-06-01T17:57:08Z)
Brain encoding models based on multimodal transformers can transfer across language and vision [60.72020004771044]
We used representations from multimodal transformers to train encoding models that can transfer across fMRI responses to stories and movies. We found that encoding models trained on brain responses to one modality can successfully predict brain responses to the other modality.
arXiv Detail & Related papers (2023-05-20T17:38:44Z)
ConceptX: A Framework for Latent Concept Analysis [21.760620298330235]
We present ConceptX, a human-in-the-loop framework for interpreting and annotating latent representational space in Language Models (pLMs) We use an unsupervised method to discover concepts learned in these models and enable a graphical interface for humans to generate explanations for the concepts.
arXiv Detail & Related papers (2022-11-12T11:31:09Z)
Concept Gradient: Concept-based Interpretation Without Linear Assumption [77.96338722483226]
Concept Activation Vector (CAV) relies on learning a linear relation between some latent representation of a given model and concepts. We proposed Concept Gradient (CG), extending concept-based interpretation beyond linear concept functions. We demonstrated CG outperforms CAV in both toy examples and real world datasets.
arXiv Detail & Related papers (2022-08-31T17:06:46Z)
Discovering Latent Concepts Learned in BERT [21.760620298330235]
We study what latent concepts exist in the pre-trained BERT model. We also release a novel BERT ConceptNet dataset (BCN) consisting of 174 concept labels and 1M annotated instances.
arXiv Detail & Related papers (2022-05-15T09:45:34Z)
The Conceptual VAE [7.15767183672057]
We present a new model of concepts, based on the framework of variational autoencoders. The model is inspired by, and closely related to, the Beta-VAE model of concepts. We show how the model can be used as a concept classifier, and how it can be adapted to learn from fewer labels per instance.
arXiv Detail & Related papers (2022-03-21T17:27:28Z)
Human-Centered Concept Explanations for Neural Networks [47.71169918421306]
We introduce concept explanations including the class of Concept Activation Vectors (CAV) We then discuss approaches to automatically extract concepts, and approaches to address some of their caveats. Finally, we discuss some case studies that showcase the utility of such concept-based explanations in synthetic settings and real world applications.
arXiv Detail & Related papers (2022-02-25T01:27:31Z)
Towards Visual Semantics [17.1623244298824]
We study how humans build mental representations, i.e., concepts, of what they visually perceive. In this paper we provide a theory and an algorithm which learns substance concepts which correspond to the concepts, that we call classification concepts. The experiments, though preliminary, show that the algorithm manages to acquire the notions of Genus and Differentia with reasonable accuracy.
arXiv Detail & Related papers (2021-04-26T07:28:02Z)
Formalising Concepts as Grounded Abstractions [68.24080871981869]
This report shows how representation learning can be used to induce concepts from raw data. The main technical goal of this report is to show how techniques from representation learning can be married with a lattice-theoretic formulation of conceptual spaces.
arXiv Detail & Related papers (2021-01-13T15:22:01Z)

This list is automatically generated from the titles and abstracts of the papers in this site.