Distilling Semantic Concept Embeddings from Contrastively Fine-Tuned
Language Models
- URL: http://arxiv.org/abs/2305.09785v1
- Date: Tue, 16 May 2023 20:17:02 GMT
- Title: Distilling Semantic Concept Embeddings from Contrastively Fine-Tuned
Language Models
- Authors: Na Li, Hanane Kteich, Zied Bouraoui, Steven Schockaert
- Abstract summary: Current strategies for using language models typically represent a concept by averaging the contextualised representations of its mentions in some corpus.
We propose two contrastive learning strategies, based on the view that whenever two sentences reveal similar properties, the corresponding contextualised vectors should also be similar.
One strategy is fully unsupervised, estimating the properties which are expressed in a sentence from the neighbourhood structure of the contextualised word embeddings.
The second strategy instead relies on a distant supervision signal from ConceptNet.
- Score: 39.514266900301294
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: Learning vectors that capture the meaning of concepts remains a fundamental
challenge. Somewhat surprisingly, perhaps, pre-trained language models have
thus far only enabled modest improvements to the quality of such concept
embeddings. Current strategies for using language models typically represent a
concept by averaging the contextualised representations of its mentions in some
corpus. This is potentially sub-optimal for at least two reasons. First,
contextualised word vectors have an unusual geometry, which hampers downstream
tasks. Second, concept embeddings should capture the semantic properties of
concepts, whereas contextualised word vectors are also affected by other
factors. To address these issues, we propose two contrastive learning
strategies, based on the view that whenever two sentences reveal similar
properties, the corresponding contextualised vectors should also be similar.
One strategy is fully unsupervised, estimating the properties which are
expressed in a sentence from the neighbourhood structure of the contextualised
word embeddings. The second strategy instead relies on a distant supervision
signal from ConceptNet. Our experimental results show that the resulting
vectors substantially outperform existing concept embeddings in predicting the
semantic properties of concepts, with the ConceptNet-based strategy achieving
the best results. These findings are furthermore confirmed in a clustering task
and in the downstream task of ontology completion.
Related papers
- Explaining Explainability: Understanding Concept Activation Vectors [35.37586279472797]
Recent interpretability methods propose using concept-based explanations to translate internal representations of deep learning models into a language that humans are familiar with: concepts.
This requires understanding which concepts are present in the representation space of a neural network.
In this work, we investigate three properties of Concept Activation Vectors (CAVs), which are learnt using a probe dataset of concept exemplars.
We introduce tools designed to detect the presence of these properties, provide insight into how they affect the derived explanations, and provide recommendations to minimise their impact.
arXiv Detail & Related papers (2024-04-04T17:46:20Z) - GCPV: Guided Concept Projection Vectors for the Explainable Inspection
of CNN Feature Spaces [1.0923877073891446]
We introduce the local-to-global Guided Concept Projection Vectors (GCPV) approach.
GCPV generates local concept vectors that each precisely reconstruct a concept segmentation label.
It then generalizes these to global concept and even sub-concept vectors by means of hiearchical clustering.
arXiv Detail & Related papers (2023-11-24T12:22:00Z) - Identifying Linear Relational Concepts in Large Language Models [16.917379272022064]
Transformer language models (LMs) have been shown to represent concepts as directions in the latent space of hidden activations.
We present a technique called linear relational concepts (LRC) for finding concept directions corresponding to human-interpretable concepts.
arXiv Detail & Related papers (2023-11-15T14:01:41Z) - How Well Do Text Embedding Models Understand Syntax? [50.440590035493074]
The ability of text embedding models to generalize across a wide range of syntactic contexts remains under-explored.
Our findings reveal that existing text embedding models have not sufficiently addressed these syntactic understanding challenges.
We propose strategies to augment the generalization ability of text embedding models in diverse syntactic scenarios.
arXiv Detail & Related papers (2023-11-14T08:51:00Z) - Rewrite Caption Semantics: Bridging Semantic Gaps for
Language-Supervised Semantic Segmentation [100.81837601210597]
We propose Concept Curation (CoCu) to bridge the gap between visual and textual semantics in pre-training data.
CoCu achieves superb zero-shot transfer performance and greatly boosts language-supervised segmentation baseline by a large margin.
arXiv Detail & Related papers (2023-09-24T00:05:39Z) - Visual Superordinate Abstraction for Robust Concept Learning [80.15940996821541]
Concept learning constructs visual representations that are connected to linguistic semantics.
We ascribe the bottleneck to a failure of exploring the intrinsic semantic hierarchy of visual concepts.
We propose a visual superordinate abstraction framework for explicitly modeling semantic-aware visual subspaces.
arXiv Detail & Related papers (2022-05-28T14:27:38Z) - Learnable Visual Words for Interpretable Image Recognition [70.85686267987744]
We propose the Learnable Visual Words (LVW) to interpret the model prediction behaviors with two novel modules.
The semantic visual words learning relaxes the category-specific constraint, enabling the general visual words shared across different categories.
Our experiments on six visual benchmarks demonstrate the superior effectiveness of our proposed LVW in both accuracy and model interpretation.
arXiv Detail & Related papers (2022-05-22T03:24:45Z) - Translational Concept Embedding for Generalized Compositional Zero-shot
Learning [73.60639796305415]
Generalized compositional zero-shot learning means to learn composed concepts of attribute-object pairs in a zero-shot fashion.
This paper introduces a new approach, termed translational concept embedding, to solve these two difficulties in a unified framework.
arXiv Detail & Related papers (2021-12-20T21:27:51Z) - Concept Embedding for Information Retrieval [0.0]
We present three approaches to build concepts vectors based on words vectors.
We use a vector-based measure to estimate inter-concepts similarity.
This could be used to improve conceptual indexing process.
arXiv Detail & Related papers (2020-02-01T09:18:56Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.