On Isotropy, Contextualization and Learning Dynamics of
Contrastive-based Sentence Representation Learning
- URL: http://arxiv.org/abs/2212.09170v2
- Date: Fri, 26 May 2023 20:40:57 GMT
- Title: On Isotropy, Contextualization and Learning Dynamics of
Contrastive-based Sentence Representation Learning
- Authors: Chenghao Xiao, Yang Long, Noura Al Moubayed
- Abstract summary: It is not well understood why contrastive learning works for learning sentence-level semantics.
We show that contrastive learning brings isotropy, and drives high intra-sentence similarity.
We also find that what we formalize as "spurious contextualization" is mitigated for semantically meaningful tokens.
- Score: 8.959800369169798
- License: http://creativecommons.org/licenses/by-nc-nd/4.0/
- Abstract: Incorporating contrastive learning objectives in sentence representation
learning (SRL) has yielded significant improvements on many sentence-level NLP
tasks. However, it is not well understood why contrastive learning works for
learning sentence-level semantics. In this paper, we aim to help guide future
designs of sentence representation learning methods by taking a closer look at
contrastive SRL through the lens of isotropy, contextualization and learning
dynamics. We interpret its successes through the geometry of the representation
shifts and show that contrastive learning brings isotropy, and drives high
intra-sentence similarity: when in the same sentence, tokens converge to
similar positions in the semantic space. We also find that what we formalize as
"spurious contextualization" is mitigated for semantically meaningful tokens,
while augmented for functional ones. We find that the embedding space is
directed towards the origin during training, with more areas now better
defined. We ablate these findings by observing the learning dynamics with
different training temperatures, batch sizes and pooling methods.
Related papers
- Reframing linguistic bootstrapping as joint inference using visually-grounded grammar induction models [31.006803764376475]
Semantic and syntactic bootstrapping posit that children use their prior knowledge of one linguistic domain, say syntactic relations, to help later acquire another, such as the meanings of new words.
Here, we argue that they are instead both contingent on a more general learning strategy for language acquisition: joint learning.
Using a series of neural visually-grounded grammar induction models, we demonstrate that both syntactic and semantic bootstrapping effects are strongest when syntax and semantics are learnt simultaneously.
arXiv Detail & Related papers (2024-06-17T18:01:06Z) - Pixel Sentence Representation Learning [67.4775296225521]
In this work, we conceptualize the learning of sentence-level textual semantics as a visual representation learning process.
We employ visually-grounded text perturbation methods like typos and word order shuffling, resonating with human cognitive patterns, and enabling perturbation to be perceived as continuous.
Our approach is further bolstered by large-scale unsupervised topical alignment training and natural language inference supervision.
arXiv Detail & Related papers (2024-02-13T02:46:45Z) - DenoSent: A Denoising Objective for Self-Supervised Sentence
Representation Learning [59.4644086610381]
We propose a novel denoising objective that inherits from another perspective, i.e., the intra-sentence perspective.
By introducing both discrete and continuous noise, we generate noisy sentences and then train our model to restore them to their original form.
Our empirical evaluations demonstrate that this approach delivers competitive results on both semantic textual similarity (STS) and a wide range of transfer tasks.
arXiv Detail & Related papers (2024-01-24T17:48:45Z) - Subspace Chronicles: How Linguistic Information Emerges, Shifts and
Interacts during Language Model Training [56.74440457571821]
We analyze tasks covering syntax, semantics and reasoning, across 2M pre-training steps and five seeds.
We identify critical learning phases across tasks and time, during which subspaces emerge, share information, and later disentangle to specialize.
Our findings have implications for model interpretability, multi-task learning, and learning from limited data.
arXiv Detail & Related papers (2023-10-25T09:09:55Z) - A Message Passing Perspective on Learning Dynamics of Contrastive
Learning [60.217972614379065]
We show that if we cast a contrastive objective equivalently into the feature space, then its learning dynamics admits an interpretable form.
This perspective also establishes an intriguing connection between contrastive learning and Message Passing Graph Neural Networks (MP-GNNs)
arXiv Detail & Related papers (2023-03-08T08:27:31Z) - Sentence Representation Learning with Generative Objective rather than
Contrastive Objective [86.01683892956144]
We propose a novel generative self-supervised learning objective based on phrase reconstruction.
Our generative learning achieves powerful enough performance improvement and outperforms the current state-of-the-art contrastive methods.
arXiv Detail & Related papers (2022-10-16T07:47:46Z) - CMSBERT-CLR: Context-driven Modality Shifting BERT with Contrastive
Learning for linguistic, visual, acoustic Representations [0.7081604594416336]
We present a Context-driven Modality Shifting BERT with Contrastive Learning for linguistic, visual, acoustic Representations (CMSBERT-CLR)
CMSBERT-CLR incorporates the whole context's non-verbal and verbal information and aligns modalities more effectively through contrastive learning.
In our experiments, we demonstrate that our approach achieves state-of-the-art results.
arXiv Detail & Related papers (2022-08-21T08:21:43Z) - Generative or Contrastive? Phrase Reconstruction for Better Sentence
Representation Learning [86.01683892956144]
We propose a novel generative self-supervised learning objective based on phrase reconstruction.
Our generative learning may yield powerful enough sentence representation and achieve performance in Sentence Textual Similarity tasks on par with contrastive learning.
arXiv Detail & Related papers (2022-04-20T10:00:46Z) - Text Transformations in Contrastive Self-Supervised Learning: A Review [27.25193476131943]
We formalize the contrastive learning framework in the domain of natural language processing.
We describe some challenges and potential directions for learning better text representations using contrastive methods.
arXiv Detail & Related papers (2022-03-22T19:02:43Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.