Text Transformations in Contrastive Self-Supervised Learning: A Review
- URL: http://arxiv.org/abs/2203.12000v1
- Date: Tue, 22 Mar 2022 19:02:43 GMT
- Title: Text Transformations in Contrastive Self-Supervised Learning: A Review
- Authors: Amrita Bhattacharjee, Mansooreh Karami, Huan Liu
- Abstract summary: We formalize the contrastive learning framework in the domain of natural language processing.
We describe some challenges and potential directions for learning better text representations using contrastive methods.
- Score: 27.25193476131943
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Contrastive self-supervised learning has become a prominent technique in
representation learning. The main step in these methods is to contrast
semantically similar and dissimilar pairs of samples. However, in the domain of
Natural Language, the augmentation methods used in creating similar pairs with
regard to contrastive learning assumptions are challenging. This is because,
even simply modifying a word in the input might change the semantic meaning of
the sentence, and hence, would violate the distributional hypothesis. In this
review paper, we formalize the contrastive learning framework in the domain of
natural language processing. We emphasize the considerations that need to be
addressed in the data transformation step and review the state-of-the-art
methods and evaluations for contrastive representation learning in NLP.
Finally, we describe some challenges and potential directions for learning
better text representations using contrastive methods.
Related papers
- HNCSE: Advancing Sentence Embeddings via Hybrid Contrastive Learning with Hard Negatives [17.654412302780557]
HNCSE is a novel contrastive learning framework that extends the leading SimCSE approach.
The hallmark of HNCSE is its innovative use of hard negative samples to enhance the learning of both positive and negative samples.
arXiv Detail & Related papers (2024-11-19T01:26:20Z) - Pixel Sentence Representation Learning [67.4775296225521]
In this work, we conceptualize the learning of sentence-level textual semantics as a visual representation learning process.
We employ visually-grounded text perturbation methods like typos and word order shuffling, resonating with human cognitive patterns, and enabling perturbation to be perceived as continuous.
Our approach is further bolstered by large-scale unsupervised topical alignment training and natural language inference supervision.
arXiv Detail & Related papers (2024-02-13T02:46:45Z) - DenoSent: A Denoising Objective for Self-Supervised Sentence
Representation Learning [59.4644086610381]
We propose a novel denoising objective that inherits from another perspective, i.e., the intra-sentence perspective.
By introducing both discrete and continuous noise, we generate noisy sentences and then train our model to restore them to their original form.
Our empirical evaluations demonstrate that this approach delivers competitive results on both semantic textual similarity (STS) and a wide range of transfer tasks.
arXiv Detail & Related papers (2024-01-24T17:48:45Z) - Textual Entailment Recognition with Semantic Features from Empirical
Text Representation [60.31047947815282]
A text entails a hypothesis if and only if the true value of the hypothesis follows the text.
In this paper, we propose a novel approach to identifying the textual entailment relationship between text and hypothesis.
We employ an element-wise Manhattan distance vector-based feature that can identify the semantic entailment relationship between the text-hypothesis pair.
arXiv Detail & Related papers (2022-10-18T10:03:51Z) - Contextualized language models for semantic change detection: lessons
learned [4.436724861363513]
We present a qualitative analysis of the outputs of contextualized embedding-based methods for detecting diachronic semantic change.
Our findings show that contextualized methods can often predict high change scores for words which are not undergoing any real diachronic semantic shift.
Our conclusion is that pre-trained contextualized language models are prone to confound changes in lexicographic senses and changes in contextual variance.
arXiv Detail & Related papers (2022-08-31T23:35:24Z) - Generative or Contrastive? Phrase Reconstruction for Better Sentence
Representation Learning [86.01683892956144]
We propose a novel generative self-supervised learning objective based on phrase reconstruction.
Our generative learning may yield powerful enough sentence representation and achieve performance in Sentence Textual Similarity tasks on par with contrastive learning.
arXiv Detail & Related papers (2022-04-20T10:00:46Z) - Transductive Learning for Unsupervised Text Style Transfer [60.65782243927698]
Unsupervised style transfer models are mainly based on an inductive learning approach.
We propose a novel transductive learning approach based on a retrieval-based context-aware style representation.
arXiv Detail & Related papers (2021-09-16T08:57:20Z) - Understanding Synonymous Referring Expressions via Contrastive Features [105.36814858748285]
We develop an end-to-end trainable framework to learn contrastive features on the image and object instance levels.
We conduct extensive experiments to evaluate the proposed algorithm on several benchmark datasets.
arXiv Detail & Related papers (2021-04-20T17:56:24Z) - Disentangled Contrastive Learning for Learning Robust Textual
Representations [13.880693856907037]
We introduce the concept of momentum representation consistency to align features and leverage power normalization while conforming the uniformity.
Our experimental results for the NLP benchmarks demonstrate that our approach can obtain better results compared with the baselines.
arXiv Detail & Related papers (2021-04-11T03:32:49Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.