DenoSent: A Denoising Objective for Self-Supervised Sentence
Representation Learning
- URL: http://arxiv.org/abs/2401.13621v1
- Date: Wed, 24 Jan 2024 17:48:45 GMT
- Title: DenoSent: A Denoising Objective for Self-Supervised Sentence
Representation Learning
- Authors: Xinghao Wang, Junliang He, Pengyu Wang, Yunhua Zhou, Tianxiang Sun,
Xipeng Qiu
- Abstract summary: We propose a novel denoising objective that inherits from another perspective, i.e., the intra-sentence perspective.
By introducing both discrete and continuous noise, we generate noisy sentences and then train our model to restore them to their original form.
Our empirical evaluations demonstrate that this approach delivers competitive results on both semantic textual similarity (STS) and a wide range of transfer tasks.
- Score: 59.4644086610381
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Contrastive-learning-based methods have dominated sentence representation
learning. These methods regularize the representation space by pulling similar
sentence representations closer and pushing away the dissimilar ones and have
been proven effective in various NLP tasks, e.g., semantic textual similarity
(STS) tasks. However, it is challenging for these methods to learn fine-grained
semantics as they only learn from the inter-sentence perspective, i.e., their
supervision signal comes from the relationship between data samples. In this
work, we propose a novel denoising objective that inherits from another
perspective, i.e., the intra-sentence perspective. By introducing both discrete
and continuous noise, we generate noisy sentences and then train our model to
restore them to their original form. Our empirical evaluations demonstrate that
this approach delivers competitive results on both semantic textual similarity
(STS) and a wide range of transfer tasks, standing up well in comparison to
contrastive-learning-based methods. Notably, the proposed intra-sentence
denoising objective complements existing inter-sentence contrastive
methodologies and can be integrated with them to further enhance performance.
Our code is available at https://github.com/xinghaow99/DenoSent.
Related papers
- Pixel Sentence Representation Learning [67.4775296225521]
In this work, we conceptualize the learning of sentence-level textual semantics as a visual representation learning process.
We employ visually-grounded text perturbation methods like typos and word order shuffling, resonating with human cognitive patterns, and enabling perturbation to be perceived as continuous.
Our approach is further bolstered by large-scale unsupervised topical alignment training and natural language inference supervision.
arXiv Detail & Related papers (2024-02-13T02:46:45Z) - Vocabulary-Defined Semantics: Latent Space Clustering for Improving In-Context Learning [32.178931149612644]
In-context learning enables language models to adapt to downstream data or incorporate tasks by few samples as demonstrations within the prompts.
However, the performance of in-context learning can be unstable depending on the quality, format, or order of demonstrations.
We propose a novel approach "vocabulary-defined semantics"
arXiv Detail & Related papers (2024-01-29T14:29:48Z) - RankCSE: Unsupervised Sentence Representations Learning via Learning to
Rank [54.854714257687334]
We propose a novel approach, RankCSE, for unsupervised sentence representation learning.
It incorporates ranking consistency and ranking distillation with contrastive learning into a unified framework.
An extensive set of experiments are conducted on both semantic textual similarity (STS) and transfer (TR) tasks.
arXiv Detail & Related papers (2023-05-26T08:27:07Z) - SimCSE++: Improving Contrastive Learning for Sentence Embeddings from
Two Perspectives [32.6620719893457]
This paper improves contrastive learning for sentence embeddings from two perspectives.
First, we identify that the dropout noise from negative pairs affects the model's performance.
Secondly, we propose a simple yet effective method to deal with such type of noise.
arXiv Detail & Related papers (2023-05-22T16:24:46Z) - Alleviating Over-smoothing for Unsupervised Sentence Representation [96.19497378628594]
We present a Simple method named Self-Contrastive Learning (SSCL) to alleviate this issue.
Our proposed method is quite simple and can be easily extended to various state-of-the-art models for performance boosting.
arXiv Detail & Related papers (2023-05-09T11:00:02Z) - Sentence Representation Learning with Generative Objective rather than
Contrastive Objective [86.01683892956144]
We propose a novel generative self-supervised learning objective based on phrase reconstruction.
Our generative learning achieves powerful enough performance improvement and outperforms the current state-of-the-art contrastive methods.
arXiv Detail & Related papers (2022-10-16T07:47:46Z) - Pair-Level Supervised Contrastive Learning for Natural Language
Inference [11.858875850000457]
We propose a Pair-level Supervised Contrastive Learning approach (PairSCL)
A contrastive learning objective is designed to distinguish the varied classes of sentence pairs by pulling those in one class together and pushing apart the pairs in other classes.
We evaluate PairSCL on two public datasets of NLI where the accuracy of PairSCL outperforms other methods by 2.1% on average.
arXiv Detail & Related papers (2022-01-26T13:34:52Z) - Dense Contrastive Visual-Linguistic Pretraining [53.61233531733243]
Several multimodal representation learning approaches have been proposed that jointly represent image and text.
These approaches achieve superior performance by capturing high-level semantic information from large-scale multimodal pretraining.
We propose unbiased Dense Contrastive Visual-Linguistic Pretraining to replace the region regression and classification with cross-modality region contrastive learning.
arXiv Detail & Related papers (2021-09-24T07:20:13Z) - Adversarial Training with Contrastive Learning in NLP [0.0]
We propose adversarial training with contrastive learning (ATCL) to adversarially train a language processing task.
The core idea is to make linear perturbations in the embedding space of the input via fast gradient methods (FGM) and train the model to keep the original and perturbed representations close via contrastive learning.
The results show not only an improvement in the quantitative (perplexity and BLEU) scores when compared to the baselines, but ATCL also achieves good qualitative results in the semantic level for both tasks.
arXiv Detail & Related papers (2021-09-19T07:23:45Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.