Debiased Contrastive Learning of Unsupervised Sentence Representations
- URL: http://arxiv.org/abs/2205.00656v1
- Date: Mon, 2 May 2022 05:07:43 GMT
- Title: Debiased Contrastive Learning of Unsupervised Sentence Representations
- Authors: Kun Zhou, Beichen Zhang, Wayne Xin Zhao and Ji-Rong Wen
- Abstract summary: Contrastive learning is effective in improving pre-trained language models (PLM) to derive high-quality sentence representations.
Previous works mostly adopt in-batch negatives or sample from training data at random.
We present a new framework textbfDCLR to alleviate the influence of these improper negatives.
- Score: 88.58117410398759
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Recently, contrastive learning has been shown to be effective in improving
pre-trained language models (PLM) to derive high-quality sentence
representations. It aims to pull close positive examples to enhance the
alignment while push apart irrelevant negatives for the uniformity of the whole
representation space. However, previous works mostly adopt in-batch negatives
or sample from training data at random. Such a way may cause the sampling bias
that improper negatives (e.g. false negatives and anisotropy representations)
are used to learn sentence representations, which will hurt the uniformity of
the representation space. To address it, we present a new framework
\textbf{DCLR} (\underline{D}ebiased \underline{C}ontrastive
\underline{L}earning of unsupervised sentence \underline{R}epresentations) to
alleviate the influence of these improper negatives. In DCLR, we design an
instance weighting method to punish false negatives and generate noise-based
negatives to guarantee the uniformity of the representation space. Experiments
on seven semantic textual similarity tasks show that our approach is more
effective than competitive baselines. Our code and data are publicly available
at the link: \textcolor{blue}{\url{https://github.com/RUCAIBox/DCLR}}.
Related papers
- Contrastive Learning with Negative Sampling Correction [52.990001829393506]
We propose a novel contrastive learning method named Positive-Unlabeled Contrastive Learning (PUCL)
PUCL treats the generated negative samples as unlabeled samples and uses information from positive samples to correct bias in contrastive loss.
PUCL can be applied to general contrastive learning problems and outperforms state-of-the-art methods on various image and graph classification tasks.
arXiv Detail & Related papers (2024-01-13T11:18:18Z) - Your Negative May not Be True Negative: Boosting Image-Text Matching
with False Negative Elimination [62.18768931714238]
We propose a novel False Negative Elimination (FNE) strategy to select negatives via sampling.
The results demonstrate the superiority of our proposed false negative elimination strategy.
arXiv Detail & Related papers (2023-08-08T16:31:43Z) - Clustering-Aware Negative Sampling for Unsupervised Sentence
Representation [24.15096466098421]
ClusterNS is a novel method that incorporates cluster information into contrastive learning for unsupervised sentence representation learning.
We apply a modified K-means clustering algorithm to supply hard negatives and recognize in-batch false negatives during training.
arXiv Detail & Related papers (2023-05-17T02:06:47Z) - Language Model Pre-training on True Negatives [109.73819321246062]
Discriminative pre-trained language models (PLMs) learn to predict original texts from intentionally corrupted ones.
Existing PLMs simply treat all corrupted texts as equal negative without any examination.
We design enhanced pre-training methods to counteract false negative predictions and encourage pre-training language models on true negatives.
arXiv Detail & Related papers (2022-12-01T12:24:19Z) - Exploring the Impact of Negative Samples of Contrastive Learning: A Case
Study of Sentence Embedding [14.295787044482136]
We present a momentum contrastive learning model with negative sample queue for sentence embedding, namely MoCoSE.
We define a maximum traceable distance metric, through which we learn to what extent the text contrastive learning benefits from the historical information of negative samples.
Our experiments find that the best results are obtained when the maximum traceable distance is at a certain range, demonstrating that there is an optimal range of historical information for a negative sample queue.
arXiv Detail & Related papers (2022-02-26T08:29:25Z) - SNCSE: Contrastive Learning for Unsupervised Sentence Embedding with
Soft Negative Samples [36.08601841321196]
We propose contrastive learning for unsupervised sentence embedding with soft negative samples.
We show that SNCSE can obtain state-of-the-art performance on semantic textual similarity task.
arXiv Detail & Related papers (2022-01-16T06:15:43Z) - Negative Sample is Negative in Its Own Way: Tailoring Negative Sentences
for Image-Text Retrieval [19.161248757493386]
We propose our TAiloring neGative Sentences with Discrimination and Correction (TAGS-DC) to generate synthetic sentences automatically as negative samples.
To keep the difficulty during training, we mutually improve the retrieval and generation through parameter sharing.
In experiments, we verify the effectiveness of our model on MS-COCO and Flickr30K compared with current state-of-the-art models.
arXiv Detail & Related papers (2021-11-05T09:36:41Z) - AdCo: Adversarial Contrast for Efficient Learning of Unsupervised
Representations from Self-Trained Negative Adversaries [55.059844800514774]
We propose an Adrial Contrastive (AdCo) model to train representations that are hard to discriminate against positive queries.
Experiment results demonstrate the proposed Adrial Contrastive (AdCo) model achieves superior performances.
arXiv Detail & Related papers (2020-11-17T05:45:46Z) - Contrastive Learning with Hard Negative Samples [80.12117639845678]
We develop a new family of unsupervised sampling methods for selecting hard negative samples.
A limiting case of this sampling results in a representation that tightly clusters each class, and pushes different classes as far apart as possible.
The proposed method improves downstream performance across multiple modalities, requires only few additional lines of code to implement, and introduces no computational overhead.
arXiv Detail & Related papers (2020-10-09T14:18:53Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.