Whitening-based Contrastive Learning of Sentence Embeddings
- URL: http://arxiv.org/abs/2305.17746v2
- Date: Thu, 8 Jun 2023 05:33:55 GMT
- Title: Whitening-based Contrastive Learning of Sentence Embeddings
- Authors: Wenjie Zhuo, Yifan Sun, Xiaohan Wang, Linchao Zhu, Yi Yang
- Abstract summary: This paper presents a whitening-based contrastive learning method for sentence embedding learning (WhitenedCSE)
We find that these two approaches are not totally redundant but actually have some complementarity due to different uniformity mechanism.
- Score: 61.38955786965527
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: This paper presents a whitening-based contrastive learning method for
sentence embedding learning (WhitenedCSE), which combines contrastive learning
with a novel shuffled group whitening. Generally, contrastive learning pulls
distortions of a single sample (i.e., positive samples) close and push negative
samples far away, correspondingly facilitating the alignment and uniformity in
the feature space. A popular alternative to the "pushing'' operation is
whitening the feature space, which scatters all the samples for uniformity.
Since the whitening and the contrastive learning have large redundancy w.r.t.
the uniformity, they are usually used separately and do not easily work
together. For the first time, this paper integrates whitening into the
contrastive learning scheme and facilitates two benefits. 1) Better uniformity.
We find that these two approaches are not totally redundant but actually have
some complementarity due to different uniformity mechanism. 2) Better
alignment. We randomly divide the feature into multiple groups along the
channel axis and perform whitening independently within each group. By
shuffling the group division, we derive multiple distortions of a single sample
and thus increase the positive sample diversity. Consequently, using multiple
positive samples with enhanced diversity further improves contrastive learning
due to better alignment. Extensive experiments on seven semantic textual
similarity tasks show our method achieves consistent improvement over the
contrastive learning baseline and sets new states of the art, e.g., 78.78\%
(+2.53\% based on BERT\ba) Spearman correlation on STS tasks.
Related papers
- Rethinking Positive Pairs in Contrastive Learning [19.149235307036324]
We present Hydra, a universal contrastive learning framework for visual representations that extends conventional contrastive learning to accommodate arbitrary pairs.
Our approach is validated using IN1K, where 1K diverse classes compose 500,500 pairs, most of them being distinct.
Our work highlights the value of learning common features of arbitrary pairs and potentially broadens the applicability of contrastive learning techniques on the sample pairs with weak relationships.
arXiv Detail & Related papers (2024-10-23T18:07:18Z) - Contrastive Learning with Negative Sampling Correction [52.990001829393506]
We propose a novel contrastive learning method named Positive-Unlabeled Contrastive Learning (PUCL)
PUCL treats the generated negative samples as unlabeled samples and uses information from positive samples to correct bias in contrastive loss.
PUCL can be applied to general contrastive learning problems and outperforms state-of-the-art methods on various image and graph classification tasks.
arXiv Detail & Related papers (2024-01-13T11:18:18Z) - Synthetic Hard Negative Samples for Contrastive Learning [8.776888865665024]
This paper proposes a novel feature-level method, namely sampling synthetic hard negative samples for contrastive learning (SSCL)
We generate more and harder negative samples by mixing negative samples, and then sample them by controlling the contrast of anchor sample with the other negative samples.
Our proposed method improves the classification performance on different image datasets and can be readily integrated into existing methods.
arXiv Detail & Related papers (2023-04-06T09:54:35Z) - An Investigation into Whitening Loss for Self-supervised Learning [62.157102463386394]
A desirable objective in self-supervised learning (SSL) is to avoid feature collapse.
We propose a framework with an informative indicator to analyze whitening loss.
Based on our analysis, we propose channel whitening with random group partition (CW-RGP)
arXiv Detail & Related papers (2022-10-07T14:43:29Z) - Debiased Contrastive Learning of Unsupervised Sentence Representations [88.58117410398759]
Contrastive learning is effective in improving pre-trained language models (PLM) to derive high-quality sentence representations.
Previous works mostly adopt in-batch negatives or sample from training data at random.
We present a new framework textbfDCLR to alleviate the influence of these improper negatives.
arXiv Detail & Related papers (2022-05-02T05:07:43Z) - Chaos is a Ladder: A New Theoretical Understanding of Contrastive
Learning via Augmentation Overlap [64.60460828425502]
We propose a new guarantee on the downstream performance of contrastive learning.
Our new theory hinges on the insight that the support of different intra-class samples will become more overlapped under aggressive data augmentations.
We propose an unsupervised model selection metric ARC that aligns well with downstream accuracy.
arXiv Detail & Related papers (2022-03-25T05:36:26Z) - SNCSE: Contrastive Learning for Unsupervised Sentence Embedding with
Soft Negative Samples [36.08601841321196]
We propose contrastive learning for unsupervised sentence embedding with soft negative samples.
We show that SNCSE can obtain state-of-the-art performance on semantic textual similarity task.
arXiv Detail & Related papers (2022-01-16T06:15:43Z) - Whitening for Self-Supervised Representation Learning [129.57407186848917]
We propose a new loss function for self-supervised representation learning (SSL) based on the whitening of latent-space features.
Our solution does not require asymmetric networks and it is conceptually simple.
arXiv Detail & Related papers (2020-07-13T12:33:25Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.