Related papers: SimCSE: Simple Contrastive Learning of Sentence Embeddings

SimCSE: Simple Contrastive Learning of Sentence Embeddings

URL: http://arxiv.org/abs/2104.08821v1
Date: Sun, 18 Apr 2021 11:27:08 GMT
Title: SimCSE: Simple Contrastive Learning of Sentence Embeddings
Authors: Tianyu Gao, Xingcheng Yao, Danqi Chen
Abstract summary: This paper presents SimCSE, a contrastive learning framework for embeddings. We first describe an unsupervised approach, which takes an input sentence and predicts itself in a contrastive objective. We then incorporate annotated pairs from NLI datasets into contrastive learning by using "entailment" pairs as positives and "contradiction" pairs as hard negatives.
Score: 10.33373737281907
License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
Abstract: This paper presents SimCSE, a simple contrastive learning framework that greatly advances the state-of-the-art sentence embeddings. We first describe an unsupervised approach, which takes an input sentence and predicts itself in a contrastive objective, with only standard dropout used as noise. This simple method works surprisingly well, performing on par with previous supervised counterparts. We hypothesize that dropout acts as minimal data augmentation and removing it leads to a representation collapse. Then, we draw inspiration from the recent success of learning sentence embeddings from natural language inference (NLI) datasets and incorporate annotated pairs from NLI datasets into contrastive learning by using "entailment" pairs as positives and "contradiction" pairs as hard negatives. We evaluate SimCSE on standard semantic textual similarity (STS) tasks, and our unsupervised and supervised models using BERT-base achieve an average of 74.5% and 81.6% Spearman's correlation respectively, a 7.9 and 4.6 points improvement compared to previous best results. We also show that contrastive learning theoretically regularizes pre-trained embeddings' anisotropic space to be more uniform, and it better aligns positive pairs when supervised signals are available.

Related papers

DebCSE: Rethinking Unsupervised Contrastive Sentence Embedding Learning in the Debiasing Perspective [1.351603931922027]
We argue that effectively eliminating the influence of various biases is crucial for learning high-quality sentence embeddings. We propose a novel contrastive framework for sentence embedding, termed DebCSE, which can eliminate the impact of these biases.
arXiv Detail & Related papers (2023-09-14T02:43:34Z)
Alleviating Over-smoothing for Unsupervised Sentence Representation [96.19497378628594]
We present a Simple method named Self-Contrastive Learning (SSCL) to alleviate this issue. Our proposed method is quite simple and can be easily extended to various state-of-the-art models for performance boosting.
arXiv Detail & Related papers (2023-05-09T11:00:02Z)
Generate, Discriminate and Contrast: A Semi-Supervised Sentence Representation Learning Framework [68.04940365847543]
We propose a semi-supervised sentence embedding framework, GenSE, that effectively leverages large-scale unlabeled data. Our method include three parts: 1) Generate: A generator/discriminator model is jointly trained to synthesize sentence pairs from open-domain unlabeled corpus; 2) Discriminate: Noisy sentence pairs are filtered out by the discriminator to acquire high-quality positive and negative sentence pairs; 3) Contrast: A prompt-based contrastive approach is presented for sentence representation learning with both annotated and synthesized data.
arXiv Detail & Related papers (2022-10-30T10:15:21Z)
SDA: Simple Discrete Augmentation for Contrastive Sentence Representation Learning [14.028140579482688]
SimCSE surprisingly dominates discrete augmentations such as cropping, word deletion, and synonym replacement as reported. We develop three simple yet effective discrete sentence augmentation schemes: punctuation insertion, modal verbs, and double negation. Results support the superiority of the proposed methods consistently.
arXiv Detail & Related papers (2022-10-08T08:07:47Z)
Non-contrastive representation learning for intervals from well logs [58.70164460091879]
The representation learning problem in the oil & gas industry aims to construct a model that provides a representation based on logging data for a well interval. One of the possible approaches is self-supervised learning (SSL) We are the first to introduce non-contrastive SSL for well-logging data.
arXiv Detail & Related papers (2022-09-28T13:27:10Z)
Improving Contrastive Learning of Sentence Embeddings with Case-Augmented Positives and Retrieved Negatives [17.90820242798732]
Unsupervised contrastive learning methods still lag far behind the supervised counterparts. We propose switch-case augmentation to flip the case of the first letter of randomly selected words in a sentence. For negative samples, we sample hard negatives from the whole dataset based on a pre-trained language model.
arXiv Detail & Related papers (2022-06-06T09:46:12Z)
SNCSE: Contrastive Learning for Unsupervised Sentence Embedding with Soft Negative Samples [36.08601841321196]
We propose contrastive learning for unsupervised sentence embedding with soft negative samples. We show that SNCSE can obtain state-of-the-art performance on semantic textual similarity task.
arXiv Detail & Related papers (2022-01-16T06:15:43Z)
ESimCSE: Enhanced Sample Building Method for Contrastive Learning of Unsupervised Sentence Embedding [41.09180639504244]
The current state-of-the-art unsupervised method is the unsupervised SimCSE (unsup-SimCSE) We develop a new sentence embedding method, termed Enhanced Unsup-SimCSE (ESimCSE) ESimCSE outperforms the state-of-the-art unsup-SimCSE by an average Spearman correlation of 2.02% on BERT-base.
arXiv Detail & Related papers (2021-09-09T16:07:31Z)
Provable Guarantees for Self-Supervised Deep Learning with Spectral Contrastive Loss [72.62029620566925]
Recent works in self-supervised learning have advanced the state-of-the-art by relying on the contrastive learning paradigm. Our work analyzes contrastive learning without assuming conditional independence of positive pairs. We propose a loss that performs spectral decomposition on the population augmentation graph and can be succinctly written as a contrastive learning objective.
arXiv Detail & Related papers (2021-06-08T07:41:02Z)
Understanding self-supervised Learning Dynamics without Contrastive Pairs [72.1743263777693]
Contrastive approaches to self-supervised learning (SSL) learn representations by minimizing the distance between two augmented views of the same data point. BYOL and SimSiam, show remarkable performance it without negative pairs. We study the nonlinear learning dynamics of non-contrastive SSL in simple linear networks.
arXiv Detail & Related papers (2021-02-12T22:57:28Z)
Evaluating Models' Local Decision Boundaries via Contrast Sets [119.38387782979474]
We propose a new annotation paradigm for NLP that helps to close systematic gaps in the test data. We demonstrate the efficacy of contrast sets by creating them for 10 diverse NLP datasets. Although our contrast sets are not explicitly adversarial, model performance is significantly lower on them than on the original test sets.
arXiv Detail & Related papers (2020-04-06T14:47:18Z)

This list is automatically generated from the titles and abstracts of the papers in this site.