Self-Guided Contrastive Learning for BERT Sentence Representations
- URL: http://arxiv.org/abs/2106.07345v1
- Date: Thu, 3 Jun 2021 05:52:43 GMT
- Title: Self-Guided Contrastive Learning for BERT Sentence Representations
- Authors: Taeuk Kim, Kang Min Yoo, Sang-goo Lee
- Abstract summary: We propose a contrastive learning method that utilizes self-guidance for improving the quality of BERT sentence representations.
Our method fine-tunes BERT in a self-supervised fashion, does not rely on data augmentation, and enables the usual [] token embeddings to function as sentence vectors.
- Score: 19.205754738851546
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Although BERT and its variants have reshaped the NLP landscape, it still
remains unclear how best to derive sentence embeddings from such pre-trained
Transformers. In this work, we propose a contrastive learning method that
utilizes self-guidance for improving the quality of BERT sentence
representations. Our method fine-tunes BERT in a self-supervised fashion, does
not rely on data augmentation, and enables the usual [CLS] token embeddings to
function as sentence vectors. Moreover, we redesign the contrastive learning
objective (NT-Xent) and apply it to sentence representation learning. We
demonstrate with extensive experiments that our approach is more effective than
competitive baselines on diverse sentence-related tasks. We also show it is
efficient at inference and robust to domain shifts.
Related papers
- DenoSent: A Denoising Objective for Self-Supervised Sentence
Representation Learning [59.4644086610381]
We propose a novel denoising objective that inherits from another perspective, i.e., the intra-sentence perspective.
By introducing both discrete and continuous noise, we generate noisy sentences and then train our model to restore them to their original form.
Our empirical evaluations demonstrate that this approach delivers competitive results on both semantic textual similarity (STS) and a wide range of transfer tasks.
arXiv Detail & Related papers (2024-01-24T17:48:45Z) - Prefer to Classify: Improving Text Classifiers via Auxiliary Preference
Learning [76.43827771613127]
In this paper, we investigate task-specific preferences between pairs of input texts as a new alternative way for such auxiliary data annotation.
We propose a novel multi-task learning framework, called prefer-to-classify (P2C), which can enjoy the cooperative effect of learning both the given classification task and the auxiliary preferences.
arXiv Detail & Related papers (2023-06-08T04:04:47Z) - Alleviating Over-smoothing for Unsupervised Sentence Representation [96.19497378628594]
We present a Simple method named Self-Contrastive Learning (SSCL) to alleviate this issue.
Our proposed method is quite simple and can be easily extended to various state-of-the-art models for performance boosting.
arXiv Detail & Related papers (2023-05-09T11:00:02Z) - Improving Contrastive Learning of Sentence Embeddings from AI Feedback [43.56070504980024]
Supervised contrastive learning can produce more accurate sample pairs with human feedback labels.
Our method utilizes AI feedback from large pre-trained language models to construct sample pairs with fine-grained sample similarity scores.
Experimental results show that our method achieves state-of-the-art performance on several semantic textual similarity tasks.
arXiv Detail & Related papers (2023-05-03T06:26:13Z) - Generative or Contrastive? Phrase Reconstruction for Better Sentence
Representation Learning [86.01683892956144]
We propose a novel generative self-supervised learning objective based on phrase reconstruction.
Our generative learning may yield powerful enough sentence representation and achieve performance in Sentence Textual Similarity tasks on par with contrastive learning.
arXiv Detail & Related papers (2022-04-20T10:00:46Z) - A Sentence is Worth 128 Pseudo Tokens: A Semantic-Aware Contrastive
Learning Framework for Sentence Embeddings [28.046786376565123]
We propose a semantics-aware contrastive learning framework for sentence embeddings, termed Pseudo-Token BERT (PT-BERT)
We exploit the pseudo-token space (i.e., latent semantic space) representation of a sentence while eliminating the impact of superficial features such as sentence length and syntax.
Our model outperforms the state-of-the-art baselines on six standard semantic textual similarity (STS) tasks.
arXiv Detail & Related papers (2022-03-11T12:29:22Z) - PromptBERT: Improving BERT Sentence Embeddings with Prompts [95.45347849834765]
We propose a prompt based sentence embeddings method which can reduce token embeddings biases and make the original BERT layers more effective.
We also propose a novel unsupervised training objective by the technology of template denoising, which substantially shortens the performance gap between the supervised and unsupervised setting.
Our fine-tuned method outperforms the state-of-the-art method SimCSE in both unsupervised and supervised settings.
arXiv Detail & Related papers (2022-01-12T06:54:21Z) - Sentence Embeddings using Supervised Contrastive Learning [0.0]
We propose a new method to build sentence embeddings by doing supervised contrastive learning.
Our method fine-tunes pretrained BERT on SNLI data, incorporating both supervised crossentropy loss and supervised contrastive loss.
arXiv Detail & Related papers (2021-06-09T03:30:29Z) - Improving BERT Fine-Tuning via Self-Ensemble and Self-Distillation [84.64004917951547]
Fine-tuning pre-trained language models like BERT has become an effective way in NLP.
In this paper, we improve the fine-tuning of BERT with two effective mechanisms: self-ensemble and self-distillation.
arXiv Detail & Related papers (2020-02-24T16:17:12Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.