Deep Continuous Prompt for Contrastive Learning of Sentence Embeddings
- URL: http://arxiv.org/abs/2203.06875v1
- Date: Mon, 14 Mar 2022 06:07:44 GMT
- Title: Deep Continuous Prompt for Contrastive Learning of Sentence Embeddings
- Authors: Yuxin Jiang and Wei Wang
- Abstract summary: We present a novel method which freezes the whole language model and only optimize the prefix deep continuous prompts.
It not only tunes around 0.1% parameters of the original language model, but avoids the cumbersome computation of searching handcrafted prompts.
Our proposed DCPCSE outperforms the state-of-the-art method SimCSE by a large margin.
- Score: 8.70715711885114
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: The performance of sentence representation has been remarkably improved by
the framework of contrastive learning. However, recent works still require full
fine-tuning, which is quite inefficient for large-scaled pre-trained language
models. To this end, we present a novel method which freezes the whole language
model and only optimizes the prefix deep continuous prompts. It not only tunes
around 0.1% parameters of the original language model, but avoids the
cumbersome computation of searching handcrafted prompts. Experimental results
show that our proposed DCPCSE outperforms the state-of-the-art method SimCSE by
a large margin. We raise the performance of unsupervised BERT$_{base}$ and
supervised RoBERTa$_{large}$ by 2.24 and 1.00 points, respectively. Our code is
publicly avaliable at https://github.com/YJiangcm/DCPCSE
Related papers
- Adapting Vision-Language Models to Open Classes via Test-Time Prompt Tuning [50.26965628047682]
Adapting pre-trained models to open classes is a challenging problem in machine learning.
In this paper, we consider combining the advantages of both and come up with a test-time prompt tuning approach.
Our proposed method outperforms all comparison methods on average considering both base and new classes.
arXiv Detail & Related papers (2024-08-29T12:34:01Z) - CTC-based Non-autoregressive Speech Translation [51.37920141751813]
We investigate the potential of connectionist temporal classification for non-autoregressive speech translation.
We develop a model consisting of two encoders that are guided by CTC to predict the source and target texts.
Experiments on the MuST-C benchmarks show that our NAST model achieves an average BLEU score of 29.5 with a speed-up of 5.67$times$.
arXiv Detail & Related papers (2023-05-27T03:54:09Z) - Alleviating Over-smoothing for Unsupervised Sentence Representation [96.19497378628594]
We present a Simple method named Self-Contrastive Learning (SSCL) to alleviate this issue.
Our proposed method is quite simple and can be easily extended to various state-of-the-art models for performance boosting.
arXiv Detail & Related papers (2023-05-09T11:00:02Z) - Efficient and Flexible Topic Modeling using Pretrained Embeddings and
Bag of Sentences [1.8592384822257952]
We propose a novel topic modeling and inference algorithm.
We leverage pre-trained sentence embeddings by combining generative process models and clustering.
TheTailor evaluation shows that our method yields state-of-the art results with relatively little computational demands.
arXiv Detail & Related papers (2023-02-06T20:13:11Z) - EPIC TTS Models: Empirical Pruning Investigations Characterizing
Text-To-Speech Models [26.462819114575172]
This work compares sparsity paradigms in text-to-speech synthesis.
It is the first work that compares sparsity paradigms in text-to-speech synthesis.
arXiv Detail & Related papers (2022-09-22T09:47:25Z) - Improving Pre-trained Language Model Fine-tuning with Noise Stability
Regularization [94.4409074435894]
We propose a novel and effective fine-tuning framework, named Layerwise Noise Stability Regularization (LNSR)
Specifically, we propose to inject the standard Gaussian noise and regularize hidden representations of the fine-tuned model.
We demonstrate the advantages of the proposed method over other state-of-the-art algorithms including L2-SP, Mixout and SMART.
arXiv Detail & Related papers (2022-06-12T04:42:49Z) - Prompt Consistency for Zero-Shot Task Generalization [118.81196556175797]
In this paper, we explore methods to utilize unlabeled data to improve zero-shot performance.
Specifically, we take advantage of the fact that multiple prompts can be used to specify a single task, and propose to regularize prompt consistency.
Our approach outperforms the state-of-the-art zero-shot learner, T0, on 9 out of 11 datasets across 4 NLP tasks by up to 10.6 absolute points in terms of accuracy.
arXiv Detail & Related papers (2022-04-29T19:18:37Z) - Contrastive Demonstration Tuning for Pre-trained Language Models [59.90340768724675]
Demonstration examples are crucial for an excellent final performance of prompt-tuning.
The proposed approach can be: (i) Plugged into any previous prompt-tuning approaches; (ii) Extended to widespread classification tasks with a large number of categories.
Experimental results on 16 datasets illustrate that our method integrated with previous approaches LM-BFF and P-tuning can yield better performance.
arXiv Detail & Related papers (2022-04-09T05:30:48Z) - Compressing Sentence Representation for Semantic Retrieval via
Homomorphic Projective Distillation [28.432799973328127]
We propose Homomorphic Projective Distillation (HPD) to learn compressed sentence embeddings.
Our method augments a small Transformer encoder model with learnable projection layers to produce compact representations.
arXiv Detail & Related papers (2022-03-15T07:05:43Z) - Efficient Constituency Parsing by Pointing [21.395573911155495]
We propose a novel constituency parsing model that casts the parsing problem into a series of pointing tasks.
Our model supports efficient top-down decoding and our learning objective is able to enforce structural consistency without resorting to the expensive CKY inference.
arXiv Detail & Related papers (2020-06-24T08:29:09Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.