Related papers: InfoCSE: Information-aggregated Contrastive Learning of Sentence Embeddings

InfoCSE: Information-aggregated Contrastive Learning of Sentence Embeddings

URL: http://arxiv.org/abs/2210.06432v1
Date: Sat, 8 Oct 2022 15:53:19 GMT
Title: InfoCSE: Information-aggregated Contrastive Learning of Sentence Embeddings
Authors: Xing Wu, Chaochen Gao, Zijia Lin, Jizhong Han, Zhongyuan Wang, Songlin Hu
Abstract summary: This paper proposes an information-d contrastive learning framework for learning unsupervised sentence embeddings, termed InfoCSE. We evaluate the proposed InfoCSE on several benchmark datasets w.r.t the semantic text similarity (STS) task. Experimental results show that InfoCSE outperforms SimCSE by an average Spearman correlation of 2.60% on BERT-base, and 1.77% on BERT-large.
Score: 61.77760317554826
License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
Abstract: Contrastive learning has been extensively studied in sentence embedding learning, which assumes that the embeddings of different views of the same sentence are closer. The constraint brought by this assumption is weak, and a good sentence representation should also be able to reconstruct the original sentence fragments. Therefore, this paper proposes an information-aggregated contrastive learning framework for learning unsupervised sentence embeddings, termed InfoCSE. InfoCSE forces the representation of [CLS] positions to aggregate denser sentence information by introducing an additional Masked language model task and a well-designed network. We evaluate the proposed InfoCSE on several benchmark datasets w.r.t the semantic text similarity (STS) task. Experimental results show that InfoCSE outperforms SimCSE by an average Spearman correlation of 2.60% on BERT-base, and 1.77% on BERT-large, achieving state-of-the-art results among unsupervised sentence representation learning methods. Our code are available at https://github.com/caskcsg/sentemb/info

Related papers

DEUCE: Dual-diversity Enhancement and Uncertainty-awareness for Cold-start Active Learning [54.35107462768146]
Cold-start active learning (CSAL) selects valuable instances from an unlabeled dataset for manual annotation. Existing CSAL methods overlook weak classes and hard representative examples, resulting in biased learning. This paper proposes a novel dual-diversity enhancing and uncertainty-aware framework for CSAL.
arXiv Detail & Related papers (2025-02-01T04:00:03Z)
CMLM-CSE: Based on Conditional MLM Contrastive Learning for Sentence Embeddings [16.592691470405683]
We propose CMLM-CSE, an unsupervised contrastive learning framework based on conditional loss. An auxiliary network is added to integrate sentence embedding to perform tasks, forcing sentence embedding to learn more masked word information. When Bertbase was used as the pretraining language model, we exceeded SimCSE by 0.55 percentage points on average in textual similarity tasks, and when Robertabase was used as the pretraining language model, we exceeded SimCSE by 0.3 percentage points on average in textual similarity tasks.
arXiv Detail & Related papers (2023-06-16T02:39:45Z)
RankCSE: Unsupervised Sentence Representations Learning via Learning to Rank [54.854714257687334]
We propose a novel approach, RankCSE, for unsupervised sentence representation learning. It incorporates ranking consistency and ranking distillation with contrastive learning into a unified framework. An extensive set of experiments are conducted on both semantic textual similarity (STS) and transfer (TR) tasks.
arXiv Detail & Related papers (2023-05-26T08:27:07Z)
Contrastive Learning of Sentence Embeddings from Scratch [26.002876719243464]
We present SynCSE, a contrastive learning framework that trains sentence embeddings with synthesized data. Specifically, we explore utilizing large language models to synthesize the required data samples for contrastive learning. Experimental results on sentence similarity and reranking tasks indicate that both SynCSE-partial and SynCSE-scratch greatly outperform unsupervised baselines.
arXiv Detail & Related papers (2023-05-24T11:56:21Z)
Instance Smoothed Contrastive Learning for Unsupervised Sentence Embedding [16.598732694215137]
We propose IS-CSE (instance smoothing contrastive sentence embedding) to smooth the boundaries of embeddings in the feature space. We evaluate our method on standard semantic text similarity (STS) tasks and achieve an average of 78.30%, 79.47%, 77.73%, and 79.42% Spearman's correlation.
arXiv Detail & Related papers (2023-05-12T12:46:13Z)
M-Tuning: Prompt Tuning with Mitigated Label Bias in Open-Set Scenarios [103.6153593636399]
We propose a vision-language prompt tuning method with mitigated label bias (M-Tuning) It introduces open words from the WordNet to extend the range of words forming the prompt texts from only closed-set label words to more, and thus prompts are tuned in a simulated open-set scenario. Our method achieves the best performance on datasets with various scales, and extensive ablation studies also validate its effectiveness.
arXiv Detail & Related papers (2023-03-09T09:05:47Z)
DiffCSE: Difference-based Contrastive Learning for Sentence Embeddings [51.274478128525686]
DiffCSE is an unsupervised contrastive learning framework for learning sentence embeddings. Our experiments show that DiffCSE achieves state-of-the-art results among unsupervised sentence representation learning methods.
arXiv Detail & Related papers (2022-04-21T17:32:01Z)
ESimCSE: Enhanced Sample Building Method for Contrastive Learning of Unsupervised Sentence Embedding [41.09180639504244]
The current state-of-the-art unsupervised method is the unsupervised SimCSE (unsup-SimCSE) We develop a new sentence embedding method, termed Enhanced Unsup-SimCSE (ESimCSE) ESimCSE outperforms the state-of-the-art unsup-SimCSE by an average Spearman correlation of 2.02% on BERT-base.
arXiv Detail & Related papers (2021-09-09T16:07:31Z)
Trash to Treasure: Harvesting OOD Data with Cross-Modal Matching for Open-Set Semi-Supervised Learning [101.28281124670647]
Open-set semi-supervised learning (open-set SSL) investigates a challenging but practical scenario where out-of-distribution (OOD) samples are contained in the unlabeled data. We propose a novel training mechanism that could effectively exploit the presence of OOD data for enhanced feature learning. Our approach substantially lifts the performance on open-set SSL and outperforms the state-of-the-art by a large margin.
arXiv Detail & Related papers (2021-08-12T09:14:44Z)
Syntactic Structure Distillation Pretraining For Bidirectional Encoders [49.483357228441434]
We introduce a knowledge distillation strategy for injecting syntactic biases into BERT pretraining. We distill the approximate marginal distribution over words in context from the syntactic LM. Our findings demonstrate the benefits of syntactic biases, even in representation learners that exploit large amounts of data.
arXiv Detail & Related papers (2020-05-27T16:44:01Z)

This list is automatically generated from the titles and abstracts of the papers in this site.