An Information Minimization Based Contrastive Learning Model for
Unsupervised Sentence Embeddings Learning
- URL: http://arxiv.org/abs/2209.10951v1
- Date: Thu, 22 Sep 2022 12:07:35 GMT
- Title: An Information Minimization Based Contrastive Learning Model for
Unsupervised Sentence Embeddings Learning
- Authors: Shaobin Chen, Jie Zhou, Yuling Sun, and Liang He
- Abstract summary: We present an information minimization based contrastive learning (InforMin-CL) model for unsupervised sentence representation learning.
We find that information minimization can be achieved by simple contrast and reconstruction objectives.
- Score: 19.270283247740664
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Unsupervised sentence embeddings learning has been recently dominated by
contrastive learning methods (e.g., SimCSE), which keep positive pairs similar
and push negative pairs apart. The contrast operation aims to keep as much
information as possible by maximizing the mutual information between positive
instances, which leads to redundant information in sentence embedding. To
address this problem, we present an information minimization based contrastive
learning (InforMin-CL) model to retain the useful information and discard the
redundant information by maximizing the mutual information and minimizing the
information entropy between positive instances meanwhile for unsupervised
sentence representation learning. Specifically, we find that information
minimization can be achieved by simple contrast and reconstruction objectives.
The reconstruction operation reconstitutes the positive instance via the other
positive instance to minimize the information entropy between positive
instances. We evaluate our model on fourteen downstream tasks, including both
supervised and unsupervised (semantic textual similarity) tasks. Extensive
experimental results show that our InforMin-CL obtains a state-of-the-art
performance.
Related papers
- On Discriminative Probabilistic Modeling for Self-Supervised Representation Learning [85.75164588939185]
We study the discriminative probabilistic modeling problem on a continuous domain for (multimodal) self-supervised representation learning.
We conduct generalization error analysis to reveal the limitation of current InfoNCE-based contrastive loss for self-supervised representation learning.
arXiv Detail & Related papers (2024-10-11T18:02:46Z) - C-ICL: Contrastive In-context Learning for Information Extraction [54.39470114243744]
c-ICL is a novel few-shot technique that leverages both correct and incorrect sample constructions to create in-context learning demonstrations.
Our experiments on various datasets indicate that c-ICL outperforms previous few-shot in-context learning methods.
arXiv Detail & Related papers (2024-02-17T11:28:08Z) - Time-Series Contrastive Learning against False Negatives and Class Imbalance [17.43801009251228]
We conduct theoretical analysis and find they have overlooked the fundamental issues: false negatives and class imbalance inherent in the InfoNCE loss-based framework.
We introduce a straightforward modification grounded in the SimCLR framework, universally to models engaged in the instance discrimination task.
We perform semi-supervised consistency classification and enhance the representative ability of minority classes.
arXiv Detail & Related papers (2023-12-19T08:38:03Z) - Rethinking Minimal Sufficient Representation in Contrastive Learning [28.83450836832452]
We show that contrastive learning models have the risk of over-fitting to the shared information between views.
We propose to increase the mutual information between the representation and input as regularization to approximately introduce more task-relevant information.
It significantly improves the performance of several classic contrastive learning models in downstream tasks.
arXiv Detail & Related papers (2022-03-14T11:17:48Z) - Robust Contrastive Learning against Noisy Views [79.71880076439297]
We propose a new contrastive loss function that is robust against noisy views.
We show that our approach provides consistent improvements over the state-of-the-art image, video, and graph contrastive learning benchmarks.
arXiv Detail & Related papers (2022-01-12T05:24:29Z) - Learning Bias-Invariant Representation by Cross-Sample Mutual
Information Minimization [77.8735802150511]
We propose a cross-sample adversarial debiasing (CSAD) method to remove the bias information misused by the target task.
The correlation measurement plays a critical role in adversarial debiasing and is conducted by a cross-sample neural mutual information estimator.
We conduct thorough experiments on publicly available datasets to validate the advantages of the proposed method over state-of-the-art approaches.
arXiv Detail & Related papers (2021-08-11T21:17:02Z) - Incremental False Negative Detection for Contrastive Learning [95.68120675114878]
We introduce a novel incremental false negative detection for self-supervised contrastive learning.
During contrastive learning, we discuss two strategies to explicitly remove the detected false negatives.
Our proposed method outperforms other self-supervised contrastive learning frameworks on multiple benchmarks within a limited compute.
arXiv Detail & Related papers (2021-06-07T15:29:14Z) - Adversarial Examples for Unsupervised Machine Learning Models [71.81480647638529]
Adrial examples causing evasive predictions are widely used to evaluate and improve the robustness of machine learning models.
We propose a framework of generating adversarial examples for unsupervised models and demonstrate novel applications to data augmentation.
arXiv Detail & Related papers (2021-03-02T17:47:58Z) - On Mutual Information in Contrastive Learning for Visual Representations [19.136685699971864]
unsupervised, "contrastive" learning algorithms in vision have been shown to learn representations that perform remarkably well on transfer tasks.
We show that this family of algorithms maximizes a lower bound on the mutual information between two or more "views" of an image.
We find that the choice of negative samples and views are critical to the success of these algorithms.
arXiv Detail & Related papers (2020-05-27T04:21:53Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.