Data Augmentation of Contrastive Learning is Estimating Positive-incentive Noise
- URL: http://arxiv.org/abs/2408.09929v1
- Date: Mon, 19 Aug 2024 12:07:42 GMT
- Title: Data Augmentation of Contrastive Learning is Estimating Positive-incentive Noise
- Authors: Hongyuan Zhang, Yanchen Xu, Sida Huang, Xuelong Li,
- Abstract summary: We scientifically investigate the connection between contrastive learning and $pi$-noise.
Inspired by the idea of Positive-incentive Noise (Pi-Noise or $pi$-Noise) that aims at learning the reliable noise beneficial to tasks, we develop a $pi$-noise generator.
- Score: 54.24688963649581
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Inspired by the idea of Positive-incentive Noise (Pi-Noise or $\pi$-Noise) that aims at learning the reliable noise beneficial to tasks, we scientifically investigate the connection between contrastive learning and $\pi$-noise in this paper. By converting the contrastive loss to an auxiliary Gaussian distribution to quantitatively measure the difficulty of the specific contrastive model under the information theory framework, we properly define the task entropy, the core concept of $\pi$-noise, of contrastive learning. It is further proved that the predefined data augmentation in the standard contrastive learning paradigm can be regarded as a kind of point estimation of $\pi$-noise. Inspired by the theoretical study, a framework that develops a $\pi$-noise generator to learn the beneficial noise (instead of estimation) as data augmentations for contrast is proposed. The designed framework can be applied to diverse types of data and is also completely compatible with the existing contrastive models. From the visualization, we surprisingly find that the proposed method successfully learns effective augmentations.
Related papers
- TDCGL: Two-Level Debiased Contrastive Graph Learning for Recommendation [1.5836776102398225]
Long-tailed distribution of entities of KG and noise issues in the real world make item-entity dependent relations deviate from reflecting true characteristics.
We design the Two-Level Debiased Contrastive Learning (TDCL) and deploy it in the knowledge graph.
Considerable experiments on open-source datasets demonstrate that our method has excellent anti-noise capability.
arXiv Detail & Related papers (2023-10-01T03:56:38Z) - Understanding and Mitigating the Label Noise in Pre-training on
Downstream Tasks [91.15120211190519]
This paper aims to understand the nature of noise in pre-training datasets and to mitigate its impact on downstream tasks.
We propose a light-weight black-box tuning method (NMTune) to affine the feature space to mitigate the malignant effect of noise.
arXiv Detail & Related papers (2023-09-29T06:18:15Z) - Advancing Unsupervised Low-light Image Enhancement: Noise Estimation, Illumination Interpolation, and Self-Regulation [55.07472635587852]
Low-Light Image Enhancement (LLIE) techniques have made notable advancements in preserving image details and enhancing contrast.
These approaches encounter persistent challenges in efficiently mitigating dynamic noise and accommodating diverse low-light scenarios.
We first propose a method for estimating the noise level in low light images in a quick and accurate way.
We then devise a Learnable Illumination Interpolator (LII) to satisfy general constraints between illumination and input.
arXiv Detail & Related papers (2023-05-17T13:56:48Z) - Deep Active Learning with Noise Stability [24.54974925491753]
Uncertainty estimation for unlabeled data is crucial to active learning.
We propose a novel algorithm that leverages noise stability to estimate data uncertainty.
Our method is generally applicable in various tasks, including computer vision, natural language processing, and structural data analysis.
arXiv Detail & Related papers (2022-05-26T13:21:01Z) - The Optimal Noise in Noise-Contrastive Learning Is Not What You Think [80.07065346699005]
We show that deviating from this assumption can actually lead to better statistical estimators.
In particular, the optimal noise distribution is different from the data's and even from a different family.
arXiv Detail & Related papers (2022-03-02T13:59:20Z) - Robust Contrastive Learning against Noisy Views [79.71880076439297]
We propose a new contrastive loss function that is robust against noisy views.
We show that our approach provides consistent improvements over the state-of-the-art image, video, and graph contrastive learning benchmarks.
arXiv Detail & Related papers (2022-01-12T05:24:29Z) - Rethinking InfoNCE: How Many Negative Samples Do You Need? [54.146208195806636]
We study how many negative samples are optimal for InfoNCE in different scenarios via a semi-quantitative theoretical framework.
We estimate the optimal negative sampling ratio using the $K$ value that maximizes the training effectiveness function.
arXiv Detail & Related papers (2021-05-27T08:38:29Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.