Contrastive Learning with Orthonormal Anchors (CLOA)
- URL: http://arxiv.org/abs/2403.18699v1
- Date: Wed, 27 Mar 2024 15:48:16 GMT
- Title: Contrastive Learning with Orthonormal Anchors (CLOA)
- Authors: Huanran Li, Daniel Pimentel-Alarcón,
- Abstract summary: This study focuses on addressing the instability issues prevalent in contrastive learning, specifically examining the InfoNCE loss function and its derivatives.
We reveal a critical observation that these loss functions exhibit a restrictive behavior, leading to a convergence phenomenon where embeddings tend to merge into a singular point.
This "over-fusion" effect detrimentally affects classification accuracy in subsequent supervised-learning tasks.
- Score: 0.0
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: This study focuses on addressing the instability issues prevalent in contrastive learning, specifically examining the InfoNCE loss function and its derivatives. We reveal a critical observation that these loss functions exhibit a restrictive behavior, leading to a convergence phenomenon where embeddings tend to merge into a singular point. This "over-fusion" effect detrimentally affects classification accuracy in subsequent supervised-learning tasks. Through theoretical analysis, we demonstrate that embeddings, when equalized or confined to a rank-1 linear subspace, represent a local minimum for InfoNCE. In response to this challenge, our research introduces an innovative strategy that leverages the same or fewer labeled data than typically used in the fine-tuning phase. The loss we proposed, Orthonormal Anchor Regression Loss, is designed to disentangle embedding clusters, significantly enhancing the distinctiveness of each embedding while simultaneously ensuring their aggregation into dense, well-defined clusters. Our method demonstrates remarkable improvements with just a fraction of the conventional label requirements, as evidenced by our results on CIFAR10 and CIFAR100 datasets.
Related papers
- Anti-Collapse Loss for Deep Metric Learning Based on Coding Rate Metric [99.19559537966538]
DML aims to learn a discriminative high-dimensional embedding space for downstream tasks like classification, clustering, and retrieval.
To maintain the structure of embedding space and avoid feature collapse, we propose a novel loss function called Anti-Collapse Loss.
Comprehensive experiments on benchmark datasets demonstrate that our proposed method outperforms existing state-of-the-art methods.
arXiv Detail & Related papers (2024-07-03T13:44:20Z) - Bayesian Learning-driven Prototypical Contrastive Loss for Class-Incremental Learning [42.14439854721613]
We propose a prototypical network with a Bayesian learning-driven contrastive loss (BLCL) tailored specifically for class-incremental learning scenarios.
Our approach dynamically adapts the balance between the cross-entropy and contrastive loss functions with a Bayesian learning technique.
arXiv Detail & Related papers (2024-05-17T19:49:02Z) - Learning Prompt-Enhanced Context Features for Weakly-Supervised Video
Anomaly Detection [37.99031842449251]
Video anomaly detection under weak supervision presents significant challenges.
We present a weakly supervised anomaly detection framework that focuses on efficient context modeling and enhanced semantic discriminability.
Our approach significantly improves the detection accuracy of certain anomaly sub-classes, underscoring its practical value and efficacy.
arXiv Detail & Related papers (2023-06-26T06:45:16Z) - Latent Spectral Regularization for Continual Learning [21.445600749028923]
We study the phenomenon by investigating the geometric characteristics of the learner's latent space.
We propose a geometric regularizer that enforces weak requirements on the Laplacian spectrum of the latent space.
arXiv Detail & Related papers (2023-01-09T13:56:59Z) - Mitigating Forgetting in Online Continual Learning via Contrasting
Semantically Distinct Augmentations [22.289830907729705]
Online continual learning (OCL) aims to enable model learning from a non-stationary data stream to continuously acquire new knowledge as well as retain the learnt one.
Main challenge comes from the "catastrophic forgetting" issue -- the inability to well remember the learnt knowledge while learning the new ones.
arXiv Detail & Related papers (2022-11-10T05:29:43Z) - Learning Neural Causal Models with Active Interventions [83.44636110899742]
We introduce an active intervention-targeting mechanism which enables a quick identification of the underlying causal structure of the data-generating process.
Our method significantly reduces the required number of interactions compared with random intervention targeting.
We demonstrate superior performance on multiple benchmarks from simulated to real-world data.
arXiv Detail & Related papers (2021-09-06T13:10:37Z) - Unsupervised Learning of Debiased Representations with Pseudo-Attributes [85.5691102676175]
We propose a simple but effective debiasing technique in an unsupervised manner.
We perform clustering on the feature embedding space and identify pseudoattributes by taking advantage of the clustering results.
We then employ a novel cluster-based reweighting scheme for learning debiased representation.
arXiv Detail & Related papers (2021-08-06T05:20:46Z) - Risk Minimization from Adaptively Collected Data: Guarantees for
Supervised and Policy Learning [57.88785630755165]
Empirical risk minimization (ERM) is the workhorse of machine learning, but its model-agnostic guarantees can fail when we use adaptively collected data.
We study a generic importance sampling weighted ERM algorithm for using adaptively collected data to minimize the average of a loss function over a hypothesis class.
For policy learning, we provide rate-optimal regret guarantees that close an open gap in the existing literature whenever exploration decays to zero.
arXiv Detail & Related papers (2021-06-03T09:50:13Z) - Spatial Contrastive Learning for Few-Shot Classification [9.66840768820136]
We propose a novel attention-based spatial contrastive objective to learn locally discriminative and class-agnostic features.
With extensive experiments, we show that the proposed method outperforms state-of-the-art approaches.
arXiv Detail & Related papers (2020-12-26T23:39:41Z) - Extreme Memorization via Scale of Initialization [72.78162454173803]
We construct an experimental setup in which changing the scale of initialization strongly impacts the implicit regularization induced by SGD.
We find that the extent and manner in which generalization ability is affected depends on the activation and loss function used.
In the case of the homogeneous ReLU activation, we show that this behavior can be attributed to the loss function.
arXiv Detail & Related papers (2020-08-31T04:53:11Z) - Provably Efficient Causal Reinforcement Learning with Confounded
Observational Data [135.64775986546505]
We study how to incorporate the dataset (observational data) collected offline, which is often abundantly available in practice, to improve the sample efficiency in the online setting.
We propose the deconfounded optimistic value iteration (DOVI) algorithm, which incorporates the confounded observational data in a provably efficient manner.
arXiv Detail & Related papers (2020-06-22T14:49:33Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.