Related papers: Your contrastive learning problem is secretly a distribution alignment problem

Your contrastive learning problem is secretly a distribution alignment problem

URL: http://arxiv.org/abs/2502.20141v1
Date: Thu, 27 Feb 2025 14:33:08 GMT
Title: Your contrastive learning problem is secretly a distribution alignment problem
Authors: Zihao Chen, Chi-Heng Lin, Ran Liu, Jingyun Xiao, Eva L Dyer,
Abstract summary: We build connections between noise contrastive estimation losses widely used in vision and distribution alignment.<n>By using more information from the distribution of latents, our approach allows a more distribution-aware manipulation of the relationships within augmented sample sets.
Score: 11.75699373180322
License: http://creativecommons.org/licenses/by/4.0/
Abstract: Despite the success of contrastive learning (CL) in vision and language, its theoretical foundations and mechanisms for building representations remain poorly understood. In this work, we build connections between noise contrastive estimation losses widely used in CL and distribution alignment with entropic optimal transport (OT). This connection allows us to develop a family of different losses and multistep iterative variants for existing CL methods. Intuitively, by using more information from the distribution of latents, our approach allows a more distribution-aware manipulation of the relationships within augmented sample sets. We provide theoretical insights and experimental evidence demonstrating the benefits of our approach for {\em generalized contrastive alignment}. Through this framework, it is possible to leverage tools in OT to build unbalanced losses to handle noisy views and customize the representation space by changing the constraints on alignment. By reframing contrastive learning as an alignment problem and leveraging existing optimization tools for OT, our work provides new insights and connections between different self-supervised learning models in addition to new tools that can be more easily adapted to incorporate domain knowledge into learning.

Related papers

Generalizing Supervised Contrastive learning: A Projection Perspective [0.9208007322096533]
We introduce ProjNCE, a generalization of the InfoNCE loss that unifies supervised and self-supervised contrastive objectives.<n>We show that ProjNCE consistently outperforms both SupCon and standard cross-entropy training.
arXiv Detail & Related papers (2025-06-11T14:49:43Z)
Explainability and Continual Learning meet Federated Learning at the Network Edge [4.348225679878919]
We discuss novel optimization problems that emerge in distributed learning at the network edge with wirelessly interconnected edge devices. Specifically, we discuss how Multi-objective optimization (MOO) can be used to address the trade-off between predictive accuracy and explainability. We also discuss the implications of integrating inherently explainable tree-based models into distributed learning settings.
arXiv Detail & Related papers (2025-04-11T13:45:55Z)
Variational Self-Supervised Learning [0.0]
We present Variational Self-Supervised Learning (VSSL), a novel framework that combines variational inference with self-supervised learning. A momentum-updated teacher network defines a dynamic, data-dependent prior, while the student encoder produces an approximate posterior from augmented views. Experiments on CIFAR-10, CIFAR-100, and ImageNet-100 show that VSSL achieves competitive or superior performance to leading self-supervised methods.
arXiv Detail & Related papers (2025-04-06T01:28:50Z)
Revisiting Self-Supervised Heterogeneous Graph Learning from Spectral Clustering Perspective [52.662463893268225]
Self-supervised heterogeneous graph learning (SHGL) has shown promising potential in diverse scenarios.<n>Existing SHGL methods encounter two significant limitations.<n>We introduce a novel framework enhanced by rank and dual consistency constraints.
arXiv Detail & Related papers (2024-12-01T09:33:20Z)
Dataset Awareness is not Enough: Implementing Sample-level Tail Encouragement in Long-tailed Self-supervised Learning [16.110763554788445]
We introduce pseudo-labels into self-supervised long-tailed learning, utilizing pseudo-label information to drive a dynamic temperature and re-weighting strategy. We analyze the lack of quantity awareness in the temperature parameter and use re-weighting to compensate for this deficiency, thereby achieving optimal training patterns at the sample level.
arXiv Detail & Related papers (2024-10-30T10:25:22Z)
SimO Loss: Anchor-Free Contrastive Loss for Fine-Grained Supervised Contrastive Learning [0.0]
We introduce a novel anchor-free contrastive learning (L) method leveraging our proposed Similarity-Orthogonality (SimO) loss. Our approach minimizes a semi-metric discriminative loss function that simultaneously optimize two key objectives. We provide visualizations that demonstrate the impact of SimO loss on the embedding space.
arXiv Detail & Related papers (2024-10-07T17:41:10Z)
The Common Stability Mechanism behind most Self-Supervised Learning Approaches [64.40701218561921]
We provide a framework to explain the stability mechanism of different self-supervised learning techniques. We discuss the working mechanism of contrastive techniques like SimCLR, non-contrastive techniques like BYOL, SWAV, SimSiam, Barlow Twins, and DINO. We formulate different hypotheses and test them using the Imagenet100 dataset.
arXiv Detail & Related papers (2024-02-22T20:36:24Z)
Relaxed Contrastive Learning for Federated Learning [48.96253206661268]
We propose a novel contrastive learning framework to address the challenges of data heterogeneity in federated learning. Our framework outperforms all existing federated learning approaches by huge margins on the standard benchmarks.
arXiv Detail & Related papers (2024-01-10T04:55:24Z)
On Task Performance and Model Calibration with Supervised and Self-Ensembled In-Context Learning [71.44986275228747]
In-context learning (ICL) has become an efficient approach propelled by the recent advancements in large language models (LLMs) However, both paradigms are prone to suffer from the critical problem of overconfidence (i.e., miscalibration)
arXiv Detail & Related papers (2023-12-21T11:55:10Z)
ArCL: Enhancing Contrastive Learning with Augmentation-Robust Representations [30.745749133759304]
We develop a theoretical framework to analyze the transferability of self-supervised contrastive learning. We show that contrastive learning fails to learn domain-invariant features, which limits its transferability. Based on these theoretical insights, we propose a novel method called Augmentation-robust Contrastive Learning (ArCL)
arXiv Detail & Related papers (2023-03-02T09:26:20Z)
Your Contrastive Learning Is Secretly Doing Stochastic Neighbor Embedding [12.421540007814937]
Self-supervised contrastive learning (SSCL) has achieved great success in extracting powerful features from unlabeled data. We contribute to the theoretical understanding of SSCL and uncover its connection to the classic data visualization method, neighbor embedding. We provide novel analysis on domain-agnostic augmentations, implicit bias and robustness of learned features.
arXiv Detail & Related papers (2022-05-30T02:39:29Z)
Weak Augmentation Guided Relational Self-Supervised Learning [80.0680103295137]
We introduce a novel relational self-supervised learning (ReSSL) framework that learns representations by modeling the relationship between different instances. Our proposed method employs sharpened distribution of pairwise similarities among different instances as textitrelation metric. Experimental results show that our proposed ReSSL substantially outperforms the state-of-the-art methods across different network architectures.
arXiv Detail & Related papers (2022-03-16T16:14:19Z)
Revisiting Deep Semi-supervised Learning: An Empirical Distribution Alignment Framework and Its Generalization Bound [97.93945601881407]
We propose a new deep semi-supervised learning framework called Semi-supervised Learning by Empirical Distribution Alignment (SLEDA) We show the generalization error of semi-supervised learning can be effectively bounded by minimizing the training error on labeled data. Building upon our new framework and the theoretical bound, we develop a simple and effective deep semi-supervised learning method called Augmented Distribution Alignment Network (ADA-Net)
arXiv Detail & Related papers (2022-03-13T11:59:52Z)
Learning Diverse Representations for Fast Adaptation to Distribution Shift [78.83747601814669]
We present a method for learning multiple models, incorporating an objective that pressures each to learn a distinct way to solve the task. We demonstrate our framework's ability to facilitate rapid adaptation to distribution shift.
arXiv Detail & Related papers (2020-06-12T12:23:50Z)

This list is automatically generated from the titles and abstracts of the papers in this site.