SinSim: Sinkhorn-Regularized SimCLR
- URL: http://arxiv.org/abs/2502.10478v1
- Date: Thu, 13 Feb 2025 19:49:25 GMT
- Title: SinSim: Sinkhorn-Regularized SimCLR
- Authors: M. Hadi Sepanj, Paul Fiegth,
- Abstract summary: Self-supervised learning has revolutionized representation learning by eliminating the need for labeled data.
We propose SinSim, a novel extension of SimCLR that integrates Sinkhorn regularization from optimal transport theory to enhance representation structure.
Empirical evaluations on various datasets demonstrate that SinSim outperforms SimCLR and achieves competitive performance against prominent self-supervised methods such as VICReg and Barlow Twins.
- Score: 0.0
- License:
- Abstract: Self-supervised learning has revolutionized representation learning by eliminating the need for labeled data. Contrastive learning methods, such as SimCLR, maximize the agreement between augmented views of an image but lack explicit regularization to enforce a globally structured latent space. This limitation often leads to suboptimal generalization. We propose SinSim, a novel extension of SimCLR that integrates Sinkhorn regularization from optimal transport theory to enhance representation structure. The Sinkhorn loss, an entropy-regularized Wasserstein distance, encourages a well-dispersed and geometry-aware feature space, preserving discriminative power. Empirical evaluations on various datasets demonstrate that SinSim outperforms SimCLR and achieves competitive performance against prominent self-supervised methods such as VICReg and Barlow Twins. UMAP visualizations further reveal improved class separability and structured feature distributions. These results indicate that integrating optimal transport regularization into contrastive learning provides a principled and effective mechanism for learning robust, well-structured representations. Our findings open new directions for applying transport-based constraints in self-supervised learning frameworks.
Related papers
- Graph Structure Refinement with Energy-based Contrastive Learning [56.957793274727514]
We introduce an unsupervised method based on a joint of generative training and discriminative training to learn graph structure and representation.
We propose an Energy-based Contrastive Learning (ECL) guided Graph Structure Refinement (GSR) framework, denoted as ECL-GSR.
ECL-GSR achieves faster training with fewer samples and memories against the leading baseline, highlighting its simplicity and efficiency in downstream tasks.
arXiv Detail & Related papers (2024-12-20T04:05:09Z) - SimO Loss: Anchor-Free Contrastive Loss for Fine-Grained Supervised Contrastive Learning [0.0]
We introduce a novel anchor-free contrastive learning (L) method leveraging our proposed Similarity-Orthogonality (SimO) loss.
Our approach minimizes a semi-metric discriminative loss function that simultaneously optimize two key objectives.
We provide visualizations that demonstrate the impact of SimO loss on the embedding space.
arXiv Detail & Related papers (2024-10-07T17:41:10Z) - Towards Robust Recommendation via Decision Boundary-aware Graph Contrastive Learning [25.514007761856632]
graph contrastive learning (GCL) has received increasing attention in recommender systems due to its effectiveness in reducing bias caused by data sparsity.
We argue that these methods struggle to balance between semantic invariance and view hardness across the dynamic training process.
We propose a novel GCL-based recommendation framework RGCL, which effectively maintains the semantic invariance of contrastive pairs and dynamically adapts as the model capability evolves.
arXiv Detail & Related papers (2024-07-14T13:03:35Z) - Preventing Collapse in Contrastive Learning with Orthonormal Prototypes (CLOP) [0.0]
CLOP is a novel semi-supervised loss function designed to prevent neural collapse by promoting the formation of linear subspaces among class embeddings.
We show that CLOP enhances performance, providing greater stability across different learning rates and batch sizes.
arXiv Detail & Related papers (2024-03-27T15:48:16Z) - Stragglers-Aware Low-Latency Synchronous Federated Learning via Layer-Wise Model Updates [71.81037644563217]
Synchronous federated learning (FL) is a popular paradigm for collaborative edge learning.
As some of the devices may have limited computational resources and varying availability, FL latency is highly sensitive to stragglers.
We propose straggler-aware layer-wise federated learning (SALF) that leverages the optimization procedure of NNs via backpropagation to update the global model in a layer-wise fashion.
arXiv Detail & Related papers (2024-03-27T09:14:36Z) - Over-the-Air Federated Learning and Optimization [52.5188988624998]
We focus on Federated learning (FL) via edge-the-air computation (AirComp)
We describe the convergence of AirComp-based FedAvg (AirFedAvg) algorithms under both convex and non- convex settings.
For different types of local updates that can be transmitted by edge devices (i.e., model, gradient, model difference), we reveal that transmitting in AirFedAvg may cause an aggregation error.
In addition, we consider more practical signal processing schemes to improve the communication efficiency and extend the convergence analysis to different forms of model aggregation error caused by these signal processing schemes.
arXiv Detail & Related papers (2023-10-16T05:49:28Z) - Unifying Graph Contrastive Learning with Flexible Contextual Scopes [57.86762576319638]
We present a self-supervised learning method termed Unifying Graph Contrastive Learning with Flexible Contextual Scopes (UGCL for short)
Our algorithm builds flexible contextual representations with contextual scopes by controlling the power of an adjacency matrix.
Based on representations from both local and contextual scopes, distL optimises a very simple contrastive loss function for graph representation learning.
arXiv Detail & Related papers (2022-10-17T07:16:17Z) - Cross-modal Representation Learning for Zero-shot Action Recognition [67.57406812235767]
We present a cross-modal Transformer-based framework, which jointly encodes video data and text labels for zero-shot action recognition (ZSAR)
Our model employs a conceptually new pipeline by which visual representations are learned in conjunction with visual-semantic associations in an end-to-end manner.
Experiment results show our model considerably improves upon the state of the arts in ZSAR, reaching encouraging top-1 accuracy on UCF101, HMDB51, and ActivityNet benchmark datasets.
arXiv Detail & Related papers (2022-05-03T17:39:27Z) - Imposing Consistency for Optical Flow Estimation [73.53204596544472]
Imposing consistency through proxy tasks has been shown to enhance data-driven learning.
This paper introduces novel and effective consistency strategies for optical flow estimation.
arXiv Detail & Related papers (2022-04-14T22:58:30Z) - Learning Aligned Cross-Modal Representation for Generalized Zero-Shot
Classification [17.177622259867515]
We propose an innovative autoencoder network by learning Aligned Cross-Modal Representations (dubbed ACMR) for Generalized Zero-Shot Classification (GZSC)
Specifically, we propose a novel Vision-Semantic Alignment (VSA) method to strengthen the alignment of cross-modal latent features on the latent subspaces guided by a learned classifier.
In addition, we propose a novel Information Enhancement Module (IEM) to reduce the possibility of latent variables collapse meanwhile encouraging the discriminative ability of latent variables.
arXiv Detail & Related papers (2021-12-24T03:35:37Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.