Guarding Barlow Twins Against Overfitting with Mixed Samples
- URL: http://arxiv.org/abs/2312.02151v1
- Date: Mon, 4 Dec 2023 18:59:36 GMT
- Title: Guarding Barlow Twins Against Overfitting with Mixed Samples
- Authors: Wele Gedara Chaminda Bandara, Celso M. De Melo, and Vishal M. Patel
- Abstract summary: Self-supervised learning aims to learn transferable feature representations for downstream applications without relying on labeled data.
We introduce Mixed Barlow Twins, which aims to improve sample interaction during Barlow Twins training via linearly interpolated samples.
- Score: 27.7244906436942
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: Self-supervised Learning (SSL) aims to learn transferable feature
representations for downstream applications without relying on labeled data.
The Barlow Twins algorithm, renowned for its widespread adoption and
straightforward implementation compared to its counterparts like contrastive
learning methods, minimizes feature redundancy while maximizing invariance to
common corruptions. Optimizing for the above objective forces the network to
learn useful representations, while avoiding noisy or constant features,
resulting in improved downstream task performance with limited adaptation.
Despite Barlow Twins' proven effectiveness in pre-training, the underlying SSL
objective can inadvertently cause feature overfitting due to the lack of strong
interaction between the samples unlike the contrastive learning approaches.
From our experiments, we observe that optimizing for the Barlow Twins objective
doesn't necessarily guarantee sustained improvements in representation quality
beyond a certain pre-training phase, and can potentially degrade downstream
performance on some datasets. To address this challenge, we introduce Mixed
Barlow Twins, which aims to improve sample interaction during Barlow Twins
training via linearly interpolated samples. This results in an additional
regularization term to the original Barlow Twins objective, assuming linear
interpolation in the input space translates to linearly interpolated features
in the feature space. Pre-training with this regularization effectively
mitigates feature overfitting and further enhances the downstream performance
on CIFAR-10, CIFAR-100, TinyImageNet, STL-10, and ImageNet datasets. The code
and checkpoints are available at: https://github.com/wgcban/mix-bt.git
Related papers
- CARE Transformer: Mobile-Friendly Linear Visual Transformer via Decoupled Dual Interaction [77.8576094863446]
We propose a new detextbfCoupled dutextbfAl-interactive lineatextbfR atttextbfEntion (CARE) mechanism.
We first propose an asymmetrical feature decoupling strategy that asymmetrically decouples the learning process for local inductive bias and long-range dependencies.
By adopting a decoupled learning way and fully exploiting complementarity across features, our method can achieve both high efficiency and accuracy.
arXiv Detail & Related papers (2024-11-25T07:56:13Z) - L-DAWA: Layer-wise Divergence Aware Weight Aggregation in Federated
Self-Supervised Visual Representation Learning [14.888569402903562]
Integration of self-supervised learning (SSL) and federated learning (FL) into one coherent system can potentially offer data privacy guarantees.
We propose a new aggregation strategy termed Layer-wise Divergence Aware Weight Aggregation (L-DAWA) to mitigate the influence of client bias and divergence during FL aggregation.
arXiv Detail & Related papers (2023-07-14T15:07:30Z) - Deep Active Learning Using Barlow Twins [0.0]
The generalisation performance of a convolutional neural networks (CNN) is majorly predisposed by the quantity, quality, and diversity of the training images.
The goal of the Active learning for the task is to draw most informative samples from the unlabeled pool.
We propose Deep Active Learning using BarlowTwins(DALBT), an active learning method for all the datasets.
arXiv Detail & Related papers (2022-12-30T12:39:55Z) - Learning Compact Features via In-Training Representation Alignment [19.273120635948363]
In each epoch, the true gradient of the loss function is estimated using a mini-batch sampled from the training set.
We propose In-Training Representation Alignment (ITRA) that explicitly aligns feature distributions of two different mini-batches with a matching loss.
We also provide a rigorous analysis of the desirable effects of the matching loss on feature representation learning.
arXiv Detail & Related papers (2022-11-23T22:23:22Z) - Non-contrastive representation learning for intervals from well logs [58.70164460091879]
The representation learning problem in the oil & gas industry aims to construct a model that provides a representation based on logging data for a well interval.
One of the possible approaches is self-supervised learning (SSL)
We are the first to introduce non-contrastive SSL for well-logging data.
arXiv Detail & Related papers (2022-09-28T13:27:10Z) - Siamese Prototypical Contrastive Learning [24.794022951873156]
Contrastive Self-supervised Learning (CSL) is a practical solution that learns meaningful visual representations from massive data in an unsupervised approach.
In this paper, we tackle this problem by introducing a simple but effective contrastive learning framework.
The key insight is to employ siamese-style metric loss to match intra-prototype features, while increasing the distance between inter-prototype features.
arXiv Detail & Related papers (2022-08-18T13:25:30Z) - Interpolation-based Correlation Reduction Network for Semi-Supervised
Graph Learning [49.94816548023729]
We propose a novel graph contrastive learning method, termed Interpolation-based Correlation Reduction Network (ICRN)
In our method, we improve the discriminative capability of the latent feature by enlarging the margin of decision boundaries.
By combining the two settings, we extract rich supervision information from both the abundant unlabeled nodes and the rare yet valuable labeled nodes for discnative representation learning.
arXiv Detail & Related papers (2022-06-06T14:26:34Z) - Semi-supervised Domain Adaptive Structure Learning [72.01544419893628]
Semi-supervised domain adaptation (SSDA) is a challenging problem requiring methods to overcome both 1) overfitting towards poorly annotated data and 2) distribution shift across domains.
We introduce an adaptive structure learning method to regularize the cooperation of SSL and DA.
arXiv Detail & Related papers (2021-12-12T06:11:16Z) - Improving Calibration for Long-Tailed Recognition [68.32848696795519]
We propose two methods to improve calibration and performance in such scenarios.
For dataset bias due to different samplers, we propose shifted batch normalization.
Our proposed methods set new records on multiple popular long-tailed recognition benchmark datasets.
arXiv Detail & Related papers (2021-04-01T13:55:21Z) - Barlow Twins: Self-Supervised Learning via Redundancy Reduction [31.077182488826963]
Self-supervised learning (SSL) is rapidly closing the gap with supervised methods on large computer vision benchmarks.
We propose an objective function that naturally avoids collapse by measuring the cross-correlation matrix between the outputs of two identical networks.
This causes the representation vectors of distorted versions of a sample to be similar, while minimizing the redundancy between the components of these vectors.
arXiv Detail & Related papers (2021-03-04T18:55:09Z) - Multi-scale Interactive Network for Salient Object Detection [91.43066633305662]
We propose the aggregate interaction modules to integrate the features from adjacent levels.
To obtain more efficient multi-scale features, the self-interaction modules are embedded in each decoder unit.
Experimental results on five benchmark datasets demonstrate that the proposed method without any post-processing performs favorably against 23 state-of-the-art approaches.
arXiv Detail & Related papers (2020-07-17T15:41:37Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.