MIO : Mutual Information Optimization using Self-Supervised Binary
Contrastive Learning
- URL: http://arxiv.org/abs/2111.12664v1
- Date: Wed, 24 Nov 2021 17:51:29 GMT
- Title: MIO : Mutual Information Optimization using Self-Supervised Binary
Contrastive Learning
- Authors: Siladittya Manna, Saumik Bhattacharya and Umapada Pal
- Abstract summary: We model contrastive learning into a binary classification problem to predict if a pair is positive or not.
The proposed method outperforms the state-of-the-art algorithms on benchmark datasets like STL-10, CIFAR-10, CIFAR-100.
- Score: 19.5917119072985
- License: http://creativecommons.org/licenses/by-nc-sa/4.0/
- Abstract: Self-supervised contrastive learning is one of the domains which has
progressed rapidly over the last few years. Most of the state-of-the-art
self-supervised algorithms use a large number of negative samples, momentum
updates, specific architectural modifications, or extensive training to learn
good representations. Such arrangements make the overall training process
complex and challenging to realize analytically. In this paper, we propose a
mutual information optimization based loss function for contrastive learning
where we model contrastive learning into a binary classification problem to
predict if a pair is positive or not. This formulation not only helps us to
track the problem mathematically but also helps us to outperform existing
algorithms. Unlike the existing methods that only maximize the mutual
information in a positive pair, the proposed loss function optimizes the mutual
information in both positive and negative pairs. We also present a mathematical
expression for the parameter gradients flowing into the projector and the
displacement of the feature vectors in the feature space. This helps us to get
a mathematical insight into the working principle of contrastive learning. An
additive $L_2$ regularizer is also used to prevent diverging of the feature
vectors and to improve performance. The proposed method outperforms the
state-of-the-art algorithms on benchmark datasets like STL-10, CIFAR-10,
CIFAR-100. After only 250 epochs of pre-training, the proposed model achieves
the best accuracy of 85.44\%, 60.75\%, 56.81\% on CIFAR-10, STL-10, CIFAR-100
datasets, respectively.
Related papers
- Unlearning as multi-task optimization: A normalized gradient difference approach with an adaptive learning rate [105.86576388991713]
We introduce a normalized gradient difference (NGDiff) algorithm, enabling us to have better control over the trade-off between the objectives.
We provide a theoretical analysis and empirically demonstrate the superior performance of NGDiff among state-of-the-art unlearning methods on the TOFU and MUSE datasets.
arXiv Detail & Related papers (2024-10-29T14:41:44Z) - DynamoRep: Trajectory-Based Population Dynamics for Classification of
Black-box Optimization Problems [0.755972004983746]
We propose a feature extraction method that describes the trajectories of optimization algorithms using simple statistics.
We demonstrate that the proposed DynamoRep features capture enough information to identify the problem class on which the optimization algorithm is running.
arXiv Detail & Related papers (2023-06-08T06:57:07Z) - Towards Compute-Optimal Transfer Learning [82.88829463290041]
We argue that zero-shot structured pruning of pretrained models allows them to increase compute efficiency with minimal reduction in performance.
Our results show that pruning convolutional filters of pretrained models can lead to more than 20% performance improvement in low computational regimes.
arXiv Detail & Related papers (2023-04-25T21:49:09Z) - Model Predictive Control with Self-supervised Representation Learning [13.225264876433528]
We propose the use of a reconstruction function within the TD-MPC framework, so that the agent can reconstruct the original observation.
Our proposed addition of another loss term leads to improved performance on both state- and image-based tasks.
arXiv Detail & Related papers (2023-04-14T16:02:04Z) - Representation Learning with Multi-Step Inverse Kinematics: An Efficient
and Optimal Approach to Rich-Observation RL [106.82295532402335]
Existing reinforcement learning algorithms suffer from computational intractability, strong statistical assumptions, and suboptimal sample complexity.
We provide the first computationally efficient algorithm that attains rate-optimal sample complexity with respect to the desired accuracy level.
Our algorithm, MusIK, combines systematic exploration with representation learning based on multi-step inverse kinematics.
arXiv Detail & Related papers (2023-04-12T14:51:47Z) - Deep Active Ensemble Sampling For Image Classification [8.31483061185317]
Active learning frameworks aim to reduce the cost of data annotation by actively requesting the labeling for the most informative data points.
Some proposed approaches include uncertainty-based techniques, geometric methods, implicit combination of uncertainty-based and geometric approaches.
We present an innovative integration of recent progress in both uncertainty-based and geometric frameworks to enable an efficient exploration/exploitation trade-off in sample selection strategy.
Our framework provides two advantages: (1) accurate posterior estimation, and (2) tune-able trade-off between computational overhead and higher accuracy.
arXiv Detail & Related papers (2022-10-11T20:20:20Z) - Improved Algorithms for Neural Active Learning [74.89097665112621]
We improve the theoretical and empirical performance of neural-network(NN)-based active learning algorithms for the non-parametric streaming setting.
We introduce two regret metrics by minimizing the population loss that are more suitable in active learning than the one used in state-of-the-art (SOTA) related work.
arXiv Detail & Related papers (2022-10-02T05:03:38Z) - Provable Stochastic Optimization for Global Contrastive Learning: Small
Batch Does Not Harm Performance [53.49803579981569]
We consider a global objective for contrastive learning, which contrasts each positive pair with all negative pairs for an anchor point.
Existing methods such as SimCLR requires a large batch size in order to achieve a satisfactory result.
We propose a memory-efficient optimization algorithm for solving the Global Contrastive Learning of Representations, named SogCLR.
arXiv Detail & Related papers (2022-02-24T22:16:53Z) - Fast Few-Shot Classification by Few-Iteration Meta-Learning [173.32497326674775]
We introduce a fast optimization-based meta-learning method for few-shot classification.
Our strategy enables important aspects of the base learner objective to be learned during meta-training.
We perform a comprehensive experimental analysis, demonstrating the speed and effectiveness of our approach.
arXiv Detail & Related papers (2020-10-01T15:59:31Z) - Neural Non-Rigid Tracking [26.41847163649205]
We introduce a novel, end-to-end learnable, differentiable non-rigid tracker.
We employ a convolutional neural network to predict dense correspondences and their confidences.
Compared to state-of-the-art approaches, our algorithm shows improved reconstruction performance.
arXiv Detail & Related papers (2020-06-23T18:00:39Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.