Self-Supervised Learning with an Information Maximization Criterion
- URL: http://arxiv.org/abs/2209.07999v1
- Date: Fri, 16 Sep 2022 15:26:19 GMT
- Title: Self-Supervised Learning with an Information Maximization Criterion
- Authors: Serdar Ozsoy, Shadi Hamdan, Sercan \"O. Arik, Deniz Yuret, Alper T.
Erdogan
- Abstract summary: We argue that a straightforward application of information among alternative representations of the same input naturally solves the collapse problem.
We propose a self-supervised learning method, CorInfoMax, that uses a second-order statistics-based mutual information measure.
Numerical experiments demonstrate that CorInfoMax achieves better or competitive performance results relative to the state-of-the-art SSL approaches.
- Score: 5.214806886230471
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Self-supervised learning allows AI systems to learn effective representations
from large amounts of data using tasks that do not require costly labeling.
Mode collapse, i.e., the model producing identical representations for all
inputs, is a central problem to many self-supervised learning approaches,
making self-supervised tasks, such as matching distorted variants of the
inputs, ineffective. In this article, we argue that a straightforward
application of information maximization among alternative latent
representations of the same input naturally solves the collapse problem and
achieves competitive empirical results. We propose a self-supervised learning
method, CorInfoMax, that uses a second-order statistics-based mutual
information measure that reflects the level of correlation among its arguments.
Maximizing this correlative information measure between alternative
representations of the same input serves two purposes: (1) it avoids the
collapse problem by generating feature vectors with non-degenerate covariances;
(2) it establishes relevance among alternative representations by increasing
the linear dependence among them. An approximation of the proposed information
maximization objective simplifies to a Euclidean distance-based objective
function regularized by the log-determinant of the feature covariance matrix.
The regularization term acts as a natural barrier against feature space
degeneracy. Consequently, beyond avoiding complete output collapse to a single
point, the proposed approach also prevents dimensional collapse by encouraging
the spread of information across the whole feature space. Numerical experiments
demonstrate that CorInfoMax achieves better or competitive performance results
relative to the state-of-the-art SSL approaches.
Related papers
- Causal Feature Selection via Transfer Entropy [59.999594949050596]
Causal discovery aims to identify causal relationships between features with observational data.
We introduce a new causal feature selection approach that relies on the forward and backward feature selection procedures.
We provide theoretical guarantees on the regression and classification errors for both the exact and the finite-sample cases.
arXiv Detail & Related papers (2023-10-17T08:04:45Z) - Nonlinear Feature Aggregation: Two Algorithms driven by Theory [45.3190496371625]
Real-world machine learning applications are characterized by a huge number of features, leading to computational and memory issues.
We propose a dimensionality reduction algorithm (NonLinCFA) which aggregates non-linear transformations of features with a generic aggregation function.
We also test the algorithms on synthetic and real-world datasets, performing regression and classification tasks, showing competitive performances.
arXiv Detail & Related papers (2023-06-19T19:57:33Z) - Efficient Alternating Minimization Solvers for Wyner Multi-View
Unsupervised Learning [0.0]
We propose two novel formulations that enable the development of computational efficient solvers based the alternating principle.
The proposed solvers offer computational efficiency, theoretical convergence guarantees, local minima complexity with the number of views, and exceptional accuracy as compared with the state-of-the-art techniques.
arXiv Detail & Related papers (2023-03-28T10:17:51Z) - Robust Direct Learning for Causal Data Fusion [14.462235940634969]
We provide a framework for integrating multi-source data that separates the treatment effect from other nuisance functions.
We also propose a causal information-aware weighting function motivated by theoretical insights from the semiparametric efficiency theory.
arXiv Detail & Related papers (2022-11-01T03:33:22Z) - HiURE: Hierarchical Exemplar Contrastive Learning for Unsupervised
Relation Extraction [60.80849503639896]
Unsupervised relation extraction aims to extract the relationship between entities from natural language sentences without prior information on relational scope or distribution.
We propose a novel contrastive learning framework named HiURE, which has the capability to derive hierarchical signals from relational feature space using cross hierarchy attention.
Experimental results on two public datasets demonstrate the advanced effectiveness and robustness of HiURE on unsupervised relation extraction when compared with state-of-the-art models.
arXiv Detail & Related papers (2022-05-04T17:56:48Z) - Adversarial Dual-Student with Differentiable Spatial Warping for
Semi-Supervised Semantic Segmentation [70.2166826794421]
We propose a differentiable geometric warping to conduct unsupervised data augmentation.
We also propose a novel adversarial dual-student framework to improve the Mean-Teacher.
Our solution significantly improves the performance and state-of-the-art results are achieved on both datasets.
arXiv Detail & Related papers (2022-03-05T17:36:17Z) - Semi-supervised Domain Adaptive Structure Learning [72.01544419893628]
Semi-supervised domain adaptation (SSDA) is a challenging problem requiring methods to overcome both 1) overfitting towards poorly annotated data and 2) distribution shift across domains.
We introduce an adaptive structure learning method to regularize the cooperation of SSL and DA.
arXiv Detail & Related papers (2021-12-12T06:11:16Z) - Learning Bias-Invariant Representation by Cross-Sample Mutual
Information Minimization [77.8735802150511]
We propose a cross-sample adversarial debiasing (CSAD) method to remove the bias information misused by the target task.
The correlation measurement plays a critical role in adversarial debiasing and is conducted by a cross-sample neural mutual information estimator.
We conduct thorough experiments on publicly available datasets to validate the advantages of the proposed method over state-of-the-art approaches.
arXiv Detail & Related papers (2021-08-11T21:17:02Z) - Integrating Information Theory and Adversarial Learning for Cross-modal
Retrieval [19.600581093189362]
Accurately matching visual and textual data in cross-modal retrieval has been widely studied in the multimedia community.
We propose integrating Shannon information theory and adversarial learning.
In terms of the gap, we integrate modality classification and information entropy adversarially.
arXiv Detail & Related papers (2021-04-11T11:04:55Z) - Learning while Respecting Privacy and Robustness to Distributional
Uncertainties and Adversarial Data [66.78671826743884]
The distributionally robust optimization framework is considered for training a parametric model.
The objective is to endow the trained model with robustness against adversarially manipulated input data.
Proposed algorithms offer robustness with little overhead.
arXiv Detail & Related papers (2020-07-07T18:25:25Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.