Investigating Power laws in Deep Representation Learning
- URL: http://arxiv.org/abs/2202.05808v1
- Date: Fri, 11 Feb 2022 18:11:32 GMT
- Title: Investigating Power laws in Deep Representation Learning
- Authors: Arna Ghosh, Arnab Kumar Mondal, Kumar Krishna Agrawal, Blake Richards
- Abstract summary: We propose a framework to evaluate the quality of representations in unlabelled datasets.
We estimate the coefficient of the power law, $alpha$, across three key attributes which influence representation learning.
Notably, $alpha$ is computable from the representations without knowledge of any labels, thereby offering a framework to evaluate the quality of representations in unlabelled datasets.
- Score: 4.996066540156903
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: Representation learning that leverages large-scale labelled datasets, is
central to recent progress in machine learning. Access to task relevant labels
at scale is often scarce or expensive, motivating the need to learn from
unlabelled datasets with self-supervised learning (SSL). Such large unlabelled
datasets (with data augmentations) often provide a good coverage of the
underlying input distribution. However evaluating the representations learned
by SSL algorithms still requires task-specific labelled samples in the training
pipeline. Additionally, the generalization of task-specific encoding is often
sensitive to potential distribution shift. Inspired by recent advances in
theoretical machine learning and vision neuroscience, we observe that the
eigenspectrum of the empirical feature covariance matrix often follows a power
law. For visual representations, we estimate the coefficient of the power law,
$\alpha$, across three key attributes which influence representation learning:
learning objective (supervised, SimCLR, Barlow Twins and BYOL), network
architecture (VGG, ResNet and Vision Transformer), and tasks (object and scene
recognition). We observe that under mild conditions, proximity of $\alpha$ to
1, is strongly correlated to the downstream generalization performance.
Furthermore, $\alpha \approx 1$ is a strong indicator of robustness to label
noise during fine-tuning. Notably, $\alpha$ is computable from the
representations without knowledge of any labels, thereby offering a framework
to evaluate the quality of representations in unlabelled datasets.
Related papers
- Improving Deep Representation Learning via Auxiliary Learnable Target Coding [69.79343510578877]
This paper introduces a novel learnable target coding as an auxiliary regularization of deep representation learning.
Specifically, a margin-based triplet loss and a correlation consistency loss on the proposed target codes are designed to encourage more discriminative representations.
arXiv Detail & Related papers (2023-05-30T01:38:54Z) - A Benchmark Generative Probabilistic Model for Weak Supervised Learning [2.0257616108612373]
Weak Supervised Learning approaches have been developed to alleviate the annotation burden.
We show that latent variable models (PLVMs) achieve state-of-the-art performance across four datasets.
arXiv Detail & Related papers (2023-03-31T07:06:24Z) - Deep Active Learning Using Barlow Twins [0.0]
The generalisation performance of a convolutional neural networks (CNN) is majorly predisposed by the quantity, quality, and diversity of the training images.
The goal of the Active learning for the task is to draw most informative samples from the unlabeled pool.
We propose Deep Active Learning using BarlowTwins(DALBT), an active learning method for all the datasets.
arXiv Detail & Related papers (2022-12-30T12:39:55Z) - On the Informativeness of Supervision Signals [31.418827619510036]
We use information theory to compare how a number of commonly-used supervision signals contribute to representation-learning performance.
Our framework provides theoretical justification for using hard labels in the big-data regime, but richer supervision signals for few-shot learning and out-of-distribution generalization.
arXiv Detail & Related papers (2022-11-02T18:02:31Z) - Improving Model Training via Self-learned Label Representations [5.969349640156469]
We show that more sophisticated label representations are better for classification than the usual one-hot encoding.
We propose Learning with Adaptive Labels (LwAL) algorithm, which simultaneously learns the label representation while training for the classification task.
Our algorithm introduces negligible additional parameters and has a minimal computational overhead.
arXiv Detail & Related papers (2022-09-09T21:10:43Z) - Improving Contrastive Learning on Imbalanced Seed Data via Open-World
Sampling [96.8742582581744]
We present an open-world unlabeled data sampling framework called Model-Aware K-center (MAK)
MAK follows three simple principles: tailness, proximity, and diversity.
We demonstrate that MAK can consistently improve both the overall representation quality and the class balancedness of the learned features.
arXiv Detail & Related papers (2021-11-01T15:09:41Z) - Self-supervised Learning is More Robust to Dataset Imbalance [65.84339596595383]
We investigate self-supervised learning under dataset imbalance.
Off-the-shelf self-supervised representations are already more robust to class imbalance than supervised representations.
We devise a re-weighted regularization technique that consistently improves the SSL representation quality on imbalanced datasets.
arXiv Detail & Related papers (2021-10-11T06:29:56Z) - SCARF: Self-Supervised Contrastive Learning using Random Feature
Corruption [72.35532598131176]
We propose SCARF, a technique for contrastive learning, where views are formed by corrupting a random subset of features.
We show that SCARF complements existing strategies and outperforms alternatives like autoencoders.
arXiv Detail & Related papers (2021-06-29T08:08:33Z) - Knowledge-Guided Multi-Label Few-Shot Learning for General Image
Recognition [75.44233392355711]
KGGR framework exploits prior knowledge of statistical label correlations with deep neural networks.
It first builds a structured knowledge graph to correlate different labels based on statistical label co-occurrence.
Then, it introduces the label semantics to guide learning semantic-specific features.
It exploits a graph propagation network to explore graph node interactions.
arXiv Detail & Related papers (2020-09-20T15:05:29Z) - Federated Self-Supervised Learning of Multi-Sensor Representations for
Embedded Intelligence [8.110949636804772]
Smartphones, wearables, and Internet of Things (IoT) devices produce a wealth of data that cannot be accumulated in a centralized repository for learning supervised models.
We propose a self-supervised approach termed textitscalogram-signal correspondence learning based on wavelet transform to learn useful representations from unlabeled sensor inputs.
We extensively assess the quality of learned features with our multi-view strategy on diverse public datasets, achieving strong performance in all domains.
arXiv Detail & Related papers (2020-07-25T21:59:17Z) - Laplacian Denoising Autoencoder [114.21219514831343]
We propose to learn data representations with a novel type of denoising autoencoder.
The noisy input data is generated by corrupting latent clean data in the gradient domain.
Experiments on several visual benchmarks demonstrate that better representations can be learned with the proposed approach.
arXiv Detail & Related papers (2020-03-30T16:52:39Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.