Usable Information and Evolution of Optimal Representations During
Training
- URL: http://arxiv.org/abs/2010.02459v2
- Date: Sun, 28 Feb 2021 17:51:26 GMT
- Title: Usable Information and Evolution of Optimal Representations During
Training
- Authors: Michael Kleinman, Alessandro Achille, Daksh Idnani, Jonathan C. Kao
- Abstract summary: In particular, we find that semantically meaningful but ultimately irrelevant information is encoded in the early transient dynamics of training.
We show these effects on both perceptual decision-making tasks inspired by literature, as well as on standard image classification tasks.
- Score: 79.38872675793813
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: We introduce a notion of usable information contained in the representation
learned by a deep network, and use it to study how optimal representations for
the task emerge during training. We show that the implicit regularization
coming from training with Stochastic Gradient Descent with a high learning-rate
and small batch size plays an important role in learning minimal sufficient
representations for the task. In the process of arriving at a minimal
sufficient representation, we find that the content of the representation
changes dynamically during training. In particular, we find that semantically
meaningful but ultimately irrelevant information is encoded in the early
transient dynamics of training, before being later discarded. In addition, we
evaluate how perturbing the initial part of training impacts the learning
dynamics and the resulting representations. We show these effects on both
perceptual decision-making tasks inspired by neuroscience literature, as well
as on standard image classification tasks.
Related papers
- A Quantitative Approach to Predicting Representational Learning and
Performance in Neural Networks [5.544128024203989]
Key property of neural networks is how they learn to represent and manipulate input information in order to solve a task.
We introduce a new pseudo-kernel based tool for analyzing and predicting learned representations.
arXiv Detail & Related papers (2023-07-14T18:39:04Z) - ALP: Action-Aware Embodied Learning for Perception [60.64801970249279]
We introduce Action-Aware Embodied Learning for Perception (ALP)
ALP incorporates action information into representation learning through a combination of optimizing a reinforcement learning policy and an inverse dynamics prediction objective.
We show that ALP outperforms existing baselines in several downstream perception tasks.
arXiv Detail & Related papers (2023-06-16T21:51:04Z) - Task Formulation Matters When Learning Continually: A Case Study in
Visual Question Answering [58.82325933356066]
Continual learning aims to train a model incrementally on a sequence of tasks without forgetting previous knowledge.
We present a detailed study of how different settings affect performance for Visual Question Answering.
arXiv Detail & Related papers (2022-09-30T19:12:58Z) - (Un)likelihood Training for Interpretable Embedding [30.499562324921648]
Cross-modal representation learning has become a new normal for bridging the semantic gap between text and visual data.
We propose two novel training objectives, likelihood and unlikelihood functions, to unroll semantics behind embeddings.
With both training objectives, a new encoder-decoder network, which learns interpretable cross-modal representation, is proposed for ad-hoc video search.
arXiv Detail & Related papers (2022-07-01T09:15:02Z) - Probing Representation Forgetting in Supervised and Unsupervised
Continual Learning [14.462797749666992]
Catastrophic forgetting is associated with an abrupt loss of knowledge previously learned by a model.
We show that representation forgetting can lead to new insights on the effect of model capacity and loss function used in continual learning.
arXiv Detail & Related papers (2022-03-24T23:06:08Z) - On Efficient Transformer and Image Pre-training for Low-level Vision [74.22436001426517]
Pre-training has marked numerous state of the arts in high-level computer vision.
We present an in-depth study of image pre-training.
We find pre-training plays strikingly different roles in low-level tasks.
arXiv Detail & Related papers (2021-12-19T15:50:48Z) - Adversarial Training Reduces Information and Improves Transferability [81.59364510580738]
Recent results show that features of adversarially trained networks for classification, in addition to being robust, enable desirable properties such as invertibility.
We show that the Adversarial Training can improve linear transferability to new tasks, from which arises a new trade-off between transferability of representations and accuracy on the source task.
arXiv Detail & Related papers (2020-07-22T08:30:16Z) - Complementing Representation Deficiency in Few-shot Image
Classification: A Meta-Learning Approach [27.350615059290348]
We propose a meta-learning approach with complemented representations network (MCRNet) for few-shot image classification.
In particular, we embed a latent space, where latent codes are reconstructed with extra representation information to complement the representation deficiency.
Our end-to-end framework achieves the state-of-the-art performance in image classification on three standard few-shot learning datasets.
arXiv Detail & Related papers (2020-07-21T13:25:54Z) - Adversarially-Trained Deep Nets Transfer Better: Illustration on Image
Classification [53.735029033681435]
Transfer learning is a powerful methodology for adapting pre-trained deep neural networks on image recognition tasks to new domains.
In this work, we demonstrate that adversarially-trained models transfer better than non-adversarially-trained models.
arXiv Detail & Related papers (2020-07-11T22:48:42Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.