Reasoning-Modulated Representations
- URL: http://arxiv.org/abs/2107.08881v1
- Date: Mon, 19 Jul 2021 13:57:13 GMT
- Title: Reasoning-Modulated Representations
- Authors: Petar Veli\v{c}kovi\'c, Matko Bo\v{s}njak, Thomas Kipf, Alexander
Lerchner, Raia Hadsell, Razvan Pascanu, Charles Blundell
- Abstract summary: We study a common setting where our task is not purely opaque.
Our approach paves the way for a new class of data-efficient representation learning.
- Score: 85.08205744191078
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Neural networks leverage robust internal representations in order to
generalise. Learning them is difficult, and often requires a large training set
that covers the data distribution densely. We study a common setting where our
task is not purely opaque. Indeed, very often we may have access to information
about the underlying system (e.g. that observations must obey certain laws of
physics) that any "tabula rasa" neural network would need to re-learn from
scratch, penalising data efficiency. We incorporate this information into a
pre-trained reasoning module, and investigate its role in shaping the
discovered representations in diverse self-supervised learning settings from
pixels. Our approach paves the way for a new class of data-efficient
representation learning.
Related papers
- A Spectral Condition for Feature Learning [20.440553685976194]
Key challenge is to scale training so that a network's internal representations evolve nontrivially at all widths.
We show that feature learning is achieved by scaling the spectral norm of weight and their updates.
arXiv Detail & Related papers (2023-10-26T23:17:39Z) - On information captured by neural networks: connections with
memorization and generalization [4.082286997378594]
We study information captured by neural networks during training.
We relate example informativeness to generalization by deriving nonvacuous generalization gap bounds.
Overall, our findings contribute to a deeper understanding of the mechanisms underlying neural network generalization.
arXiv Detail & Related papers (2023-06-28T04:46:59Z) - Learn, Unlearn and Relearn: An Online Learning Paradigm for Deep Neural
Networks [12.525959293825318]
We introduce Learn, Unlearn, and Relearn (LURE) an online learning paradigm for deep neural networks (DNNs)
LURE interchanges between the unlearning phase, which selectively forgets the undesirable information in the model, and the relearning phase, which emphasizes learning on generalizable features.
We show that our training paradigm provides consistent performance gains across datasets in both classification and few-shot settings.
arXiv Detail & Related papers (2023-03-18T16:45:54Z) - Neural networks trained with SGD learn distributions of increasing
complexity [78.30235086565388]
We show that neural networks trained using gradient descent initially classify their inputs using lower-order input statistics.
We then exploit higher-order statistics only later during training.
We discuss the relation of DSB to other simplicity biases and consider its implications for the principle of universality in learning.
arXiv Detail & Related papers (2022-11-21T15:27:22Z) - Learning sparse features can lead to overfitting in neural networks [9.2104922520782]
We show that feature learning can perform worse than lazy training.
Although sparsity is known to be essential for learning anisotropic data, it is detrimental when the target function is constant or smooth.
arXiv Detail & Related papers (2022-06-24T14:26:33Z) - Fair Interpretable Learning via Correction Vectors [68.29997072804537]
We propose a new framework for fair representation learning centered around the learning of "correction vectors"
The corrections are then simply summed up to the original features, and can therefore be analyzed as an explicit penalty or bonus to each feature.
We show experimentally that a fair representation learning problem constrained in such a way does not impact performance.
arXiv Detail & Related papers (2022-01-17T10:59:33Z) - Dynamic Inference with Neural Interpreters [72.90231306252007]
We present Neural Interpreters, an architecture that factorizes inference in a self-attention network as a system of modules.
inputs to the model are routed through a sequence of functions in a way that is end-to-end learned.
We show that Neural Interpreters perform on par with the vision transformer using fewer parameters, while being transferrable to a new task in a sample efficient manner.
arXiv Detail & Related papers (2021-10-12T23:22:45Z) - Self-supervised Audiovisual Representation Learning for Remote Sensing Data [96.23611272637943]
We propose a self-supervised approach for pre-training deep neural networks in remote sensing.
By exploiting the correspondence between geo-tagged audio recordings and remote sensing, this is done in a completely label-free manner.
We show that our approach outperforms existing pre-training strategies for remote sensing imagery.
arXiv Detail & Related papers (2021-08-02T07:50:50Z) - Malicious Network Traffic Detection via Deep Learning: An Information
Theoretic View [0.0]
We study how homeomorphism affects learned representation of a malware traffic dataset.
Our results suggest that although the details of learned representations and the specific coordinate system defined over the manifold of all parameters differ slightly, the functional approximations are the same.
arXiv Detail & Related papers (2020-09-16T15:37:44Z) - Laplacian Denoising Autoencoder [114.21219514831343]
We propose to learn data representations with a novel type of denoising autoencoder.
The noisy input data is generated by corrupting latent clean data in the gradient domain.
Experiments on several visual benchmarks demonstrate that better representations can be learned with the proposed approach.
arXiv Detail & Related papers (2020-03-30T16:52:39Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.