Noether Networks: Meta-Learning Useful Conserved Quantities
- URL: http://arxiv.org/abs/2112.03321v1
- Date: Mon, 6 Dec 2021 19:27:43 GMT
- Title: Noether Networks: Meta-Learning Useful Conserved Quantities
- Authors: Ferran Alet, Dylan Doblar, Allan Zhou, Joshua Tenenbaum, Kenji
Kawaguchi, Chelsea Finn
- Abstract summary: We propose Noether Networks: a new type of architecture where a meta-learned conservation loss is optimized inside the prediction function.
We show, theoretically and experimentally, that Noether Networks improve prediction quality, providing a general framework for discovering inductive biases in sequential problems.
- Score: 46.88551280525578
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Progress in machine learning (ML) stems from a combination of data
availability, computational resources, and an appropriate encoding of inductive
biases. Useful biases often exploit symmetries in the prediction problem, such
as convolutional networks relying on translation equivariance. Automatically
discovering these useful symmetries holds the potential to greatly improve the
performance of ML systems, but still remains a challenge. In this work, we
focus on sequential prediction problems and take inspiration from Noether's
theorem to reduce the problem of finding inductive biases to meta-learning
useful conserved quantities. We propose Noether Networks: a new type of
architecture where a meta-learned conservation loss is optimized inside the
prediction function. We show, theoretically and experimentally, that Noether
Networks improve prediction quality, providing a general framework for
discovering inductive biases in sequential problems.
Related papers
- Error Feedback under $(L_0,L_1)$-Smoothness: Normalization and Momentum [56.37522020675243]
We provide the first proof of convergence for normalized error feedback algorithms across a wide range of machine learning problems.
We show that due to their larger allowable stepsizes, our new normalized error feedback algorithms outperform their non-normalized counterparts on various tasks.
arXiv Detail & Related papers (2024-10-22T10:19:27Z) - Characterizing out-of-distribution generalization of neural networks: application to the disordered Su-Schrieffer-Heeger model [38.79241114146971]
We show how interpretability methods can increase trust in predictions of a neural network trained to classify quantum phases.
In particular, we show that we can ensure better out-of-distribution generalization in the complex classification problem.
This work is an example of how the systematic use of interpretability methods can improve the performance of NNs in scientific problems.
arXiv Detail & Related papers (2024-06-14T13:24:32Z) - Learning Latent Graph Structures and their Uncertainty [63.95971478893842]
Graph Neural Networks (GNNs) use relational information as an inductive bias to enhance the model's accuracy.
As task-relevant relations might be unknown, graph structure learning approaches have been proposed to learn them while solving the downstream prediction task.
arXiv Detail & Related papers (2024-05-30T10:49:22Z) - Deep Neural Networks Tend To Extrapolate Predictably [51.303814412294514]
neural network predictions tend to be unpredictable and overconfident when faced with out-of-distribution (OOD) inputs.
We observe that neural network predictions often tend towards a constant value as input data becomes increasingly OOD.
We show how one can leverage our insights in practice to enable risk-sensitive decision-making in the presence of OOD inputs.
arXiv Detail & Related papers (2023-10-02T03:25:32Z) - Regularization, early-stopping and dreaming: a Hopfield-like setup to
address generalization and overfitting [0.0]
We look for optimal network parameters by applying a gradient descent over a regularized loss function.
Within this framework, the optimal neuron-interaction matrices correspond to Hebbian kernels revised by a reiterated unlearning protocol.
arXiv Detail & Related papers (2023-08-01T15:04:30Z) - GIT: Detecting Uncertainty, Out-Of-Distribution and Adversarial Samples
using Gradients and Invariance Transformations [77.34726150561087]
We propose a holistic approach for the detection of generalization errors in deep neural networks.
GIT combines the usage of gradient information and invariance transformations.
Our experiments demonstrate the superior performance of GIT compared to the state-of-the-art on a variety of network architectures.
arXiv Detail & Related papers (2023-07-05T22:04:38Z) - Less is More: Rethinking Few-Shot Learning and Recurrent Neural Nets [2.824895388993495]
We provide theoretical guarantees for reliable learning under the information-theoretic AEP.
We then focus on a highly efficient recurrent neural net (RNN) framework and propose a reduced-entropy algorithm for few-shot learning.
Our experimental results demonstrate significant potential for improving learning models' sample efficiency, generalization, and time complexity.
arXiv Detail & Related papers (2022-09-28T17:33:11Z) - Tailoring: encoding inductive biases by optimizing unsupervised
objectives at prediction time [34.03150701567508]
Adding auxiliary losses to the main objective function is a general way of encoding biases that can help networks learn better representations.
In this work we take inspiration from textittransductive learning and note that after receiving an input, we can fine-tune our networks on any unsupervised loss.
We formulate em meta-tailoring, a nested optimization similar to that in meta-learning, and train our models to perform well on the task objective after adapting them using an unsupervised loss.
arXiv Detail & Related papers (2020-09-22T15:26:24Z) - Vulnerability Under Adversarial Machine Learning: Bias or Variance? [77.30759061082085]
We investigate the effect of adversarial machine learning on the bias and variance of a trained deep neural network.
Our analysis sheds light on why the deep neural networks have poor performance under adversarial perturbation.
We introduce a new adversarial machine learning algorithm with lower computational complexity than well-known adversarial machine learning strategies.
arXiv Detail & Related papers (2020-08-01T00:58:54Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.