Can we learn gradients by Hamiltonian Neural Networks?
- URL: http://arxiv.org/abs/2111.00565v1
- Date: Sun, 31 Oct 2021 18:35:10 GMT
- Title: Can we learn gradients by Hamiltonian Neural Networks?
- Authors: Aleksandr Timofeev, Andrei Afonin, Yehao Liu
- Abstract summary: We propose a meta-learner based on ODE neural networks that learns gradients.
We demonstrate that our method outperforms a meta-learner based on LSTM for an artificial task and the MNIST dataset with ReLU activations in the optimizee.
- Score: 68.8204255655161
- License: http://creativecommons.org/licenses/by-sa/4.0/
- Abstract: In this work, we propose a meta-learner based on ODE neural networks that
learns gradients. This approach makes the optimizer is more flexible inducing
an automatic inductive bias to the given task. Using the simplest Hamiltonian
Neural Network we demonstrate that our method outperforms a meta-learner based
on LSTM for an artificial task and the MNIST dataset with ReLU activations in
the optimizee. Furthermore, it also surpasses the classic optimization methods
for the artificial task and achieves comparable results for MNIST.
Related papers
- Use Your INSTINCT: INSTruction optimization for LLMs usIng Neural bandits Coupled with Transformers [66.823588073584]
Large language models (LLMs) have shown remarkable instruction-following capabilities and achieved impressive performances in various applications.
Recent work has used the query-efficient Bayesian optimization (BO) algorithm to automatically optimize the instructions given to black-box LLMs.
We propose a neural bandit algorithm which replaces the GP in BO by an NN surrogate to optimize instructions for black-box LLMs.
arXiv Detail & Related papers (2023-10-02T02:01:16Z) - Globally Optimal Training of Neural Networks with Threshold Activation
Functions [63.03759813952481]
We study weight decay regularized training problems of deep neural networks with threshold activations.
We derive a simplified convex optimization formulation when the dataset can be shattered at a certain layer of the network.
arXiv Detail & Related papers (2023-03-06T18:59:13Z) - Towards Theoretically Inspired Neural Initialization Optimization [66.04735385415427]
We propose a differentiable quantity, named GradCosine, with theoretical insights to evaluate the initial state of a neural network.
We show that both the training and test performance of a network can be improved by maximizing GradCosine under norm constraint.
Generalized from the sample-wise analysis into the real batch setting, NIO is able to automatically look for a better initialization with negligible cost.
arXiv Detail & Related papers (2022-10-12T06:49:16Z) - Training Neural Networks using SAT solvers [1.0152838128195465]
We propose an algorithm to explore the global optimisation method, using SAT solvers, for training a neural net.
In the experiments, we demonstrate the effectiveness of our algorithm against the ADAM optimiser in certain tasks like parity learning.
arXiv Detail & Related papers (2022-06-10T01:31:12Z) - An alternative approach to train neural networks using monotone
variational inequality [22.320632565424745]
We propose an alternative approach to neural network training using the monotone vector field.
Our approach can be used for more efficient fine-tuning of a pre-trained neural network.
arXiv Detail & Related papers (2022-02-17T19:24:20Z) - Meta-learning Spiking Neural Networks with Surrogate Gradient Descent [1.90365714903665]
Bi-level learning, such as meta-learning, is increasingly used in deep learning to overcome limitations.
We show that SNNs meta-trained using MAML match or exceed the performance of conventional ANNs meta-trained with MAML on event-based meta-datasets.
Our results emphasize how meta-learning techniques can become instrumental for deploying neuromorphic learning technologies on real-world problems.
arXiv Detail & Related papers (2022-01-26T06:53:46Z) - Gone Fishing: Neural Active Learning with Fisher Embeddings [55.08537975896764]
There is an increasing need for active learning algorithms that are compatible with deep neural networks.
This article introduces BAIT, a practical representation of tractable, and high-performing active learning algorithm for neural networks.
arXiv Detail & Related papers (2021-06-17T17:26:31Z) - MSE-Optimal Neural Network Initialization via Layer Fusion [68.72356718879428]
Deep neural networks achieve state-of-the-art performance for a range of classification and inference tasks.
The use of gradient combined nonvolutionity renders learning susceptible to novel problems.
We propose fusing neighboring layers of deeper networks that are trained with random variables.
arXiv Detail & Related papers (2020-01-28T18:25:15Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.