A Generalist Neural Algorithmic Learner
- URL: http://arxiv.org/abs/2209.11142v1
- Date: Thu, 22 Sep 2022 16:41:33 GMT
- Title: A Generalist Neural Algorithmic Learner
- Authors: Borja Ibarz, Vitaly Kurin, George Papamakarios, Kyriacos Nikiforou,
Mehdi Bennani, R\'obert Csord\'as, Andrew Dudzik, Matko Bo\v{s}njak, Alex
Vitvitskyi, Yulia Rubanova, Andreea Deac, Beatrice Bevilacqua, Yaroslav
Ganin, Charles Blundell, Petar Veli\v{c}kovi\'c
- Abstract summary: We build a single graph neural network processor capable of learning to execute a wide range of algorithms.
We show that it is possible to effectively learn algorithms in a multi-task manner, so long as we can learn to execute them well in a single-task regime.
- Score: 18.425083543441776
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: The cornerstone of neural algorithmic reasoning is the ability to solve
algorithmic tasks, especially in a way that generalises out of distribution.
While recent years have seen a surge in methodological improvements in this
area, they mostly focused on building specialist models. Specialist models are
capable of learning to neurally execute either only one algorithm or a
collection of algorithms with identical control-flow backbone. Here, instead,
we focus on constructing a generalist neural algorithmic learner -- a single
graph neural network processor capable of learning to execute a wide range of
algorithms, such as sorting, searching, dynamic programming, path-finding and
geometry. We leverage the CLRS benchmark to empirically show that, much like
recent successes in the domain of perception, generalist algorithmic learners
can be built by "incorporating" knowledge. That is, it is possible to
effectively learn algorithms in a multi-task manner, so long as we can learn to
execute them well in a single-task regime. Motivated by this, we present a
series of improvements to the input representation, training regime and
processor architecture over CLRS, improving average single-task performance by
over 20% from prior art. We then conduct a thorough ablation of multi-task
learners leveraging these improvements. Our results demonstrate a generalist
learner that effectively incorporates knowledge captured by specialist models.
Related papers
- Training Neural Networks with Internal State, Unconstrained
Connectivity, and Discrete Activations [66.53734987585244]
True intelligence may require the ability of a machine learning model to manage internal state.
We show that we have not yet discovered the most effective algorithms for training such models.
We present one attempt to design such a training algorithm, applied to an architecture with binary activations and only a single matrix of weights.
arXiv Detail & Related papers (2023-12-22T01:19:08Z) - The Clock and the Pizza: Two Stories in Mechanistic Explanation of
Neural Networks [59.26515696183751]
We show that algorithm discovery in neural networks is sometimes more complex.
We show that even simple learning problems can admit a surprising diversity of solutions.
arXiv Detail & Related papers (2023-06-30T17:59:13Z) - Neural Algorithmic Reasoning Without Intermediate Supervision [21.852775399735005]
We focus on learning neural algorithmic reasoning only from the input-output pairs without appealing to the intermediate supervision.
We build a self-supervised objective that can regularise intermediate computations of the model without access to the algorithm trajectory.
We demonstrate that our approach is competitive to its trajectory-supervised counterpart on tasks from the CLRSic Algorithmic Reasoning Benchmark.
arXiv Detail & Related papers (2023-06-23T09:57:44Z) - Dual Algorithmic Reasoning [9.701208207491879]
We propose to learn algorithms by exploiting duality of the underlying algorithmic problem.
We demonstrate that simultaneously learning the dual definition of these optimisation problems in algorithmic learning allows for better learning.
We then validate the real-world utility of our dual algorithmic reasoner by deploying it on a challenging brain vessel classification task.
arXiv Detail & Related papers (2023-02-09T08:46:23Z) - Learning with Differentiable Algorithms [6.47243430672461]
This thesis explores combining classic algorithms and machine learning systems like neural networks.
The thesis formalizes the idea of algorithmic supervision, which allows a neural network to learn from or in conjunction with an algorithm.
In addition, this thesis proposes differentiable algorithms, such as differentiable sorting networks, differentiable sorting gates, and differentiable logic gate networks.
arXiv Detail & Related papers (2022-09-01T17:30:00Z) - The CLRS Algorithmic Reasoning Benchmark [28.789225199559834]
Learning representations of algorithms is an emerging area of machine learning, seeking to bridge concepts from neural networks with classical algorithms.
We propose the CLRS Algorithmic Reasoning Benchmark, covering classical algorithms from the Introduction to Algorithms textbook.
Our benchmark spans a variety of algorithmic reasoning procedures, including sorting, searching, dynamic programming, graph algorithms, string algorithms and geometric algorithms.
arXiv Detail & Related papers (2022-05-31T09:56:44Z) - The Information Geometry of Unsupervised Reinforcement Learning [133.20816939521941]
Unsupervised skill discovery is a class of algorithms that learn a set of policies without access to a reward function.
We show that unsupervised skill discovery algorithms do not learn skills that are optimal for every possible reward function.
arXiv Detail & Related papers (2021-10-06T13:08:36Z) - Evolving Reinforcement Learning Algorithms [186.62294652057062]
We propose a method for meta-learning reinforcement learning algorithms.
The learned algorithms are domain-agnostic and can generalize to new environments not seen during training.
We highlight two learned algorithms which obtain good generalization performance over other classical control tasks, gridworld type tasks, and Atari games.
arXiv Detail & Related papers (2021-01-08T18:55:07Z) - Learning to Stop While Learning to Predict [85.7136203122784]
Many algorithm-inspired deep models are restricted to a fixed-depth'' for all inputs.
Similar to algorithms, the optimal depth of a deep architecture may be different for different input instances.
In this paper, we tackle this varying depth problem using a steerable architecture.
We show that the learned deep model along with the stopping policy improves the performances on a diverse set of tasks.
arXiv Detail & Related papers (2020-06-09T07:22:01Z) - AutoML-Zero: Evolving Machine Learning Algorithms From Scratch [76.83052807776276]
We show that it is possible to automatically discover complete machine learning algorithms just using basic mathematical operations as building blocks.
We demonstrate this by introducing a novel framework that significantly reduces human bias through a generic search space.
We believe these preliminary successes in discovering machine learning algorithms from scratch indicate a promising new direction in the field.
arXiv Detail & Related papers (2020-03-06T19:00:04Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.