Can You Learn an Algorithm? Generalizing from Easy to Hard Problems with
Recurrent Networks
- URL: http://arxiv.org/abs/2106.04537v1
- Date: Tue, 8 Jun 2021 17:19:48 GMT
- Title: Can You Learn an Algorithm? Generalizing from Easy to Hard Problems with
Recurrent Networks
- Authors: Avi Schwarzschild, Eitan Borgnia, Arjun Gupta, Furong Huang, Uzi
Vishkin, Micah Goldblum, Tom Goldstein
- Abstract summary: We show that recurrent networks trained to solve simple problems can indeed solve much more complex problems simply by performing additional recurrences during inference.
In all three domains, networks trained on simple problem instances are able to extend their reasoning abilities at test time simply by "thinking for longer"
- Score: 47.54459795966417
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: Deep neural networks are powerful machines for visual pattern recognition,
but reasoning tasks that are easy for humans may still be difficult for neural
models. Humans possess the ability to extrapolate reasoning strategies learned
on simple problems to solve harder examples, often by thinking for longer. For
example, a person who has learned to solve small mazes can easily extend the
very same search techniques to solve much larger mazes by spending more time.
In computers, this behavior is often achieved through the use of algorithms,
which scale to arbitrarily hard problem instances at the cost of more
computation. In contrast, the sequential computing budget of feed-forward
neural networks is limited by their depth, and networks trained on simple
problems have no way of extending their reasoning to accommodate harder
problems. In this work, we show that recurrent networks trained to solve simple
problems with few recurrent steps can indeed solve much more complex problems
simply by performing additional recurrences during inference. We demonstrate
this algorithmic behavior of recurrent networks on prefix sum computation,
mazes, and chess. In all three domains, networks trained on simple problem
instances are able to extend their reasoning abilities at test time simply by
"thinking for longer."
Related papers
- Adaptive recurrent vision performs zero-shot computation scaling to
unseen difficulty levels [6.053394076324473]
We investigate whether adaptive computation can also enable vision models to extrapolate solutions beyond their training distribution's difficulty level.
We combine convolutional recurrent neural networks (ConvRNNs) with a learnable mechanism based on Graves: PathFinder and Mazes.
We show that AdRNNs learn to dynamically halt processing early (or late) to solve easier (or harder) problems, 2) these RNNs zero-shot generalize to more difficult problem settings not shown during training by dynamically increasing the number of recurrent at test time.
arXiv Detail & Related papers (2023-11-12T21:07:04Z) - The Clock and the Pizza: Two Stories in Mechanistic Explanation of
Neural Networks [59.26515696183751]
We show that algorithm discovery in neural networks is sometimes more complex.
We show that even simple learning problems can admit a surprising diversity of solutions.
arXiv Detail & Related papers (2023-06-30T17:59:13Z) - Are Deep Neural Networks SMARTer than Second Graders? [85.60342335636341]
We evaluate the abstraction, deduction, and generalization abilities of neural networks in solving visuo-linguistic puzzles designed for children in the 6--8 age group.
Our dataset consists of 101 unique puzzles; each puzzle comprises a picture question, and their solution needs a mix of several elementary skills, including arithmetic, algebra, and spatial reasoning.
Experiments reveal that while powerful deep models offer reasonable performances on puzzles in a supervised setting, they are not better than random accuracy when analyzed for generalization.
arXiv Detail & Related papers (2022-12-20T04:33:32Z) - Learning Iterative Reasoning through Energy Minimization [77.33859525900334]
We present a new framework for iterative reasoning with neural networks.
We train a neural network to parameterize an energy landscape over all outputs.
We implement each step of the iterative reasoning as an energy minimization step to find a minimal energy solution.
arXiv Detail & Related papers (2022-06-30T17:44:20Z) - End-to-end Algorithm Synthesis with Recurrent Networks: Logical
Extrapolation Without Overthinking [52.05847268235338]
We show how machine learning systems can perform logical extrapolation without overthinking problems.
We propose a recall architecture that keeps an explicit copy of the problem instance in memory so that it cannot be forgotten.
We also employ a progressive training routine that prevents the model from learning behaviors that are specific to number and instead pushes it to learn behaviors that can be repeated indefinitely.
arXiv Detail & Related papers (2022-02-11T18:43:28Z) - Thinking Deeply with Recurrence: Generalizing from Easy to Hard
Sequential Reasoning Problems [51.132938969015825]
We observe that recurrent networks have the uncanny ability to closely emulate the behavior of non-recurrent deep models.
We show that recurrent networks that are trained to solve simple mazes with few recurrent steps can indeed solve much more complex problems simply by performing additional recurrences during inference.
arXiv Detail & Related papers (2021-02-22T14:09:20Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.