Thinking Deeply with Recurrence: Generalizing from Easy to Hard
Sequential Reasoning Problems
- URL: http://arxiv.org/abs/2102.11011v1
- Date: Mon, 22 Feb 2021 14:09:20 GMT
- Title: Thinking Deeply with Recurrence: Generalizing from Easy to Hard
Sequential Reasoning Problems
- Authors: Avi Schwarzschild, Arjun Gupta, Micah Goldblum, Tom Goldstein
- Abstract summary: We observe that recurrent networks have the uncanny ability to closely emulate the behavior of non-recurrent deep models.
We show that recurrent networks that are trained to solve simple mazes with few recurrent steps can indeed solve much more complex problems simply by performing additional recurrences during inference.
- Score: 51.132938969015825
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: Deep neural networks are powerful machines for visual pattern recognition,
but reasoning tasks that are easy for humans may still be difficult for neural
models. Humans can extrapolate simple reasoning strategies to solve difficult
problems using long sequences of abstract manipulations, i.e., harder problems
are solved by thinking for longer. In contrast, the sequential computing budget
of feed-forward networks is limited by their depth, and networks trained on
simple problems have no way of extending their reasoning capabilities without
retraining. In this work, we observe that recurrent networks have the uncanny
ability to closely emulate the behavior of non-recurrent deep models, often
doing so with far fewer parameters, on both image classification and maze
solving tasks. We also explore whether recurrent networks can make the
generalization leap from simple problems to hard problems simply by increasing
the number of recurrent iterations used as test time. To this end, we show that
recurrent networks that are trained to solve simple mazes with few recurrent
steps can indeed solve much more complex problems simply by performing
additional recurrences during inference.
Related papers
- Adaptive recurrent vision performs zero-shot computation scaling to
unseen difficulty levels [6.053394076324473]
We investigate whether adaptive computation can also enable vision models to extrapolate solutions beyond their training distribution's difficulty level.
We combine convolutional recurrent neural networks (ConvRNNs) with a learnable mechanism based on Graves: PathFinder and Mazes.
We show that AdRNNs learn to dynamically halt processing early (or late) to solve easier (or harder) problems, 2) these RNNs zero-shot generalize to more difficult problem settings not shown during training by dynamically increasing the number of recurrent at test time.
arXiv Detail & Related papers (2023-11-12T21:07:04Z) - The Clock and the Pizza: Two Stories in Mechanistic Explanation of
Neural Networks [59.26515696183751]
We show that algorithm discovery in neural networks is sometimes more complex.
We show that even simple learning problems can admit a surprising diversity of solutions.
arXiv Detail & Related papers (2023-06-30T17:59:13Z) - Are Deep Neural Networks SMARTer than Second Graders? [85.60342335636341]
We evaluate the abstraction, deduction, and generalization abilities of neural networks in solving visuo-linguistic puzzles designed for children in the 6--8 age group.
Our dataset consists of 101 unique puzzles; each puzzle comprises a picture question, and their solution needs a mix of several elementary skills, including arithmetic, algebra, and spatial reasoning.
Experiments reveal that while powerful deep models offer reasonable performances on puzzles in a supervised setting, they are not better than random accuracy when analyzed for generalization.
arXiv Detail & Related papers (2022-12-20T04:33:32Z) - Learning Iterative Reasoning through Energy Minimization [77.33859525900334]
We present a new framework for iterative reasoning with neural networks.
We train a neural network to parameterize an energy landscape over all outputs.
We implement each step of the iterative reasoning as an energy minimization step to find a minimal energy solution.
arXiv Detail & Related papers (2022-06-30T17:44:20Z) - End-to-end Algorithm Synthesis with Recurrent Networks: Logical
Extrapolation Without Overthinking [52.05847268235338]
We show how machine learning systems can perform logical extrapolation without overthinking problems.
We propose a recall architecture that keeps an explicit copy of the problem instance in memory so that it cannot be forgotten.
We also employ a progressive training routine that prevents the model from learning behaviors that are specific to number and instead pushes it to learn behaviors that can be repeated indefinitely.
arXiv Detail & Related papers (2022-02-11T18:43:28Z) - Can You Learn an Algorithm? Generalizing from Easy to Hard Problems with
Recurrent Networks [47.54459795966417]
We show that recurrent networks trained to solve simple problems can indeed solve much more complex problems simply by performing additional recurrences during inference.
In all three domains, networks trained on simple problem instances are able to extend their reasoning abilities at test time simply by "thinking for longer"
arXiv Detail & Related papers (2021-06-08T17:19:48Z) - Verifying Recurrent Neural Networks using Invariant Inference [0.0]
We propose a novel approach for verifying properties of a widespread variant of neural networks, called recurrent neural networks.
Our approach is based on the inference of invariants, which allow us to reduce the complex problem of verifying recurrent networks into simpler, non-recurrent problems.
arXiv Detail & Related papers (2020-04-06T08:08:24Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.