End-to-end Algorithm Synthesis with Recurrent Networks: Logical
Extrapolation Without Overthinking
- URL: http://arxiv.org/abs/2202.05826v2
- Date: Tue, 15 Feb 2022 14:38:12 GMT
- Title: End-to-end Algorithm Synthesis with Recurrent Networks: Logical
Extrapolation Without Overthinking
- Authors: Arpit Bansal, Avi Schwarzschild, Eitan Borgnia, Zeyad Emam, Furong
Huang, Micah Goldblum, Tom Goldstein
- Abstract summary: We show how machine learning systems can perform logical extrapolation without overthinking problems.
We propose a recall architecture that keeps an explicit copy of the problem instance in memory so that it cannot be forgotten.
We also employ a progressive training routine that prevents the model from learning behaviors that are specific to number and instead pushes it to learn behaviors that can be repeated indefinitely.
- Score: 52.05847268235338
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Machine learning systems perform well on pattern matching tasks, but their
ability to perform algorithmic or logical reasoning is not well understood. One
important reasoning capability is logical extrapolation, in which models
trained only on small/simple reasoning problems can synthesize complex
algorithms that scale up to large/complex problems at test time. Logical
extrapolation can be achieved through recurrent systems, which can be iterated
many times to solve difficult reasoning problems. We observe that this approach
fails to scale to highly complex problems because behavior degenerates when
many iterations are applied -- an issue we refer to as "overthinking." We
propose a recall architecture that keeps an explicit copy of the problem
instance in memory so that it cannot be forgotten. We also employ a progressive
training routine that prevents the model from learning behaviors that are
specific to iteration number and instead pushes it to learn behaviors that can
be repeated indefinitely. These innovations prevent the overthinking problem,
and enable recurrent systems to solve extremely hard logical extrapolation
tasks, some requiring over 100K convolutional layers, without overthinking.
Related papers
- The Clock and the Pizza: Two Stories in Mechanistic Explanation of
Neural Networks [59.26515696183751]
We show that algorithm discovery in neural networks is sometimes more complex.
We show that even simple learning problems can admit a surprising diversity of solutions.
arXiv Detail & Related papers (2023-06-30T17:59:13Z) - The Art of SOCRATIC QUESTIONING: Recursive Thinking with Large Language
Models [45.01562498702836]
Chain-of-Thought (CoT) prompting enables large language models to solve complex reasoning problems by generating intermediate steps.
We propose SOCRATIC QUESTIONING, a divide-and-conquer style algorithm that mimics the recursive thinking process.
arXiv Detail & Related papers (2023-05-24T10:36:14Z) - Chaining Simultaneous Thoughts for Numerical Reasoning [92.2007997126144]
numerical reasoning over text should be an essential skill of AI systems.
Previous work focused on modeling the structures of equations, and has proposed various structured decoders.
We propose CANTOR, a numerical reasoner that models reasoning steps using a directed acyclic graph.
arXiv Detail & Related papers (2022-11-29T18:52:06Z) - Learning Iterative Reasoning through Energy Minimization [77.33859525900334]
We present a new framework for iterative reasoning with neural networks.
We train a neural network to parameterize an energy landscape over all outputs.
We implement each step of the iterative reasoning as an energy minimization step to find a minimal energy solution.
arXiv Detail & Related papers (2022-06-30T17:44:20Z) - Can You Learn an Algorithm? Generalizing from Easy to Hard Problems with
Recurrent Networks [47.54459795966417]
We show that recurrent networks trained to solve simple problems can indeed solve much more complex problems simply by performing additional recurrences during inference.
In all three domains, networks trained on simple problem instances are able to extend their reasoning abilities at test time simply by "thinking for longer"
arXiv Detail & Related papers (2021-06-08T17:19:48Z) - Differentiable Logic Machines [38.21461039738474]
We propose a novel neural-logic architecture, called differentiable logic machine (DLM)
DLM can solve both inductive logic programming (ILP) and reinforcement learning (RL) problems.
On RL problems, without requiring an interpretable solution, DLM outperforms other non-interpretable neural-logic RL approaches.
arXiv Detail & Related papers (2021-02-23T07:31:52Z) - Thinking Deeply with Recurrence: Generalizing from Easy to Hard
Sequential Reasoning Problems [51.132938969015825]
We observe that recurrent networks have the uncanny ability to closely emulate the behavior of non-recurrent deep models.
We show that recurrent networks that are trained to solve simple mazes with few recurrent steps can indeed solve much more complex problems simply by performing additional recurrences during inference.
arXiv Detail & Related papers (2021-02-22T14:09:20Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.