Related papers: End-to-end Algorithm Synthesis with Recurrent Networks: Logical Extrapolation Without Overthinking

End-to-end Algorithm Synthesis with Recurrent Networks: Logical Extrapolation Without Overthinking

URL: http://arxiv.org/abs/2202.05826v2
Date: Tue, 15 Feb 2022 14:38:12 GMT
Title: End-to-end Algorithm Synthesis with Recurrent Networks: Logical Extrapolation Without Overthinking
Authors: Arpit Bansal, Avi Schwarzschild, Eitan Borgnia, Zeyad Emam, Furong Huang, Micah Goldblum, Tom Goldstein
Abstract summary: We show how machine learning systems can perform logical extrapolation without overthinking problems. We propose a recall architecture that keeps an explicit copy of the problem instance in memory so that it cannot be forgotten. We also employ a progressive training routine that prevents the model from learning behaviors that are specific to number and instead pushes it to learn behaviors that can be repeated indefinitely.
Score: 52.05847268235338
License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
Abstract: Machine learning systems perform well on pattern matching tasks, but their ability to perform algorithmic or logical reasoning is not well understood. One important reasoning capability is logical extrapolation, in which models trained only on small/simple reasoning problems can synthesize complex algorithms that scale up to large/complex problems at test time. Logical extrapolation can be achieved through recurrent systems, which can be iterated many times to solve difficult reasoning problems. We observe that this approach fails to scale to highly complex problems because behavior degenerates when many iterations are applied -- an issue we refer to as "overthinking." We propose a recall architecture that keeps an explicit copy of the problem instance in memory so that it cannot be forgotten. We also employ a progressive training routine that prevents the model from learning behaviors that are specific to iteration number and instead pushes it to learn behaviors that can be repeated indefinitely. These innovations prevent the overthinking problem, and enable recurrent systems to solve extremely hard logical extrapolation tasks, some requiring over 100K convolutional layers, without overthinking.

Related papers

The Clock and the Pizza: Two Stories in Mechanistic Explanation of Neural Networks [59.26515696183751]
We show that algorithm discovery in neural networks is sometimes more complex. We show that even simple learning problems can admit a surprising diversity of solutions.
arXiv Detail & Related papers (2023-06-30T17:59:13Z)
The Art of SOCRATIC QUESTIONING: Recursive Thinking with Large Language Models [45.01562498702836]
Chain-of-Thought (CoT) prompting enables large language models to solve complex reasoning problems by generating intermediate steps. We propose SOCRATIC QUESTIONING, a divide-and-conquer style algorithm that mimics the recursive thinking process.
arXiv Detail & Related papers (2023-05-24T10:36:14Z)
Chaining Simultaneous Thoughts for Numerical Reasoning [92.2007997126144]
numerical reasoning over text should be an essential skill of AI systems. Previous work focused on modeling the structures of equations, and has proposed various structured decoders. We propose CANTOR, a numerical reasoner that models reasoning steps using a directed acyclic graph.
arXiv Detail & Related papers (2022-11-29T18:52:06Z)
Learning Iterative Reasoning through Energy Minimization [77.33859525900334]
We present a new framework for iterative reasoning with neural networks. We train a neural network to parameterize an energy landscape over all outputs. We implement each step of the iterative reasoning as an energy minimization step to find a minimal energy solution.
arXiv Detail & Related papers (2022-06-30T17:44:20Z)
Can You Learn an Algorithm? Generalizing from Easy to Hard Problems with Recurrent Networks [47.54459795966417]
We show that recurrent networks trained to solve simple problems can indeed solve much more complex problems simply by performing additional recurrences during inference. In all three domains, networks trained on simple problem instances are able to extend their reasoning abilities at test time simply by "thinking for longer"
arXiv Detail & Related papers (2021-06-08T17:19:48Z)
Differentiable Logic Machines [38.21461039738474]
We propose a novel neural-logic architecture, called differentiable logic machine (DLM) DLM can solve both inductive logic programming (ILP) and reinforcement learning (RL) problems. On RL problems, without requiring an interpretable solution, DLM outperforms other non-interpretable neural-logic RL approaches.
arXiv Detail & Related papers (2021-02-23T07:31:52Z)
Thinking Deeply with Recurrence: Generalizing from Easy to Hard Sequential Reasoning Problems [51.132938969015825]
We observe that recurrent networks have the uncanny ability to closely emulate the behavior of non-recurrent deep models. We show that recurrent networks that are trained to solve simple mazes with few recurrent steps can indeed solve much more complex problems simply by performing additional recurrences during inference.
arXiv Detail & Related papers (2021-02-22T14:09:20Z)

This list is automatically generated from the titles and abstracts of the papers in this site.

This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.