Learning Iterative Reasoning through Energy Minimization
- URL: http://arxiv.org/abs/2206.15448v1
- Date: Thu, 30 Jun 2022 17:44:20 GMT
- Title: Learning Iterative Reasoning through Energy Minimization
- Authors: Yilun Du, Shuang Li, Joshua B. Tenenbaum, Igor Mordatch
- Abstract summary: We present a new framework for iterative reasoning with neural networks.
We train a neural network to parameterize an energy landscape over all outputs.
We implement each step of the iterative reasoning as an energy minimization step to find a minimal energy solution.
- Score: 77.33859525900334
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: Deep learning has excelled on complex pattern recognition tasks such as image
classification and object recognition. However, it struggles with tasks
requiring nontrivial reasoning, such as algorithmic computation. Humans are
able to solve such tasks through iterative reasoning -- spending more time
thinking about harder tasks. Most existing neural networks, however, exhibit a
fixed computational budget controlled by the neural network architecture,
preventing additional computational processing on harder tasks. In this work,
we present a new framework for iterative reasoning with neural networks. We
train a neural network to parameterize an energy landscape over all outputs,
and implement each step of the iterative reasoning as an energy minimization
step to find a minimal energy solution. By formulating reasoning as an energy
minimization problem, for harder problems that lead to more complex energy
landscapes, we may then adjust our underlying computational budget by running a
more complex optimization procedure. We empirically illustrate that our
iterative reasoning approach can solve more accurate and generalizable
algorithmic reasoning tasks in both graph and continuous domains. Finally, we
illustrate that our approach can recursively solve algorithmic problems
requiring nested reasoning
Related papers
- Learning Iterative Reasoning through Energy Diffusion [90.24765095498392]
We introduce iterative reasoning through energy diffusion (IRED), a novel framework for learning to reason for a variety of tasks.
IRED learns energy functions to represent the constraints between input conditions and desired outputs.
We show IRED outperforms existing methods in continuous-space reasoning, discrete-space reasoning, and planning tasks.
arXiv Detail & Related papers (2024-06-17T03:36:47Z) - The Clock and the Pizza: Two Stories in Mechanistic Explanation of
Neural Networks [59.26515696183751]
We show that algorithm discovery in neural networks is sometimes more complex.
We show that even simple learning problems can admit a surprising diversity of solutions.
arXiv Detail & Related papers (2023-06-30T17:59:13Z) - End-to-end Algorithm Synthesis with Recurrent Networks: Logical
Extrapolation Without Overthinking [52.05847268235338]
We show how machine learning systems can perform logical extrapolation without overthinking problems.
We propose a recall architecture that keeps an explicit copy of the problem instance in memory so that it cannot be forgotten.
We also employ a progressive training routine that prevents the model from learning behaviors that are specific to number and instead pushes it to learn behaviors that can be repeated indefinitely.
arXiv Detail & Related papers (2022-02-11T18:43:28Z) - Can You Learn an Algorithm? Generalizing from Easy to Hard Problems with
Recurrent Networks [47.54459795966417]
We show that recurrent networks trained to solve simple problems can indeed solve much more complex problems simply by performing additional recurrences during inference.
In all three domains, networks trained on simple problem instances are able to extend their reasoning abilities at test time simply by "thinking for longer"
arXiv Detail & Related papers (2021-06-08T17:19:48Z) - Thinking Deeply with Recurrence: Generalizing from Easy to Hard
Sequential Reasoning Problems [51.132938969015825]
We observe that recurrent networks have the uncanny ability to closely emulate the behavior of non-recurrent deep models.
We show that recurrent networks that are trained to solve simple mazes with few recurrent steps can indeed solve much more complex problems simply by performing additional recurrences during inference.
arXiv Detail & Related papers (2021-02-22T14:09:20Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.