Learning Iterative Reasoning through Energy Diffusion
- URL: http://arxiv.org/abs/2406.11179v1
- Date: Mon, 17 Jun 2024 03:36:47 GMT
- Title: Learning Iterative Reasoning through Energy Diffusion
- Authors: Yilun Du, Jiayuan Mao, Joshua B. Tenenbaum,
- Abstract summary: We introduce iterative reasoning through energy diffusion (IRED), a novel framework for learning to reason for a variety of tasks.
IRED learns energy functions to represent the constraints between input conditions and desired outputs.
We show IRED outperforms existing methods in continuous-space reasoning, discrete-space reasoning, and planning tasks.
- Score: 90.24765095498392
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: We introduce iterative reasoning through energy diffusion (IRED), a novel framework for learning to reason for a variety of tasks by formulating reasoning and decision-making problems with energy-based optimization. IRED learns energy functions to represent the constraints between input conditions and desired outputs. After training, IRED adapts the number of optimization steps during inference based on problem difficulty, enabling it to solve problems outside its training distribution -- such as more complex Sudoku puzzles, matrix completion with large value magnitudes, and pathfinding in larger graphs. Key to our method's success is two novel techniques: learning a sequence of annealed energy landscapes for easier inference and a combination of score function and energy landscape supervision for faster and more stable training. Our experiments show that IRED outperforms existing methods in continuous-space reasoning, discrete-space reasoning, and planning tasks, particularly in more challenging scenarios. Code and visualizations at https://energy-based-model.github.io/ired/
Related papers
- Can Graph Learning Improve Task Planning? [61.47027387839096]
Task planning is emerging as an important research topic alongside the development of large language models (LLMs)
In this paper, we explore graph learning-based methods for task planning.
Our approach complements prompt engineering and fine-tuning techniques, with performance further enhanced by improved prompts or a fine-tuned model.
arXiv Detail & Related papers (2024-05-29T14:26:24Z) - SEGO: Sequential Subgoal Optimization for Mathematical Problem-Solving [64.38649623473626]
Large Language Models (LLMs) have driven substantial progress in artificial intelligence.
We propose a novel framework called textbfSEquential subtextbfGoal textbfOptimization (SEGO) to enhance LLMs' ability to solve mathematical problems.
arXiv Detail & Related papers (2023-10-19T17:56:40Z) - Energy-frugal and Interpretable AI Hardware Design using Learning
Automata [5.514795777097036]
A new machine learning algorithm, called the Tsetlin machine, has been proposed.
In this paper, we investigate methods of energy-frugal artificial intelligence hardware design.
We show that frugal resource allocation can provide decisive energy reduction while also achieving robust and interpretable learning.
arXiv Detail & Related papers (2023-05-19T15:11:18Z) - Scalable Coupling of Deep Learning with Logical Reasoning [0.0]
We introduce a scalable neural architecture and loss function dedicated to learning the constraints and criteria of NP-hard reasoning problems.
Our loss function solves one of the main limitations of Besag's pseudo-loglikelihood, enabling learning of high energies.
arXiv Detail & Related papers (2023-05-12T17:09:34Z) - Energy Transformer [64.22957136952725]
Our work combines aspects of three promising paradigms in machine learning, namely, attention mechanism, energy-based models, and associative memory.
We propose a novel architecture, called the Energy Transformer (or ET for short), that uses a sequence of attention layers that are purposely designed to minimize a specifically engineered energy function.
arXiv Detail & Related papers (2023-02-14T18:51:22Z) - Learning Iterative Reasoning through Energy Minimization [77.33859525900334]
We present a new framework for iterative reasoning with neural networks.
We train a neural network to parameterize an energy landscape over all outputs.
We implement each step of the iterative reasoning as an energy minimization step to find a minimal energy solution.
arXiv Detail & Related papers (2022-06-30T17:44:20Z) - Learning Energy Networks with Generalized Fenchel-Young Losses [34.46284877812228]
Energy-based models, a.k.a. energy networks, perform inference by optimizing an energy function.
We propose generalized Fenchel-Young losses, a natural loss construction for learning energy networks.
arXiv Detail & Related papers (2022-05-19T14:32:04Z) - Physical Gradients for Deep Learning [101.36788327318669]
We find that state-of-the-art training techniques are not well-suited to many problems that involve physical processes.
We propose a novel hybrid training approach that combines higher-order optimization methods with machine learning techniques.
arXiv Detail & Related papers (2021-09-30T12:14:31Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.