Related papers: SEGO: Sequential Subgoal Optimization for Mathematical Problem-Solving

SEGO: Sequential Subgoal Optimization for Mathematical Problem-Solving

URL: http://arxiv.org/abs/2310.12960v1
Date: Thu, 19 Oct 2023 17:56:40 GMT
Title: SEGO: Sequential Subgoal Optimization for Mathematical Problem-Solving
Authors: Xueliang Zhao, Xinting Huang, Wei Bi, Lingpeng Kong
Abstract summary: Large Language Models (LLMs) have driven substantial progress in artificial intelligence. We propose a novel framework called textbfSEquential subtextbfGoal textbfOptimization (SEGO) to enhance LLMs' ability to solve mathematical problems.
Score: 64.38649623473626
License: http://creativecommons.org/licenses/by-nc-nd/4.0/
Abstract: Large Language Models (LLMs) have driven substantial progress in artificial intelligence in recent years, exhibiting impressive capabilities across a wide range of tasks, including mathematical problem-solving. Inspired by the success of subgoal-based methods, we propose a novel framework called \textbf{SE}quential sub\textbf{G}oal \textbf{O}ptimization (SEGO) to enhance LLMs' ability to solve mathematical problems. By establishing a connection between the subgoal breakdown process and the probability of solving problems, SEGO aims to identify better subgoals with theoretical guarantees. Addressing the challenge of identifying suitable subgoals in a large solution space, our framework generates problem-specific subgoals and adjusts them according to carefully designed criteria. Incorporating these optimized subgoals into the policy model training leads to significant improvements in problem-solving performance. We validate SEGO's efficacy through experiments on two benchmarks, GSM8K and MATH, where our approach outperforms existing methods, highlighting the potential of SEGO in AI-driven mathematical problem-solving. Data and code associated with this paper will be available at https://github.com/zhaoxlpku/SEGO

Related papers

Swarm Intelligence Enhanced Reasoning: A Density-Driven Framework for LLM-Based Multi-Agent Optimization [18.912255448200888]
We propose integrating swarm intelligence into the reasoning process by introducing a novel Agent-based Swarm Intelligence (ASI) paradigm.<n>To avoid swarm intelligence getting trapped in local optima, we develop a Swarm Intelligence Enhancing Reasoning framework.
arXiv Detail & Related papers (2025-05-21T15:48:13Z)
A Survey on Mathematical Reasoning and Optimization with Large Language Models [0.5439020425819]
Recent advancements in Large Language Models (LLMs) have significantly improved AI-driven mathematical reasoning, theorem proving, and optimization techniques. This survey explores the evolution of mathematical problem-solving in AI, from early statistical learning approaches to modern deep learning and transformer-based methodologies.
arXiv Detail & Related papers (2025-03-22T10:49:32Z)
Text2Zinc: A Cross-Domain Dataset for Modeling Optimization and Satisfaction Problems in MiniZinc [2.6217304977339473]
This paper introduces Text2Zinc, a cross-domain dataset for capturing optimization and satisfaction problems specified in natural language text. Our work is distinguished from previous attempts by integrating both satisfaction and optimization problems within a unified dataset. Our findings indicate that LLMs are not yet a push-button technology to model problems from text.
arXiv Detail & Related papers (2025-02-22T04:13:53Z)
MathFimer: Enhancing Mathematical Reasoning by Expanding Reasoning Steps through Fill-in-the-Middle Task [49.355810887265925]
We introduce MathFimer, a novel framework for mathematical reasoning step expansion. We develop a specialized model, MathFimer-7B, on our carefully curated NuminaMath-FIM dataset. We then apply these models to enhance existing mathematical reasoning datasets by inserting detailed intermediate steps into their solution chains.
arXiv Detail & Related papers (2025-02-17T11:22:24Z)
Large Language Models for Combinatorial Optimization of Design Structure Matrix [4.513609458468522]
Combinatorial optimization (CO) is essential for improving efficiency and performance in engineering applications. When it comes to real-world engineering problems, algorithms based on pure mathematical reasoning are limited and incapable to capture the contextual nuances necessary for optimization. This study explores the potential of Large Language Models (LLMs) in solving engineering CO problems by leveraging their reasoning power and contextual knowledge.
arXiv Detail & Related papers (2024-11-19T15:39:51Z)
Enhancing Multi-Step Reasoning Abilities of Language Models through Direct Q-Function Optimization [50.485788083202124]
Reinforcement Learning (RL) plays a crucial role in aligning large language models with human preferences and improving their ability to perform complex tasks. We introduce Direct Q-function Optimization (DQO), which formulates the response generation process as a Markov Decision Process (MDP) and utilizes the soft actor-critic (SAC) framework to optimize a Q-function directly parameterized by the language model. Experimental results on two math problem-solving datasets, GSM8K and MATH, demonstrate that DQO outperforms previous methods, establishing it as a promising offline reinforcement learning approach for aligning language models.
arXiv Detail & Related papers (2024-10-11T23:29:20Z)
LLaMA-Berry: Pairwise Optimization for O1-like Olympiad-Level Mathematical Reasoning [56.273799410256075]
The framework combines Monte Carlo Tree Search (MCTS) with iterative Self-Refine to optimize the reasoning path. The framework has been tested on general and advanced benchmarks, showing superior performance in terms of search efficiency and problem-solving capability.
arXiv Detail & Related papers (2024-10-03T18:12:29Z)
Machine Learning Insides OptVerse AI Solver: Design Principles and Applications [74.67495900436728]
We present a comprehensive study on the integration of machine learning (ML) techniques into Huawei Cloud's OptVerse AI solver. We showcase our methods for generating complex SAT and MILP instances utilizing generative models that mirror multifaceted structures of real-world problem. We detail the incorporation of state-of-the-art parameter tuning algorithms which markedly elevate solver performance.
arXiv Detail & Related papers (2024-01-11T15:02:15Z)
Optimizing Solution-Samplers for Combinatorial Problems: The Landscape of Policy-Gradient Methods [52.0617030129699]
We introduce a novel theoretical framework for analyzing the effectiveness of DeepMatching Networks and Reinforcement Learning methods. Our main contribution holds for a broad class of problems including Max-and Min-Cut, Max-$k$-Bipartite-Bi, Maximum-Weight-Bipartite-Bi, and Traveling Salesman Problem. As a byproduct of our analysis we introduce a novel regularization process over vanilla descent and provide theoretical and experimental evidence that it helps address vanishing-gradient issues and escape bad stationary points.
arXiv Detail & Related papers (2023-10-08T23:39:38Z)
Hierarchical Imitation Learning with Vector Quantized Models [77.67190661002691]
We propose to use reinforcement learning to identify subgoals in expert trajectories. We build a vector-quantized generative model for the identified subgoals to perform subgoal-level planning. In experiments, the algorithm excels at solving complex, long-horizon decision-making problems outperforming state-of-the-art.
arXiv Detail & Related papers (2023-01-30T15:04:39Z)
A Survey on Influence Maximization: From an ML-Based Combinatorial Optimization [2.9882027965916413]
Influence Maximization (IM) is a classical optimization problem, which can be widely used in mobile networks, social computing, and recommendation systems. Main challenge comes from the NP-hardness of the IM problem and #P-hardness of estimating the influence spread. We focus on summarizing the relevant background knowledge, basic principles, common methods, and applied research.
arXiv Detail & Related papers (2022-11-06T10:13:42Z)
An Efficient Merge Search Matheuristic for Maximising the Net Present Value of Project Schedules [5.10800491975164]
Resource constrained project scheduling is an important optimisation problem with many practical applications. We propose a new math-heuristic algorithm based on Merge Search and parallel computing to solve the resource constrained project scheduling.
arXiv Detail & Related papers (2022-10-20T13:30:23Z)
Meta-Learning with Neural Tangent Kernels [58.06951624702086]
We propose the first meta-learning paradigm in the Reproducing Kernel Hilbert Space (RKHS) induced by the meta-model's Neural Tangent Kernel (NTK) Within this paradigm, we introduce two meta-learning algorithms, which no longer need a sub-optimal iterative inner-loop adaptation as in the MAML framework. We achieve this goal by 1) replacing the adaptation with a fast-adaptive regularizer in the RKHS; and 2) solving the adaptation analytically based on the NTK theory.
arXiv Detail & Related papers (2021-02-07T20:53:23Z)
A General Large Neighborhood Search Framework for Solving Integer Linear Programs [46.62993477453986]
We focus on solving integer programs, and ground our approach in the large neighborhood search (SLN) paradigm. We show that our LNS framework can significantly outperform compared to state-of-the-art commercial solvers such as Gurobi.
arXiv Detail & Related papers (2020-03-29T23:08:14Z)

This list is automatically generated from the titles and abstracts of the papers in this site.