SEGO: Sequential Subgoal Optimization for Mathematical Problem-Solving
- URL: http://arxiv.org/abs/2310.12960v1
- Date: Thu, 19 Oct 2023 17:56:40 GMT
- Title: SEGO: Sequential Subgoal Optimization for Mathematical Problem-Solving
- Authors: Xueliang Zhao, Xinting Huang, Wei Bi, Lingpeng Kong
- Abstract summary: Large Language Models (LLMs) have driven substantial progress in artificial intelligence.
We propose a novel framework called textbfSEquential subtextbfGoal textbfOptimization (SEGO) to enhance LLMs' ability to solve mathematical problems.
- Score: 64.38649623473626
- License: http://creativecommons.org/licenses/by-nc-nd/4.0/
- Abstract: Large Language Models (LLMs) have driven substantial progress in artificial
intelligence in recent years, exhibiting impressive capabilities across a wide
range of tasks, including mathematical problem-solving. Inspired by the success
of subgoal-based methods, we propose a novel framework called
\textbf{SE}quential sub\textbf{G}oal \textbf{O}ptimization (SEGO) to enhance
LLMs' ability to solve mathematical problems. By establishing a connection
between the subgoal breakdown process and the probability of solving problems,
SEGO aims to identify better subgoals with theoretical guarantees. Addressing
the challenge of identifying suitable subgoals in a large solution space, our
framework generates problem-specific subgoals and adjusts them according to
carefully designed criteria. Incorporating these optimized subgoals into the
policy model training leads to significant improvements in problem-solving
performance. We validate SEGO's efficacy through experiments on two benchmarks,
GSM8K and MATH, where our approach outperforms existing methods, highlighting
the potential of SEGO in AI-driven mathematical problem-solving.
Data and code associated with this paper will be available at
https://github.com/zhaoxlpku/SEGO
Related papers
- MindStar: Enhancing Math Reasoning in Pre-trained LLMs at Inference Time [51.5039731721706]
MindStar is a purely inference-based searching method for large language models.
It formulates reasoning tasks as searching problems and proposes two search ideas to identify the optimal reasoning paths.
It significantly enhances the reasoning abilities of open-source models, such as Llama-2-13B and Mistral-7B, and achieves comparable performance to GPT-3.5 and Grok-1.
arXiv Detail & Related papers (2024-05-25T15:07:33Z) - Machine Learning Insides OptVerse AI Solver: Design Principles and
Applications [74.67495900436728]
We present a comprehensive study on the integration of machine learning (ML) techniques into Huawei Cloud's OptVerse AI solver.
We showcase our methods for generating complex SAT and MILP instances utilizing generative models that mirror multifaceted structures of real-world problem.
We detail the incorporation of state-of-the-art parameter tuning algorithms which markedly elevate solver performance.
arXiv Detail & Related papers (2024-01-11T15:02:15Z) - Solution to Advanced Manufacturing Process Problems using Cohort
Intelligence Algorithm with Improved Constraint Handling Approaches [0.07989135005592125]
Cohort Intelligence (CI) algorithm is a socio inspired optimization technique which is successfully applied for solving several unconstrained & constrained real-world problems from the domains such as design, manufacturing, supply chain, healthcare, etc.
arXiv Detail & Related papers (2023-10-16T05:40:23Z) - Optimizing Solution-Samplers for Combinatorial Problems: The Landscape
of Policy-Gradient Methods [52.0617030129699]
We introduce a novel theoretical framework for analyzing the effectiveness of DeepMatching Networks and Reinforcement Learning methods.
Our main contribution holds for a broad class of problems including Max-and Min-Cut, Max-$k$-Bipartite-Bi, Maximum-Weight-Bipartite-Bi, and Traveling Salesman Problem.
As a byproduct of our analysis we introduce a novel regularization process over vanilla descent and provide theoretical and experimental evidence that it helps address vanishing-gradient issues and escape bad stationary points.
arXiv Detail & Related papers (2023-10-08T23:39:38Z) - Hierarchical Imitation Learning with Vector Quantized Models [77.67190661002691]
We propose to use reinforcement learning to identify subgoals in expert trajectories.
We build a vector-quantized generative model for the identified subgoals to perform subgoal-level planning.
In experiments, the algorithm excels at solving complex, long-horizon decision-making problems outperforming state-of-the-art.
arXiv Detail & Related papers (2023-01-30T15:04:39Z) - A Survey on Influence Maximization: From an ML-Based Combinatorial
Optimization [2.9882027965916413]
Influence Maximization (IM) is a classical optimization problem, which can be widely used in mobile networks, social computing, and recommendation systems.
Main challenge comes from the NP-hardness of the IM problem and #P-hardness of estimating the influence spread.
We focus on summarizing the relevant background knowledge, basic principles, common methods, and applied research.
arXiv Detail & Related papers (2022-11-06T10:13:42Z) - A Study of Scalarisation Techniques for Multi-Objective QUBO Solving [0.0]
Quantum and quantum-inspired optimisation algorithms have shown promising performance when applied to academic benchmarks as well as real-world problems.
However, QUBO solvers are single objective solvers. To make them more efficient at solving problems with multiple objectives, a decision on how to convert such multi-objective problems to single-objective problems need to be made.
arXiv Detail & Related papers (2022-10-20T14:54:37Z) - An Efficient Merge Search Matheuristic for Maximising the Net Present
Value of Project Schedules [5.10800491975164]
Resource constrained project scheduling is an important optimisation problem with many practical applications.
We propose a new math-heuristic algorithm based on Merge Search and parallel computing to solve the resource constrained project scheduling.
arXiv Detail & Related papers (2022-10-20T13:30:23Z) - Multi-Objective Policy Gradients with Topological Constraints [108.10241442630289]
We present a new algorithm for a policy gradient in TMDPs by a simple extension of the proximal policy optimization (PPO) algorithm.
We demonstrate this on a real-world multiple-objective navigation problem with an arbitrary ordering of objectives both in simulation and on a real robot.
arXiv Detail & Related papers (2022-09-15T07:22:58Z) - A General Large Neighborhood Search Framework for Solving Integer Linear
Programs [46.62993477453986]
We focus on solving integer programs, and ground our approach in the large neighborhood search (SLN) paradigm.
We show that our LNS framework can significantly outperform compared to state-of-the-art commercial solvers such as Gurobi.
arXiv Detail & Related papers (2020-03-29T23:08:14Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.