No Train Still Gain. Unleash Mathematical Reasoning of Large Language
Models with Monte Carlo Tree Search Guided by Energy Function
- URL: http://arxiv.org/abs/2309.03224v3
- Date: Tue, 12 Sep 2023 03:03:00 GMT
- Title: No Train Still Gain. Unleash Mathematical Reasoning of Large Language
Models with Monte Carlo Tree Search Guided by Energy Function
- Authors: Haotian Xu
- Abstract summary: Large language models (LLMs) demonstrate impressive language understanding and contextual learning abilities.
LLMs often struggle to generate correct reasoning steps and answers despite having high probabilities for the solutions.
We propose a method that incorporates Monte Carlo Tree Search (MCTS) and a lightweight energy function to rank decision steps.
- Score: 3.0299876288833345
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: Large language models (LLMs) demonstrate impressive language understanding
and contextual learning abilities, making them suitable for natural language
processing (NLP) tasks and complex mathematical reasoning. However, when
applied to mathematical reasoning tasks, LLMs often struggle to generate
correct reasoning steps and answers despite having high probabilities for the
solutions. To overcome this limitation and enhance the mathematical reasoning
capabilities of fine-tuned LLMs without additional fine-tuning steps, we
propose a method that incorporates Monte Carlo Tree Search (MCTS) and a
lightweight energy function to rank decision steps and enable immediate
reaction and precise reasoning. Specifically, we re-formulate the fine-tuned
LLMs into a Residual-based Energy Model (Residual-EBM) and employ noise
contrastive estimation to estimate the energy function's parameters. We then
utilize MCTS with the energy function as a path verifier to search the output
space and evaluate the reasoning path. Through extensive experiments on two
mathematical reasoning benchmarks, GSM8k and AQUA-RAT, we demonstrate the
exceptional capabilities of our method, which significantly improves the pass@1
metric of the fine-tuned model without requiring additional fine-tuning or
reinforcement learning with human feedback alignment.
Related papers
- Improving Small-Scale Large Language Models Function Calling for Reasoning Tasks [0.8425561594225592]
This study introduces a novel framework for training smaller language models in function calling.
It focuses on specific logical and mathematical reasoning tasks.
The approach aims to improve performances of small-scale models for these tasks using function calling.
arXiv Detail & Related papers (2024-10-24T16:27:35Z) - Deconfounded Causality-aware Parameter-Efficient Fine-Tuning for Problem-Solving Improvement of LLMs [12.48241058167222]
Large Language Models (LLMs) have demonstrated remarkable efficiency in tackling various tasks based on human instructions.
But studies reveal that they often struggle with tasks requiring reasoning, such as math or physics limitation.
This raises questions about whether LLMs truly comprehend embedded knowledge or merely learn to replicate the token distribution without a true understanding of the content.
We propose Decon Causal Adaptation (DCA), a novel parameter-efficient fine-tuning (PEFT) method to enhance the model's reasoning capabilities.
arXiv Detail & Related papers (2024-09-04T13:17:09Z) - Interpreting and Improving Large Language Models in Arithmetic Calculation [72.19753146621429]
Large language models (LLMs) have demonstrated remarkable potential across numerous applications.
In this work, we delve into uncovering a specific mechanism by which LLMs execute calculations.
We investigate the potential benefits of selectively fine-tuning these essential heads/MLPs to boost the LLMs' computational performance.
arXiv Detail & Related papers (2024-09-03T07:01:46Z) - Key-Point-Driven Mathematical Reasoning Distillation of Large Language Model [15.542737858152053]
We propose Key-Point-Driven Mathematical Reasoning Distillation (KPDD) to mitigate misunderstanding errors.
KPDD enhances the reasoning performance of SLMs by breaking down the problem-solving process into three stages.
Experiments show KPDD-CoT significantly improves reasoning abilities, while KPDD-PoT achieves state-of-the-art performance in mathematical reasoning tasks.
arXiv Detail & Related papers (2024-07-14T11:41:03Z) - MindStar: Enhancing Math Reasoning in Pre-trained LLMs at Inference Time [51.5039731721706]
MindStar is a purely inference-based searching method for large language models.
It formulates reasoning tasks as searching problems and proposes two search ideas to identify the optimal reasoning paths.
It significantly enhances the reasoning abilities of open-source models, such as Llama-2-13B and Mistral-7B, and achieves comparable performance to GPT-3.5 and Grok-1.
arXiv Detail & Related papers (2024-05-25T15:07:33Z) - Towards LogiGLUE: A Brief Survey and A Benchmark for Analyzing Logical Reasoning Capabilities of Language Models [56.34029644009297]
Large language models (LLMs) have demonstrated the ability to overcome various limitations of formal Knowledge Representation (KR) systems.
LLMs excel most in abductive reasoning, followed by deductive reasoning, while they are least effective at inductive reasoning.
We study single-task training, multi-task training, and "chain-of-thought" knowledge distillation fine-tuning technique to assess the performance of model.
arXiv Detail & Related papers (2023-10-02T01:00:50Z) - Sci-CoT: Leveraging Large Language Models for Enhanced Knowledge
Distillation in Small Models for Scientific QA [5.117094291273979]
Large Language Models (LLMs) have shown outstanding performance across wide range of downstream tasks.
We propose Sci-CoT, a two-stage framework that separates the processes of generating rationales and inferring answers.
Our 80-million parameter model is able to exceed the performance of BLOOM-176B in the ARC-Easy dataset under the few shot setting.
arXiv Detail & Related papers (2023-08-09T03:18:07Z) - Learning Multi-Step Reasoning by Solving Arithmetic Tasks [6.398022050054328]
This work investigates how to incorporate relatively small Language Models with the capabilities of multi-step reasoning.
We propose to inject such abilities by continually pre-training LMs on a synthetic dataset MsAT.
Our experiments on four math word problem datasets show the effectiveness of the proposed method.
arXiv Detail & Related papers (2023-06-02T17:29:22Z) - Evaluating Language Models for Mathematics through Interactions [116.67206980096513]
We introduce CheckMate, a prototype platform for humans to interact with and evaluate large language models (LLMs)
We conduct a study with CheckMate to evaluate three language models (InstructGPT, ChatGPT, and GPT-4) as assistants in proving undergraduate-level mathematics.
We derive a taxonomy of human behaviours and uncover that despite a generally positive correlation, there are notable instances of divergence between correctness and perceived helpfulness.
arXiv Detail & Related papers (2023-06-02T17:12:25Z) - SatLM: Satisfiability-Aided Language Models Using Declarative Prompting [68.40726892904286]
We propose a new satisfiability-aided language modeling (SatLM) approach for improving the reasoning capabilities of large language models (LLMs)
We use an LLM to generate a declarative task specification rather than an imperative program and leverage an off-the-shelf automated theorem prover to derive the final answer.
We evaluate SATLM on 8 different datasets and show that it consistently outperforms program-aided LMs in the imperative paradigm.
arXiv Detail & Related papers (2023-05-16T17:55:51Z) - ChatABL: Abductive Learning via Natural Language Interaction with
ChatGPT [72.83383437501577]
Large language models (LLMs) have recently demonstrated significant potential in mathematical abilities.
LLMs currently have difficulty in bridging perception, language understanding and reasoning capabilities.
This paper presents a novel method for integrating LLMs into the abductive learning framework.
arXiv Detail & Related papers (2023-04-21T16:23:47Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.