Chain-of-Reasoning: Towards Unified Mathematical Reasoning in Large Language Models via a Multi-Paradigm Perspective
- URL: http://arxiv.org/abs/2501.11110v1
- Date: Sun, 19 Jan 2025 16:53:26 GMT
- Title: Chain-of-Reasoning: Towards Unified Mathematical Reasoning in Large Language Models via a Multi-Paradigm Perspective
- Authors: Yiyao Yu, Yuxiang Zhang, Dongdong Zhang, Xiao Liang, Hengyuan Zhang, Xingxing Zhang, Ziyi Yang, Mahmoud Khademi, Hany Awadalla, Junjie Wang, Yujiu Yang, Furu Wei,
- Abstract summary: Chain-of-Reasoning (CoR) is a novel unified framework that integrates multiple reasoning paradigms.
CoR generates multiple potential answers using different reasoning paradigms and synthesizes them into a coherent final solution.
Experimental results demonstrate that CoR-Math-7B significantly outperforms current SOTA models.
- Score: 90.86370957353911
- License:
- Abstract: Large Language Models (LLMs) have made notable progress in mathematical reasoning, yet they often rely on single-paradigm reasoning that limits their effectiveness across diverse tasks. In this paper, we introduce Chain-of-Reasoning (CoR), a novel unified framework that integrates multiple reasoning paradigms--Natural Language Reasoning (NLR), Algorithmic Reasoning (AR), and Symbolic Reasoning (SR)--to enable synergistic collaboration. CoR generates multiple potential answers using different reasoning paradigms and synthesizes them into a coherent final solution. We propose a Progressive Paradigm Training (PPT) strategy that allows models to progressively master these paradigms, culminating in the development of CoR-Math-7B. Experimental results demonstrate that CoR-Math-7B significantly outperforms current SOTA models, achieving up to a 41.0% absolute improvement over GPT-4 in theorem proving tasks and a 7.9% improvement over RL-based methods in arithmetic tasks. These results showcase the enhanced mathematical comprehensive ability of our model, achieving significant performance gains on specific tasks and enabling zero-shot generalization across tasks.
Related papers
- BRiTE: Bootstrapping Reinforced Thinking Process to Enhance Language Model Reasoning [78.63421517563056]
Large Language Models (LLMs) have demonstrated remarkable capabilities in complex reasoning tasks.
We present a unified probabilistic framework that formalizes LLM reasoning through a novel graphical model.
We introduce the Bootstrapping Reinforced Thinking Process (BRiTE) algorithm, which works in two steps.
arXiv Detail & Related papers (2025-01-31T02:39:07Z) - Recursive Decomposition of Logical Thoughts: Framework for Superior Reasoning and Knowledge Propagation in Large Language Models [1.4956870931936515]
We introduce RDoLT, a novel framework that significantly boosts Large Language Models reasoning performance.
RDoLT is built on three key innovations: (1) breaking down complex reasoning tasks into sub-tasks of progressive complexity; (2) employing an advanced selection and scoring mechanism to identify the most promising reasoning thoughts; and (3) integrating a knowledge propagation module that mimics human learning.
Our approach was evaluated across multiple benchmarks, including GSM8K, SVAMP, MultiArithm, LastLetterConcatenation, and Gaokao2023 Math.
arXiv Detail & Related papers (2025-01-03T02:55:44Z) - Diversity of Thought Elicits Stronger Reasoning Capabilities in Multi-Agent Debate Frameworks [0.0]
Chain-of-thought prompting, self-verification, and multi-agent debate are proposed to improve the reasoning and factual accuracy of large language models.
We find that multi-agent debate helps at any model scale, and that diversity of thought elicits stronger reasoning in debating LLMs.
arXiv Detail & Related papers (2024-10-10T21:59:01Z) - Building Math Agents with Multi-Turn Iterative Preference Learning [56.71330214021884]
This paper studies the complementary direct preference learning approach to further improve model performance.
Existing direct preference learning algorithms are originally designed for the single-turn chat task.
We introduce a multi-turn direct preference learning framework, tailored for this context.
arXiv Detail & Related papers (2024-09-04T02:41:04Z) - iREL at SemEval-2024 Task 9: Improving Conventional Prompting Methods for Brain Teasers [11.819814280565142]
This paper describes our approach for SemEval-2024 Task 9: BRAINTEASER: A Novel Task Defying Common Sense.
The BRAINTEASER task comprises multiple-choice Question Answering designed to evaluate the models' lateral thinking capabilities.
We propose a unique strategy to improve the performance of pre-trained language models in both subtasks.
arXiv Detail & Related papers (2024-05-25T08:50:51Z) - Fine-Tuning Large Vision-Language Models as Decision-Making Agents via Reinforcement Learning [79.38140606606126]
We propose an algorithmic framework that fine-tunes vision-language models (VLMs) with reinforcement learning (RL)
Our framework provides a task description and then prompts the VLM to generate chain-of-thought (CoT) reasoning.
We demonstrate that our proposed framework enhances the decision-making capabilities of VLM agents across various tasks.
arXiv Detail & Related papers (2024-05-16T17:50:19Z) - MR-GSM8K: A Meta-Reasoning Benchmark for Large Language Model Evaluation [60.65820977963331]
We introduce a novel evaluation paradigm for Large Language Models (LLMs)
This paradigm shifts the emphasis from result-oriented assessments, which often neglect the reasoning process, to a more comprehensive evaluation.
By applying this paradigm in the GSM8K dataset, we have developed the MR-GSM8K benchmark.
arXiv Detail & Related papers (2023-12-28T15:49:43Z) - Retrieval-augmented Multi-modal Chain-of-Thoughts Reasoning for Large
Language Models [56.256069117502385]
Chain of Thought (CoT) approaches can be used to enhance the capability of Large Language Models (LLMs) on complex reasoning tasks.
However, the selection of optimal CoT demonstration examples in multi-modal reasoning remains less explored.
We introduce a novel approach that addresses this challenge by using retrieval mechanisms to automatically select demonstration examples.
arXiv Detail & Related papers (2023-12-04T08:07:21Z) - Let's Reinforce Step by Step [10.65244642965387]
We use Reinforcement Learning from Human Feedback to shape model reasoning processes.
Our results show that the fine-grained reward provided by PRM-based methods enhances accuracy on simple mathematical reasoning.
We also show the critical role reward aggregation functions play in model performance.
arXiv Detail & Related papers (2023-11-10T01:35:51Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.