MetaRuleGPT: Recursive Numerical Reasoning of Language Models Trained with Simple Rules
- URL: http://arxiv.org/abs/2412.13536v1
- Date: Wed, 18 Dec 2024 06:27:10 GMT
- Title: MetaRuleGPT: Recursive Numerical Reasoning of Language Models Trained with Simple Rules
- Authors: Kejie Chen, Lin Wang, Qinghai Zhang, Renjun Xu,
- Abstract summary: We introduce MetaRuleGPT, a novel Transformer-based architecture that performs precise numerical calculations.
In contrast with traditional training sets, which are heavily composed of massive raw instance data, MetaRuleGPT is pre-trained on much less abstract datasets.
- Score: 6.988553014376883
- License:
- Abstract: Recent studies have highlighted the limitations of large language models in mathematical reasoning, particularly their inability to capture the underlying logic. Inspired by meta-learning, we propose that models should acquire not only task-specific knowledge but also transferable problem-solving skills. We introduce MetaRuleGPT, a novel Transformer-based architecture that performs precise numerical calculations and complex logical operations by learning and combining different rules. In contrast with traditional training sets, which are heavily composed of massive raw instance data, MetaRuleGPT is pre-trained on much less abstract datasets containing basic, compound, and iterative rules for mathematical reasoning. Extensive experimental results demonstrate MetaRuleGPT can mimic human's rule-following capabilities, break down complexity, and iteratively derive accurate results for complex mathematical problems. These findings prove the potential of rule learning to enhance the numerical reasoning abilities of language models.
Related papers
- Large Language Models and Mathematical Reasoning Failures [1.6114012813668932]
This paper investigates the mathematical reasoning capabilities of large language models (LLMs) using 50 newly constructed high-school-level word problems.
We rigorously analyze both final answers and solution steps to identify reasoning failures.
We find that while newer models (e.g., o3-mini, deepseek-r1) achieve higher accuracy, all models exhibit errors in spatial reasoning, strategic planning, and arithmetic.
arXiv Detail & Related papers (2025-02-17T09:07:32Z) - Learning Mathematical Rules with Large Language Models [10.285317818397298]
We study the ability of large language models to learn specific mathematical rules such as distributivity or simplifying equations.
We present an empirical analysis of their ability to generalize these rules, as well as to reuse them in the context of word problems.
arXiv Detail & Related papers (2024-10-22T12:51:51Z) - How Truncating Weights Improves Reasoning in Language Models [49.80959223722325]
We study how certain global associations tend to be stored in specific weight components or Transformer blocks.
We analyze how this arises during training, both empirically and theoretically.
arXiv Detail & Related papers (2024-06-05T08:51:08Z) - Exploring the Compositional Deficiency of Large Language Models in Mathematical Reasoning [30.40415945003794]
We investigate the compositionality of large language models (LLMs) in mathematical reasoning.
Since problems with logical flaws are quite rare in the real world, these represent "unseen" cases to LLMs.
Our experiments show that while LLMs possess both components of requisite knowledge, they do not textbfspontaneously combine them to handle these novel cases.
arXiv Detail & Related papers (2024-05-05T16:35:30Z) - Can LLMs Reason with Rules? Logic Scaffolding for Stress-Testing and Improving LLMs [87.34281749422756]
Large language models (LLMs) have achieved impressive human-like performance across various reasoning tasks.
However, their mastery of underlying inferential rules still falls short of human capabilities.
We propose a logic scaffolding inferential rule generation framework, to construct an inferential rule base, ULogic.
arXiv Detail & Related papers (2024-02-18T03:38:51Z) - ChatRule: Mining Logical Rules with Large Language Models for Knowledge
Graph Reasoning [107.61997887260056]
We propose a novel framework, ChatRule, unleashing the power of large language models for mining logical rules over knowledge graphs.
Specifically, the framework is initiated with an LLM-based rule generator, leveraging both the semantic and structural information of KGs.
To refine the generated rules, a rule ranking module estimates the rule quality by incorporating facts from existing KGs.
arXiv Detail & Related papers (2023-09-04T11:38:02Z) - A Hybrid System for Systematic Generalization in Simple Arithmetic
Problems [70.91780996370326]
We propose a hybrid system capable of solving arithmetic problems that require compositional and systematic reasoning over sequences of symbols.
We show that the proposed system can accurately solve nested arithmetical expressions even when trained only on a subset including the simplest cases.
arXiv Detail & Related papers (2023-06-29T18:35:41Z) - Reflection of Thought: Inversely Eliciting Numerical Reasoning in
Language Models via Solving Linear Systems [42.782260686177395]
We propose a novel method to elicit and exploit the numerical reasoning knowledge hidden in pre-trained language models.
We first leverage simple numbers as anchors to probe the implicitly inferred arithmetic expressions from language models.
We transform and formulate the task as an analytically solvable linear system.
arXiv Detail & Related papers (2022-10-11T00:57:19Z) - JiuZhang: A Chinese Pre-trained Language Model for Mathematical Problem
Understanding [74.12405417718054]
This paper aims to advance the mathematical intelligence of machines by presenting the first Chinese mathematical pre-trained language model(PLM)
Unlike other standard NLP tasks, mathematical texts are difficult to understand, since they involve mathematical terminology, symbols and formulas in the problem statement.
We design a novel curriculum pre-training approach for improving the learning of mathematical PLMs, consisting of both basic and advanced courses.
arXiv Detail & Related papers (2022-06-13T17:03:52Z) - Learning Symbolic Rules for Reasoning in Quasi-Natural Language [74.96601852906328]
We build a rule-based system that can reason with natural language input but without the manual construction of rules.
We propose MetaQNL, a "Quasi-Natural" language that can express both formal logic and natural language sentences.
Our approach achieves state-of-the-art accuracy on multiple reasoning benchmarks.
arXiv Detail & Related papers (2021-11-23T17:49:00Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.