ClozeMath: Improving Mathematical Reasoning in Language Models by Learning to Fill Equations
- URL: http://arxiv.org/abs/2506.03763v1
- Date: Wed, 04 Jun 2025 09:27:21 GMT
- Title: ClozeMath: Improving Mathematical Reasoning in Language Models by Learning to Fill Equations
- Authors: Quang Hieu Pham, Thuy Duong Nguyen, Tung Pham, Anh Tuan Luu, Dat Quoc Nguyen,
- Abstract summary: We propose a new approach named ClozeMath to fine-tune large language models for mathematical reasoning.<n>Our ClozeMath involves a text-infilling task that predicts masked equations from a given solution, analogous to cloze exercises used in human learning.<n> Experiments on GSM8K, MATH, and GSM-Symbolic show that ClozeMath surpasses the strong baseline Masked Thought in performance and robustness.
- Score: 29.51572057789961
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: The capabilities of large language models (LLMs) have been enhanced by training on data that reflects human thought processes, such as the Chain-of-Thought format. However, evidence suggests that the conventional scheme of next-word prediction may not fully capture how humans learn to think. Inspired by how humans generalize mathematical reasoning, we propose a new approach named ClozeMath to fine-tune LLMs for mathematical reasoning. Our ClozeMath involves a text-infilling task that predicts masked equations from a given solution, analogous to cloze exercises used in human learning. Experiments on GSM8K, MATH, and GSM-Symbolic show that ClozeMath surpasses the strong baseline Masked Thought in performance and robustness, with two test-time scaling decoding algorithms, Beam Search and Chain-of-Thought decoding. Additionally, we conduct an ablation study to analyze the effects of various architectural and implementation choices on our approach.
Related papers
- MathFimer: Enhancing Mathematical Reasoning by Expanding Reasoning Steps through Fill-in-the-Middle Task [49.355810887265925]
We introduce MathFimer, a novel framework for mathematical reasoning step expansion.<n>We develop a specialized model, MathFimer-7B, on our carefully curated NuminaMath-FIM dataset.<n>We then apply these models to enhance existing mathematical reasoning datasets by inserting detailed intermediate steps into their solution chains.
arXiv Detail & Related papers (2025-02-17T11:22:24Z) - MathCoder2: Better Math Reasoning from Continued Pretraining on Model-translated Mathematical Code [38.127313175508746]
We introduce a novel method for generating mathematical code accompanied with corresponding reasoning steps for continued pretraining.
Our approach begins with the construction of a high-quality mathematical continued pretraining dataset.
Appending the generated code to each reasoning step results in data consisting of paired natural language reasoning steps and their corresponding code.
arXiv Detail & Related papers (2024-10-10T17:58:40Z) - MathScale: Scaling Instruction Tuning for Mathematical Reasoning [70.89605383298331]
Large language models (LLMs) have demonstrated remarkable capabilities in problem-solving.
However, their proficiency in solving mathematical problems remains inadequate.
We propose MathScale, a simple and scalable method to create high-quality mathematical reasoning data.
arXiv Detail & Related papers (2024-03-05T11:42:59Z) - GSM-Plus: A Comprehensive Benchmark for Evaluating the Robustness of LLMs as Mathematical Problem Solvers [68.77382332826167]
Large language models (LLMs) have achieved impressive performance across various mathematical reasoning benchmarks.
One essential and frequently occurring evidence is that when the math questions are slightly changed, LLMs can behave incorrectly.
This motivates us to evaluate the robustness of LLMs' math reasoning capability by testing a wide range of question variations.
arXiv Detail & Related papers (2024-02-29T15:26:14Z) - Brain-Inspired Two-Stage Approach: Enhancing Mathematical Reasoning by
Imitating Human Thought Processes [6.512667145063511]
We propose a novel approach, named Brain, to imitate human thought processes to enhance mathematical reasoning abilities.
First, we achieve SOTA performance in comparison with Code LLaMA 7B based models through this method.
Secondly, we find that plans can be explicitly extracted from natural language, code, or formal language.
arXiv Detail & Related papers (2024-02-23T17:40:31Z) - InternLM-Math: Open Math Large Language Models Toward Verifiable Reasoning [98.53491178426492]
We open-source our math reasoning LLMs InternLM-Math which is continue pre-trained from InternLM2.
We unify chain-of-thought reasoning, reward modeling, formal reasoning, data augmentation, and code interpreter in a unified seq2seq format.
Our pre-trained model achieves 30.3 on the MiniF2F test set without fine-tuning.
arXiv Detail & Related papers (2024-02-09T11:22:08Z) - ATHENA: Mathematical Reasoning with Thought Expansion [3.3727470465639833]
We introduce Attention-based THought Expansion Network Architecture (ATHENA) to tackle the challenges of real-world practices.
A thought expansion recurrently generates the candidates carrying the thoughts of possible math expressions driven from the previous step.
arXiv Detail & Related papers (2023-11-02T07:03:25Z) - MathCoder: Seamless Code Integration in LLMs for Enhanced Mathematical
Reasoning [52.97768001837269]
We present a method to fine-tune open-source language models, enabling them to use code for modeling and deriving math equations.
We propose a method of generating novel and high-quality datasets with math problems and their code-based solutions.
This approach yields the MathCoder models, a family of models capable of generating code-based solutions for solving challenging math problems.
arXiv Detail & Related papers (2023-10-05T17:52:09Z) - A Survey of Deep Learning for Mathematical Reasoning [71.88150173381153]
We review the key tasks, datasets, and methods at the intersection of mathematical reasoning and deep learning over the past decade.
Recent advances in large-scale neural language models have opened up new benchmarks and opportunities to use deep learning for mathematical reasoning.
arXiv Detail & Related papers (2022-12-20T18:46:16Z) - JiuZhang: A Chinese Pre-trained Language Model for Mathematical Problem
Understanding [74.12405417718054]
This paper aims to advance the mathematical intelligence of machines by presenting the first Chinese mathematical pre-trained language model(PLM)
Unlike other standard NLP tasks, mathematical texts are difficult to understand, since they involve mathematical terminology, symbols and formulas in the problem statement.
We design a novel curriculum pre-training approach for improving the learning of mathematical PLMs, consisting of both basic and advanced courses.
arXiv Detail & Related papers (2022-06-13T17:03:52Z) - Enhancing Neural Mathematical Reasoning by Abductive Combination with
Symbolic Library [5.339286921277565]
This paper demonstrates that some abilities can be achieved through abductive combination with discrete systems that have been programmed with human knowledge.
On a mathematical reasoning dataset, we adopt the recently proposed abductive learning framework, and propose the ABL-Sym algorithm that combines the Transformer models with a symbolic mathematics library.
arXiv Detail & Related papers (2022-03-28T04:19:39Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.