Fugu-MT 論文翻訳(概要): SMRC: Aligning Large Language Models with Student Reasoning for Mathematical Error Correction

論文の概要: SMRC: Aligning Large Language Models with Student Reasoning for Mathematical Error Correction

arxiv url: http://arxiv.org/abs/2511.14684v1
Date: Tue, 18 Nov 2025 17:22:37 GMT
ステータス: 翻訳完了
システム内更新日: 2025-11-19 16:23:53.230352
Title: SMRC: Aligning Large Language Models with Student Reasoning for Mathematical Error Correction
Title（参考訳）: SMRC:数学的誤り訂正のための学生推論による大規模言語モデルの調整
Authors: Biaojie Zeng, Min Zhang, Juan Zhou, Fengrui Liu, Ruiyang Huang, Xin Lin,
Abstract要約: 大規模言語モデル(LLM)は、数学的な問題を解く際にしばしば推論エラーを発生させる。我々は,LLMを学生の推論と整合させる新しい手法であるtextttSMRC (textitunderlineStudent underline UnderlineReasoning underlineCorrection) を提案する。
参考スコア（独自算出の注目度）: 13.864749522667273
License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
Abstract: Large language models (LLMs) often make reasoning errors when solving mathematical problems, and how to automatically detect and correct these errors has become an important research direction. However, existing approaches \textit{mainly focus on self-correction within the model}, which falls short of the ``teacher-style`` correction required in educational settings, \textit{i.e.}, systematically guiding and revising a student's problem-solving process. To address this gap, we propose \texttt{SMRC} (\textit{\underline{S}tudent \underline{M}athematical \underline{R}easoning \underline{C}orrection}), a novel method that aligns LLMs with student reasoning. Specifically, \texttt{SMRC} formulates student reasoning as a multi-step sequential decision problem and introduces Monte Carlo Tree Search (MCTS) to explore optimal correction paths. To reduce the cost of the annotating process-level rewards, we leverage breadth-first search (BFS) guided by LLMs and final-answer evaluation to generate reward signals, which are then distributed across intermediate reasoning steps via a back-propagation mechanism, enabling fine-grained process supervision. Additionally, we construct a benchmark for high school mathematics, MSEB (Multi-Solution Error Benchmark), consisting of 158 instances that include problem statements, student solutions, and correct reasoning steps. We further propose a dual evaluation protocol centered on \textbf{solution accuracy} and \textbf{correct-step retention}, offering a comprehensive measure of educational applicability. Experiments demonstrate that \texttt{SMRC} significantly outperforms existing methods on two public datasets (ProcessBench and MR-GSM8K) and our MSEB in terms of effectiveness and overall performance. The code and data are available at https://github.com/Mind-Lab-ECNU/SMRC.
Abstract（参考訳）: 大規模言語モデル(LLM)は、数学的な問題を解く際にしばしば推論エラーを発生させ、これらの誤りを自動的に検出し、修正する方法が重要な研究方向となっている。しかし、既存のアプローチである「textit{mainly focus on self-correction in the model}」は、教育環境において必要となる ``teacher-style`` の修正に不足している。このギャップに対処するために, LLM を学生推論と整合させる新しい手法である \texttt{SMRC} (\textit{\underline{S}tudent \underline{M}athematical \underline{R}easoning \underline{C}orrection} を提案する。具体的には、学生推論を多段階連続決定問題として定式化し、モンテカルロ木探索(MCTS)を導入して最適な補正経路を探索する。注釈付きプロセスレベルの報酬のコストを低減するため,LLMによって導かれる広帯域探索(BFS)と最終回答評価を利用して報酬信号を生成し,その処理をバックプロパゲーション機構を介して中間的推論ステップに分散し,プロセスの詳細な監視を可能にする。さらに,問題文,生徒ソリューション,正しい推論ステップを含む158のインスタンスからなる高校数学のベンチマークMSEB(Multi-Solution Error Benchmark)を構築した。さらに, 教育適用可能性の総合的な尺度として, \textbf{solution accuracy} と \textbf{correct-step retention} を中心にした二重評価プロトコルを提案する。 ProcessBench と MR-GSM8K の2つの公開データセットと MSEB の既存の手法を,有効性と全体的な性能の観点から比較した。コードとデータはhttps://github.com/Mind-Lab-ECNU/SMRCで公開されている。

論文の概要: SMRC: Aligning Large Language Models with Student Reasoning for Mathematical Error Correction

関連論文リスト