Fugu-MT 論文翻訳(概要): Code-driven Number Sequence Calculation: Enhancing the inductive Reasoning Abilities of Large Language Models

論文の概要: Code-driven Number Sequence Calculation: Enhancing the inductive Reasoning Abilities of Large Language Models

arxiv url: http://arxiv.org/abs/2510.14620v1
Date: Thu, 16 Oct 2025 12:29:40 GMT
ステータス: 翻訳完了
システム内更新日: 2025-10-17 21:15:14.848022
Title: Code-driven Number Sequence Calculation: Enhancing the inductive Reasoning Abilities of Large Language Models
Title（参考訳）: コード駆動数列計算:大言語モデルの帰納的推論能力の向上
Authors: Kedi Chen, Zhikai Lei, Xu Guo, Xuecheng Wu, Siyuan Zeng, Jianghao Yin, Yinqi Zhang, Qin Chen, Jie Zhou, Liang He, Qipeng Guo, Kai Chen, Wei Zhang,
Abstract要約: textitCodeSeqは,数列から構築した合成後トレーニングデータセットである。パイプラインは、失敗したテストケースを反映し、反復的な修正を取り入れることで、教師付き微妙なデータを生成する。実験の結果,textitCodeSeqでトレーニングしたモデルでは,様々な推論タスクが改善され,OOD性能が保たれることがわかった。
参考スコア（独自算出の注目度）: 44.17697803306198
License: http://creativecommons.org/licenses/by-nc-nd/4.0/
Abstract: Large language models (LLMs) make remarkable progress in reasoning tasks. Among different reasoning modes, inductive reasoning, due to its better alignment with human learning, attracts increasing interest. However, research on inductive reasoning faces certain challenges. First, existing inductive data mostly focuses on superficial regularities while lacking more complex internal patterns. Second, current works merely prompt LLMs or finetune on simple prompt-response pairs, but do not provide precise thinking processes nor implement difficulty control. Unlike previous work, we address these challenges by introducing \textit{CodeSeq}, a synthetic post-training dataset built from number sequences. We package number sequences into algorithmic problems to discover their general terms, defining a general term generation (GTG) task correspondingly. Our pipeline generates supervised finetuning data by reflecting on failed test cases and incorporating iterative corrections, thereby teaching LLMs to learn autonomous case generation and self-checking. Additionally, it leverages reinforcement learning with a novel Case-Synergy Solvability Scaling Reward based on both solvability, estimated from the problem pass rate, and the success rate of self-directed case generation, enabling models to learn more effectively from both successes and failures. Experimental results show that the models trained with \textit{CodeSeq} improve on various reasoning tasks and can preserve the models' OOD performance.
Abstract（参考訳）: 大規模言語モデル(LLM)は推論タスクにおいて顕著に進歩する。様々な推論モードの中で、人間の学習との整合性が良く、帰納的推論が関心を惹きつける。しかし、帰納的推論の研究はある種の課題に直面している。まず、既存の帰納的データは、より複雑な内部パターンを欠きながら表面的な規則性に焦点を当てている。第二に、現在の作業は単に単純なプロンプト-レスポンスペアに対してLSMやファインチューンを誘導するだけであるが、正確な思考プロセスを提供しておらず、難易度制御を実装していない。従来の研究とは異なり、数値列から構築された合成後トレーニングデータセットである‘textit{CodeSeq}’を導入することで、これらの課題に対処する。数値列をアルゴリズム的な問題にまとめてそれらの一般用語を探索し、それに対応する一般用語生成(GTG)タスクを定義する。パイプラインは、失敗したテストケースを反映し、反復的な修正を取り入れて教師付き微調整データを生成し、LSMに自律的なケース生成と自己チェックの学習を教える。さらに、問題パス率から推定される可解性と自己指向のケース生成の成功率の両方に基づいて、新しいケース・シネギー・ソルバビリティ・スケーリング・リワード(Case-Synergy Solvability Scaling Reward)による強化学習を活用し、モデルが成功と失敗の両方からより効果的に学習できるようにする。実験結果から, <textit{CodeSeq} でトレーニングしたモデルは, 様々な推論タスクを改善し, モデルの OOD 性能を維持できることが示唆された。

論文の概要: Code-driven Number Sequence Calculation: Enhancing the inductive Reasoning Abilities of Large Language Models

関連論文リスト