Fugu-MT 論文翻訳(概要): CoTEvol: Self-Evolving Chain-of-Thoughts for Data Synthesis in Mathematical Reasoning

論文の概要: CoTEvol: Self-Evolving Chain-of-Thoughts for Data Synthesis in Mathematical Reasoning

arxiv url: http://arxiv.org/abs/2604.14768v1
Date: Thu, 16 Apr 2026 08:29:22 GMT
ステータス: 翻訳完了
システム内更新日: 2026-04-17 21:29:31.803033
Title: CoTEvol: Self-Evolving Chain-of-Thoughts for Data Synthesis in Mathematical Reasoning
Title（参考訳）: CoTEvol: 数学的推論におけるデータ合成のための自己進化鎖
Authors: Zhuo Wang, Zhuo Zhang, Yafu Li, Yu Cheng, Lizhen Qu, Zenglin Xu,
Abstract要約: コテボル(CoTEvol)は、遺伝進化の枠組みであり、個体群に基づく推論の軌跡の探索としてチェイン・オブ・ソート(Chain-of-Thought)の生成を推し進める。軽量でタスク対応のフィットネス機能は、進化過程を正確で多様な推論へと導くように設計されている。実証的に、CoTEvolは正しいCoT合成の成功を30%以上改善し、構造的多様性を高め、効率を著しく改善した。
参考スコア（独自算出の注目度）: 56.866286835885994
License: http://creativecommons.org/licenses/by-nc-nd/4.0/
Abstract: Large Language Models (LLMs) exhibit strong mathematical reasoning when trained on high-quality Chain-of-Thought (CoT) that articulates intermediate steps, yet costly CoT curation hinders further progress. While existing remedies such as distillation from stronger LLMs and self-synthesis based on test-time search alleviate this issue, they often suffer from diminishing returns or high computing overhead.In this work, we propose CoTEvol, a genetic evolutionary framework that casts CoT generation as a population-based search over reasoning trajectories.Candidate trajectories are iteratively evolved through reflective global crossover at the trajectory level and local mutation guided by uncertainty at the step level, enabling holistic recombination and fine-grained refinement. Lightweight, task-aware fitness functions are designed to guide the evolutionary process toward accurate and diverse reasoning. Empirically, CoTEvol improves correct-CoT synthesis success by over 30% and enhances structural diversity, with markedly improved efficiency. LLMs trained on these evolutionary CoT data achieve an average gain of 6.6% across eight math benchmarks, outperforming previous distillation and self-synthesis approaches. These results underscore the promise of evolutionary CoT synthesis as a scalable and effective method for mathematical reasoning tasks.
Abstract（参考訳）: 大規模言語モデル (LLMs) は、中間ステップを明確に表現する高品質のチェーン・オブ・ソート (CoT) で訓練された時に強力な数学的推論を示すが、コストがかかるCoTのキュレーションはさらなる進歩を妨げる。本研究は,CoT 生成を推論軌道上の集団ベース探索として活用する遺伝的進化の枠組みである Cotevol を提案し,軌道の反射的大域的交叉と,ステップレベルでの不確実性によって誘導される局所変異によって反復的に進化し,総体的再結合と微細化を可能にする。軽量でタスク対応のフィットネス機能は、進化過程を正確で多様な推論へと導くように設計されている。実証的に、CoTEvolは正しいCoT合成の成功を30%以上改善し、構造的多様性を高め、効率を著しく改善した。これらの進化的CoTデータに基づいて訓練されたLLMは、8つのベンチマークで平均6.6%上昇し、以前の蒸留法や自己合成法よりも優れていた。これらの結果は、数学的推論タスクのスケーラブルで効果的な方法として、進化的CoT合成の可能性を浮き彫りにした。

論文の概要: CoTEvol: Self-Evolving Chain-of-Thoughts for Data Synthesis in Mathematical Reasoning

関連論文リスト