Fugu-MT 論文翻訳(概要): CriticLean: Critic-Guided Reinforcement Learning for Mathematical Formalization

論文の概要: CriticLean: Critic-Guided Reinforcement Learning for Mathematical Formalization

arxiv url: http://arxiv.org/abs/2507.06181v1
Date: Tue, 08 Jul 2025 17:03:39 GMT
ステータス: 翻訳完了
システム内更新日: 2025-07-09 16:34:38.357844
Title: CriticLean: Critic-Guided Reinforcement Learning for Mathematical Formalization
Title（参考訳）: CriticLean: 数学的形式化のための批判指導型強化学習
Authors: Zhongyuan Peng, Yifan Yao, Kaijing Ma, Shuyue Guo, Yizhe Li, Yichi Zhang, Chenchen Zhang, Yifan Zhang, Zhouliang Yu, Luming Li, Minghao Liu, Yihang Xia, Jiawei Shen, Yuchen Wu, Yixin Cao, Zhaoxiang Zhang, Wenhao Huang, Jiaheng Liu, Ge Zhang,
Abstract要約: CriticLeanは、新しい批評家による強化学習フレームワークである。教師付き微調整と強化学習を通じてトレーニングされたCriticLeanGPTを導入し、Lean 4の形式化の意味的忠実さを厳格に評価する。そして、CryticLeanBenchというベンチマークを紹介します。これは、モデルが意味論的に正しい形式を識別する能力を測定するために設計されたベンチマークです。
参考スコア（独自算出の注目度）: 48.61754523492116
License: http://creativecommons.org/publicdomain/zero/1.0/
Abstract: Translating natural language mathematical statements into formal, executable code is a fundamental challenge in automated theorem proving. While prior work has focused on generation and compilation success, little attention has been paid to the critic phase-the evaluation of whether generated formalizations truly capture the semantic intent of the original problem. In this paper, we introduce CriticLean, a novel critic-guided reinforcement learning framework that elevates the role of the critic from a passive validator to an active learning component. Specifically, first, we propose the CriticLeanGPT, trained via supervised fine-tuning and reinforcement learning, to rigorously assess the semantic fidelity of Lean 4 formalizations. Then, we introduce CriticLeanBench, a benchmark designed to measure models' ability to distinguish semantically correct from incorrect formalizations, and demonstrate that our trained CriticLeanGPT models can significantly outperform strong open- and closed-source baselines. Building on the CriticLean framework, we construct FineLeanCorpus, a dataset comprising over 285K problems that exhibits rich domain diversity, broad difficulty coverage, and high correctness based on human evaluation. Overall, our findings highlight that optimizing the critic phase is essential for producing reliable formalizations, and we hope our CriticLean will provide valuable insights for future advances in formal mathematical reasoning.
Abstract（参考訳）: 自然言語の数学的ステートメントを形式的で実行可能なコードに変換することは、自動定理証明の基本的な課題である。以前の研究は生成とコンパイルの成功に焦点が当てられていたが、生成された形式化が元の問題の意味的な意図を真に捉えているかどうかを批判する段階にはほとんど関心が向けられていない。本稿では,批判指導型強化学習フレームワークであるCryticLeanを紹介する。具体的には、まず、教師付き微調整と強化学習によって訓練されたCriticLeanGPTを提案し、リーン4の形式化の意味的忠実さを厳格に評価する。次に、CryticLeanBenchという、意味論的に正しい形式とを区別するモデルの能力を測定するためのベンチマークを紹介し、訓練されたCryticLeanGPTモデルが、強力なオープンおよびクローズドソースベースラインを著しく上回ることを示す。 CriticLeanフレームワーク上に構築したFineLeanCorpusは、285K以上の問題からなるデータセットで、豊富なドメインの多様性、幅広い難易度、そして人間の評価に基づく高い正確性を示す。全体として、批判フェーズの最適化は信頼できる形式化を生み出す上で不可欠であり、私たちのCriticLeanは、形式的な数学的推論における将来の進歩に貴重な洞察を提供することを期待しています。

論文の概要: CriticLean: Critic-Guided Reinforcement Learning for Mathematical Formalization

関連論文リスト