Fugu-MT 論文翻訳(概要): Evaluating the Robustness of Proof Autoformalization in Lean 4

論文の概要: Evaluating the Robustness of Proof Autoformalization in Lean 4

arxiv url: http://arxiv.org/abs/2606.14867v1
Date: Fri, 12 Jun 2026 18:10:21 GMT
ステータス: 翻訳完了
システム内更新日: 2026-06-16 16:21:32.350345
Title: Evaluating the Robustness of Proof Autoformalization in Lean 4
Title（参考訳）: リーン4における証明オートフォーマル化のロバストさの評価
Authors: Zhengtao Gui, Sheng Yang, Zhouxing Shi,
Abstract要約: 我々は、厳密な証明オートフォーマライザは、理想化された証明から分岐する非公式な証明であっても忠実でなければならないと論じる。ミニF2FとMATH-500の両摂動によるベンチマークを構築した。我々は,近年の7つのモデルを評価する。これらはいずれもグローバルな摂動に敏感であり,局地的な摂動の下では忠実に保たない。
参考スコア（独自算出の注目度）: 8.029528831501514
License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
Abstract: Proof autoformalization aims to translate a mathematical informal proof written in natural language into a formal proof in a formal language such as Lean~4. Several works have developed LLM-based models for proof autoformalization. However, existing evaluations have typically focused on translating well-formed informal proofs from curated datasets. We argue that a robust proof autoformalizer must remain faithful even for informal proofs that diverge from these idealized ones, and we present the first study on the robustness of proof autoformalization models. We formulate two categories of perturbations and evaluate robustness under each: a global perturbation paraphrases the informal proof in a different style, under which the formalization should remain consistent; a local perturbation alters a value, symbol, or proof step, possibly in a counterfactual way, and a robust formalization should faithfully reflect the perturbation rather than reverting to the original one or inferring a different one on its own. We build a benchmark with both perturbations on miniF2F and MATH-500, and automatically measure how stable a proof autoformalization's correctness is under global perturbations and how faithfully its output reflects local perturbations. We evaluate seven recent models, all of which are sensitive to global perturbations and mostly fail to remain faithful under local perturbations. Code and data are available via https://github.com/ucr-rai/robust-proof-autoformalization.
Abstract（参考訳）: Proof Autoformalization は、自然言語で書かれた数学的非公式な証明を、Lean -4 のような形式的な言語の形式的な証明に変換することを目的としている。いくつかの研究が、証明オートフォーマル化のためのLSMベースのモデルを開発した。しかしながら、既存の評価は典型的には、キュレートされたデータセットから、よく形成された非公式な証明を翻訳することに重点を置いている。我々は、これらの理想化されたものから分岐する非公式な証明であっても、ロバストな証明オートフォーマライザは忠実でなければならないと論じ、証明オートフォーマライゼーションモデルのロバスト性に関する最初の研究を示す。我々は摂動の2つのカテゴリを定式化し、それぞれの下で頑健さを評価する: グローバル摂動は、形式化が整ったままでなければならない、非公式な証明を異なるスタイルで言い換える; 局所摂動は、おそらく反ファクト的な方法で値、記号、証明のステップを変化させる; 頑健な形式化は、元のものへの回帰や別のものへの推論よりも、忠実に摂動を反映すべきである。我々は,MiniF2FとMATH-500の両方の摂動を用いたベンチマークを構築し,証明の自己形式化の正しさがグローバルな摂動の下でどの程度安定であるか,その出力が局所摂動を忠実に反映しているかを自動的に測定する。我々は,近年の7つのモデルを評価する。これらはいずれもグローバルな摂動に敏感であり,局地的な摂動の下では忠実に保たない。コードとデータはhttps://github.com/ucr-rai/robust-proof-autoformalizationを通じて入手できる。

論文の概要: Evaluating the Robustness of Proof Autoformalization in Lean 4

関連論文リスト