Fugu-MT 論文翻訳(概要): Optimizing the Cost-Quality Tradeoff of Agentic Theorem Provers in Lean

論文の概要: Optimizing the Cost-Quality Tradeoff of Agentic Theorem Provers in Lean

arxiv url: http://arxiv.org/abs/2606.04883v1
Date: Wed, 03 Jun 2026 13:46:20 GMT
ステータス: 翻訳完了
システム内更新日: 2026-06-04 20:44:18.788558
Title: Optimizing the Cost-Quality Tradeoff of Agentic Theorem Provers in Lean
Title（参考訳）: リーンにおけるエージェント定理プローバのコスト-品質トレードオフの最適化
Authors: Kári Rögnvaldsson, Chenhao Sun, Jasper Dekoninck, Martin Vechev,
Abstract要約: 大規模な言語モデル(LLM)は、リーンで形式的な証明を生成するためにますます使われています。 LLMは違法に高価であり、多くの場合、最終的に失敗する試みに相当な計算に費やされる。本研究では,データプレーンと制御プレーンで構成されるアクションルーティングエージェントを用いてこの問題に対処する。コントロールプレーンは、失敗していたリーンの試みを観察し、成功の可能性を見積もり、他の試みのコストを見積もり、現在の目標を証明し続けるか、あるいは新たなブレークダウンから再開するかを決めます。
参考スコア（独自算出の注目度）: 3.9323543777759014
License: http://creativecommons.org/licenses/by/4.0/
Abstract: Large language models (LLMs) are increasingly used in workflows for generating formal proofs in Lean. These workflows often decompose problems into smaller lemmas, sample many proof attempts, and use compiler feedback to guide search. However, they can be prohibitively expensive, often spending substantial compute on attempts that ultimately fail. In this work, we address this problem with an action routing agent that consists of a data plane and a control plane. The data plane generates natural-language lemma decompositions, formalizes them in Lean, and samples proof attempts for the resulting theorem and lemma targets. The control plane observes previous failed Lean attempts, estimates both the likelihood of success and cost of another attempt, and decides whether to continue proving the current target or restart from a new breakdown. On a subset of PutnamBench, our agent decreases the cost by $25.8\%$ over a fixed-step baseline on average, preserving performance while using substantially less compute. These results suggest that failed Lean trajectories provide actionable signals for cost-aware resource allocation in agentic theorem proving.
Abstract（参考訳）: 大規模な言語モデル(LLM)は、リーンで形式的な証明を生成するワークフローでますます使われています。これらのワークフローは、しばしば問題を小さな補題に分解し、多くの証明試行をサンプリングし、コンパイラフィードバックを使って検索をガイドする。しかし、それらは違法に高価であり、多くの場合、最終的に失敗する試みに相当な計算に費やされる。本研究では,データプレーンと制御プレーンで構成されるアクションルーティングエージェントを用いてこの問題に対処する。データプレーンは自然言語の補題分解を生成し、それらをLeanで形式化し、結果の定理と補題の目標に対する証明の試みをサンプリングする。コントロールプレーンは、失敗していたリーンの試みを観察し、成功の可能性を見積もり、他の試みのコストを見積もり、現在の目標を証明し続けるか、あるいは新たなブレークダウンから再開するかを決めます。 PutnamBenchのサブセットでは、エージェントは平均的な固定ステップベースラインに対して25.8\%のコストを削減し、計算量を大幅に減らしながら性能を保っている。これらの結果は, 失敗に終わったリーン軌道が, エージェント的定理の証明において, コストを意識した資源配分のための実用的な信号を提供することを示している。

論文の概要: Optimizing the Cost-Quality Tradeoff of Agentic Theorem Provers in Lean

関連論文リスト