Fugu-MT 論文翻訳(概要): Ax-Prover: A Deep Reasoning Agentic Framework for Theorem Proving in Mathematics and Quantum Physics

論文の概要: Ax-Prover: A Deep Reasoning Agentic Framework for Theorem Proving in Mathematics and Quantum Physics

arxiv url: http://arxiv.org/abs/2510.12787v1
Date: Tue, 14 Oct 2025 17:57:04 GMT
ステータス: 翻訳完了
システム内更新日: 2025-10-15 19:02:32.437587
Title: Ax-Prover: A Deep Reasoning Agentic Framework for Theorem Proving in Mathematics and Quantum Physics
Title（参考訳）: Ax-Prover:数学と量子物理学の定理証明のためのディープ推論エージェントフレームワーク
Authors: Marco Del Tredici, Jacob McCarran, Benjamin Breen, Javier Aspuru Mijares, Weichen Winston Yin, Jacob M. Taylor, Frank Koppens, Dirk Englund,
Abstract要約: Ax-Proverは、リーンにおける自動定理証明のためのマルチエージェントシステムである。様々な科学的領域にまたがる問題を解決し、自律的または協調的に人間の専門家と操作することができる。我々は,2つの公的な数学ベンチマークと,抽象代数学と量子論の分野において導入される2つのリーンベンチマークにおいて,フロンティア LLM と特殊証明モデルに対するアプローチをベンチマークする。
参考スコア（独自算出の注目度）: 1.2978846076301875
License: http://creativecommons.org/licenses/by/4.0/
Abstract: We present Ax-Prover, a multi-agent system for automated theorem proving in Lean that can solve problems across diverse scientific domains and operate either autonomously or collaboratively with human experts. To achieve this, Ax-Prover approaches scientific problem solving through formal proof generation, a process that demands both creative reasoning and strict syntactic rigor. Ax-Prover meets this challenge by equipping Large Language Models (LLMs), which provide knowledge and reasoning, with Lean tools via the Model Context Protocol (MCP), which ensure formal correctness. To evaluate its performance as an autonomous prover, we benchmark our approach against frontier LLMs and specialized prover models on two public math benchmarks and on two Lean benchmarks we introduce in the fields of abstract algebra and quantum theory. On public datasets, Ax-Prover is competitive with state-of-the-art provers, while it largely outperform them on the new benchmarks. This shows that, unlike specialized systems that struggle to generalize, our tool-based agentic theorem prover approach offers a generalizable methodology for formal verification across diverse scientific domains. Furthermore, we demonstrate Ax-Prover's assistant capabilities in a practical use case, showing how it enabled an expert mathematician to formalize the proof of a complex cryptography theorem.
Abstract（参考訳）: 私たちはAx-Proverを紹介します。Ax-Proverはリーンで証明された自動定理のためのマルチエージェントシステムです。これを達成するために、Ax-Proverは、創造的な推論と厳密な構文的厳密さの両方を要求する形式的証明生成を通じて、科学的な問題解決にアプローチする。 Ax-Proverは、知識と推論を提供するLarge Language Models (LLMs) と、正式な正当性を保証する Model Context Protocol (MCP) を通じてリーンツールを備えることで、この課題に対処する。自律型証明器としての性能を評価するため,2つの公的な数学ベンチマークと,抽象代数学と量子論の分野に導入する2つのリーンベンチマークにおいて,フロンティア LLM と特殊証明モデルに対する我々のアプローチをベンチマークした。公開データセットでは、Ax-Proverは最先端のプロバーと競合するが、新しいベンチマークでは大きく上回っている。このことは、一般化に苦しむ専門的なシステムとは異なり、ツールベースのエージェント的定理証明手法が、様々な科学的領域にわたる形式的検証のための一般化可能な方法論を提供することを示している。さらに,Ax-Proverのアシスタント機能を実例で実証し,それが複雑な暗号定理の証明の形式化を可能にしたことを示す。

論文の概要: Ax-Prover: A Deep Reasoning Agentic Framework for Theorem Proving in Mathematics and Quantum Physics

関連論文リスト