Fugu-MT 論文翻訳(概要): Learning to Disprove: Formal Counterexample Generation with Large Language Models

論文の概要: Learning to Disprove: Formal Counterexample Generation with Large Language Models

arxiv url: http://arxiv.org/abs/2603.19514v1
Date: Thu, 19 Mar 2026 22:42:49 GMT
ステータス: 翻訳完了
システム内更新日: 2026-03-23 19:48:38.911656
Title: Learning to Disprove: Formal Counterexample Generation with Large Language Models
Title（参考訳）: 解き放つ学習:大規模言語モデルを用いた形式的反例生成
Authors: Zenan Li, Zhaoyu Li, Kaiyu Yang, Xiaoxing Ma, Zhendong Su,
Abstract要約: 数学における現在のAIの取り組みは、ほとんど証明構築にのみ焦点をあてている。我々はこのタスクを形式的な逆例生成として定式化する。多様なトレーニングデータを合成するシンボリックな突然変異戦略を導入する。
参考スコア（独自算出の注目度）: 26.262810717227108
License: http://creativecommons.org/licenses/by/4.0/
Abstract: Mathematical reasoning demands two critical, complementary skills: constructing rigorous proofs for true statements and discovering counterexamples that disprove false ones. However, current AI efforts in mathematics focus almost exclusively on proof construction, often neglecting the equally important task of finding counterexamples. In this paper, we address this gap by fine-tuning large language models (LLMs) to reason about and generate counterexamples. We formalize this task as formal counterexample generation, which requires LLMs not only to propose candidate counterexamples but also to produce formal proofs that can be automatically verified in the Lean 4 theorem prover. To enable effective learning, we introduce a symbolic mutation strategy that synthesizes diverse training data by systematically extracting theorems and discarding selected hypotheses, thereby producing diverse counterexample instances. Together with curated datasets, this strategy enables a multi-reward expert iteration framework that substantially enhances both the effectiveness and efficiency of training LLMs for counterexample generation and theorem proving. Experiments on three newly collected benchmarks validate the advantages of our approach, showing that the mutation strategy and training framework yield significant performance gains.
Abstract（参考訳）: 数学的推論は、真言の厳密な証明を構築し、偽の証明を否定する反例を発見するという、2つの重要な補完的なスキルを要求する。しかし、現在の数学におけるAIの取り組みは、ほとんど証明構築に焦点を合わせており、しばしば反例を見つけるという同様に重要なタスクを無視している。本稿では,大言語モデル(LLM)を微調整し,逆例を生成することで,このギャップに対処する。我々は、このタスクを形式的な反例生成として形式化し、LSMは候補となる反例を提案するだけでなく、Lean 4定理証明器で自動的に検証できる形式的な証明を生成する必要がある。効果的な学習を可能にするために,定理を体系的に抽出し,選択された仮説を捨て,多様な逆例を生成することによって,多様なトレーニングデータを合成するシンボリック突然変異戦略を導入する。この戦略は、キュレートされたデータセットとともに、逆例生成と定理証明のためのLLMのトレーニングの有効性と効率を大幅に向上するマルチリワード専門家の反復フレームワークを実現する。新たに収集された3つのベンチマーク実験により,本手法の利点が検証され,変異戦略とトレーニングフレームワークが大きなパフォーマンス向上をもたらすことが示された。

論文の概要: Learning to Disprove: Formal Counterexample Generation with Large Language Models

関連論文リスト