Fugu-MT 論文翻訳(概要): Intent-aligned Formal Specification Synthesis via Traceable Refinement

論文の概要: Intent-aligned Formal Specification Synthesis via Traceable Refinement

arxiv url: http://arxiv.org/abs/2604.10392v1
Date: Sun, 12 Apr 2026 00:52:27 GMT
ステータス: 翻訳完了
システム内更新日: 2026-04-14 20:13:15.989851
Title: Intent-aligned Formal Specification Synthesis via Traceable Refinement
Title（参考訳）: トレーサブルリファインメントによるインテントアライン形式仕様合成
Authors: Zhe Ye, Aidan Z. H. Yang, Huangyuan Su, Zhenyu Liao, Samuel Tenka, Zhizhen Qin, Udaya Ghai, Dawn Song, Soonho Kong,
Abstract要約: VeriSpecGenは、要求レベルの属性と局所的な修復を通じて、リーンで意図に整合した仕様を合成する、トレーサブルな改善フレームワークです。 VeriSpecGenは自然言語をアトミックな要件に分解し、明示的なトレーサビリティマップで要求対象のテストを生成する。我々は、Claude Opus 4.5を使用してVERINA SpecGenタスクの86.6%を達成し、モデルファミリとスケールで最大31.8ポイントのベースラインを改善する。
参考スコア（独自算出の注目度）: 43.72968325799861
License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
Abstract: Large language models are increasingly used to generate code from natural language, but ensuring correctness remains challenging. Formal verification offers a principled way to obtain such guarantees by proving that a program satisfies a formal specification. However, specifications are frequently missing in real-world codebases, and writing high-quality specifications remains expensive and expertise-intensive. We present VeriSpecGen, a traceable refinement framework that synthesizes intent-aligned specifications in Lean through requirement-level attribution and localized repair. VeriSpecGen decomposes natural language into atomic requirements and generates requirement-targeted tests with explicit traceability maps to validate generated specifications. When validation fails, traceability maps attribute failures to specific requirements, enabling targeted clause-level repairs. VeriSpecGen achieve 86.6% on VERINA SpecGen task using Claude Opus 4.5, improving over baselines by up to 31.8 points across different model families and scales. Beyond inference-time gains, we generate 343K training examples from VeriSpecGen refinement trajectories and demonstrate that training on these trajectories substantially improves specification synthesis by 62-106% relative and transfers gains to general reasoning abilities.
Abstract（参考訳）: 大きな言語モデルは、自然言語からコードを生成するためにますます使われていますが、正確性を保証することは依然として困難です。形式検証は、プログラムが形式仕様を満たすことを証明することによって、そのような保証を得るための原則化された方法を提供する。しかし、現実世界のコードベースでは仕様が欠落することが多く、高品質な仕様を書くことは高価で専門知識に重点を置いている。 VeriSpecGenは、要求レベルの属性と局所的な修復を通じて、リーンで意図に整合した仕様を合成する、トレーサブルな改善フレームワークです。 VeriSpecGenは、自然言語をアトミックな要件に分解し、明確なトレーサビリティマップで要求対象のテストを生成して、生成された仕様を検証する。検証が失敗した場合、トレーサビリティーは障害を特定の要件にマッピングし、ターゲットの節レベルの修復を可能にする。 VeriSpecGen は Claude Opus 4.5 を使用して VERINA SpecGen タスクで 86.6% を達成した。推論時間ゲインの他に、VeriSpecGen精錬軌道から343Kのトレーニング例を生成し、これらのトラジェクトリのトレーニングが62-106%の相対的な仕様合成を著しく改善し、一般的な推論能力にゲインを移すことを実証した。

論文の概要: Intent-aligned Formal Specification Synthesis via Traceable Refinement

関連論文リスト