Fugu-MT 論文翻訳(概要): Automatic Generation of Formal Specification and Verification Annotations Using LLMs and Test Oracles

論文の概要: Automatic Generation of Formal Specification and Verification Annotations Using LLMs and Test Oracles

arxiv url: http://arxiv.org/abs/2601.12845v1
Date: Mon, 19 Jan 2026 08:56:43 GMT
ステータス: 翻訳完了
システム内更新日: 2026-01-21 22:47:22.82125
Title: Automatic Generation of Formal Specification and Verification Annotations Using LLMs and Test Oracles
Title（参考訳）: LLMとテストオラクルを用いた形式仕様の自動生成と検証アノテーション
Authors: João Pascoal Faria, Emanuel Trigo, Vinicius Honorato, Rui Abreu,
Abstract要約: 110 Dafnyプログラムの実験では、Claude Opus 4.5 と GPT-5.2 を組み合わせたマルチモデルアプローチが、少なくとも8回の修正イテレーションで98.2%のプログラムに対して正しいアノテーションを生成した。ロジスティック回帰分析では、証明-ヘルパーアノテーションが現在のLLMの難易度に不相応に寄与していることが示された。
参考スコア（独自算出の注目度）: 3.4742046772246837
License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
Abstract: Recent verification tools aim to make formal verification more accessible to software engineers by automating most of the verification process. However, annotating conventional programs with the formal specification and verification constructs (preconditions, postconditions, loop invariants, auxiliary predicates and functions and proof helpers) required to prove their correctness still demands significant manual effort and expertise. This paper investigates how LLMs can automatically generate such annotations for programs written in Dafny, a verification-aware programming language, starting from conventional code accompanied by natural language specifications (in comments) and test code. In experiments on 110 Dafny programs, a multimodel approach combining Claude Opus 4.5 and GPT-5.2 generated correct annotations for 98.2% of the programs within at most 8 repair iterations, using verifier feedback. A logistic regression analysis shows that proof-helper annotations contribute disproportionately to problem difficulty for current LLMs. Assertions in the test cases served as static oracles to automatically validate the generated pre/postconditions. We also compare generated and manual solutions and present an extension for Visual Studio Code to incorporate automatic generation into the IDE, with encouraging usability feedback.
Abstract（参考訳）: 最近の検証ツールは、ほとんどの検証プロセスを自動化して、フォーマルな検証をよりソフトウェアエンジニアにしやすくすることを目的としている。しかしながら、形式的な仕様と検証構造(前提条件、後条件、ループ不変量、補助述語と関数と証明ヘルパー)で従来のプログラムに注釈を付けるには、依然としてかなりの手作業と専門知識が必要である。本稿では,自然言語仕様(コメント)やテストコードに付随する従来のコードから始めて,検証対応のプログラミング言語であるDafnyで書かれたプログラムに対して,LCMがこのようなアノテーションを自動生成する方法について検討する。 110 Dafnyプログラムの実験において、Claude Opus 4.5 と GPT-5.2 を組み合わせたマルチモデルアプローチは、検証者フィードバックを用いて、少なくとも8回の修正イテレーションで98.2%のプログラムに対して正しいアノテーションを生成した。ロジスティック回帰分析では、証明-ヘルパーアノテーションが現在のLLMの難易度に不相応に寄与していることが示された。テストケースの挿入は、生成されたプレ/ポスト条件を自動的に検証するために静的なオラクルとして機能した。また、生成したソリューションと手動のソリューションを比較し、IDEに自動生成を組み込むVisual Studio Codeの拡張を示し、ユーザビリティのフィードバックを奨励します。

論文の概要: Automatic Generation of Formal Specification and Verification Annotations Using LLMs and Test Oracles

関連論文リスト