Fugu-MT 論文翻訳(概要): Intent Formalization: A Grand Challenge for Reliable Coding in the Age of AI Agents

論文の概要: Intent Formalization: A Grand Challenge for Reliable Coding in the Age of AI Agents

arxiv url: http://arxiv.org/abs/2603.17150v1
Date: Tue, 17 Mar 2026 21:28:59 GMT
ステータス: 翻訳完了
システム内更新日: 2026-03-19 18:32:57.401361
Title: Intent Formalization: A Grand Challenge for Reliable Coding in the Age of AI Agents
Title（参考訳）: Intent Formalization:AIエージェント時代における信頼性の高いコーディングのグランドチャレンジ
Authors: Shuvendu K. Lahiri,
Abstract要約: エージェントAIシステムは、驚くほどの頻度でコードを生成することができる。生成されたコードが実際にユーザが意図した通りに動作するようにします。
参考スコア（独自算出の注目度）: 7.228124845671868
License: http://creativecommons.org/licenses/by/4.0/
Abstract: Agentic AI systems can now generate code with remarkable fluency, but a fundamental question remains: \emph{does the generated code actually do what the user intended?} The gap between informal natural language requirements and precise program behavior -- the \emph{intent gap} -- has always plagued software engineering, but AI-generated code amplifies it to an unprecedented scale. This article argues that \textbf{intent formalization} -- the translation of informal user intent into a set of checkable formal specifications -- is the key challenge that will determine whether AI makes software more reliable or merely more abundant. Intent formalization offers a tradeoff spectrum suitable to the reliability needs of different contexts: from lightweight tests that disambiguate likely misinterpretations, through full functional specifications for formal verification, to domain-specific languages from which correct code is synthesized automatically. The central bottleneck is \emph{validating specifications}: since there is no oracle for specification correctness other than the user, we need semi-automated metrics that can assess specification quality with or without code, through lightweight user interaction and proxy artifacts such as tests. We survey early research that demonstrates the \emph{potential} of this approach: interactive test-driven formalization that improves program correctness, AI-generated postconditions that catch real-world bugs missed by prior methods, and end-to-end verified pipelines that produce provably correct code from informal specifications. We outline the open research challenges -- scaling beyond benchmarks, achieving compositionality over changes, metrics for validating specifications, handling rich logics, designing human-AI specification interactions -- that define a research agenda spanning AI, programming languages, formal methods, and human-computer interaction.
Abstract（参考訳）: エージェントAIシステムは、驚くほどの頻度でコードを生成することができるが、根本的な疑問は残る: \emph{does the generated code really do the user intended? } 非公式な自然言語要件と正確なプログラム動作のギャップ -- emph{intent gap} -- は常にソフトウェアエンジニアリングに悩まされてきたが、AI生成コードはそれを前例のない規模に増幅している。この記事では、'textbf{intent formalization} -- 非公式なユーザインテントをチェック可能な形式仕様のセットに翻訳する -- が、AIがソフトウェアをより信頼性を高めたり、単に豊富なものにするかどうかを決定する重要な課題である、と論じる。直観的形式化(Intent formalization)は、さまざまなコンテキストの信頼性要件に適したトレードオフスペクトルを提供する。形式検証の完全な機能仕様から、正しいコードが自動的に合成されるドメイン固有言語まで、おそらく誤解を曖昧にする可能性のある軽量テストである。ユーザ以外の仕様の正確性に関するオラクルは存在しないので、テストのような軽量なユーザインタラクションやプロキシアーティファクトを通じて、コードの有無に関わらず仕様の品質を評価できる半自動化されたメトリクスが必要です。プログラムの正確性を改善するインタラクティブなテスト駆動形式化、事前の方法で見逃された実世界のバグをキャッチするAI生成の事後条件、非公式な仕様から確実に正しいコードを生成するエンドツーエンドの検証パイプライン。我々は、AI、プログラミング言語、フォーマルなメソッド、人間とコンピュータのインタラクションにまたがる研究課題を定義するオープンな研究課題 - ベンチマークを越えてのスケーリング、変更に対する構成性の達成、仕様の検証のためのメトリクス、リッチなロジックの処理、ヒューマンとAIの仕様のインタラクションの設計 -- の概要を説明した。

論文の概要: Intent Formalization: A Grand Challenge for Reliable Coding in the Age of AI Agents

関連論文リスト