Fugu-MT 論文翻訳(概要): VeriGuard: Enhancing LLM Agent Safety via Verified Code Generation

論文の概要: VeriGuard: Enhancing LLM Agent Safety via Verified Code Generation

arxiv url: http://arxiv.org/abs/2510.05156v1
Date: Fri, 03 Oct 2025 04:11:43 GMT
ステータス: 翻訳完了
システム内更新日: 2025-10-08 17:57:07.863964
Title: VeriGuard: Enhancing LLM Agent Safety via Verified Code Generation
Title（参考訳）: VeriGuard: 検証コード生成によるLLMエージェントの安全性向上
Authors: Lesly Miculicich, Mihir Parmar, Hamid Palangi, Krishnamurthy Dj Dvijotham, Mirko Montanari, Tomas Pfister, Long T. Le,
Abstract要約: 医療などのセンシティブなドメインに自律的なAIエージェントを配置することは、安全性、セキュリティ、プライバシに重大なリスクをもたらす。 LLMをベースとしたエージェントに対して、正式な安全保証を提供する新しいフレームワークであるVeriGuardを紹介する。
参考スコア（独自算出の注目度）: 40.594947933580464
License: http://creativecommons.org/licenses/by/4.0/
Abstract: The deployment of autonomous AI agents in sensitive domains, such as healthcare, introduces critical risks to safety, security, and privacy. These agents may deviate from user objectives, violate data handling policies, or be compromised by adversarial attacks. Mitigating these dangers necessitates a mechanism to formally guarantee that an agent's actions adhere to predefined safety constraints, a challenge that existing systems do not fully address. We introduce VeriGuard, a novel framework that provides formal safety guarantees for LLM-based agents through a dual-stage architecture designed for robust and verifiable correctness. The initial offline stage involves a comprehensive validation process. It begins by clarifying user intent to establish precise safety specifications. VeriGuard then synthesizes a behavioral policy and subjects it to both testing and formal verification to prove its compliance with these specifications. This iterative process refines the policy until it is deemed correct. Subsequently, the second stage provides online action monitoring, where VeriGuard operates as a runtime monitor to validate each proposed agent action against the pre-verified policy before execution. This separation of the exhaustive offline validation from the lightweight online monitoring allows formal guarantees to be practically applied, providing a robust safeguard that substantially improves the trustworthiness of LLM agents.
Abstract（参考訳）: 医療などのセンシティブなドメインに自律的なAIエージェントを配置することは、安全性、セキュリティ、プライバシに重大なリスクをもたらす。これらのエージェントは、ユーザー目標から逸脱したり、データハンドリングポリシーに違反したり、敵の攻撃によって妥協されることがある。これらの危険を緩和するには、エージェントのアクションが事前に定義された安全上の制約に従うことを正式に保証する必要がある。 We introduced VeriGuard, a novel framework that provides formal safety guarantees for LLM-based agent through a dual-stage architecture designed for robust and verible correctness。最初のオフラインステージには、包括的な検証プロセスが含まれる。まず、正確な安全仕様を確立するためのユーザの意図を明確にすることから始まります。その後、VeriGuardは行動ポリシーを合成し、これらの仕様に準拠することを証明するために、テストと正式な検証の両方を施す。この反復的なプロセスは、それが正しいと判断されるまでポリシーを洗練します。その後、第2ステージはオンラインアクション監視を提供し、VeriGuardは実行前に事前に検証されたポリシーに対して提案された各エージェントアクションを検証するランタイムモニターとして動作する。このオフライン検証を軽量なオンライン監視から切り離すことにより、正式な保証を実際に適用することが可能となり、LDMエージェントの信頼性を大幅に向上する堅牢なセーフガードが提供される。

論文の概要: VeriGuard: Enhancing LLM Agent Safety via Verified Code Generation

関連論文リスト