Fugu-MT 論文翻訳(概要): Evaluating Generalization Mechanisms in Autonomous Cyber Attack Agents

論文の概要: Evaluating Generalization Mechanisms in Autonomous Cyber Attack Agents

arxiv url: http://arxiv.org/abs/2603.10041v1
Date: Fri, 06 Mar 2026 22:24:37 GMT
ステータス: 翻訳完了
システム内更新日: 2026-03-12 16:22:32.581347
Title: Evaluating Generalization Mechanisms in Autonomous Cyber Attack Agents
Title（参考訳）: 自律型サイバー攻撃エージェントの一般化メカニズムの評価
Authors: Ondřej Lukáš, Jihoon Shin, Emilia Rivas, Diego Forni, Maria Rigaki, Carlos Catania, Aritran Piplai, Christopher Kiekintveld, Sebastian Garcia,
Abstract要約: 我々は、自律的な攻撃エージェントが訓練対象のネットワークを超えた移動に失敗する方法について研究する。 3つのエージェントファミリー(従来のRL,適応エージェント,LLMベースのエージェント)を比較し,行動分布に基づく行動/XAI分析を用いて障害モードをローカライズする。プロンプト駆動型LDMエージェントは、保持された再割り当てにおいて最も成功したが、推論時間の増大、透明性の低下、繰り返し/無効動作ループのような実用的な障害モードのコストがかかる。
参考スコア（独自算出の注目度）: 0.5611004142746667
License: http://creativecommons.org/licenses/by-sa/4.0/
Abstract: Autonomous offensive agents often fail to transfer beyond the networks on which they are trained. We isolate a minimal but fundamental shift -- unseen host/subnet IP reassignment in an otherwise fixed enterprise scenario -- and evaluate attacker generalization in the NetSecGame environment. Agents are trained on five IP-range variants and tested on a sixth unseen variant; only the meta-learning agent may adapt at test time. We compare three agent families (traditional RL, adaptation agents, and LLM-based agents) and use action-distribution-based behavioral/XAI analyses to localize failure modes. Some adaptation methods show partial transfer but significant degradation under unseen reassignment, indicating that even address-space changes can break long-horizon attack policies. Under our evaluation protocol and agent-specific assumptions, prompt-driven pretrained LLM agents achieve the highest success on the held-out reassignment, but at the cost of increased inference-time compute, reduced transparency, and practical failure modes such as repetition/invalid-action loops.
Abstract（参考訳）: 自律攻撃エージェントは訓練対象のネットワークを超えて移動できないことが多い。我々は、最小限だが基本的なシフト -- 他の固定されたエンタープライズシナリオにおけるホスト/サブネットIP再割り当て -- を分離し、NetSecGame環境における攻撃者の一般化を評価する。エージェントは5つのIPレンジの変種で訓練され、6番目の見えない変種でテストされる。 3つのエージェントファミリー(従来のRL,適応エージェント,LLMベースのエージェント)を比較し,行動分布に基づく行動/XAI分析を用いて障害モードをローカライズする。いくつかの適応手法は部分的移動を示すが、見知らぬ再割り当ての下で顕著な劣化を示し、アドレス空間の変化でさえ長期水平攻撃ポリシーを破る可能性があることを示している。評価プロトコルとエージェント固有の仮定の下では,プリトレーニング済みのLCMエージェントは,計算時間の増加,透明性の低減,繰り返し/無効動作ループなどの実用的障害モードといったコストで,保持された再割り当てにおいて最高の成功を収める。

論文の概要: Evaluating Generalization Mechanisms in Autonomous Cyber Attack Agents

関連論文リスト