Fugu-MT 論文翻訳(概要): Get the Agents Drunk: Memory Perturbations in Autonomous Agent-based Recommender Systems

論文の概要: Get the Agents Drunk: Memory Perturbations in Autonomous Agent-based Recommender Systems

arxiv url: http://arxiv.org/abs/2503.23804v1
Date: Mon, 31 Mar 2025 07:35:40 GMT
ステータス: 翻訳完了
システム内更新日: 2025-04-01 19:35:57.303803
Title: Get the Agents Drunk: Memory Perturbations in Autonomous Agent-based Recommender Systems
Title（参考訳）: Get the Agents Drunk: 自律エージェントベースのレコメンダシステムにおけるメモリ摂動
Authors: Shiyi Yang, Zhibo Hu, Chen Wang, Tong Yu, Xiwei Xu, Liming Zhu, Lina Yao,
Abstract要約: 大規模言語モデルベースのエージェントは、パーソナライズされた振る舞いモデリングを実現するために、リコメンデータシステム(Agent4RS)でますます使われている。私たちの知る限りでは、Agent4RSがいかに堅牢かは未解明のままです。本稿では,エージェントの記憶を乱すことによってエージェント4RSを攻撃するための最初の取り組みを提案する。
参考スコア（独自算出の注目度）: 29.35591074298123
License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
Abstract: Large language model-based agents are increasingly used in recommender systems (Agent4RSs) to achieve personalized behavior modeling. Specifically, Agent4RSs introduces memory mechanisms that enable the agents to autonomously learn and self-evolve from real-world interactions. However, to the best of our knowledge, how robust Agent4RSs are remains unexplored. As such, in this paper, we propose the first work to attack Agent4RSs by perturbing agents' memories, not only to uncover their limitations but also to enhance their security and robustness, ensuring the development of safer and more reliable AI agents. Given the security and privacy concerns, it is more practical to launch attacks under a black-box setting, where the accurate knowledge of the victim models cannot be easily obtained. Moreover, the practical attacks are often stealthy to maximize the impact. To this end, we propose a novel practical attack framework named DrunkAgent. DrunkAgent consists of a generation module, a strategy module, and a surrogate module. The generation module aims to produce effective and coherent adversarial textual triggers, which can be used to achieve attack objectives such as promoting the target items. The strategy module is designed to `get the target agents drunk' so that their memories cannot be effectively updated during the interaction process. As such, the triggers can play the best role. Both of the modules are optimized on the surrogate module to improve the transferability and imperceptibility of the attacks. By identifying and analyzing the vulnerabilities, our work provides critical insights that pave the way for building safer and more resilient Agent4RSs. Extensive experiments across various real-world datasets demonstrate the effectiveness of DrunkAgent.
Abstract（参考訳）: 大規模言語モデルベースのエージェントは、パーソナライズされた振る舞いモデリングを実現するために、リコメンデータシステム(Agent4RS)でますます使われている。具体的には、Agent4RSsは、エージェントが現実世界のインタラクションから自律的に学習し、自己進化することを可能にするメモリメカニズムを導入している。しかし、我々の知る限りでは、Agent4RSがいかに堅牢かは未解明のままである。そこで本稿では,エージェントの記憶を乱すことによってエージェント4RSを攻撃し,その限界を明らかにするだけでなく,セキュリティと堅牢性を向上し,より安全で信頼性の高いAIエージェントの開発を保証するために,エージェント4RSを攻撃するための最初の取り組みを提案する。セキュリティとプライバシの懸念から、被害者モデルの正確な知識を容易に取得できないブラックボックス設定で攻撃を起動することはより現実的である。さらに、実際の攻撃は影響を最大化するためにステルス性を持つことが多い。そこで本研究では,DrunkAgentという新しい攻撃フレームワークを提案する。 DrunkAgentは生成モジュール、戦略モジュール、代理モジュールで構成される。生成モジュールは、ターゲットアイテムの促進などの攻撃目的を達成するために使用できる、効果的で一貫性のある対向的なテキストトリガーを作成することを目的としている。戦略モジュールは、インタラクションプロセス中にメモリを効果的に更新できないように、‘ターゲットエージェントを酔っ払う’ように設計されている。そのため、トリガーは最高の役割を果たすことができる。どちらのモジュールもサロゲートモジュールに最適化されており、アタックの転送性と非許容性を改善する。脆弱性を特定し分析することで、我々の研究はより安全でよりレジリエントなAgent4RSを構築するための重要な洞察を提供する。さまざまな実世界のデータセットにわたる大規模な実験は、DrunkAgentの有効性を示している。

論文の概要: Get the Agents Drunk: Memory Perturbations in Autonomous Agent-based Recommender Systems

関連論文リスト