Related papers: Strategic Communication under Threat: Learning Information Trade-offs in Pursuit-Evasion Games

Strategic Communication under Threat: Learning Information Trade-offs in Pursuit-Evasion Games

URL: http://arxiv.org/abs/2510.07813v1
Date: Thu, 09 Oct 2025 05:44:00 GMT
Title: Strategic Communication under Threat: Learning Information Trade-offs in Pursuit-Evasion Games
Authors: Valerio La Gatta, Dolev Mutzari, Sarit Kraus, VS Subrahmanian,
Abstract summary: We formulate a PursuitEvasion-Exposure-Concealment Game (PEEC) in which a pursuer agent must decide when to communicate in order to obtain the evader's position.<n>Both agents learn their movement policies via reinforcement learning, while the pursuer additionally learns a communication policy that balances observability and risk.<n> Empirical evaluations show that SHADOW pursuers achieve higher success rates than six competitive baselines.
Score: 21.58614507029022
License: http://creativecommons.org/licenses/by/4.0/
Abstract: Adversarial environments require agents to navigate a key strategic trade-off: acquiring information enhances situational awareness, but may simultaneously expose them to threats. To investigate this tension, we formulate a PursuitEvasion-Exposure-Concealment Game (PEEC) in which a pursuer agent must decide when to communicate in order to obtain the evader's position. Each communication reveals the pursuer's location, increasing the risk of being targeted. Both agents learn their movement policies via reinforcement learning, while the pursuer additionally learns a communication policy that balances observability and risk. We propose SHADOW (Strategic-communication Hybrid Action Decision-making under partial Observation for Warfare), a multi-headed sequential reinforcement learning framework that integrates continuous navigation control, discrete communication actions, and opponent modeling for behavior prediction. Empirical evaluations show that SHADOW pursuers achieve higher success rates than six competitive baselines. Our ablation study confirms that temporal sequence modeling and opponent modeling are critical for effective decision-making. Finally, our sensitivity analysis reveals that the learned policies generalize well across varying communication risks and physical asymmetries between agents.

Related papers

Verification Required: The Impact of Information Credibility on AI Persuasion [13.454393198058398]
We introduce MixTalk, a strategic communication game for LLM-to-LLM interaction that models information credibility.<n>We evaluate state-of-the-art LLM agents in large-scale tournaments across three realistic deployment settings.<n>We propose Tournament Oracle Policy Distillation (TOPD), an offline method that distills tournament oracle policy from interaction logs and deploys it in-context at inference time.
arXiv Detail & Related papers (2026-02-01T02:22:28Z)
Searching for Privacy Risks in LLM Agents via Simulation [61.229785851581504]
We present a search-based framework that alternates between improving attack and defense strategies through the simulation of privacy-critical agent interactions.<n>We find that attack strategies escalate from direct requests to sophisticated tactics, such as impersonation and consent forgery.<n>The discovered attacks and defenses transfer across diverse scenarios and backbone models, demonstrating strong practical utility for building privacy-aware agents.
arXiv Detail & Related papers (2025-08-14T17:49:09Z)
Reinforcement Learning for Decision-Level Interception Prioritization in Drone Swarm Defense [51.736723807086385]
We present a case study demonstrating the practical advantages of reinforcement learning in addressing this challenge.<n>We introduce a high-fidelity simulation environment that captures realistic operational constraints.<n>Agent learns to coordinate multiple effectors for optimal interception prioritization.<n>We evaluate the learned policy against a handcrafted rule-based baseline across hundreds of simulated attack scenarios.
arXiv Detail & Related papers (2025-08-01T13:55:39Z)
Learning to Communicate in Multi-Agent Reinforcement Learning for Autonomous Cyber Defence [4.267944967869789]
We propose a game design where defender agents learn to communicate and defend against imminent cyber threats by playing training games in the Cyber Operations Research Gym.<n>The tactical policies learned by these autonomous agents are akin to those of human experts during incident responses to avert cyber threats.
arXiv Detail & Related papers (2025-07-19T15:16:24Z)
Toward Optimal LLM Alignments Using Two-Player Games [86.39338084862324]
In this paper, we investigate alignment through the lens of two-agent games, involving iterative interactions between an adversarial and a defensive agent. We theoretically demonstrate that this iterative reinforcement learning optimization converges to a Nash Equilibrium for the game induced by the agents. Experimental results in safety scenarios demonstrate that learning in such a competitive environment not only fully trains agents but also leads to policies with enhanced generalization capabilities for both adversarial and defensive agents.
arXiv Detail & Related papers (2024-06-16T15:24:50Z)
SUB-PLAY: Adversarial Policies against Partially Observed Multi-Agent Reinforcement Learning Systems [40.91476827978885]
Attackers can rapidly exploit the victim's vulnerabilities, generating adversarial policies that result in the failure of specific tasks. We propose a novel black-box attack ( SUB-PLAY) that incorporates the concept of constructing multiple subgames to mitigate the impact of partial observability. We evaluate three potential defenses aimed at exploring ways to mitigate security threats posed by adversarial policies.
arXiv Detail & Related papers (2024-02-06T06:18:16Z)
Imitating Opponent to Win: Adversarial Policy Imitation Learning in Two-player Competitive Games [0.0]
adversarial policies adopted by an adversary agent can influence a target RL agent to perform poorly in a multi-agent environment. In existing studies, adversarial policies are directly trained based on experiences of interacting with the victim agent. We design a new effective adversarial policy learning algorithm that overcomes this shortcoming.
arXiv Detail & Related papers (2022-10-30T18:32:02Z)
Certifiably Robust Policy Learning against Adversarial Communication in Multi-agent Systems [51.6210785955659]
Communication is important in many multi-agent reinforcement learning (MARL) problems for agents to share information and make good decisions. However, when deploying trained communicative agents in a real-world application where noise and potential attackers exist, the safety of communication-based policies becomes a severe issue that is underexplored. In this work, we consider an environment with $N$ agents, where the attacker may arbitrarily change the communication from any $CfracN-12$ agents to a victim agent.
arXiv Detail & Related papers (2022-06-21T07:32:18Z)
Multi-Agent Adversarial Attacks for Multi-Channel Communications [24.576538640840976]
We propose a multi-agent adversary system (MAAS) for modeling and analyzing adversaries in a wireless communication scenario. By modeling the adversaries as learning agents, we show that the proposed MAAS is able to successfully choose the transmitted channel(s) and their respective allocated power(s) without any prior knowledge of the sender strategy.
arXiv Detail & Related papers (2022-01-22T23:57:00Z)
Robust Reinforcement Learning on State Observations with Learned Optimal Adversary [86.0846119254031]
We study the robustness of reinforcement learning with adversarially perturbed state observations. With a fixed agent policy, we demonstrate that an optimal adversary to perturb state observations can be found. For DRL settings, this leads to a novel empirical adversarial attack to RL agents via a learned adversary that is much stronger than previous ones.
arXiv Detail & Related papers (2021-01-21T05:38:52Z)
Adversarial Augmentation Policy Search for Domain and Cross-Lingual Generalization in Reading Comprehension [96.62963688510035]
Reading comprehension models often overfit to nuances of training datasets and fail at adversarial evaluation. We present several effective adversaries and automated data augmentation policy search methods with the goal of making reading comprehension models more robust to adversarial evaluation.
arXiv Detail & Related papers (2020-04-13T17:20:08Z)

This list is automatically generated from the titles and abstracts of the papers in this site.