Related papers: Adversarial Search and Tracking with Multiagent Reinforcement Learning in Sparsely Observable Environment

Adversarial Search and Tracking with Multiagent Reinforcement Learning in Sparsely Observable Environment

URL: http://arxiv.org/abs/2306.11301v2
Date: Sat, 21 Oct 2023 01:40:24 GMT
Title: Adversarial Search and Tracking with Multiagent Reinforcement Learning in Sparsely Observable Environment
Authors: Zixuan Wu, Sean Ye, Manisha Natarajan, Letian Chen, Rohan Paleja, Matthew C. Gombolay
Abstract summary: We study a search and tracking (S&T) problem where a team of dynamic search agents must collaborate to track an adversarial, evasive agent. This problem is challenging for both model-based searching and reinforcement learning (RL) methods since the adversary exhibits reactionary and deceptive evasive behaviors in a large space leading to sparse detections for the search agents. We propose a novel Multi-Agent RL (MARL) framework that leverages the estimated adversary location from our learnable filtering model.
Score: 7.195547595036644
License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
Abstract: We study a search and tracking (S&T) problem where a team of dynamic search agents must collaborate to track an adversarial, evasive agent. The heterogeneous search team may only have access to a limited number of past adversary trajectories within a large search space. This problem is challenging for both model-based searching and reinforcement learning (RL) methods since the adversary exhibits reactionary and deceptive evasive behaviors in a large space leading to sparse detections for the search agents. To address this challenge, we propose a novel Multi-Agent RL (MARL) framework that leverages the estimated adversary location from our learnable filtering model. We show that our MARL architecture can outperform all baselines and achieves a 46% increase in detection rate.

Related papers

R1-Searcher: Incentivizing the Search Capability in LLMs via Reinforcement Learning [87.30285670315334]
textbfR1-Searcher is a novel two-stage outcome-based RL approach designed to enhance the search capabilities of Large Language Models. Our framework relies exclusively on RL, without requiring process rewards or distillation for a cold start. Our experiments demonstrate that our method significantly outperforms previous strong RAG methods, even when compared to the closed-source GPT-4o-mini.
arXiv Detail & Related papers (2025-03-07T17:14:44Z)
ExACT: Teaching AI Agents to Explore with Reflective-MCTS and Exploratory Learning [78.42927884000673]
ExACT is an approach to combine test-time search and self-learning to build o1-like models for agentic applications. We first introduce Reflective Monte Carlo Tree Search (R-MCTS), a novel test time algorithm designed to enhance AI agents' ability to explore decision space on the fly. Next, we introduce Exploratory Learning, a novel learning strategy to teach agents to search at inference time without relying on any external search algorithms.
arXiv Detail & Related papers (2024-10-02T21:42:35Z)
FoX: Formation-aware exploration in multi-agent reinforcement learning [10.554220876480297]
We propose a formation-based equivalence relation on the exploration space and aim to reduce the search space by exploring only meaningful states in different formations. Numerical results show that the proposed FoX framework significantly outperforms the state-of-the-art MARL algorithms on Google Research Football (GRF) and sparse Starcraft II multi-agent challenge (SMAC) tasks.
arXiv Detail & Related papers (2023-08-22T08:39:44Z)
GUTS: Generalized Uncertainty-Aware Thompson Sampling for Multi-Agent Active Search [5.861092453610268]
Generalized Uncertainty-aware Thompson Sampling (GUTS) algorithm is suitable for deployment on heterogeneous multi-robot systems for active search in large unstructured environments. We conduct field tests using our multi-robot system in an unstructured environment with a search area of 75,000 sq. m.
arXiv Detail & Related papers (2023-04-04T18:58:16Z)
PS-ARM: An End-to-End Attention-aware Relation Mixer Network for Person Search [56.02761592710612]
We propose a novel attention-aware relation mixer (ARM) for module person search. Our ARM module is native and does not rely on fine-grained supervision or topological assumptions. Our PS-ARM achieves state-of-the-art performance on both datasets.
arXiv Detail & Related papers (2022-10-07T10:04:12Z)
The StarCraft Multi-Agent Challenges+ : Learning of Multi-Stage Tasks and Environmental Factors without Precise Reward Functions [14.399479538886064]
We propose a novel benchmark called the StarCraft Multi-Agent Challenges+. This challenge is interested in the exploration capability of MARL algorithms to efficiently learn implicit multi-stage tasks and environmental factors as well as micro-control. We investigate MARL algorithms under SMAC+ and observe that recent approaches work well in similar settings to the previous challenges, but misbehave in offensive scenarios.
arXiv Detail & Related papers (2022-07-05T12:43:54Z)
Cooperative Exploration for Multi-Agent Deep Reinforcement Learning [127.4746863307944]
We propose cooperative multi-agent exploration (CMAE) for deep reinforcement learning. The goal is selected from multiple projected state spaces via a normalized entropy-based technique. We demonstrate that CMAE consistently outperforms baselines on various tasks.
arXiv Detail & Related papers (2021-07-23T20:06:32Z)
Explore and Control with Adversarial Surprise [78.41972292110967]
Reinforcement learning (RL) provides a framework for learning goal-directed policies given user-specified rewards. We propose a new unsupervised RL technique based on an adversarial game which pits two policies against each other to compete over the amount of surprise an RL agent experiences. We show that our method leads to the emergence of complex skills by exhibiting clear phase transitions.
arXiv Detail & Related papers (2021-07-12T17:58:40Z)
Reannealing of Decaying Exploration Based On Heuristic Measure in Deep Q-Network [82.20059754270302]
We propose an algorithm based on the idea of reannealing, that aims at encouraging exploration only when it is needed. We perform an illustrative case study showing that it has potential to both accelerate training and obtain a better policy.
arXiv Detail & Related papers (2020-09-29T20:40:00Z)
AutoOD: Automated Outlier Detection via Curiosity-guided Search and Self-imitation Learning [72.99415402575886]
Outlier detection is an important data mining task with numerous practical applications. We propose AutoOD, an automated outlier detection framework, which aims to search for an optimal neural network model. Experimental results on various real-world benchmark datasets demonstrate that the deep model identified by AutoOD achieves the best performance.
arXiv Detail & Related papers (2020-06-19T18:57:51Z)
Conflict-Based Search for Connected Multi-Agent Path Finding [6.18778092044887]
We study a variant of the multi-agent path finding problem (MAPF) in which agents are required to remain connected to each other and to a designated base. This problem has applications in search and rescue missions where the entire execution must be monitored by a human operator. We re-visit the conflict-based search algorithm known for MAPF, and define a variant where conflicts arise from disconnections rather than collisions.
arXiv Detail & Related papers (2020-06-05T08:02:36Z)

This list is automatically generated from the titles and abstracts of the papers in this site.