Related papers: DyPNIPP: Predicting Environment Dynamics for RL-based Robust Informative Path Planning

DyPNIPP: Predicting Environment Dynamics for RL-based Robust Informative Path Planning

URL: http://arxiv.org/abs/2410.17186v1
Date: Tue, 22 Oct 2024 17:07:26 GMT
Title: DyPNIPP: Predicting Environment Dynamics for RL-based Robust Informative Path Planning
Authors: Srujan Deolasee, Siva Kailas, Wenhao Luo, Katia Sycara, Woojun Kim,
Abstract summary: DyPNIPP is a robust RL-based IPP framework designed to effectively acrosstemporal environments. Our experiments in a wildfire environment demonstrate that DyPNIPP outperforms existing RL-based IPP algorithms.
Score: 13.462524685985818
License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
Abstract: Informative path planning (IPP) is an important planning paradigm for various real-world robotic applications such as environment monitoring. IPP involves planning a path that can learn an accurate belief of the quantity of interest, while adhering to planning constraints. Traditional IPP methods typically require high computation time during execution, giving rise to reinforcement learning (RL) based IPP methods. However, the existing RL-based methods do not consider spatio-temporal environments which involve their own challenges due to variations in environment characteristics. In this paper, we propose DyPNIPP, a robust RL-based IPP framework, designed to operate effectively across spatio-temporal environments with varying dynamics. To achieve this, DyPNIPP incorporates domain randomization to train the agent across diverse environments and introduces a dynamics prediction model to capture and adapt the agent actions to specific environment dynamics. Our extensive experiments in a wildfire environment demonstrate that DyPNIPP outperforms existing RL-based IPP algorithms by significantly improving robustness and performing across diverse environment conditions.

Related papers

Improved particle swarm optimization algorithm: multi-target trajectory optimization for swarm drones [20.531764063763678]
Traditional Particle Swarm Optimization (PSO) methods struggle with premature convergence and latency in real-time scenarios.<n>We propose PE-PSO, an enhanced PSO-based online trajectory planner.<n>We develop a multi-agent framework that combines genetic algorithm (GA)-based task allocation with distributed PE-PSO, supporting scalable and coordinated trajectory generation.
arXiv Detail & Related papers (2025-07-18T04:31:49Z)
Exploration and Adaptation in Non-Stationary Tasks with Diffusion Policies [0.0]
This paper investigates the application of Diffusion Policy in non-stationary, vision-based RL settings, specifically targeting environments where task dynamics and objectives evolve over time. We apply Diffusion Policy -- which leverages iterative denoising to refine latent action representations-to benchmark environments including Procgen and PointMaze. Our experiments demonstrate that, despite increased computational demands, Diffusion Policy consistently outperforms standard RL methods such as PPO and DQN, achieving higher mean and maximum rewards with reduced variability.
arXiv Detail & Related papers (2025-03-31T23:00:07Z)
Scalable Decision-Making in Stochastic Environments through Learned Temporal Abstraction [7.918703013303246]
We present Latent Macro Action Planner (L-MAP), which addresses the challenge of learning to make decisions in high-dimensional continuous action spaces. L-MAP learns a set of temporally extended macro-actions through a state-conditional Vector Quantized Variational Autoencoder (VQ-VAE) In offline RL settings, including continuous control tasks, L-MAP efficiently searches over discrete latent actions to yield high expected returns.
arXiv Detail & Related papers (2025-02-28T16:02:23Z)
Survival of the Fittest: Evolutionary Adaptation of Policies for Environmental Shifts [0.15889427269227555]
We develop an adaptive re-training algorithm inspired by evolutionary game theory (EGT) ERPO shows faster policy adaptation, higher average rewards, and reduced computational costs in policy adaptation.
arXiv Detail & Related papers (2024-10-22T09:29:53Z)
OffRIPP: Offline RL-based Informative Path Planning [12.705099730591671]
IPP is a crucial task in robotics, where agents must design paths to gather valuable information about a target environment. We propose an offline RL-based IPP framework that optimized information gain without requiring real-time interaction during training. We validate the framework through extensive simulations and real-world experiments.
arXiv Detail & Related papers (2024-09-25T11:30:59Z)
Learning Logic Specifications for Policy Guidance in POMDPs: an Inductive Logic Programming Approach [57.788675205519986]
We learn high-quality traces from POMDP executions generated by any solver. We exploit data- and time-efficient Indu Logic Programming (ILP) to generate interpretable belief-based policy specifications. We show that learneds expressed in Answer Set Programming (ASP) yield performance superior to neural networks and similar to optimal handcrafted task-specifics within lower computational time.
arXiv Detail & Related papers (2024-02-29T15:36:01Z)
Entropy-Regularized Token-Level Policy Optimization for Language Agent Reinforcement [67.1393112206885]
Large Language Models (LLMs) have shown promise as intelligent agents in interactive decision-making tasks. We introduce Entropy-Regularized Token-level Policy Optimization (ETPO), an entropy-augmented RL method tailored for optimizing LLMs at the token level. We assess the effectiveness of ETPO within a simulated environment that models data science code generation as a series of multi-step interactive tasks.
arXiv Detail & Related papers (2024-02-09T07:45:26Z)
Reparameterized Policy Learning for Multimodal Trajectory Optimization [61.13228961771765]
We investigate the challenge of parametrizing policies for reinforcement learning in high-dimensional continuous action spaces. We propose a principled framework that models the continuous RL policy as a generative model of optimal trajectories. We present a practical model-based RL method, which leverages the multimodal policy parameterization and learned world model.
arXiv Detail & Related papers (2023-07-20T09:05:46Z)
A Comparative Study of Machine Learning Algorithms for Anomaly Detection in Industrial Environments: Performance and Environmental Impact [62.997667081978825]
This study seeks to address the demands of high-performance machine learning models with environmental sustainability. Traditional machine learning algorithms, such as Decision Trees and Random Forests, demonstrate robust efficiency and performance. However, superior outcomes were obtained with optimised configurations, albeit with a commensurate increase in resource consumption.
arXiv Detail & Related papers (2023-07-01T15:18:00Z)
Diverse Policy Optimization for Structured Action Space [59.361076277997704]
We propose Diverse Policy Optimization (DPO) to model the policies in structured action space as the energy-based models (EBM) A novel and powerful generative model, GFlowNet, is introduced as the efficient, diverse EBM-based policy sampler. Experiments on ATSC and Battle benchmarks demonstrate that DPO can efficiently discover surprisingly diverse policies.
arXiv Detail & Related papers (2023-02-23T10:48:09Z)
Hierarchical Policy Blending as Inference for Reactive Robot Control [21.058662668187875]
Motion generation in cluttered, dense, and dynamic environments is a central topic in robotics. We propose a hierarchical motion generation method that combines the benefits of reactive policies and planning. Our experimental study in planar navigation and 6DoF manipulation shows that our proposed hierarchical motion generation method outperforms both myopic reactive controllers and online re-planning methods.
arXiv Detail & Related papers (2022-10-14T15:16:54Z)
Scenic4RL: Programmatic Modeling and Generation of Reinforcement Learning Environments [89.04823188871906]
Generation of diverse realistic scenarios is challenging for real-time strategy (RTS) environments. Most of the existing simulators rely on randomly generating the environments. We introduce the benefits of adopting an existing formal scenario specification language, SCENIC, to assist researchers.
arXiv Detail & Related papers (2021-06-18T21:49:46Z)
Adaptive Informative Path Planning with Multimodal Sensing [36.16721115973077]
AIPPMS (MS for Multimodal Sensing) We frame AIPPMS as a Partially Observable Markov Decision Process (POMDP) and solve it with online planning. We evaluate our method on two domains: a simulated search-and-rescue scenario and a challenging extension to the classic RockSample problem.
arXiv Detail & Related papers (2020-03-21T20:28:57Z)

This list is automatically generated from the titles and abstracts of the papers in this site.