Adaptive Diffusion Terrain Generator for Autonomous Uneven Terrain Navigation
- URL: http://arxiv.org/abs/2410.10766v1
- Date: Mon, 14 Oct 2024 17:42:37 GMT
- Title: Adaptive Diffusion Terrain Generator for Autonomous Uneven Terrain Navigation
- Authors: Youwei Yu, Junhong Xu, Lantao Liu,
- Abstract summary: We introduce the Adaptive Diffusion Terrain Generator (ADTG)
ADTG dynamically expands existing training environments by adding more diverse and complex terrains adaptive to the current policy.
Our experiments show that the policy trained by ADTG outperforms both procedural generated and natural environments, along with popular navigation methods.
- Score: 10.025095580713678
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Model-free reinforcement learning has emerged as a powerful method for developing robust robot control policies capable of navigating through complex and unstructured terrains. The effectiveness of these methods hinges on two essential elements: (1) the use of massively parallel physics simulations to expedite policy training, and (2) an environment generator tasked with crafting sufficiently challenging yet attainable terrains to facilitate continuous policy improvement. Existing methods of environment generation often rely on heuristics constrained by a set of parameters, limiting the diversity and realism. In this work, we introduce the Adaptive Diffusion Terrain Generator (ADTG), a novel method that leverages Denoising Diffusion Probabilistic Models to dynamically expand existing training environments by adding more diverse and complex terrains adaptive to the current policy. ADTG guides the diffusion model's generation process through initial noise optimization, blending noise-corrupted terrains from existing training environments weighted by the policy's performance in each corresponding environment. By manipulating the noise corruption level, ADTG seamlessly transitions between generating similar terrains for policy fine-tuning and novel ones to expand training diversity. Our experiments show that the policy trained by ADTG outperforms both procedural generated and natural environments, along with popular navigation methods.
Related papers
- Aligning Agentic World Models via Knowledgeable Experience Learning [68.85843641222186]
We introduce WorldMind, a framework that constructs a symbolic World Knowledge Repository by synthesizing environmental feedback.<n>WorldMind achieves superior performance compared to baselines with remarkable cross-model and cross-environment transferability.
arXiv Detail & Related papers (2026-01-19T17:33:31Z) - Grounded Test-Time Adaptation for LLM Agents [75.62784644919803]
Large language model (LLM)-based agents struggle to generalize to novel and complex environments.<n>We propose two strategies for adapting LLM agents by leveraging environment-specific information available during deployment.
arXiv Detail & Related papers (2025-11-06T22:24:35Z) - A Cross-Environment and Cross-Embodiment Path Planning Framework via a Conditional Diffusion Model [6.482051262912219]
This research aims to develop a path planning framework capable of generalizing to unseen environments and new robotic manipulators without retraining.<n>We present GADGET, a diffusion-based planning model that generates joint-space trajectories conditioned on voxelized scene representations.<n> Experimental results show that GADGET achieves high success rates with low collision intensity in spherical-obstacle, bin-picking, and shelf environments.
arXiv Detail & Related papers (2025-10-21T23:30:14Z) - Adaptive Divergence Regularized Policy Optimization for Fine-tuning Generative Models [31.470613363668672]
Adaptive Divergence Regularized Policy Optimization automatically adjusts regularization strength based on advantage estimates.<n>Our implementation with Wasserstein-2 regularization for flow matching generative models achieves remarkable results on text-to-image generation.<n> ADRPO generalizes to KL-regularized fine-tuning of both text-only LLMs and multi-modal reasoning models.
arXiv Detail & Related papers (2025-10-20T19:46:02Z) - AD-NODE: Adaptive Dynamics Learning with Neural ODEs for Mobile Robots Control [17.551574806243853]
Mobile robots are increasingly important in various fields, from logistics to agriculture.<n>These systems require dynamics models capable of responding to environmental variations.<n>We propose an adaptive dynamics model which bypasses the need for direct environmental knowledge.
arXiv Detail & Related papers (2025-10-06T23:14:08Z) - A General Infrastructure and Workflow for Quadrotor Deep Reinforcement Learning and Reality Deployment [48.90852123901697]
We propose a platform that enables seamless transfer of end-to-end deep reinforcement learning (DRL) policies to quadrotors.
Our platform provides rich types of environments including hovering, dynamic obstacle avoidance, trajectory tracking, balloon hitting, and planning in unknown environments.
arXiv Detail & Related papers (2025-04-21T14:25:23Z) - Mind the Gap: Towards Generalizable Autonomous Penetration Testing via Domain Randomization and Meta-Reinforcement Learning [15.619925926862235]
GAP is a generalizable autonomous pentesting framework.
It aims to realizes efficient policy training in realistic environments.
It also trains agents capable of drawing inferences about other cases from one instance.
arXiv Detail & Related papers (2024-12-05T11:24:27Z) - Enabling Adaptive Agent Training in Open-Ended Simulators by Targeting Diversity [10.402855891273346]
DIVA is an evolutionary approach for generating diverse training tasks in complex, open-ended simulators.
Our empirical results showcase DIVA's unique ability to overcome complex parameterizations and successfully train adaptive agent behavior.
arXiv Detail & Related papers (2024-11-07T06:27:12Z) - Model-Based Reinforcement Learning for Control of Strongly-Disturbed Unsteady Aerodynamic Flows [0.0]
We propose a model-based reinforcement learning (MBRL) approach by incorporating a novel reduced-order model as a surrogate for the full environment.
The robustness and generalizability of the model is demonstrated in two distinct flow environments.
We demonstrate that the policy learned in the reduced-order environment translates to an effective control strategy in the full CFD environment.
arXiv Detail & Related papers (2024-08-26T23:21:44Z) - A Conservative Approach for Few-Shot Transfer in Off-Dynamics Reinforcement Learning [3.1515473193934778]
Off-dynamics Reinforcement Learning seeks to transfer a policy from a source environment to a target environment characterized by distinct yet similar dynamics.
We propose an innovative approach inspired by recent advancements in Imitation Learning and conservative RL algorithms.
arXiv Detail & Related papers (2023-12-24T13:09:08Z) - Reparameterized Policy Learning for Multimodal Trajectory Optimization [61.13228961771765]
We investigate the challenge of parametrizing policies for reinforcement learning in high-dimensional continuous action spaces.
We propose a principled framework that models the continuous RL policy as a generative model of optimal trajectories.
We present a practical model-based RL method, which leverages the multimodal policy parameterization and learned world model.
arXiv Detail & Related papers (2023-07-20T09:05:46Z) - Diverse Policy Optimization for Structured Action Space [59.361076277997704]
We propose Diverse Policy Optimization (DPO) to model the policies in structured action space as the energy-based models (EBM)
A novel and powerful generative model, GFlowNet, is introduced as the efficient, diverse EBM-based policy sampler.
Experiments on ATSC and Battle benchmarks demonstrate that DPO can efficiently discover surprisingly diverse policies.
arXiv Detail & Related papers (2023-02-23T10:48:09Z) - Continuous Trajectory Generation Based on Two-Stage GAN [50.55181727145379]
We propose a novel two-stage generative adversarial framework to generate the continuous trajectory on the road network.
Specifically, we build the generator under the human mobility hypothesis of the A* algorithm to learn the human mobility behavior.
For the discriminator, we combine the sequential reward with the mobility yaw reward to enhance the effectiveness of the generator.
arXiv Detail & Related papers (2023-01-16T09:54:02Z) - A Regularized Implicit Policy for Offline Reinforcement Learning [54.7427227775581]
offline reinforcement learning enables learning from a fixed dataset, without further interactions with the environment.
We propose a framework that supports learning a flexible yet well-regularized fully-implicit policy.
Experiments and ablation study on the D4RL dataset validate our framework and the effectiveness of our algorithmic designs.
arXiv Detail & Related papers (2022-02-19T20:22:04Z) - Learning Robust Policy against Disturbance in Transition Dynamics via
State-Conservative Policy Optimization [63.75188254377202]
Deep reinforcement learning algorithms can perform poorly in real-world tasks due to discrepancy between source and target environments.
We propose a novel model-free actor-critic algorithm to learn robust policies without modeling the disturbance in advance.
Experiments in several robot control tasks demonstrate that SCPO learns robust policies against the disturbance in transition dynamics.
arXiv Detail & Related papers (2021-12-20T13:13:05Z) - Distributionally Robust Policy Learning via Adversarial Environment
Generation [3.42658286826597]
We propose DRAGEN - Distributionally Robust policy learning via Adversarial Generation of ENvironments.
We learn a generative model for environments whose latent variables capture cost-predictive and realistic variations in environments.
We demonstrate strong Out-of-Distribution (OoD) generalization in simulation for grasping realistic 2D/3D objects.
arXiv Detail & Related papers (2021-07-13T19:26:34Z) - MAPPER: Multi-Agent Path Planning with Evolutionary Reinforcement
Learning in Mixed Dynamic Environments [30.407700996710023]
This paper proposes a decentralized partially observable multi-agent path planning with evolutionary reinforcement learning (MAPPER) method.
We decompose the long-range navigation task into many easier sub-tasks under the guidance of a global planner.
Our approach models dynamic obstacles' behavior with an image-based representation and trains a policy in mixed dynamic environments without homogeneity assumption.
arXiv Detail & Related papers (2020-07-30T20:14:42Z) - Deep Reinforcement Learning amidst Lifelong Non-Stationarity [67.24635298387624]
We show that an off-policy RL algorithm can reason about and tackle lifelong non-stationarity.
Our method leverages latent variable models to learn a representation of the environment from current and past experiences.
We also introduce several simulation environments that exhibit lifelong non-stationarity, and empirically find that our approach substantially outperforms approaches that do not reason about environment shift.
arXiv Detail & Related papers (2020-06-18T17:34:50Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.