Related papers: CMAX++ : Leveraging Experience in Planning and Execution using Inaccurate Models

CMAX++ : Leveraging Experience in Planning and Execution using Inaccurate Models

URL: http://arxiv.org/abs/2009.09942v3
Date: Thu, 15 Oct 2020 18:44:52 GMT
Title: CMAX++ : Leveraging Experience in Planning and Execution using Inaccurate Models
Authors: Anirudh Vemula, J. Andrew Bagnell, Maxim Likhachev
Abstract summary: CMAX++ is an approach that leverages real-world experience to improve the quality of resulting plans over successive repetitions of a robotic task. We provide provable guarantees on the completeness and convergence of CMAX++ to the optimal path cost as the number of repetitions increases. CMAX++ is also shown to outperform baselines in simulated robotic tasks including 3D mobile robot navigation where the track friction is incorrectly modeled, and a 7D pick-and-place task where the mass of the object is unknown.
Score: 26.674062544226636
License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
Abstract: Given access to accurate dynamical models, modern planning approaches are effective in computing feasible and optimal plans for repetitive robotic tasks. However, it is difficult to model the true dynamics of the real world before execution, especially for tasks requiring interactions with objects whose parameters are unknown. A recent planning approach, CMAX, tackles this problem by adapting the planner online during execution to bias the resulting plans away from inaccurately modeled regions. CMAX, while being provably guaranteed to reach the goal, requires strong assumptions on the accuracy of the model used for planning and fails to improve the quality of the solution over repetitions of the same task. In this paper we propose CMAX++, an approach that leverages real-world experience to improve the quality of resulting plans over successive repetitions of a robotic task. CMAX++ achieves this by integrating model-free learning using acquired experience with model-based planning using the potentially inaccurate model. We provide provable guarantees on the completeness and asymptotic convergence of CMAX++ to the optimal path cost as the number of repetitions increases. CMAX++ is also shown to outperform baselines in simulated robotic tasks including 3D mobile robot navigation where the track friction is incorrectly modeled, and a 7D pick-and-place task where the mass of the object is unknown leading to discrepancy between true and modeled dynamics.

Related papers

Closing the Train-Test Gap in World Models for Gradient-Based Planning [64.36544881136405]
We propose improved methods for training world models that enable efficient gradient-based planning.<n>At test time, our approach outperforms or matches the classical gradient-free cross-entropy method.
arXiv Detail & Related papers (2025-12-10T18:59:45Z)
Keypoint-based Diffusion for Robotic Motion Planning on the NICOL Robot [7.239128729983817]
We propose a novel diffusion-based action model for robotic motion planning.<n>By leveraging the power of deep learning, we are able to achieve good results in a much smaller runtime.
arXiv Detail & Related papers (2025-09-04T10:11:51Z)
Planning-Query-Guided Model Generation for Model-Based Deformable Object Manipulation [24.086752654743957]
This paper introduces a method that automatically generates task-specific, spatially adaptive dynamics models.<n>On a tree-manipulation task, our method doubles planning speed with only a small decrease in task performance over using a full-resolution model.
arXiv Detail & Related papers (2025-08-26T17:03:39Z)
Action Flow Matching for Continual Robot Learning [57.698553219660376]
Continual learning in robotics seeks systems that can constantly adapt to changing environments and tasks. We introduce a generative framework leveraging flow matching for online robot dynamics model alignment. We find that by transforming the actions themselves rather than exploring with a misaligned model, the robot collects informative data more efficiently.
arXiv Detail & Related papers (2025-04-25T16:26:15Z)
Learning from Reward-Free Offline Data: A Case for Planning with Latent Dynamics Models [79.2162092822111]
We systematically evaluate reinforcement learning (RL) and control-based methods on a suite of navigation tasks.<n>We employ a latent dynamics model using the Joint Embedding Predictive Architecture (JEPA) and employ it for planning.<n>Our results show that model-free RL benefits most from large amounts of high-quality data, whereas model-based planning generalizes better to unseen layouts.
arXiv Detail & Related papers (2025-02-20T18:39:41Z)
DeformPAM: Data-Efficient Learning for Long-horizon Deformable Object Manipulation via Preference-based Action Alignment [47.273405862634085]
We propose a data-efficient general learning framework based on preference learning and reward-guided action selection. DeformPAM decomposes long-horizon tasks into multiple action primitives and trains an implicit reward model using human preference data. Experiments conducted on three challenging real-world long-horizon deformable object manipulation tasks demonstrate the effectiveness of this method.
arXiv Detail & Related papers (2024-10-15T13:19:16Z)
Solving Motion Planning Tasks with a Scalable Generative Model [15.858076912795621]
We present an efficient solution based on generative models which learns the dynamics of the driving scenes. Our innovative design allows the model to operate in both full-Autoregressive and partial-Autoregressive modes. We conclude that the proposed generative model may serve as a foundation for a variety of motion planning tasks.
arXiv Detail & Related papers (2024-07-03T03:57:05Z)
Planning with Adaptive World Models for Autonomous Driving [50.4439896514353]
We present nuPlan, a real-world motion planning benchmark that captures multi-agent interactions. We learn to model such unique behaviors with BehaviorNet, a graph convolutional neural network (GCNN) We also present AdaptiveDriver, a model-predictive control (MPC) based planner that unrolls different world models conditioned on BehaviorNet's predictions.
arXiv Detail & Related papers (2024-06-15T18:53:45Z)
MOTO: Offline Pre-training to Online Fine-tuning for Model-based Robot Learning [52.101643259906915]
We study the problem of offline pre-training and online fine-tuning for reinforcement learning from high-dimensional observations. Existing model-based offline RL methods are not suitable for offline-to-online fine-tuning in high-dimensional domains. We propose an on-policy model-based method that can efficiently reuse prior data through model-based value expansion and policy regularization.
arXiv Detail & Related papers (2024-01-06T21:04:31Z)
When Demonstrations Meet Generative World Models: A Maximum Likelihood Framework for Offline Inverse Reinforcement Learning [62.00672284480755]
This paper aims to recover the structure of rewards and environment dynamics that underlie observed actions in a fixed, finite set of demonstrations from an expert agent. Accurate models of expertise in executing a task has applications in safety-sensitive applications such as clinical decision making and autonomous driving.
arXiv Detail & Related papers (2023-02-15T04:14:20Z)
Modeling the Second Player in Distributionally Robust Optimization [90.25995710696425]
We argue for the use of neural generative models to characterize the worst-case distribution. This approach poses a number of implementation and optimization challenges. We find that the proposed approach yields models that are more robust than comparable baselines.
arXiv Detail & Related papers (2021-03-18T14:26:26Z)
Model-Based Visual Planning with Self-Supervised Functional Distances [104.83979811803466]
We present a self-supervised method for model-based visual goal reaching. Our approach learns entirely using offline, unlabeled data. We find that this approach substantially outperforms both model-free and model-based prior methods.
arXiv Detail & Related papers (2020-12-30T23:59:09Z)
Goal-Aware Prediction: Learning to Model What Matters [105.43098326577434]
One of the fundamental challenges in using a learned forward dynamics model is the mismatch between the objective of the learned model and that of the downstream planner or policy. We propose to direct prediction towards task relevant information, enabling the model to be aware of the current task and encouraging it to only model relevant quantities of the state space. We find that our method more effectively models the relevant parts of the scene conditioned on the goal, and as a result outperforms standard task-agnostic dynamics models and model-free reinforcement learning.
arXiv Detail & Related papers (2020-07-14T16:42:59Z)
Maximum Entropy Model Rollouts: Fast Model Based Policy Optimization without Compounding Errors [10.906666680425754]
We propose a Dyna-style model-based reinforcement learning algorithm, which we called Maximum Entropy Model Rollouts (MEMR) To eliminate the compounding errors, we only use our model to generate single-step rollouts.
arXiv Detail & Related papers (2020-06-08T21:38:15Z)
Planning and Execution using Inaccurate Models with Provable Guarantees [23.733488427663396]
We propose CMAX as an approach for interleaving planning and execution. CMAX adapts its planning strategy online during real-world execution to account for discrepancies in dynamics during planning. We provide provable guarantees on the completeness and efficiency of the proposed planning and execution framework.
arXiv Detail & Related papers (2020-03-09T20:17:13Z)
Stepwise Model Selection for Sequence Prediction via Deep Kernel Learning [100.83444258562263]
We propose a novel Bayesian optimization (BO) algorithm to tackle the challenge of model selection in this setting. In order to solve the resulting multiple black-box function optimization problem jointly and efficiently, we exploit potential correlations among black-box functions. We are the first to formulate the problem of stepwise model selection (SMS) for sequence prediction, and to design and demonstrate an efficient joint-learning algorithm for this purpose.
arXiv Detail & Related papers (2020-01-12T09:42:19Z)

This list is automatically generated from the titles and abstracts of the papers in this site.