Imitation Learning with Limited Actions via Diffusion Planners and Deep Koopman Controllers
- URL: http://arxiv.org/abs/2410.07584v2
- Date: Tue, 25 Mar 2025 13:23:21 GMT
- Title: Imitation Learning with Limited Actions via Diffusion Planners and Deep Koopman Controllers
- Authors: Jianxin Bi, Kelvin Lim, Kaiqi Chen, Yifei Huang, Harold Soh,
- Abstract summary: We propose a plan-then-control framework aimed at improving the action-data efficiency of inverse dynamics controllers.<n>Specifically, we adopt a Deep Koopman Operator framework to model the dynamical system and utilize observation-only trajectories to learn a latent action representation.<n>This latent representation can then be effectively mapped to real high-dimensional continuous actions using a linear action decoder.
- Score: 23.292429025366417
- License: http://creativecommons.org/licenses/by-nc-sa/4.0/
- Abstract: Recent advances in diffusion-based robot policies have demonstrated significant potential in imitating multi-modal behaviors. However, these approaches typically require large quantities of demonstration data paired with corresponding robot action labels, creating a substantial data collection burden. In this work, we propose a plan-then-control framework aimed at improving the action-data efficiency of inverse dynamics controllers by leveraging observational demonstration data. Specifically, we adopt a Deep Koopman Operator framework to model the dynamical system and utilize observation-only trajectories to learn a latent action representation. This latent representation can then be effectively mapped to real high-dimensional continuous actions using a linear action decoder, requiring minimal action-labeled data. Through experiments on simulated robot manipulation tasks and a real robot experiment with multi-modal expert demonstrations, we demonstrate that our approach significantly enhances action-data efficiency and achieves high task success rates with limited action data.
Related papers
- Latent Diffusion Planning for Imitation Learning [78.56207566743154]
Latent Diffusion Planning (LDP) is a modular approach consisting of a planner and inverse dynamics model.
By separating planning from action prediction, LDP can benefit from the denser supervision signals of suboptimal and action-free data.
On simulated visual robotic manipulation tasks, LDP outperforms state-of-the-art imitation learning approaches.
arXiv Detail & Related papers (2025-04-23T17:53:34Z) - Learning Coordinated Bimanual Manipulation Policies using State Diffusion and Inverse Dynamics Models [22.826115023573205]
We infuse the predictive nature of human manipulation strategies into robot imitation learning.
We train a diffusion model to predict future states and compute robot actions that achieve the predicted states.
Our framework consistently outperforms state-of-the-art state-to-action mapping policies.
arXiv Detail & Related papers (2025-03-30T01:25:35Z) - λ: A Benchmark for Data-Efficiency in Long-Horizon Indoor Mobile Manipulation Robotics [11.901933884058021]
We introduce the LAMBDA benchmark-Long-horizon Actions for Mobile-manipulation Benchmarking of Directed Activities.
This benchmark evaluates the data efficiency of models on language-conditioned, long-horizon, multi-room, multi-floor, pick-and-place tasks.
Our benchmark includes 571 human-collected demonstrations that provide realism and diversity in simulated and real-world settings.
arXiv Detail & Related papers (2024-11-28T19:31:50Z) - Dynamic Non-Prehensile Object Transport via Model-Predictive Reinforcement Learning [24.079032278280447]
We propose an approach that combines batch reinforcement learning (RL) with model-predictive control (MPC)
We validate the proposed approach through extensive simulated and real-world experiments on a Franka Panda robot performing the robot waiter task.
arXiv Detail & Related papers (2024-11-27T03:33:42Z) - Reinforcement Learning with Action Sequence for Data-Efficient Robot Learning [62.3886343725955]
We introduce a novel RL algorithm that learns a critic network that outputs Q-values over a sequence of actions.
By explicitly training the value functions to learn the consequence of executing a series of current and future actions, our algorithm allows for learning useful value functions from noisy trajectories.
arXiv Detail & Related papers (2024-11-19T01:23:52Z) - Affordance-based Robot Manipulation with Flow Matching [6.863932324631107]
We present a framework for assistive robot manipulation.
We tackle two challenges: first, efficiently adapting large-scale models to downstream scene affordance understanding tasks, and second, effectively learning robot action trajectories by grounding the visual affordance model.
We learn robot action trajectories guided by affordances in a supervised flow matching method.
arXiv Detail & Related papers (2024-09-02T09:11:28Z) - SAM-E: Leveraging Visual Foundation Model with Sequence Imitation for Embodied Manipulation [62.58480650443393]
Segment Anything (SAM) is a vision-foundation model for generalizable scene understanding and sequence imitation.
We develop a novel multi-channel heatmap that enables the prediction of the action sequence in a single pass.
arXiv Detail & Related papers (2024-05-30T00:32:51Z) - Information-driven Affordance Discovery for Efficient Robotic Manipulation [14.863105174430087]
We argue that well-directed interactions with the environment can mitigate this problem.
We provide a theoretical justification of our approach and we empirically validate the approach both in simulation and real-world tasks.
Our method, which we dub IDA, enables the efficient discovery of visual affordances for several action primitives.
arXiv Detail & Related papers (2024-05-06T21:25:51Z) - AdaDemo: Data-Efficient Demonstration Expansion for Generalist Robotic Agent [75.91274222142079]
In this study, we aim to scale up demonstrations in a data-efficient way to facilitate the learning of generalist robotic agents.
AdaDemo is a framework designed to improve multi-task policy learning by actively and continually expanding the demonstration dataset.
arXiv Detail & Related papers (2024-04-11T01:59:29Z) - Unsupervised Learning of Effective Actions in Robotics [0.9374652839580183]
Current state-of-the-art action representations in robotics lack proper effect-driven learning of the robot's actions.
We propose an unsupervised algorithm to discretize a continuous motion space and generate "action prototypes"
We evaluate our method on a simulated stair-climbing reinforcement learning task.
arXiv Detail & Related papers (2024-04-03T13:28:52Z) - Deep Learning for Koopman-based Dynamic Movement Primitives [0.0]
We propose a novel approach by joining the theories of Koopman Operators and Dynamic Movement Primitives to Learning from Demonstration.
Our approach, named glsadmd, projects nonlinear dynamical systems into linear latent spaces such that a solution reproduces the desired complex motion.
Our results are comparable to the Extended Dynamic Mode Decomposition on the LASA Handwriting dataset but with training on only a small fractions of the letters.
arXiv Detail & Related papers (2023-12-06T07:33:22Z) - Value function estimation using conditional diffusion models for control [62.27184818047923]
We propose a simple algorithm called Diffused Value Function (DVF)
It learns a joint multi-step model of the environment-robot interaction dynamics using a diffusion model.
We show how DVF can be used to efficiently capture the state visitation measure for multiple controllers.
arXiv Detail & Related papers (2023-06-09T18:40:55Z) - AlphaBlock: Embodied Finetuning for Vision-Language Reasoning in Robot
Manipulation [50.737355245505334]
We propose a novel framework for learning high-level cognitive capabilities in robot manipulation tasks.
The resulting dataset AlphaBlock consists of 35 comprehensive high-level tasks of multi-step text plans and paired observation.
arXiv Detail & Related papers (2023-05-30T09:54:20Z) - Learning Transferable Motor Skills with Hierarchical Latent Mixture
Policies [37.09286945259353]
We propose an approach to learn abstract motor skills from data using a hierarchical mixture latent variable model.
We demonstrate in manipulation domains that the method can effectively cluster offline data into distinct, executable behaviours.
arXiv Detail & Related papers (2021-12-09T17:37:14Z) - Visual Imitation Made Easy [102.36509665008732]
We present an alternate interface for imitation that simplifies the data collection process while allowing for easy transfer to robots.
We use commercially available reacher-grabber assistive tools both as a data collection device and as the robot's end-effector.
We experimentally evaluate on two challenging tasks: non-prehensile pushing and prehensile stacking, with 1000 diverse demonstrations for each task.
arXiv Detail & Related papers (2020-08-11T17:58:50Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.