Related papers: X-IL: Exploring the Design Space of Imitation Learning Policies

X-IL: Exploring the Design Space of Imitation Learning Policies

URL: http://arxiv.org/abs/2502.12330v2
Date: Wed, 19 Feb 2025 08:57:34 GMT
Title: X-IL: Exploring the Design Space of Imitation Learning Policies
Authors: Xiaogang Jia, Atalay Donat, Xi Huang, Xuan Zhao, Denis Blessing, Hongyi Zhou, Han A. Wang, Hanyi Zhang, Qian Wang, Rudolf Lioutikov, Gerhard Neumann,
Abstract summary: We present X-IL, an open-source framework designed to explore the vast design space for imitation learning policies.<n>The framework's modular design enables seamless swapping of policy components, such as backbones (e.g., Transformer, Mamba, xLSTM) and policy optimization techniques.<n>This study serves as both a practical reference for practitioners and a foundation for guiding future research in imitation learning.
Score: 20.770730972159242
License: http://creativecommons.org/licenses/by/4.0/
Abstract: Designing modern imitation learning (IL) policies requires making numerous decisions, including the selection of feature encoding, architecture, policy representation, and more. As the field rapidly advances, the range of available options continues to grow, creating a vast and largely unexplored design space for IL policies. In this work, we present X-IL, an accessible open-source framework designed to systematically explore this design space. The framework's modular design enables seamless swapping of policy components, such as backbones (e.g., Transformer, Mamba, xLSTM) and policy optimization techniques (e.g., Score-matching, Flow-matching). This flexibility facilitates comprehensive experimentation and has led to the discovery of novel policy configurations that outperform existing methods on recent robot learning benchmarks. Our experiments demonstrate not only significant performance gains but also provide valuable insights into the strengths and weaknesses of various design choices. This study serves as both a practical reference for practitioners and a foundation for guiding future research in imitation learning.

Related papers

A Survey of Sim-to-Real Methods in RL: Progress, Prospects and Challenges with Foundation Models [7.936554266939555]
Deep Reinforcement Learning (RL) has been explored and verified to be effective in solving decision-making tasks in various domains. However, due to the limited real-world data and unbearable consequences of taking detrimental actions, the learning of RL policy is mainly restricted within the simulators. This survey paper is the first taxonomy that formally frames the sim-to-real techniques from key elements of the Markov Decision Process.
arXiv Detail & Related papers (2025-02-18T12:57:29Z)
Flex: End-to-End Text-Instructed Visual Navigation with Foundation Models [59.892436892964376]
We investigate the minimal data requirements and architectural adaptations necessary to achieve robust closed-loop performance with vision-based control policies. Our findings are synthesized in Flex (Fly-lexically), a framework that uses pre-trained Vision Language Models (VLMs) as frozen patch-wise feature extractors. We demonstrate the effectiveness of this approach on quadrotor fly-to-target tasks, where agents trained via behavior cloning successfully generalize to real-world scenes.
arXiv Detail & Related papers (2024-10-16T19:59:31Z)
Hierarchical Reinforcement Learning for Temporal Abstraction of Listwise Recommendation [51.06031200728449]
We propose a novel framework called mccHRL to provide different levels of temporal abstraction on listwise recommendation. Within the hierarchical framework, the high-level agent studies the evolution of user perception, while the low-level agent produces the item selection policy. Results observe significant performance improvement by our method, compared with several well-known baselines.
arXiv Detail & Related papers (2024-09-11T17:01:06Z)
Towards a Unified View of Preference Learning for Large Language Models: A Survey [88.66719962576005]
Large Language Models (LLMs) exhibit remarkably powerful capabilities. One of the crucial factors to achieve success is aligning the LLM's output with human preferences. We decompose all the strategies in preference learning into four components: model, data, feedback, and algorithm.
arXiv Detail & Related papers (2024-09-04T15:11:55Z)
Dynamic and Adaptive Feature Generation with LLM [10.142660254703225]
We propose a dynamic and adaptive feature generation method that enhances the interpretability of the feature generation process. Our approach broadens the applicability across various data types and tasks and draws advantages over strategic flexibility.
arXiv Detail & Related papers (2024-06-04T20:32:14Z)
Human as Points: Explicit Point-based 3D Human Reconstruction from Single-view RGB Images [71.91424164693422]
We introduce an explicit point-based human reconstruction framework called HaP. Our approach is featured by fully-explicit point cloud estimation, manipulation, generation, and refinement in the 3D geometric space. Our results may indicate a paradigm rollback to the fully-explicit and geometry-centric algorithm design.
arXiv Detail & Related papers (2023-11-06T05:52:29Z)
Policy Architectures for Compositional Generalization in Control [71.61675703776628]
We introduce a framework for modeling entity-based compositional structure in tasks. Our policies are flexible and can be trained end-to-end without requiring any action primitives.
arXiv Detail & Related papers (2022-03-10T06:44:24Z)
Policy-Based Bayesian Experimental Design for Non-Differentiable Implicit Models [25.00242490764664]
Reinforcement Learning for Deep Adaptive Design (RL-DAD) is a method for simulation-based optimal experimental design for non-differentiable implicit models. RL-DAD maps prior histories to experiment designs offline and can be quickly deployed during online execution.
arXiv Detail & Related papers (2022-03-08T18:47:01Z)
Assessing Policy, Loss and Planning Combinations in Reinforcement Learning using a New Modular Architecture [0.0]
We propose a new modular software architecture suited for model-based reinforcement learning agents. We show that the best combination of planning algorithm, policy, and loss function is heavily problem dependent.
arXiv Detail & Related papers (2022-01-08T18:30:25Z)
Attention Option-Critic [56.50123642237106]
We propose an attention-based extension to the option-critic framework. We show that this leads to behaviorally diverse options which are also capable of state abstraction. We also demonstrate the more efficient, interpretable, and reusable nature of the learned options in comparison with option-critic.
arXiv Detail & Related papers (2022-01-07T18:44:28Z)
Context-Specific Representation Abstraction for Deep Option Learning [43.68681795014662]
We introduce Context-Specific Representation Abstraction for Deep Option Learning (CRADOL) CRADOL is a new framework that considers both temporal abstraction and context-specific representation abstraction to effectively reduce the size of the search over policy space. Specifically, our method learns a factored belief state representation that enables each option to learn a policy over only a subsection of the state space.
arXiv Detail & Related papers (2021-09-20T22:50:01Z)
A Survey of Reinforcement Learning Techniques: Strategies, Recent Development, and Future Directions [0.0]
Reinforcement learning influences the system to take actions within an arbitrary environment. This paper focuses on various dimensions including challenges, the recent development of different state-of-the-art techniques, and future directions.
arXiv Detail & Related papers (2020-01-19T23:51:14Z)

This list is automatically generated from the titles and abstracts of the papers in this site.