Single-Reset Divide & Conquer Imitation Learning
- URL: http://arxiv.org/abs/2402.09355v1
- Date: Wed, 14 Feb 2024 17:59:47 GMT
- Title: Single-Reset Divide & Conquer Imitation Learning
- Authors: Alexandre Chenu, Olivier Serris, Olivier Sigaud, Nicolas
Perrin-Gilbert
- Abstract summary: Demonstrations are commonly used to speed up the learning process of Deep Reinforcement Learning algorithms.
Some algorithms have been developed to learn from a single demonstration.
- Score: 49.87201678501027
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Demonstrations are commonly used to speed up the learning process of Deep
Reinforcement Learning algorithms. To cope with the difficulty of accessing
multiple demonstrations, some algorithms have been developed to learn from a
single demonstration. In particular, the Divide & Conquer Imitation Learning
algorithms leverage a sequential bias to learn a control policy for complex
robotic tasks using a single state-based demonstration. The latest version,
DCIL-II demonstrates remarkable sample efficiency. This novel method operates
within an extended Goal-Conditioned Reinforcement Learning framework, ensuring
compatibility between intermediate and subsequent goals extracted from the
demonstration. However, a fundamental limitation arises from the assumption
that the system can be reset to specific states along the demonstrated
trajectory, confining the application to simulated systems. In response, we
introduce an extension called Single-Reset DCIL (SR-DCIL), designed to overcome
this constraint by relying on a single initial state reset rather than
sequential resets. To address this more challenging setting, we integrate two
mechanisms inspired by the Learning from Demonstrations literature, including a
Demo-Buffer and Value Cloning, to guide the agent toward compatible success
states. In addition, we introduce Approximate Goal Switching to facilitate
training to reach goals distant from the reset state. Our paper makes several
contributions, highlighting the importance of the reset assumption in DCIL-II,
presenting the mechanisms of SR-DCIL variants and evaluating their performance
in challenging robotic tasks compared to DCIL-II. In summary, this work offers
insights into the significance of reset assumptions in the framework of DCIL
and proposes SR-DCIL, a first step toward a versatile algorithm capable of
learning control policies under a weaker reset assumption.
Related papers
- Bidirectional Decoding: Improving Action Chunking via Closed-Loop Resampling [51.38330727868982]
Bidirectional Decoding (BID) is a test-time inference algorithm that bridges action chunking with closed-loop operations.
We show that BID boosts the performance of two state-of-the-art generative policies across seven simulation benchmarks and two real-world tasks.
arXiv Detail & Related papers (2024-08-30T15:39:34Z) - Exemplar-free Continual Representation Learning via Learnable Drift Compensation [24.114984920918715]
We propose Learnable Drift Compensation (LDC), which can effectively mitigate drift in any moving backbone.
LDC is fast and straightforward to integrate on top of existing continual learning approaches.
We achieve state-of-the-art performance in both supervised and semi-supervised settings.
arXiv Detail & Related papers (2024-07-11T14:23:08Z) - Sequential Action-Induced Invariant Representation for Reinforcement
Learning [1.2046159151610263]
How to accurately learn task-relevant state representations from high-dimensional observations with visual distractions is a challenging problem in visual reinforcement learning.
We propose a Sequential Action-induced invariant Representation (SAR) method, in which the encoder is optimized by an auxiliary learner to only preserve the components that follow the control signals of sequential actions.
arXiv Detail & Related papers (2023-09-22T05:31:55Z) - USER: Unified Semantic Enhancement with Momentum Contrast for Image-Text
Retrieval [115.28586222748478]
Image-Text Retrieval (ITR) aims at searching for the target instances that are semantically relevant to the given query from the other modality.
Existing approaches typically suffer from two major limitations.
arXiv Detail & Related papers (2023-01-17T12:42:58Z) - Leveraging Sequentiality in Reinforcement Learning from a Single
Demonstration [68.94506047556412]
We propose to leverage a sequential bias to learn control policies for complex robotic tasks using a single demonstration.
We show that DCIL-II can solve with unprecedented sample efficiency some challenging simulated tasks such as humanoid locomotion and stand-up.
arXiv Detail & Related papers (2022-11-09T10:28:40Z) - Persistent Reinforcement Learning via Subgoal Curricula [114.83989499740193]
Value-accelerated Persistent Reinforcement Learning (VaPRL) generates a curriculum of initial states.
VaPRL reduces the interventions required by three orders of magnitude compared to episodic reinforcement learning.
arXiv Detail & Related papers (2021-07-27T16:39:45Z) - Dynamic Dual-Attentive Aggregation Learning for Visible-Infrared Person
Re-Identification [208.1227090864602]
Visible-infrared person re-identification (VI-ReID) is a challenging cross-modality pedestrian retrieval problem.
Existing VI-ReID methods tend to learn global representations, which have limited discriminability and weak robustness to noisy images.
We propose a novel dynamic dual-attentive aggregation (DDAG) learning method by mining both intra-modality part-level and cross-modality graph-level contextual cues for VI-ReID.
arXiv Detail & Related papers (2020-07-18T03:08:13Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.