Related papers: Concurrent Training Improves the Performance of Behavioral Cloning from Observation

Concurrent Training Improves the Performance of Behavioral Cloning from Observation

URL: http://arxiv.org/abs/2008.01205v1
Date: Mon, 3 Aug 2020 21:30:28 GMT
Title: Concurrent Training Improves the Performance of Behavioral Cloning from Observation
Authors: Zachary W. Robertson, Matthew R. Walter
Abstract summary: Learning from demonstration is widely used as an efficient way for robots to acquire new skills. Learning from observation offers a way to utilize unlabeled demonstrations (e.g., video) to perform imitation learning. One approach to this is behavioral cloning from observation (BCO) We present a novel theoretical analysis of BCO, introduce a modification BCO*, and show that in the semi-supervised setting, BCO* can concurrently improve both its estimate for the inverse dynamics model and the expert policy.
Score: 10.939683083130616
License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
Abstract: Learning from demonstration is widely used as an efficient way for robots to acquire new skills. However, it typically requires that demonstrations provide full access to the state and action sequences. In contrast, learning from observation offers a way to utilize unlabeled demonstrations (e.g., video) to perform imitation learning. One approach to this is behavioral cloning from observation (BCO). The original implementation of BCO proceeds by first learning an inverse dynamics model and then using that model to estimate action labels, thereby reducing the problem to behavioral cloning. However, existing approaches to BCO require a large number of initial interactions in the first step. Here, we provide a novel theoretical analysis of BCO, introduce a modification BCO*, and show that in the semi-supervised setting, BCO* can concurrently improve both its estimate for the inverse dynamics model and the expert policy. This result allows us to eliminate the dependence on initial interactions and dramatically improve the sample complexity of BCO. We evaluate the effectiveness of our algorithm through experiments on various benchmark domains. The results demonstrate that concurrent training not only improves over the performance of BCO but also results in performance that is competitive with state-of-the-art imitation learning methods such as GAIL and Value-Dice.

Related papers

UCO: A Multi-Turn Interactive Reinforcement Learning Method for Adaptive Teaching with Large Language Models [59.693733170193944]
Large language models (LLMs) are shifting from answer providers to intelligent tutors in educational settings.<n>Recent reinforcement learning approaches address this limitation but face two critical challenges.<n>We propose the Unidirectional Cognitive Optimization (UCO) method to address these challenges.
arXiv Detail & Related papers (2025-11-12T01:27:02Z)
CURE: Critical-Token-Guided Re-Concatenation for Entropy-Collapse Prevention [24.71056659948577]
We introduce CURE (Critical-token-gUided Re concatenation for Entropy-collapse prevention), a two-stage framework that balances exploration and exploitation.<n>CURE achieves a 5% performance gain across six math benchmarks, establishing state-of-the-art performance in both entropy and accuracy.
arXiv Detail & Related papers (2025-08-14T18:40:34Z)
WSS-CL: Weight Saliency Soft-Guided Contrastive Learning for Efficient Machine Unlearning Image Classification [0.0]
We introduce a new two-phase efficient machine unlearning method for image classification, in terms of weight saliency.<n>Our method is called weight saliency soft-guided contrastive learning for efficient machine unlearning image classification (WSS-CL)<n>Our proposed method yields much-improved unlearning efficacy with negligible performance loss compared to state-of-the-art approaches.
arXiv Detail & Related papers (2025-08-06T10:47:36Z)
Efficient Machine Unlearning via Influence Approximation [75.31015485113993]
Influence-based unlearning has emerged as a prominent approach to estimate the impact of individual training samples on model parameters without retraining.<n>This paper establishes a theoretical link between memorizing (incremental learning) and forgetting (unlearning)<n>We introduce the Influence Approximation Unlearning algorithm for efficient machine unlearning from the incremental perspective.
arXiv Detail & Related papers (2025-07-31T05:34:27Z)
Value from Observations: Towards Large-Scale Imitation Learning via Self-Improvement [19.883973457999282]
Imitation Learning from Observation (IfO) offers a powerful way to learn behaviors at large-scale.<n>This paper investigates idealized scenarios with mostly bimodal-quality data distributions and introduces a method to learn from such data.<n>Our method adapts RL-based imitation learning to action-free demonstrations, using a value function to transfer information between expert and non-expert data.
arXiv Detail & Related papers (2025-07-09T09:55:23Z)
COSEE: Consistency-Oriented Signal-Based Early Exiting via Calibrated Sample Weighting Mechanism [32.015402521706825]
Early exiting is an effective paradigm for improving the inference efficiency of pre-trained language models (PLMs) We propose a novel Consistency-Oriented Signal-based Early Exiting (COSEE) framework, which leverages a calibrated sample weighting mechanism. Experiments on the GLUE benchmark demonstrate the effectiveness of our COSEE across multiple exiting signals and backbones, yielding a better trade-off between performance and efficiency.
arXiv Detail & Related papers (2024-12-17T16:24:55Z)
Bidirectional Decoding: Improving Action Chunking via Closed-Loop Resampling [51.38330727868982]
Bidirectional Decoding (BID) is a test-time inference algorithm that bridges action chunking with closed-loop operations. We show that BID boosts the performance of two state-of-the-art generative policies across seven simulation benchmarks and two real-world tasks.
arXiv Detail & Related papers (2024-08-30T15:39:34Z)
Rethinking Classifier Re-Training in Long-Tailed Recognition: A Simple Logits Retargeting Approach [102.0769560460338]
We develop a simple logits approach (LORT) without the requirement of prior knowledge of the number of samples per class. Our method achieves state-of-the-art performance on various imbalanced datasets, including CIFAR100-LT, ImageNet-LT, and iNaturalist 2018.
arXiv Detail & Related papers (2024-03-01T03:27:08Z)
Unbiased and Efficient Self-Supervised Incremental Contrastive Learning [31.763904668737304]
We propose a self-supervised Incremental Contrastive Learning (ICL) framework consisting of a novel Incremental InfoNCE (NCE-II) loss function. ICL achieves up to 16.7x training speedup and 16.8x faster convergence with competitive results.
arXiv Detail & Related papers (2023-01-28T06:11:31Z)
Planning for Sample Efficient Imitation Learning [52.44953015011569]
Current imitation algorithms struggle to achieve high performance and high in-environment sample efficiency simultaneously. We propose EfficientImitate, a planning-based imitation learning method that can achieve high in-environment sample efficiency and performance simultaneously. Experimental results show that EI achieves state-of-the-art results in performance and sample efficiency.
arXiv Detail & Related papers (2022-10-18T05:19:26Z)
DBN-Mix: Training Dual Branch Network Using Bilateral Mixup Augmentation for Long-Tailed Visual Recognition [7.94190631530826]
We develop a simple yet effective method to improve the performance of DBN without cumulative learning. We present class-conditional temperature scaling that mitigates bias toward the majority class for the proposed DBN architecture.
arXiv Detail & Related papers (2022-07-05T17:01:27Z)
ReIL: A Framework for Reinforced Intervention-based Imitation Learning [3.0846824529023387]
We introduce Reinforced Intervention-based Learning (ReIL), a framework consisting of a general intervention-based learning algorithm and a multi-task imitation learning model. Experimental results from real world mobile robot navigation challenges indicate that ReIL learns rapidly from sparse supervisor corrections without suffering deterioration in performance.
arXiv Detail & Related papers (2022-03-29T09:30:26Z)
DEALIO: Data-Efficient Adversarial Learning for Imitation from Observation [57.358212277226315]
In imitation learning from observation IfO, a learning agent seeks to imitate a demonstrating agent using only observations of the demonstrated behavior without access to the control signals generated by the demonstrator. Recent methods based on adversarial imitation learning have led to state-of-the-art performance on IfO problems, but they typically suffer from high sample complexity due to a reliance on data-inefficient, model-free reinforcement learning algorithms. This issue makes them impractical to deploy in real-world settings, where gathering samples can incur high costs in terms of time, energy, and risk. We propose a more data-efficient IfO algorithm
arXiv Detail & Related papers (2021-03-31T23:46:32Z)
Few-shot Action Recognition with Prototype-centered Attentive Learning [88.10852114988829]
Prototype-centered Attentive Learning (PAL) model composed of two novel components. First, a prototype-centered contrastive learning loss is introduced to complement the conventional query-centered learning objective. Second, PAL integrates a attentive hybrid learning mechanism that can minimize the negative impacts of outliers.
arXiv Detail & Related papers (2021-01-20T11:48:12Z)
Robust Pre-Training by Adversarial Contrastive Learning [120.33706897927391]
Recent work has shown that, when integrated with adversarial training, self-supervised pre-training can lead to state-of-the-art robustness. We improve robustness-aware self-supervised pre-training by learning representations consistent under both data augmentations and adversarial perturbations.
arXiv Detail & Related papers (2020-10-26T04:44:43Z)

This list is automatically generated from the titles and abstracts of the papers in this site.