Related papers: Identifying Expert Behavior in Offline Training Datasets Improves Behavioral Cloning of Robotic Manipulation Policies

Identifying Expert Behavior in Offline Training Datasets Improves Behavioral Cloning of Robotic Manipulation Policies

URL: http://arxiv.org/abs/2301.13019v2
Date: Thu, 21 Sep 2023 10:39:03 GMT
Title: Identifying Expert Behavior in Offline Training Datasets Improves Behavioral Cloning of Robotic Manipulation Policies
Authors: Qiang Wang, Robert McCarthy, David Cordova Bulens, Francisco Roldan Sanchez, Kevin McGuinness, Noel E. O'Connor, and Stephen J. Redmond
Abstract summary: This paper presents our solution for the Real Robot Challenge III, a competition featured in the NeurIPS 2022 Competition Track. It aimed at addressing dexterous robotic manipulation tasks through learning from pre-collected offline data.
Score: 15.383102120417407
License: http://creativecommons.org/licenses/by/4.0/
Abstract: This paper presents our solution for the Real Robot Challenge (RRC) III, a competition featured in the NeurIPS 2022 Competition Track, aimed at addressing dexterous robotic manipulation tasks through learning from pre-collected offline data. Participants were provided with two types of datasets for each task: expert and mixed datasets with varying skill levels. While the simplest offline policy learning algorithm, Behavioral Cloning (BC), performed remarkably well when trained on expert datasets, it outperformed even the most advanced offline reinforcement learning (RL) algorithms. However, BC's performance deteriorated when applied to mixed datasets, and the performance of offline RL algorithms was also unsatisfactory. Upon examining the mixed datasets, we observed that they contained a significant amount of expert data, although this data was unlabeled. To address this issue, we proposed a semi-supervised learning-based classifier to identify the underlying expert behavior within mixed datasets, effectively isolating the expert data. To further enhance BC's performance, we leveraged the geometric symmetry of the RRC arena to augment the training dataset through mathematical transformations. In the end, our submission surpassed that of all other participants, even those who employed complex offline RL algorithms and intricate data processing and feature engineering techniques.

Related papers

AdvKT: An Adversarial Multi-Step Training Framework for Knowledge Tracing [64.79967583649407]
Knowledge Tracing (KT) monitors students' knowledge states and simulates their responses to question sequences. Existing KT models typically follow a single-step training paradigm, which leads to significant error accumulation. We propose a novel Adversarial Multi-Step Training Framework for Knowledge Tracing (AdvKT) which focuses on the multi-step KT task.
arXiv Detail & Related papers (2025-04-07T03:31:57Z)
Pessimistic Value Iteration for Multi-Task Data Sharing in Offline Reinforcement Learning [116.87367592920171]
Offline Reinforcement Learning (RL) has shown promising results in learning a task-specific policy from a fixed dataset. In scenarios where the dataset for a specific task is limited, a natural approach is to improve offline RL with datasets from other tasks. We propose an uncertainty-based Multi-Task Data Sharing (MTDS) approach that shares the entire dataset without data selection.
arXiv Detail & Related papers (2024-04-30T08:16:52Z)
Incremental Self-training for Semi-supervised Learning [56.57057576885672]
IST is simple yet effective and fits existing self-training-based semi-supervised learning methods. We verify the proposed IST on five datasets and two types of backbone, effectively improving the recognition accuracy and learning speed.
arXiv Detail & Related papers (2024-04-14T05:02:00Z)
Robotic Manipulation Datasets for Offline Compositional Reinforcement Learning [25.24958915674457]
This paper provides four offline RL datasets for simulated robotic manipulation created using the $256$ tasks from CompoSuite. We provide training and evaluation settings for assessing an agent's ability to learn compositional task policies.
arXiv Detail & Related papers (2023-07-13T23:36:55Z)
Personalized Decentralized Multi-Task Learning Over Dynamic Communication Graphs [59.96266198512243]
We propose a decentralized and federated learning algorithm for tasks that are positively and negatively correlated. Our algorithm uses gradients to calculate the correlations among tasks automatically, and dynamically adjusts the communication graph to connect mutually beneficial tasks and isolate those that may negatively impact each other. We conduct experiments on a synthetic Gaussian dataset and a large-scale celebrity attributes (CelebA) dataset.
arXiv Detail & Related papers (2022-12-21T18:58:24Z)
Offline Robot Reinforcement Learning with Uncertainty-Guided Human Expert Sampling [11.751910133386254]
Recent advances in batch (offline) reinforcement learning have shown promising results in learning from available offline data. We propose a novel approach that uses uncertainty estimation to trigger the injection of human demonstration data. Our experiments show that this approach is more sample efficient when compared to a naive way of combining expert data with data collected from a sub-optimal agent.
arXiv Detail & Related papers (2022-12-16T01:41:59Z)
Implicit Offline Reinforcement Learning via Supervised Learning [83.8241505499762]
Offline Reinforcement Learning (RL) via Supervised Learning is a simple and effective way to learn robotic skills from a dataset collected by policies of different expertise levels. We show how implicit models can leverage return information and match or outperform explicit algorithms to acquire robotic skills from fixed datasets.
arXiv Detail & Related papers (2022-10-21T21:59:42Z)
Discriminator-Weighted Offline Imitation Learning from Suboptimal Demonstrations [5.760034336327491]
We study the problem of offline Learning (IL) where an agent aims to learn an optimal expert behavior policy without additional online environment interactions. We introduce an additional discriminator to distinguish expert and non-expert data. Our proposed algorithm achieves higher returns and faster training speed compared to baseline algorithms.
arXiv Detail & Related papers (2022-07-20T17:29:04Z)
When Should We Prefer Offline Reinforcement Learning Over Behavioral Cloning? [86.43517734716606]
offline reinforcement learning (RL) algorithms can acquire effective policies by utilizing previously collected experience, without any online interaction. behavioral cloning (BC) algorithms mimic a subset of the dataset via supervised learning. We show that policies trained on sufficiently noisy suboptimal data can attain better performance than even BC algorithms with expert data.
arXiv Detail & Related papers (2022-04-12T08:25:34Z)
The Challenges of Exploration for Offline Reinforcement Learning [8.484491887821473]
We study the two processes of reinforcement learning: collecting informative experience and inferring optimal behaviour. The task-agnostic setting for data collection, where the task is not known a priori, is of particular interest. We use this decoupled framework to strengthen intuitions about exploration and the data prerequisites for effective offline RL.
arXiv Detail & Related papers (2022-01-27T23:59:56Z)
Understanding the Effects of Dataset Characteristics on Offline Reinforcement Learning [4.819336169151637]
Offline Reinforcement Learning can learn policies from a given dataset without interacting with the environment. We show how dataset characteristics influence the performance of Offline RL algorithms for discrete action environments. For datasets with high TQ, Behavior Cloning outperforms or performs similarly to the best Offline RL algorithms.
arXiv Detail & Related papers (2021-11-08T18:48:43Z)
D4RL: Datasets for Deep Data-Driven Reinforcement Learning [119.49182500071288]
We introduce benchmarks specifically designed for the offline setting, guided by key properties of datasets relevant to real-world applications of offline RL. By moving beyond simple benchmark tasks and data collected by partially-trained RL agents, we reveal important and unappreciated deficiencies of existing algorithms.
arXiv Detail & Related papers (2020-04-15T17:18:19Z)

This list is automatically generated from the titles and abstracts of the papers in this site.