Related papers: IIFL: Implicit Interactive Fleet Learning from Heterogeneous Human Supervisors

IIFL: Implicit Interactive Fleet Learning from Heterogeneous Human Supervisors

URL: http://arxiv.org/abs/2306.15228v2
Date: Fri, 20 Oct 2023 05:43:49 GMT
Title: IIFL: Implicit Interactive Fleet Learning from Heterogeneous Human Supervisors
Authors: Gaurav Datta, Ryan Hoque, Anrui Gu, Eugen Solowjow, Ken Goldberg
Abstract summary: Implicit Interactive Fleet Learning (IIFL) is an algorithm that builds on Implicit Behavior Cloning (IBC) for interactive imitation learning. IIFL achieves a 2.8x higher success rate in simulation experiments and a 4.5x higher return on human effort.
Score: 20.182639914630514
License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
Abstract: Imitation learning has been applied to a range of robotic tasks, but can struggle when robots encounter edge cases that are not represented in the training data (i.e., distribution shift). Interactive fleet learning (IFL) mitigates distribution shift by allowing robots to access remote human supervisors during task execution and learn from them over time, but different supervisors may demonstrate the task in different ways. Recent work proposes Implicit Behavior Cloning (IBC), which is able to represent multimodal demonstrations using energy-based models (EBMs). In this work, we propose Implicit Interactive Fleet Learning (IIFL), an algorithm that builds on IBC for interactive imitation learning from multiple heterogeneous human supervisors. A key insight in IIFL is a novel approach for uncertainty quantification in EBMs using Jeffreys divergence. While IIFL is more computationally expensive than explicit methods, results suggest that IIFL achieves a 2.8x higher success rate in simulation experiments and a 4.5x higher return on human effort in a physical block pushing task over (Explicit) IFL, IBC, and other baselines.

Related papers

Cooperative Multi-Agent Planning with Adaptive Skill Synthesis [16.228784877899976]
Multi-agent systems with reinforcement learning face challenges in sample efficiency, interpretability, and transferability. We present a novel multi-agent architecture that integrates vision-language models (VLMs) with a dynamic skill library and structured communication for decentralized closed-loop decision-making.
arXiv Detail & Related papers (2025-02-14T13:23:18Z)
Low-rank Prompt Interaction for Continual Vision-Language Retrieval [47.323830129786145]
We propose the Low-rank Prompt Interaction to address the problem of multi-modal understanding. Considering that the training parameters scale to the number of layers and tasks, we propose low-rank interaction-augmented decomposition. We also adopt hierarchical low-rank contrastive learning to ensure robustness training.
arXiv Detail & Related papers (2025-01-24T10:00:47Z)
FactorLLM: Factorizing Knowledge via Mixture of Experts for Large Language Models [50.331708897857574]
We introduce FactorLLM, a novel approach that decomposes well-trained dense FFNs into sparse sub-networks without requiring any further modifications. FactorLLM achieves comparable performance to the source model securing up to 85% model performance while obtaining over a 30% increase in inference speed.
arXiv Detail & Related papers (2024-08-15T16:45:16Z)
Sparse Diffusion Policy: A Sparse, Reusable, and Flexible Policy for Robot Learning [61.294110816231886]
We introduce a sparse, reusable, and flexible policy, Sparse Diffusion Policy (SDP) SDP selectively activates experts and skills, enabling efficient and task-specific learning without retraining the entire model. Demos and codes can be found in https://forrest-110.io/sparse_diffusion_policy/.
arXiv Detail & Related papers (2024-07-01T17:59:56Z)
Variational Offline Multi-agent Skill Discovery [47.924414207796005]
We propose two novel auto-encoder schemes to simultaneously capture subgroup- and temporal-level abstractions and form multi-agent skills. Our method can be applied to offline multi-task data, and the discovered subgroup skills can be transferred across relevant tasks without retraining. Empirical evaluations on StarCraft tasks indicate that our approach significantly outperforms existing hierarchical multi-agent reinforcement learning (MARL) methods.
arXiv Detail & Related papers (2024-05-26T00:24:46Z)
Large Language Models for Orchestrating Bimanual Robots [19.60907949776435]
We present LAnguage-model-based Bimanual ORchestration (LABOR) to analyze task configurations and devise coordination control policies. We evaluate our method through simulated experiments involving two classes of long-horizon tasks using the NICOL humanoid robot.
arXiv Detail & Related papers (2024-04-02T15:08:35Z)
GeRM: A Generalist Robotic Model with Mixture-of-experts for Quadruped Robot [27.410618312830497]
In this paper, we propose GeRM (Generalist Robotic Model) We utilize offline reinforcement learning to optimize data utilization strategies. We employ a transformer-based VLA network to process multi-modal inputs and output actions.
arXiv Detail & Related papers (2024-03-20T07:36:43Z)
Interactive Planning Using Large Language Models for Partially Observable Robotics Tasks [54.60571399091711]
Large Language Models (LLMs) have achieved impressive results in creating robotic agents for performing open vocabulary tasks. We present an interactive planning technique for partially observable tasks using LLMs.
arXiv Detail & Related papers (2023-12-11T22:54:44Z)
Mastering Robot Manipulation with Multimodal Prompts through Pretraining and Multi-task Fine-tuning [49.92517970237088]
We tackle the problem of training a robot to understand multimodal prompts. This type of task poses a major challenge to robots' capability to understand the interconnection and complementarity between vision and language signals. We introduce an effective framework that learns a policy to perform robot manipulation with multimodal prompts.
arXiv Detail & Related papers (2023-10-14T22:24:58Z)
Expanding Frozen Vision-Language Models without Retraining: Towards Improved Robot Perception [0.0]
Vision-language models (VLMs) have shown powerful capabilities in visual question answering and reasoning tasks. In this paper, we demonstrate a method of aligning the embedding spaces of different modalities to the vision embedding space. We show that using multiple modalities as input improves the VLM's scene understanding and enhances its overall performance in various tasks.
arXiv Detail & Related papers (2023-08-31T06:53:55Z)
Pre-training Language Model as a Multi-perspective Course Learner [103.17674402415582]
This study proposes a multi-perspective course learning (MCL) method for sample-efficient pre-training. In this study, three self-supervision courses are designed to alleviate inherent flaws of "tug-of-war" dynamics. Our method significantly improves ELECTRA's average performance by 2.8% and 3.2% absolute points respectively on GLUE and SQuAD 2.0 benchmarks.
arXiv Detail & Related papers (2023-05-06T09:02:10Z)
Flexible Parallel Learning in Edge Scenarios: Communication, Computational and Energy Cost [20.508003076947848]
Fog- and IoT-based scenarios often require combining both approaches. We present a framework for flexible parallel learning (FPL), achieving both data and model parallelism. Our experiments, carried out using state-of-the-art deep-network architectures and large-scale datasets, confirm that FPL allows for an excellent trade-off among computational (hence energy) cost, communication overhead, and learning performance.
arXiv Detail & Related papers (2022-01-19T03:47:04Z)
ACNMP: Skill Transfer and Task Extrapolation through Learning from Demonstration and Reinforcement Learning via Representation Sharing [5.06461227260756]
ACNMPs can be used to implement skill transfer between robots having different morphology. We show the real-world suitability of ACNMPs through real robot experiments.
arXiv Detail & Related papers (2020-03-25T11:28:12Z)
On the interaction between supervision and self-play in emergent communication [82.290338507106]
We investigate the relationship between two categories of learning signals with the ultimate goal of improving sample efficiency. We find that first training agents via supervised learning on human data followed by self-play outperforms the converse.
arXiv Detail & Related papers (2020-02-04T02:35:19Z)

This list is automatically generated from the titles and abstracts of the papers in this site.