To the Noise and Back: Diffusion for Shared Autonomy
- URL: http://arxiv.org/abs/2302.12244v3
- Date: Thu, 15 Jun 2023 18:06:12 GMT
- Title: To the Noise and Back: Diffusion for Shared Autonomy
- Authors: Takuma Yoneda and Luzhe Sun and and Ge Yang and Bradly Stadie and
Matthew Walter
- Abstract summary: We present a new approach to shared autonomy that employs a modulation of the forward and reverse diffusion process of diffusion models.
Our framework learns a distribution over a space of desired behaviors.
It then employs a diffusion model to translate the user's actions to a sample from this distribution.
- Score: 2.341116149201203
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: Shared autonomy is an operational concept in which a user and an autonomous
agent collaboratively control a robotic system. It provides a number of
advantages over the extremes of full-teleoperation and full-autonomy in many
settings. Traditional approaches to shared autonomy rely on knowledge of the
environment dynamics, a discrete space of user goals that is known a priori, or
knowledge of the user's policy -- assumptions that are unrealistic in many
domains. Recent works relax some of these assumptions by formulating shared
autonomy with model-free deep reinforcement learning (RL). In particular, they
no longer need knowledge of the goal space (e.g., that the goals are discrete
or constrained) or environment dynamics. However, they need knowledge of a
task-specific reward function to train the policy. Unfortunately, such reward
specification can be a difficult and brittle process. On top of that, the
formulations inherently rely on human-in-the-loop training, and that
necessitates them to prepare a policy that mimics users' behavior. In this
paper, we present a new approach to shared autonomy that employs a modulation
of the forward and reverse diffusion process of diffusion models. Our approach
does not assume known environment dynamics or the space of user goals, and in
contrast to previous work, it does not require any reward feedback, nor does it
require access to the user's policy during training. Instead, our framework
learns a distribution over a space of desired behaviors. It then employs a
diffusion model to translate the user's actions to a sample from this
distribution. Crucially, we show that it is possible to carry out this process
in a manner that preserves the user's control authority. We evaluate our
framework on a series of challenging continuous control tasks, and analyze its
ability to effectively correct user actions while maintaining their autonomy.
Related papers
- Personalisation via Dynamic Policy Fusion [14.948610521764415]
Deep reinforcement learning policies may not align with the personal preferences of human users.
We propose a more practical approach - to adapt the already trained policy to user-specific needs with the help of human feedback.
We empirically demonstrate in a number of environments that our proposed dynamic policy fusion approach consistently achieves the intended task.
arXiv Detail & Related papers (2024-09-30T07:23:47Z) - Cognitive Kernel: An Open-source Agent System towards Generalist Autopilots [54.55088169443828]
We introduce Cognitive Kernel, an open-source agent system towards the goal of generalist autopilots.
Unlike copilot systems, which primarily rely on users to provide essential state information, autopilot systems must complete tasks independently.
To achieve this, an autopilot system should be capable of understanding user intents, actively gathering necessary information from various real-world sources, and making wise decisions.
arXiv Detail & Related papers (2024-09-16T13:39:05Z) - Towards Interpretable Foundation Models of Robot Behavior: A Task Specific Policy Generation Approach [1.7205106391379026]
Foundation models are a promising path toward general-purpose and user-friendly robots.
In particular, the lack of modularity between tasks means that when model weights are updated, the behavior in other, unrelated tasks may be affected.
We present an alternative approach to the design of robot foundation models, which generates stand-alone, task-specific policies.
arXiv Detail & Related papers (2024-07-10T21:55:44Z) - Drive Anywhere: Generalizable End-to-end Autonomous Driving with
Multi-modal Foundation Models [114.69732301904419]
We present an approach to apply end-to-end open-set (any environment/scene) autonomous driving that is capable of providing driving decisions from representations queryable by image and text.
Our approach demonstrates unparalleled results in diverse tests while achieving significantly greater robustness in out-of-distribution situations.
arXiv Detail & Related papers (2023-10-26T17:56:35Z) - Diagnosis, Feedback, Adaptation: A Human-in-the-Loop Framework for
Test-Time Policy Adaptation [20.266695694005943]
Policies often fail due to distribution shift -- changes in the state and reward that occur when a policy is deployed in new environments.
Data augmentation can increase robustness by making the model invariant to task-irrelevant changes in the agent's observation.
We propose an interactive framework to leverage feedback directly from the user to identify personalized task-irrelevant concepts.
arXiv Detail & Related papers (2023-07-12T17:55:08Z) - Domain Knowledge Driven Pseudo Labels for Interpretable Goal-Conditioned
Interactive Trajectory Prediction [29.701029725302586]
We study the joint trajectory prediction problem with the goal-conditioned framework.
We introduce a conditional-variational-autoencoder-based (CVAE) model to explicitly encode different interaction modes into the latent space.
We propose a novel approach to avoid KL vanishing and induce an interpretable interactive latent space with pseudo labels.
arXiv Detail & Related papers (2022-03-28T21:41:21Z) - Goal-Conditioned Reinforcement Learning with Imagined Subgoals [89.67840168694259]
We propose to incorporate imagined subgoals into policy learning to facilitate learning of complex tasks.
Imagined subgoals are predicted by a separate high-level policy, which is trained simultaneously with the policy and its critic.
We evaluate our approach on complex robotic navigation and manipulation tasks and show that it outperforms existing methods by a large margin.
arXiv Detail & Related papers (2021-07-01T15:30:59Z) - Online Learning Demands in Max-min Fairness [91.37280766977923]
We describe mechanisms for the allocation of a scarce resource among multiple users in a way that is efficient, fair, and strategy-proof.
The mechanism is repeated for multiple rounds and a user's requirements can change on each round.
At the end of each round, users provide feedback about the allocation they received, enabling the mechanism to learn user preferences over time.
arXiv Detail & Related papers (2020-12-15T22:15:20Z) - Neural Dynamic Policies for End-to-End Sensorimotor Learning [51.24542903398335]
The current dominant paradigm in sensorimotor control, whether imitation or reinforcement learning, is to train policies directly in raw action spaces.
We propose Neural Dynamic Policies (NDPs) that make predictions in trajectory distribution space.
NDPs outperform the prior state-of-the-art in terms of either efficiency or performance across several robotic control tasks.
arXiv Detail & Related papers (2020-12-04T18:59:32Z) - Guided Uncertainty-Aware Policy Optimization: Combining Learning and
Model-Based Strategies for Sample-Efficient Policy Learning [75.56839075060819]
Traditional robotic approaches rely on an accurate model of the environment, a detailed description of how to perform the task, and a robust perception system to keep track of the current state.
reinforcement learning approaches can operate directly from raw sensory inputs with only a reward signal to describe the task, but are extremely sample-inefficient and brittle.
In this work, we combine the strengths of model-based methods with the flexibility of learning-based methods to obtain a general method that is able to overcome inaccuracies in the robotics perception/actuation pipeline.
arXiv Detail & Related papers (2020-05-21T19:47:05Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.