Diagnosis, Feedback, Adaptation: A Human-in-the-Loop Framework for
Test-Time Policy Adaptation
- URL: http://arxiv.org/abs/2307.06333v1
- Date: Wed, 12 Jul 2023 17:55:08 GMT
- Title: Diagnosis, Feedback, Adaptation: A Human-in-the-Loop Framework for
Test-Time Policy Adaptation
- Authors: Andi Peng, Aviv Netanyahu, Mark Ho, Tianmin Shu, Andreea Bobu, Julie
Shah, Pulkit Agrawal
- Abstract summary: Policies often fail due to distribution shift -- changes in the state and reward that occur when a policy is deployed in new environments.
Data augmentation can increase robustness by making the model invariant to task-irrelevant changes in the agent's observation.
We propose an interactive framework to leverage feedback directly from the user to identify personalized task-irrelevant concepts.
- Score: 20.266695694005943
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: Policies often fail due to distribution shift -- changes in the state and
reward that occur when a policy is deployed in new environments. Data
augmentation can increase robustness by making the model invariant to
task-irrelevant changes in the agent's observation. However, designers don't
know which concepts are irrelevant a priori, especially when different end
users have different preferences about how the task is performed. We propose an
interactive framework to leverage feedback directly from the user to identify
personalized task-irrelevant concepts. Our key idea is to generate
counterfactual demonstrations that allow users to quickly identify possible
task-relevant and irrelevant concepts. The knowledge of task-irrelevant
concepts is then used to perform data augmentation and thus obtain a policy
adapted to personalized user objectives. We present experiments validating our
framework on discrete and continuous control tasks with real human users. Our
method (1) enables users to better understand agent failure, (2) reduces the
number of demonstrations required for fine-tuning, and (3) aligns the agent to
individual user task preferences.
Related papers
- Tell Me More! Towards Implicit User Intention Understanding of Language
Model Driven Agents [110.25679611755962]
Current language model-driven agents often lack mechanisms for effective user participation, which is crucial given the vagueness commonly found in user instructions.
We introduce Intention-in-Interaction (IN3), a novel benchmark designed to inspect users' implicit intentions through explicit queries.
We empirically train Mistral-Interact, a powerful model that proactively assesses task vagueness, inquires user intentions, and refines them into actionable goals.
arXiv Detail & Related papers (2024-02-14T14:36:30Z) - To the Noise and Back: Diffusion for Shared Autonomy [2.341116149201203]
We present a new approach to shared autonomy that employs a modulation of the forward and reverse diffusion process of diffusion models.
Our framework learns a distribution over a space of desired behaviors.
It then employs a diffusion model to translate the user's actions to a sample from this distribution.
arXiv Detail & Related papers (2023-02-23T18:58:36Z) - PARTNR: Pick and place Ambiguity Resolving by Trustworthy iNteractive
leaRning [5.046831208137847]
We present the PARTNR algorithm that can detect ambiguities in the trained policy by analyzing multiple modalities in the pick and place poses.
PARTNR employs an adaptive, sensitivity-based, gating function that decides if additional user demonstrations are required.
We demonstrate the performance of PARTNR in a table-top pick and place task.
arXiv Detail & Related papers (2022-11-15T17:07:40Z) - Relative Behavioral Attributes: Filling the Gap between Symbolic Goal
Specification and Reward Learning from Human Preferences [19.70421486855437]
Non-expert users can express complex objectives by expressing preferences over short clips of agent behaviors.
Relative Behavioral Attributes acts as a middle ground between exact goal specification and reward learning purely from preference labels.
We propose two different parametric methods that can potentially encode any kind of behavioral attributes from ordered behavior clips.
arXiv Detail & Related papers (2022-10-28T05:25:23Z) - Interactive and Visual Prompt Engineering for Ad-hoc Task Adaptation
with Large Language Models [116.25562358482962]
State-of-the-art neural language models can be used to solve ad-hoc language tasks without the need for supervised training.
PromptIDE allows users to experiment with prompt variations, visualize prompt performance, and iteratively optimize prompts.
arXiv Detail & Related papers (2022-08-16T17:17:53Z) - Exploring the Trade-off between Plausibility, Change Intensity and
Adversarial Power in Counterfactual Explanations using Multi-objective
Optimization [73.89239820192894]
We argue that automated counterfactual generation should regard several aspects of the produced adversarial instances.
We present a novel framework for the generation of counterfactual examples.
arXiv Detail & Related papers (2022-05-20T15:02:53Z) - Generative multitask learning mitigates target-causing confounding [61.21582323566118]
We propose a simple and scalable approach to causal representation learning for multitask learning.
The improvement comes from mitigating unobserved confounders that cause the targets, but not the input.
Our results on the Attributes of People and Taskonomy datasets reflect the conceptual improvement in robustness to prior probability shift.
arXiv Detail & Related papers (2022-02-08T20:42:14Z) - Lifelong Unsupervised Domain Adaptive Person Re-identification with
Coordinated Anti-forgetting and Adaptation [127.6168183074427]
We propose a new task, Lifelong Unsupervised Domain Adaptive (LUDA) person ReID.
This is challenging because it requires the model to continuously adapt to unlabeled data of the target environments.
We design an effective scheme for this task, dubbed CLUDA-ReID, where the anti-forgetting is harmoniously coordinated with the adaptation.
arXiv Detail & Related papers (2021-12-13T13:19:45Z) - Unsupervised Model Personalization while Preserving Privacy and
Scalability: An Open Problem [55.21502268698577]
This work investigates the task of unsupervised model personalization, adapted to continually evolving, unlabeled local user images.
We provide a novel Dual User-Adaptation framework (DUA) to explore the problem.
This framework flexibly disentangles user-adaptation into model personalization on the server and local data regularization on the user device.
arXiv Detail & Related papers (2020-03-30T09:35:12Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.