Robot Behavior Personalization from Sparse User Feedback
- URL: http://arxiv.org/abs/2410.19219v1
- Date: Fri, 25 Oct 2024 00:08:38 GMT
- Title: Robot Behavior Personalization from Sparse User Feedback
- Authors: Maithili Patel, Sonia Chernova,
- Abstract summary: We create Task Adaptation using Abstract Concepts (TAACo) framework.
TAACo can learn to predict the user's preferred manner of assistance with any given task.
We show that TAACo outperforms GPT-4 by 16% and a rule-based system by 54%, on prediction accuracy.
- Score: 7.51373397483077
- License:
- Abstract: As service robots become more general-purpose, they will need to adapt to their users' preferences over a large set of all possible tasks that they can perform. This includes preferences regarding which actions the users prefer to delegate to robots as opposed to doing themselves. Existing personalization approaches require task-specific data for each user. To handle diversity across all household tasks and users, and nuances in user preferences across tasks, we propose to learn a task adaptation function independently, which can be used in tandem with any universal robot policy to customize robot behavior. We create Task Adaptation using Abstract Concepts (TAACo) framework. TAACo can learn to predict the user's preferred manner of assistance with any given task, by mediating reasoning through a representation composed of abstract concepts built based on user feedback. TAACo can generalize to an open set of household tasks from small amount of user feedback and explain its inferences through intuitive concepts. We evaluate our model on a dataset we collected of 5 people's preferences, and show that TAACo outperforms GPT-4 by 16% and a rule-based system by 54%, on prediction accuracy, with 40 samples of user feedback.
Related papers
- DegustaBot: Zero-Shot Visual Preference Estimation for Personalized Multi-Object Rearrangement [53.86523017756224]
We present DegustaBot, an algorithm for visual preference learning that solves household multi-object rearrangement tasks according to personal preference.
We collect a large dataset of naturalistic personal preferences in a simulated table-setting task.
We find that 50% of our model's predictions are likely to be found acceptable by at least 20% of people.
arXiv Detail & Related papers (2024-07-11T21:28:02Z) - Enhancing Supermarket Robot Interaction: A Multi-Level LLM Conversational Interface for Handling Diverse Customer Intents [46.623273455512106]
This paper presents the design and evaluation of a novel multi-level LLM interface for supermarket robots.
We compare this approach to a specialized GPT model powered by GPT-4 Turbo.
We find statistically significant improvements in four key areas: performance, user satisfaction, user-agent partnership, and self-image enhancement.
arXiv Detail & Related papers (2024-06-16T19:13:01Z) - Tell Me More! Towards Implicit User Intention Understanding of Language
Model Driven Agents [110.25679611755962]
Current language model-driven agents often lack mechanisms for effective user participation, which is crucial given the vagueness commonly found in user instructions.
We introduce Intention-in-Interaction (IN3), a novel benchmark designed to inspect users' implicit intentions through explicit queries.
We empirically train Mistral-Interact, a powerful model that proactively assesses task vagueness, inquires user intentions, and refines them into actionable goals.
arXiv Detail & Related papers (2024-02-14T14:36:30Z) - Personalized Language Modeling from Personalized Human Feedback [49.344833339240566]
Reinforcement Learning from Human Feedback (RLHF) is commonly used to fine-tune large language models to better align with human preferences.
In this work, we aim to address this problem by developing methods for building personalized language models.
arXiv Detail & Related papers (2024-02-06T04:18:58Z) - Factual and Personalized Recommendations using Language Models and
Reinforcement Learning [38.96462170594542]
We develop a comPelling, Precise, Personalized, Preference-relevant language model (P4LM)
P4LM recommends items to users while putting emphasis on explaining item characteristics and their relevance.
We develop a joint reward function that measures precision, appeal, and personalization.
arXiv Detail & Related papers (2023-10-09T21:58:55Z) - Sequence-aware item recommendations for multiply repeated user-item
interactions [0.0]
We design a recommender system that induces the temporal dimension in the task of item recommendation.
It considers sequences of item interactions for each user in order to make recommendations.
This method is empirically shown to give highly accurate predictions of user-items interactions for all users in a retail environment.
arXiv Detail & Related papers (2023-04-02T17:06:07Z) - Personalizing Task-oriented Dialog Systems via Zero-shot Generalizable
Reward Function [19.652303125864204]
We propose a novel framework, P-ToD, to personalize task-oriented dialog systems.
P-ToD uses a pre-trained GPT-2 as a backbone model and works in three phases.
Our novel reward function can quantify the quality of the generated responses even for unseen profiles.
arXiv Detail & Related papers (2023-03-24T04:33:40Z) - Assisting Human Decisions in Document Matching [52.79491990823573]
We devise a proxy matching task that allows us to evaluate which kinds of assistive information improve decision makers' performance.
We find that providing black-box model explanations reduces users' accuracy on the matching task.
On the other hand, custom methods that are designed to closely attend to some task-specific desiderata are found to be effective in improving user performance.
arXiv Detail & Related papers (2023-02-16T17:45:20Z) - Eliciting User Preferences for Personalized Multi-Objective Decision
Making through Comparative Feedback [76.7007545844273]
We propose a multi-objective decision making framework that accommodates different user preferences over objectives.
Our model consists of a Markov decision process with a vector-valued reward function, with each user having an unknown preference vector.
We suggest an algorithm that finds a nearly optimal policy for the user using a small number of comparison queries.
arXiv Detail & Related papers (2023-02-07T23:58:19Z) - Proactive Detractor Detection Framework Based on Message-Wise Sentiment
Analysis Over Customer Support Interactions [60.87845704495664]
We propose a framework relying solely on chat-based customer support interactions for predicting the recommendation decision of individual users.
For our case study, we analyzed a total number of 16.4k users and 48.7k customer support conversations within the financial vertical of a large e-commerce company in Latin America.
Our results show that, with respective feature interpretability, it is possible to predict the likelihood of a user to recommend a product or service, based solely on the message-wise sentiment evolution of their CS conversations in a fully automated way.
arXiv Detail & Related papers (2022-11-08T00:43:36Z) - Towards Personalized Explanation of Robot Path Planning via User
Feedback [1.7231251035416644]
We present a system for generating personalized explanations of robot path planning via user feedback.
The system is capable of detecting and resolving any preference conflict via user interaction.
arXiv Detail & Related papers (2020-11-01T15:10:43Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.