Adaptive Querying for Reward Learning from Human Feedback
- URL: http://arxiv.org/abs/2412.07990v1
- Date: Wed, 11 Dec 2024 00:02:48 GMT
- Title: Adaptive Querying for Reward Learning from Human Feedback
- Authors: Yashwanthi Anand, Sandhya Saisubramanian,
- Abstract summary: Learning from human feedback is a popular approach to train robots to adapt to user preferences and improve safety.
We examine how to learn a penalty function associated with unsafe behaviors, such as side effects, using multiple forms of human feedback.
We employ an iterative, two-phase approach which first selects critical states for querying, and then uses information gain to select a feedback format for querying.
- Score: 5.587293092389789
- License:
- Abstract: Learning from human feedback is a popular approach to train robots to adapt to user preferences and improve safety. Existing approaches typically consider a single querying (interaction) format when seeking human feedback and do not leverage multiple modes of user interaction with a robot. We examine how to learn a penalty function associated with unsafe behaviors, such as side effects, using multiple forms of human feedback, by optimizing the query state and feedback format. Our framework for adaptive feedback selection enables querying for feedback in critical states in the most informative format, while accounting for the cost and probability of receiving feedback in a certain format. We employ an iterative, two-phase approach which first selects critical states for querying, and then uses information gain to select a feedback format for querying across the sampled critical states. Our evaluation in simulation demonstrates the sample efficiency of our approach.
Related papers
- Beyond Thumbs Up/Down: Untangling Challenges of Fine-Grained Feedback for Text-to-Image Generation [67.88747330066049]
Fine-grained feedback captures nuanced distinctions in image quality and prompt-alignment.
We show that demonstrating its superiority to coarse-grained feedback is not automatic.
We identify key challenges in eliciting and utilizing fine-grained feedback.
arXiv Detail & Related papers (2024-06-24T17:19:34Z) - RLVF: Learning from Verbal Feedback without Overgeneralization [94.19501420241188]
We study the problem of incorporating verbal feedback without such overgeneralization.
We develop a new method Contextualized Critiques with Constrained Preference Optimization (C3PO)
Our approach effectively applies verbal feedback to relevant scenarios while preserving existing behaviors for other contexts.
arXiv Detail & Related papers (2024-02-16T18:50:24Z) - UltraFeedback: Boosting Language Models with Scaled AI Feedback [99.4633351133207]
We present textscUltraFeedback, a large-scale, high-quality, and diversified AI feedback dataset.
Our work validates the effectiveness of scaled AI feedback data in constructing strong open-source chat language models.
arXiv Detail & Related papers (2023-10-02T17:40:01Z) - Learning from Negative User Feedback and Measuring Responsiveness for
Sequential Recommenders [13.762960304406016]
We introduce explicit and implicit negative user feedback into the training objective of sequential recommenders.
We demonstrate the effectiveness of this approach using live experiments on a large-scale industrial recommender system.
arXiv Detail & Related papers (2023-08-23T17:16:07Z) - System-Level Natural Language Feedback [83.24259100437965]
We show how to use feedback to formalize system-level design decisions in a human-in-the-loop-process.
We conduct two case studies of this approach for improving search query and dialog response generation.
We show the combination of system-level and instance-level feedback brings further gains.
arXiv Detail & Related papers (2023-06-23T16:21:40Z) - Simulating Bandit Learning from User Feedback for Extractive Question
Answering [51.97943858898579]
We study learning from user feedback for extractive question answering by simulating feedback using supervised data.
We show that systems initially trained on a small number of examples can dramatically improve given feedback from users on model-predicted answers.
arXiv Detail & Related papers (2022-03-18T17:47:58Z) - Adaptive Summaries: A Personalized Concept-based Summarization Approach
by Learning from Users' Feedback [0.0]
This paper proposes an interactive concept-based summarization model, called Adaptive Summaries.
The system learns from users' provided information gradually while interacting with the system by giving feedback in an iterative loop.
It helps users make high-quality summaries based on their preferences by maximizing the user-desired content in the generated summaries.
arXiv Detail & Related papers (2020-12-24T18:27:50Z) - Dialogue Response Ranking Training with Large-Scale Human Feedback Data [52.12342165926226]
We leverage social media feedback data to build a large-scale training dataset for feedback prediction.
We trained DialogRPT, a set of GPT-2 based models on 133M pairs of human feedback data.
Our ranker outperforms the conventional dialog perplexity baseline with a large margin on predicting Reddit feedback.
arXiv Detail & Related papers (2020-09-15T10:50:05Z) - Large-scale Hybrid Approach for Predicting User Satisfaction with
Conversational Agents [28.668681892786264]
Measuring user satisfaction level is a challenging task, and a critical component in developing large-scale conversational agent systems.
Human annotation based approaches are easier to control, but hard to scale.
A novel alternative approach is to collect user's direct feedback via a feedback elicitation system embedded to the conversational agent system.
arXiv Detail & Related papers (2020-05-29T16:29:09Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.