Related papers: System-Level Natural Language Feedback

System-Level Natural Language Feedback

URL: http://arxiv.org/abs/2306.13588v3
Date: Sat, 3 Feb 2024 00:24:11 GMT
Title: System-Level Natural Language Feedback
Authors: Weizhe Yuan, Kyunghyun Cho, Jason Weston
Abstract summary: We show how to use feedback to formalize system-level design decisions in a human-in-the-loop-process. We conduct two case studies of this approach for improving search query and dialog response generation. We show the combination of system-level and instance-level feedback brings further gains.
Score: 83.24259100437965
License: http://creativecommons.org/licenses/by-nc-sa/4.0/
Abstract: Natural language (NL) feedback offers rich insights into user experience. While existing studies focus on an instance-level approach, where feedback is used to refine specific examples, we introduce a framework for system-level use of NL feedback. We show how to use feedback to formalize system-level design decisions in a human-in-the-loop-process -- in order to produce better models. In particular this is done through: (i) metric design for tasks; and (ii) language model prompt design for refining model responses. We conduct two case studies of this approach for improving search query and dialog response generation, demonstrating the effectiveness of system-level feedback. We show the combination of system-level and instance-level feedback brings further gains, and that human written instance-level feedback results in more grounded refinements than GPT-3.5 written ones, underlying the importance of human feedback for building systems. We release our code and data at https://github.com/yyy-Apple/Sys-NL-Feedback.

Related papers

You're (Not) My Type -- Can LLMs Generate Feedback of Specific Types for Introductory Programming Tasks? [0.4779196219827508]
This paper aims to generate specific types of feedback for programming tasks using Large Language Models (LLMs) We revisit existing feedback to capture the specifics of the generated feedback, such as randomness, uncertainty, and degrees of variation. Results have implications for future feedback research with regard to, for example, feedback effects and learners' informational needs.
arXiv Detail & Related papers (2024-12-04T17:57:39Z)
Learning from Naturally Occurring Feedback [25.266461597402056]
We propose a scalable method for extracting feedback that users naturally include when interacting with chat models. We manually annotated conversation data to confirm the presence of naturally occurring feedback. We apply our method to over 1M conversations to obtain hundreds of thousands of feedback samples.
arXiv Detail & Related papers (2024-07-15T17:41:34Z)
RLVF: Learning from Verbal Feedback without Overgeneralization [94.19501420241188]
We study the problem of incorporating verbal feedback without such overgeneralization. We develop a new method Contextualized Critiques with Constrained Preference Optimization (C3PO) Our approach effectively applies verbal feedback to relevant scenarios while preserving existing behaviors for other contexts.
arXiv Detail & Related papers (2024-02-16T18:50:24Z)
UltraFeedback: Boosting Language Models with Scaled AI Feedback [99.4633351133207]
We present textscUltraFeedback, a large-scale, high-quality, and diversified AI feedback dataset. Our work validates the effectiveness of scaled AI feedback data in constructing strong open-source chat language models.
arXiv Detail & Related papers (2023-10-02T17:40:01Z)
Bridging the Gap: A Survey on Integrating (Human) Feedback for Natural Language Generation [68.9440575276396]
This survey aims to provide an overview of the recent research that has leveraged human feedback to improve natural language generation. First, we introduce an encompassing formalization of feedback, and identify and organize existing research into a taxonomy following this formalization. Second, we discuss how feedback can be described by its format and objective, and cover the two approaches proposed to use feedback (either for training or decoding): directly using the feedback or training feedback models. Third, we provide an overview of the nascent field of AI feedback, which exploits large language models to make judgments based on a set of principles and minimize the need for
arXiv Detail & Related papers (2023-05-01T17:36:06Z)
Simulating Bandit Learning from User Feedback for Extractive Question Answering [51.97943858898579]
We study learning from user feedback for extractive question answering by simulating feedback using supervised data. We show that systems initially trained on a small number of examples can dramatically improve given feedback from users on model-predicted answers.
arXiv Detail & Related papers (2022-03-18T17:47:58Z)
Improving Conversational Question Answering Systems after Deployment using Feedback-Weighted Learning [69.42679922160684]
We propose feedback-weighted learning based on importance sampling to improve upon an initial supervised system using binary user feedback. Our work opens the prospect to exploit interactions with real users and improve conversational systems after deployment.
arXiv Detail & Related papers (2020-11-01T19:50:34Z)

This list is automatically generated from the titles and abstracts of the papers in this site.