The Future of Open Human Feedback
- URL: http://arxiv.org/abs/2408.16961v2
- Date: Wed, 4 Sep 2024 15:39:47 GMT
- Title: The Future of Open Human Feedback
- Authors: Shachar Don-Yehiya, Ben Burtenshaw, Ramon Fernandez Astudillo, Cailean Osborne, Mimansa Jaiswal, Tzu-Sheng Kuo, Wenting Zhao, Idan Shenfeld, Andi Peng, Mikhail Yurochkin, Atoosa Kasirzadeh, Yangsibo Huang, Tatsunori Hashimoto, Yacine Jernite, Daniel Vila-Suero, Omri Abend, Jennifer Ding, Sara Hooker, Hannah Rose Kirk, Leshem Choshen,
- Abstract summary: We bring together interdisciplinary experts to assess the opportunities and challenges to realizing an open ecosystem of human feedback for AI.
We first look for successful practices in peer production, open source, and citizen science communities.
We end by envisioning the components needed to underpin a sustainable and open human feedback ecosystem.
- Score: 65.2188596695235
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: Human feedback on conversations with language language models (LLMs) is central to how these systems learn about the world, improve their capabilities, and are steered toward desirable and safe behaviors. However, this feedback is mostly collected by frontier AI labs and kept behind closed doors. In this work, we bring together interdisciplinary experts to assess the opportunities and challenges to realizing an open ecosystem of human feedback for AI. We first look for successful practices in peer production, open source, and citizen science communities. We then characterize the main challenges for open human feedback. For each, we survey current approaches and offer recommendations. We end by envisioning the components needed to underpin a sustainable and open human feedback ecosystem. In the center of this ecosystem are mutually beneficial feedback loops, between users and specialized models, incentivizing a diverse stakeholders community of model trainers and feedback providers to support a general open feedback pool.
Related papers
- Source Echo Chamber: Exploring the Escalation of Source Bias in User, Data, and Recommender System Feedback Loop [65.23044868332693]
We investigate the impact of source bias on the realm of recommender systems.
We show the prevalence of source bias and reveal a potential digital echo chamber with source bias amplification.
We introduce a black-box debiasing method that maintains model impartiality towards both HGC and AIGC.
arXiv Detail & Related papers (2024-05-28T09:34:50Z) - Rethinking the Evaluation of Dialogue Systems: Effects of User Feedback on Crowdworkers and LLMs [57.16442740983528]
In ad-hoc retrieval, evaluation relies heavily on user actions, including implicit feedback.
The role of user feedback in annotators' assessment of turns in a conversational perception has been little studied.
We focus on how the evaluation of task-oriented dialogue systems ( TDSs) is affected by considering user feedback, explicit or implicit, as provided through the follow-up utterance of a turn being evaluated.
arXiv Detail & Related papers (2024-04-19T16:45:50Z) - UltraFeedback: Boosting Language Models with Scaled AI Feedback [99.4633351133207]
We present textscUltraFeedback, a large-scale, high-quality, and diversified AI feedback dataset.
Our work validates the effectiveness of scaled AI feedback data in constructing strong open-source chat language models.
arXiv Detail & Related papers (2023-10-02T17:40:01Z) - Breadcrumbs to the Goal: Goal-Conditioned Exploration from
Human-in-the-Loop Feedback [22.89046164459011]
We present a technique called Human Guided Exploration (HuGE), which uses low-quality feedback from non-expert users.
HuGE guides exploration for reinforcement learning not only in simulation but also in the real world, all without meticulous reward specification.
arXiv Detail & Related papers (2023-07-20T17:30:37Z) - Continually Improving Extractive QA via Human Feedback [59.49549491725224]
We study continually improving an extractive question answering (QA) system via human user feedback.
We conduct experiments involving thousands of user interactions under diverse setups to broaden the understanding of learning from feedback over time.
arXiv Detail & Related papers (2023-05-21T14:35:32Z) - Humans are not Boltzmann Distributions: Challenges and Opportunities for
Modelling Human Feedback and Interaction in Reinforcement Learning [13.64577704565643]
We argue that these models are too simplistic and that RL researchers need to develop more realistic human models to design and evaluate their algorithms.
This paper calls for research from different disciplines to address key questions about how humans provide feedback to AIs and how we can build more robust human-in-the-loop RL systems.
arXiv Detail & Related papers (2022-06-27T13:58:51Z) - Perspectives on Incorporating Expert Feedback into Model Updates [46.99664744930785]
We devise a taxonomy to match expert feedback types with practitioner updates.
A practitioner may receive feedback from an expert at the observation- or domain-level.
We review existing work from ML and human-computer interaction to describe this feedback-update taxonomy.
arXiv Detail & Related papers (2022-05-13T21:46:55Z) - Reinforcement Learning with Feedback from Multiple Humans with Diverse
Skills [1.433758865948252]
A promising approach to improve the robustness and exploration in Reinforcement Learning is collecting human feedback.
It is, however, often too expensive to obtain enough feedback of good quality.
We aim to rely on a group of multiple experts with different skill levels to generate enough feedback.
arXiv Detail & Related papers (2021-11-16T16:19:19Z) - Advances and Challenges in Conversational Recommender Systems: A Survey [133.93908165922804]
We provide a systematic review of the techniques used in current conversational recommender systems (CRSs)
We summarize the key challenges of developing CRSs into five directions.
These research directions involve multiple research fields like information retrieval (IR), natural language processing (NLP), and human-computer interaction (HCI)
arXiv Detail & Related papers (2021-01-23T08:53:15Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.