Active teacher selection for reinforcement learning from human feedback
- URL: http://arxiv.org/abs/2310.15288v1
- Date: Mon, 23 Oct 2023 18:54:43 GMT
- Title: Active teacher selection for reinforcement learning from human feedback
- Authors: Rachel Freedman, Justin Svegliato, Kyle Wray, Stuart Russell
- Abstract summary: Reinforcement learning from human feedback (RLHF) enables machine learning systems to learn objectives from human feedback.
We propose the Hidden Utility Bandit framework to model differences in teacher rationality, expertise, and costliness.
We develop a variety of solution algorithms and apply them to two real-world domains: paper recommendation systems and COVID-19 vaccine testing.
- Score: 14.009227941725783
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: Reinforcement learning from human feedback (RLHF) enables machine learning
systems to learn objectives from human feedback. A core limitation of these
systems is their assumption that all feedback comes from a single human
teacher, despite querying a range of distinct teachers. We propose the Hidden
Utility Bandit (HUB) framework to model differences in teacher rationality,
expertise, and costliness, formalizing the problem of learning from multiple
teachers. We develop a variety of solution algorithms and apply them to two
real-world domains: paper recommendation systems and COVID-19 vaccine testing.
We find that the Active Teacher Selection (ATS) algorithm outperforms baseline
algorithms by actively selecting when and which teacher to query. The HUB
framework and ATS algorithm demonstrate the importance of leveraging
differences between teachers to learn accurate reward models, facilitating
future research on active teacher selection for robust reward modeling.
Related papers
- YODA: Teacher-Student Progressive Learning for Language Models [82.0172215948963]
This paper introduces YODA, a teacher-student progressive learning framework.
It emulates the teacher-student education process to improve the efficacy of model fine-tuning.
Experiments show that training LLaMA2 with data from YODA improves SFT with significant performance gain.
arXiv Detail & Related papers (2024-01-28T14:32:15Z) - TGRL: An Algorithm for Teacher Guided Reinforcement Learning [45.38447023752256]
It is common to train a policy to maximize a combination of reinforcement and teacher-student learning objectives.
We present a $textitprincipled$ approach, along with an approximate implementation for $textitdynamically$ and $textitautomatically$ balancing when to follow the teacher and when to use rewards.
arXiv Detail & Related papers (2023-07-06T17:58:40Z) - Active Reward Learning from Multiple Teachers [17.10187575303075]
Reward learning algorithms utilize human feedback to infer a reward function, which is then used to train an AI system.
This human feedback is often a preference comparison, in which the human teacher compares several samples of AI behavior and chooses which they believe best accomplishes the objective.
While reward learning typically assumes that all feedback comes from a single teacher, in practice these systems often query multiple teachers to gather sufficient training data.
arXiv Detail & Related papers (2023-03-02T01:26:53Z) - Better Teacher Better Student: Dynamic Prior Knowledge for Knowledge
Distillation [70.92135839545314]
We propose the dynamic prior knowledge (DPK), which integrates part of teacher's features as the prior knowledge before the feature distillation.
Our DPK makes the performance of the student model positively correlated with that of the teacher model, which means that we can further boost the accuracy of students by applying larger teachers.
arXiv Detail & Related papers (2022-06-13T11:52:13Z) - Unsupervised Domain Adaptive Person Re-Identification via Human Learning
Imitation [67.52229938775294]
In past years, researchers propose to utilize the teacher-student framework in their methods to decrease the domain gap between different person re-identification datasets.
Inspired by recent teacher-student framework based methods, we propose to conduct further exploration to imitate the human learning process from different aspects.
arXiv Detail & Related papers (2021-11-28T01:14:29Z) - A teacher-student framework for online correctional learning [12.980296933051509]
We show that the variance of the estimate of the student is reduced with the help of the teacher.
We formulate the online problem - where the teacher has to decide at each time instant whether or not to change the observations.
We validate the framework in numerical experiments, and compare the optimal online policy with the one from the batch setting.
arXiv Detail & Related papers (2021-11-15T15:01:00Z) - Iterative Teacher-Aware Learning [136.05341445369265]
In human pedagogy, teachers and students can interact adaptively to maximize communication efficiency.
We propose a gradient optimization based teacher-aware learner who can incorporate teacher's cooperative intention into the likelihood function.
arXiv Detail & Related papers (2021-10-01T00:27:47Z) - Point Adversarial Self Mining: A Simple Method for Facial Expression
Recognition [79.75964372862279]
We propose Point Adversarial Self Mining (PASM) to improve the recognition accuracy in facial expression recognition.
PASM uses a point adversarial attack method and a trained teacher network to locate the most informative position related to the target task.
The adaptive learning materials generation and teacher/student update can be conducted more than one time, improving the network capability iteratively.
arXiv Detail & Related papers (2020-08-26T06:39:24Z) - Neural Multi-Task Learning for Teacher Question Detection in Online
Classrooms [50.19997675066203]
We build an end-to-end neural framework that automatically detects questions from teachers' audio recordings.
By incorporating multi-task learning techniques, we are able to strengthen the understanding of semantic relations among different types of questions.
arXiv Detail & Related papers (2020-05-16T02:17:04Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.