Diversity from Human Feedback
- URL: http://arxiv.org/abs/2310.06648v2
- Date: Sun, 10 Dec 2023 13:58:34 GMT
- Title: Diversity from Human Feedback
- Authors: Ren-Jian Wang, Ke Xue, Yutong Wang, Peng Yang, Haobo Fu, Qiang Fu,
Chao Qian
- Abstract summary: We propose the problem of learning a behavior space from human feedback and present a general method called Diversity from Human Feedback (DivHF) to solve it.
DivHF learns a behavior consistent with human preference by querying human feedback.
We demonstrate the effectiveness of DivHF by integrating it with the Quality-Diversity optimization algorithm MAP-Elites and conducting experiments on the QDax suite.
- Score: 39.05609489642456
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Diversity plays a significant role in many problems, such as ensemble
learning, reinforcement learning, and combinatorial optimization. How to define
the diversity measure is a longstanding problem. Many methods rely on expert
experience to define a proper behavior space and then obtain the diversity
measure, which is, however, challenging in many scenarios. In this paper, we
propose the problem of learning a behavior space from human feedback and
present a general method called Diversity from Human Feedback (DivHF) to solve
it. DivHF learns a behavior descriptor consistent with human preference by
querying human feedback. The learned behavior descriptor can be combined with
any distance measure to define a diversity measure. We demonstrate the
effectiveness of DivHF by integrating it with the Quality-Diversity
optimization algorithm MAP-Elites and conducting experiments on the QDax suite.
The results show that DivHF learns a behavior space that aligns better with
human requirements compared to direct data-driven approaches and leads to more
diverse solutions under human preference. Our contributions include formulating
the problem, proposing the DivHF method, and demonstrating its effectiveness
through experiments.
Related papers
- Personalizing Reinforcement Learning from Human Feedback with Variational Preference Learning [12.742158403867002]
Reinforcement Learning from Human Feedback is a powerful paradigm for aligning foundation models to human values and preferences.
Current RLHF techniques cannot account for the naturally occurring differences in individual human preferences across a diverse population.
We develop a class of multimodal RLHF methods to address the need for pluralistic alignment.
arXiv Detail & Related papers (2024-08-19T15:18:30Z) - Scaling Data Diversity for Fine-Tuning Language Models in Human Alignment [84.32768080422349]
Alignment with human preference prevents large language models from generating misleading or toxic content.
We propose a new formulation of prompt diversity, implying a linear correlation with the final performance of LLMs after fine-tuning.
arXiv Detail & Related papers (2024-03-17T07:08:55Z) - Provable Multi-Party Reinforcement Learning with Diverse Human Feedback [63.830731470186855]
Reinforcement learning with human feedback (RLHF) is an emerging paradigm to align models with human preferences.
We show how traditional RLHF approaches can fail since learning a single reward function cannot capture and balance the preferences of multiple individuals.
We incorporate meta-learning to learn multiple preferences and adopt different social welfare functions to aggregate the preferences across multiple parties.
arXiv Detail & Related papers (2024-03-08T03:05:11Z) - Improving Demonstration Diversity by Human-Free Fusing for Text-to-SQL [51.48239006107272]
In this paper, we discuss how to measure and improve the diversity of the demonstrations for text-to-diversity research.
We propose fusing iteratively for demonstrations (Fused) to build a high-diversity demonstration pool.
Our method achieves an average improvement of 3.2% and 5.0% with and without human labeling on several mainstream datasets.
arXiv Detail & Related papers (2024-02-16T13:13:18Z) - MaxMin-RLHF: Towards Equitable Alignment of Large Language Models with
Diverse Human Preferences [101.57443597426374]
Reinforcement Learning from Human Feedback (RLHF) aligns language models to human preferences by employing a singular reward model derived from preference data.
We learn a mixture of preference distributions via an expectation-maximization algorithm to better represent diverse human preferences.
Our algorithm achieves an average improvement of more than 16% in win-rates over conventional RLHF algorithms.
arXiv Detail & Related papers (2024-02-14T03:56:27Z) - Generalized People Diversity: Learning a Human Perception-Aligned
Diversity Representation for People Images [11.038712922077458]
We introduce a diverse people image ranking method which more flexibly aligns with human notions of people diversity.
The Perception-Aligned Text-derived Human representation Space (PATHS) aims to capture all or many relevant features of people-related diversity.
arXiv Detail & Related papers (2024-01-25T17:19:22Z) - Quality Diversity through Human Feedback: Towards Open-Ended Diversity-Driven Optimization [13.436983663467938]
This paper introduces Quality Diversity through Human Feedback (QDHF), a novel approach that progressively infers diversity metrics from human judgments of similarity among solutions.
Empirical studies show that QDHF significantly outperforms state-of-the-art methods in automatic diversity discovery.
In open-ended generative tasks, QDHF substantially enhances the diversity of text-to-image generation from a diffusion model.
arXiv Detail & Related papers (2023-10-18T16:46:16Z) - AlignDiff: Aligning Diverse Human Preferences via Behavior-Customisable
Diffusion Model [69.12623428463573]
AlignDiff is a novel framework to quantify human preferences, covering abstractness, and guide diffusion planning.
It can accurately match user-customized behaviors and efficiently switch from one to another.
We demonstrate its superior performance on preference matching, switching, and covering compared to other baselines.
arXiv Detail & Related papers (2023-10-03T13:53:08Z) - Efficient Diversity-Driven Ensemble for Deep Neural Networks [28.070540722925152]
We propose Efficient Diversity-Driven Ensemble (EDDE) to address both the diversity and the efficiency of an ensemble.
Compared with other well-known ensemble methods, EDDE can get highest ensemble accuracy with the lowest training cost.
We evaluate EDDE on Computer Vision (CV) and Natural Language Processing (NLP) tasks.
arXiv Detail & Related papers (2021-12-26T04:28:47Z) - Effective Diversity in Population Based Reinforcement Learning [38.62641968788987]
We introduce an approach to optimize all members of a population simultaneously.
Rather than using pairwise distance, we measure the volume of the entire population in a behavioral manifold.
Our algorithm Diversity via Determinants (DvD) adapts the degree of diversity during training using online learning techniques.
arXiv Detail & Related papers (2020-02-03T10:09:16Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.