Data-Centric Human Preference Optimization with Rationales
- URL: http://arxiv.org/abs/2407.14477v3
- Date: Sat, 3 Aug 2024 17:32:08 GMT
- Title: Data-Centric Human Preference Optimization with Rationales
- Authors: Hoang Anh Just, Ming Jin, Anit Sahu, Huy Phan, Ruoxi Jia,
- Abstract summary: Reinforcement learning from human feedback plays a crucial role in aligning language models towards human preferences.
This work shifts focus to improving preference learning through a data-centric approach.
We propose enriching existing preference datasets with machine-generated rationales that explain the reasons behind choices.
- Score: 23.243583332894737
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Reinforcement learning from human feedback plays a crucial role in aligning language models towards human preferences, traditionally represented through comparisons between pairs or sets of responses within a given context. While many studies have enhanced algorithmic techniques to optimize learning from such data, this work shifts focus to improving preference learning through a data-centric approach. Specifically, we propose enriching existing preference datasets with machine-generated rationales that explain the reasons behind choices. We develop a simple and principled framework to augment current preference learning methods with rationale information. Our comprehensive analysis highlights how rationales enhance learning efficiency. Extensive experiments reveal that rationale-enriched preference learning offers multiple advantages: it improves data efficiency, accelerates convergence to higher-performing models, and reduces verbosity bias and hallucination. Furthermore, this framework is versatile enough to integrate with various preference optimization algorithms. Overall, our findings highlight the potential of re-imagining data design for preference learning, demonstrating that even freely available machine-generated rationales can significantly boost performance across multiple dimensions. The code repository is available at https: //github.com/reds-lab/preference-learning-with-rationales
Related papers
- Self-supervised Preference Optimization: Enhance Your Language Model with Preference Degree Awareness [27.43137305486112]
We propose a novel Self-supervised Preference Optimization (SPO) framework, which constructs a self-supervised preference degree loss combined with the alignment loss.
The results demonstrate that SPO can be seamlessly integrated with existing preference optimization methods to achieve state-of-the-art performance.
arXiv Detail & Related papers (2024-09-26T12:37:26Z) - Leveraging Variation Theory in Counterfactual Data Augmentation for Optimized Active Learning [19.962212551963383]
Active Learning (AL) allows models to learn interactively from user feedback.
This paper introduces a counterfactual data augmentation approach to AL.
arXiv Detail & Related papers (2024-08-07T14:55:04Z) - Calibrating LLMs with Preference Optimization on Thought Trees for Generating Rationale in Science Question Scoring [16.38771834692938]
We propose a novel framework capable of generating more faithful rationales and, more importantly, matching performance with black-box scoring systems.
We first mimic the human assessment process by querying Large Language Models (LLMs) to generate a thought tree.
We then summarise intermediate assessment decisions from each thought tree path for creating synthetic rationale data and rationale preference data.
arXiv Detail & Related papers (2024-06-28T14:33:05Z) - Unpacking DPO and PPO: Disentangling Best Practices for Learning from Preference Feedback [110.16220825629749]
Learning from preference feedback has emerged as an essential step for improving the generation quality and performance of modern language models.
In this work, we identify four core aspects of preference-based learning: preference data, learning algorithm, reward model, and policy training prompts.
Our findings indicate that all aspects are important for performance, with better preference data leading to the largest improvements.
arXiv Detail & Related papers (2024-06-13T16:17:21Z) - Aligning Large Language Models with Self-generated Preference Data [72.99676237703099]
We propose a new framework that boosts the alignment of large language models (LLMs) with human preferences.
Our key idea is leveraging the human prior knowledge within the small (seed) data.
We introduce a noise-aware preference learning algorithm to mitigate the risk of low quality within generated preference data.
arXiv Detail & Related papers (2024-06-06T18:01:02Z) - Adaptive Preference Scaling for Reinforcement Learning with Human Feedback [103.36048042664768]
Reinforcement learning from human feedback (RLHF) is a prevalent approach to align AI systems with human values.
We propose a novel adaptive preference loss, underpinned by distributionally robust optimization (DRO)
Our method is versatile and can be readily adapted to various preference optimization frameworks.
arXiv Detail & Related papers (2024-06-04T20:33:22Z) - LESS: Selecting Influential Data for Targeted Instruction Tuning [64.78894228923619]
We propose LESS, an efficient algorithm to estimate data influences and perform Low-rank gradiEnt Similarity Search for instruction data selection.
We show that training on a LESS-selected 5% of the data can often outperform training on the full dataset across diverse downstream tasks.
Our method goes beyond surface form cues to identify data that the necessary reasoning skills for the intended downstream application.
arXiv Detail & Related papers (2024-02-06T19:18:04Z) - A Data Driven Sequential Learning Framework to Accelerate and Optimize
Multi-Objective Manufacturing Decisions [1.5771347525430772]
This paper presents a novel data-driven Bayesian optimization framework that utilizes sequential learning to efficiently optimize complex systems.
The proposed framework is particularly beneficial in practical applications where acquiring data can be expensive and resource intensive.
It implies that the proposed data-driven framework can lead to similar manufacturing decisions with reduced costs and time.
arXiv Detail & Related papers (2023-04-18T20:33:08Z) - HyperImpute: Generalized Iterative Imputation with Automatic Model
Selection [77.86861638371926]
We propose a generalized iterative imputation framework for adaptively and automatically configuring column-wise models.
We provide a concrete implementation with out-of-the-box learners, simulators, and interfaces.
arXiv Detail & Related papers (2022-06-15T19:10:35Z) - Compactness Score: A Fast Filter Method for Unsupervised Feature
Selection [66.84571085643928]
We propose a fast unsupervised feature selection method, named as, Compactness Score (CSUFS) to select desired features.
Our proposed algorithm seems to be more accurate and efficient compared with existing algorithms.
arXiv Detail & Related papers (2022-01-31T13:01:37Z) - Training With Data Dependent Dynamic Learning Rates [8.833548357664608]
We propose an optimization framework which accounts for difference in loss function characteristics across instances.
Our framework learns a dynamic learning rate for each instance present in the dataset.
We show that our framework can be used for personalization of a machine learning model towards a known targeted data distribution.
arXiv Detail & Related papers (2021-05-27T21:52:29Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.