Scaling Data Diversity for Fine-Tuning Language Models in Human Alignment
- URL: http://arxiv.org/abs/2403.11124v2
- Date: Sat, 30 Mar 2024 16:48:16 GMT
- Title: Scaling Data Diversity for Fine-Tuning Language Models in Human Alignment
- Authors: Feifan Song, Bowen Yu, Hao Lang, Haiyang Yu, Fei Huang, Houfeng Wang, Yongbin Li,
- Abstract summary: Alignment with human preference prevents large language models from generating misleading or toxic content.
We propose a new formulation of prompt diversity, implying a linear correlation with the final performance of LLMs after fine-tuning.
- Score: 84.32768080422349
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Alignment with human preference prevents large language models (LLMs) from generating misleading or toxic content while requiring high-cost human feedback. Assuming resources of human annotation are limited, there are two different ways of allocating considered: more diverse PROMPTS or more diverse RESPONSES to be labeled. Nonetheless, a straightforward comparison between their impact is absent. In this work, we first control the diversity of both sides according to the number of samples for fine-tuning, which can directly reflect their influence. We find that instead of numerous prompts, more responses but fewer prompts better trigger LLMs for human alignment. Additionally, the concept of diversity for prompts can be more complex than responses that are typically quantified by single digits. Consequently, a new formulation of prompt diversity is proposed, further implying a linear correlation with the final performance of LLMs after fine-tuning. We also leverage it on data augmentation and conduct experiments to show its effect on different algorithms.
Related papers
- One fish, two fish, but not the whole sea: Alignment reduces language models' conceptual diversity [2.5975241792179378]
Researchers have proposed using large language models (LLMs) as replacements for humans in behavioral research.
It is debated whether post-training alignment (RLHF or RLAIF) affects models' internal diversity.
We use a new way of measuring the conceptual diversity of synthetically-generated LLM "populations" by relating the internal variability of simulated individuals to the population-level variability.
arXiv Detail & Related papers (2024-11-07T04:38:58Z) - Using LLMs for Explaining Sets of Counterfactual Examples to Final Users [0.0]
In automated decision-making scenarios, causal inference methods can analyze the underlying data-generation process.
Counterfactual examples explore hypothetical scenarios where a minimal number of factors are altered.
We propose a novel multi-step pipeline that uses counterfactuals to generate natural language explanations of actions that will lead to a change in outcome.
arXiv Detail & Related papers (2024-08-27T15:13:06Z) - REAL Sampling: Boosting Factuality and Diversity of Open-Ended Generation via Asymptotic Entropy [93.8400683020273]
Decoding methods for large language models (LLMs) usually struggle with the tradeoff between ensuring factuality and maintaining diversity.
We propose REAL sampling, a decoding method that improved factuality and diversity over nucleus sampling.
arXiv Detail & Related papers (2024-06-11T21:44:49Z) - ACORN: Aspect-wise Commonsense Reasoning Explanation Evaluation [29.718851249656172]
Large language models (LLMs) present an appealing alternative due to their potential for consistency, scalability, and cost-efficiency.
We present ACORN, a new dataset of 3,500 free-text explanations and aspect-wise quality ratings.
arXiv Detail & Related papers (2024-05-08T05:36:52Z) - Preference Fine-Tuning of LLMs Should Leverage Suboptimal, On-Policy Data [102.16105233826917]
Learning from preference labels plays a crucial role in fine-tuning large language models.
There are several distinct approaches for preference fine-tuning, including supervised learning, on-policy reinforcement learning (RL), and contrastive learning.
arXiv Detail & Related papers (2024-04-22T17:20:18Z) - Quantifying the Persona Effect in LLM Simulations [25.367927300697424]
Large language models (LLMs) have shown remarkable promise in simulating human language and behavior.
This study investigates how integrating persona variables-demographic, social, and behavioral factors-impacts LLMs' ability to simulate diverse perspectives.
We find that persona variables account for 10% variance in annotations in existing subjective NLP datasets.
arXiv Detail & Related papers (2024-02-16T16:35:35Z) - Improving Demonstration Diversity by Human-Free Fusing for Text-to-SQL [51.48239006107272]
In this paper, we discuss how to measure and improve the diversity of the demonstrations for text-to-diversity research.
We propose fusing iteratively for demonstrations (Fused) to build a high-diversity demonstration pool.
Our method achieves an average improvement of 3.2% and 5.0% with and without human labeling on several mainstream datasets.
arXiv Detail & Related papers (2024-02-16T13:13:18Z) - Relative Preference Optimization: Enhancing LLM Alignment through Contrasting Responses across Identical and Diverse Prompts [95.09994361995389]
Relative Preference Optimization (RPO) is designed to discern between more and less preferred responses derived from both identical and related prompts.
RPO has demonstrated a superior ability to align large language models with user preferences and to improve their adaptability during the training process.
arXiv Detail & Related papers (2024-02-12T22:47:57Z) - Effects of diversity incentives on sample diversity and downstream model performance in LLM-based text augmentation [6.273933281069326]
We investigate three text diversity incentive methods well established in crowdsourcing: taboo words, hints by previous outlier solutions, and chaining on previous outlier solutions.
We show that diversity is most increased by taboo words, but downstream model performance is highest with hints.
arXiv Detail & Related papers (2024-01-12T15:46:43Z) - Elastic Weight Removal for Faithful and Abstractive Dialogue Generation [61.40951756070646]
A dialogue system should generate responses that are faithful to the knowledge contained in relevant documents.
Many models generate hallucinated responses instead that contradict it or contain unverifiable information.
We show that our method can be extended to simultaneously discourage hallucinations and extractive responses.
arXiv Detail & Related papers (2023-03-30T17:40:30Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.