Related papers: Scaling Data Diversity for Fine-Tuning Language Models in Human Alignment

Scaling Data Diversity for Fine-Tuning Language Models in Human Alignment

URL: http://arxiv.org/abs/2403.11124v2
Date: Sat, 30 Mar 2024 16:48:16 GMT
Title: Scaling Data Diversity for Fine-Tuning Language Models in Human Alignment
Authors: Feifan Song, Bowen Yu, Hao Lang, Haiyang Yu, Fei Huang, Houfeng Wang, Yongbin Li,
Abstract summary: Alignment with human preference prevents large language models from generating misleading or toxic content. We propose a new formulation of prompt diversity, implying a linear correlation with the final performance of LLMs after fine-tuning.
Score: 84.32768080422349
License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
Abstract: Alignment with human preference prevents large language models (LLMs) from generating misleading or toxic content while requiring high-cost human feedback. Assuming resources of human annotation are limited, there are two different ways of allocating considered: more diverse PROMPTS or more diverse RESPONSES to be labeled. Nonetheless, a straightforward comparison between their impact is absent. In this work, we first control the diversity of both sides according to the number of samples for fine-tuning, which can directly reflect their influence. We find that instead of numerous prompts, more responses but fewer prompts better trigger LLMs for human alignment. Additionally, the concept of diversity for prompts can be more complex than responses that are typically quantified by single digits. Consequently, a new formulation of prompt diversity is proposed, further implying a linear correlation with the final performance of LLMs after fine-tuning. We also leverage it on data augmentation and conduct experiments to show its effect on different algorithms.

Related papers

Revisiting LLM Value Probing Strategies: Are They Robust and Expressive? [81.49470136653665]
We evaluate the robustness and expressiveness of value representations across three widely used probing strategies.<n>We show that the demographic context has little effect on the free-text generation, and the models' values only weakly correlate with their preference for value-based actions.
arXiv Detail & Related papers (2025-07-17T18:56:41Z)
Measuring diversity of synthetic prompts and data generated with fine-grained persona prompting [2.773884499834578]
We measure the diversity of persona-driven synthetically generated prompts and responses with a suite of lexical diversity and redundancy metrics.<n>We find that synthetic prompts are significantly less diverse than human-written ones.<n>While persona-prompting does improve lexical diversity (especially with larger models), fine-grained detail in personas doesn't increase diversity noticeably.
arXiv Detail & Related papers (2025-05-23T02:00:00Z)
Few-shot Steerable Alignment: Adapting Rewards and LLM Policies with Neural Processes [50.544186914115045]
Large language models (LLMs) are increasingly embedded in everyday applications. Ensuring their alignment with the diverse preferences of individual users has become a critical challenge. We present a novel framework for few-shot steerable alignment.
arXiv Detail & Related papers (2024-12-18T16:14:59Z)
Improving Linguistic Diversity of Large Language Models with Possibility Exploration Fine-Tuning [23.456302461693053]
Possibility Exploration Fine-Tuning (PEFT) is a task-agnostic framework that enhances the text diversity of Large Language Models (LLMs) without increasing latency or computational cost. PEFT significantly enhances the diversity of LLM outputs, as evidenced by lower similarity between candidate responses. It can also notably reduce demographic bias in dialogue systems.
arXiv Detail & Related papers (2024-12-04T14:23:16Z)
One fish, two fish, but not the whole sea: Alignment reduces language models' conceptual diversity [2.5975241792179378]
Researchers have proposed using large language models (LLMs) as replacements for humans in behavioral research. It is debated whether post-training alignment (RLHF or RLAIF) affects models' internal diversity. We use a new way of measuring the conceptual diversity of synthetically-generated LLM "populations" by relating the internal variability of simulated individuals to the population-level variability.
arXiv Detail & Related papers (2024-11-07T04:38:58Z)
Using LLMs for Explaining Sets of Counterfactual Examples to Final Users [0.0]
In automated decision-making scenarios, causal inference methods can analyze the underlying data-generation process. Counterfactual examples explore hypothetical scenarios where a minimal number of factors are altered. We propose a novel multi-step pipeline that uses counterfactuals to generate natural language explanations of actions that will lead to a change in outcome.
arXiv Detail & Related papers (2024-08-27T15:13:06Z)
REAL Sampling: Boosting Factuality and Diversity of Open-Ended Generation via Asymptotic Entropy [93.8400683020273]
Decoding methods for large language models (LLMs) usually struggle with the tradeoff between ensuring factuality and maintaining diversity. We propose REAL sampling, a decoding method that improved factuality and diversity over nucleus sampling.
arXiv Detail & Related papers (2024-06-11T21:44:49Z)
ACORN: Aspect-wise Commonsense Reasoning Explanation Evaluation [29.718851249656172]
Large language models (LLMs) present an appealing alternative due to their potential for consistency, scalability, and cost-efficiency. We present ACORN, a new dataset of 3,500 free-text explanations and aspect-wise quality ratings.
arXiv Detail & Related papers (2024-05-08T05:36:52Z)
Preference Fine-Tuning of LLMs Should Leverage Suboptimal, On-Policy Data [102.16105233826917]
Learning from preference labels plays a crucial role in fine-tuning large language models. There are several distinct approaches for preference fine-tuning, including supervised learning, on-policy reinforcement learning (RL), and contrastive learning.
arXiv Detail & Related papers (2024-04-22T17:20:18Z)
Quantifying the Persona Effect in LLM Simulations [25.367927300697424]
Large language models (LLMs) have shown remarkable promise in simulating human language and behavior. This study investigates how integrating persona variables-demographic, social, and behavioral factors-impacts LLMs' ability to simulate diverse perspectives. We find that persona variables account for 10% variance in annotations in existing subjective NLP datasets.
arXiv Detail & Related papers (2024-02-16T16:35:35Z)
Improving Demonstration Diversity by Human-Free Fusing for Text-to-SQL [51.48239006107272]
In this paper, we discuss how to measure and improve the diversity of the demonstrations for text-to-diversity research. We propose fusing iteratively for demonstrations (Fused) to build a high-diversity demonstration pool. Our method achieves an average improvement of 3.2% and 5.0% with and without human labeling on several mainstream datasets.
arXiv Detail & Related papers (2024-02-16T13:13:18Z)
Relative Preference Optimization: Enhancing LLM Alignment through Contrasting Responses across Identical and Diverse Prompts [95.09994361995389]
Relative Preference Optimization (RPO) is designed to discern between more and less preferred responses derived from both identical and related prompts. RPO has demonstrated a superior ability to align large language models with user preferences and to improve their adaptability during the training process.
arXiv Detail & Related papers (2024-02-12T22:47:57Z)
Effects of diversity incentives on sample diversity and downstream model performance in LLM-based text augmentation [6.273933281069326]
We investigate three text diversity incentive methods well established in crowdsourcing: taboo words, hints by previous outlier solutions, and chaining on previous outlier solutions. We show that diversity is most increased by taboo words, but downstream model performance is highest with hints.
arXiv Detail & Related papers (2024-01-12T15:46:43Z)
Elastic Weight Removal for Faithful and Abstractive Dialogue Generation [61.40951756070646]
A dialogue system should generate responses that are faithful to the knowledge contained in relevant documents. Many models generate hallucinated responses instead that contradict it or contain unverifiable information. We show that our method can be extended to simultaneously discourage hallucinations and extractive responses.
arXiv Detail & Related papers (2023-03-30T17:40:30Z)

This list is automatically generated from the titles and abstracts of the papers in this site.