Related papers: Principled Data Selection for Alignment: The Hidden Risks of Difficult Examples

Principled Data Selection for Alignment: The Hidden Risks of Difficult Examples

URL: http://arxiv.org/abs/2502.09650v1
Date: Tue, 11 Feb 2025 17:01:11 GMT
Title: Principled Data Selection for Alignment: The Hidden Risks of Difficult Examples
Authors: Chengqian Gao, Haonan Li, Liu Liu, Zeke Xie, Peilin Zhao, Zhiqiang Xu,
Abstract summary: We show that preference data vary in difficulty, and overly difficult examples hinder alignment.<n>We introduce Selective DPO, which filters out overly difficult examples.<n>This simple adjustment improves alignment performance by 9-16% in win rates on the AlpacaEval 2 benchmark.
Score: 38.79705507444374
License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
Abstract: The alignment of large language models (LLMs) often assumes that using more clean data yields better outcomes, overlooking the match between model capacity and example difficulty. Challenging this, we propose a new principle: Preference data vary in difficulty, and overly difficult examples hinder alignment, by exceeding the model's capacity. Through systematic experimentation, we validate this principle with three key findings: (1) preference examples vary in difficulty, as evidenced by consistent learning orders across alignment runs; (2) overly difficult examples significantly degrade performance across four LLMs and two datasets; and (3) the capacity of a model dictates its threshold for handling difficult examples, underscoring a critical relationship between data selection and model capacity. Building on this principle, we introduce Selective DPO, which filters out overly difficult examples. This simple adjustment improves alignment performance by 9-16% in win rates on the AlpacaEval 2 benchmark compared to the DPO baseline, suppressing a series of DPO variants with different algorithmic adjustments. Together, these results illuminate the importance of aligning data difficulty with model capacity, offering a transformative perspective for improving alignment strategies in LLMs. Code is available at https://github.com/glorgao/SelectiveDPO.

Related papers

SeMi: When Imbalanced Semi-Supervised Learning Meets Mining Hard Examples [54.760757107700755]
Semi-Supervised Learning (SSL) can leverage abundant unlabeled data to boost model performance.<n>The class-imbalanced data distribution in real-world scenarios poses great challenges to SSL, resulting in performance degradation.<n>We propose a method that enhances the performance of Imbalanced Semi-Supervised Learning by Mining Hard Examples (SeMi)
arXiv Detail & Related papers (2025-01-10T14:35:16Z)
On Sampling Strategies for Spectral Model Sharding [7.185534285278903]
In this work, we present two sampling strategies for such sharding. The first produces unbiased estimators of the original weights, while the second aims to minimize the squared approximation error. We demonstrate that both of these methods can lead to improved performance on various commonly used datasets.
arXiv Detail & Related papers (2024-10-31T16:37:25Z)
Improving Data Efficiency via Curating LLM-Driven Rating Systems [30.233724785974143]
We introduce DS2, a Diversity-aware Score curation method for Data Selection. By systematically modeling error patterns through a score transition matrix, DS2 corrects LLM-based scores and promotes diversity in the selected data samples. Our approach shows that a curated subset (just 3.3% of the original dataset) outperforms full-scale datasets (300k samples) across various machine-alignment benchmarks.
arXiv Detail & Related papers (2024-10-09T10:07:55Z)
TIS-DPO: Token-level Importance Sampling for Direct Preference Optimization With Estimated Weights [73.9088920210495]
We propose a token-level importance sampling DPO objective named TIS-DPO that assigns importance weights to each token based on its reward.<n>TIS-DPO significantly outperforms various baseline methods on harmlessness and helpfulness alignment and summarization tasks.
arXiv Detail & Related papers (2024-10-06T04:03:00Z)
Uncertainty Aware Learning for Language Model Alignment [97.36361196793929]
We propose uncertainty-aware learning (UAL) to improve the model alignment of different task scenarios. We implement UAL in a simple fashion -- adaptively setting the label smoothing value of training according to the uncertainty of individual samples. Experiments on widely used benchmarks demonstrate that our UAL significantly and consistently outperforms standard supervised fine-tuning.
arXiv Detail & Related papers (2024-06-07T11:37:45Z)
In-Context Example Ordering Guided by Label Distributions [34.30216341226014]
We formulate in-context example ordering as an optimization problem. Inspired by the idea of learning from label proportions, we propose two principles for in-context example ordering guided by model's probability predictions. We demonstrate our approach outperforms the baselines by improving the classification accuracy, reducing model miscalibration, and also by selecting better in-context examples.
arXiv Detail & Related papers (2024-02-18T04:08:10Z)
D2 Pruning: Message Passing for Balancing Diversity and Difficulty in Data Pruning [70.98091101459421]
Coreset selection seeks to select a subset of the training data so as to maximize the performance of models trained on this subset, also referred to as coreset. We propose a novel pruning algorithm, D2 Pruning, that uses forward and reverse message passing over this dataset graph for coreset selection. Results show that D2 Pruning improves coreset selection over previous state-of-the-art methods for up to 70% pruning rates.
arXiv Detail & Related papers (2023-10-11T23:01:29Z)
Simplicity Bias Leads to Amplified Performance Disparities [8.60453031364566]
We show that SGD-trained models have a bias towards simplicity, leading them to prioritize learning a majority class. A model may prioritize any class or group of the dataset that it finds simple-at the expense of what it finds complex.
arXiv Detail & Related papers (2022-12-13T15:24:41Z)
Robust Optimal Transport with Applications in Generative Modeling and Domain Adaptation [120.69747175899421]
Optimal Transport (OT) distances such as Wasserstein have been used in several areas such as GANs and domain adaptation. We propose a computationally-efficient dual form of the robust OT optimization that is amenable to modern deep learning applications. Our approach can train state-of-the-art GAN models on noisy datasets corrupted with outlier distributions.
arXiv Detail & Related papers (2020-10-12T17:13:40Z)

This list is automatically generated from the titles and abstracts of the papers in this site.