Related papers: Dataset Cartography for Large Language Model Alignment: Mapping and Diagnosing Preference Data

Dataset Cartography for Large Language Model Alignment: Mapping and Diagnosing Preference Data

URL: http://arxiv.org/abs/2505.23114v2
Date: Tue, 03 Jun 2025 05:21:36 GMT
Title: Dataset Cartography for Large Language Model Alignment: Mapping and Diagnosing Preference Data
Authors: Seohyeong Lee, Eunwon Kim, Hwaran Lee, Buru Chang,
Abstract summary: Human preference data plays a critical role in aligning large language models with human values.<n>We introduce Alignment Data Map, a GPT-4o-assisted tool for analyzing and diagnosing preference data.
Score: 12.113905795672899
License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
Abstract: Human preference data plays a critical role in aligning large language models (LLMs) with human values. However, collecting such data is often expensive and inefficient, posing a significant scalability challenge. To address this, we introduce Alignment Data Map, a GPT-4o-assisted tool for analyzing and diagnosing preference data. Using GPT-4o as a proxy for LLM alignment, we compute alignment scores for LLM-generated responses to instructions from existing preference datasets. These scores are then used to construct an Alignment Data Map based on their mean and variance. Our experiments show that using only 33 percent of the data, specifically samples in the high-mean, low-variance region, achieves performance comparable to or better than using the entire dataset. This finding suggests that the Alignment Data Map can significantly improve data collection efficiency by identifying high-quality samples for LLM alignment without requiring explicit annotations. Moreover, the Alignment Data Map can diagnose existing preference datasets. Our analysis shows that it effectively detects low-impact or potentially misannotated samples. Source code is available online.

Related papers

Efficient Alignment of Large Language Models via Data Sampling [0.4915744683251149]
We propose an information theory-based methodology for efficient alignment by identifying a small high quality subset.<n>We find that the model aligned using our proposed methodology outperforms other sampling methods and performs comparable to the model aligned with the full dataset.
arXiv Detail & Related papers (2024-11-15T19:36:15Z)
Is C4 Dataset Optimal for Pruning? An Investigation of Calibration Data for LLM Pruning [56.795078085234195]
LLM pruning approaches universally rely on the C4 dataset as the calibration data for calculating pruning scores. In this study, we evaluate the choice of calibration data on LLM pruning, across a wide range of datasets. Our results also uncover several subtle and often unexpected findings.
arXiv Detail & Related papers (2024-10-09T22:00:19Z)
Improving Data Efficiency via Curating LLM-Driven Rating Systems [30.233724785974143]
We introduce DS2, a Diversity-aware Score curation method for Data Selection.<n>By systematically modeling error patterns through a score transition matrix, DS2 corrects LLM-based scores and promotes diversity in the selected data samples.<n>Our approach shows that a curated subset (just 3.3% of the original dataset) outperforms full-scale datasets (300k samples) across various machine-alignment benchmarks.
arXiv Detail & Related papers (2024-10-09T10:07:55Z)
Magpie: Alignment Data Synthesis from Scratch by Prompting Aligned LLMs with Nothing [48.07915731998946]
We present a self-synthesis method for generating large-scale alignment data named Magpie. We use this method to prompt Llama-3-Instruct and generate 4 million instructions along with their corresponding responses. Our results indicate that in some tasks, models fine-tuned with Magpie perform comparably to the official Llama-3-8B-Instruct.
arXiv Detail & Related papers (2024-06-12T17:52:30Z)
Spread Preference Annotation: Direct Preference Judgment for Efficient LLM Alignment [72.99676237703099]
We propose a new framework that boosts the alignment of large language models with human preferences.<n>Our key idea is leveraging the human prior knowledge within the small (seed) data.<n>We introduce a noise-aware preference learning algorithm to mitigate the risk of low quality within generated preference data.
arXiv Detail & Related papers (2024-06-06T18:01:02Z)
LESS: Selecting Influential Data for Targeted Instruction Tuning [64.78894228923619]
We propose LESS, an efficient algorithm to estimate data influences and perform Low-rank gradiEnt Similarity Search for instruction data selection. We show that training on a LESS-selected 5% of the data can often outperform training on the full dataset across diverse downstream tasks. Our method goes beyond surface form cues to identify data that the necessary reasoning skills for the intended downstream application.
arXiv Detail & Related papers (2024-02-06T19:18:04Z)
No More Distractions: an Adaptive Up-Sampling Algorithm to Reduce Data Artifacts [3.9777369380822956]
We analyzed SNLI data and visualized such spurious correlations. We proposed an adaptive up-sampling algorithm to correct the data artifacts. We did an experiment applying the algorithm to fix the data artifacts in SNLI data and the model trained with corrected data performed significantly better than the model trained with raw SNLI data.
arXiv Detail & Related papers (2024-01-25T02:54:53Z)
Data Selection for Language Models via Importance Resampling [90.9263039747723]
We formalize the problem of selecting a subset of a large raw unlabeled dataset to match a desired target distribution. We extend the classic importance resampling approach used in low-dimensions for LM data selection. We instantiate the DSIR framework with hashed n-gram features for efficiency, enabling the selection of 100M documents in 4.5 hours.
arXiv Detail & Related papers (2023-02-06T23:57:56Z)
LiDAR dataset distillation within bayesian active learning framework: Understanding the effect of data augmentation [63.20765930558542]
Active learning (AL) has re-gained attention recently to address reduction of annotation costs and dataset size. This paper performs a principled evaluation of AL based dataset distillation on (1/4th) of the large Semantic-KITTI dataset. We observe that data augmentation achieves full dataset accuracy using only 60% of samples from the selected dataset configuration.
arXiv Detail & Related papers (2022-02-06T00:04:21Z)
Training Dynamic based data filtering may not work for NLP datasets [0.0]
We study the applicability of the Area Under the Margin (AUM) metric to identify mislabelled examples in NLP datasets. We find that mislabelled samples can be filtered using the AUM metric in NLP datasets but it also removes a significant number of correctly labeled points.
arXiv Detail & Related papers (2021-09-19T18:50:45Z)

This list is automatically generated from the titles and abstracts of the papers in this site.