Related papers: Improving Task Diversity in Label Efficient Supervised Finetuning of LLMs

Improving Task Diversity in Label Efficient Supervised Finetuning of LLMs

URL: http://arxiv.org/abs/2507.21482v1
Date: Tue, 29 Jul 2025 03:51:00 GMT
Title: Improving Task Diversity in Label Efficient Supervised Finetuning of LLMs
Authors: Abhinav Arabelly, Jagrut Nemade, Robert D Nowak, Jifan Zhang,
Abstract summary: Large Language Models (LLMs) have demonstrated remarkable capabilities across diverse domains, but developing high-performing models for specialized applications often requires substantial human annotation.<n>We address the label-efficient learning problem for supervised finetuning (SFT) by leveraging task-diversity as a fundamental principle for effective data selection.<n>Our approach is based on two key observations: 1) task labels for different prompts are often readily available; 2) pre-trained models have significantly varying levels of confidence across tasks.
Score: 14.531280062127442
License: http://creativecommons.org/licenses/by/4.0/
Abstract: Large Language Models (LLMs) have demonstrated remarkable capabilities across diverse domains, but developing high-performing models for specialized applications often requires substantial human annotation -- a process that is time-consuming, labor-intensive, and expensive. In this paper, we address the label-efficient learning problem for supervised finetuning (SFT) by leveraging task-diversity as a fundamental principle for effective data selection. This is markedly different from existing methods based on the prompt-diversity. Our approach is based on two key observations: 1) task labels for different prompts are often readily available; 2) pre-trained models have significantly varying levels of confidence across tasks. We combine these facts to devise a simple yet effective sampling strategy: we select examples across tasks using an inverse confidence weighting strategy. This produces models comparable to or better than those trained with more complex sampling procedures, while being significantly easier to implement and less computationally intensive. Notably, our experimental results demonstrate that this method can achieve better accuracy than training on the complete dataset (a 4\% increase in MMLU score). Across various annotation budgets and two instruction finetuning datasets, our algorithm consistently performs at or above the level of the best existing methods, while reducing annotation costs by up to 80\%.

Related papers

Fine-Tuned In-Context Learners for Efficient Adaptation [27.541207591799004]
We propose a unified approach to fine-tuning large language models (LLMs)<n>We fine-tune the model on task-specific data augmented with in-context examples, mimicking the structure of k-shot prompts.<n>This approach, while requiring per-task fine-tuning, combines the sample efficiency of in-context learning with the performance gains of fine-tuning.
arXiv Detail & Related papers (2025-12-22T21:12:02Z)
SPARE: Single-Pass Annotation with Reference-Guided Evaluation for Automatic Process Supervision and Reward Modelling [58.05959902776133]
We introduce Single-Pass.<n>with Reference-Guided Evaluation (SPARE), a novel structured framework that enables efficient per-step annotation.<n>We demonstrate SPARE's effectiveness across four diverse datasets spanning mathematical reasoning (GSM8K, MATH), multi-hop question answering (MuSiQue-Ans), and spatial reasoning (SpaRP)<n>On ProcessBench, SPARE demonstrates data-efficient out-of-distribution generalization, using only $sim$16% of training samples compared to human-labeled and other synthetically trained baselines.
arXiv Detail & Related papers (2025-06-18T14:37:59Z)
Learning What Matters: Prioritized Concept Learning via Relative Error-driven Sample Selection [38.35524024887503]
We propose PRioritized cOncept learninG via Relative Error-driven Sample Selection (PROGRESS)<n>PROGRESS is a data- and compute-efficient framework that enables vision-language models to dynamically select what to learn next.<n>We show that PROGRESS consistently outperforms state-of-the-art baselines with much less data and supervision.
arXiv Detail & Related papers (2025-06-01T17:05:35Z)
Star-Agents: Automatic Data Optimization with LLM Agents for Instruction Tuning [71.2981957820888]
We propose a novel Star-Agents framework, which automates the enhancement of data quality across datasets. The framework initially generates diverse instruction data with multiple LLM agents through a bespoke sampling method. The generated data undergo a rigorous evaluation using a dual-model method that assesses both difficulty and quality.
arXiv Detail & Related papers (2024-11-21T02:30:53Z)
Uncertainty Aware Learning for Language Model Alignment [97.36361196793929]
We propose uncertainty-aware learning (UAL) to improve the model alignment of different task scenarios. We implement UAL in a simple fashion -- adaptively setting the label smoothing value of training according to the uncertainty of individual samples. Experiments on widely used benchmarks demonstrate that our UAL significantly and consistently outperforms standard supervised fine-tuning.
arXiv Detail & Related papers (2024-06-07T11:37:45Z)
An Experimental Design Framework for Label-Efficient Supervised Finetuning of Large Language Models [55.01592097059969]
Supervised finetuning on instruction datasets has played a crucial role in achieving the remarkable zero-shot generalization capabilities. Active learning is effective in identifying useful subsets of samples to annotate from an unlabeled pool. We propose using experimental design to circumvent the computational bottlenecks of active learning.
arXiv Detail & Related papers (2024-01-12T16:56:54Z)
One-Shot Learning as Instruction Data Prospector for Large Language Models [108.81681547472138]
textscNuggets uses one-shot learning to select high-quality instruction data from extensive datasets. We show that instruction tuning with the top 1% of examples curated by textscNuggets substantially outperforms conventional methods employing the entire dataset.
arXiv Detail & Related papers (2023-12-16T03:33:12Z)
Task-Distributionally Robust Data-Free Meta-Learning [99.56612787882334]
Data-Free Meta-Learning (DFML) aims to efficiently learn new tasks by leveraging multiple pre-trained models without requiring their original training data. For the first time, we reveal two major challenges hindering their practical deployments: Task-Distribution Shift ( TDS) and Task-Distribution Corruption (TDC)
arXiv Detail & Related papers (2023-11-23T15:46:54Z)
STG-MTL: Scalable Task Grouping for Multi-Task Learning Using Data Map [4.263847576433289]
Multi-Task Learning (MTL) is a powerful technique that has gained popularity due to its performance improvement over traditional Single-Task Learning (STL) However, MTL is often challenging because there is an exponential number of possible task groupings. We propose a new data-driven method that addresses these challenges and provides a scalable and modular solution for classification task grouping.
arXiv Detail & Related papers (2023-07-07T03:54:26Z)
OverPrompt: Enhancing ChatGPT through Efficient In-Context Learning [49.38867353135258]
We propose OverPrompt, leveraging the in-context learning capability of LLMs to handle multiple task inputs. Our experiments show that OverPrompt can achieve cost-efficient zero-shot classification without causing significant detriment to task performance.
arXiv Detail & Related papers (2023-05-24T10:08:04Z)
Batch Active Learning at Scale [39.26441165274027]
Batch active learning, which adaptively issues batched queries to a labeling oracle, is a common approach for addressing this problem. In this work, we analyze an efficient active learning algorithm, which focuses on the large batch setting. We show that our sampling method, which combines notions of uncertainty and diversity, easily scales to batch sizes (100K-1M) several orders of magnitude larger than used in previous studies.
arXiv Detail & Related papers (2021-07-29T18:14:05Z)
Lifelong Learning Without a Task Oracle [13.331659934508764]
Supervised deep neural networks are known to undergo a sharp decline in the accuracy of older tasks when new tasks are learned. We propose and compare several candidate task-assigning mappers which require very little memory overhead. Best-performing variants only impose an average cost of 1.7% parameter memory increase.
arXiv Detail & Related papers (2020-11-09T21:30:31Z)

This list is automatically generated from the titles and abstracts of the papers in this site.