Related papers: InstructDiff: Domain-Adaptive Data Selection via Differential Entropy for Efficient LLM Fine-Tuning

InstructDiff: Domain-Adaptive Data Selection via Differential Entropy for Efficient LLM Fine-Tuning

URL: http://arxiv.org/abs/2601.23006v1
Date: Fri, 30 Jan 2026 14:15:44 GMT
Title: InstructDiff: Domain-Adaptive Data Selection via Differential Entropy for Efficient LLM Fine-Tuning
Authors: Junyou Su, He Zhu, Xiao Luo, Liyu Zhang, Hong-Yu Zhou, Yun Chen, Peng Li, Yang Liu, Guanhua Chen,
Abstract summary: InstructDiff is a unified framework that operationalizes differential entropy as a domain-adaptive selection criterion.<n>We show that InstructDiff achieves 17% relative improvement over full data training on mathematical reasoning and 52% for general instruction-following.
Score: 35.89674702985539
License: http://creativecommons.org/licenses/by/4.0/
Abstract: Supervised fine-tuning (SFT) is fundamental to adapting large language models, yet training on complete datasets incurs prohibitive costs with diminishing returns. Existing data selection methods suffer from severe domain specificity: techniques optimized for general instruction-following fail on reasoning tasks, and vice versa. We observe that measuring entropy differences between base models and minimally instruction-tuned calibrated models reveals a pattern -- samples with the lowest differential entropy consistently yield optimal performance across domains, yet this principle manifests domain-adaptively: reasoning tasks favor entropy increase (cognitive expansion), while general tasks favor entropy decrease (cognitive compression). We introduce InstructDiff, a unified framework that operationalizes differential entropy as a domain-adaptive selection criterion through warmup calibration, bi-directional NLL filtering, and entropy-based ranking. Extensive experiments show that InstructDiff achieves 17\% relative improvement over full data training on mathematical reasoning and 52\% for general instruction-following, outperforming prior baselines while using only 10\% of the data.

Related papers

Improving Domain Generalization in Contrastive Learning using Adaptive Temperature Control [6.29137812995328]
Self-supervised pre-training with contrastive learning is a powerful method for learning from sparsely labeled data.<n>Performance can drop considerably when there is a shift in the distribution of data from training to test time.<n>We present a new method for contrastive learning that incorporates domain labels to increase the domain invariance of learned representations.
arXiv Detail & Related papers (2026-01-12T17:32:24Z)
DONOD: Efficient and Generalizable Instruction Fine-Tuning for LLMs via Model-Intrinsic Dataset Pruning [22.704995231753397]
Ad-hoc instruction fine-tuning of large language models (LLMs) is widely adopted for domain-specific adaptation.<n>We propose DONOD, a lightweight model-intrinsic data pruning method.<n>By filtering out 70% of the whole dataset, we improve target-domain accuracy by 14.90% and cross-domain accuracy by 5.67%.
arXiv Detail & Related papers (2025-04-21T02:25:03Z)
Feasible Learning [78.6167929413604]
We introduce Feasible Learning (FL), a sample-centric learning paradigm where models are trained by solving a feasibility problem that bounds the loss for each training sample.<n>Our empirical analysis, spanning image classification, age regression, and preference optimization in large language models, demonstrates that models trained via FL can learn from data while displaying improved tail behavior compared to ERM, with only a marginal impact on average performance.
arXiv Detail & Related papers (2025-01-24T20:39:38Z)
APAR: Modeling Irregular Target Functions in Tabular Regression via Arithmetic-Aware Pre-Training and Adaptive-Regularized Fine-Tuning [12.35924469567586]
We propose a novel Arithmetic-Aware Pre-training and Adaptive-Regularized Fine-tuning framework (APAR)<n>In the pre-training phase, APAR introduces an arithmetic-aware pretext objective to capture intricate sample-wise relationships from the perspective of continuous labels.<n>In the fine-tuning phase, a consistency-based adaptive regularization technique is proposed to self-learn appropriate data augmentation.
arXiv Detail & Related papers (2024-12-14T19:33:21Z)
IT$^3$: Idempotent Test-Time Training [95.78053599609044]
Deep learning models often struggle when deployed in real-world settings due to distribution shifts between training and test data.<n>We present Idempotent Test-Time Training (IT$3$), a novel approach that enables on-the-fly adaptation to distribution shifts using only the current test instance.<n>Our results suggest that idempotence provides a universal principle for test-time adaptation that generalizes across domains and architectures.
arXiv Detail & Related papers (2024-10-05T15:39:51Z)
Gradient Guidance for Diffusion Models: An Optimization Perspective [45.6080199096424]
This paper studies a form of gradient guidance for adapting a pre-trained diffusion model towards optimizing user-specified objectives. We establish a mathematical framework for guided diffusion to systematically study its optimization theory and algorithmic design.
arXiv Detail & Related papers (2024-04-23T04:51:02Z)
Functional Graphical Models: Structure Enables Offline Data-Driven Optimization [111.28605744661638]
We show how structure can enable sample-efficient data-driven optimization. We also present a data-driven optimization algorithm that infers the FGM structure itself.
arXiv Detail & Related papers (2024-01-08T22:33:14Z)
Sparse is Enough in Fine-tuning Pre-trained Large Language Models [98.46493578509039]
We propose a gradient-based sparse fine-tuning algorithm, named Sparse Increment Fine-Tuning (SIFT) We validate its effectiveness on a range of tasks including the GLUE Benchmark and Instruction-tuning.
arXiv Detail & Related papers (2023-12-19T06:06:30Z)
Domain Generalization by Rejecting Extreme Augmentations [24.321332981669297]
We show that for out-of-domain and domain generalization settings, data augmentation can provide a conspicuous and robust improvement in performance.<n>We propose a simple training procedure: (i) use uniform sampling on standard data augmentation transformations; (ii) increase the strength transformations to account for the higher data variance expected when working out-of-domain, and (iii) devise a new reward function to reject extreme transformations that can harm the training.
arXiv Detail & Related papers (2023-10-10T14:46:22Z)
Fine-grained Retrieval Prompt Tuning [149.9071858259279]
Fine-grained Retrieval Prompt Tuning steers a frozen pre-trained model to perform the fine-grained retrieval task from the perspectives of sample prompt and feature adaptation. Our FRPT with fewer learnable parameters achieves the state-of-the-art performance on three widely-used fine-grained datasets.
arXiv Detail & Related papers (2022-07-29T04:10:04Z)
Test-time Batch Statistics Calibration for Covariate Shift [66.7044675981449]
We propose to adapt the deep models to the novel environment during inference. We present a general formulation $alpha$-BN to calibrate the batch statistics. We also present a novel loss function to form a unified test time adaptation framework Core.
arXiv Detail & Related papers (2021-10-06T08:45:03Z)

This list is automatically generated from the titles and abstracts of the papers in this site.