An Empirical Study on Preference Tuning Generalization and Diversity Under Domain Shift
- URL: http://arxiv.org/abs/2601.05882v1
- Date: Fri, 09 Jan 2026 15:56:55 GMT
- Title: An Empirical Study on Preference Tuning Generalization and Diversity Under Domain Shift
- Authors: Constantinos Karouzos, Xingwei Tan, Nikolaos Aletras,
- Abstract summary: Preference tuning aligns pretrained language models to human judgments of quality, helpfulness, or safety.<n>Prior work has shown that preference-tuning degrades performance and reduces helpfulness when evaluated outside the training domain.<n>We show that adaptation strategies based on pseudo-labeling can substantially reduce domain-shift degradation.
- Score: 28.406449942947315
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: Preference tuning aligns pretrained language models to human judgments of quality, helpfulness, or safety by optimizing over explicit preference signals rather than likelihood alone. Prior work has shown that preference-tuning degrades performance and reduces helpfulness when evaluated outside the training domain. However, the extent to which adaptation strategies mitigate this domain shift remains unexplored. We address this challenge by conducting a comprehensive and systematic study of alignment generalization under domain shift. We compare five popular alignment objectives and various adaptation strategies from source to target, including target-domain supervised fine-tuning and pseudo-labeling, across summarization and question-answering helpfulness tasks. Our findings reveal systematic differences in generalization across alignment objectives under domain shift. We show that adaptation strategies based on pseudo-labeling can substantially reduce domain-shift degradation
Related papers
- Gradient-Guided Annealing for Domain Generalization [5.124256074746721]
Gradient-Guided Annealing (GGA) algorithm is proposed to improve domain generalization effectiveness.<n>The efficacy of GGA is evaluated on five widely accepted and challenging image classification domain generalization benchmarks.
arXiv Detail & Related papers (2025-02-27T15:01:55Z) - Randomized Adversarial Style Perturbations for Domain Generalization [49.888364462991234]
We propose a novel domain generalization technique, referred to as Randomized Adversarial Style Perturbation (RASP)
The proposed algorithm perturbs the style of a feature in an adversarial direction towards a randomly selected class, and makes the model learn against being misled by the unexpected styles observed in unseen target domains.
We evaluate the proposed algorithm via extensive experiments on various benchmarks and show that our approach improves domain generalization performance, especially in large-scale benchmarks.
arXiv Detail & Related papers (2023-04-04T17:07:06Z) - Label Alignment Regularization for Distribution Shift [63.228879525056904]
Recent work has highlighted the label alignment property (LAP) in supervised learning, where the vector of all labels in the dataset is mostly in the span of the top few singular vectors of the data matrix.
We propose a regularization method for unsupervised domain adaptation that encourages alignment between the predictions in the target domain and its top singular vectors.
We report improved performance over domain adaptation baselines in well-known tasks such as MNIST-USPS domain adaptation and cross-lingual sentiment analysis.
arXiv Detail & Related papers (2022-11-27T22:54:48Z) - A Closer Look at Smoothness in Domain Adversarial Training [37.205372217498656]
We analyze the effect of smoothness enhancing formulations on domain adversarial training.
We find that converging to a smooth minima with respect to (w.r.t.) task loss stabilizes the adversarial training leading to better performance on target domain.
In contrast to task loss, our analysis shows that converging to smooth minima w.r.t. adversarial loss leads to sub-optimal generalization on the target domain.
arXiv Detail & Related papers (2022-06-16T14:31:38Z) - Labeling Where Adapting Fails: Cross-Domain Semantic Segmentation with
Point Supervision via Active Selection [81.703478548177]
Training models dedicated to semantic segmentation require a large amount of pixel-wise annotated data.
Unsupervised domain adaptation approaches aim at aligning the feature distributions between the labeled source and the unlabeled target data.
Previous works attempted to include human interactions in this process under the form of sparse single-pixel annotations in the target data.
We propose a new domain adaptation framework for semantic segmentation with annotated points via active selection.
arXiv Detail & Related papers (2022-06-01T01:52:28Z) - Variational Disentanglement for Domain Generalization [68.85458536180437]
We propose to tackle the problem of domain generalization by delivering an effective framework named Variational Disentanglement Network (VDN)
VDN is capable of disentangling the domain-specific features and task-specific features, where the task-specific features are expected to be better generalized to unseen but related test data.
arXiv Detail & Related papers (2021-09-13T09:55:32Z) - Domain Adaptation for Semantic Segmentation via Patch-Wise Contrastive
Learning [62.7588467386166]
We leverage contrastive learning to bridge the domain gap by aligning the features of structurally similar label patches across domains.
Our approach consistently outperforms state-of-the-art unsupervised and semi-supervised methods on two challenging domain adaptive segmentation tasks.
arXiv Detail & Related papers (2021-04-22T13:39:12Z) - Gradient Regularized Contrastive Learning for Continual Domain
Adaptation [86.02012896014095]
We study the problem of continual domain adaptation, where the model is presented with a labelled source domain and a sequence of unlabelled target domains.
We propose Gradient Regularized Contrastive Learning (GRCL) to solve the obstacles.
Experiments on Digits, DomainNet and Office-Caltech benchmarks demonstrate the strong performance of our approach.
arXiv Detail & Related papers (2021-03-23T04:10:42Z) - Exploiting Diverse Characteristics and Adversarial Ambivalence for
Domain Adaptive Segmentation [20.13548631627542]
Adapting semantic segmentation models to new domains is an important but challenging problem.
We propose a condition-guided adaptation framework that is empowered by a special progressive adversarial training mechanism and a novel self-training policy.
We evaluate our method on various adaptation scenarios where the target images vary in weather conditions.
arXiv Detail & Related papers (2020-12-10T11:50:59Z) - Adversarial Weighting for Domain Adaptation in Regression [4.34858896385326]
We present a novel instance-based approach to handle regression tasks in the context of supervised domain adaptation.
We develop an adversarial network algorithm which learns both the source weighting scheme and the task in one feed-forward gradient descent.
arXiv Detail & Related papers (2020-06-15T09:44:04Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.