Related papers: Sensitivity of Stability: Theoretical & Empirical Analysis of Replicability for Adaptive Data Selection in Transfer Learning

Sensitivity of Stability: Theoretical & Empirical Analysis of Replicability for Adaptive Data Selection in Transfer Learning

URL: http://arxiv.org/abs/2508.04901v1
Date: Wed, 06 Aug 2025 21:56:56 GMT
Title: Sensitivity of Stability: Theoretical & Empirical Analysis of Replicability for Adaptive Data Selection in Transfer Learning
Authors: Prabhav Singh, Jessica Sorrell,
Abstract summary: We introduce a mathematical framework that quantifies the fundamental trade-off between adaptation effectiveness and result consistency.<n>We show that highly adaptive strategies like gradient-based and curriculum learning achieve superior task performance but suffer from high replicability failure rates.<n>We also show that source domain pretraining provides a powerful mitigation mechanism, reducing failure rates by up to 30% while preserving performance gains.
Score: 2.5526642520700946
License: http://creativecommons.org/licenses/by/4.0/
Abstract: The widespread adoption of transfer learning has revolutionized machine learning by enabling efficient adaptation of pre-trained models to new domains. However, the reliability of these adaptations remains poorly understood, particularly when using adaptive data selection strategies that dynamically prioritize training examples. We present a comprehensive theoretical and empirical analysis of replicability in transfer learning, introducing a mathematical framework that quantifies the fundamental trade-off between adaptation effectiveness and result consistency. Our key contribution is the formalization of selection sensitivity ($\Delta_Q$), a measure that captures how adaptive selection strategies respond to perturbations in training data. We prove that replicability failure probability: the likelihood that two independent training runs produce models differing in performance by more than a threshold, increases quadratically with selection sensitivity while decreasing exponentially with sample size. Through extensive experiments on the MultiNLI corpus using six adaptive selection strategies - ranging from uniform sampling to gradient-based selection - we demonstrate that this theoretical relationship holds precisely in practice. Our results reveal that highly adaptive strategies like gradient-based and curriculum learning achieve superior task performance but suffer from high replicability failure rates, while less adaptive approaches maintain failure rates below 7%. Crucially, we show that source domain pretraining provides a powerful mitigation mechanism, reducing failure rates by up to 30% while preserving performance gains. These findings establish principled guidelines for practitioners to navigate the performance-replicability trade-off and highlight the need for replicability-aware design in modern transfer learning systems.

Related papers

AdaLRS: Loss-Guided Adaptive Learning Rate Search for Efficient Foundation Model Pretraining [12.630306478872043]
We propose textbfAdaLRS, a plug-in-and-play adaptive learning rate search algorithm that conducts online optimal learning rate search.<n>Experiments show that AdaLRS adjusts suboptimal learning rates to the neighborhood of optimum with marked efficiency and effectiveness.
arXiv Detail & Related papers (2025-06-16T09:14:01Z)
Dynamic Loss-Based Sample Reweighting for Improved Large Language Model Pretraining [55.262510814326035]
Existing reweighting strategies primarily focus on group-level data importance.<n>We introduce novel algorithms for dynamic, instance-level data reweighting.<n>Our framework allows us to devise reweighting strategies deprioritizing redundant or uninformative data.
arXiv Detail & Related papers (2025-02-10T17:57:15Z)
Model-Robust and Adaptive-Optimal Transfer Learning for Tackling Concept Shifts in Nonparametric Regression [7.243632426715939]
We present a transfer learning procedure that is robust against model misspecification while adaptively attaining optimality.<n>We derive the adaptive convergence rates of the excess risk for specifying Gaussian kernels in a prevalent class of hypothesis transfer learning algorithms.
arXiv Detail & Related papers (2025-01-18T20:33:37Z)
Deep Transfer $Q$-Learning for Offline Non-Stationary Reinforcement Learning [3.2839905453386162]
This paper pioneers the study of transfer learning for dynamic decision scenarios modeled by non-stationary finite-horizon Markov decision processes.<n>We introduce a novel re-weighted targeting procedure'' to construct transferable RL samples'' and propose transfer deep $Q*$-learning''<n>Our analytical techniques for transfer learning in neural network approximation and transition density transfers have broader implications.
arXiv Detail & Related papers (2025-01-08T23:03:18Z)
Super Level Sets and Exponential Decay: A Synergistic Approach to Stable Neural Network Training [0.0]
We develop a dynamic learning rate algorithm that integrates exponential decay and advanced anti-overfitting strategies. We prove that the superlevel sets of the loss function, as influenced by our adaptive learning rate, are always connected.
arXiv Detail & Related papers (2024-09-25T09:27:17Z)
Decentralized Learning Strategies for Estimation Error Minimization with Graph Neural Networks [94.2860766709971]
We address the challenge of sampling and remote estimation for autoregressive Markovian processes in a wireless network with statistically-identical agents.<n>Our goal is to minimize time-average estimation error and/or age of information with decentralized scalable sampling and transmission policies.
arXiv Detail & Related papers (2024-04-04T06:24:11Z)
Modeling Uncertain Feature Representation for Domain Generalization [49.129544670700525]
We show that our method consistently improves the network generalization ability on multiple vision tasks. Our methods are simple yet effective and can be readily integrated into networks without additional trainable parameters or loss constraints.
arXiv Detail & Related papers (2023-01-16T14:25:02Z)
Estimation and inference for transfer learning with high-dimensional quantile regression [3.4510296013600374]
We propose a transfer learning procedure in the framework of high-dimensional quantile regression models. We establish error bounds of transfer learning estimator based on delicately selected transferable source domains. By adopting data-splitting technique, we advocate a transferability detection approach that guarantees to circumvent negative transfer.
arXiv Detail & Related papers (2022-11-26T14:40:19Z)
Latent-Optimized Adversarial Neural Transfer for Sarcasm Detection [50.29565896287595]
We apply transfer learning to exploit common datasets for sarcasm detection. We propose a generalized latent optimization strategy that allows different losses to accommodate each other. In particular, we achieve 10.02% absolute performance gain over the previous state of the art on the iSarcasm dataset.
arXiv Detail & Related papers (2021-04-19T13:07:52Z)
ReMP: Rectified Metric Propagation for Few-Shot Learning [67.96021109377809]
A rectified metric space is learned to maintain the metric consistency from training to testing. Numerous analyses indicate that a simple modification of the objective can yield substantial performance gains. The proposed ReMP is effective and efficient, and outperforms the state of the arts on various standard few-shot learning datasets.
arXiv Detail & Related papers (2020-12-02T00:07:53Z)
Learning Diverse Representations for Fast Adaptation to Distribution Shift [78.83747601814669]
We present a method for learning multiple models, incorporating an objective that pressures each to learn a distinct way to solve the task. We demonstrate our framework's ability to facilitate rapid adaptation to distribution shift.
arXiv Detail & Related papers (2020-06-12T12:23:50Z)

This list is automatically generated from the titles and abstracts of the papers in this site.

This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.