Related papers: Adaptive Sample Aggregation In Transfer Learning

Related papers

Minimax optimal adaptive structured transfer learning through semi-parametric domain-varying coefficient model [9.091986429838117]
We study a multi-source, single-target transfer learning problem under conditional distributional drift.<n>We develop an adaptive transfer learning estimator that selectively borrows strength from informative source domains.
arXiv Detail & Related papers (2026-02-20T03:53:06Z)
Transfer Learning Through Conditional Quantile Matching [3.86972243789112]
We introduce a transfer learning framework for regression that leverages heterogeneous source domains to improve predictive performance in a data-scarce target domain.<n>Our approach learns a conditional generative model separately for each source domain and calibrates the generated responses to the target domain via conditional quantile matching.
arXiv Detail & Related papers (2026-02-02T17:19:55Z)
Whatever Remains Must Be True: Filtering Drives Reasoning in LLMs, Shaping Diversity [13.211627219720796]
Reinforcement Learning (RL) has become the de facto standard for tuning LLMs to solve tasks involving reasoning.<n>We argue that RL implicitly optimize the "mode-seeking" or "zero-forcing" Reverse KL to a target distribution causing the model to concentrate mass on certain high-probability regions of the target while others.<n>In this work, we instead begin from an explicit target distribution, obtained by filtering out incorrect answers while neglecting the relative probabilities of correct ones.
arXiv Detail & Related papers (2025-12-05T18:56:40Z)
MMR1: Enhancing Multimodal Reasoning with Variance-Aware Sampling and Open Resources [113.33902847941941]
Variance-Aware Sampling (VAS) is a data selection strategy guided by Variance Promotion Score (VPS)<n>We release large-scale, carefully curated resources containing 1.6M long CoT cold-start data and 15k RL QA pairs.<n> Experiments across mathematical reasoning benchmarks demonstrate the effectiveness of both the curated data and the proposed VAS.
arXiv Detail & Related papers (2025-09-25T14:58:29Z)
Centralized Adaptive Sampling for Reliable Co-Training of Independent Multi-Agent Policies [3.5253513747455303]
Independent on-policy policy gradient algorithms are widely used for multi-agent reinforcement learning (MARL) in cooperative and no-conflict games.<n>They are known to converge suboptimally when each agent's policy gradient points toward a suboptimal equilibrium.<n>We introduce an adaptive action sampling approach to reduce joint sampling error.
arXiv Detail & Related papers (2025-08-01T20:07:25Z)
Statistical Inference for Conditional Group Distributionally Robust Optimization with Cross-Entropy Loss [9.054486124506521]
We study multi-source unsupervised domain adaptation, where labeled data are drawn from multiple source domains and only unlabeled data from a target domain.<n>We propose a novel Conditional Conditional Optimization (CG-DRO) framework that learns a classifier by minimizing the worst-case cross-entropy loss over the convex combinations of the conditional outcome distributions from the sources.<n>We establish fast statistical convergence rates for the estimator by constructing two surrogate minimax optimization problems that serve as theoretical bridges.
arXiv Detail & Related papers (2025-07-14T04:21:23Z)
Progressive Multi-Source Domain Adaptation for Personalized Facial Expression Recognition [51.61979855488214]
Personalized facial expression recognition (FER) involves adapting a machine learning model using samples from labeled sources and unlabeled target domains. We propose a progressive MSDA approach that gradually introduces information from subjects based on their similarity to the target subject. Our experiments show the effectiveness of our proposed method on pain datasets: Biovid and UNBC-McMaster.
arXiv Detail & Related papers (2025-04-05T19:14:51Z)
Reducing Spurious Correlation for Federated Domain Generalization [15.864230656989854]
In open-world scenarios, global models may struggle to predict well on entirely new domain data captured by certain media. Existing methods still rely on strong statistical correlations between samples and labels to address this issue. We introduce FedCD, an overall optimization framework at both the local and global levels.
arXiv Detail & Related papers (2024-07-27T05:06:31Z)
Generalization Bounds of Surrogate Policies for Combinatorial Optimization Problems [61.580419063416734]
A recent stream of structured learning approaches has improved the practical state of the art for a range of optimization problems. The key idea is to exploit the statistical distribution over instances instead of dealing with instances separately. In this article, we investigate methods that smooth the risk by perturbing the policy, which eases optimization and improves the generalization error.
arXiv Detail & Related papers (2024-07-24T12:00:30Z)
S$Ω$I: Score-based O-INFORMATION Estimation [7.399561232927219]
We introduce S$Omega$I, which allows for the first time to compute O-information without restrictive assumptions about the system. Our experiments validate our approach on synthetic data, and demonstrate the effectiveness of S$Omega$I in the context of a real-world use case.
arXiv Detail & Related papers (2024-02-08T13:38:23Z)
Amortizing intractable inference in large language models [56.92471123778389]
We use amortized Bayesian inference to sample from intractable posterior distributions. We empirically demonstrate that this distribution-matching paradigm of LLM fine-tuning can serve as an effective alternative to maximum-likelihood training. As an important application, we interpret chain-of-thought reasoning as a latent variable modeling problem.
arXiv Detail & Related papers (2023-10-06T16:36:08Z)
Deep Anti-Regularized Ensembles provide reliable out-of-distribution uncertainty quantification [4.750521042508541]
Deep ensemble often return overconfident estimates outside the training domain. We show that an ensemble of networks with large weights fitting the training data are likely to meet these two objectives. We derive a theoretical framework for this approach and show that the proposed optimization can be seen as a "water-filling" problem.
arXiv Detail & Related papers (2023-04-08T15:25:12Z)
On the Connection between $L_p$ and Risk Consistency and its Implications on Regularized Kernel Methods [0.0]
The first aim of this paper is to establish the close connection between risk consistency and $L_p$-consistency for a considerably wider class of loss functions. The attempt to transfer this connection to shifted loss functions surprisingly reveals that this shift does not reduce the assumptions needed on the underlying probability measure to the same extent as it does for many other results.
arXiv Detail & Related papers (2023-03-27T13:51:56Z)
Enhancing Multiple Reliability Measures via Nuisance-extended Information Bottleneck [77.37409441129995]
In practical scenarios where training data is limited, many predictive signals in the data can be rather from some biases in data acquisition. We consider an adversarial threat model under a mutual information constraint to cover a wider class of perturbations in training. We propose an autoencoder-based training to implement the objective, as well as practical encoder designs to facilitate the proposed hybrid discriminative-generative training.
arXiv Detail & Related papers (2023-03-24T16:03:21Z)
The Power and Limitation of Pretraining-Finetuning for Linear Regression under Covariate Shift [127.21287240963859]
We investigate a transfer learning approach with pretraining on the source data and finetuning based on the target data. For a large class of linear regression instances, transfer learning with $O(N2)$ source data is as effective as supervised learning with $N$ target data.
arXiv Detail & Related papers (2022-08-03T05:59:49Z)
On the Generalization for Transfer Learning: An Information-Theoretic Analysis [8.102199960821165]
We give an information-theoretic analysis of the generalization error and excess risk of transfer learning algorithms. Our results suggest, perhaps as expected, that the Kullback-Leibler divergenceD(mu|mu')$ plays an important role in the characterizations. We then generalize the mutual information bound with other divergences such as $phi$-divergence and Wasserstein distance.
arXiv Detail & Related papers (2022-07-12T08:20:41Z)
A Relational Intervention Approach for Unsupervised Dynamics Generalization in Model-Based Reinforcement Learning [113.75991721607174]
We introduce an interventional prediction module to estimate the probability of two estimated $hatz_i, hatz_j$ belonging to the same environment. We empirically show that $hatZ$ estimated by our method enjoy less redundant information than previous methods.
arXiv Detail & Related papers (2022-06-09T15:01:36Z)
Trustworthy Multimodal Regression with Mixture of Normal-inverse Gamma Distributions [91.63716984911278]
We introduce a novel Mixture of Normal-Inverse Gamma distributions (MoNIG) algorithm, which efficiently estimates uncertainty in principle for adaptive integration of different modalities and produces a trustworthy regression result. Experimental results on both synthetic and different real-world data demonstrate the effectiveness and trustworthiness of our method on various multimodal regression tasks.
arXiv Detail & Related papers (2021-11-11T14:28:12Z)
Linear Speedup in Personalized Collaborative Learning [69.45124829480106]
Personalization in federated learning can improve the accuracy of a model for a user by trading off the model's bias. We formalize the personalized collaborative learning problem as optimization of a user's objective. We explore conditions under which we can optimally trade-off their bias for a reduction in variance.
arXiv Detail & Related papers (2021-11-10T22:12:52Z)
Online Selective Classification with Limited Feedback [82.68009460301585]
We study selective classification in the online learning model, wherein a predictor may abstain from classifying an instance. Two salient aspects of the setting we consider are that the data may be non-realisable, due to which abstention may be a valid long-term action. We construct simple versioning-based schemes for any $mu in (0,1],$ that make most $Tmu$ mistakes while incurring smash$tildeO(T1-mu)$ excess abstention against adaptive adversaries.
arXiv Detail & Related papers (2021-10-27T08:00:53Z)
Learning Invariant Representation with Consistency and Diversity for Semi-supervised Source Hypothesis Transfer [46.68586555288172]
We propose a novel task named Semi-supervised Source Hypothesis Transfer (SSHT), which performs domain adaptation based on source trained model, to generalize well in target domain with a few supervisions. We propose Consistency and Diversity Learning (CDL), a simple but effective framework for SSHT by facilitating prediction consistency between two randomly augmented unlabeled data. Experimental results show that our method outperforms existing SSDA methods and unsupervised model adaptation methods on DomainNet, Office-Home and Office-31 datasets.
arXiv Detail & Related papers (2021-07-07T04:14:24Z)
KL Guided Domain Adaptation [88.19298405363452]
Domain adaptation is an important problem and often needed for real-world applications. A common approach in the domain adaptation literature is to learn a representation of the input that has the same distributions over the source and the target domain. We show that with a probabilistic representation network, the KL term can be estimated efficiently via minibatch samples.
arXiv Detail & Related papers (2021-06-14T22:24:23Z)
Adaptive transfer learning [6.574517227976925]
We introduce a flexible framework for transfer learning in the context of binary classification. We show that the optimal rate can be achieved by an algorithm that adapts to key aspects of the unknown transfer relationship.
arXiv Detail & Related papers (2021-06-08T15:39:43Z)
Squared $\ell_2$ Norm as Consistency Loss for Leveraging Augmented Data to Learn Robust and Invariant Representations [76.85274970052762]
Regularizing distance between embeddings/representations of original samples and augmented counterparts is a popular technique for improving robustness of neural networks. In this paper, we explore these various regularization choices, seeking to provide a general understanding of how we should regularize the embeddings. We show that the generic approach we identified (squared $ell$ regularized augmentation) outperforms several recent methods, which are each specially designed for one task.
arXiv Detail & Related papers (2020-11-25T22:40:09Z)
Dimensionality reduction, regularization, and generalization in overparameterized regressions [8.615625517708324]
We show that PCA-OLS, also known as principal component regression, can be avoided with a dimensionality reduction. We show that dimensionality reduction improves robustness while OLS is arbitrarily susceptible to adversarial attacks. We find that methods in which the projection depends on the training data can outperform methods where the projections are chosen independently of the training data.
arXiv Detail & Related papers (2020-11-23T15:38:50Z)
Learning Invariant Representations and Risks for Semi-supervised Domain Adaptation [109.73983088432364]
We propose the first method that aims to simultaneously learn invariant representations and risks under the setting of semi-supervised domain adaptation (Semi-DA) We introduce the LIRR algorithm for jointly textbfLearning textbfInvariant textbfRepresentations and textbfRisks.
arXiv Detail & Related papers (2020-10-09T15:42:35Z)
Self-adaptive Re-weighted Adversarial Domain Adaptation [12.73753413032972]
We present a self-adaptive re-weighted adversarial domain adaptation approach. It tries to enhance domain alignment from the perspective of conditional distribution. Empirical evidence demonstrates that the proposed model outperforms state of the arts on standard domain adaptation datasets.
arXiv Detail & Related papers (2020-05-30T08:35:18Z)

This list is automatically generated from the titles and abstracts of the papers in this site.