Mixture Weight Estimation and Model Prediction in Multi-source
Multi-target Domain Adaptation
- URL: http://arxiv.org/abs/2309.10736v2
- Date: Sun, 12 Nov 2023 17:24:25 GMT
- Title: Mixture Weight Estimation and Model Prediction in Multi-source
Multi-target Domain Adaptation
- Authors: Yuyang Deng, Ilja Kuzborskij, Mehrdad Mahdavi
- Abstract summary: We consider the problem of learning a model from multiple heterogeneous sources.
The goal of learner is to mix these data sources in a target-distribution aware way.
- Score: 22.933419188759707
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: We consider the problem of learning a model from multiple heterogeneous
sources with the goal of performing well on a new target distribution. The goal
of learner is to mix these data sources in a target-distribution aware way and
simultaneously minimize the empirical risk on the mixed source. The literature
has made some tangible advancements in establishing theory of learning on
mixture domain. However, there are still two unsolved problems. Firstly, how to
estimate the optimal mixture of sources, given a target domain; Secondly, when
there are numerous target domains, how to solve empirical risk minimization
(ERM) for each target using possibly unique mixture of data sources in a
computationally efficient manner. In this paper we address both problems
efficiently and with guarantees. We cast the first problem, mixture weight
estimation, as a convex-nonconcave compositional minimax problem, and propose
an efficient stochastic algorithm with provable stationarity guarantees. Next,
for the second problem, we identify that for certain regimes, solving ERM for
each target domain individually can be avoided, and instead parameters for a
target optimal model can be viewed as a non-linear function on a space of the
mixture coefficients. Building upon this, we show that in the offline setting,
a GD-trained overparameterized neural network can provably learn such function
to predict the model of target domain instead of solving a designated ERM
problem. Finally, we also consider an online setting and propose a label
efficient online algorithm, which predicts parameters for new targets given an
arbitrary sequence of mixing coefficients, while enjoying regret guarantees.
Related papers
- Distributed Personalized Empirical Risk Minimization [19.087524494290676]
This paper advocates a new paradigm Personalized Empirical Risk Minimization (PERM) to facilitate learning from heterogeneous data sources.
We propose a distributed algorithm that replaces the standard model averaging with model shuffling to simultaneously optimize PERM objectives for all devices.
arXiv Detail & Related papers (2023-10-26T20:07:33Z) - Threshold-aware Learning to Generate Feasible Solutions for Mixed
Integer Programs [5.28005598366543]
Neural diving (ND) is one of the learning-based approaches to generating partial discrete variable assignments in Mixed Programs (MIP)
We introduce a post-hoc method and a learning-based approach for optimizing the coverage.
Experimental results demonstrate that learning a deep neural network to estimate the coverage for finding high-quality feasible solutions achieves state-of-the-art performance in NeurIPS ML4CO datasets.
arXiv Detail & Related papers (2023-08-01T07:03:16Z) - Efficient first-order predictor-corrector multiple objective
optimization for fair misinformation detection [5.139559672771439]
Multiple-objective optimization (MOO) aims to simultaneously optimize multiple conflicting objectives and has found important applications in machine learning.
We propose a Gauss-Newton approximation that only scales linearly, and that requires only first-order inner-product per iteration.
The innovations make predictor-corrector possible for large networks.
arXiv Detail & Related papers (2022-09-15T12:32:15Z) - Domain-Specific Risk Minimization for Out-of-Distribution Generalization [104.17683265084757]
We first establish a generalization bound that explicitly considers the adaptivity gap.
We propose effective gap estimation methods for guiding the selection of a better hypothesis for the target.
The other method is minimizing the gap directly by adapting model parameters using online target samples.
arXiv Detail & Related papers (2022-08-18T06:42:49Z) - Leveraging Unlabeled Data to Predict Out-of-Distribution Performance [63.740181251997306]
Real-world machine learning deployments are characterized by mismatches between the source (training) and target (test) distributions.
In this work, we investigate methods for predicting the target domain accuracy using only labeled source data and unlabeled target data.
We propose Average Thresholded Confidence (ATC), a practical method that learns a threshold on the model's confidence, predicting accuracy as the fraction of unlabeled examples.
arXiv Detail & Related papers (2022-01-11T23:01:12Z) - KL Guided Domain Adaptation [88.19298405363452]
Domain adaptation is an important problem and often needed for real-world applications.
A common approach in the domain adaptation literature is to learn a representation of the input that has the same distributions over the source and the target domain.
We show that with a probabilistic representation network, the KL term can be estimated efficiently via minibatch samples.
arXiv Detail & Related papers (2021-06-14T22:24:23Z) - Multi-resource allocation for federated settings: A non-homogeneous
Markov chain model [2.552459629685159]
In a federated setting, agents coordinate with a central agent or a server to solve an optimization problem in which agents do not share their information with each other.
We describe how the basic additive-increase multiplicative-decrease (AIMD) algorithm can be modified in a straightforward manner to solve a class of optimization problems for federated settings for a single shared resource with no inter-agent communication.
We extend the single-resource algorithm to multiple heterogeneous shared resources that emerge in smart cities, sharing economy, and many other applications.
arXiv Detail & Related papers (2021-04-26T19:10:00Z) - Distill and Fine-tune: Effective Adaptation from a Black-box Source
Model [138.12678159620248]
Unsupervised domain adaptation (UDA) aims to transfer knowledge in previous related labeled datasets (source) to a new unlabeled dataset (target)
We propose a novel two-step adaptation framework called Distill and Fine-tune (Dis-tune)
arXiv Detail & Related papers (2021-04-04T05:29:05Z) - Modeling the Second Player in Distributionally Robust Optimization [90.25995710696425]
We argue for the use of neural generative models to characterize the worst-case distribution.
This approach poses a number of implementation and optimization challenges.
We find that the proposed approach yields models that are more robust than comparable baselines.
arXiv Detail & Related papers (2021-03-18T14:26:26Z) - Learning while Respecting Privacy and Robustness to Distributional
Uncertainties and Adversarial Data [66.78671826743884]
The distributionally robust optimization framework is considered for training a parametric model.
The objective is to endow the trained model with robustness against adversarially manipulated input data.
Proposed algorithms offer robustness with little overhead.
arXiv Detail & Related papers (2020-07-07T18:25:25Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.