A Wasserstein Minimax Framework for Mixed Linear Regression
- URL: http://arxiv.org/abs/2106.07537v2
- Date: Wed, 16 Jun 2021 14:45:42 GMT
- Title: A Wasserstein Minimax Framework for Mixed Linear Regression
- Authors: Theo Diamandis, Yonina C. Eldar, Alireza Fallah, Farzan Farnia, Asuman
Ozdaglar
- Abstract summary: Multi-modal distributions are commonly used to model clustered data in learning tasks.
We propose an optimal transport-based framework for Mixed Linear Regression problems.
- Score: 69.40394595795544
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Multi-modal distributions are commonly used to model clustered data in
statistical learning tasks. In this paper, we consider the Mixed Linear
Regression (MLR) problem. We propose an optimal transport-based framework for
MLR problems, Wasserstein Mixed Linear Regression (WMLR), which minimizes the
Wasserstein distance between the learned and target mixture regression models.
Through a model-based duality analysis, WMLR reduces the underlying MLR task to
a nonconvex-concave minimax optimization problem, which can be provably solved
to find a minimax stationary point by the Gradient Descent Ascent (GDA)
algorithm. In the special case of mixtures of two linear regression models, we
show that WMLR enjoys global convergence and generalization guarantees. We
prove that WMLR's sample complexity grows linearly with the dimension of data.
Finally, we discuss the application of WMLR to the federated learning task
where the training samples are collected by multiple agents in a network.
Unlike the Expectation Maximization algorithm, WMLR directly extends to the
distributed, federated learning setting. We support our theoretical results
through several numerical experiments, which highlight our framework's ability
to handle the federated learning setting with mixture models.
Related papers
- Unveiling the Cycloid Trajectory of EM Iterations in Mixed Linear Regression [5.883916678819683]
We study the trajectory of iterations and the convergence rates of the Expectation-Maximization (EM) algorithm for two-component Mixed Linear Regression (2MLR)
Recent results have established the super-linear convergence of EM for 2MLR in the noiseless and high SNR settings.
arXiv Detail & Related papers (2024-05-28T14:46:20Z) - Maximize to Explore: One Objective Function Fusing Estimation, Planning,
and Exploration [87.53543137162488]
We propose an easy-to-implement online reinforcement learning (online RL) framework called textttMEX.
textttMEX integrates estimation and planning components while balancing exploration exploitation automatically.
It can outperform baselines by a stable margin in various MuJoCo environments with sparse rewards.
arXiv Detail & Related papers (2023-05-29T17:25:26Z) - Federated Empirical Risk Minimization via Second-Order Method [18.548661105227488]
We present an interior point method (IPM) to solve a general empirical risk minimization problem under the federated learning setting.
We show that the communication complexity of each iteration of our IPM is $tildeO(d3/2)$, where $d$ is the dimension (i.e., number of features) of the dataset.
arXiv Detail & Related papers (2023-05-27T14:23:14Z) - Federated Latent Class Regression for Hierarchical Data [5.110894308882439]
Federated Learning (FL) allows a number of agents to participate in training a global machine learning model without disclosing locally stored data.
We propose a novel probabilistic model, Hierarchical Latent Class Regression (HLCR), and its extension to Federated Learning, FEDHLCR.
Our inference algorithm, being derived from Bayesian theory, provides strong convergence guarantees and good robustness to overfitting. Experimental results show that FEDHLCR offers fast convergence even in non-IID datasets.
arXiv Detail & Related papers (2022-06-22T00:33:04Z) - Reinforcement Learning for Adaptive Mesh Refinement [63.7867809197671]
We propose a novel formulation of AMR as a Markov decision process and apply deep reinforcement learning to train refinement policies directly from simulation.
The model sizes of these policy architectures are independent of the mesh size and hence scale to arbitrarily large and complex simulations.
arXiv Detail & Related papers (2021-03-01T22:55:48Z) - On the Minimal Error of Empirical Risk Minimization [90.09093901700754]
We study the minimal error of the Empirical Risk Minimization (ERM) procedure in the task of regression.
Our sharp lower bounds shed light on the possibility (or impossibility) of adapting to simplicity of the model generating the data.
arXiv Detail & Related papers (2021-02-24T04:47:55Z) - Robust Finite Mixture Regression for Heterogeneous Targets [70.19798470463378]
We propose an FMR model that finds sample clusters and jointly models multiple incomplete mixed-type targets simultaneously.
We provide non-asymptotic oracle performance bounds for our model under a high-dimensional learning framework.
The results show that our model can achieve state-of-the-art performance.
arXiv Detail & Related papers (2020-10-12T03:27:07Z) - MLR-SNet: Transferable LR Schedules for Heterogeneous Tasks [56.66010634895913]
The learning rate (LR) is one of the most important hyper-learned network parameters in gradient descent (SGD) training networks (DNN)
In this paper, we propose to learn a proper LR schedule for MLR-SNet tasks.
We also make MLR-SNet to query tasks like different noises, architectures, data modalities, sizes from the training ones, and achieve or even better performance.
arXiv Detail & Related papers (2020-07-29T01:18:58Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.