Optimal Multitask Linear Regression and Contextual Bandits under Sparse Heterogeneity
- URL: http://arxiv.org/abs/2306.06291v3
- Date: Fri, 13 Dec 2024 00:38:11 GMT
- Title: Optimal Multitask Linear Regression and Contextual Bandits under Sparse Heterogeneity
- Authors: Xinmeng Huang, Kan Xu, Donghwan Lee, Hamed Hassani, Hamsa Bastani, Edgar Dobriban,
- Abstract summary: Multitask learning methods improve efficiency by leveraging commonalities across datasets.
We study multitask linear regression and contextual bandits under sparse heterogeneity.
We show that our methods are minimax optimal by providing a number of lower bounds.
- Score: 41.772562538698395
- License:
- Abstract: Large and complex datasets are often collected from several, possibly heterogeneous sources. Multitask learning methods improve efficiency by leveraging commonalities across datasets while accounting for possible differences among them. Here, we study multitask linear regression and contextual bandits under sparse heterogeneity, where the source/task-associated parameters are equal to a global parameter plus a sparse task-specific term. We propose a novel two-stage estimator called MOLAR that leverages this structure by first constructing a covariate-wise weighted median of the task-wise linear regression estimates and then shrinking the task-wise estimates towards the weighted median. Compared to task-wise least squares estimates, MOLAR improves the dependence of the estimation error on the data dimension. Extensions of MOLAR to generalized linear models and constructing confidence intervals are discussed in the paper. We then apply MOLAR to develop methods for sparsely heterogeneous multitask contextual bandits, obtaining improved regret guarantees over single-task bandit methods. We further show that our methods are minimax optimal by providing a number of lower bounds. Finally, we support the efficiency of our methods by performing experiments on both synthetic data and the PISA dataset on student educational outcomes from heterogeneous countries.
Related papers
- Analysing Multi-Task Regression via Random Matrix Theory with Application to Time Series Forecasting [16.640336442849282]
We formulate a multi-task optimization problem as a regularization technique to enable single-task models to leverage multi-task learning information.
We derive a closed-form solution for multi-task optimization in the context of linear models.
arXiv Detail & Related papers (2024-06-14T17:59:25Z) - Interpetable Target-Feature Aggregation for Multi-Task Learning based on Bias-Variance Analysis [53.38518232934096]
Multi-task learning (MTL) is a powerful machine learning paradigm designed to leverage shared knowledge across tasks to improve generalization and performance.
We propose an MTL approach at the intersection between task clustering and feature transformation based on a two-phase iterative aggregation of targets and features.
In both phases, a key aspect is to preserve the interpretability of the reduced targets and features through the aggregation with the mean, which is motivated by applications to Earth science.
arXiv Detail & Related papers (2024-06-12T08:30:16Z) - Heterogeneous Multi-Task Gaussian Cox Processes [61.67344039414193]
We present a novel extension of multi-task Gaussian Cox processes for modeling heterogeneous correlated tasks jointly.
A MOGP prior over the parameters of the dedicated likelihoods for classification, regression and point process tasks can facilitate sharing of information between heterogeneous tasks.
We derive a mean-field approximation to realize closed-form iterative updates for estimating model parameters.
arXiv Detail & Related papers (2023-08-29T15:01:01Z) - Multi-Task Learning for Sparsity Pattern Heterogeneity: Statistical and Computational Perspectives [10.514866749547558]
We consider a problem in Multi-Task Learning (MTL) where multiple linear models are jointly trained on a collection of datasets.
A key novelty of our framework is that it allows the sparsity pattern of regression coefficients and the values of non-zero coefficients to differ across tasks.
Our methods encourage models to share information across tasks through separately encouraging 1) coefficient supports, and/or 2) nonzero coefficient values to be similar.
This allows models to borrow strength during variable selection even when non-zero coefficient values differ across tasks.
arXiv Detail & Related papers (2022-12-16T19:52:25Z) - New Tight Relaxations of Rank Minimization for Multi-Task Learning [161.23314844751556]
We propose two novel multi-task learning formulations based on two regularization terms.
We show that our methods can correctly recover the low-rank structure shared across tasks, and outperform related multi-task learning methods.
arXiv Detail & Related papers (2021-12-09T07:29:57Z) - A Unified Framework for Multi-distribution Density Ratio Estimation [101.67420298343512]
Binary density ratio estimation (DRE) provides the foundation for many state-of-the-art machine learning algorithms.
We develop a general framework from the perspective of Bregman minimization divergence.
We show that our framework leads to methods that strictly generalize their counterparts in binary DRE.
arXiv Detail & Related papers (2021-12-07T01:23:20Z) - Byzantine Resilient Distributed Multi-Task Learning [6.850757447639822]
We show that distributed algorithms for learning relatedness among tasks are not resilient in the presence of Byzantine agents.
We propose an approach for Byzantine resilient distributed multi-task learning.
arXiv Detail & Related papers (2020-10-25T04:32:52Z) - DEMI: Discriminative Estimator of Mutual Information [5.248805627195347]
Estimating mutual information between continuous random variables is often intractable and challenging for high-dimensional data.
Recent progress has leveraged neural networks to optimize variational lower bounds on mutual information.
Our approach is based on training a classifier that provides the probability that a data sample pair is drawn from the joint distribution.
arXiv Detail & Related papers (2020-10-05T04:19:27Z) - Rank-Based Multi-task Learning for Fair Regression [9.95899391250129]
We develop a novel learning approach for multi-taskart regression models based on a biased dataset.
We use a popular non-parametric oracle-based non-world multipliers dataset.
arXiv Detail & Related papers (2020-09-23T22:32:57Z) - An Online Method for A Class of Distributionally Robust Optimization
with Non-Convex Objectives [54.29001037565384]
We propose a practical online method for solving a class of online distributionally robust optimization (DRO) problems.
Our studies demonstrate important applications in machine learning for improving the robustness of networks.
arXiv Detail & Related papers (2020-06-17T20:19:25Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.