Multi-Task Learning for Sparsity Pattern Heterogeneity: Statistical and Computational Perspectives
- URL: http://arxiv.org/abs/2212.08697v2
- Date: Sun, 9 Jun 2024 01:20:43 GMT
- Title: Multi-Task Learning for Sparsity Pattern Heterogeneity: Statistical and Computational Perspectives
- Authors: Kayhan Behdin, Gabriel Loewinger, Kenneth T. Kishida, Giovanni Parmigiani, Rahul Mazumder,
- Abstract summary: We consider a problem in Multi-Task Learning (MTL) where multiple linear models are jointly trained on a collection of datasets.
A key novelty of our framework is that it allows the sparsity pattern of regression coefficients and the values of non-zero coefficients to differ across tasks.
Our methods encourage models to share information across tasks through separately encouraging 1) coefficient supports, and/or 2) nonzero coefficient values to be similar.
This allows models to borrow strength during variable selection even when non-zero coefficient values differ across tasks.
- Score: 10.514866749547558
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: We consider a problem in Multi-Task Learning (MTL) where multiple linear models are jointly trained on a collection of datasets ("tasks"). A key novelty of our framework is that it allows the sparsity pattern of regression coefficients and the values of non-zero coefficients to differ across tasks while still leveraging partially shared structure. Our methods encourage models to share information across tasks through separately encouraging 1) coefficient supports, and/or 2) nonzero coefficient values to be similar. This allows models to borrow strength during variable selection even when non-zero coefficient values differ across tasks. We propose a novel mixed-integer programming formulation for our estimator. We develop custom scalable algorithms based on block coordinate descent and combinatorial local search to obtain high-quality (approximate) solutions for our estimator. Additionally, we propose a novel exact optimization algorithm to obtain globally optimal solutions. We investigate the theoretical properties of our estimators. We formally show how our estimators leverage the shared support information across tasks to achieve better variable selection performance. We evaluate the performance of our methods in simulations and two biomedical applications. Our proposed approaches appear to outperform other sparse MTL methods in variable selection and prediction accuracy. We provide the sMTL package on CRAN.
Related papers
- Towards Efficient Pareto Set Approximation via Mixture of Experts Based Model Fusion [53.33473557562837]
Solving multi-objective optimization problems for large deep neural networks is a challenging task due to the complexity of the loss landscape and the expensive computational cost.
We propose a practical and scalable approach to solve this problem via mixture of experts (MoE) based model fusion.
By ensembling the weights of specialized single-task models, the MoE module can effectively capture the trade-offs between multiple objectives.
arXiv Detail & Related papers (2024-06-14T07:16:18Z) - Interpetable Target-Feature Aggregation for Multi-Task Learning based on Bias-Variance Analysis [53.38518232934096]
Multi-task learning (MTL) is a powerful machine learning paradigm designed to leverage shared knowledge across tasks to improve generalization and performance.
We propose an MTL approach at the intersection between task clustering and feature transformation based on a two-phase iterative aggregation of targets and features.
In both phases, a key aspect is to preserve the interpretability of the reduced targets and features through the aggregation with the mean, which is motivated by applications to Earth science.
arXiv Detail & Related papers (2024-06-12T08:30:16Z) - MAP: Low-compute Model Merging with Amortized Pareto Fronts via Quadratic Approximation [80.47072100963017]
We introduce a novel and low-compute algorithm, Model Merging with Amortized Pareto Front (MAP)
MAP efficiently identifies a set of scaling coefficients for merging multiple models, reflecting the trade-offs involved.
We also introduce Bayesian MAP for scenarios with a relatively low number of tasks and Nested MAP for situations with a high number of tasks, further reducing the computational cost of evaluation.
arXiv Detail & Related papers (2024-06-11T17:55:25Z) - Bayesian Uncertainty for Gradient Aggregation in Multi-Task Learning [39.4348419684885]
Multi-task learning (MTL) aims at learning a single model that solves several tasks efficiently.
We introduce a novel gradient aggregation approach using Bayesian inference.
We empirically demonstrate the benefits of our approach in a variety of datasets.
arXiv Detail & Related papers (2024-02-06T14:00:43Z) - Low-Rank Multitask Learning based on Tensorized SVMs and LSSVMs [65.42104819071444]
Multitask learning (MTL) leverages task-relatedness to enhance performance.
We employ high-order tensors, with each mode corresponding to a task index, to naturally represent tasks referenced by multiple indices.
We propose a general framework of low-rank MTL methods with tensorized support vector machines (SVMs) and least square support vector machines (LSSVMs)
arXiv Detail & Related papers (2023-08-30T14:28:26Z) - A Lagrangian Duality Approach to Active Learning [119.36233726867992]
We consider the batch active learning problem, where only a subset of the training data is labeled.
We formulate the learning problem using constrained optimization, where each constraint bounds the performance of the model on labeled samples.
We show, via numerical experiments, that our proposed approach performs similarly to or better than state-of-the-art active learning methods.
arXiv Detail & Related papers (2022-02-08T19:18:49Z) - Mixed-Integer Optimization with Constraint Learning [4.462264781248437]
We establish a broad methodological foundation for mixed-integer optimization with learned constraints.
We exploit the mixed-integer optimization-representability of many machine learning methods.
We demonstrate the method in both World Food Programme planning and chemotherapy optimization.
arXiv Detail & Related papers (2021-11-04T20:19:55Z) - Multi-task Supervised Learning via Cross-learning [102.64082402388192]
We consider a problem known as multi-task learning, consisting of fitting a set of regression functions intended for solving different tasks.
In our novel formulation, we couple the parameters of these functions, so that they learn in their task specific domains while staying close to each other.
This facilitates cross-fertilization in which data collected across different domains help improving the learning performance at each other task.
arXiv Detail & Related papers (2020-10-24T21:35:57Z) - Efficient Robust Optimal Transport with Application to Multi-Label
Classification [12.521494095948068]
We model the feature-feature relationship via a symmetric positive semi-definite Mahalanobis metric in the OT cost function.
We view the resulting optimization problem as a non-linear OT problem, which we solve using the Frank-Wolfe algorithm.
Empirical results on the discriminative learning setting, such as tag prediction and multi-class classification, illustrate the good performance of our approach.
arXiv Detail & Related papers (2020-10-22T16:43:52Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.