Cross-Learning from Scarce Data via Multi-Task Constrained Optimization
- URL: http://arxiv.org/abs/2511.13680v1
- Date: Mon, 17 Nov 2025 18:35:59 GMT
- Title: Cross-Learning from Scarce Data via Multi-Task Constrained Optimization
- Authors: Leopoldo Agorio, Juan Cerviño, Miguel Calvo-Fullana, Alejandro Ribeiro, Juan Andrés Bazerque,
- Abstract summary: This paper introduces a multi-task emphcross-learning framework to overcome data scarcity.<n>We formulate this joint estimation as a constrained optimization problem.<n>We show the efficiency of our cross-learning method in applications with real data including image classification and propagation of infectious diseases.
- Score: 70.90607489166648
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: A learning task, understood as the problem of fitting a parametric model from supervised data, fundamentally requires the dataset to be large enough to be representative of the underlying distribution of the source. When data is limited, the learned models fail generalize to cases not seen during training. This paper introduces a multi-task \emph{cross-learning} framework to overcome data scarcity by jointly estimating \emph{deterministic} parameters across multiple, related tasks. We formulate this joint estimation as a constrained optimization problem, where the constraints dictate the resulting similarity between the parameters of the different models, allowing the estimated parameters to differ across tasks while still combining information from multiple data sources. This framework enables knowledge transfer from tasks with abundant data to those with scarce data, leading to more accurate and reliable parameter estimates, providing a solution for scenarios where parameter inference from limited data is critical. We provide theoretical guarantees in a controlled framework with Gaussian data, and show the efficiency of our cross-learning method in applications with real data including image classification and propagation of infectious diseases.
Related papers
- Federated Online Learning for Heterogeneous Multisource Streaming Data [0.0]
Federated learning has emerged as an essential paradigm for distributed multi-source data analysis under privacy concerns.<n>In this paper, we propose a federated online learning (FOL) method for distributed multi-source streaming data analysis.
arXiv Detail & Related papers (2025-08-08T19:08:53Z) - Late Fusion Multi-task Learning for Semiparametric Inference with Nuisance Parameters [2.6217304977339473]
We introduce a late fusion framework for multi-task learning with semiparametric models.<n>We focus on applications such as heterogeneous treatment effect estimation across multiple data sources.<n>We propose a novel multi-task learning method for nuisance parameter estimation.
arXiv Detail & Related papers (2025-07-10T17:27:04Z) - AdvKT: An Adversarial Multi-Step Training Framework for Knowledge Tracing [64.79967583649407]
Knowledge Tracing (KT) monitors students' knowledge states and simulates their responses to question sequences.<n>Existing KT models typically follow a single-step training paradigm, which leads to significant error accumulation.<n>We propose a novel Adversarial Multi-Step Training Framework for Knowledge Tracing (AdvKT) which focuses on the multi-step KT task.
arXiv Detail & Related papers (2025-04-07T03:31:57Z) - Meta-Statistical Learning: Supervised Learning of Statistical Inference [59.463430294611626]
This work demonstrates that the tools and principles driving the success of large language models (LLMs) can be repurposed to tackle distribution-level tasks.<n>We propose meta-statistical learning, a framework inspired by multi-instance learning that reformulates statistical inference tasks as supervised learning problems.
arXiv Detail & Related papers (2025-02-17T18:04:39Z) - Empowering Large Language Models in Wireless Communication: A Novel Dataset and Fine-Tuning Framework [81.29965270493238]
We develop a specialized dataset aimed at enhancing the evaluation and fine-tuning of large language models (LLMs) for wireless communication applications.<n>The dataset includes a diverse set of multi-hop questions, including true/false and multiple-choice types, spanning varying difficulty levels from easy to hard.<n>We introduce a Pointwise V-Information (PVI) based fine-tuning method, providing a detailed theoretical analysis and justification for its use in quantifying the information content of training data.
arXiv Detail & Related papers (2025-01-16T16:19:53Z) - Optimize Incompatible Parameters through Compatibility-aware Knowledge Integration [104.52015641099828]
Existing research excels in removing such parameters or merging the outputs of multiple different pretrained models.<n>We propose Compatibility-aware Knowledge Integration (CKI), which consists of Deep Assessment and Deep Splicing.<n>The integrated model can be used directly for inference or for further fine-tuning.
arXiv Detail & Related papers (2025-01-10T01:42:43Z) - Analysing Multi-Task Regression via Random Matrix Theory with Application to Time Series Forecasting [16.640336442849282]
We formulate a multi-task optimization problem as a regularization technique to enable single-task models to leverage multi-task learning information.
We derive a closed-form solution for multi-task optimization in the context of linear models.
arXiv Detail & Related papers (2024-06-14T17:59:25Z) - Improving Transferability for Cross-domain Trajectory Prediction via
Neural Stochastic Differential Equation [41.09061877498741]
discrepancies exist among datasets due to external factors and data acquisition strategies.
The proficient performance of models trained on large-scale datasets has limited transferability on other small-size datasets.
We propose a method based on continuous and utilization of Neural Differential Equations (NSDE) for alleviating discrepancies.
The effectiveness of our method is validated against state-of-the-art trajectory prediction models on the popular benchmark datasets: nuScenes, Argoverse, Lyft, INTERACTION, and Open Motion dataset.
arXiv Detail & Related papers (2023-12-26T06:50:29Z) - Multi-Task Learning with Summary Statistics [4.871473117968554]
We propose a flexible multi-task learning framework utilizing summary statistics from various sources.
We also present an adaptive parameter selection approach based on a variant of Lepski's method.
This work offers a more flexible tool for training related models across various domains, with practical implications in genetic risk prediction.
arXiv Detail & Related papers (2023-07-05T15:55:23Z) - Learning while Respecting Privacy and Robustness to Distributional
Uncertainties and Adversarial Data [66.78671826743884]
The distributionally robust optimization framework is considered for training a parametric model.
The objective is to endow the trained model with robustness against adversarially manipulated input data.
Proposed algorithms offer robustness with little overhead.
arXiv Detail & Related papers (2020-07-07T18:25:25Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.