Related papers: Analysing Multi-Task Regression via Random Matrix Theory with Application to Time Series Forecasting

Analysing Multi-Task Regression via Random Matrix Theory with Application to Time Series Forecasting

URL: http://arxiv.org/abs/2406.10327v1
Date: Fri, 14 Jun 2024 17:59:25 GMT
Title: Analysing Multi-Task Regression via Random Matrix Theory with Application to Time Series Forecasting
Authors: Romain Ilbert, Malik Tiomoko, Cosme Louart, Ambroise Odonnat, Vasilii Feofanov, Themis Palpanas, Ievgen Redko,
Abstract summary: We formulate a multi-task optimization problem as a regularization technique to enable single-task models to leverage multi-task learning information. We derive a closed-form solution for multi-task optimization in the context of linear models.
Score: 16.640336442849282
License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
Abstract: In this paper, we introduce a novel theoretical framework for multi-task regression, applying random matrix theory to provide precise performance estimations, under high-dimensional, non-Gaussian data distributions. We formulate a multi-task optimization problem as a regularization technique to enable single-task models to leverage multi-task learning information. We derive a closed-form solution for multi-task optimization in the context of linear models. Our analysis provides valuable insights by linking the multi-task learning performance to various model statistics such as raw data covariances, signal-generating hyperplanes, noise levels, as well as the size and number of datasets. We finally propose a consistent estimation of training and testing errors, thereby offering a robust foundation for hyperparameter optimization in multi-task regression scenarios. Experimental validations on both synthetic and real-world datasets in regression and multivariate time series forecasting demonstrate improvements on univariate models, incorporating our method into the training loss and thus leveraging multivariate information.

Related papers

In-Context Linear Regression Demystified: Training Dynamics and Mechanistic Interpretability of Multi-Head Softmax Attention [52.159541540613915]
We study how multi-head softmax attention models are trained to perform in-context learning on linear data. Our results reveal that in-context learning ability emerges from the trained transformer as an aggregated effect of its architecture and the underlying data distribution.
arXiv Detail & Related papers (2025-03-17T02:00:49Z)
A Multi-Task Learning Approach to Linear Multivariate Forecasting [4.369550829556578]
Recent state-of-the-art works ignore the inter-relations between divisons, using their model on each divison independently. We propose to view multivariate forecasting as a multi-task learning problem, facilitating the analysis of forecasting. We evaluate our approach on challenging benchmarks in comparison to strong baselines, and we show it obtains on-par or better results.
arXiv Detail & Related papers (2025-02-05T19:34:23Z)
AdaPRL: Adaptive Pairwise Regression Learning with Uncertainty Estimation for Universal Regression Tasks [0.0]
We propose a novel adaptive pairwise learning framework for regression tasks (AdaPRL) AdaPRL leverages the relative differences between data points and with deep probabilistic models to quantify the uncertainty associated with predictions. Experiments show that AdaPRL can be seamlessly integrated into recently proposed regression frameworks to gain performance improvement.
arXiv Detail & Related papers (2025-01-10T09:19:10Z)
Towards Stable Machine Learning Model Retraining via Slowly Varying Sequences [6.067007470552307]
We propose a methodology for finding sequences of machine learning models that are stable across retraining iterations. We develop a mixed-integer optimization formulation that is guaranteed to recover optimal models. Our method shows stronger stability than greedily trained models with a small, controllable sacrifice in predictive power.
arXiv Detail & Related papers (2024-03-28T22:45:38Z)
Meta-Learning with Generalized Ridge Regression: High-dimensional Asymptotics, Optimality and Hyper-covariance Estimation [14.194212772887699]
We consider meta-learning within the framework of high-dimensional random-effects linear models. We show the precise behavior of the predictive risk for a new test task when the data dimension grows proportionally to the number of samples per task. We propose and analyze an estimator inverse random regression coefficients based on data from the training tasks.
arXiv Detail & Related papers (2024-03-27T21:18:43Z)
How Many Pretraining Tasks Are Needed for In-Context Learning of Linear Regression? [92.90857135952231]
Transformers pretrained on diverse tasks exhibit remarkable in-context learning (ICL) capabilities. We study ICL in one of its simplest setups: pretraining a linearly parameterized single-layer linear attention model for linear regression.
arXiv Detail & Related papers (2023-10-12T15:01:43Z)
AdaMerging: Adaptive Model Merging for Multi-Task Learning [68.75885518081357]
This paper introduces an innovative technique called Adaptive Model Merging (AdaMerging) It aims to autonomously learn the coefficients for model merging, either in a task-wise or layer-wise manner, without relying on the original training data. Compared to the current state-of-the-art task arithmetic merging scheme, AdaMerging showcases a remarkable 11% improvement in performance.
arXiv Detail & Related papers (2023-10-04T04:26:33Z)
TACTiS-2: Better, Faster, Simpler Attentional Copulas for Multivariate Time Series [57.4208255711412]
Building on copula theory, we propose a simplified objective for the recently-introduced transformer-based attentional copulas (TACTiS) We show that the resulting model has significantly better training dynamics and achieves state-of-the-art performance across diverse real-world forecasting tasks.
arXiv Detail & Related papers (2023-10-02T16:45:19Z)
Multi-Task Learning with Summary Statistics [4.871473117968554]
We propose a flexible multi-task learning framework utilizing summary statistics from various sources. We also present an adaptive parameter selection approach based on a variant of Lepski's method. This work offers a more flexible tool for training related models across various domains, with practical implications in genetic risk prediction.
arXiv Detail & Related papers (2023-07-05T15:55:23Z)
Diffusion Model is an Effective Planner and Data Synthesizer for Multi-Task Reinforcement Learning [101.66860222415512]
Multi-Task Diffusion Model (textscMTDiff) is a diffusion-based method that incorporates Transformer backbones and prompt learning for generative planning and data synthesis. For generative planning, we find textscMTDiff outperforms state-of-the-art algorithms across 50 tasks on Meta-World and 8 maps on Maze2D.
arXiv Detail & Related papers (2023-05-29T05:20:38Z)
Trustworthy Multimodal Regression with Mixture of Normal-inverse Gamma Distributions [91.63716984911278]
We introduce a novel Mixture of Normal-Inverse Gamma distributions (MoNIG) algorithm, which efficiently estimates uncertainty in principle for adaptive integration of different modalities and produces a trustworthy regression result. Experimental results on both synthetic and different real-world data demonstrate the effectiveness and trustworthiness of our method on various multimodal regression tasks.
arXiv Detail & Related papers (2021-11-11T14:28:12Z)
Scalable Multi-Task Gaussian Processes with Neural Embedding of Coregionalization [9.873139480223367]
Multi-task regression attempts to exploit the task similarity in order to achieve knowledge transfer across related tasks for performance improvement. The linear model of coregionalization (LMC) is a well-known MTGP paradigm which exploits the dependency of tasks through linear combination of several independent and diverse GPs. We develop the neural embedding of coregionalization that transforms the latent GPs into a high-dimensional latent space to induce rich yet diverse behaviors.
arXiv Detail & Related papers (2021-09-20T01:28:14Z)
An Extended Multi-Model Regression Approach for Compressive Strength Prediction and Optimization of a Concrete Mixture [0.0]
A model based evaluation of concrete compressive strength is of high value, both for the purpose of strength prediction and the mixture optimization. We take a further step towards improving the accuracy of the prediction model via the weighted combination of multiple regression methods. A proposed (GA)-based mixture optimization is proposed, building on the obtained multi-regression model.
arXiv Detail & Related papers (2021-06-13T16:10:32Z)
Meta-learning framework with applications to zero-shot time-series forecasting [82.61728230984099]
This work provides positive evidence using a broad meta-learning framework. residual connections act as a meta-learning adaptation mechanism. We show that it is viable to train a neural network on a source TS dataset and deploy it on a different target TS dataset without retraining.
arXiv Detail & Related papers (2020-02-07T16:39:43Z)

This list is automatically generated from the titles and abstracts of the papers in this site.