Benign Overfitting in Time Series Linear Models with Over-Parameterization
- URL: http://arxiv.org/abs/2204.08369v3
- Date: Thu, 13 Mar 2025 10:19:36 GMT
- Title: Benign Overfitting in Time Series Linear Models with Over-Parameterization
- Authors: Shogo Nakakita, Masaaki Imaizumi,
- Abstract summary: We analyze a linear regression model with dependent time-series data.<n>We develop a theory for the excess risk of the estimator.<n>We show the convergence rate of the risk bound and demonstrate that it is also influenced by the coherence of the temporal covariance.
- Score: 6.9060054915724
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: The success of large-scale models in recent years has increased the importance of statistical models with numerous parameters. Several studies have analyzed over-parameterized linear models with high-dimensional data, which may not be sparse; however, existing results rely on the assumption of sample independence. In this study, we analyze a linear regression model with dependent time-series data in an over-parameterized setting. We consider an estimator using interpolation and develop a theory for the excess risk of the estimator. Then, we derive non-asymptotic risk bounds for the estimator for cases with dependent data. This analysis reveals that the coherence of the temporal covariance plays a key role; the risk bound is influenced by the product of temporal covariance matrices at different time steps. Moreover, we show the convergence rate of the risk bound and demonstrate that it is also influenced by the coherence of the temporal covariance. Finally, we provide several examples of specific dependent processes applicable to our setting.
Related papers
- Risk and cross validation in ridge regression with correlated samples [72.59731158970894]
We provide training examples for the in- and out-of-sample risks of ridge regression when the data points have arbitrary correlations.
We further extend our analysis to the case where the test point has non-trivial correlations with the training set, setting often encountered in time series forecasting.
We validate our theory across a variety of high dimensional data.
arXiv Detail & Related papers (2024-08-08T17:27:29Z) - Precise analysis of ridge interpolators under heavy correlations -- a Random Duality Theory view [0.0]
We show that emphRandom Duality Theory (RDT) can be utilized to obtain precise closed form characterizations of all estimators related optimizing quantities of interest.
arXiv Detail & Related papers (2024-06-13T14:56:52Z) - Diffusion posterior sampling for simulation-based inference in tall data settings [53.17563688225137]
Simulation-based inference ( SBI) is capable of approximating the posterior distribution that relates input parameters to a given observation.
In this work, we consider a tall data extension in which multiple observations are available to better infer the parameters of the model.
We compare our method to recently proposed competing approaches on various numerical experiments and demonstrate its superiority in terms of numerical stability and computational cost.
arXiv Detail & Related papers (2024-04-11T09:23:36Z) - High-dimensional analysis of ridge regression for non-identically distributed data with a variance profile [0.0]
We study the predictive risk of the ridge estimator for linear regression with a variance profile.
For certain class of variance profile, our work highlights the emergence of the well-known double descent phenomenon.
We also investigate the similarities and differences that exist with the standard setting of independent and identically distributed data.
arXiv Detail & Related papers (2024-03-29T14:24:49Z) - High Dimensional Time Series Regression Models: Applications to
Statistical Learning Methods [0.0]
These lecture notes provide an overview of existing methodologies and recent developments for estimation and inference with high dimensional time series regression models.
First, we present main limit theory results for high dimensional dependent data which is relevant to covariance matrix structures as well as to dependent time series sequences.
arXiv Detail & Related papers (2023-08-27T15:53:31Z) - The Capacity and Robustness Trade-off: Revisiting the Channel
Independent Strategy for Multivariate Time Series Forecasting [50.48888534815361]
We show that models trained with the Channel Independent (CI) strategy outperform those trained with the Channel Dependent (CD) strategy.
Our results conclude that the CD approach has higher capacity but often lacks robustness to accurately predict distributionally drifted time series.
We propose a modified CD method called Predict Residuals with Regularization (PRReg) that can surpass the CI strategy.
arXiv Detail & Related papers (2023-04-11T13:15:33Z) - Wasserstein multivariate auto-regressive models for modeling distributional time series [0.0]
We propose a new auto-regressive model for the statistical analysis of multivariate distributional time series.
Results on the existence, uniqueness and stationarity of the solution of such a model are provided.
To shed some light on the benefits of our approach for real data analysis, we also apply this methodology to a data set made of observations from age distribution in different countries.
arXiv Detail & Related papers (2022-07-12T10:18:36Z) - Continuous-Time Modeling of Counterfactual Outcomes Using Neural
Controlled Differential Equations [84.42837346400151]
Estimating counterfactual outcomes over time has the potential to unlock personalized healthcare.
Existing causal inference approaches consider regular, discrete-time intervals between observations and treatment decisions.
We propose a controllable simulation environment based on a model of tumor growth for a range of scenarios.
arXiv Detail & Related papers (2022-06-16T17:15:15Z) - TACTiS: Transformer-Attentional Copulas for Time Series [76.71406465526454]
estimation of time-varying quantities is a fundamental component of decision making in fields such as healthcare and finance.
We propose a versatile method that estimates joint distributions using an attention-based decoder.
We show that our model produces state-of-the-art predictions on several real-world datasets.
arXiv Detail & Related papers (2022-02-07T21:37:29Z) - Time varying regression with hidden linear dynamics [74.9914602730208]
We revisit a model for time-varying linear regression that assumes the unknown parameters evolve according to a linear dynamical system.
Counterintuitively, we show that when the underlying dynamics are stable the parameters of this model can be estimated from data by combining just two ordinary least squares estimates.
arXiv Detail & Related papers (2021-12-29T23:37:06Z) - Learning Interpretable Deep State Space Model for Probabilistic Time
Series Forecasting [98.57851612518758]
Probabilistic time series forecasting involves estimating the distribution of future based on its history.
We propose a deep state space model for probabilistic time series forecasting whereby the non-linear emission model and transition model are parameterized by networks.
We show in experiments that our model produces accurate and sharp probabilistic forecasts.
arXiv Detail & Related papers (2021-01-31T06:49:33Z) - Generative Learning of Heterogeneous Tail Dependence [13.60514494665717]
Our model features heterogeneous and asymmetric tail dependence between all pairs of individual dimensions.
We devise a novel moment learning algorithm to learn the parameters.
Results show that this framework gives better finite-sample performance compared to the copula-based benchmarks.
arXiv Detail & Related papers (2020-11-26T05:34:31Z) - Deep Switching Auto-Regressive Factorization:Application to Time Series
Forecasting [16.934920617960085]
DSARF approximates high dimensional data by a product variables between time dependent weights and spatially dependent factors.
DSARF is different from the state-of-the-art techniques in that it parameterizes the weights in terms of a deep switching vector auto-regressive factorization.
Our experiments attest the superior performance of DSARF in terms of long- and short-term prediction error, when compared with the state-of-the-art methods.
arXiv Detail & Related papers (2020-09-10T20:15:59Z) - On Disentangled Representations Learned From Correlated Data [59.41587388303554]
We bridge the gap to real-world scenarios by analyzing the behavior of the most prominent disentanglement approaches on correlated data.
We show that systematically induced correlations in the dataset are being learned and reflected in the latent representations.
We also demonstrate how to resolve these latent correlations, either using weak supervision during training or by post-hoc correcting a pre-trained model with a small number of labels.
arXiv Detail & Related papers (2020-06-14T12:47:34Z) - Transformer Hawkes Process [79.16290557505211]
We propose a Transformer Hawkes Process (THP) model, which leverages the self-attention mechanism to capture long-term dependencies.
THP outperforms existing models in terms of both likelihood and event prediction accuracy by a notable margin.
We provide a concrete example, where THP achieves improved prediction performance for learning multiple point processes when incorporating their relational information.
arXiv Detail & Related papers (2020-02-21T13:48:13Z) - Multivariate Probabilistic Time Series Forecasting via Conditioned
Normalizing Flows [8.859284959951204]
Time series forecasting is fundamental to scientific and engineering problems.
Deep learning methods are well suited for this problem.
We show that it improves over the state-of-the-art for standard metrics on many real-world data sets.
arXiv Detail & Related papers (2020-02-14T16:16:51Z) - Predicting Multidimensional Data via Tensor Learning [0.0]
We develop a model that retains the intrinsic multidimensional structure of the dataset.
To estimate the model parameters, an Alternating Least Squares algorithm is developed.
The proposed model is able to outperform benchmark models present in the forecasting literature.
arXiv Detail & Related papers (2020-02-11T11:57:07Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.