Modeling High-Dimensional Dependent Data in the Presence of Many Explanatory Variables and Weak Signals
- URL: http://arxiv.org/abs/2412.04736v1
- Date: Fri, 06 Dec 2024 02:54:31 GMT
- Title: Modeling High-Dimensional Dependent Data in the Presence of Many Explanatory Variables and Weak Signals
- Authors: Zhaoxing Gao, Ruey S. Tsay,
- Abstract summary: This article considers a novel and widely applicable approach to modeling high-dimensional dependent data when a large number of explanatory variables are available and the signal-to-noise ratio is low.
- Score: 0.0
- License:
- Abstract: This article considers a novel and widely applicable approach to modeling high-dimensional dependent data when a large number of explanatory variables are available and the signal-to-noise ratio is low. We postulate that a $p$-dimensional response series is the sum of a linear regression with many observable explanatory variables and an error term driven by some latent common factors and an idiosyncratic noise. The common factors have dynamic dependence whereas the covariance matrix of the idiosyncratic noise can have diverging eigenvalues to handle the situation of low signal-to-noise ratio commonly encountered in applications. The regression coefficient matrix is estimated using penalized methods when the dimensions involved are high. We apply factor modeling to the regression residuals, employ a high-dimensional white noise testing procedure to determine the number of common factors, and adopt a projected Principal Component Analysis when the signal-to-noise ratio is low. We establish asymptotic properties of the proposed method, both for fixed and diverging numbers of regressors, as $p$ and the sample size $T$ approach infinity. Finally, we use simulations and empirical applications to demonstrate the efficacy of the proposed approach in finite samples.
Related papers
- Heteroscedastic Double Bayesian Elastic Net [1.1240642213359266]
We propose the Heteroscedastic Double Bayesian Elastic Net (HDBEN), a novel framework that jointly models the mean and log- variance.
Our approach simultaneously induces sparsity and grouping in the regression coefficients and variance parameters, capturing complex variance structures in the data.
arXiv Detail & Related papers (2025-02-04T05:44:19Z) - RieszBoost: Gradient Boosting for Riesz Regression [49.737777802061984]
We propose a novel gradient boosting algorithm to directly estimate the Riesz representer without requiring its explicit analytical form.
We show that our algorithm performs on par with or better than indirect estimation techniques across a range of functionals.
arXiv Detail & Related papers (2025-01-08T23:04:32Z) - Method-of-Moments Inference for GLMs and Doubly Robust Functionals under Proportional Asymptotics [30.324051162373973]
We consider the estimation of regression coefficients and signal-to-noise ratio in high-dimensional Generalized Linear Models (GLMs)
We derive Consistent and Asymptotically Normal (CAN) estimators of our targets of inference.
We complement our theoretical results with numerical experiments and comparisons with existing literature.
arXiv Detail & Related papers (2024-08-12T12:43:30Z) - Multivariate root-n-consistent smoothing parameter free matching estimators and estimators of inverse density weighted expectations [51.000851088730684]
We develop novel modifications of nearest-neighbor and matching estimators which converge at the parametric $sqrt n $-rate.
We stress that our estimators do not involve nonparametric function estimators and in particular do not rely on sample-size dependent parameters smoothing.
arXiv Detail & Related papers (2024-07-11T13:28:34Z) - Approximate Message Passing for the Matrix Tensor Product Model [8.206394018475708]
We propose and analyze an approximate message passing (AMP) algorithm for the matrix tensor product model.
Building upon an convergence theorem for non-separable functions, we prove a state evolution for non-separable functions.
We leverage this state evolution result to provide necessary and sufficient conditions for recovery of the signal of interest.
arXiv Detail & Related papers (2023-06-27T16:03:56Z) - Probabilistic Unrolling: Scalable, Inverse-Free Maximum Likelihood
Estimation for Latent Gaussian Models [69.22568644711113]
We introduce probabilistic unrolling, a method that combines Monte Carlo sampling with iterative linear solvers to circumvent matrix inversions.
Our theoretical analyses reveal that unrolling and backpropagation through the iterations of the solver can accelerate gradient estimation for maximum likelihood estimation.
In experiments on simulated and real data, we demonstrate that probabilistic unrolling learns latent Gaussian models up to an order of magnitude faster than gradient EM, with minimal losses in model performance.
arXiv Detail & Related papers (2023-06-05T21:08:34Z) - Mixed Regression via Approximate Message Passing [16.91276351457051]
We study the problem of regression in a generalized linear model (GLM) with multiple signals and latent variables.
In mixed linear regression, each observation comes from one of $L$ signal vectors (regressors), but we do not know which one.
In max-affine regression, each observation comes from the maximum of $L$ affine functions, each defined via a different signal vector.
arXiv Detail & Related papers (2023-04-05T04:59:59Z) - Quantized Low-Rank Multivariate Regression with Random Dithering [23.81618208119832]
Low-rank multivariate regression (LRMR) is an important statistical learning model.
We focus on the estimation of the underlying coefficient matrix.
We employ uniform quantization with random dithering, i.e., we add appropriate random noise to the data before quantization.
We propose the constrained Lasso and regularized Lasso estimators, and derive the non-asymptotic error bounds.
arXiv Detail & Related papers (2023-02-22T08:14:24Z) - Heavy-tailed Streaming Statistical Estimation [58.70341336199497]
We consider the task of heavy-tailed statistical estimation given streaming $p$ samples.
We design a clipped gradient descent and provide an improved analysis under a more nuanced condition on the noise of gradients.
arXiv Detail & Related papers (2021-08-25T21:30:27Z) - Leveraging Global Parameters for Flow-based Neural Posterior Estimation [90.21090932619695]
Inferring the parameters of a model based on experimental observations is central to the scientific method.
A particularly challenging setting is when the model is strongly indeterminate, i.e., when distinct sets of parameters yield identical observations.
We present a method for cracking such indeterminacy by exploiting additional information conveyed by an auxiliary set of observations sharing global parameters.
arXiv Detail & Related papers (2021-02-12T12:23:13Z) - Instability, Computational Efficiency and Statistical Accuracy [101.32305022521024]
We develop a framework that yields statistical accuracy based on interplay between the deterministic convergence rate of the algorithm at the population level, and its degree of (instability) when applied to an empirical object based on $n$ samples.
We provide applications of our general results to several concrete classes of models, including Gaussian mixture estimation, non-linear regression models, and informative non-response models.
arXiv Detail & Related papers (2020-05-22T22:30:52Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.