Robust High-Dimensional Regression with Coefficient Thresholding and its
Application to Imaging Data Analysis
- URL: http://arxiv.org/abs/2109.14856v1
- Date: Thu, 30 Sep 2021 05:29:54 GMT
- Title: Robust High-Dimensional Regression with Coefficient Thresholding and its
Application to Imaging Data Analysis
- Authors: Bingyuan Liu, Qi Zhang, Lingzhou Xue, Peter X.K. Song, and Jian Kang
- Abstract summary: It is of importance to develop statistical techniques to analyze high-dimensional data in the presence of both complex dependence and possible outliers in real-world imaging data.
- Score: 7.640041402805495
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: It is of importance to develop statistical techniques to analyze
high-dimensional data in the presence of both complex dependence and possible
outliers in real-world applications such as imaging data analyses. We propose a
new robust high-dimensional regression with coefficient thresholding, in which
an efficient nonconvex estimation procedure is proposed through a thresholding
function and the robust Huber loss. The proposed regularization method accounts
for complex dependence structures in predictors and is robust against outliers
in outcomes. Theoretically, we analyze rigorously the landscape of the
population and empirical risk functions for the proposed method. The fine
landscape enables us to establish both {statistical consistency and
computational convergence} under the high-dimensional setting. The
finite-sample properties of the proposed method are examined by extensive
simulation studies. An illustration of real-world application concerns a
scalar-on-image regression analysis for an association of psychiatric disorder
measured by the general factor of psychopathology with features extracted from
the task functional magnetic resonance imaging data in the Adolescent Brain
Cognitive Development study.
Related papers
- Provable Risk-Sensitive Distributional Reinforcement Learning with
General Function Approximation [54.61816424792866]
We introduce a general framework on Risk-Sensitive Distributional Reinforcement Learning (RS-DisRL), with static Lipschitz Risk Measures (LRM) and general function approximation.
We design two innovative meta-algorithms: textttRS-DisRL-M, a model-based strategy for model-based function approximation, and textttRS-DisRL-V, a model-free approach for general value function approximation.
arXiv Detail & Related papers (2024-02-28T08:43:18Z) - Inference of Dependency Knowledge Graph for Electronic Health Records [13.35941801610195]
We propose a framework for deriving a sparse knowledge graph based on the dynamic log-linear topic model.
Within this model, the KG embeddings are estimated by performing singular value decomposition on the empirical pointwise mutual information matrix.
We then establish entrywise normality for the KG low-rank estimator, enabling the recovery of sparse graph edges with controlled type I error.
arXiv Detail & Related papers (2023-12-25T04:45:36Z) - Understanding Augmentation-based Self-Supervised Representation Learning
via RKHS Approximation and Regression [53.15502562048627]
Recent work has built the connection between self-supervised learning and the approximation of the top eigenspace of a graph Laplacian operator.
This work delves into a statistical analysis of augmentation-based pretraining.
arXiv Detail & Related papers (2023-06-01T15:18:55Z) - Errors-in-variables Fr\'echet Regression with Low-rank Covariate
Approximation [2.1756081703276]
Fr'echet regression has emerged as a promising approach for regression analysis involving non-Euclidean response variables.
Our proposed framework combines the concepts of global Fr'echet regression and principal component regression, aiming to improve the efficiency and accuracy of the regression estimator.
arXiv Detail & Related papers (2023-05-16T08:37:54Z) - Semiparametric Regression for Spatial Data via Deep Learning [17.63607438860882]
We use a sparsely connected deep neural network with rectified linear unit (ReLU) activation function to estimate the unknown regression function.
Our method can handle well large data set owing to the gradient descent optimization algorithm.
arXiv Detail & Related papers (2023-01-10T01:55:55Z) - Scalable Intervention Target Estimation in Linear Models [52.60799340056917]
Current approaches to causal structure learning either work with known intervention targets or use hypothesis testing to discover the unknown intervention targets.
This paper proposes a scalable and efficient algorithm that consistently identifies all intervention targets.
The proposed algorithm can be used to also update a given observational Markov equivalence class into the interventional Markov equivalence class.
arXiv Detail & Related papers (2021-11-15T03:16:56Z) - Differential privacy and robust statistics in high dimensions [49.50869296871643]
High-dimensional Propose-Test-Release (HPTR) builds upon three crucial components: the exponential mechanism, robust statistics, and the Propose-Test-Release mechanism.
We show that HPTR nearly achieves the optimal sample complexity under several scenarios studied in the literature.
arXiv Detail & Related papers (2021-11-12T06:36:40Z) - Heavy-tailed Streaming Statistical Estimation [58.70341336199497]
We consider the task of heavy-tailed statistical estimation given streaming $p$ samples.
We design a clipped gradient descent and provide an improved analysis under a more nuanced condition on the noise of gradients.
arXiv Detail & Related papers (2021-08-25T21:30:27Z) - A Nonconvex Framework for Structured Dynamic Covariance Recovery [24.471814126358556]
We propose a flexible yet interpretable model for high-dimensional data with time-varying second order statistics.
Motivated by the literature, we quantify factorization and smooth temporal data.
We show that our approach outperforms existing baselines.
arXiv Detail & Related papers (2020-11-11T07:09:44Z) - Statistical control for spatio-temporal MEG/EEG source imaging with
desparsified multi-task Lasso [102.84915019938413]
Non-invasive techniques like magnetoencephalography (MEG) or electroencephalography (EEG) offer promise of non-invasive techniques.
The problem of source localization, or source imaging, poses however a high-dimensional statistical inference challenge.
We propose an ensemble of desparsified multi-task Lasso (ecd-MTLasso) to deal with this problem.
arXiv Detail & Related papers (2020-09-29T21:17:16Z) - Image Response Regression via Deep Neural Networks [4.646077947295938]
We propose a novel nonparametric approach in the framework of spatially varying coefficient models, where the spatially varying functions are estimated through deep neural networks.
A key idea in our approach is to treat the image voxels as spatial effective samples, which alleviates the limited sample size issue that haunts the majority of medical imaging studies.
We demonstrate the efficacy of the method through intensive simulations, and further illustrate its advantages analyses of two functional magnetic resonance imaging datasets.
arXiv Detail & Related papers (2020-06-17T14:45:26Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.