A Support Detection and Root Finding Approach for Learning
High-dimensional Generalized Linear Models
- URL: http://arxiv.org/abs/2001.05819v1
- Date: Thu, 16 Jan 2020 14:35:17 GMT
- Title: A Support Detection and Root Finding Approach for Learning
High-dimensional Generalized Linear Models
- Authors: Jian Huang, Yuling Jiao, Lican Kang, Jin Liu, Yanyan Liu, Xiliang Lu
- Abstract summary: We develop a support detection and root finding procedure to learn the high dimensional sparse generalized linear models.
We conduct simulations and real data analysis to illustrate the advantages of our proposed method over several existing methods.
- Score: 10.103666349083165
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Feature selection is important for modeling high-dimensional data, where the
number of variables can be much larger than the sample size. In this paper, we
develop a support detection and root finding procedure to learn the high
dimensional sparse generalized linear models and denote this method by GSDAR.
Based on the KKT condition for $\ell_0$-penalized maximum likelihood
estimations, GSDAR generates a sequence of estimators iteratively.
Under some restricted invertibility conditions on the maximum likelihood
function and sparsity assumption on the target coefficients, the errors of the
proposed estimate decays exponentially to the optimal order. Moreover, the
oracle estimator can be recovered if the target signal is stronger than the
detectable level.
We conduct simulations and real data analysis to illustrate the advantages of
our proposed method over several existing methods, including Lasso and MCP.
Related papers
- Diffusion posterior sampling for simulation-based inference in tall data settings [53.17563688225137]
Simulation-based inference ( SBI) is capable of approximating the posterior distribution that relates input parameters to a given observation.
In this work, we consider a tall data extension in which multiple observations are available to better infer the parameters of the model.
We compare our method to recently proposed competing approaches on various numerical experiments and demonstrate its superiority in terms of numerical stability and computational cost.
arXiv Detail & Related papers (2024-04-11T09:23:36Z) - A Random Matrix Approach to Low-Multilinear-Rank Tensor Approximation [24.558241146742205]
We characterize the large-dimensional spectral behavior of the unfoldings of the data tensor and exhibit relevant signal-to-noise ratios governing the detectability of the principal directions of the signal.
Results allow to accurately predict the reconstruction performance of truncated multilinear SVD (MLSVD) in the non-trivial regime.
arXiv Detail & Related papers (2024-02-05T16:38:30Z) - Conditional Korhunen-Lo\'{e}ve regression model with Basis Adaptation
for high-dimensional problems: uncertainty quantification and inverse
modeling [62.997667081978825]
We propose a methodology for improving the accuracy of surrogate models of the observable response of physical systems.
We apply the proposed methodology to constructing surrogate models via the Basis Adaptation (BA) method of the stationary hydraulic head response.
arXiv Detail & Related papers (2023-07-05T18:14:38Z) - Approximate Message Passing for the Matrix Tensor Product Model [8.206394018475708]
We propose and analyze an approximate message passing (AMP) algorithm for the matrix tensor product model.
Building upon an convergence theorem for non-separable functions, we prove a state evolution for non-separable functions.
We leverage this state evolution result to provide necessary and sufficient conditions for recovery of the signal of interest.
arXiv Detail & Related papers (2023-06-27T16:03:56Z) - Probabilistic Unrolling: Scalable, Inverse-Free Maximum Likelihood
Estimation for Latent Gaussian Models [69.22568644711113]
We introduce probabilistic unrolling, a method that combines Monte Carlo sampling with iterative linear solvers to circumvent matrix inversions.
Our theoretical analyses reveal that unrolling and backpropagation through the iterations of the solver can accelerate gradient estimation for maximum likelihood estimation.
In experiments on simulated and real data, we demonstrate that probabilistic unrolling learns latent Gaussian models up to an order of magnitude faster than gradient EM, with minimal losses in model performance.
arXiv Detail & Related papers (2023-06-05T21:08:34Z) - Sparse high-dimensional linear regression with a partitioned empirical
Bayes ECM algorithm [62.997667081978825]
We propose a computationally efficient and powerful Bayesian approach for sparse high-dimensional linear regression.
Minimal prior assumptions on the parameters are used through the use of plug-in empirical Bayes estimates.
The proposed approach is implemented in the R package probe.
arXiv Detail & Related papers (2022-09-16T19:15:50Z) - Generative Principal Component Analysis [47.03792476688768]
We study the problem of principal component analysis with generative modeling assumptions.
Key assumption is that the underlying signal lies near the range of an $L$-Lipschitz continuous generative model with bounded $k$-dimensional inputs.
We propose a quadratic estimator, and show that it enjoys a statistical rate of order $sqrtfracklog Lm$, where $m$ is the number of samples.
arXiv Detail & Related papers (2022-03-18T01:48:16Z) - Matrix optimization based Euclidean embedding with outliers [4.219333707563623]
We propose a matrix optimization based embedding model that can produce reliable embeddings and identify the outliers jointly.
numerical experiments demonstrate that the matrix optimization-based model can produce configurations of high quality and successfully identify outliers even for large networks.
arXiv Detail & Related papers (2020-12-23T16:26:40Z) - Maximum sampled conditional likelihood for informative subsampling [4.708378681950648]
Subsampling is a computationally effective approach to extract information from massive data sets when computing resources are limited.
We propose to use the maximum maximum conditional likelihood estimator (MSCLE) based on the sampled data.
arXiv Detail & Related papers (2020-11-11T16:01:17Z) - Goal-directed Generation of Discrete Structures with Conditional
Generative Models [85.51463588099556]
We introduce a novel approach to directly optimize a reinforcement learning objective, maximizing an expected reward.
We test our methodology on two tasks: generating molecules with user-defined properties and identifying short python expressions which evaluate to a given target value.
arXiv Detail & Related papers (2020-10-05T20:03:13Z) - Improved guarantees and a multiple-descent curve for Column Subset
Selection and the Nystr\"om method [76.73096213472897]
We develop techniques which exploit spectral properties of the data matrix to obtain improved approximation guarantees.
Our approach leads to significantly better bounds for datasets with known rates of singular value decay.
We show that both our improved bounds and the multiple-descent curve can be observed on real datasets simply by varying the RBF parameter.
arXiv Detail & Related papers (2020-02-21T00:43:06Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.