Related papers: A Support Detection and Root Finding Approach for Learning High-dimensional Generalized Linear Models

A Support Detection and Root Finding Approach for Learning High-dimensional Generalized Linear Models

URL: http://arxiv.org/abs/2001.05819v1
Date: Thu, 16 Jan 2020 14:35:17 GMT
Title: A Support Detection and Root Finding Approach for Learning High-dimensional Generalized Linear Models
Authors: Jian Huang, Yuling Jiao, Lican Kang, Jin Liu, Yanyan Liu, Xiliang Lu
Abstract summary: We develop a support detection and root finding procedure to learn the high dimensional sparse generalized linear models. We conduct simulations and real data analysis to illustrate the advantages of our proposed method over several existing methods.
Score: 10.103666349083165
License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
Abstract: Feature selection is important for modeling high-dimensional data, where the number of variables can be much larger than the sample size. In this paper, we develop a support detection and root finding procedure to learn the high dimensional sparse generalized linear models and denote this method by GSDAR. Based on the KKT condition for $\ell_0$-penalized maximum likelihood estimations, GSDAR generates a sequence of estimators iteratively. Under some restricted invertibility conditions on the maximum likelihood function and sparsity assumption on the target coefficients, the errors of the proposed estimate decays exponentially to the optimal order. Moreover, the oracle estimator can be recovered if the target signal is stronger than the detectable level. We conduct simulations and real data analysis to illustrate the advantages of our proposed method over several existing methods, including Lasso and MCP.

Related papers

Robust Spatiotemporal Epidemic Modeling with Integrated Adaptive Outlier Detection [7.5504472850103435]
In epidemic modeling, outliers can distort parameter estimation and lead to misguided public health decisions.<n>We introduce a robust generalized additive model (RST-GAM) to mitigate this distortion.<n>We demonstrate the practical utility of RST-GAM by analyzing county-level COVID-19 infection data in the United States.
arXiv Detail & Related papers (2025-07-12T19:23:25Z)
Bayesian Estimation and Tuning-Free Rank Detection for Probability Mass Function Tensors [17.640500920466984]
This paper presents a novel framework for estimating the joint PMF and automatically inferring its rank from observed data. We derive a deterministic solution based on variational inference (VI) to approximate the posterior distributions of various model parameters. Additionally, we develop a scalable version of the VI-based approach by leveraging variational inference (SVI) Experiments involving both synthetic data and real movie recommendation data illustrate the advantages of our VI and SVI-based methods in terms of estimation accuracy, automatic rank detection, and computational efficiency.
arXiv Detail & Related papers (2024-10-08T20:07:49Z)
Maximum a Posteriori Estimation for Linear Structural Dynamics Models Using Bayesian Optimization with Rational Polynomial Chaos Expansions [0.01578888899297715]
We propose an extension to an existing sparse Bayesian learning approach for MAP estimation. We introduce a Bayesian optimization approach, which allows to adaptively enrich the experimental design. By combining the sparsity-inducing learning procedure with the experimental design, we effectively reduce the number of model evaluations.
arXiv Detail & Related papers (2024-08-07T06:11:37Z)
Diffusion posterior sampling for simulation-based inference in tall data settings [53.17563688225137]
Simulation-based inference ( SBI) is capable of approximating the posterior distribution that relates input parameters to a given observation. In this work, we consider a tall data extension in which multiple observations are available to better infer the parameters of the model. We compare our method to recently proposed competing approaches on various numerical experiments and demonstrate its superiority in terms of numerical stability and computational cost.
arXiv Detail & Related papers (2024-04-11T09:23:36Z)
A Random Matrix Approach to Low-Multilinear-Rank Tensor Approximation [24.558241146742205]
We characterize the large-dimensional spectral behavior of the unfoldings of the data tensor and exhibit relevant signal-to-noise ratios governing the detectability of the principal directions of the signal. Results allow to accurately predict the reconstruction performance of truncated multilinear SVD (MLSVD) in the non-trivial regime.
arXiv Detail & Related papers (2024-02-05T16:38:30Z)
Conditional Korhunen-Lo\'{e}ve regression model with Basis Adaptation for high-dimensional problems: uncertainty quantification and inverse modeling [62.997667081978825]
We propose a methodology for improving the accuracy of surrogate models of the observable response of physical systems. We apply the proposed methodology to constructing surrogate models via the Basis Adaptation (BA) method of the stationary hydraulic head response.
arXiv Detail & Related papers (2023-07-05T18:14:38Z)
Approximate Message Passing for the Matrix Tensor Product Model [8.206394018475708]
We propose and analyze an approximate message passing (AMP) algorithm for the matrix tensor product model. Building upon an convergence theorem for non-separable functions, we prove a state evolution for non-separable functions. We leverage this state evolution result to provide necessary and sufficient conditions for recovery of the signal of interest.
arXiv Detail & Related papers (2023-06-27T16:03:56Z)
Probabilistic Unrolling: Scalable, Inverse-Free Maximum Likelihood Estimation for Latent Gaussian Models [69.22568644711113]
We introduce probabilistic unrolling, a method that combines Monte Carlo sampling with iterative linear solvers to circumvent matrix inversions. Our theoretical analyses reveal that unrolling and backpropagation through the iterations of the solver can accelerate gradient estimation for maximum likelihood estimation. In experiments on simulated and real data, we demonstrate that probabilistic unrolling learns latent Gaussian models up to an order of magnitude faster than gradient EM, with minimal losses in model performance.
arXiv Detail & Related papers (2023-06-05T21:08:34Z)
Sparse high-dimensional linear regression with a partitioned empirical Bayes ECM algorithm [62.997667081978825]
We propose a computationally efficient and powerful Bayesian approach for sparse high-dimensional linear regression. Minimal prior assumptions on the parameters are used through the use of plug-in empirical Bayes estimates. The proposed approach is implemented in the R package probe.
arXiv Detail & Related papers (2022-09-16T19:15:50Z)
Generative Principal Component Analysis [47.03792476688768]
We study the problem of principal component analysis with generative modeling assumptions. Key assumption is that the underlying signal lies near the range of an $L$-Lipschitz continuous generative model with bounded $k$-dimensional inputs. We propose a quadratic estimator, and show that it enjoys a statistical rate of order $sqrtfracklog Lm$, where $m$ is the number of samples.
arXiv Detail & Related papers (2022-03-18T01:48:16Z)
Goal-directed Generation of Discrete Structures with Conditional Generative Models [85.51463588099556]
We introduce a novel approach to directly optimize a reinforcement learning objective, maximizing an expected reward. We test our methodology on two tasks: generating molecules with user-defined properties and identifying short python expressions which evaluate to a given target value.
arXiv Detail & Related papers (2020-10-05T20:03:13Z)
Improved guarantees and a multiple-descent curve for Column Subset Selection and the Nystr\"om method [76.73096213472897]
We develop techniques which exploit spectral properties of the data matrix to obtain improved approximation guarantees. Our approach leads to significantly better bounds for datasets with known rates of singular value decay. We show that both our improved bounds and the multiple-descent curve can be observed on real datasets simply by varying the RBF parameter.
arXiv Detail & Related papers (2020-02-21T00:43:06Z)

This list is automatically generated from the titles and abstracts of the papers in this site.