Error Bounds for Generalized Group Sparsity
- URL: http://arxiv.org/abs/2008.04734v1
- Date: Sat, 8 Aug 2020 03:52:05 GMT
- Title: Error Bounds for Generalized Group Sparsity
- Authors: Xinyu Zhang
- Abstract summary: We state one universal theorem which is proved to obtain results on consistency and convergence rates for different forms of double sparsity regularization.
Our analysis identifies a generalized norm of $epsilon$-norm, which provides a dual formulation for our double sparsity regularization.
- Score: 8.557485533942337
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: In high-dimensional statistical inference, sparsity regularizations have
shown advantages in consistency and convergence rates for coefficient
estimation. We consider a generalized version of Sparse-Group Lasso which
captures both element-wise sparsity and group-wise sparsity simultaneously. We
state one universal theorem which is proved to obtain results on consistency
and convergence rates for different forms of double sparsity regularization.
The universality of the results lies in an generalization of various
convergence rates for single regularization cases such as LASSO and group LASSO
and also double regularization cases such as sparse-group LASSO. Our analysis
identifies a generalized norm of $\epsilon$-norm, which provides a dual
formulation for our double sparsity regularization.
Related papers
- Theoretical Limitations of Ensembles in the Age of Overparameterization [28.965084623872464]
Recent empirical studies find that modern ensembles of neural networks may not provide any inherent generalization advantage over single but larger neural networks.
We prove that infinite ensembles of over parameterized RF regressors become pointwise equivalent to (single) infinite-width RF regressors.
As a consequence, we can characterize the predictive variance amongst ensemble members, and demonstrate that it quantifies the expected effects of increasing capacity.
arXiv Detail & Related papers (2024-10-21T17:03:20Z) - Leveraging PAC-Bayes Theory and Gibbs Distributions for Generalization
Bounds with Complexity Measures [14.408127155170774]
We leverage the framework of disintegrated PAC-Bayes bounds to derive a general generalization bound instantiable with arbitrary complexity measures.
Our bound stands in probability jointly over the hypothesis and the learning sample, which allows the complexity to be adapted to the generalization gap.
arXiv Detail & Related papers (2024-02-19T10:15:11Z) - The non-overlapping statistical approximation to overlapping group lasso [4.197110761923662]
We propose a separable penalty as an approximation of the overlapping group lasso penalty.
Thanks to the separability, the computation of regularization based on our penalty is substantially faster than that of the overlapping group lasso.
We show that the estimator based on the proposed separable penalty is statistically equivalent to the one based on the overlapping group lasso penalty.
arXiv Detail & Related papers (2022-11-16T21:21:41Z) - A PAC-Bayesian Generalization Bound for Equivariant Networks [15.27608414735815]
We derive norm-based PAC-Bayesian generalization bounds for equivariant networks.
The bound characterizes the impact of group size, and multiplicity and degree of irreducible representations on the generalization error.
In general, the bound indicates that using larger group size in the model improves the generalization error substantiated by extensive numerical experiments.
arXiv Detail & Related papers (2022-10-24T12:07:03Z) - A Non-Asymptotic Moreau Envelope Theory for High-Dimensional Generalized
Linear Models [33.36787620121057]
We prove a new generalization bound that shows for any class of linear predictors in Gaussian space.
We use our finite-sample bound to directly recover the "optimistic rate" of Zhou et al. (2021)
We show that application of our bound generalization using localized Gaussian width will generally be sharp for empirical risk minimizers.
arXiv Detail & Related papers (2022-10-21T16:16:55Z) - On the Importance of Gradient Norm in PAC-Bayesian Bounds [92.82627080794491]
We propose a new generalization bound that exploits the contractivity of the log-Sobolev inequalities.
We empirically analyze the effect of this new loss-gradient norm term on different neural architectures.
arXiv Detail & Related papers (2022-10-12T12:49:20Z) - A Unified Framework for Multi-distribution Density Ratio Estimation [101.67420298343512]
Binary density ratio estimation (DRE) provides the foundation for many state-of-the-art machine learning algorithms.
We develop a general framework from the perspective of Bregman minimization divergence.
We show that our framework leads to methods that strictly generalize their counterparts in binary DRE.
arXiv Detail & Related papers (2021-12-07T01:23:20Z) - On the Double Descent of Random Features Models Trained with SGD [78.0918823643911]
We study properties of random features (RF) regression in high dimensions optimized by gradient descent (SGD)
We derive precise non-asymptotic error bounds of RF regression under both constant and adaptive step-size SGD setting.
We observe the double descent phenomenon both theoretically and empirically.
arXiv Detail & Related papers (2021-10-13T17:47:39Z) - Benign Overfitting of Constant-Stepsize SGD for Linear Regression [122.70478935214128]
inductive biases are central in preventing overfitting empirically.
This work considers this issue in arguably the most basic setting: constant-stepsize SGD for linear regression.
We reflect on a number of notable differences between the algorithmic regularization afforded by (unregularized) SGD in comparison to ordinary least squares.
arXiv Detail & Related papers (2021-03-23T17:15:53Z) - Dimension Free Generalization Bounds for Non Linear Metric Learning [61.193693608166114]
We provide uniform generalization bounds for two regimes -- the sparse regime, and a non-sparse regime.
We show that by relying on a different, new property of the solutions, it is still possible to provide dimension free generalization guarantees.
arXiv Detail & Related papers (2021-02-07T14:47:00Z) - Posterior Differential Regularization with f-divergence for Improving
Model Robustness [95.05725916287376]
We focus on methods that regularize the model posterior difference between clean and noisy inputs.
We generalize the posterior differential regularization to the family of $f$-divergences.
Our experiments show that regularizing the posterior differential with $f$-divergence can result in well-improved model robustness.
arXiv Detail & Related papers (2020-10-23T19:58:01Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.