The non-overlapping statistical approximation to overlapping group lasso
- URL: http://arxiv.org/abs/2211.09221v3
- Date: Tue, 20 Feb 2024 23:58:24 GMT
- Title: The non-overlapping statistical approximation to overlapping group lasso
- Authors: Mingyu Qi, Tianxi Li
- Abstract summary: We propose a separable penalty as an approximation of the overlapping group lasso penalty.
Thanks to the separability, the computation of regularization based on our penalty is substantially faster than that of the overlapping group lasso.
We show that the estimator based on the proposed separable penalty is statistically equivalent to the one based on the overlapping group lasso penalty.
- Score: 4.197110761923662
- License: http://creativecommons.org/publicdomain/zero/1.0/
- Abstract: Group lasso is a commonly used regularization method in statistical learning
in which parameters are eliminated from the model according to predefined
groups. However, when the groups overlap, optimizing the group lasso penalized
objective can be time-consuming on large-scale problems because of the
non-separability induced by the overlapping groups. This bottleneck has
seriously limited the application of overlapping group lasso regularization in
many modern problems, such as gene pathway selection and graphical model
estimation. In this paper, we propose a separable penalty as an approximation
of the overlapping group lasso penalty. Thanks to the separability, the
computation of regularization based on our penalty is substantially faster than
that of the overlapping group lasso, especially for large-scale and
high-dimensional problems. We show that the penalty is the tightest separable
relaxation of the overlapping group lasso norm within the family of
$\ell_{q_1}/\ell_{q_2}$ norms. Moreover, we show that the estimator based on
the proposed separable penalty is statistically equivalent to the one based on
the overlapping group lasso penalty with respect to their error bounds and the
rate-optimal performance under the squared loss. We demonstrate the faster
computational time and statistical equivalence of our method compared with the
overlapping group lasso in simulation examples and a classification problem of
cancer tumors based on gene expression and multiple gene pathways.
Related papers
- Maximum a Posteriori Inference for Factor Graphs via Benders' Decomposition [0.38233569758620056]
We present a method for maximum a-posteriori inference in general Bayesian factor models.
We derive MAP estimation algorithms for the Bayesian Gaussian mixture model and latent Dirichlet allocation.
arXiv Detail & Related papers (2024-10-24T19:57:56Z) - Dual feature reduction for the sparse-group lasso and its adaptive variant [0.49109372384514843]
The sparse-group lasso performs both variable and group selection, making simultaneous use of the strengths of the lasso and group lasso.
It has found widespread use in genetics, a field that regularly involves the analysis of high-dimensional data, due to its sparse-group penalty.
A novel dual feature reduction method, Dual Feature Reduction (DFR), is presented that uses strong screening rules for the sparse-group lasso and the adaptive sparse-group lasso to reduce their input space before optimization.
arXiv Detail & Related papers (2024-05-27T12:10:07Z) - Strong screening rules for group-based SLOPE models [0.49109372384514843]
We develop screening rules for group-based Sorted L-One Penalized Estimation (SLOPE) models.
The developed rules are applicable for the wider family of group-based models, including OSCAR.
arXiv Detail & Related papers (2024-05-24T08:48:06Z) - Optimal Multi-Distribution Learning [88.3008613028333]
Multi-distribution learning seeks to learn a shared model that minimizes the worst-case risk across $k$ distinct data distributions.
We propose a novel algorithm that yields an varepsilon-optimal randomized hypothesis with a sample complexity on the order of (d+k)/varepsilon2.
arXiv Detail & Related papers (2023-12-08T16:06:29Z) - Modeling the Q-Diversity in a Min-max Play Game for Robust Optimization [61.39201891894024]
Group distributionally robust optimization (group DRO) can minimize the worst-case loss over pre-defined groups.
We reformulate the group DRO framework by proposing Q-Diversity.
Characterized by an interactive training mode, Q-Diversity relaxes the group identification from annotation into direct parameterization.
arXiv Detail & Related papers (2023-05-20T07:02:27Z) - Compound Batch Normalization for Long-tailed Image Classification [77.42829178064807]
We propose a compound batch normalization method based on a Gaussian mixture.
It can model the feature space more comprehensively and reduce the dominance of head classes.
The proposed method outperforms existing methods on long-tailed image classification.
arXiv Detail & Related papers (2022-12-02T07:31:39Z) - Optimal Clustering with Bandit Feedback [57.672609011609886]
This paper considers the problem of online clustering with bandit feedback.
It includes a novel stopping rule for sequential testing that circumvents the need to solve any NP-hard weighted clustering problem as its subroutines.
We show through extensive simulations on synthetic and real-world datasets that BOC's performance matches the lower boundally, and significantly outperforms a non-adaptive baseline algorithm.
arXiv Detail & Related papers (2022-02-09T06:05:05Z) - Lattice-Based Methods Surpass Sum-of-Squares in Clustering [98.46302040220395]
Clustering is a fundamental primitive in unsupervised learning.
Recent work has established lower bounds against the class of low-degree methods.
We show that, perhaps surprisingly, this particular clustering model textitdoes not exhibit a statistical-to-computational gap.
arXiv Detail & Related papers (2021-12-07T18:50:17Z) - Deconfounding Scores: Feature Representations for Causal Effect
Estimation with Weak Overlap [140.98628848491146]
We introduce deconfounding scores, which induce better overlap without biasing the target of estimation.
We show that deconfounding scores satisfy a zero-covariance condition that is identifiable in observed data.
In particular, we show that this technique could be an attractive alternative to standard regularizations.
arXiv Detail & Related papers (2021-04-12T18:50:11Z) - Fast marginal likelihood estimation of penalties for group-adaptive
elastic net [0.0]
Group-adaptive elastic net penalisation learns from co-data to improve prediction.
We present a fast method for marginal likelihood estimation of group-adaptive elastic net penalties for generalised linear models.
We demonstrate the method in a model-based simulation study and an application to cancer genomics.
arXiv Detail & Related papers (2021-01-11T13:30:24Z) - Error Bounds for Generalized Group Sparsity [8.557485533942337]
We state one universal theorem which is proved to obtain results on consistency and convergence rates for different forms of double sparsity regularization.
Our analysis identifies a generalized norm of $epsilon$-norm, which provides a dual formulation for our double sparsity regularization.
arXiv Detail & Related papers (2020-08-08T03:52:05Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.