Dual feature reduction for the sparse-group lasso and its adaptive variant
- URL: http://arxiv.org/abs/2405.17094v1
- Date: Mon, 27 May 2024 12:10:07 GMT
- Title: Dual feature reduction for the sparse-group lasso and its adaptive variant
- Authors: Fabio Feser, Marina Evangelou,
- Abstract summary: The sparse-group lasso performs both variable and group selection, making simultaneous use of the strengths of the lasso and group lasso.
It has found widespread use in genetics, a field that regularly involves the analysis of high-dimensional data, due to its sparse-group penalty.
A novel dual feature reduction method, Dual Feature Reduction (DFR), is presented that uses strong screening rules for the sparse-group lasso and the adaptive sparse-group lasso to reduce their input space before optimization.
- Score: 0.49109372384514843
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: The sparse-group lasso performs both variable and group selection, making simultaneous use of the strengths of the lasso and group lasso. It has found widespread use in genetics, a field that regularly involves the analysis of high-dimensional data, due to its sparse-group penalty, which allows it to utilize grouping information. However, the sparse-group lasso can be computationally more expensive than both the lasso and group lasso, due to the added shrinkage complexity, and its additional hyper-parameter that needs tuning. In this paper a novel dual feature reduction method, Dual Feature Reduction (DFR), is presented that uses strong screening rules for the sparse-group lasso and the adaptive sparse-group lasso to reduce their input space before optimization. DFR applies two layers of screening and is based on the dual norms of the sparse-group lasso and adaptive sparse-group lasso. Through synthetic and real numerical studies, it is shown that the proposed feature reduction approach is able to drastically reduce the computational cost in many different scenarios.
Related papers
- Using Constraints to Discover Sparse and Alternative Subgroup Descriptions [0.0]
Subgroup-discovery methods allow users to obtain simple descriptions of interesting regions in a dataset.
We focus on two types of constraints: First, we limit the number of features used in subgroup descriptions, making the latter sparse.
Second, we propose the novel optimization problem of finding alternative subgroup descriptions, which cover a similar set of data objects as a given subgroup but use different features.
arXiv Detail & Related papers (2024-06-03T15:10:01Z) - How does promoting the minority fraction affect generalization? A theoretical study of the one-hidden-layer neural network on group imbalance [64.1656365676171]
Group imbalance has been a known problem in empirical risk minimization.
This paper quantifies the impact of individual groups on the sample complexity, the convergence rate, and the average and group-level testing performance.
arXiv Detail & Related papers (2024-03-12T04:38:05Z) - Sparsity via Sparse Group $k$-max Regularization [22.05774771336432]
In this paper, we propose a novel and concise regularization, namely the sparse group $k$-max regularization.
We verify the effectiveness and flexibility of the proposed method through numerical experiments on both synthetic and real-world datasets.
arXiv Detail & Related papers (2024-02-13T14:41:28Z) - Optimal Multi-Distribution Learning [88.3008613028333]
Multi-distribution learning seeks to learn a shared model that minimizes the worst-case risk across $k$ distinct data distributions.
We propose a novel algorithm that yields an varepsilon-optimal randomized hypothesis with a sample complexity on the order of (d+k)/varepsilon2.
arXiv Detail & Related papers (2023-12-08T16:06:29Z) - Modeling the Q-Diversity in a Min-max Play Game for Robust Optimization [61.39201891894024]
Group distributionally robust optimization (group DRO) can minimize the worst-case loss over pre-defined groups.
We reformulate the group DRO framework by proposing Q-Diversity.
Characterized by an interactive training mode, Q-Diversity relaxes the group identification from annotation into direct parameterization.
arXiv Detail & Related papers (2023-05-20T07:02:27Z) - The non-overlapping statistical approximation to overlapping group lasso [4.197110761923662]
We propose a separable penalty as an approximation of the overlapping group lasso penalty.
Thanks to the separability, the computation of regularization based on our penalty is substantially faster than that of the overlapping group lasso.
We show that the estimator based on the proposed separable penalty is statistically equivalent to the one based on the overlapping group lasso penalty.
arXiv Detail & Related papers (2022-11-16T21:21:41Z) - Towards Group Robustness in the presence of Partial Group Labels [61.33713547766866]
spurious correlations between input samples and the target labels wrongly direct the neural network predictions.
We propose an algorithm that optimize for the worst-off group assignments from a constraint set.
We show improvements in the minority group's performance while preserving overall aggregate accuracy across groups.
arXiv Detail & Related papers (2022-01-10T22:04:48Z) - Exclusive Group Lasso for Structured Variable Selection [10.86544864007391]
A structured variable selection problem is considered.
A composite norm can be properly designed to promote such exclusive group sparsity patterns.
An active set algorithm is proposed that builds the solution by including structure atoms into the estimated support.
arXiv Detail & Related papers (2021-08-23T16:55:13Z) - Just Train Twice: Improving Group Robustness without Training Group
Information [101.84574184298006]
Standard training via empirical risk minimization can produce models that achieve high accuracy on average but low accuracy on certain groups.
Prior approaches that achieve high worst-group accuracy, like group distributionally robust optimization (group DRO) require expensive group annotations for each training point.
We propose a simple two-stage approach, JTT, that first trains a standard ERM model for several epochs, and then trains a second model that upweights the training examples that the first model misclassified.
arXiv Detail & Related papers (2021-07-19T17:52:32Z) - Feature Grouping and Sparse Principal Component Analysis [23.657672812296518]
Grouping and Sparse Principal Analysis (SPCA) is widely used in data processing dimension reduction.
FGSPCA allows loadings to belong to disjoint homogeneous groups, with sparsity as a special case.
arXiv Detail & Related papers (2021-06-25T15:08:39Z) - Robust Recursive Partitioning for Heterogeneous Treatment Effects with
Uncertainty Quantification [84.53697297858146]
Subgroup analysis of treatment effects plays an important role in applications from medicine to public policy to recommender systems.
Most of the current methods of subgroup analysis begin with a particular algorithm for estimating individualized treatment effects (ITE)
This paper develops a new method for subgroup analysis, R2P, that addresses all these weaknesses.
arXiv Detail & Related papers (2020-06-14T14:50:02Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.