Sharp Trade-Offs in High-Dimensional Inference via 2-Level SLOPE
- URL: http://arxiv.org/abs/2507.09110v1
- Date: Sat, 12 Jul 2025 01:57:10 GMT
- Title: Sharp Trade-Offs in High-Dimensional Inference via 2-Level SLOPE
- Authors: Zhiqi Bu, Jason M. Klusowski, Cynthia Rush, Ruijia Wu,
- Abstract summary: We show that 2-level SLOPE offers a robust, scalable alternative to both LASSO and general SLOPE.<n>Our results suggest that 2-level SLOPE offers a robust, scalable alternative to both LASSO and general SLOPE.
- Score: 20.580487867158364
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Among techniques for high-dimensional linear regression, Sorted L-One Penalized Estimation (SLOPE) generalizes the LASSO via an adaptive $l_1$ regularization that applies heavier penalties to larger coefficients in the model. To achieve such adaptivity, SLOPE requires the specification of a complex hierarchy of penalties, i.e., a monotone penalty sequence in $R^p$, in contrast to a single penalty scalar for LASSO. Tuning this sequence when $p$ is large poses a challenge, as brute force search over a grid of values is computationally prohibitive. In this work, we study the 2-level SLOPE, an important subclass of SLOPE, with only three hyperparameters. We demonstrate both empirically and analytically that 2-level SLOPE not only preserves the advantages of general SLOPE -- such as improved mean squared error and overcoming the Donoho-Tanner power limit -- but also exhibits computational benefits by reducing the penalty hyperparameter space. In particular, we prove that 2-level SLOPE admits a sharp, theoretically tight characterization of the trade-off between true positive proportion (TPP) and false discovery proportion (FDP), contrasting with general SLOPE where only upper and lower bounds are known. Empirical evaluations further underscore the effectiveness of 2-level SLOPE in settings where predictors exhibit high correlation, when the noise is large, or when the underlying signal is not sparse. Our results suggest that 2-level SLOPE offers a robust, scalable alternative to both LASSO and general SLOPE, making it particularly suited for practical high-dimensional data analysis.
Related papers
- Stochastic Weakly Convex Optimization Under Heavy-Tailed Noises [55.43924214633558]
In this paper, we focus on two types of noises: one is sub-Weibull noises, and the other is SsBC noises.<n>Under these two noise assumptions, the in-expectation and high-probability convergence of SFOMs have been studied in the contexts of convex optimization and smooth optimization.
arXiv Detail & Related papers (2025-07-17T16:48:45Z) - NDCG-Consistent Softmax Approximation with Accelerated Convergence [67.10365329542365]
We propose novel loss formulations that align directly with ranking metrics.<n>We integrate the proposed RG losses with the highly efficient Alternating Least Squares (ALS) optimization method.<n> Empirical evaluations on real-world datasets demonstrate that our approach achieves comparable or superior ranking performance.
arXiv Detail & Related papers (2025-06-11T06:59:17Z) - Convergence of Clipped-SGD for Convex $(L_0,L_1)$-Smooth Optimization with Heavy-Tailed Noise [60.17850744118546]
First-order methods with clipping, such as Clip-SGD, exhibit stronger convergence guarantees than SGD under the $(L_$1)$-smoothness assumption.<n>We establish the first high-probability convergence bounds for Clip-SGD applied to convex $(L_$1)$-smooth optimization with heavytailed noise.
arXiv Detail & Related papers (2025-05-27T07:23:42Z) - MaxSup: Overcoming Representation Collapse in Label Smoothing [52.66247931969715]
Label Smoothing (LS) is widely adopted to reduce overconfidence in neural network predictions.<n>LS compacts feature representations into overly tight clusters, diluting intra-class diversity.<n>We propose Max Suppression (MaxSup), which applies uniform regularization to both correct and incorrect predictions.
arXiv Detail & Related papers (2025-02-18T20:10:34Z) - Mean-Field Langevin Dynamics for Signed Measures via a Bilevel Approach [4.577104493960515]
Mean-field Langevin dynamics (MLFD) is a class of interacting particle methods that tackle convex optimization over probability measures on a manifold.<n>We show how to extend the MFLD framework to convex optimization problems over signed measures.
arXiv Detail & Related papers (2024-06-24T18:15:12Z) - A Non-Asymptotic Moreau Envelope Theory for High-Dimensional Generalized
Linear Models [33.36787620121057]
We prove a new generalization bound that shows for any class of linear predictors in Gaussian space.
We use our finite-sample bound to directly recover the "optimistic rate" of Zhou et al. (2021)
We show that application of our bound generalization using localized Gaussian width will generally be sharp for empirical risk minimizers.
arXiv Detail & Related papers (2022-10-21T16:16:55Z) - High-probability Bounds for Non-Convex Stochastic Optimization with
Heavy Tails [55.561406656549686]
We consider non- Hilbert optimization using first-order algorithms for which the gradient estimates may have tails.
We show that a combination of gradient, momentum, and normalized gradient descent convergence to critical points in high-probability with best-known iteration for smooth losses.
arXiv Detail & Related papers (2021-06-28T00:17:01Z) - High Probability Complexity Bounds for Non-Smooth Stochastic Optimization with Heavy-Tailed Noise [51.31435087414348]
It is essential to theoretically guarantee that algorithms provide small objective residual with high probability.
Existing methods for non-smooth convex optimization have complexity bounds with dependence on confidence level.
We propose novel stepsize rules for two methods with gradient clipping.
arXiv Detail & Related papers (2021-06-10T17:54:21Z) - Characterizing the SLOPE Trade-off: A Variational Perspective and the
Donoho-Tanner Limit [29.344264789740894]
Sorted l1 regularization has been incorporated into many methods for solving high-dimensional statistical estimation problems.
We show how this technique improves variable selection by characterizing the optimal SLOPE trade-off between the false discovery proportion (FDP) and true positive proportion (TPP) or, equivalently, between measures of type I error and power.
arXiv Detail & Related papers (2021-05-27T16:56:42Z) - Efficient Designs of SLOPE Penalty Sequences in Finite Dimension [5.787117733071415]
In linear regression, SLOPE is a new convex analysis method that generalizes the Lasso via the sorted L1 penalty.
In this paper, we propose two efficient algorithms to design the possibly high-dimensional SLOPE penalty.
arXiv Detail & Related papers (2021-02-14T18:06:56Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.