Related papers: Proximal Operators of Sorted Nonconvex Penalties

Proximal Operators of Sorted Nonconvex Penalties

URL: http://arxiv.org/abs/2506.15315v1
Date: Wed, 18 Jun 2025 09:44:13 GMT
Title: Proximal Operators of Sorted Nonconvex Penalties
Authors: Anne Gagneux, Mathurin Massias, Emmanuel Soubies,
Abstract summary: We show how the Adjacent ViolatorsPAV algorithm can compute exactly proximal operator.<n>We also present new theoretical insights on thers of the non proximal proximal problem.
Score: 8.121488821215793
License: http://creativecommons.org/licenses/by/4.0/
Abstract: This work studies the problem of sparse signal recovery with automatic grouping of variables. To this end, we investigate sorted nonsmooth penalties as a regularization approach for generalized linear models. We focus on a family of sorted nonconvex penalties which generalizes the Sorted L1 Norm (SLOPE). These penalties are designed to promote clustering of variables due to their sorted nature, while the nonconvexity reduces the shrinkage of coefficients. Our goal is to provide efficient ways to compute their proximal operator, enabling the use of popular proximal algorithms to solve composite optimization problems with this choice of sorted penalties. We distinguish between two classes of problems: the weakly convex case where computing the proximal operator remains a convex problem, and the nonconvex case where computing the proximal operator becomes a challenging nonconvex combinatorial problem. For the weakly convex case (e.g. sorted MCP and SCAD), we explain how the Pool Adjacent Violators (PAV) algorithm can exactly compute the proximal operator. For the nonconvex case (e.g. sorted Lq with q in ]0,1[), we show that a slight modification of this algorithm turns out to be remarkably efficient to tackle the computation of the proximal operator. We also present new theoretical insights on the minimizers of the nonconvex proximal problem. We demonstrate the practical interest of using such penalties on several experiments.

Related papers

Single-loop Algorithms for Stochastic Non-convex Optimization with Weakly-Convex Constraints [49.76332265680669]
This paper examines a crucial subset of problems where both the objective and constraint functions are weakly convex.<n>Existing methods often face limitations, including slow convergence rates or reliance on double-loop designs.<n>We introduce a novel single-loop penalty-based algorithm to overcome these challenges.
arXiv Detail & Related papers (2025-04-21T17:15:48Z)
Pathwise optimization for bridge-type estimators and its applications [49.1574468325115]
Pathwise methods allow to efficiently compute the full path for penalized estimators.<n>We apply these algorithms to the penalized estimation of processes observed at discrete times.
arXiv Detail & Related papers (2024-12-05T10:38:29Z)
Stochastic Optimization for Non-convex Problem with Inexact Hessian Matrix, Gradient, and Function [99.31457740916815]
Trust-region (TR) and adaptive regularization using cubics have proven to have some very appealing theoretical properties. We show that TR and ARC methods can simultaneously provide inexact computations of the Hessian, gradient, and function values.
arXiv Detail & Related papers (2023-10-18T10:29:58Z)
Learning distributed representations with efficient SoftMax normalization [3.8673630752805437]
We propose a linear-time approximation to compute the normalization constants of $rm SoftMax(XYT)$ for embedding vectors with bounded norms.<n>We show on some pre-trained embedding datasets that the proposed estimation method achieves higher or comparable accuracy with competing methods.<n>The proposed algorithm is interpretable and easily adapted to arbitrary embedding problems.
arXiv Detail & Related papers (2023-03-30T15:48:26Z)
Fast, Differentiable and Sparse Top-k: a Convex Analysis Perspective [21.6146047576295]
The top-k operator returns a sparse vector, where the non-zero values correspond to the k largest values of the input. We view the top-k operator as a linear program over the permutahedron, the convex hull of permutations. Our framework is significantly more general than the existing one and allows for example to express top-k operators that select values in magnitude.
arXiv Detail & Related papers (2023-02-02T21:32:13Z)
Accelerated First-Order Optimization under Nonlinear Constraints [61.98523595657983]
We exploit between first-order algorithms for constrained optimization and non-smooth systems to design a new class of accelerated first-order algorithms.<n>An important property of these algorithms is that constraints are expressed in terms of velocities instead of sparse variables.
arXiv Detail & Related papers (2023-02-01T08:50:48Z)
Optimal Algorithms for Stochastic Complementary Composite Minimization [55.26935605535377]
Inspired by regularization techniques in statistics and machine learning, we study complementary composite minimization. We provide novel excess risk bounds, both in expectation and with high probability. Our algorithms are nearly optimal, which we prove via novel lower complexity bounds for this class of problems.
arXiv Detail & Related papers (2022-11-03T12:40:24Z)
Alternating Mahalanobis Distance Minimization for Stable and Accurate CP Decomposition [4.847980206213335]
We introduce a new formulation for deriving singular values and vectors of a tensor by considering the critical points of a function different from what is used in the previous work. We show that a subsweep of this algorithm can achieve a superlinear convergence rate for exact CPD with known rank. We then view the algorithm as optimizing a Mahalanobis distance with respect to each factor with ground metric dependent on the other factors.
arXiv Detail & Related papers (2022-04-14T19:56:36Z)
Divide and Learn: A Divide and Conquer Approach for Predict+Optimize [50.03608569227359]
The predict+optimize problem combines machine learning ofproblem coefficients with a optimization prob-lem that uses the predicted coefficients. We show how to directlyexpress the loss of the optimization problem in terms of thepredicted coefficients as a piece-wise linear function. We propose a novel divide and algorithm to tackle optimization problems without this restriction and predict itscoefficients using the optimization loss.
arXiv Detail & Related papers (2020-12-04T00:26:56Z)
Hybrid Variance-Reduced SGD Algorithms For Nonconvex-Concave Minimax Problems [26.24895953952318]
We develop an algorithm to solve a class of non-gence minimax problems. They can also work with both single or two mini-batch derivatives.
arXiv Detail & Related papers (2020-06-27T03:05:18Z)

This list is automatically generated from the titles and abstracts of the papers in this site.