Coordinate Descent for SLOPE
- URL: http://arxiv.org/abs/2210.14780v1
- Date: Wed, 26 Oct 2022 15:20:30 GMT
- Title: Coordinate Descent for SLOPE
- Authors: Johan Larsson, Quentin Klopfenstein, Mathurin Massias, Jonas Wallin
- Abstract summary: Sorted L-One Penalized Estimation (SLOPE) is a generalization of the lasso with appealing statistical properties.
Current software packages that fit SLOPE rely on algorithms that perform poorly in high dimensions.
We propose a new fast algorithm to solve the SLOPE optimization problem, which combines proximal gradient descent and proximal coordinate descent steps.
- Score: 6.838073951329198
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: The lasso is the most famous sparse regression and feature selection method.
One reason for its popularity is the speed at which the underlying optimization
problem can be solved. Sorted L-One Penalized Estimation (SLOPE) is a
generalization of the lasso with appealing statistical properties. In spite of
this, the method has not yet reached widespread interest. A major reason for
this is that current software packages that fit SLOPE rely on algorithms that
perform poorly in high dimensions. To tackle this issue, we propose a new fast
algorithm to solve the SLOPE optimization problem, which combines proximal
gradient descent and proximal coordinate descent steps. We provide new results
on the directional derivative of the SLOPE penalty and its related SLOPE
thresholding operator, as well as provide convergence guarantees for our
proposed solver. In extensive benchmarks on simulated and real data, we show
that our method outperforms a long list of competing algorithms.
Related papers
- Sparsity-Constraint Optimization via Splicing Iteration [1.3622424109977902]
We develop an algorithm named Sparsity-Constraint Optimization via sPlicing itEration (SCOPE)
SCOPE converges effectively without tuning parameters.
We apply SCOPE to solve quadratic optimization, learn sparse classifiers, and recover sparse Markov networks for binary variables.
Our open-source Python package skscope based on C++ implementation is publicly available on GitHub.
arXiv Detail & Related papers (2024-06-17T18:34:51Z) - Accelerating Cutting-Plane Algorithms via Reinforcement Learning
Surrogates [49.84541884653309]
A current standard approach to solving convex discrete optimization problems is the use of cutting-plane algorithms.
Despite the existence of a number of general-purpose cut-generating algorithms, large-scale discrete optimization problems continue to suffer from intractability.
We propose a method for accelerating cutting-plane algorithms via reinforcement learning.
arXiv Detail & Related papers (2023-07-17T20:11:56Z) - Strictly Low Rank Constraint Optimization -- An Asymptotically
$\mathcal{O}(\frac{1}{t^2})$ Method [5.770309971945476]
We propose a class of non-text and non-smooth problems with textitrank regularization to promote sparsity in optimal solution.
We show that our algorithms are able to achieve a singular convergence of $Ofrac(t2)$, which is exactly same as Nesterov's optimal convergence for first-order methods on smooth convex problems.
arXiv Detail & Related papers (2023-07-04T16:55:41Z) - A Survey of Numerical Algorithms that can Solve the Lasso Problems [2.538209532048867]
In statistics, the least absolute shrinkage and selection operator (Lasso) is a regression method that performs both variable selection and regularization.
We summarize five representative algorithms to optimize the objective function in Lasso.
arXiv Detail & Related papers (2023-03-07T01:12:59Z) - The Curse of Unrolling: Rate of Differentiating Through Optimization [35.327233435055305]
Un differentiation approximates the solution using an iterative solver and differentiates it through the computational path.
We show that we can either 1) choose a large learning rate leading to a fast convergence but accept that the algorithm may have an arbitrarily long burn-in phase or 2) choose a smaller learning rate leading to an immediate but slower convergence.
arXiv Detail & Related papers (2022-09-27T09:27:29Z) - Outlier-Robust Sparse Estimation via Non-Convex Optimization [73.18654719887205]
We explore the connection between high-dimensional statistics and non-robust optimization in the presence of sparsity constraints.
We develop novel and simple optimization formulations for these problems.
As a corollary, we obtain that any first-order method that efficiently converges to station yields an efficient algorithm for these tasks.
arXiv Detail & Related papers (2021-09-23T17:38:24Z) - An Asymptotically Optimal Primal-Dual Incremental Algorithm for
Contextual Linear Bandits [129.1029690825929]
We introduce a novel algorithm improving over the state-of-the-art along multiple dimensions.
We establish minimax optimality for any learning horizon in the special case of non-contextual linear bandits.
arXiv Detail & Related papers (2020-10-23T09:12:47Z) - Adaptive Sampling for Best Policy Identification in Markov Decision
Processes [79.4957965474334]
We investigate the problem of best-policy identification in discounted Markov Decision (MDPs) when the learner has access to a generative model.
The advantages of state-of-the-art algorithms are discussed and illustrated.
arXiv Detail & Related papers (2020-09-28T15:22:24Z) - Accelerated Message Passing for Entropy-Regularized MAP Inference [89.15658822319928]
Maximum a posteriori (MAP) inference in discrete-valued random fields is a fundamental problem in machine learning.
Due to the difficulty of this problem, linear programming (LP) relaxations are commonly used to derive specialized message passing algorithms.
We present randomized methods for accelerating these algorithms by leveraging techniques that underlie classical accelerated gradient.
arXiv Detail & Related papers (2020-07-01T18:43:32Z) - Exploiting Higher Order Smoothness in Derivative-free Optimization and
Continuous Bandits [99.70167985955352]
We study the problem of zero-order optimization of a strongly convex function.
We consider a randomized approximation of the projected gradient descent algorithm.
Our results imply that the zero-order algorithm is nearly optimal in terms of sample complexity and the problem parameters.
arXiv Detail & Related papers (2020-06-14T10:42:23Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.