Smoothing ADMM for Sparse-Penalized Quantile Regression with Non-Convex
Penalties
- URL: http://arxiv.org/abs/2309.03094v1
- Date: Mon, 4 Sep 2023 21:48:51 GMT
- Title: Smoothing ADMM for Sparse-Penalized Quantile Regression with Non-Convex
Penalties
- Authors: Reza Mirzaeifard, Naveen K. D. Venkategowda, Vinay Chakravarthi
Gogineni, Stefan Werner
- Abstract summary: This paper investigates concave and clipped quantile regression in the presence of nonsecondary absolute and non-smooth convergence penalties.
We introduce a novel-loop ADM algorithm with an increasing penalty multiplier, named SIAD, specifically for sparse regression.
- Score: 8.294148737585543
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: This paper investigates quantile regression in the presence of non-convex and
non-smooth sparse penalties, such as the minimax concave penalty (MCP) and
smoothly clipped absolute deviation (SCAD). The non-smooth and non-convex
nature of these problems often leads to convergence difficulties for many
algorithms. While iterative techniques like coordinate descent and local linear
approximation can facilitate convergence, the process is often slow. This
sluggish pace is primarily due to the need to run these approximation
techniques until full convergence at each step, a requirement we term as a
\emph{secondary convergence iteration}. To accelerate the convergence speed, we
employ the alternating direction method of multipliers (ADMM) and introduce a
novel single-loop smoothing ADMM algorithm with an increasing penalty
parameter, named SIAD, specifically tailored for sparse-penalized quantile
regression. We first delve into the convergence properties of the proposed SIAD
algorithm and establish the necessary conditions for convergence.
Theoretically, we confirm a convergence rate of $o\big({k^{-\frac{1}{4}}}\big)$
for the sub-gradient bound of augmented Lagrangian. Subsequently, we provide
numerical results to showcase the effectiveness of the SIAD algorithm. Our
findings highlight that the SIAD method outperforms existing approaches,
providing a faster and more stable solution for sparse-penalized quantile
regression.
Related papers
- Stochastic Optimization for Non-convex Problem with Inexact Hessian
Matrix, Gradient, and Function [99.31457740916815]
Trust-region (TR) and adaptive regularization using cubics have proven to have some very appealing theoretical properties.
We show that TR and ARC methods can simultaneously provide inexact computations of the Hessian, gradient, and function values.
arXiv Detail & Related papers (2023-10-18T10:29:58Z) - Faster One-Sample Stochastic Conditional Gradient Method for Composite
Convex Minimization [61.26619639722804]
We propose a conditional gradient method (CGM) for minimizing convex finite-sum objectives formed as a sum of smooth and non-smooth terms.
The proposed method, equipped with an average gradient (SAG) estimator, requires only one sample per iteration. Nevertheless, it guarantees fast convergence rates on par with more sophisticated variance reduction techniques.
arXiv Detail & Related papers (2022-02-26T19:10:48Z) - Differentiable Annealed Importance Sampling and the Perils of Gradient
Noise [68.44523807580438]
Annealed importance sampling (AIS) and related algorithms are highly effective tools for marginal likelihood estimation.
Differentiability is a desirable property as it would admit the possibility of optimizing marginal likelihood as an objective.
We propose a differentiable algorithm by abandoning Metropolis-Hastings steps, which further unlocks mini-batch computation.
arXiv Detail & Related papers (2021-07-21T17:10:14Z) - On the Convergence of Stochastic Extragradient for Bilinear Games with
Restarted Iteration Averaging [96.13485146617322]
We present an analysis of the ExtraGradient (SEG) method with constant step size, and present variations of the method that yield favorable convergence.
We prove that when augmented with averaging, SEG provably converges to the Nash equilibrium, and such a rate is provably accelerated by incorporating a scheduled restarting procedure.
arXiv Detail & Related papers (2021-06-30T17:51:36Z) - A Unified Analysis of First-Order Methods for Smooth Games via Integral
Quadratic Constraints [10.578409461429626]
In this work, we adapt the integral quadratic constraints theory to first-order methods for smooth and strongly-varying games and iteration.
We provide emphfor the first time a global convergence rate for the negative momentum method(NM) with an complexity $mathcalO(kappa1.5)$, which matches its known lower bound.
We show that it is impossible for an algorithm with one step of memory to achieve acceleration if it only queries the gradient once per batch.
arXiv Detail & Related papers (2020-09-23T20:02:00Z) - Accelerated Gradient Methods for Sparse Statistical Learning with
Nonconvex Penalties [10.913266721195916]
Nesterov's accelerated simulation (AG) is a popular technique to optimize objective functions two: a convex loss and a penalty function.
A recent Nesterov's AG proposal generalizes but has never been applied to statistical learning problems.
In this article, we consider the application non AG algorithm to high-dimensional and sparse statistical learning problems.
arXiv Detail & Related papers (2020-09-22T15:37:09Z) - Single-Timescale Stochastic Nonconvex-Concave Optimization for Smooth
Nonlinear TD Learning [145.54544979467872]
We propose two single-timescale single-loop algorithms that require only one data point each step.
Our results are expressed in a form of simultaneous primal and dual side convergence.
arXiv Detail & Related papers (2020-08-23T20:36:49Z) - Balancing Rates and Variance via Adaptive Batch-Size for Stochastic
Optimization Problems [120.21685755278509]
In this work, we seek to balance the fact that attenuating step-size is required for exact convergence with the fact that constant step-size learns faster in time up to an error.
Rather than fixing the minibatch the step-size at the outset, we propose to allow parameters to evolve adaptively.
arXiv Detail & Related papers (2020-07-02T16:02:02Z) - Dual Stochastic Natural Gradient Descent and convergence of interior
half-space gradient approximations [0.0]
Multinomial logistic regression (MLR) is widely used in statistics and machine learning.
gradient descent (SGD) is the most common approach for determining the parameters of a MLR model in big data scenarios.
arXiv Detail & Related papers (2020-01-19T00:53:49Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.