Untangling Lariats: Subgradient Following of Variationally Penalized Objectives
- URL: http://arxiv.org/abs/2405.04710v1
- Date: Tue, 7 May 2024 23:08:24 GMT
- Title: Untangling Lariats: Subgradient Following of Variationally Penalized Objectives
- Authors: Kai-Chia Mo, Shai Shalev-Shwartz, Nisæl Shártov,
- Abstract summary: We derive, as special cases of our apparatus, known algorithms for the fused lasso and isotonic regression.
Last but not least, we derive a lattice-based subgradient solvers for variational penalties characterized through the output of arbitrary convolutional filters.
- Score: 10.043139484808949
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: We describe a novel subgradient following apparatus for calculating the optimum of convex problems with variational penalties. In this setting, we receive a sequence $y_i,\ldots,y_n$ and seek a smooth sequence $x_1,\ldots,x_n$. The smooth sequence attains the minimum Bregman divergence to an input sequence with additive variational penalties in the general form of $\sum_i g_i(x_{i+1}-x_i)$. We derive, as special cases of our apparatus, known algorithms for the fused lasso and isotonic regression. Our approach also facilitates new variational penalties such as non-smooth barrier functions. We next derive and analyze multivariate problems in which $\mathbf{x}_i,\mathbf{y}_i\in\mathbb{R}^d$ and variational penalties that depend on $\|\mathbf{x}_{i+1}-\mathbf{x}_i\|$. The norms we consider are $\ell_2$ and $\ell_\infty$ which promote group sparsity. Last but not least, we derive a lattice-based subgradient following for variational penalties characterized through the output of arbitrary convolutional filters. This paradigm yields efficient solvers for problems in which sparse high-order discrete derivatives such as acceleration and jerk are desirable.
Related papers
- Variance-Dependent Regret Bounds for Non-stationary Linear Bandits [52.872628573907434]
We propose algorithms that utilize the variance of the reward distribution as well as the $B_K$, and show that they can achieve tighter regret upper bounds.
We introduce two novel algorithms: Restarted Weighted$textOFUL+$ and Restarted $textSAVE+$.
Notably, when the total variance $V_K$ is much smaller than $K$, our algorithms outperform previous state-of-the-art results on non-stationary linear bandits under different settings.
arXiv Detail & Related papers (2024-03-15T23:36:55Z) - High Probability Guarantees for Random Reshuffling [5.663909018247509]
We consider the gradient method with random reshuffling ($mathsfRR$) for tackling nonmathp optimization problems.
In this work, we first investigate the neural sample complexity for $mathsfRR$s sampling procedure.
Then, we design a random reshuffling method ($mathsfp$mathsfRR$) that involves an additional randomized perturbation procedure stationary points.
arXiv Detail & Related papers (2023-11-20T15:17:20Z) - First and zeroth-order implementations of the regularized Newton method
with lazy approximated Hessians [4.62316736194615]
We develop Lip-order (Hessian-O) and zero-order (derivative-free) implementations of general non-free$ normfree problems.
We also equip our algorithms with the lazy bound update that reuses a previously computed Hessian approximation matrix for several iterations.
arXiv Detail & Related papers (2023-09-05T17:40:54Z) - A Fast Algorithm for Adaptive Private Mean Estimation [5.090363690988394]
We design an $(varepsilon, delta)$-differentially private algorithm that is adaptive to $Sigma$.
The estimator achieves optimal rates of convergence with respect to the induced Mahalanobis norm $||cdot||_Sigma$.
arXiv Detail & Related papers (2023-01-17T18:44:41Z) - Penalized Overdamped and Underdamped Langevin Monte Carlo Algorithms for Constrained Sampling [17.832449046193933]
We consider the constrained sampling problem where the goal is to sample from a target distribution of $pi(x)prop ef(x)$ when $x$ is constrained.
Motivated by penalty methods, we convert the constrained problem into an unconstrained sampling problem by introducing a penalty function for constraint violations.
For PSGLD and PSGULMC, when $tildemathcalO(d/varepsilon18)$ is strongly convex and smooth, we obtain $tildemathcalO(d/varepsilon
arXiv Detail & Related papers (2022-11-29T18:43:22Z) - Sparse Bayesian Lasso via a Variable-Coefficient $\ell_1$ Penalty [0.9176056742068814]
We define learnable penalty weights $lambda_p$ with hyperpriors.
We study the theoretical properties of this variable-co-efficient $ell_$ penalty in the context of penalized likelihood.
We develop a model we call the Sparse Bayesian Lasso which allows for behavior endowed qualitatively like Lasso regression to be applied to arbitrary variational models.
arXiv Detail & Related papers (2022-11-09T18:19:23Z) - Adaptive Stochastic Variance Reduction for Non-convex Finite-Sum
Minimization [52.25843977506935]
We propose an adaptive variance method, called AdaSpider, for $L$-smooth, non-reduction functions with a finitesum structure.
In doing so, we are able to compute an $epsilon-stationary point with $tildeOleft + st/epsilon calls.
arXiv Detail & Related papers (2022-11-03T14:41:46Z) - Private Isotonic Regression [54.32252900997422]
We consider the problem of isotonic regression over a partially ordered set (poset) $mathcalX$ and for any Lipschitz loss function.
We obtain a pure-DP algorithm that has an expected excess empirical risk of roughly $mathrmwidth(mathcalX) cdot log|mathcalX| / n$, where $mathrmwidth(mathcalX)$ is the width of the poset.
We show that the bounds above are essentially the best that can be
arXiv Detail & Related papers (2022-10-27T05:08:07Z) - Optimal Extragradient-Based Bilinearly-Coupled Saddle-Point Optimization [116.89941263390769]
We consider the smooth convex-concave bilinearly-coupled saddle-point problem, $min_mathbfxmax_mathbfyF(mathbfx) + H(mathbfx,mathbfy)$, where one has access to first-order oracles for $F$, $G$ as well as the bilinear coupling function $H$.
We present a emphaccelerated gradient-extragradient (AG-EG) descent-ascent algorithm that combines extragrad
arXiv Detail & Related papers (2022-06-17T06:10:20Z) - Optimal Robust Linear Regression in Nearly Linear Time [97.11565882347772]
We study the problem of high-dimensional robust linear regression where a learner is given access to $n$ samples from the generative model $Y = langle X,w* rangle + epsilon$
We propose estimators for this problem under two settings: (i) $X$ is L4-L2 hypercontractive, $mathbbE [XXtop]$ has bounded condition number and $epsilon$ has bounded variance and (ii) $X$ is sub-Gaussian with identity second moment and $epsilon$ is
arXiv Detail & Related papers (2020-07-16T06:44:44Z) - Agnostic Learning of a Single Neuron with Gradient Descent [92.7662890047311]
We consider the problem of learning the best-fitting single neuron as measured by the expected square loss.
For the ReLU activation, our population risk guarantee is $O(mathsfOPT1/2)+epsilon$.
For the ReLU activation, our population risk guarantee is $O(mathsfOPT1/2)+epsilon$.
arXiv Detail & Related papers (2020-05-29T07:20:35Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.