Faster Convex Lipschitz Regression via 2-block ADMM
- URL: http://arxiv.org/abs/2111.01348v1
- Date: Tue, 2 Nov 2021 03:10:41 GMT
- Title: Faster Convex Lipschitz Regression via 2-block ADMM
- Authors: Ali Siahkamari, Durmus Alp Emre Acar, Christopher Liao, Kelly Geyer,
Venkatesh Saligrama, Brian Kulis
- Abstract summary: We show how a broad class of convex function learning problems can be solved via a 2-block ADMM approach.
For the task of convex Lipschitz regression, we establish that our proposed algorithm converges at the rate of $O(n3 d1.5+n2 d2.5+n d3)$ for a dataset.
Unlike previous approaches, our method is up to 20 times faster than the existing method, and produces results that are comparable to state-of-the-art.
- Score: 45.217150417108826
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: The task of approximating an arbitrary convex function arises in several
learning problems such as convex regression, learning with a difference of
convex (DC) functions, and approximating Bregman divergences. In this paper, we
show how a broad class of convex function learning problems can be solved via a
2-block ADMM approach, where updates for each block can be computed in closed
form. For the task of convex Lipschitz regression, we establish that our
proposed algorithm converges at the rate of $O(n^3 d^{1.5}+n^2 d^{2.5}+n d^3)$
for a dataset $X \in R^{n\times d}$. This new rate improves the state of the
art $O(n^5d^2$) available by interior point methods if $d = o( n^4)$. Further
we provide similar solvers for DC regression and Bregman divergence learning.
Unlike previous approaches, our method is amenable to the use of GPUs. We
demonstrate on regression and metric learning experiments that our approach is
up to 20 times faster than the existing method, and produces results that are
comparable to state-of-the-art.
Related papers
- Projection by Convolution: Optimal Sample Complexity for Reinforcement Learning in Continuous-Space MDPs [56.237917407785545]
We consider the problem of learning an $varepsilon$-optimal policy in a general class of continuous-space Markov decision processes (MDPs) having smooth Bellman operators.
Key to our solution is a novel projection technique based on ideas from harmonic analysis.
Our result bridges the gap between two popular but conflicting perspectives on continuous-space MDPs.
arXiv Detail & Related papers (2024-05-10T09:58:47Z) - Mirror Descent Algorithms with Nearly Dimension-Independent Rates for
Differentially-Private Stochastic Saddle-Point Problems [6.431793114484429]
We propose $sqrtlog(d)/sqrtn + log(d)/[nvarepsilon]2/5$ to solve the problem of differentially-private saddle-points in the polyhedral setting.
We show that our algorithms attain a rate of $sqrtlog(d)/sqrtn + log(d)/[nvarepsilon]2/5$ with constant success.
arXiv Detail & Related papers (2024-03-05T12:28:00Z) - An Oblivious Stochastic Composite Optimization Algorithm for Eigenvalue
Optimization Problems [76.2042837251496]
We introduce two oblivious mirror descent algorithms based on a complementary composite setting.
Remarkably, both algorithms work without prior knowledge of the Lipschitz constant or smoothness of the objective function.
We show how to extend our framework to scale and demonstrate the efficiency and robustness of our methods on large scale semidefinite programs.
arXiv Detail & Related papers (2023-06-30T08:34:29Z) - Optimality of Robust Online Learning [4.21768682940933]
We study an online learning algorithm with a robust loss function $mathcalL_sigma$ for regression over a reproducing kernel Hilbert space (RKHS)
The proposed algorithm is then a robust alternative for online least squares regression aiming to estimate the conditional mean function.
arXiv Detail & Related papers (2023-04-20T03:00:33Z) - Variance-Aware Confidence Set: Variance-Dependent Bound for Linear
Bandits and Horizon-Free Bound for Linear Mixture MDP [76.94328400919836]
We show how to construct variance-aware confidence sets for linear bandits and linear mixture Decision Process (MDP)
For linear bandits, we obtain an $widetildeO(mathrmpoly(d)sqrt1 + sum_i=1Ksigma_i2) regret bound, where $d is the feature dimension.
For linear mixture MDP, we obtain an $widetildeO(mathrmpoly(d)sqrtK)$ regret bound, where
arXiv Detail & Related papers (2021-01-29T18:57:52Z) - Optimal Robust Linear Regression in Nearly Linear Time [97.11565882347772]
We study the problem of high-dimensional robust linear regression where a learner is given access to $n$ samples from the generative model $Y = langle X,w* rangle + epsilon$
We propose estimators for this problem under two settings: (i) $X$ is L4-L2 hypercontractive, $mathbbE [XXtop]$ has bounded condition number and $epsilon$ has bounded variance and (ii) $X$ is sub-Gaussian with identity second moment and $epsilon$ is
arXiv Detail & Related papers (2020-07-16T06:44:44Z) - Piecewise Linear Regression via a Difference of Convex Functions [50.89452535187813]
We present a new piecewise linear regression methodology that utilizes fitting a difference of convex functions (DC functions) to the data.
We empirically validate the method, showing it to be practically implementable, and to have comparable performance to existing regression/classification methods on real-world datasets.
arXiv Detail & Related papers (2020-07-05T18:58:47Z) - Subgradient Regularized Multivariate Convex Regression at Scale [21.55988846648753]
We present new large-scale algorithms for fitting a subgradient regularized convex regression function to $n$ samples in $d$ dimensions.
We show that our framework can approximately solve instances of the subgradient regularized convex regression problem with $n=105$ and $d=10$ within minutes.
arXiv Detail & Related papers (2020-05-23T19:08:39Z) - Efficient algorithms for multivariate shape-constrained convex
regression problems [9.281671380673306]
We prove that the least squares estimator is computable via solving a constrained convex programming (QP) problem with $(n+1)d$ variables and at least $n(n-1)$ linear inequality constraints.
For solving the generally very large-scale convex QP, we design two efficient algorithms, one is the symmetric Gauss-Seidel based alternating direction method of multipliers (tt sGS-ADMM), and the other is the proximal augmented Lagrangian method (tt pALM) with the subproblems solved by the semismooth Newton method (t
arXiv Detail & Related papers (2020-02-26T11:18:43Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.