Related papers: Faster Convex Lipschitz Regression via 2-block ADMM

Faster Convex Lipschitz Regression via 2-block ADMM

URL: http://arxiv.org/abs/2111.01348v1
Date: Tue, 2 Nov 2021 03:10:41 GMT
Title: Faster Convex Lipschitz Regression via 2-block ADMM
Authors: Ali Siahkamari, Durmus Alp Emre Acar, Christopher Liao, Kelly Geyer, Venkatesh Saligrama, Brian Kulis
Abstract summary: We show how a broad class of convex function learning problems can be solved via a 2-block ADMM approach. For the task of convex Lipschitz regression, we establish that our proposed algorithm converges at the rate of $O(n3 d1.5+n2 d2.5+n d3)$ for a dataset. Unlike previous approaches, our method is up to 20 times faster than the existing method, and produces results that are comparable to state-of-the-art.
Score: 45.217150417108826
License: http://creativecommons.org/licenses/by/4.0/
Abstract: The task of approximating an arbitrary convex function arises in several learning problems such as convex regression, learning with a difference of convex (DC) functions, and approximating Bregman divergences. In this paper, we show how a broad class of convex function learning problems can be solved via a 2-block ADMM approach, where updates for each block can be computed in closed form. For the task of convex Lipschitz regression, we establish that our proposed algorithm converges at the rate of $O(n^3 d^{1.5}+n^2 d^{2.5}+n d^3)$ for a dataset $X \in R^{n\times d}$. This new rate improves the state of the art $O(n^5d^2$) available by interior point methods if $d = o( n^4)$. Further we provide similar solvers for DC regression and Bregman divergence learning. Unlike previous approaches, our method is amenable to the use of GPUs. We demonstrate on regression and metric learning experiments that our approach is up to 20 times faster than the existing method, and produces results that are comparable to state-of-the-art.

Related papers

Robust learning of halfspaces under log-concave marginals [6.852292115526837]
We give an algorithm that learns linear threshold functions and returns a classifier with boundary volume $O(r+varepsilon)$ at radius perturbation $r$.<n>The time and sample complexity of $dtildeO (1/varepsilon2)$ matches the complexity of Boolean regression.
arXiv Detail & Related papers (2025-05-19T20:12:16Z)
Projection by Convolution: Optimal Sample Complexity for Reinforcement Learning in Continuous-Space MDPs [56.237917407785545]
We consider the problem of learning an $varepsilon$-optimal policy in a general class of continuous-space Markov decision processes (MDPs) having smooth Bellman operators. Key to our solution is a novel projection technique based on ideas from harmonic analysis. Our result bridges the gap between two popular but conflicting perspectives on continuous-space MDPs.
arXiv Detail & Related papers (2024-05-10T09:58:47Z)
Mirror Descent Algorithms with Nearly Dimension-Independent Rates for Differentially-Private Stochastic Saddle-Point Problems [6.431793114484429]
We propose $sqrtlog(d)/sqrtn + log(d)/[nvarepsilon]2/5$ to solve the problem of differentially-private saddle-points in the polyhedral setting. We show that our algorithms attain a rate of $sqrtlog(d)/sqrtn + log(d)/[nvarepsilon]2/5$ with constant success.
arXiv Detail & Related papers (2024-03-05T12:28:00Z)
Convergence analysis of online algorithms for vector-valued kernel regression [0.42970700836450487]
We consider the problem of approximating the regression function $f_mu:, Omega to Y$ from noisy $mu$-distributed vector-valued data.<n>We provide estimates for the expected squared error in the RKHS norm of the approximations $f(m)in H$ obtained by a standard regularized online approximation algorithm.
arXiv Detail & Related papers (2023-09-14T15:10:47Z)
An Oblivious Stochastic Composite Optimization Algorithm for Eigenvalue Optimization Problems [76.2042837251496]
We introduce two oblivious mirror descent algorithms based on a complementary composite setting. Remarkably, both algorithms work without prior knowledge of the Lipschitz constant or smoothness of the objective function. We show how to extend our framework to scale and demonstrate the efficiency and robustness of our methods on large scale semidefinite programs.
arXiv Detail & Related papers (2023-06-30T08:34:29Z)
Optimality of Robust Online Learning [4.21768682940933]
We study an online learning algorithm with a robust loss function $mathcalL_sigma$ for regression over a reproducing kernel Hilbert space (RKHS) The proposed algorithm is then a robust alternative for online least squares regression aiming to estimate the conditional mean function.
arXiv Detail & Related papers (2023-04-20T03:00:33Z)
Randomized Block-Coordinate Optimistic Gradient Algorithms for Root-Finding Problems [11.15373699918747]
We develop two new algorithms to approximate a solution of nonlinear equations in large-scale settings. We apply our methods to a class of large-scale finite-sum inclusions, which covers prominent applications in machine learning, statistical learning, and network optimization.
arXiv Detail & Related papers (2023-01-08T21:46:27Z)
Nearly Optimal Regret for Learning Adversarial MDPs with Linear Function Approximation [92.3161051419884]
We study the reinforcement learning for finite-horizon episodic Markov decision processes with adversarial reward and full information feedback. We show that it can achieve $tildeO(dHsqrtT)$ regret, where $H$ is the length of the episode. We also prove a matching lower bound of $tildeOmega(dHsqrtT)$ up to logarithmic factors.
arXiv Detail & Related papers (2021-02-17T18:54:08Z)
Variance-Aware Confidence Set: Variance-Dependent Bound for Linear Bandits and Horizon-Free Bound for Linear Mixture MDP [76.94328400919836]
We show how to construct variance-aware confidence sets for linear bandits and linear mixture Decision Process (MDP) For linear bandits, we obtain an $widetildeO(mathrmpoly(d)sqrt1 + sum_i=1Ksigma_i2) regret bound, where $d is the feature dimension. For linear mixture MDP, we obtain an $widetildeO(mathrmpoly(d)sqrtK)$ regret bound, where
arXiv Detail & Related papers (2021-01-29T18:57:52Z)
Optimal Robust Linear Regression in Nearly Linear Time [97.11565882347772]
We study the problem of high-dimensional robust linear regression where a learner is given access to $n$ samples from the generative model $Y = langle X,w* rangle + epsilon$ We propose estimators for this problem under two settings: (i) $X$ is L4-L2 hypercontractive, $mathbbE [XXtop]$ has bounded condition number and $epsilon$ has bounded variance and (ii) $X$ is sub-Gaussian with identity second moment and $epsilon$ is
arXiv Detail & Related papers (2020-07-16T06:44:44Z)
Piecewise Linear Regression via a Difference of Convex Functions [50.89452535187813]
We present a new piecewise linear regression methodology that utilizes fitting a difference of convex functions (DC functions) to the data. We empirically validate the method, showing it to be practically implementable, and to have comparable performance to existing regression/classification methods on real-world datasets.
arXiv Detail & Related papers (2020-07-05T18:58:47Z)
Subgradient Regularized Multivariate Convex Regression at Scale [21.55988846648753]
We present new large-scale algorithms for fitting a subgradient regularized convex regression function to $n$ samples in $d$ dimensions. We show that our framework can approximately solve instances of the subgradient regularized convex regression problem with $n=105$ and $d=10$ within minutes.
arXiv Detail & Related papers (2020-05-23T19:08:39Z)
Efficient algorithms for multivariate shape-constrained convex regression problems [9.281671380673306]
We prove that the least squares estimator is computable via solving a constrained convex programming (QP) problem with $(n+1)d$ variables and at least $n(n-1)$ linear inequality constraints. For solving the generally very large-scale convex QP, we design two efficient algorithms, one is the symmetric Gauss-Seidel based alternating direction method of multipliers (tt sGS-ADMM), and the other is the proximal augmented Lagrangian method (tt pALM) with the subproblems solved by the semismooth Newton method (t
arXiv Detail & Related papers (2020-02-26T11:18:43Z)

This list is automatically generated from the titles and abstracts of the papers in this site.