Subgradient Regularized Multivariate Convex Regression at Scale
- URL: http://arxiv.org/abs/2005.11588v3
- Date: Tue, 5 Dec 2023 02:24:39 GMT
- Title: Subgradient Regularized Multivariate Convex Regression at Scale
- Authors: Wenyu Chen, Rahul Mazumder
- Abstract summary: We present new large-scale algorithms for fitting a subgradient regularized convex regression function to $n$ samples in $d$ dimensions.
We show that our framework can approximately solve instances of the subgradient regularized convex regression problem with $n=105$ and $d=10$ within minutes.
- Score: 21.55988846648753
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: We present new large-scale algorithms for fitting a subgradient regularized
multivariate convex regression function to $n$ samples in $d$ dimensions -- a
key problem in shape constrained nonparametric regression with applications in
statistics, engineering and the applied sciences. The infinite-dimensional
learning task can be expressed via a convex quadratic program (QP) with $O(nd)$
decision variables and $O(n^2)$ constraints. While instances with $n$ in the
lower thousands can be addressed with current algorithms within reasonable
runtimes, solving larger problems (e.g., $n\approx 10^4$ or $10^5$) is
computationally challenging. To this end, we present an active set type
algorithm on the dual QP. For computational scalability, we allow for
approximate optimization of the reduced sub-problems; and propose randomized
augmentation rules for expanding the active set. We derive novel computational
guarantees for our algorithms. We demonstrate that our framework can
approximately solve instances of the subgradient regularized convex regression
problem with $n=10^5$ and $d=10$ within minutes; and shows strong computational
performance compared to earlier approaches.
Related papers
- Convergence analysis of online algorithms for vector-valued kernel regression [0.42970700836450487]
We consider the problem of approximating the regression function from noisy vector-valued data by an online learning algorithm.
We show that the expected squared error in the RKHS norm can be bounded by $C2 (m+1)-s/(2+s)$, where $m$ is the current number of processed data.
arXiv Detail & Related papers (2023-09-14T15:10:47Z) - Efficiently Learning One-Hidden-Layer ReLU Networks via Schur
Polynomials [50.90125395570797]
We study the problem of PAC learning a linear combination of $k$ ReLU activations under the standard Gaussian distribution on $mathbbRd$ with respect to the square loss.
Our main result is an efficient algorithm for this learning task with sample and computational complexity $(dk/epsilon)O(k)$, whereepsilon>0$ is the target accuracy.
arXiv Detail & Related papers (2023-07-24T14:37:22Z) - Provably Efficient Reinforcement Learning via Surprise Bound [66.15308700413814]
We propose a provably efficient reinforcement learning algorithm (both computationally and statistically) with general value function approximations.
Our algorithm achieves reasonable regret bounds when applied to both the linear setting and the sparse high-dimensional linear setting.
arXiv Detail & Related papers (2023-02-22T20:21:25Z) - Pseudonorm Approachability and Applications to Regret Minimization [73.54127663296906]
We convert high-dimensional $ell_infty$-approachability problems to low-dimensional pseudonorm approachability problems.
We develop an algorithmic theory of pseudonorm approachability, analogous to previous work on approachability for $ell$ and other norms.
arXiv Detail & Related papers (2023-02-03T03:19:14Z) - Refined Regret for Adversarial MDPs with Linear Function Approximation [50.00022394876222]
We consider learning in an adversarial Decision Process (MDP) where the loss functions can change arbitrarily over $K$ episodes.
This paper provides two algorithms that improve the regret to $tildemathcal O(K2/3)$ in the same setting.
arXiv Detail & Related papers (2023-01-30T14:37:21Z) - Faster Convex Lipschitz Regression via 2-block ADMM [45.217150417108826]
We show how a broad class of convex function learning problems can be solved via a 2-block ADMM approach.
For the task of convex Lipschitz regression, we establish that our proposed algorithm converges at the rate of $O(n3 d1.5+n2 d2.5+n d3)$ for a dataset.
Unlike previous approaches, our method is up to 20 times faster than the existing method, and produces results that are comparable to state-of-the-art.
arXiv Detail & Related papers (2021-11-02T03:10:41Z) - Minimax Optimization with Smooth Algorithmic Adversaries [59.47122537182611]
We propose a new algorithm for the min-player against smooth algorithms deployed by an adversary.
Our algorithm is guaranteed to make monotonic progress having no limit cycles, and to find an appropriate number of gradient ascents.
arXiv Detail & Related papers (2021-06-02T22:03:36Z) - Empirical Risk Minimization in the Non-interactive Local Model of
Differential Privacy [26.69391745812235]
We study the Empirical Risk Minimization (ERM) problem in the noninteractive Local Differential Privacy (LDP) model.
Previous research indicates that the sample complexity, to achieve error $alpha, needs to be depending on dimensionality $p$ for general loss functions.
arXiv Detail & Related papers (2020-11-11T17:48:00Z) - A Two-Timescale Framework for Bilevel Optimization: Complexity Analysis
and Application to Actor-Critic [142.1492359556374]
Bilevel optimization is a class of problems which exhibit a two-level structure.
We propose a two-timescale approximation (TTSA) algorithm for tackling such a bilevel problem.
We show that a two-timescale natural actor-critic policy optimization algorithm can be viewed as a special case of our TTSA framework.
arXiv Detail & Related papers (2020-07-10T05:20:02Z) - Efficient algorithms for multivariate shape-constrained convex
regression problems [9.281671380673306]
We prove that the least squares estimator is computable via solving a constrained convex programming (QP) problem with $(n+1)d$ variables and at least $n(n-1)$ linear inequality constraints.
For solving the generally very large-scale convex QP, we design two efficient algorithms, one is the symmetric Gauss-Seidel based alternating direction method of multipliers (tt sGS-ADMM), and the other is the proximal augmented Lagrangian method (tt pALM) with the subproblems solved by the semismooth Newton method (t
arXiv Detail & Related papers (2020-02-26T11:18:43Z) - Learning Sparse Classifiers: Continuous and Mixed Integer Optimization
Perspectives [10.291482850329892]
Mixed integer programming (MIP) can be used to solve (to optimality) $ell_0$-regularized regression problems.
We propose two classes of scalable algorithms: an exact algorithm that can handlepapprox 50,000$ features in a few minutes, and approximate algorithms that can address instances with $papprox6$.
In addition, we present new estimation error bounds for $ell$-regularizeds.
arXiv Detail & Related papers (2020-01-17T18:47:02Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.