Related papers: Nearly Optimal Algorithms with Sublinear Computational Complexity for Online Kernel Regression

Nearly Optimal Algorithms with Sublinear Computational Complexity for Online Kernel Regression

URL: http://arxiv.org/abs/2306.08320v1
Date: Wed, 14 Jun 2023 07:39:09 GMT
Title: Nearly Optimal Algorithms with Sublinear Computational Complexity for Online Kernel Regression
Authors: Junfan Li and Shizhong Liao
Abstract summary: The trade-off between regret and computational cost is a fundamental problem for online kernel regression. We propose two new algorithms, AOGD-ALD and NONS-ALD, which can keep nearly optimal regret bounds at a sublinear computational complexity.
Score: 13.510889339096117
License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
Abstract: The trade-off between regret and computational cost is a fundamental problem for online kernel regression, and previous algorithms worked on the trade-off can not keep optimal regret bounds at a sublinear computational complexity. In this paper, we propose two new algorithms, AOGD-ALD and NONS-ALD, which can keep nearly optimal regret bounds at a sublinear computational complexity, and give sufficient conditions under which our algorithms work. Both algorithms dynamically maintain a group of nearly orthogonal basis used to approximate the kernel mapping, and keep nearly optimal regret bounds by controlling the approximate error. The number of basis depends on the approximate error and the decay rate of eigenvalues of the kernel matrix. If the eigenvalues decay exponentially, then AOGD-ALD and NONS-ALD separately achieves a regret of $O(\sqrt{L(f)})$ and $O(\mathrm{d}_{\mathrm{eff}}(\mu)\ln{T})$ at a computational complexity in $O(\ln^2{T})$. If the eigenvalues decay polynomially with degree $p\geq 1$, then our algorithms keep the same regret bounds at a computational complexity in $o(T)$ in the case of $p>4$ and $p\geq 10$, respectively. $L(f)$ is the cumulative losses of $f$ and $\mathrm{d}_{\mathrm{eff}}(\mu)$ is the effective dimension of the problem. The two regret bounds are nearly optimal and are not comparable.

Related papers

Quantum Algorithms for Projection-Free Sparse Convex Optimization [32.34794896079469]
For the vector domain, we propose two quantum algorithms for sparse constraints that find a $varepsilon$-optimal solution with the query complexity of $O(sqrtd/varepsilon)$.<n>For the matrix domain, we propose two quantum algorithms for nuclear norm constraints that improve the time complexity to $tildeO(rd/varepsilon2)$ and $tildeO(sqrtrd/varepsilon3)$.
arXiv Detail & Related papers (2025-07-11T12:43:58Z)
Sparsity-Based Interpolation of External, Internal and Swap Regret [4.753557469026313]
This paper focuses on the expert problem in online learning.<n>By recovering the optimal $O(sqrtTlog d)$ external regret bound when $dmathrmunif_phi=d$, the standard $tilde O(sqrtT)$ internal regret bound when $dmathrmself_phi=d-1$ and the optimal $tilde O(sqrtdT)$ swap regret bound in the worst case, we improve upon existing algorithms in the intermediate regimes.
arXiv Detail & Related papers (2025-02-06T22:47:52Z)
Implicit High-Order Moment Tensor Estimation and Learning Latent Variable Models [39.33814194788341]
We study the task of learning latent-variable models. Motivated by such applications, we develop a general efficient algorithm for implicit moment computation. By leveraging our general algorithm, we obtain the first-time learners for the following models.
arXiv Detail & Related papers (2024-11-23T23:13:24Z)
On Linear Convergence in Smooth Convex-Concave Bilinearly-Coupled Saddle-Point Optimization: Lower Bounds and Optimal Algorithms [17.227158587717934]
We revisit the smooth convex-concave bilinearly-coupled saddle-point problem of the form $min_xmax_y f(x) + langle y,mathbfB xrangle - g(y)$. We develop the first lower complexity bounds and matching optimal linearly converging algorithms for this problem class.
arXiv Detail & Related papers (2024-11-21T22:06:25Z)
Inverting the Leverage Score Gradient: An Efficient Approximate Newton Method [10.742859956268655]
This paper aims to recover the intrinsic model parameters given the leverage scores gradient. We specifically scrutinize the inversion of the leverage score gradient, denoted as $g(x)$.
arXiv Detail & Related papers (2024-08-21T01:39:42Z)
Towards Optimal Regret in Adversarial Linear MDPs with Bandit Feedback [30.337826346496385]
We study online reinforcement learning in linear Markov decision processes with adversarial losses and bandit feedback. We introduce two algorithms that achieve improved regret performance compared to existing approaches.
arXiv Detail & Related papers (2023-10-17T19:43:37Z)
Regret-Optimal Federated Transfer Learning for Kernel Regression with Applications in American Option Pricing [8.723136784230906]
We propose an optimal iterative scheme for federated transfer learning, where a central planner has access to datasets. Our objective is to minimize the cumulative deviation of the generated parameters $thetai(t)_t=0T$ across all $T$ iterations. By leveraging symmetries within the regret-optimal algorithm, we develop a nearly regret $_optimal that runs with $mathcalO(Np2)$ fewer elementary operations.
arXiv Detail & Related papers (2023-09-08T19:17:03Z)
Efficiently Learning One-Hidden-Layer ReLU Networks via Schur Polynomials [50.90125395570797]
We study the problem of PAC learning a linear combination of $k$ ReLU activations under the standard Gaussian distribution on $mathbbRd$ with respect to the square loss. Our main result is an efficient algorithm for this learning task with sample and computational complexity $(dk/epsilon)O(k)$, whereepsilon>0$ is the target accuracy.
arXiv Detail & Related papers (2023-07-24T14:37:22Z)
Refined Regret for Adversarial MDPs with Linear Function Approximation [50.00022394876222]
We consider learning in an adversarial Decision Process (MDP) where the loss functions can change arbitrarily over $K$ episodes. This paper provides two algorithms that improve the regret to $tildemathcal O(K2/3)$ in the same setting.
arXiv Detail & Related papers (2023-01-30T14:37:21Z)
Improved Kernel Alignment Regret Bound for Online Kernel Learning [11.201662566974232]
We propose an algorithm whose regret bound and computational complexity are better than previous results. If the eigenvalues of the kernel matrix decay exponentially, then our algorithm enjoys a regret of $O(sqrtmathcalA_T)$ at a computational complexity of $O(ln2T)$. We extend our algorithm to batch learning and obtain a $O(frac1TsqrtmathbbE[mathcalA_T])$ excess risk bound which improves the previous $O
arXiv Detail & Related papers (2022-12-26T02:32:20Z)
Logarithmic Regret from Sublinear Hints [76.87432703516942]
We show that an algorithm can obtain $O(log T)$ regret with just $O(sqrtT)$ hints under a natural query model. We also show that $o(sqrtT)$ hints cannot guarantee better than $Omega(sqrtT)$ regret.
arXiv Detail & Related papers (2021-11-09T16:50:18Z)
Gaussian Process Bandit Optimization with Few Batches [49.896920704012395]
We introduce a batch algorithm inspired by finite-arm bandit algorithms. We show that it achieves the cumulative regret upper bound $Oast(sqrtTgamma_T)$ using $O(loglog T)$ batches within time horizon $T$. In addition, we propose a modified version of our algorithm, and characterize how the regret is impacted by the number of batches.
arXiv Detail & Related papers (2021-10-15T00:54:04Z)
Private Stochastic Convex Optimization: Optimal Rates in $\ell_1$ Geometry [69.24618367447101]
Up to logarithmic factors the optimal excess population loss of any $(varepsilon,delta)$-differently private is $sqrtlog(d)/n + sqrtd/varepsilon n.$ We show that when the loss functions satisfy additional smoothness assumptions, the excess loss is upper bounded (up to logarithmic factors) by $sqrtlog(d)/n + (log(d)/varepsilon n)2/3.
arXiv Detail & Related papers (2021-03-02T06:53:44Z)
Hybrid Stochastic-Deterministic Minibatch Proximal Gradient: Less-Than-Single-Pass Optimization with Nearly Optimal Generalization [83.80460802169999]
We show that HSDMPG can attain an $mathcalObig (1/sttnbig)$ which is at the order of excess error on a learning model. For loss factors, we prove that HSDMPG can attain an $mathcalObig (1/sttnbig)$ which is at the order of excess error on a learning model.
arXiv Detail & Related papers (2020-09-18T02:18:44Z)
Streaming Complexity of SVMs [110.63976030971106]
We study the space complexity of solving the bias-regularized SVM problem in the streaming model. We show that for both problems, for dimensions of $frac1lambdaepsilon$, one can obtain streaming algorithms with spacely smaller than $frac1lambdaepsilon$.
arXiv Detail & Related papers (2020-07-07T17:10:00Z)

This list is automatically generated from the titles and abstracts of the papers in this site.