Sharper bounds for online learning of smooth functions of a single
variable
- URL: http://arxiv.org/abs/2105.14648v1
- Date: Sun, 30 May 2021 23:06:21 GMT
- Title: Sharper bounds for online learning of smooth functions of a single
variable
- Authors: Jesse Geneson
- Abstract summary: We show that $opt_1+epsilon(mathcalF_q) = Theta(epsilon-frac12)$, where the constants in the bound do not depend on $q$.
We also show that $opt_1+epsilon(mathcalF_q) = Theta(epsilon-frac12)$, where the constants in the bound do not depend on $q$.
- Score: 0.0
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: We investigate the generalization of the mistake-bound model to continuous
real-valued single variable functions. Let $\mathcal{F}_q$ be the class of
absolutely continuous functions $f: [0, 1] \rightarrow \mathbb{R}$ with
$||f'||_q \le 1$, and define $opt_p(\mathcal{F}_q)$ as the best possible bound
on the worst-case sum of the $p^{th}$ powers of the absolute prediction errors
over any number of trials. Kimber and Long (Theoretical Computer Science, 1995)
proved for $q \ge 2$ that $opt_p(\mathcal{F}_q) = 1$ when $p \ge 2$ and
$opt_p(\mathcal{F}_q) = \infty$ when $p = 1$. For $1 < p < 2$ with $p =
1+\epsilon$, the only known bound was $opt_p(\mathcal{F}_{q}) =
O(\epsilon^{-1})$ from the same paper. We show for all $\epsilon \in (0, 1)$
and $q \ge 2$ that $opt_{1+\epsilon}(\mathcal{F}_q) =
\Theta(\epsilon^{-\frac{1}{2}})$, where the constants in the bound do not
depend on $q$. We also show that $opt_{1+\epsilon}(\mathcal{F}_{\infty}) =
\Theta(\epsilon^{-\frac{1}{2}})$.
Related papers
- Coresets for Multiple $\ell_p$ Regression [47.790126028106734]
We construct coresets of size $tilde O(varepsilon-2d)$ for $p2$ and $tilde O(varepsilon-pdp/2)$ for $p>2$.
For $1p2$, every matrix has a subset of $tilde O(varepsilon-1k)$ rows which spans a $(varepsilon-1k)$-approximately optimal $k$-dimensional subspace for $ell_p$ subspace approximation
arXiv Detail & Related papers (2024-06-04T15:50:42Z) - $\ell_p$-Regression in the Arbitrary Partition Model of Communication [59.89387020011663]
We consider the randomized communication complexity of the distributed $ell_p$-regression problem in the coordinator model.
For $p = 2$, i.e., least squares regression, we give the first optimal bound of $tildeTheta(sd2 + sd/epsilon)$ bits.
For $p in (1,2)$,we obtain an $tildeO(sd2/epsilon + sd/mathrmpoly(epsilon)$ upper bound.
arXiv Detail & Related papers (2023-07-11T08:51:53Z) - Fast $(1+\varepsilon)$-Approximation Algorithms for Binary Matrix
Factorization [54.29685789885059]
We introduce efficient $(1+varepsilon)$-approximation algorithms for the binary matrix factorization (BMF) problem.
The goal is to approximate $mathbfA$ as a product of low-rank factors.
Our techniques generalize to other common variants of the BMF problem.
arXiv Detail & Related papers (2023-06-02T18:55:27Z) - Online Learning of Smooth Functions [0.35534933448684125]
We study the online learning of real-valued functions where the hidden function is known to have certain smoothness properties.
We find new bounds for $textopt_p(mathcal F_q)$ that are sharp up to a constant factor.
In the multi-variable setup, we establish inequalities relating $textopt_p(mathcal F_q,d)$ to $textopt_p(mathcal F_q,d)$ and show that $textopt_p(mathcal F
arXiv Detail & Related papers (2023-01-04T04:05:58Z) - Low-degree learning and the metric entropy of polynomials [44.99833362998488]
We prove that any (deterministic or randomized) algorithm which learns $mathscrF_nd$ with $L$-accuracy $varepsilon$ requires at least $Omega(sqrtvarepsilon)2dlog n leq log mathsfM(mathscrF_n,d,|cdot|_L,varepsilon) satisfies the two-sided estimate $$c (1-varepsilon)2dlog
arXiv Detail & Related papers (2022-03-17T23:52:08Z) - Low-Rank Approximation with $1/\epsilon^{1/3}$ Matrix-Vector Products [58.05771390012827]
We study iterative methods based on Krylov subspaces for low-rank approximation under any Schatten-$p$ norm.
Our main result is an algorithm that uses only $tildeO(k/sqrtepsilon)$ matrix-vector products.
arXiv Detail & Related papers (2022-02-10T16:10:41Z) - Learning low-degree functions from a logarithmic number of random
queries [77.34726150561087]
We prove that for any integer $ninmathbbN$, $din1,ldots,n$ and any $varepsilon,deltain(0,1)$, a bounded function $f:-1,1nto[-1,1]$ of degree at most $d$ can be learned.
arXiv Detail & Related papers (2021-09-21T13:19:04Z) - The planted matching problem: Sharp threshold and infinite-order phase
transition [25.41713098167692]
We study the problem of reconstructing a perfect matching $M*$ hidden in a randomly weighted $ntimes n$ bipartite graph.
We show that if $sqrtd B(mathcalP,mathcalQ) ge 1+epsilon$ for an arbitrarily small constant $epsilon>0$, the reconstruction error for any estimator is shown to be bounded away from $0$.
arXiv Detail & Related papers (2021-03-17T00:59:33Z) - Mathematical comparison of classical and quantum mechanisms in
optimization under local differential privacy [0.0]
We show that $mathrmEC_n(varepsilon)not=mathrmCQ_n(varepsilon)$ for every $nge3$, and estimate the difference between $mathrmEC_n(varepsilon)$ and $mathrmCQ_n(varepsilon)$ in a certain manner.
arXiv Detail & Related papers (2020-11-19T17:05:11Z) - An Optimal Separation of Randomized and Quantum Query Complexity [67.19751155411075]
We prove that for every decision tree, the absolute values of the Fourier coefficients of a given order $ellsqrtbinomdell (1+log n)ell-1,$ sum to at most $cellsqrtbinomdell (1+log n)ell-1,$ where $n$ is the number of variables, $d$ is the tree depth, and $c>0$ is an absolute constant.
arXiv Detail & Related papers (2020-08-24T06:50:57Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.