Corruption-Robust Lipschitz Contextual Search
- URL: http://arxiv.org/abs/2307.13903v4
- Date: Thu, 1 Feb 2024 08:11:55 GMT
- Title: Corruption-Robust Lipschitz Contextual Search
- Authors: Shiliang Zuo
- Abstract summary: I study the problem of learning a Lipschitz function with corrupted binary signals.
This work introduces the new algorithmic technique emphagnostic checking as well as new analysis techniques.
- Score: 2.296475290901356
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: I study the problem of learning a Lipschitz function with corrupted binary
signals. The learner tries to learn a $L$-Lipschitz function $f: [0,1]^d
\rightarrow [0, L]$ that the adversary chooses. There is a total of $T$ rounds.
In each round $t$, the adversary selects a context vector $x_t$ in the input
space, and the learner makes a guess to the true function value $f(x_t)$ and
receives a binary signal indicating whether the guess is high or low. In a
total of $C$ rounds, the signal may be corrupted, though the value of $C$ is
\emph{unknown} to the learner. The learner's goal is to incur a small
cumulative loss. This work introduces the new algorithmic technique
\emph{agnostic checking} as well as new analysis techniques. I design
algorithms which: for the symmetric loss, the learner achieves regret $L\cdot
O(C\log T)$ with $d = 1$ and $L\cdot O_d(C\log T + T^{(d-1)/d})$ with $d > 1$;
for the pricing loss, the learner achieves regret $L\cdot \widetilde{O}
(T^{d/(d+1)} + C\cdot T^{1/(d+1)})$.
Related papers
- Fast $(1+\varepsilon)$-Approximation Algorithms for Binary Matrix
Factorization [54.29685789885059]
We introduce efficient $(1+varepsilon)$-approximation algorithms for the binary matrix factorization (BMF) problem.
The goal is to approximate $mathbfA$ as a product of low-rank factors.
Our techniques generalize to other common variants of the BMF problem.
arXiv Detail & Related papers (2023-06-02T18:55:27Z) - Optimal SQ Lower Bounds for Learning Halfspaces with Massart Noise [9.378684220920562]
tightest statistical query (SQ) lower bounds for learnining halfspaces in the presence of Massart noise.
We show that for arbitrary $eta in [0,1/2]$ every SQ algorithm achieving misclassification error better than $eta$ requires queries of superpolynomial accuracy.
arXiv Detail & Related papers (2022-01-24T17:33:19Z) - Logarithmic Regret from Sublinear Hints [76.87432703516942]
We show that an algorithm can obtain $O(log T)$ regret with just $O(sqrtT)$ hints under a natural query model.
We also show that $o(sqrtT)$ hints cannot guarantee better than $Omega(sqrtT)$ regret.
arXiv Detail & Related papers (2021-11-09T16:50:18Z) - Learning low-degree functions from a logarithmic number of random
queries [77.34726150561087]
We prove that for any integer $ninmathbbN$, $din1,ldots,n$ and any $varepsilon,deltain(0,1)$, a bounded function $f:-1,1nto[-1,1]$ of degree at most $d$ can be learned.
arXiv Detail & Related papers (2021-09-21T13:19:04Z) - Memory-Sample Lower Bounds for Learning Parity with Noise [2.724141845301679]
We show, for the well-studied problem of learning parity under noise, that any learning algorithm requires either a memory of size $Omega(n2/varepsilon)$ or an exponential number of samples.
Our proof is based on adapting the arguments in [Raz'17,GRT'18] to the noisy case.
arXiv Detail & Related papers (2021-07-05T23:34:39Z) - Optimal Regret Algorithm for Pseudo-1d Bandit Convex Optimization [51.23789922123412]
We study online learning with bandit feedback (i.e. learner has access to only zeroth-order oracle) where cost/reward functions admit a "pseudo-1d" structure.
We show a lower bound of $min(sqrtdT, T3/4)$ for the regret of any algorithm, where $T$ is the number of rounds.
We propose a new algorithm sbcalg that combines randomized online gradient descent with a kernelized exponential weights method to exploit the pseudo-1d structure effectively.
arXiv Detail & Related papers (2021-02-15T08:16:51Z) - $Q$-learning with Logarithmic Regret [60.24952657636464]
We prove that an optimistic $Q$-learning enjoys a $mathcalOleft(fracSAcdot mathrmpolyleft(Hright)Delta_minlogleft(SATright)right)$ cumulative regret bound, where $S$ is the number of states, $A$ is the number of actions, $H$ is the planning horizon, $T$ is the total number of steps, and $Delta_min$ is the minimum sub-optimality gap.
arXiv Detail & Related papers (2020-06-16T13:01:33Z) - Model-Free Reinforcement Learning: from Clipped Pseudo-Regret to Sample
Complexity [59.34067736545355]
Given an MDP with $S$ states, $A$ actions, the discount factor $gamma in (0,1)$, and an approximation threshold $epsilon > 0$, we provide a model-free algorithm to learn an $epsilon$-optimal policy.
For small enough $epsilon$, we show an improved algorithm with sample complexity.
arXiv Detail & Related papers (2020-06-06T13:34:41Z) - Learning and Testing Variable Partitions [13.575794982844222]
We show that $mathcalO(k n2)(delta + epsilon)$ can be learned in time $tildemathcalO(n2 mathrmpoly (1/epsilon)$ for any $epsilon > 0$.
We also show that even two-sided testers require $Omega(n)$ queries when $k = 2$.
arXiv Detail & Related papers (2020-03-29T10:12:32Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.