Related papers: Streaming Algorithms for Learning with Experts: Deterministic Versus Robust

Streaming Algorithms for Learning with Experts: Deterministic Versus Robust

URL: http://arxiv.org/abs/2303.01709v1
Date: Fri, 3 Mar 2023 04:39:53 GMT
Title: Streaming Algorithms for Learning with Experts: Deterministic Versus Robust
Authors: David P. Woodruff, Fred Zhang, Samson Zhou
Abstract summary: In the online learning with experts problem, an algorithm must make a prediction about an outcome on each of $T$ days (or times) The goal is to make a prediction with the minimum cost, specifically compared to the best expert in the set. We show a space lower bound of $widetildeOmegaleft(fracnMRTright)$ for any deterministic algorithm that achieves regret $R$ when the best expert makes $M$ mistakes.
Score: 62.98860182111096
License: http://creativecommons.org/licenses/by/4.0/
Abstract: In the online learning with experts problem, an algorithm must make a prediction about an outcome on each of $T$ days (or times), given a set of $n$ experts who make predictions on each day (or time). The algorithm is given feedback on the outcomes of each day, including the cost of its prediction and the cost of the expert predictions, and the goal is to make a prediction with the minimum cost, specifically compared to the best expert in the set. Recent work by Srinivas, Woodruff, Xu, and Zhou (STOC 2022) introduced the study of the online learning with experts problem under memory constraints. However, often the predictions made by experts or algorithms at some time influence future outcomes, so that the input is adaptively chosen. Whereas deterministic algorithms would be robust to adaptive inputs, existing algorithms all crucially use randomization to sample a small number of experts. In this paper, we study deterministic and robust algorithms for the experts problem. We first show a space lower bound of $\widetilde{\Omega}\left(\frac{nM}{RT}\right)$ for any deterministic algorithm that achieves regret $R$ when the best expert makes $M$ mistakes. Our result shows that the natural deterministic algorithm, which iterates through pools of experts until each expert in the pool has erred, is optimal up to polylogarithmic factors. On the positive side, we give a randomized algorithm that is robust to adaptive inputs that uses $\widetilde{O}\left(\frac{n}{R\sqrt{T}}\right)$ space for $M=O\left(\frac{R^2 T}{\log^2 n}\right)$, thereby showing a smooth space-regret trade-off.

Related papers

Learning to Play Against Unknown Opponents [9.346742321348366]
We show how to efficiently construct an optimal learning algorithm when the learning agent is not constrained to no-regret. All these results make use of recently developed machinery that converts the analysis of learning algorithms to the class of corresponding geometric objects known as menus.
arXiv Detail & Related papers (2024-12-24T09:05:06Z)
Competitive strategies to use "warm start" algorithms with predictions [12.970501425097645]
We consider the problem of learning and using predictions for warm start algorithms with predictions. In this setting, an algorithm is given an instance of a problem, and a prediction of the solution. We give competitive guarantees against stronger benchmarks that consider a set of $k$ predictions.
arXiv Detail & Related papers (2024-05-06T17:38:20Z)
No-Regret Online Prediction with Strategic Experts [16.54912614895861]
We study a generalization of the online binary prediction with expert advice framework where at each round, the learner is allowed to pick $mgeq 1$ experts from a pool of $K$ experts. We focus on the setting in which experts act strategically and aim to maximize their influence on the algorithm's predictions by potentially misreporting their beliefs. Our goal is to design algorithms that satisfy the following two requirements: 1) $textitIncentive-compatible$: Incentivize the experts to report their beliefs truthfully, and 2) $textitNo-regret$: Achieve
arXiv Detail & Related papers (2023-05-24T16:43:21Z)
Memory Bounds for the Experts Problem [53.67419690563877]
Online learning with expert advice is a fundamental problem of sequential prediction. The goal is to process predictions, and make a prediction with the minimum cost. An algorithm is judged by how well it does compared to the best expert in the set.
arXiv Detail & Related papers (2022-04-21T01:22:18Z)
Efficient and Optimal Fixed-Time Regret with Two Experts [5.650647159993238]
Prediction with expert advice is a foundational problem in online learning. In instances with $T$ rounds and $n$ experts, the classical Multiplicative Weights Update method suffers at most $sqrt(T/2)ln n$ regret when $T$ is known beforehand. When the number of experts $n$ is small/fixed, algorithms with better regret guarantees exist.
arXiv Detail & Related papers (2022-03-15T01:07:09Z)
Robustification of Online Graph Exploration Methods [59.50307752165016]
We study a learning-augmented variant of the classical, notoriously hard online graph exploration problem. We propose an algorithm that naturally integrates predictions into the well-known Nearest Neighbor (NN) algorithm.
arXiv Detail & Related papers (2021-12-10T10:02:31Z)
Randomized Exploration for Reinforcement Learning with General Value Function Approximation [122.70803181751135]
We propose a model-free reinforcement learning algorithm inspired by the popular randomized least squares value iteration (RLSVI) algorithm. Our algorithm drives exploration by simply perturbing the training data with judiciously chosen i.i.d. scalar noises. We complement the theory with an empirical evaluation across known difficult exploration tasks.
arXiv Detail & Related papers (2021-06-15T02:23:07Z)
Double Coverage with Machine-Learned Advice [100.23487145400833]
We study the fundamental online $k$-server problem in a learning-augmented setting. We show that our algorithm achieves for any k an almost optimal consistency-robustness tradeoff.
arXiv Detail & Related papers (2021-03-02T11:04:33Z)
Impossible Tuning Made Possible: A New Expert Algorithm and Its Applications [37.6975819766632]
We show that it is possible to achieve regret $Oleft(sqrt(ln d)sum_t ell_t,i2right)$ simultaneously for all expert $i$ in a $T$-round $d$-expert problem. Our algorithm is based on the Mirror Descent framework with a correction term and a weighted entropy regularizer.
arXiv Detail & Related papers (2021-02-01T18:34:21Z)
Malicious Experts versus the multiplicative weights algorithm in online prediction [85.62472761361107]
We consider a prediction problem with two experts and a forecaster. We assume that one of the experts is honest and makes correct prediction with probability $mu$ at each round. The other one is malicious, who knows true outcomes at each round and makes predictions in order to maximize the loss of the forecaster.
arXiv Detail & Related papers (2020-03-18T20:12:08Z)

This list is automatically generated from the titles and abstracts of the papers in this site.

This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.