Related papers: A New Minimax Theorem for Randomized Algorithms

A New Minimax Theorem for Randomized Algorithms

URL: http://arxiv.org/abs/2002.10802v2
Date: Thu, 17 Sep 2020 22:20:17 GMT
Title: A New Minimax Theorem for Randomized Algorithms
Authors: Shalev Ben-David, Eric Blais
Abstract summary: We introduce a new type of minimax theorem which can provide a hard distribution $mu$ that works for all bias levels at once. We show that this works for randomized query complexity, randomized communication complexity, approximate degreelemma, and approximate logrank. We also prove an improved version of Impagliazzo's hardcore.
Score: 1.2284934135116514
License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
Abstract: The celebrated minimax principle of Yao (1977) says that for any Boolean-valued function $f$ with finite domain, there is a distribution $\mu$ over the domain of $f$ such that computing $f$ to error $\epsilon$ against inputs from $\mu$ is just as hard as computing $f$ to error $\epsilon$ on worst-case inputs. Notably, however, the distribution $\mu$ depends on the target error level $\epsilon$: the hard distribution which is tight for bounded error might be trivial to solve to small bias, and the hard distribution which is tight for a small bias level might be far from tight for bounded error levels. In this work, we introduce a new type of minimax theorem which can provide a hard distribution $\mu$ that works for all bias levels at once. We show that this works for randomized query complexity, randomized communication complexity, some randomized circuit models, quantum query and communication complexities, approximate polynomial degree, and approximate logrank. We also prove an improved version of Impagliazzo's hardcore lemma. Our proofs rely on two innovations over the classical approach of using Von Neumann's minimax theorem or linear programming duality. First, we use Sion's minimax theorem to prove a minimax theorem for ratios of bilinear functions representing the cost and score of algorithms. Second, we introduce a new way to analyze low-bias randomized algorithms by viewing them as "forecasting algorithms" evaluated by a proper scoring rule. The expected score of the forecasting version of a randomized algorithm appears to be a more fine-grained way of analyzing the bias of the algorithm. We show that such expected scores have many elegant mathematical properties: for example, they can be amplified linearly instead of quadratically. We anticipate forecasting algorithms will find use in future work in which a fine-grained analysis of small-bias algorithms is required.

Related papers

Simple and Provable Scaling Laws for the Test-Time Compute of Large Language Models [70.07661254213181]
We propose two principled algorithms for the test-time compute of large language models. We prove theoretically that the failure probability of one algorithm decays to zero exponentially as its test-time compute grows.
arXiv Detail & Related papers (2024-11-29T05:29:47Z)
Binary Search with Distributional Predictions [26.193119064206304]
We study algorithms with distributional predictions, where the prediction itself is a distribution. We show that there are simple distributions where using the classical prediction-based algorithm with any single prediction does poorly. This also yields the first distributionally-robust algorithm for the classical problem of computing an optimal binary search tree.
arXiv Detail & Related papers (2024-11-25T01:15:31Z)
Low-Rank Bandits via Tight Two-to-Infinity Singular Subspace Recovery [45.601316850669406]
We present efficient algorithms for policy evaluation, best policy identification and regret minimization. For policy evaluation and best policy identification, we show that our algorithms are nearly minimax optimal. All the proposed algorithms consist of two phases: they first leverage spectral methods to estimate the left and right singular subspaces of the low-rank reward matrix.
arXiv Detail & Related papers (2024-02-24T06:36:08Z)
Efficiently Learning One-Hidden-Layer ReLU Networks via Schur Polynomials [50.90125395570797]
We study the problem of PAC learning a linear combination of $k$ ReLU activations under the standard Gaussian distribution on $mathbbRd$ with respect to the square loss. Our main result is an efficient algorithm for this learning task with sample and computational complexity $(dk/epsilon)O(k)$, whereepsilon>0$ is the target accuracy.
arXiv Detail & Related papers (2023-07-24T14:37:22Z)
Near-Optimal Bounds for Learning Gaussian Halfspaces with Random Classification Noise [50.64137465792738]
We show that any efficient SQ algorithm for the problem requires sample complexity at least $Omega(d1/2/(maxp, epsilon)2)$. Our lower bound suggests that this quadratic dependence on $1/epsilon$ is inherent for efficient algorithms.
arXiv Detail & Related papers (2023-07-13T18:59:28Z)
Private estimation algorithms for stochastic block models and mixture models [63.07482515700984]
General tools for designing efficient private estimation algorithms. First efficient $(epsilon, delta)$-differentially private algorithm for both weak recovery and exact recovery.
arXiv Detail & Related papers (2023-01-11T09:12:28Z)
Best Policy Identification in Linear MDPs [70.57916977441262]
We investigate the problem of best identification in discounted linear Markov+Delta Decision in the fixed confidence setting under a generative model. The lower bound as the solution of an intricate non- optimization program can be used as the starting point to devise such algorithms.
arXiv Detail & Related papers (2022-08-11T04:12:50Z)
Robust Sparse Mean Estimation via Sum of Squares [42.526664955704746]
We study the problem of high-dimensional sparse mean estimation in the presence of an $epsilon$-fraction of adversarial outliers. Our algorithms follow the Sum-of-Squares based, to algorithms approach.
arXiv Detail & Related papers (2022-06-07T16:49:54Z)
Minimax Optimal Quantization of Linear Models: Information-Theoretic Limits and Efficient Algorithms [59.724977092582535]
We consider the problem of quantizing a linear model learned from measurements. We derive an information-theoretic lower bound for the minimax risk under this setting. We show that our method and upper-bounds can be extended for two-layer ReLU neural networks.
arXiv Detail & Related papers (2022-02-23T02:39:04Z)
Locally Private Hypothesis Selection [96.06118559817057]
We output a distribution from $mathcalQ$ whose total variation distance to $p$ is comparable to the best such distribution. We show that the constraint of local differential privacy incurs an exponential increase in cost. Our algorithms result in exponential improvements on the round complexity of previous methods.
arXiv Detail & Related papers (2020-02-21T18:30:48Z)
Optimal $\delta$-Correct Best-Arm Selection for Heavy-Tailed Distributions [2.2940141855172036]
We consider the problem of identifying the one with the maximum mean using a $delta$-correct algorithm. Lower bounds for $delta$-correct algorithms are well known. We propose a $delta$-correct algorithm that matches the lower bound as $delta$ reduces to zero.
arXiv Detail & Related papers (2019-08-24T05:31:49Z)

This list is automatically generated from the titles and abstracts of the papers in this site.

This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.