Best-item Learning in Random Utility Models with Subset Choices
- URL: http://arxiv.org/abs/2002.07994v1
- Date: Wed, 19 Feb 2020 03:57:16 GMT
- Title: Best-item Learning in Random Utility Models with Subset Choices
- Authors: Aadirupa Saha and Aditya Gopalan
- Abstract summary: We consider the problem of PAC learning the most valuable item from a pool of $n$ items using sequential, adaptively chosen plays of subsets of $k$ items.
We identify a new property of such a RUM, termed the minimum advantage, that helps in characterizing the complexity of separating pairs of items.
We give a learning algorithm for general RUMs, based on pairwise relative counts of items and hierarchical elimination, along with a new PAC sample complexity guarantee.
- Score: 40.17224226373741
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: We consider the problem of PAC learning the most valuable item from a pool of
$n$ items using sequential, adaptively chosen plays of subsets of $k$ items,
when, upon playing a subset, the learner receives relative feedback sampled
according to a general Random Utility Model (RUM) with independent noise
perturbations to the latent item utilities. We identify a new property of such
a RUM, termed the minimum advantage, that helps in characterizing the
complexity of separating pairs of items based on their relative win/loss
empirical counts, and can be bounded as a function of the noise distribution
alone. We give a learning algorithm for general RUMs, based on pairwise
relative counts of items and hierarchical elimination, along with a new PAC
sample complexity guarantee of $O(\frac{n}{c^2\epsilon^2} \log
\frac{k}{\delta})$ rounds to identify an $\epsilon$-optimal item with
confidence $1-\delta$, when the worst case pairwise advantage in the RUM has
sensitivity at least $c$ to the parameter gaps of items. Fundamental lower
bounds on PAC sample complexity show that this is near-optimal in terms of its
dependence on $n,k$ and $c$.
Related papers
- Information-Computation Tradeoffs for Learning Margin Halfspaces with
Random Classification Noise [50.64137465792738]
We study the problem of PAC $gamma$-margin halfspaces with Random Classification Noise.
We establish an information-computation tradeoff suggesting an inherent gap between the sample complexity of the problem and the sample complexity of computationally efficient algorithms.
arXiv Detail & Related papers (2023-06-28T16:33:39Z) - On-Demand Sampling: Learning Optimally from Multiple Distributions [63.20009081099896]
Social and real-world considerations have given rise to multi-distribution learning paradigms.
We establish the optimal sample complexity of these learning paradigms and give algorithms that meet this sample complexity.
Our algorithm design and analysis are enabled by our extensions of online learning techniques for solving zero-sum games.
arXiv Detail & Related papers (2022-10-22T19:07:26Z) - Sample Complexity Bounds for Robustly Learning Decision Lists against
Evasion Attacks [25.832511407411637]
A fundamental problem in adversarial machine learning is to quantify how much training data is needed in the presence of evasion attacks.
We work with probability distributions on the input data that satisfy a Lipschitz condition: nearby points have similar probability.
For every fixed $k$ the class of $k$-decision lists has sample complexity against a $log(n)$-bounded adversary.
arXiv Detail & Related papers (2022-05-12T14:40:18Z) - Under-bagging Nearest Neighbors for Imbalanced Classification [63.026765294759876]
We propose an ensemble learning algorithm called textitunder-bagging $k$-NN (textitunder-bagging $k$-NN) for imbalanced classification problems.
arXiv Detail & Related papers (2021-09-01T14:10:38Z) - An Online Riemannian PCA for Stochastic Canonical Correlation Analysis [37.8212762083567]
We present an efficient algorithm (RSG+) for canonical correlation analysis (CCA) using a reparametrization of the projection matrices.
While the paper primarily focuses on the formulation and technical analysis of its properties, our experiments show that the empirical behavior on common datasets is quite promising.
arXiv Detail & Related papers (2021-06-08T23:38:29Z) - The Sample Complexity of Best-$k$ Items Selection from Pairwise
Comparisons [33.66975831989357]
We study the sample complexity (aka number of comparisons) bounds for the active best-$k$ items selection from pairwise comparisons.
At any time, the learner can adaptively choose a pair of items to compare according to past observations.
We propose two algorithms based on our PAC best items selection algorithms.
arXiv Detail & Related papers (2020-07-06T23:53:09Z) - Sampling from a $k$-DPP without looking at all items [58.30573872035083]
Given a kernel function and a subset size $k$, our goal is to sample $k$ out of $n$ items with probability proportional to the determinant of the kernel matrix induced by the subset (a.k.a. $k$-DPP)
Existing $k$-DPP sampling algorithms require an expensive preprocessing step which involves multiple passes over all $n$ items, making it infeasible for large datasets.
We develop an algorithm which adaptively builds a sufficiently large uniform sample of data that is then used to efficiently generate a smaller set of $k$ items.
arXiv Detail & Related papers (2020-06-30T16:40:44Z) - Robustly Learning any Clusterable Mixture of Gaussians [55.41573600814391]
We study the efficient learnability of high-dimensional Gaussian mixtures in the adversarial-robust setting.
We provide an algorithm that learns the components of an $epsilon$-corrupted $k$-mixture within information theoretically near-optimal error proofs of $tildeO(epsilon)$.
Our main technical contribution is a new robust identifiability proof clusters from a Gaussian mixture, which can be captured by the constant-degree Sum of Squares proof system.
arXiv Detail & Related papers (2020-05-13T16:44:12Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.