Query Complexity of k-NN based Mode Estimation
- URL: http://arxiv.org/abs/2010.13491v1
- Date: Mon, 26 Oct 2020 11:32:08 GMT
- Title: Query Complexity of k-NN based Mode Estimation
- Authors: Anirudh Singhal, Subham Pirojiwala and Nikhil Karamchandani
- Abstract summary: We study the problem of identifying the point with the minimum k-th nearest neighbor distance for a given dataset of n points.
We design a sequential learning algorithm, based on the idea of confidence intervals, which adaptively decides which queries to send to the oracle and is able to correctly solve the problem with high probability.
- Score: 4.821253146561989
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Motivated by the mode estimation problem of an unknown multivariate
probability density function, we study the problem of identifying the point
with the minimum k-th nearest neighbor distance for a given dataset of n
points. We study the case where the pairwise distances are apriori unknown, but
we have access to an oracle which we can query to get noisy information about
the distance between any pair of points. For two natural oracle models, we
design a sequential learning algorithm, based on the idea of confidence
intervals, which adaptively decides which queries to send to the oracle and is
able to correctly solve the problem with high probability. We derive
instance-dependent upper bounds on the query complexity of our proposed scheme
and also demonstrate significant improvement over the performance of other
baselines via extensive numerical evaluations.
Related papers
- Best Arm Identification with Fixed Budget: A Large Deviation Perspective [54.305323903582845]
We present sred, a truly adaptive algorithm that can reject arms in it any round based on the observed empirical gaps between the rewards of various arms.
In particular, we present sred, a truly adaptive algorithm that can reject arms in it any round based on the observed empirical gaps between the rewards of various arms.
arXiv Detail & Related papers (2023-12-19T13:17:43Z) - Approximating a RUM from Distributions on k-Slates [88.32814292632675]
We find a generalization-time algorithm that finds the RUM that best approximates the given distribution on average.
Our theoretical result can also be made practical: we obtain a that is effective and scales to real-world datasets.
arXiv Detail & Related papers (2023-05-22T17:43:34Z) - Learning to Bound Counterfactual Inference in Structural Causal Models
from Observational and Randomised Data [64.96984404868411]
We derive a likelihood characterisation for the overall data that leads us to extend a previous EM-based algorithm.
The new algorithm learns to approximate the (unidentifiability) region of model parameters from such mixed data sources.
It delivers interval approximations to counterfactual results, which collapse to points in the identifiable case.
arXiv Detail & Related papers (2022-12-06T12:42:11Z) - How to Design Robust Algorithms using Noisy Comparison Oracle [12.353002222958605]
Metric based comparison operations are fundamental to studying various clustering techniques.
In this paper, we study various problems that include finding maximum, nearest/farthest neighbor search.
We give robust algorithms for k-center clustering and agglomerative hierarchical clustering.
arXiv Detail & Related papers (2021-05-12T16:58:09Z) - Nearest Neighbor Search Under Uncertainty [19.225091554227948]
Nearest Neighbor Search (NNS) is a central task in knowledge representation, learning, and reasoning.
This paper studies NNS under Uncertainty (NNSU)
arXiv Detail & Related papers (2021-03-08T20:20:01Z) - ThetA -- fast and robust clustering via a distance parameter [3.0020405188885815]
Clustering is a fundamental problem in machine learning where distance-based approaches have dominated the field for many decades.
We propose a new set of distance threshold methods called Theta-based Algorithms (ThetA)
arXiv Detail & Related papers (2021-02-13T23:16:33Z) - Leveraging Reinforcement Learning for evaluating Robustness of KNN
Search Algorithms [0.0]
The problem of finding K-nearest neighbors in the given dataset for a given query point has been worked upon since several years.
In this paper, we survey some novel K-Nearest Neighbor Search approaches that tackles the problem of Search from the perspectives of computations.
In order to evaluate the robustness of a KNNS approach against adversarial points, we propose a generic Reinforcement Learning based framework for the same.
arXiv Detail & Related papers (2021-02-10T16:10:58Z) - Greedy k-Center from Noisy Distance Samples [10.363116234985515]
We study a variant of the canonical k-center problem over a set of vertices in a metric space, where the underlying distances are apriori unknown.
We consider two oracle models: Dimension Sampling where each query to the oracle returns the distance between a pair of points in one dimension; and Noisy Distance Sampling where the oracle returns the true distance corrupted by noise.
We propose active algorithms, based on ideas such as UCB, Thompson Sampling and Track-and-Stop developed in the closely related Multi-Armed Bandit problem.
arXiv Detail & Related papers (2020-11-03T19:37:02Z) - Stochastic Saddle-Point Optimization for Wasserstein Barycenters [69.68068088508505]
We consider the populationimation barycenter problem for random probability measures supported on a finite set of points and generated by an online stream of data.
We employ the structure of the problem and obtain a convex-concave saddle-point reformulation of this problem.
In the setting when the distribution of random probability measures is discrete, we propose an optimization algorithm and estimate its complexity.
arXiv Detail & Related papers (2020-06-11T19:40:38Z) - Neural Methods for Point-wise Dependency Estimation [129.93860669802046]
We focus on estimating point-wise dependency (PD), which quantitatively measures how likely two outcomes co-occur.
We demonstrate the effectiveness of our approaches in 1) MI estimation, 2) self-supervised representation learning, and 3) cross-modal retrieval task.
arXiv Detail & Related papers (2020-06-09T23:26:15Z) - High-Dimensional Robust Mean Estimation via Gradient Descent [73.61354272612752]
We show that the problem of robust mean estimation in the presence of a constant adversarial fraction can be solved by gradient descent.
Our work establishes an intriguing connection between the near non-lemma estimation and robust statistics.
arXiv Detail & Related papers (2020-05-04T10:48:04Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.