Related papers: Greedy k-Center from Noisy Distance Samples

Greedy k-Center from Noisy Distance Samples

URL: http://arxiv.org/abs/2011.01973v3
Date: Mon, 4 Apr 2022 08:03:30 GMT
Title: Greedy k-Center from Noisy Distance Samples
Authors: Neharika Jali, Nikhil Karamchandani, and Sharayu Moharir
Abstract summary: We study a variant of the canonical k-center problem over a set of vertices in a metric space, where the underlying distances are apriori unknown. We consider two oracle models: Dimension Sampling where each query to the oracle returns the distance between a pair of points in one dimension; and Noisy Distance Sampling where the oracle returns the true distance corrupted by noise. We propose active algorithms, based on ideas such as UCB, Thompson Sampling and Track-and-Stop developed in the closely related Multi-Armed Bandit problem.
Score: 10.363116234985515
License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
Abstract: We study a variant of the canonical k-center problem over a set of vertices in a metric space, where the underlying distances are apriori unknown. Instead, we can query an oracle which provides noisy/incomplete estimates of the distance between any pair of vertices. We consider two oracle models: Dimension Sampling where each query to the oracle returns the distance between a pair of points in one dimension; and Noisy Distance Sampling where the oracle returns the true distance corrupted by noise. We propose active algorithms, based on ideas such as UCB, Thompson Sampling and Track-and-Stop developed in the closely related Multi-Armed Bandit problem, which adaptively decide which queries to send to the oracle and are able to solve the k-center problem within an approximation ratio of two with high probability. We analytically characterize instance-dependent query complexity of our algorithms and also demonstrate significant improvements over naive implementations via numerical evaluations on two real-world datasets (Tiny ImageNet and UT Zappos50K).

Related papers

A Sample Efficient Alternating Minimization-based Algorithm For Robust Phase Retrieval [56.67706781191521]
In this work, we present a robust phase retrieval problem where the task is to recover an unknown signal. Our proposed oracle avoids the need for computationally spectral descent, using a simple gradient step and outliers.
arXiv Detail & Related papers (2024-09-07T06:37:23Z)
Query-Efficient Correlation Clustering with Noisy Oracle [17.11782578276788]
We introduce two novel formulations of online learning problems rooted in the paradigm of Pure Exploration in Combinatorial Multi-Armed Bandits (PE-CMAB) We design algorithms that combine a sampling strategy with a classic approximation algorithm for correlation and study their theoretical guarantees. Our results are the first examples of clustering-time algorithms that work for the case of PE-CMAB in which the underlying offline optimization problem is NP-hard.
arXiv Detail & Related papers (2024-02-02T13:31:24Z)
Best Arm Identification with Fixed Budget: A Large Deviation Perspective [54.305323903582845]
We present sred, a truly adaptive algorithm that can reject arms in it any round based on the observed empirical gaps between the rewards of various arms. In particular, we present sred, a truly adaptive algorithm that can reject arms in it any round based on the observed empirical gaps between the rewards of various arms.
arXiv Detail & Related papers (2023-12-19T13:17:43Z)
Approximating a RUM from Distributions on k-Slates [88.32814292632675]
We find a generalization-time algorithm that finds the RUM that best approximates the given distribution on average. Our theoretical result can also be made practical: we obtain a that is effective and scales to real-world datasets.
arXiv Detail & Related papers (2023-05-22T17:43:34Z)
How to Design Robust Algorithms using Noisy Comparison Oracle [12.353002222958605]
Metric based comparison operations are fundamental to studying various clustering techniques. In this paper, we study various problems that include finding maximum, nearest/farthest neighbor search. We give robust algorithms for k-center clustering and agglomerative hierarchical clustering.
arXiv Detail & Related papers (2021-05-12T16:58:09Z)
Finding Geometric Models by Clustering in the Consensus Space [61.65661010039768]
We propose a new algorithm for finding an unknown number of geometric models, e.g., homographies. We present a number of applications where the use of multiple geometric models improves accuracy. These include pose estimation from multiple generalized homographies; trajectory estimation of fast-moving objects.
arXiv Detail & Related papers (2021-03-25T14:35:07Z)
ThetA -- fast and robust clustering via a distance parameter [3.0020405188885815]
Clustering is a fundamental problem in machine learning where distance-based approaches have dominated the field for many decades. We propose a new set of distance threshold methods called Theta-based Algorithms (ThetA)
arXiv Detail & Related papers (2021-02-13T23:16:33Z)
(k, l)-Medians Clustering of Trajectories Using Continuous Dynamic Time Warping [57.316437798033974]
In this work we consider the problem of center-based clustering of trajectories. We propose the usage of a continuous version of DTW as distance measure, which we call continuous dynamic time warping (CDTW) We show a practical way to compute a center from a set of trajectories and subsequently iteratively improve it.
arXiv Detail & Related papers (2020-12-01T13:17:27Z)
Cross-Domain Generalization Through Memorization: A Study of Nearest Neighbors in Neural Duplicate Question Detection [72.01292864036087]
Duplicate question detection (DQD) is important to increase efficiency of community and automatic question answering systems. We leverage neural representations and study nearest neighbors for cross-domain generalization in DQD. We observe robust performance of this method in different cross-domain scenarios of StackExchange, Spring and Quora datasets.
arXiv Detail & Related papers (2020-11-22T19:19:33Z)
Query Complexity of k-NN based Mode Estimation [4.821253146561989]
We study the problem of identifying the point with the minimum k-th nearest neighbor distance for a given dataset of n points. We design a sequential learning algorithm, based on the idea of confidence intervals, which adaptively decides which queries to send to the oracle and is able to correctly solve the problem with high probability.
arXiv Detail & Related papers (2020-10-26T11:32:08Z)
Learning to Accelerate Heuristic Searching for Large-Scale Maximum Weighted b-Matching Problems in Online Advertising [51.97494906131859]
Bipartite b-matching is fundamental in algorithm design, and has been widely applied into economic markets, labor markets, etc. Existing exact and approximate algorithms usually fail in such settings due to either requiring intolerable running time or too much computation resource. We propose textttNeuSearcher which leverages the knowledge learned from previously instances to solve new problem instances.
arXiv Detail & Related papers (2020-05-09T02:48:23Z)

This list is automatically generated from the titles and abstracts of the papers in this site.