Fast Interactive Search with a Scale-Free Comparison Oracle
- URL: http://arxiv.org/abs/2306.01814v1
- Date: Fri, 2 Jun 2023 09:33:19 GMT
- Title: Fast Interactive Search with a Scale-Free Comparison Oracle
- Authors: Daniyar Chumbalov, Lars Klein, Lucas Maystre, Matthias Grossglauser
- Abstract summary: A comparison-based search algorithm lets a user find a target item $t$ in a database by answering queries of the form.
We propose a scale-free probabilistic oracle model called $gamma$-CKL for such similarity triplets $(i,j;t)$.
We develop a search algorithm with provably exponential rate of convergence under the $gamma$-CKL oracle, thanks to a backtracking strategy.
- Score: 17.38671584773247
- License: http://creativecommons.org/licenses/by-nc-sa/4.0/
- Abstract: A comparison-based search algorithm lets a user find a target item $t$ in a
database by answering queries of the form, ``Which of items $i$ and $j$ is
closer to $t$?'' Instead of formulating an explicit query (such as one or
several keywords), the user navigates towards the target via a sequence of such
(typically noisy) queries.
We propose a scale-free probabilistic oracle model called $\gamma$-CKL for
such similarity triplets $(i,j;t)$, which generalizes the CKL triplet model
proposed in the literature. The generalization affords independent control over
the discriminating power of the oracle and the dimension of the feature space
containing the items.
We develop a search algorithm with provably exponential rate of convergence
under the $\gamma$-CKL oracle, thanks to a backtracking strategy that deals
with the unavoidable errors in updating the belief region around the target.
We evaluate the performance of the algorithm both over the posited oracle and
over several real-world triplet datasets. We also report on a comprehensive
user study, where human subjects navigate a database of face portraits.
Related papers
- FLARE: Faithful Logic-Aided Reasoning and Exploration [50.9814063216852]
We introduce a novel approach for traversing the problem space using task decompositions.
We use the Large Language Models to plan a solution, soft-formalise the query into facts and predicates using a logic programming code.
Our method allows us to compute the faithfulness of the reasoning process w.r.t. the generated code and analyse the steps of the multi-hop search without relying on external solvers.
arXiv Detail & Related papers (2024-10-14T19:39:11Z) - Accelerated quantum search using partial oracles and Grover's algorithm [0.0]
Grover's algorithm, orginally conceived as a means of searching an unordered database, can also be used to extract solutions from the result sets generated by quantum computations.
We explore the idea of associating a separate oracle with each bit of the matching condition, obtaining multiple partial oracle functions which can be tested independently.
The algorithm is validated against the simplest kind of search scenario, where the incoming index bits are scrambled using a permutation operation.
arXiv Detail & Related papers (2024-03-19T11:32:02Z) - JoinGym: An Efficient Query Optimization Environment for Reinforcement
Learning [58.71541261221863]
Join order selection (JOS) is the problem of ordering join operations to minimize total query execution cost.
We present JoinGym, a query optimization environment for bushy reinforcement learning (RL)
Under the hood, JoinGym simulates a query plan's cost by looking up intermediate result cardinalities from a pre-computed dataset.
arXiv Detail & Related papers (2023-07-21T17:00:06Z) - Approximating a RUM from Distributions on k-Slates [88.32814292632675]
We find a generalization-time algorithm that finds the RUM that best approximates the given distribution on average.
Our theoretical result can also be made practical: we obtain a that is effective and scales to real-world datasets.
arXiv Detail & Related papers (2023-05-22T17:43:34Z) - How to Query An Oracle? Efficient Strategies to Label Data [59.89900843097016]
We consider the basic problem of querying an expert oracle for labeling a dataset in machine learning.
We present a randomized batch algorithm that operates on a round-by-round basis to label the samples and achieves a query rate of $O(fracNk2)$.
In addition, we present an adaptive greedy query scheme, which achieves an average rate of $approx 0.2N$ queries per sample with triplet queries.
arXiv Detail & Related papers (2021-10-05T20:15:35Z) - Towards a Query-Optimal and Time-Efficient Algorithm for Clustering with
a Faulty Oracle [7.449644976563424]
We propose an elegant theoretical model for studying clustering with a faulty oracle.
It was left as an open question whether one can obtain a query-optimal, time-efficient algorithm for the general case of $k$ clusters.
We provide a time-efficient algorithm with nearly-optimal query complexity (up to a factor of $O(log2 n)$) for all constant $k$ and any $delta$ in the regime when information-theoretic recovery is possible.
arXiv Detail & Related papers (2021-06-18T22:20:12Z) - Fuzzy Clustering with Similarity Queries [56.96625809888241]
The fuzzy or soft objective is a popular generalization of the well-known $k$-means problem.
We show that by making few queries, the problem becomes easier to solve.
arXiv Detail & Related papers (2021-06-04T02:32:26Z) - Query-Efficient Correlation Clustering [13.085439249887713]
Correlation clustering is arguably the most natural formulation of clustering.
A main drawback of correlation clustering is that it requires as input the $Theta(n2)$ pairwise similarities.
We devise a correlation clustering algorithm that attains a solution whose expected number of disagreements is at most $3cdot OPT + O(fracn3Q)$.
arXiv Detail & Related papers (2020-02-26T15:18:20Z) - Optimal Clustering from Noisy Binary Feedback [75.17453757892152]
We study the problem of clustering a set of items from binary user feedback.
We devise an algorithm with a minimal cluster recovery error rate.
For adaptive selection, we develop an algorithm inspired by the derivation of the information-theoretical error lower bounds.
arXiv Detail & Related papers (2019-10-14T09:18:26Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.