Related papers: Binary Classification with XOR Queries: Fundamental Limits and An Efficient Algorithm

Binary Classification with XOR Queries: Fundamental Limits and An Efficient Algorithm

URL: http://arxiv.org/abs/2001.11775v2
Date: Fri, 30 Apr 2021 05:39:42 GMT
Title: Binary Classification with XOR Queries: Fundamental Limits and An Efficient Algorithm
Authors: Daesung Kim and Hye Won Chung
Abstract summary: We consider a query-based data acquisition problem for binary classification of unknown labels. We consider an effective query type that asks "group attribute" of a chosen subset of objects. We propose an efficient inference algorithm that achieves this limit even when the noise parameters are unknown.
Score: 15.127728811011245
License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
Abstract: We consider a query-based data acquisition problem for binary classification of unknown labels, which has diverse applications in communications, crowdsourcing, recommender systems and active learning. To ensure reliable recovery of unknown labels with as few number of queries as possible, we consider an effective query type that asks "group attribute" of a chosen subset of objects. In particular, we consider the problem of classifying $m$ binary labels with XOR queries that ask whether the number of objects having a given attribute in the chosen subset of size $d$ is even or odd. The subset size $d$, which we call query degree, can be varying over queries. We consider a general noise model where the accuracy of answers on queries changes depending both on the worker (the data provider) and query degree $d$. For this general model, we characterize the information-theoretic limit on the optimal number of queries to reliably recover $m$ labels in terms of a given combination of degree-$d$ queries and noise parameters. Further, we propose an efficient inference algorithm that achieves this limit even when the noise parameters are unknown.

Related papers

Envy-Free Allocation of Indivisible Goods via Noisy Queries [66.16311857301167]
We introduce a problem of fairly allocating indivisible goods in which the agents' valuations cannot be observed directly.<n>We derive upper and lower bounds on the required number of queries for finding an envy-free allocation.<n>Our upper bound is based on non-adaptive queries and a simple thresholding-based allocation algorithm that runs in computation time.
arXiv Detail & Related papers (2026-02-06T03:44:40Z)
Computing Voting Rules with Elicited Incomplete Votes [11.550634521005968]
We study voting rules that are computable by querying voters about $t m$ candidates. For scoring rules that are computable with limited-sized queries, we give parameterized upper and lower bounds on the number of such queries.
arXiv Detail & Related papers (2024-02-16T22:17:01Z)
Query2Particles: Knowledge Graph Reasoning with Particle Embeddings [49.64006979045662]
We propose a query embedding method to answer complex logical queries on knowledge graphs with missing edges. The answer entities are selected according to the similarities between the entity embeddings and the query embedding. A complex KG query answering method, Q2P, is proposed to retrieve diverse answers from different areas over the embedding space.
arXiv Detail & Related papers (2022-04-27T11:16:08Z)
Knowledge Base Question Answering by Case-based Reasoning over Subgraphs [81.22050011503933]
We show that our model answers queries requiring complex reasoning patterns more effectively than existing KG completion algorithms. The proposed model outperforms or performs competitively with state-of-the-art models on several KBQA benchmarks.
arXiv Detail & Related papers (2022-02-22T01:34:35Z)
Active clustering for labeling training data [0.8029049649310211]
We propose a setting for training data gathering where the human experts perform the comparatively cheap task of answering pairwise queries. We analyze the algorithms that minimize the average number of queries required to cluster the items and analyze their complexity.
arXiv Detail & Related papers (2021-10-27T15:35:58Z)
How to Query An Oracle? Efficient Strategies to Label Data [59.89900843097016]
We consider the basic problem of querying an expert oracle for labeling a dataset in machine learning. We present a randomized batch algorithm that operates on a round-by-round basis to label the samples and achieves a query rate of $O(fracNk2)$. In addition, we present an adaptive greedy query scheme, which achieves an average rate of $approx 0.2N$ queries per sample with triplet queries.
arXiv Detail & Related papers (2021-10-05T20:15:35Z)
Online Learning of Optimally Diverse Rankings [63.62764375279861]
We propose an algorithm that efficiently learns the optimal list based on users' feedback only. We show that after $T$ queries, the regret of LDR scales as $O((N-L)log(T))$ where $N$ is the number of all items.
arXiv Detail & Related papers (2021-09-13T12:13:20Z)
Active Covering [37.525977525895605]
We analyze the problem of active covering, where the learner is given an unlabeled dataset and can sequentially label query examples. The objective is to label query all of the positive examples in the fewest number of total label queries.
arXiv Detail & Related papers (2021-06-04T15:32:39Z)
Query2box: Reasoning over Knowledge Graphs in Vector Space using Box Embeddings [84.0206612938464]
query2box is an embedding-based framework for reasoning over arbitrary queries on incomplete knowledge graphs. We show that query2box is capable of handling arbitrary logical queries with $wedge$, $vee$, $exists$ in a scalable manner. We demonstrate the effectiveness of query2box on three large KGs and show that query2box achieves up to 25% relative improvement over the state of the art.
arXiv Detail & Related papers (2020-02-14T11:20:10Z)
Optimal Clustering from Noisy Binary Feedback [75.17453757892152]
We study the problem of clustering a set of items from binary user feedback. We devise an algorithm with a minimal cluster recovery error rate. For adaptive selection, we develop an algorithm inspired by the derivation of the information-theoretical error lower bounds.
arXiv Detail & Related papers (2019-10-14T09:18:26Z)

This list is automatically generated from the titles and abstracts of the papers in this site.