HITSnDIFFs: From Truth Discovery to Ability Discovery by Recovering
Matrices with the Consecutive Ones Property
- URL: http://arxiv.org/abs/2401.00013v1
- Date: Thu, 21 Dec 2023 18:47:17 GMT
- Title: HITSnDIFFs: From Truth Discovery to Ability Discovery by Recovering
Matrices with the Consecutive Ones Property
- Authors: Zixuan Chen, Subhodeep Mitra, R Ravi, Wolfgang Gatterbauer
- Abstract summary: We analyze a general problem in a crowd-sourced setting where one user asks a question (also called item) and other users return answers (also called labels) for this question.
We call this problem "ability discovery" to emphasize the connection to and duality with the more well-studied problem of "truth discovery"
Our experiments show that our novel variant of HITS produces user rankings with robustly high accuracy compared to state-of-the-art truth discovery methods.
- Score: 11.332742187228524
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: We analyze a general problem in a crowd-sourced setting where one user asks a
question (also called item) and other users return answers (also called labels)
for this question. Different from existing crowd sourcing work which focuses on
finding the most appropriate label for the question (the "truth"), our problem
is to determine a ranking of the users based on their ability to answer
questions. We call this problem "ability discovery" to emphasize the connection
to and duality with the more well-studied problem of "truth discovery".
To model items and their labels in a principled way, we draw upon Item
Response Theory (IRT) which is the widely accepted theory behind standardized
tests such as SAT and GRE. We start from an idealized setting where the
relative performance of users is consistent across items and better users
choose better fitting labels for each item. We posit that a principled
algorithmic solution to our more general problem should solve this ideal
setting correctly and observe that the response matrices in this setting obey
the Consecutive Ones Property (C1P). While C1P is well understood
algorithmically with various discrete algorithms, we devise a novel variant of
the HITS algorithm which we call "HITSNDIFFS" (or HND), and prove that it can
recover the ideal C1P-permutation in case it exists. Unlike fast combinatorial
algorithms for finding the consecutive ones permutation (if it exists), HND
also returns an ordering when such a permutation does not exist. Thus it
provides a principled heuristic for our problem that is guaranteed to return
the correct answer in the ideal setting. Our experiments show that HND produces
user rankings with robustly high accuracy compared to state-of-the-art truth
discovery methods. We also show that our novel variant of HITS scales better in
the number of users than ABH, the only prior spectral C1P reconstruction
algorithm.
Related papers
- Simple and Provable Scaling Laws for the Test-Time Compute of Large Language Models [70.07661254213181]
We propose two principled algorithms for the test-time compute of large language models.
We prove theoretically that the failure probability of one algorithm decays to zero exponentially as its test-time compute grows.
arXiv Detail & Related papers (2024-11-29T05:29:47Z) - Dual-Directed Algorithm Design for Efficient Pure Exploration [9.728332815218181]
We consider pure-exploration problems in the context of sequential adaptive experiments with a finite set of alternatives.
We formulate the problem complexity measure as a maximin optimization problem for the static fixed-budget, fixed-confidence, and posterior convergence rate settings.
Our algorithm attains optimality in $varepsilon$-best-arm identification (or ranking and selection with a probability of good selection guarantee) and thresholding bandits.
arXiv Detail & Related papers (2023-10-30T07:29:17Z) - Improved theoretical guarantee for rank aggregation via spectral method [1.0152838128195467]
Given pairwise comparisons between multiple items, how to rank them so that the ranking matches the observations?
This problem, known as rank aggregation, has found many applications in sports, recommendation systems, and other web applications.
Here, each pairwise comparison is a corrupted copy of the true score difference.
arXiv Detail & Related papers (2023-09-07T16:01:47Z) - Learning to Select and Rank from Choice-Based Feedback: A Simple Nested Approach [10.293894471295205]
We study a ranking and selection problem of learning from choice-based feedback with dynamic assortments.
We present novel and simple algorithms for both learning goals.
arXiv Detail & Related papers (2023-07-13T05:05:30Z) - HARRIS: Hybrid Ranking and Regression Forests for Algorithm Selection [75.84584400866254]
We propose a new algorithm selector leveraging special forests, combining the strengths of both approaches while alleviating their weaknesses.
HARRIS' decisions are based on a forest model, whose trees are created based on optimized on a hybrid ranking and regression loss function.
arXiv Detail & Related papers (2022-10-31T14:06:11Z) - Learning to Hash Robustly, with Guarantees [79.68057056103014]
In this paper, we design an NNS algorithm for the Hamming space that has worst-case guarantees essentially matching that of theoretical algorithms.
We evaluate the algorithm's ability to optimize for a given dataset both theoretically and practically.
Our algorithm has a 1.8x and 2.1x better recall on the worst-performing queries to the MNIST and ImageNet datasets.
arXiv Detail & Related papers (2021-08-11T20:21:30Z) - Double Coverage with Machine-Learned Advice [100.23487145400833]
We study the fundamental online $k$-server problem in a learning-augmented setting.
We show that our algorithm achieves for any k an almost optimal consistency-robustness tradeoff.
arXiv Detail & Related papers (2021-03-02T11:04:33Z) - Performance Analysis of Meta-heuristic Algorithms for a Quadratic
Assignment Problem [6.555180412600522]
A quadratic assignment problem (QAP) is an optimization problem that belongs to the class of NP-hard ones.
Heuristics and meta-heuristics algorithm are prevalent solution methods for this problem.
This paper is one of comparative studies to apply different metaheuristic algorithms for solving the QAP.
arXiv Detail & Related papers (2020-07-29T15:02:07Z) - Corruption-Tolerant Gaussian Process Bandit Optimization [130.60115798580136]
We consider the problem of optimizing an unknown (typically non-producing) function with a bounded norm.
We introduce an algorithm based on Fast-Slow GP-UCB based on "fast but non-robust" and "slow"
We argue that certain dependencies cannot be required depending on the corruption level.
arXiv Detail & Related papers (2020-03-04T09:46:58Z) - Optimal Clustering from Noisy Binary Feedback [75.17453757892152]
We study the problem of clustering a set of items from binary user feedback.
We devise an algorithm with a minimal cluster recovery error rate.
For adaptive selection, we develop an algorithm inspired by the derivation of the information-theoretical error lower bounds.
arXiv Detail & Related papers (2019-10-14T09:18:26Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.