Certified Error Control of Candidate Set Pruning for Two-Stage Relevance
Ranking
- URL: http://arxiv.org/abs/2205.09638v1
- Date: Thu, 19 May 2022 16:00:13 GMT
- Title: Certified Error Control of Candidate Set Pruning for Two-Stage Relevance
Ranking
- Authors: Minghan Li, Xinyu Zhang, Ji Xin, Hongyang Zhang, Jimmy Lin
- Abstract summary: We propose the concept of certified error control of candidate set pruning for relevance ranking.
Our method successfully prunes the first-stage retrieved candidate sets to improve the second-stage reranking speed.
- Score: 57.42241521034744
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: In information retrieval (IR), candidate set pruning has been commonly used
to speed up two-stage relevance ranking. However, such an approach lacks
accurate error control and often trades accuracy off against computational
efficiency in an empirical fashion, lacking theoretical guarantees. In this
paper, we propose the concept of certified error control of candidate set
pruning for relevance ranking, which means that the test error after pruning is
guaranteed to be controlled under a user-specified threshold with high
probability. Both in-domain and out-of-domain experiments show that our method
successfully prunes the first-stage retrieved candidate sets to improve the
second-stage reranking speed while satisfying the pre-specified accuracy
constraints in both settings. For example, on MS MARCO Passage v1, our method
yields an average candidate set size of 27 out of 1,000 which increases the
reranking speed by about 37 times, while the MRR@10 is greater than a
pre-specified value of 0.38 with about 90% empirical coverage and the empirical
baselines fail to provide such guarantee. Code and data are available at:
https://github.com/alexlimh/CEC-Ranking.
Related papers
- Optimizing Metamorphic Testing: Prioritizing Relations Through Execution Profile Dissimilarity [2.6749261270690434]
An oracle determines whether the output of a program for executed test cases is correct.
For machine learning programs, such an oracle is often unavailable or impractical to apply.
Prioritizing MRs enhances fault detection effectiveness and improves testing efficiency.
arXiv Detail & Related papers (2024-11-14T04:14:30Z) - A Self-boosted Framework for Calibrated Ranking [7.4291851609176645]
Calibrated Ranking is a scale-calibrated ranking system that pursues accurate ranking quality and calibrated probabilistic predictions simultaneously.
Previous methods need to aggregate the full candidate list within a single mini-batch to compute the ranking loss.
We propose a Self-Boosted framework for Calibrated Ranking (SBCR)
arXiv Detail & Related papers (2024-06-12T09:00:49Z) - Early Time Classification with Accumulated Accuracy Gap Control [34.77841988415891]
Early time classification algorithms aim to label a stream of features without processing the full input stream.
We introduce a statistical framework that can be applied to any sequential classifier, formulating a calibrated stopping rule.
We show that our proposed early stopping mechanism reduces up to 94% of timesteps used for classification while achieving rigorous accuracy gap control.
arXiv Detail & Related papers (2024-02-01T18:54:34Z) - Conservative Prediction via Data-Driven Confidence Minimization [70.93946578046003]
In safety-critical applications of machine learning, it is often desirable for a model to be conservative.
We propose the Data-Driven Confidence Minimization framework, which minimizes confidence on an uncertainty dataset.
arXiv Detail & Related papers (2023-06-08T07:05:36Z) - (Almost) Provable Error Bounds Under Distribution Shift via Disagreement
Discrepancy [8.010528849585937]
We derive an (almost) guaranteed upper bound on the error of deep neural networks under distribution shift using unlabeled test data.
In particular, our bound requires a simple, intuitive condition which is well justified by prior empirical works.
We expect this loss can serve as a drop-in replacement for future methods which require maximizing multiclass disagreement.
arXiv Detail & Related papers (2023-06-01T03:22:15Z) - Accurate and Reliable Methods for 5G UAV Jamming Identification With
Calibrated Uncertainty [3.4208659698673127]
Only increasing accuracy without considering uncertainty may negatively impact Deep Neural Network (DNN) decision-making.
This paper proposes five combined preprocessing and post-processing methods for time-series binary classification problems.
arXiv Detail & Related papers (2022-11-05T15:04:45Z) - Sample-dependent Adaptive Temperature Scaling for Improved Calibration [95.7477042886242]
Post-hoc approach to compensate for neural networks being wrong is to perform temperature scaling.
We propose to predict a different temperature value for each input, allowing us to adjust the mismatch between confidence and accuracy.
We test our method on the ResNet50 and WideResNet28-10 architectures using the CIFAR10/100 and Tiny-ImageNet datasets.
arXiv Detail & Related papers (2022-07-13T14:13:49Z) - Input-Specific Robustness Certification for Randomized Smoothing [76.76115360719837]
We propose Input-Specific Sampling (ISS) acceleration to achieve the cost-effectiveness for robustness certification.
ISS can speed up the certification by more than three times at a limited cost of 0.05 certified radius.
arXiv Detail & Related papers (2021-12-21T12:16:03Z) - Distribution-free uncertainty quantification for classification under
label shift [105.27463615756733]
We focus on uncertainty quantification (UQ) for classification problems via two avenues.
We first argue that label shift hurts UQ, by showing degradation in coverage and calibration.
We examine these techniques theoretically in a distribution-free framework and demonstrate their excellent practical performance.
arXiv Detail & Related papers (2021-03-04T20:51:03Z) - Privacy Preserving Recalibration under Domain Shift [119.21243107946555]
We introduce a framework that abstracts out the properties of recalibration problems under differential privacy constraints.
We also design a novel recalibration algorithm, accuracy temperature scaling, that outperforms prior work on private datasets.
arXiv Detail & Related papers (2020-08-21T18:43:37Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.