Optimizing Partial Area Under the Top-k Curve: Theory and Practice
- URL: http://arxiv.org/abs/2209.01398v1
- Date: Sat, 3 Sep 2022 11:09:13 GMT
- Title: Optimizing Partial Area Under the Top-k Curve: Theory and Practice
- Authors: Zitai Wang, Qianqian Xu, Zhiyong Yang, Yuan He, Xiaochun Cao, Qingming
Huang
- Abstract summary: We develop a novel metric named partial Area Under the top-k Curve (AUTKC)
AUTKC has a better discrimination ability, and its Bayes optimal score function could give a correct top-K ranking with respect to the conditional probability.
We present an empirical surrogate risk minimization framework to optimize the proposed metric.
- Score: 151.5072746015253
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: Top-k error has become a popular metric for large-scale classification
benchmarks due to the inevitable semantic ambiguity among classes. Existing
literature on top-k optimization generally focuses on the optimization method
of the top-k objective, while ignoring the limitations of the metric itself. In
this paper, we point out that the top-k objective lacks enough discrimination
such that the induced predictions may give a totally irrelevant label a top
rank. To fix this issue, we develop a novel metric named partial Area Under the
top-k Curve (AUTKC). Theoretical analysis shows that AUTKC has a better
discrimination ability, and its Bayes optimal score function could give a
correct top-K ranking with respect to the conditional probability. This shows
that AUTKC does not allow irrelevant labels to appear in the top list.
Furthermore, we present an empirical surrogate risk minimization framework to
optimize the proposed metric. Theoretically, we present (1) a sufficient
condition for Fisher consistency of the Bayes optimal score function; (2) a
generalization upper bound which is insensitive to the number of classes under
a simple hyperparameter setting. Finally, the experimental results on four
benchmark datasets validate the effectiveness of our proposed framework.
Related papers
- Lower-Left Partial AUC: An Effective and Efficient Optimization Metric
for Recommendation [52.45394284415614]
We propose a new optimization metric, Lower-Left Partial AUC (LLPAUC), which is computationally efficient like AUC but strongly correlates with Top-K ranking metrics.
LLPAUC considers only the partial area under the ROC curve in the Lower-Left corner to push the optimization focus on Top-K.
arXiv Detail & Related papers (2024-02-29T13:58:33Z) - Adaptive Neural Ranking Framework: Toward Maximized Business Goal for
Cascade Ranking Systems [33.46891569350896]
Cascade ranking is widely used for large-scale top-k selection problems in online advertising and recommendation systems.
Previous works on learning-to-rank usually focus on letting the model learn the complete order or top-k order.
We name this method as Adaptive Neural Ranking Framework (abbreviated as ARF)
arXiv Detail & Related papers (2023-10-16T14:43:02Z) - AUC-based Selective Classification [5.406386303264086]
We propose a model-agnostic approach to associate a selection function to a given binary classifier.
We provide both theoretical justifications and a novel algorithm, called $AUCross$, to achieve such a goal.
Experiments show that $AUCross$ succeeds in trading-off coverage for AUC, improving over existing selective classification methods targeted at optimizing accuracy.
arXiv Detail & Related papers (2022-10-19T16:29:50Z) - Joint Optimization of Ranking and Calibration with Contextualized Hybrid
Model [24.66016187602343]
We propose an approach that can Jointly optimize the Ranking and abilities (JRC) for short.
JRC improves the ranking ability by contrasting the logit value for the sample with different labels and constrains the predicted probability to be a function of the logit subtraction.
JRC has been deployed on the display advertising platform of Alibaba and has obtained significant performance improvements.
arXiv Detail & Related papers (2022-08-12T08:32:13Z) - Ranking-Based Siamese Visual Tracking [31.2428211299895]
Siamese-based trackers mainly formulate the visual tracking into two independent subtasks, including classification and localization.
This paper proposes a ranking-based optimization algorithm to explore the relationship among different proposals.
The proposed two ranking losses are compatible with most Siamese trackers and incur no additional computation for inference.
arXiv Detail & Related papers (2022-05-24T03:46:40Z) - Large-scale Optimization of Partial AUC in a Range of False Positive
Rates [51.12047280149546]
The area under the ROC curve (AUC) is one of the most widely used performance measures for classification models in machine learning.
We develop an efficient approximated gradient descent method based on recent practical envelope smoothing technique.
Our proposed algorithm can also be used to minimize the sum of some ranked range loss, which also lacks efficient solvers.
arXiv Detail & Related papers (2022-03-03T03:46:18Z) - Efficient and Differentiable Conformal Prediction with General Function
Classes [96.74055810115456]
We propose a generalization of conformal prediction to multiple learnable parameters.
We show that it achieves approximate valid population coverage and near-optimal efficiency within class.
Experiments show that our algorithm is able to learn valid prediction sets and improve the efficiency significantly.
arXiv Detail & Related papers (2022-02-22T18:37:23Z) - Tune it the Right Way: Unsupervised Validation of Domain Adaptation via
Soft Neighborhood Density [125.64297244986552]
We propose an unsupervised validation criterion that measures the density of soft neighborhoods by computing the entropy of the similarity distribution between points.
Our criterion is simpler than competing validation methods, yet more effective.
arXiv Detail & Related papers (2021-08-24T17:41:45Z) - Stochastic Optimization of Areas Under Precision-Recall Curves with
Provable Convergence [66.83161885378192]
Area under ROC (AUROC) and precision-recall curves (AUPRC) are common metrics for evaluating classification performance for imbalanced problems.
We propose a technical method to optimize AUPRC for deep learning.
arXiv Detail & Related papers (2021-04-18T06:22:21Z) - Trade-offs in Top-k Classification Accuracies on Losses for Deep
Learning [0.0]
Cross entropy (CE) is not guaranteed to optimize top-k prediction without infinite training data and model complexities.
Our novel loss is basically CE modified by grouping temporal top-k classes as a single class.
Our loss has been found to provide better top-k accuracies compared to CE at k larger than 10.
arXiv Detail & Related papers (2020-07-30T10:18:57Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.