Trade-offs in Top-k Classification Accuracies on Losses for Deep
Learning
- URL: http://arxiv.org/abs/2007.15359v1
- Date: Thu, 30 Jul 2020 10:18:57 GMT
- Title: Trade-offs in Top-k Classification Accuracies on Losses for Deep
Learning
- Authors: Azusa Sawada, Eiji Kaneko, Kazutoshi Sagi
- Abstract summary: Cross entropy (CE) is not guaranteed to optimize top-k prediction without infinite training data and model complexities.
Our novel loss is basically CE modified by grouping temporal top-k classes as a single class.
Our loss has been found to provide better top-k accuracies compared to CE at k larger than 10.
- Score: 0.0
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: This paper presents an experimental analysis about trade-offs in top-k
classification accuracies on losses for deep leaning and proposal of a novel
top-k loss. Commonly-used cross entropy (CE) is not guaranteed to optimize
top-k prediction without infinite training data and model complexities. The
objective is to clarify when CE sacrifices top-k accuracies to optimize top-1
prediction, and to design loss that improve top-k accuracy under such
conditions. Our novel loss is basically CE modified by grouping temporal top-k
classes as a single class. To obtain a robust decision boundary, we introduce
an adaptive transition from normal CE to our loss, and thus call it top-k
transition loss. It is demonstrated that CE is not always the best choice to
learn top-k prediction in our experiments. First, we explore trade-offs between
top-1 and top-k (=2) accuracies on synthetic datasets, and find a failure of CE
in optimizing top-k prediction when we have complex data distribution for a
given model to represent optimal top-1 prediction. Second, we compare top-k
accuracies on CIFAR-100 dataset targeting top-5 prediction in deep learning.
While CE performs the best in top-1 accuracy, in top-5 accuracy our loss
performs better than CE except using one experimental setup. Moreover, our loss
has been found to provide better top-k accuracies compared to CE at k larger
than 10. As a result, a ResNet18 model trained with our loss reaches 99 %
accuracy with k=25 candidates, which is a smaller candidate number than that of
CE by 8.
Related papers
- Aligning GPTRec with Beyond-Accuracy Goals with Reinforcement Learning [67.71952251641545]
GPTRec is an alternative to the Top-K model for item-by-item recommendations.
We show that GPTRec offers a better tradeoff between accuracy and secondary metrics than classic greedy re-ranking techniques.
Our experiments on two datasets show that GPTRec's Next-K generation approach offers a better tradeoff between accuracy and secondary metrics than classic greedy re-ranking techniques.
arXiv Detail & Related papers (2024-03-07T19:47:48Z) - Bridging Precision and Confidence: A Train-Time Loss for Calibrating
Object Detection [58.789823426981044]
We propose a novel auxiliary loss formulation that aims to align the class confidence of bounding boxes with the accurateness of predictions.
Our results reveal that our train-time loss surpasses strong calibration baselines in reducing calibration error for both in and out-domain scenarios.
arXiv Detail & Related papers (2023-03-25T08:56:21Z) - A Case Study on the Classification of Lost Circulation Events During
Drilling using Machine Learning Techniques on an Imbalanced Large Dataset [0.0]
We utilize a 65,000+ records data with class imbalance problem from Azadegan oilfield formations in Iran.
Eleven of the dataset's seventeen parameters are chosen to be used in the classification of five lost circulation events.
To generate classification models, we used six basic machine learning algorithms and four ensemble learning methods.
arXiv Detail & Related papers (2022-09-04T12:28:40Z) - Optimizing Partial Area Under the Top-k Curve: Theory and Practice [151.5072746015253]
We develop a novel metric named partial Area Under the top-k Curve (AUTKC)
AUTKC has a better discrimination ability, and its Bayes optimal score function could give a correct top-K ranking with respect to the conditional probability.
We present an empirical surrogate risk minimization framework to optimize the proposed metric.
arXiv Detail & Related papers (2022-09-03T11:09:13Z) - Differentiable Top-k Classification Learning [29.75063301688965]
We optimize the model for multiple k simultaneously instead of using a single k.
We find that relaxing k does not only produce better top-5 accuracies, but also leads to top-1 accuracy improvements.
arXiv Detail & Related papers (2022-06-15T04:13:59Z) - ADT-SSL: Adaptive Dual-Threshold for Semi-Supervised Learning [68.53717108812297]
Semi-Supervised Learning (SSL) has advanced classification tasks by inputting both labeled and unlabeled data to train a model jointly.
This paper proposes an Adaptive Dual-Threshold method for Semi-Supervised Learning (ADT-SSL)
Experimental results show that the proposed ADT-SSL achieves state-of-the-art classification accuracy.
arXiv Detail & Related papers (2022-05-21T11:52:08Z) - GSC Loss: A Gaussian Score Calibrating Loss for Deep Learning [16.260520216972854]
We propose a general Gaussian Score Calibrating (GSC) loss to calibrate the predicted scores produced by the deep neural networks (DNN)
Extensive experiments on over 10 benchmark datasets demonstrate that the proposed GSC loss can yield consistent and significant performance boosts in a variety of visual tasks.
arXiv Detail & Related papers (2022-03-02T02:52:23Z) - Stochastic smoothing of the top-K calibrated hinge loss for deep
imbalanced classification [8.189630642296416]
We introduce a top-K hinge loss inspired by recent developments on top-K losses.
Our proposal is based on the smoothing of the top-K operator building on the flexible "perturbed" framework.
We show that our loss function performs very well in the case of balanced datasets, while benefiting from a significantly lower computational time.
arXiv Detail & Related papers (2022-02-04T15:39:32Z) - Semi-supervised Contrastive Learning with Similarity Co-calibration [72.38187308270135]
We propose a novel training strategy, termed as Semi-supervised Contrastive Learning (SsCL)
SsCL combines the well-known contrastive loss in self-supervised learning with the cross entropy loss in semi-supervised learning.
We show that SsCL produces more discriminative representation and is beneficial to few shot learning.
arXiv Detail & Related papers (2021-05-16T09:13:56Z) - Advanced Dropout: A Model-free Methodology for Bayesian Dropout
Optimization [62.8384110757689]
Overfitting ubiquitously exists in real-world applications of deep neural networks (DNNs)
The advanced dropout technique applies a model-free and easily implemented distribution with parametric prior, and adaptively adjusts dropout rate.
We evaluate the effectiveness of the advanced dropout against nine dropout techniques on seven computer vision datasets.
arXiv Detail & Related papers (2020-10-11T13:19:58Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.