Worst Case Matters for Few-Shot Recognition
- URL: http://arxiv.org/abs/2203.06574v1
- Date: Sun, 13 Mar 2022 05:39:40 GMT
- Title: Worst Case Matters for Few-Shot Recognition
- Authors: Minghao Fu, Yun-Hao Cao and Jianxin Wu
- Abstract summary: Few-shot recognition learns a recognition model with very few (e.g., 1 or 5) images per category.
Current few-shot learning methods focus on improving the average accuracy over many episodes.
We argue that in real-world applications we may often only try one episode instead of many, and hence maximizing the worst-case accuracy is more important than maximizing the average accuracy.
- Score: 27.023352955311502
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Few-shot recognition learns a recognition model with very few (e.g., 1 or 5)
images per category, and current few-shot learning methods focus on improving
the average accuracy over many episodes. We argue that in real-world
applications we may often only try one episode instead of many, and hence
maximizing the worst-case accuracy is more important than maximizing the
average accuracy. We empirically show that a high average accuracy not
necessarily means a high worst-case accuracy. Since this objective is not
accessible, we propose to reduce the standard deviation and increase the
average accuracy simultaneously. In turn, we devise two strategies from the
bias-variance tradeoff perspective to implicitly reach this goal: a simple yet
effective stability regularization (SR) loss together with model ensemble to
reduce variance during fine-tuning, and an adaptability calibration mechanism
to reduce the bias. Extensive experiments on benchmark datasets demonstrate the
effectiveness of the proposed strategies, which outperforms current
state-of-the-art methods with a significant margin in terms of not only
average, but also worst-case accuracy.
Related papers
- FIRST: Teach A Reliable Large Language Model Through Efficient Trustworthy Distillation [29.606646251624923]
Fine-tuning is still far away from satisfactory trustworthiness due to "tuning-induced mis-calibration"
We propose Efficient Trustworthy Distillation (FIRST), which utilizes a small portion of teacher's knowledge to obtain a reliable language model in a cost-efficient way.
Experimental results demonstrate the effectiveness of our method, where better accuracy (+2.3%) and less miscalibration (-10%) are achieved.
arXiv Detail & Related papers (2024-08-22T07:31:00Z) - Accurate and Reliable Predictions with Mutual-Transport Ensemble [46.368395985214875]
We propose a co-trained auxiliary model and adaptively regularizes the cross-entropy loss using Kullback-Leibler (KL)
We show that MTE can simultaneously enhance both accuracy and uncertainty calibration.
For example, on the CIFAR-100 dataset, our MTE method on ResNet34/50 achieved significant improvements compared to previous state-of-the-art method.
arXiv Detail & Related papers (2024-05-30T03:15:59Z) - PUMA: margin-based data pruning [51.12154122266251]
We focus on data pruning, where some training samples are removed based on the distance to the model classification boundary (i.e., margin)
We propose PUMA, a new data pruning strategy that computes the margin using DeepFool.
We show that PUMA can be used on top of the current state-of-the-art methodology in robustness, and it is able to significantly improve the model performance unlike the existing data pruning strategies.
arXiv Detail & Related papers (2024-05-10T08:02:20Z) - Reviving Undersampling for Long-Tailed Learning [16.054442161144603]
We aim to enhance the accuracy of the worst-performing categories and utilize the harmonic mean and geometric mean to assess the model's performance.
We devise a straightforward model ensemble strategy, which does not result in any additional overhead and achieves improved harmonic and geometric mean.
We validate the effectiveness of our approach on widely utilized benchmark datasets for long-tailed learning.
arXiv Detail & Related papers (2024-01-30T08:15:13Z) - Towards Calibrated Robust Fine-Tuning of Vision-Language Models [97.19901765814431]
This work proposes a robust fine-tuning method that improves both OOD accuracy and confidence calibration simultaneously in vision language models.
We show that both OOD classification and OOD calibration errors have a shared upper bound consisting of two terms of ID data.
Based on this insight, we design a novel framework that conducts fine-tuning with a constrained multimodal contrastive loss enforcing a larger smallest singular value.
arXiv Detail & Related papers (2023-11-03T05:41:25Z) - EXACT: How to Train Your Accuracy [6.144680854063938]
We propose a new optimization framework by introducing ascentity to a model's output and optimizing expected accuracy.
Experiments on linear models and deep image classification show that the proposed optimization method is a powerful alternative to widely used classification losses.
arXiv Detail & Related papers (2022-05-19T15:13:00Z) - Leveraging Unlabeled Data to Predict Out-of-Distribution Performance [63.740181251997306]
Real-world machine learning deployments are characterized by mismatches between the source (training) and target (test) distributions.
In this work, we investigate methods for predicting the target domain accuracy using only labeled source data and unlabeled target data.
We propose Average Thresholded Confidence (ATC), a practical method that learns a threshold on the model's confidence, predicting accuracy as the fraction of unlabeled examples.
arXiv Detail & Related papers (2022-01-11T23:01:12Z) - Newer is not always better: Rethinking transferability metrics, their
peculiarities, stability and performance [5.650647159993238]
Fine-tuning of large pre-trained image and language models on small customized datasets has become increasingly popular.
We show that the statistical problems with covariance estimation drive the poor performance of H-score.
We propose a correction and recommend measuring correlation performance against relative accuracy in such settings.
arXiv Detail & Related papers (2021-10-13T17:24:12Z) - FasterPose: A Faster Simple Baseline for Human Pose Estimation [65.8413964785972]
We propose a design paradigm for cost-effective network with LR representation for efficient pose estimation, named FasterPose.
We study the training behavior of FasterPose, and formulate a novel regressive cross-entropy (RCE) loss function for accelerating the convergence.
Compared with the previously dominant network of pose estimation, our method reduces 58% of the FLOPs and simultaneously gains 1.3% improvement of accuracy.
arXiv Detail & Related papers (2021-07-07T13:39:08Z) - Transferable Calibration with Lower Bias and Variance in Domain
Adaptation [139.4332115349543]
Domain Adaptation (DA) enables transferring a learning machine from a labeled source domain to an unlabeled target one.
How to estimate the predictive uncertainty of DA models is vital for decision-making in safety-critical scenarios.
TransCal can be easily applied to recalibrate existing DA methods.
arXiv Detail & Related papers (2020-07-16T11:09:36Z) - Evaluating Prediction-Time Batch Normalization for Robustness under
Covariate Shift [81.74795324629712]
We call prediction-time batch normalization, which significantly improves model accuracy and calibration under covariate shift.
We show that prediction-time batch normalization provides complementary benefits to existing state-of-the-art approaches for improving robustness.
The method has mixed results when used alongside pre-training, and does not seem to perform as well under more natural types of dataset shift.
arXiv Detail & Related papers (2020-06-19T05:08:43Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.