Class Adaptive Network Calibration
- URL: http://arxiv.org/abs/2211.15088v2
- Date: Wed, 12 Apr 2023 06:51:01 GMT
- Title: Class Adaptive Network Calibration
- Authors: Bingyuan Liu, J\'er\^ome Rony, Adrian Galdran, Jose Dolz, Ismail Ben
Ayed
- Abstract summary: We propose Class Adaptive Label Smoothing (CALS) for calibrating deep networks.
Our method builds on a general Augmented Lagrangian approach, a well-established technique in constrained optimization.
- Score: 19.80805957502909
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Recent studies have revealed that, beyond conventional accuracy, calibration
should also be considered for training modern deep neural networks. To address
miscalibration during learning, some methods have explored different penalty
functions as part of the learning objective, alongside a standard
classification loss, with a hyper-parameter controlling the relative
contribution of each term. Nevertheless, these methods share two major
drawbacks: 1) the scalar balancing weight is the same for all classes,
hindering the ability to address different intrinsic difficulties or imbalance
among classes; and 2) the balancing weight is usually fixed without an adaptive
strategy, which may prevent from reaching the best compromise between accuracy
and calibration, and requires hyper-parameter search for each application. We
propose Class Adaptive Label Smoothing (CALS) for calibrating deep networks,
which allows to learn class-wise multipliers during training, yielding a
powerful alternative to common label smoothing penalties. Our method builds on
a general Augmented Lagrangian approach, a well-established technique in
constrained optimization, but we introduce several modifications to tailor it
for large-scale, class-adaptive training. Comprehensive evaluation and multiple
comparisons on a variety of benchmarks, including standard and long-tailed
image classification, semantic segmentation, and text classification,
demonstrate the superiority of the proposed method. The code is available at
https://github.com/by-liu/CALS.
Related papers
- Rethinking Classifier Re-Training in Long-Tailed Recognition: A Simple
Logits Retargeting Approach [102.0769560460338]
We develop a simple logits approach (LORT) without the requirement of prior knowledge of the number of samples per class.
Our method achieves state-of-the-art performance on various imbalanced datasets, including CIFAR100-LT, ImageNet-LT, and iNaturalist 2018.
arXiv Detail & Related papers (2024-03-01T03:27:08Z) - Long-Tailed Learning as Multi-Objective Optimization [29.012779934262973]
We argue that the seesaw dilemma is derived from gradient imbalance of different classes.
We propose a Gradient-Balancing Grouping (GBG) strategy to gather the classes with similar gradient directions.
arXiv Detail & Related papers (2023-10-31T14:30:31Z) - Deep Imbalanced Regression via Hierarchical Classification Adjustment [50.19438850112964]
Regression tasks in computer vision are often formulated into classification by quantizing the target space into classes.
The majority of training samples lie in a head range of target values, while a minority of samples span a usually larger tail range.
We propose to construct hierarchical classifiers for solving imbalanced regression tasks.
Our novel hierarchical classification adjustment (HCA) for imbalanced regression shows superior results on three diverse tasks.
arXiv Detail & Related papers (2023-10-26T04:54:39Z) - Scaling of Class-wise Training Losses for Post-hoc Calibration [6.0632746602205865]
We propose a new calibration method to synchronize the class-wise training losses.
We design a new training loss to alleviate the variance of class-wise training losses by using multiple class-wise scaling factors.
We validate the proposed framework by employing it in the various post-hoc calibration methods.
arXiv Detail & Related papers (2023-06-19T14:59:37Z) - Multi-Head Multi-Loss Model Calibration [13.841172927454204]
We introduce a form of simplified ensembling that bypasses the costly training and inference of deep ensembles.
Specifically, each head is trained to minimize a weighted Cross-Entropy loss, but the weights are different among the different branches.
We show that the resulting averaged predictions can achieve excellent calibration without sacrificing accuracy in two challenging datasets.
arXiv Detail & Related papers (2023-03-02T09:32:32Z) - Contextual Squeeze-and-Excitation for Efficient Few-Shot Image
Classification [57.36281142038042]
We present a new adaptive block called Contextual Squeeze-and-Excitation (CaSE) that adjusts a pretrained neural network on a new task to significantly improve performance.
We also present a new training protocol based on Coordinate-Descent called UpperCaSE that exploits meta-trained CaSE blocks and fine-tuning routines for efficient adaptation.
arXiv Detail & Related papers (2022-06-20T15:25:08Z) - CMW-Net: Learning a Class-Aware Sample Weighting Mapping for Robust Deep
Learning [55.733193075728096]
Modern deep neural networks can easily overfit to biased training data containing corrupted labels or class imbalance.
Sample re-weighting methods are popularly used to alleviate this data bias issue.
We propose a meta-model capable of adaptively learning an explicit weighting scheme directly from data.
arXiv Detail & Related papers (2022-02-11T13:49:51Z) - Instance-based Label Smoothing For Better Calibrated Classification
Networks [3.388509725285237]
Label smoothing is widely used in deep neural networks for multi-class classification.
We take inspiration from both label smoothing and self-distillation.
We propose two novel instance-based label smoothing approaches.
arXiv Detail & Related papers (2021-10-11T15:33:23Z) - Understanding the Generalization of Adam in Learning Neural Networks
with Proper Regularization [118.50301177912381]
We show that Adam can converge to different solutions of the objective with provably different errors, even with weight decay globalization.
We show that if convex, and the weight decay regularization is employed, any optimization algorithms including Adam will converge to the same solution.
arXiv Detail & Related papers (2021-08-25T17:58:21Z) - Improving Calibration for Long-Tailed Recognition [68.32848696795519]
We propose two methods to improve calibration and performance in such scenarios.
For dataset bias due to different samplers, we propose shifted batch normalization.
Our proposed methods set new records on multiple popular long-tailed recognition benchmark datasets.
arXiv Detail & Related papers (2021-04-01T13:55:21Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.