Related papers: Shift Happens: Adjusting Classifiers

Shift Happens: Adjusting Classifiers

URL: http://arxiv.org/abs/2111.02529v1
Date: Wed, 3 Nov 2021 21:27:27 GMT
Title: Shift Happens: Adjusting Classifiers
Authors: Theodore James Thibault Heiser, Mari-Liis Allikivi, Meelis Kull
Abstract summary: Minimizing expected loss measured by a proper scoring rule, such as Brier score or log-loss (cross-entropy), is a common objective while training a probabilistic classifier. We propose methods that transform all predictions to (re)equalize the average prediction and the class distribution. We demonstrate experimentally that, when in practice the class distribution is known only approximately, there is often still a reduction in loss depending on the amount of shift and the precision to which the class distribution is known.
Score: 2.8682942808330703
License: http://creativecommons.org/licenses/by/4.0/
Abstract: Minimizing expected loss measured by a proper scoring rule, such as Brier score or log-loss (cross-entropy), is a common objective while training a probabilistic classifier. If the data have experienced dataset shift where the class distributions change post-training, then often the model's performance will decrease, over-estimating the probabilities of some classes while under-estimating the others on average. We propose unbounded and bounded general adjustment (UGA and BGA) methods that transform all predictions to (re-)equalize the average prediction and the class distribution. These methods act differently depending on which proper scoring rule is to be minimized, and we have a theoretical guarantee of reducing loss on test data, if the exact class distribution is known. We also demonstrate experimentally that, when in practice the class distribution is known only approximately, there is often still a reduction in loss depending on the amount of shift and the precision to which the class distribution is known.

Related papers

Rethinking Early Stopping: Refine, Then Calibrate [49.966899634962374]
We show that calibration error and refinement error are not minimized simultaneously during training. We introduce a new metric for early stopping and hyper parameter tuning that makes it possible to minimize refinement error during training. Our method integrates seamlessly with any architecture and consistently improves performance across diverse classification tasks.
arXiv Detail & Related papers (2025-01-31T15:03:54Z)
Risk-based Calibration for Generative Classifiers [4.792851066169872]
We propose a learning procedure called risk-based calibration (RC) RC iteratively refines the generative classifier by adjusting its joint probability distribution according to the 0-1 loss in training samples. RC significantly outperforms closed-form learning procedures in terms of both training error and generalization error.
arXiv Detail & Related papers (2024-09-05T14:06:56Z)
Probabilistic Contrastive Learning for Long-Tailed Visual Recognition [78.70453964041718]
Longtailed distributions frequently emerge in real-world data, where a large number of minority categories contain a limited number of samples. Recent investigations have revealed that supervised contrastive learning exhibits promising potential in alleviating the data imbalance. We propose a novel probabilistic contrastive (ProCo) learning algorithm that estimates the data distribution of the samples from each class in the feature space.
arXiv Detail & Related papers (2024-03-11T13:44:49Z)
Rethinking Classifier Re-Training in Long-Tailed Recognition: A Simple Logits Retargeting Approach [102.0769560460338]
We develop a simple logits approach (LORT) without the requirement of prior knowledge of the number of samples per class. Our method achieves state-of-the-art performance on various imbalanced datasets, including CIFAR100-LT, ImageNet-LT, and iNaturalist 2018.
arXiv Detail & Related papers (2024-03-01T03:27:08Z)
Deep Imbalanced Regression via Hierarchical Classification Adjustment [50.19438850112964]
Regression tasks in computer vision are often formulated into classification by quantizing the target space into classes. The majority of training samples lie in a head range of target values, while a minority of samples span a usually larger tail range. We propose to construct hierarchical classifiers for solving imbalanced regression tasks. Our novel hierarchical classification adjustment (HCA) for imbalanced regression shows superior results on three diverse tasks.
arXiv Detail & Related papers (2023-10-26T04:54:39Z)
Compound Batch Normalization for Long-tailed Image Classification [77.42829178064807]
We propose a compound batch normalization method based on a Gaussian mixture. It can model the feature space more comprehensively and reduce the dominance of head classes. The proposed method outperforms existing methods on long-tailed image classification.
arXiv Detail & Related papers (2022-12-02T07:31:39Z)
When Noisy Labels Meet Long Tail Dilemmas: A Representation Calibration Method [40.25499257944916]
Real-world datasets are both noisily labeled and class-imbalanced. We propose a representation calibration method RCAL. We derive theoretical results to discuss the effectiveness of our representation calibration.
arXiv Detail & Related papers (2022-11-20T11:36:48Z)
Learnable Distribution Calibration for Few-Shot Class-Incremental Learning [122.2241120474278]
Few-shot class-incremental learning (FSCIL) faces challenges of memorizing old class distributions and estimating new class distributions given few training samples. We propose a learnable distribution calibration (LDC) approach, with the aim to systematically solve these two challenges using a unified framework.
arXiv Detail & Related papers (2022-10-01T09:40:26Z)
Throwing Away Data Improves Worst-Class Error in Imbalanced Classification [36.91428748713018]
Class imbalances pervade classification problems, yet their treatment differs in theory and practice. We take on the challenge of developing learning theory able to describe the worst-class error of classifiers over linearly-separable data.
arXiv Detail & Related papers (2022-05-23T23:43:18Z)
Realistic Evaluation of Transductive Few-Shot Learning [41.06192162435249]
Transductive inference is widely used in few-shot learning. We study the effect of arbitrary class distributions within the query sets of few-shot tasks at inference. We evaluate experimentally state-of-the-art transductive methods over 3 widely used data sets.
arXiv Detail & Related papers (2022-04-24T03:35:06Z)
Unimodal-Concentrated Loss: Fully Adaptive Label Distribution Learning for Ordinal Regression [32.35098925000738]
We argue that existing ALDL algorithms do not fully exploit the intrinsic properties of ordinal regression. We propose a novel loss function for fully adaptive label distribution learning, namely unimodal-concentrated loss.
arXiv Detail & Related papers (2022-04-01T09:40:11Z)
Distribution Mismatch Correction for Improved Robustness in Deep Neural Networks [86.42889611784855]
normalization methods increase the vulnerability with respect to noise and input corruptions. We propose an unsupervised non-parametric distribution correction method that adapts the activation distribution of each layer. In our experiments, we empirically show that the proposed method effectively reduces the impact of intense image corruptions.
arXiv Detail & Related papers (2021-10-05T11:36:25Z)
Towards optimally abstaining from prediction [22.937799541125607]
A common challenge across all areas of machine learning is that training data is not distributed like test data. We consider a model where one may abstain from predicting, at a fixed cost. Our work builds on a recent abstention algorithm of Goldwasser, Kalais, and Montasser ( 2020) for transductive binary classification.
arXiv Detail & Related papers (2021-05-28T21:44:48Z)

This list is automatically generated from the titles and abstracts of the papers in this site.