Shift Happens: Adjusting Classifiers
- URL: http://arxiv.org/abs/2111.02529v1
- Date: Wed, 3 Nov 2021 21:27:27 GMT
- Title: Shift Happens: Adjusting Classifiers
- Authors: Theodore James Thibault Heiser, Mari-Liis Allikivi, Meelis Kull
- Abstract summary: Minimizing expected loss measured by a proper scoring rule, such as Brier score or log-loss (cross-entropy), is a common objective while training a probabilistic classifier.
We propose methods that transform all predictions to (re)equalize the average prediction and the class distribution.
We demonstrate experimentally that, when in practice the class distribution is known only approximately, there is often still a reduction in loss depending on the amount of shift and the precision to which the class distribution is known.
- Score: 2.8682942808330703
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: Minimizing expected loss measured by a proper scoring rule, such as Brier
score or log-loss (cross-entropy), is a common objective while training a
probabilistic classifier. If the data have experienced dataset shift where the
class distributions change post-training, then often the model's performance
will decrease, over-estimating the probabilities of some classes while
under-estimating the others on average. We propose unbounded and bounded
general adjustment (UGA and BGA) methods that transform all predictions to
(re-)equalize the average prediction and the class distribution. These methods
act differently depending on which proper scoring rule is to be minimized, and
we have a theoretical guarantee of reducing loss on test data, if the exact
class distribution is known. We also demonstrate experimentally that, when in
practice the class distribution is known only approximately, there is often
still a reduction in loss depending on the amount of shift and the precision to
which the class distribution is known.
Related papers
- Probabilistic Contrastive Learning for Long-Tailed Visual Recognition [78.70453964041718]
Longtailed distributions frequently emerge in real-world data, where a large number of minority categories contain a limited number of samples.
Recent investigations have revealed that supervised contrastive learning exhibits promising potential in alleviating the data imbalance.
We propose a novel probabilistic contrastive (ProCo) learning algorithm that estimates the data distribution of the samples from each class in the feature space.
arXiv Detail & Related papers (2024-03-11T13:44:49Z) - Rethinking Classifier Re-Training in Long-Tailed Recognition: A Simple
Logits Retargeting Approach [102.0769560460338]
We develop a simple logits approach (LORT) without the requirement of prior knowledge of the number of samples per class.
Our method achieves state-of-the-art performance on various imbalanced datasets, including CIFAR100-LT, ImageNet-LT, and iNaturalist 2018.
arXiv Detail & Related papers (2024-03-01T03:27:08Z) - Deep Imbalanced Regression via Hierarchical Classification Adjustment [50.19438850112964]
Regression tasks in computer vision are often formulated into classification by quantizing the target space into classes.
The majority of training samples lie in a head range of target values, while a minority of samples span a usually larger tail range.
We propose to construct hierarchical classifiers for solving imbalanced regression tasks.
Our novel hierarchical classification adjustment (HCA) for imbalanced regression shows superior results on three diverse tasks.
arXiv Detail & Related papers (2023-10-26T04:54:39Z) - Compound Batch Normalization for Long-tailed Image Classification [77.42829178064807]
We propose a compound batch normalization method based on a Gaussian mixture.
It can model the feature space more comprehensively and reduce the dominance of head classes.
The proposed method outperforms existing methods on long-tailed image classification.
arXiv Detail & Related papers (2022-12-02T07:31:39Z) - When Noisy Labels Meet Long Tail Dilemmas: A Representation Calibration
Method [40.25499257944916]
Real-world datasets are both noisily labeled and class-imbalanced.
We propose a representation calibration method RCAL.
We derive theoretical results to discuss the effectiveness of our representation calibration.
arXiv Detail & Related papers (2022-11-20T11:36:48Z) - Learnable Distribution Calibration for Few-Shot Class-Incremental
Learning [122.2241120474278]
Few-shot class-incremental learning (FSCIL) faces challenges of memorizing old class distributions and estimating new class distributions given few training samples.
We propose a learnable distribution calibration (LDC) approach, with the aim to systematically solve these two challenges using a unified framework.
arXiv Detail & Related papers (2022-10-01T09:40:26Z) - Throwing Away Data Improves Worst-Class Error in Imbalanced
Classification [36.91428748713018]
Class imbalances pervade classification problems, yet their treatment differs in theory and practice.
We take on the challenge of developing learning theory able to describe the worst-class error of classifiers over linearly-separable data.
arXiv Detail & Related papers (2022-05-23T23:43:18Z) - Realistic Evaluation of Transductive Few-Shot Learning [41.06192162435249]
Transductive inference is widely used in few-shot learning.
We study the effect of arbitrary class distributions within the query sets of few-shot tasks at inference.
We evaluate experimentally state-of-the-art transductive methods over 3 widely used data sets.
arXiv Detail & Related papers (2022-04-24T03:35:06Z) - Unimodal-Concentrated Loss: Fully Adaptive Label Distribution Learning
for Ordinal Regression [32.35098925000738]
We argue that existing ALDL algorithms do not fully exploit the intrinsic properties of ordinal regression.
We propose a novel loss function for fully adaptive label distribution learning, namely unimodal-concentrated loss.
arXiv Detail & Related papers (2022-04-01T09:40:11Z) - Distribution Mismatch Correction for Improved Robustness in Deep Neural
Networks [86.42889611784855]
normalization methods increase the vulnerability with respect to noise and input corruptions.
We propose an unsupervised non-parametric distribution correction method that adapts the activation distribution of each layer.
In our experiments, we empirically show that the proposed method effectively reduces the impact of intense image corruptions.
arXiv Detail & Related papers (2021-10-05T11:36:25Z) - Towards optimally abstaining from prediction [22.937799541125607]
A common challenge across all areas of machine learning is that training data is not distributed like test data.
We consider a model where one may abstain from predicting, at a fixed cost.
Our work builds on a recent abstention algorithm of Goldwasser, Kalais, and Montasser ( 2020) for transductive binary classification.
arXiv Detail & Related papers (2021-05-28T21:44:48Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.