Fair Comparison: Quantifying Variance in Resultsfor Fine-grained Visual
Categorization
- URL: http://arxiv.org/abs/2109.03156v2
- Date: Wed, 8 Sep 2021 01:23:28 GMT
- Title: Fair Comparison: Quantifying Variance in Resultsfor Fine-grained Visual
Categorization
- Authors: Matthew Gwilliam, Adam Teuscher, Connor Anderson, Ryan Farrell
- Abstract summary: Average categorization accuracy is often used in isolation.
As the number of classes increases, the amount of information conveyed by average accuracy alone dwindles.
While its most glaring weakness is its failure to describe the model's performance on a class-by-class basis, average accuracy also fails to describe how performance may vary from one trained model of the same architecture, to another.
- Score: 0.5735035463793008
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: For the task of image classification, researchers work arduously to develop
the next state-of-the-art (SOTA) model, each bench-marking their own
performance against that of their predecessors and of their peers.
Unfortunately, the metric used most frequently to describe a model's
performance, average categorization accuracy, is often used in isolation. As
the number of classes increases, such as in fine-grained visual categorization
(FGVC), the amount of information conveyed by average accuracy alone dwindles.
While its most glaring weakness is its failure to describe the model's
performance on a class-by-class basis, average accuracy also fails to describe
how performance may vary from one trained model of the same architecture, on
the same dataset, to another (both averaged across all categories and at the
per-class level). We first demonstrate the magnitude of these variations across
models and across class distributions based on attributes of the data,
comparing results on different visual domains and different per-class image
distributions, including long-tailed distributions and few-shot subsets. We
then analyze the impact various FGVC methods have on overall and per-class
variance. From this analysis, we both highlight the importance of reporting and
comparing methods based on information beyond overall accuracy, as well as
point out techniques that mitigate variance in FGVC results.
Related papers
- Understanding the Detrimental Class-level Effects of Data Augmentation [63.1733767714073]
achieving optimal average accuracy comes at the cost of significantly hurting individual class accuracy by as much as 20% on ImageNet.
We present a framework for understanding how DA interacts with class-level learning dynamics.
We show that simple class-conditional augmentation strategies improve performance on the negatively affected classes.
arXiv Detail & Related papers (2023-12-07T18:37:43Z) - On the Strong Correlation Between Model Invariance and Generalization [54.812786542023325]
Generalization captures a model's ability to classify unseen data.
Invariance measures consistency of model predictions on transformations of the data.
From a dataset-centric view, we find a certain model's accuracy and invariance linearly correlated on different test sets.
arXiv Detail & Related papers (2022-07-14T17:08:25Z) - Studying Generalization Through Data Averaging [0.0]
We study train and test performance, as well as the generalization gap given by the mean of their difference over different data set samples.
We predict some aspects about how the generalization gap and model train and test performance vary as a function of SGD noise.
arXiv Detail & Related papers (2022-06-28T00:03:40Z) - IMACS: Image Model Attribution Comparison Summaries [16.80986701058596]
We introduce IMACS, a method that combines gradient-based model attributions with aggregation and visualization techniques.
IMACS extracts salient input features from an evaluation dataset, clusters them based on similarity, then visualizes differences in model attributions for similar input features.
We show how our technique can uncover behavioral differences caused by domain shift between two models trained on satellite images.
arXiv Detail & Related papers (2022-01-26T21:35:14Z) - Selecting the suitable resampling strategy for imbalanced data
classification regarding dataset properties [62.997667081978825]
In many application domains such as medicine, information retrieval, cybersecurity, social media, etc., datasets used for inducing classification models often have an unequal distribution of the instances of each class.
This situation, known as imbalanced data classification, causes low predictive performance for the minority class examples.
Oversampling and undersampling techniques are well-known strategies to deal with this problem by balancing the number of examples of each class.
arXiv Detail & Related papers (2021-12-15T18:56:39Z) - Score-Based Generative Classifiers [9.063815952852783]
Generative models have been used as adversarially robust classifiers on simple datasets such as MNIST.
Previous results have suggested a trade-off between the likelihood of the data and classification accuracy.
We show that score-based generative models are closing the gap in classification accuracy compared to standard discriminative models.
arXiv Detail & Related papers (2021-10-01T15:05:33Z) - A Compositional Feature Embedding and Similarity Metric for
Ultra-Fine-Grained Visual Categorization [16.843126268445726]
Fine-grained visual categorization (FGVC) aims at classifying objects with small inter-class variances.
This paper proposes a novel compositional feature embedding and similarity metric ( CECS) for ultra-fine-grained visual categorization.
Experimental results on two ultra-FGVC datasets and one FGVC dataset with recent benchmark methods consistently demonstrate that the proposed CECS method achieves the state-the-art performance.
arXiv Detail & Related papers (2021-09-25T15:05:25Z) - Calibrating Class Activation Maps for Long-Tailed Visual Recognition [60.77124328049557]
We present two effective modifications of CNNs to improve network learning from long-tailed distribution.
First, we present a Class Activation Map (CAMC) module to improve the learning and prediction of network classifiers.
Second, we investigate the use of normalized classifiers for representation learning in long-tailed problems.
arXiv Detail & Related papers (2021-08-29T05:45:03Z) - Accuracy on the Line: On the Strong Correlation Between
Out-of-Distribution and In-Distribution Generalization [89.73665256847858]
We show that out-of-distribution performance is strongly correlated with in-distribution performance for a wide range of models and distribution shifts.
Specifically, we demonstrate strong correlations between in-distribution and out-of-distribution performance on variants of CIFAR-10 & ImageNet.
We also investigate cases where the correlation is weaker, for instance some synthetic distribution shifts from CIFAR-10-C and the tissue classification dataset Camelyon17-WILDS.
arXiv Detail & Related papers (2021-07-09T19:48:23Z) - Rethinking Class-Balanced Methods for Long-Tailed Visual Recognition
from a Domain Adaptation Perspective [98.70226503904402]
Object frequency in the real world often follows a power law, leading to a mismatch between datasets with long-tailed class distributions.
We propose to augment the classic class-balanced learning by explicitly estimating the differences between the class-conditioned distributions with a meta-learning approach.
arXiv Detail & Related papers (2020-03-24T11:28:42Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.