Better May Not Be Fairer: A Study on Subgroup Discrepancy in Image
Classification
- URL: http://arxiv.org/abs/2212.08649v2
- Date: Fri, 22 Sep 2023 05:44:37 GMT
- Title: Better May Not Be Fairer: A Study on Subgroup Discrepancy in Image
Classification
- Authors: Ming-Chang Chiu, Pin-Yu Chen, Xuezhe Ma
- Abstract summary: We investigate how natural background colors play a role as spurious features by annotating the test sets of CIFAR10 and CIFAR100 into subgroups based on the background color of each image.
We find that overall human-level accuracy does not guarantee consistent subgroup performances, and the phenomenon remains even on models pre-trained on ImageNet or after data augmentation (DA)
Experimental results show that FlowAug achieves more consistent subgroup results than other types of DA methods on CIFAR10/100 and on CIFAR10/100-C.
- Score: 73.87160347728314
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: In this paper, we provide 20,000 non-trivial human annotations on popular
datasets as a first step to bridge gap to studying how natural semantic
spurious features affect image classification, as prior works often study
datasets mixing low-level features due to limitations in accessing realistic
datasets. We investigate how natural background colors play a role as spurious
features by annotating the test sets of CIFAR10 and CIFAR100 into subgroups
based on the background color of each image. We name our datasets
\textbf{CIFAR10-B} and \textbf{CIFAR100-B} and integrate them with CIFAR-Cs.
We find that overall human-level accuracy does not guarantee consistent
subgroup performances, and the phenomenon remains even on models pre-trained on
ImageNet or after data augmentation (DA). To alleviate this issue, we propose
\textbf{FlowAug}, a \emph{semantic} DA that leverages decoupled semantic
representations captured by a pre-trained generative flow. Experimental results
show that FlowAug achieves more consistent subgroup results than other types of
DA methods on CIFAR10/100 and on CIFAR10/100-C. Additionally, it shows better
generalization performance.
Furthermore, we propose a generic metric, \emph{MacroStd}, for studying model
robustness to spurious correlations, where we take a macro average on the
weighted standard deviations across different classes. We show
\textit{MacroStd} being more predictive of better performances; per our metric,
FlowAug demonstrates improvements on subgroup discrepancy. Although this metric
is proposed to study our curated datasets, it applies to all datasets that have
subgroups or subclasses. Lastly, we also show superior out-of-distribution
results on CIFAR10.1.
Related papers
- Generalized Category Discovery with Clustering Assignment Consistency [56.92546133591019]
Generalized category discovery (GCD) is a recently proposed open-world task.
We propose a co-training-based framework that encourages clustering consistency.
Our method achieves state-of-the-art performance on three generic benchmarks and three fine-grained visual recognition datasets.
arXiv Detail & Related papers (2023-10-30T00:32:47Z) - A soft nearest-neighbor framework for continual semi-supervised learning [35.957577587090604]
We propose an approach for continual semi-supervised learning where not all the data samples are labeled.
We leverage the power of nearest-neighbors to nonlinearly partition the feature space and flexibly model the underlying data distribution.
Our method works well on both low and high resolution images and scales seamlessly to more complex datasets.
arXiv Detail & Related papers (2022-12-09T20:03:59Z) - AU-Aware Vision Transformers for Biased Facial Expression Recognition [17.00557858587472]
We experimentally show that the naive joint training of multiple FER datasets is harmful to the FER performance of individual datasets.
We propose a simple yet conceptually-new framework, AU-aware Vision Transformer (AU-ViT)
Our AU-ViT achieves state-of-the-art performance on three popular datasets, namely 91.10% on RAF-DB, 65.59% on AffectNet, and 90.15% on FERPlus.
arXiv Detail & Related papers (2022-11-12T08:58:54Z) - Assessing Dataset Bias in Computer Vision [0.0]
biases have the tendency to propagate to the models that train on them, often leading to a poor performance in the minority class.
We will apply several augmentation techniques on a sample of the UTKFace dataset, such as undersampling, geometric transformations, variational autoencoders (VAEs), and generative adversarial networks (GANs)
We were able to show that our model has a better overall performance and consistency on age and ethnicity classification on multiple datasets when compared with the FairFace model.
arXiv Detail & Related papers (2022-05-03T22:45:49Z) - Improving Contrastive Learning on Imbalanced Seed Data via Open-World
Sampling [96.8742582581744]
We present an open-world unlabeled data sampling framework called Model-Aware K-center (MAK)
MAK follows three simple principles: tailness, proximity, and diversity.
We demonstrate that MAK can consistently improve both the overall representation quality and the class balancedness of the learned features.
arXiv Detail & Related papers (2021-11-01T15:09:41Z) - No Fear of Heterogeneity: Classifier Calibration for Federated Learning
with Non-IID Data [78.69828864672978]
A central challenge in training classification models in the real-world federated system is learning with non-IID data.
We propose a novel and simple algorithm called Virtual Representations (CCVR), which adjusts the classifier using virtual representations sampled from an approximated ssian mixture model.
Experimental results demonstrate that CCVR state-of-the-art performance on popular federated learning benchmarks including CIFAR-10, CIFAR-100, and CINIC-10.
arXiv Detail & Related papers (2021-06-09T12:02:29Z) - Learning to Model and Ignore Dataset Bias with Mixed Capacity Ensembles [66.15398165275926]
We propose a method that can automatically detect and ignore dataset-specific patterns, which we call dataset biases.
Our method trains a lower capacity model in an ensemble with a higher capacity model.
We show improvement in all settings, including a 10 point gain on the visual question answering dataset.
arXiv Detail & Related papers (2020-11-07T22:20:03Z) - SCAN: Learning to Classify Images without Labels [73.69513783788622]
We advocate a two-step approach where feature learning and clustering are decoupled.
A self-supervised task from representation learning is employed to obtain semantically meaningful features.
We obtain promising results on ImageNet, and outperform several semi-supervised learning methods in the low-data regime.
arXiv Detail & Related papers (2020-05-25T18:12:33Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.