Background Splitting: Finding Rare Classes in a Sea of Background
- URL: http://arxiv.org/abs/2008.12873v1
- Date: Fri, 28 Aug 2020 23:05:15 GMT
- Title: Background Splitting: Finding Rare Classes in a Sea of Background
- Authors: Ravi Teja Mullapudi, Fait Poms, William R. Mark, Deva Ramanan, Kayvon
Fatahalian
- Abstract summary: We focus on the real-world problem of training accurate deep models for image classification of a small number of rare categories.
In these scenarios, almost all images belong to the background category in the dataset (>95% of the dataset is background)
We demonstrate that both standard fine-tuning approaches and state-of-the-art approaches for training on imbalanced datasets do not produce accurate deep models in the presence of this extreme imbalance.
- Score: 55.03789745276442
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: We focus on the real-world problem of training accurate deep models for image
classification of a small number of rare categories. In these scenarios, almost
all images belong to the background category in the dataset (>95% of the
dataset is background). We demonstrate that both standard fine-tuning
approaches and state-of-the-art approaches for training on imbalanced datasets
do not produce accurate deep models in the presence of this extreme imbalance.
Our key observation is that the extreme imbalance due to the background
category can be drastically reduced by leveraging visual knowledge from an
existing pre-trained model. Specifically, the background category is "split"
into smaller and more coherent pseudo-categories during training using a
pre-trained model. We incorporate background splitting into an image
classification model by adding an auxiliary loss that learns to mimic the
predictions of the existing, pre-trained image classification model. Note that
this process is automatic and requires no additional manual labels. The
auxiliary loss regularizes the feature representation of the shared network
trunk by requiring it to discriminate between previously homogeneous background
instances and reduces overfitting to the small number of rare category
positives. We also show that BG splitting can be combined with other background
imbalance methods to further improve performance. We evaluate our method on a
modified version of the iNaturalist dataset where only a small subset of rare
category labels are available during training (all other images are labeled as
background). By jointly learning to recognize ImageNet categories and selected
iNaturalist categories, our approach yields performance that is 42.3 mAP points
higher than a fine-tuning baseline when 99.98% of the data is background, and
8.3 mAP points higher than SotA baselines when 98.30% of the data is
background.
Related papers
- Reinforcing Pre-trained Models Using Counterfactual Images [54.26310919385808]
This paper proposes a novel framework to reinforce classification models using language-guided generated counterfactual images.
We identify model weaknesses by testing the model using the counterfactual image dataset.
We employ the counterfactual images as an augmented dataset to fine-tune and reinforce the classification model.
arXiv Detail & Related papers (2024-06-19T08:07:14Z) - Spawrious: A Benchmark for Fine Control of Spurious Correlation Biases [8.455991178281469]
We present benchmark-O2O, M2M-Easy, Medium, Hard, an image classification benchmark suite containing spurious correlations between classes and backgrounds.
The resulting dataset is of high quality and contains approximately 152k images.
arXiv Detail & Related papers (2023-03-09T18:22:12Z) - Co-training $2^L$ Submodels for Visual Recognition [67.02999567435626]
Submodel co-training is a regularization method related to co-training, self-distillation and depth.
We show that submodel co-training is effective to train backbones for recognition tasks such as image classification and semantic segmentation.
arXiv Detail & Related papers (2022-12-09T14:38:09Z) - Invariant Learning via Diffusion Dreamed Distribution Shifts [121.71383835729848]
We propose a dataset called Diffusion Dreamed Distribution Shifts (D3S)
D3S consists of synthetic images generated through StableDiffusion using text prompts and image guides obtained by pasting a sample foreground image onto a background template image.
Due to the incredible photorealism of the diffusion model, our images are much closer to natural images than previous synthetic datasets.
arXiv Detail & Related papers (2022-11-18T17:07:43Z) - Few-shot Open-set Recognition Using Background as Unknowns [58.04165813493666]
Few-shot open-set recognition aims to classify both seen and novel images given only limited training data of seen classes.
Our proposed method not only outperforms multiple baselines but also sets new results on three popular benchmarks.
arXiv Detail & Related papers (2022-07-19T04:19:29Z) - Towards General Deep Leakage in Federated Learning [13.643899029738474]
federated learning (FL) improves the performance of the global model by sharing and aggregating local models rather than local data to protect the users' privacy.
Some research has demonstrated that an attacker can still recover private data based on the shared gradient information.
We propose methods that can reconstruct the training data from shared gradients or weights, corresponding to the FedSGD and FedAvg usage scenarios.
arXiv Detail & Related papers (2021-10-18T07:49:52Z) - Rectifying the Shortcut Learning of Background: Shared Object
Concentration for Few-Shot Image Recognition [101.59989523028264]
Few-Shot image classification aims to utilize pretrained knowledge learned from a large-scale dataset to tackle a series of downstream classification tasks.
We propose COSOC, a novel Few-Shot Learning framework, to automatically figure out foreground objects at both pretraining and evaluation stage.
arXiv Detail & Related papers (2021-07-16T07:46:41Z) - One-Shot Image Classification by Learning to Restore Prototypes [11.448423413463916]
One-shot image classification aims to train image classifiers over the dataset with only one image per category.
For one-shot learning, the existing metric learning approaches would suffer poor performance because the single training image may not be representative of the class.
We propose a simple yet effective regression model, denoted by RestoreNet, which learns a class transformation on the image feature to move the image closer to the class center in the feature space.
arXiv Detail & Related papers (2020-05-04T02:11:30Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.