BackMix: Regularizing Open Set Recognition by Removing Underlying Fore-Background Priors
- URL: http://arxiv.org/abs/2503.17717v1
- Date: Sat, 22 Mar 2025 10:23:11 GMT
- Title: BackMix: Regularizing Open Set Recognition by Removing Underlying Fore-Background Priors
- Authors: Yu Wang, Junxian Mu, Hongzhi Huang, Qilong Wang, Pengfei Zhu, Qinghua Hu,
- Abstract summary: Open set recognition (OSR) requires models to classify known samples while detecting unknown samples for real-world applications.<n>Existing studies show impressive progress using unknown samples from auxiliary datasets to regularize OSR models, but they have proved to be sensitive to selecting such known outliers.<n>We propose a new method, Background Mix (BackMix), that mixes the foreground of an image with different backgrounds to remove the underlying fore-background priors.
- Score: 50.09148454840245
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Open set recognition (OSR) requires models to classify known samples while detecting unknown samples for real-world applications. Existing studies show impressive progress using unknown samples from auxiliary datasets to regularize OSR models, but they have proved to be sensitive to selecting such known outliers. In this paper, we discuss the aforementioned problem from a new perspective: Can we regularize OSR models without elaborately selecting auxiliary known outliers? We first empirically and theoretically explore the role of foregrounds and backgrounds in open set recognition and disclose that: 1) backgrounds that correlate with foregrounds would mislead the model and cause failures when encounters 'partially' known images; 2) Backgrounds unrelated to foregrounds can serve as auxiliary known outliers and provide regularization via global average pooling. Based on the above insights, we propose a new method, Background Mix (BackMix), that mixes the foreground of an image with different backgrounds to remove the underlying fore-background priors. Specifically, BackMix first estimates the foreground with class activation maps (CAMs), then randomly replaces image patches with backgrounds from other images to obtain mixed images for training. With backgrounds de-correlated from foregrounds, the open set recognition performance is significantly improved. The proposed method is quite simple to implement, requires no extra operation for inferences, and can be seamlessly integrated into almost all of the existing frameworks. The code is released on https://github.com/Vanixxz/BackMix.
Related papers
- Is Foreground Prototype Sufficient? Few-Shot Medical Image Segmentation with Background-Fused Prototype [40.062825908232185]
Few-shot Semantic(FSS)aim to adapt a pre-trained model to new classes with as few as a single labeled training sample per class.<n>We present a new pluggable Background-fused prototype(Bro) for FSS in medical images.<n>Bro incorporates this background with two pivot designs. Specifically, Feature Similarity(FeaC)initially reduces noise in the support image by employing feature cross-attention with the query image.<n>We achieve this by a channel groups-based attention mechanism, where an adversarial structure encourages a coarse-to-fine fusion.
arXiv Detail & Related papers (2024-12-04T02:51:22Z) - BackMix: Mitigating Shortcut Learning in Echocardiography with Minimal Supervision [1.3708815960776262]
We propose a simple, yet effective random background augmentation method called BackMix.
By enforcing the background to be uncorrelated with the outcome, the model learns to focus on the data within the ultrasound sector.
We extend our method in a semi-supervised setting, finding that the positive effects of BackMix are maintained with as few as 5% of segmentation labels.
arXiv Detail & Related papers (2024-06-27T13:06:47Z) - MAEDAY: MAE for few and zero shot AnomalY-Detection [44.99483220711847]
We propose using Masked Auto-Encoder (MAE), a transformer model self-supervisedly trained on image inpainting, for anomaly detection (AD)
MaEDAY is the first image-reconstruction-based anomaly detection method that utilizes a pre-trained model.
arXiv Detail & Related papers (2022-11-25T18:59:46Z) - Boosting Few-shot Fine-grained Recognition with Background Suppression
and Foreground Alignment [53.401889855278704]
Few-shot fine-grained recognition (FS-FGR) aims to recognize novel fine-grained categories with the help of limited available samples.
We propose a two-stage background suppression and foreground alignment framework, which is composed of a background activation suppression (BAS) module, a foreground object alignment (FOA) module, and a local to local (L2L) similarity metric.
Experiments conducted on multiple popular fine-grained benchmarks demonstrate that our method outperforms the existing state-of-the-art by a large margin.
arXiv Detail & Related papers (2022-10-04T07:54:40Z) - Learning to Detect Every Thing in an Open World [139.78830329914135]
We propose a simple yet surprisingly powerful data augmentation and training scheme we call Learning to Detect Every Thing (LDET)
To avoid suppressing hidden objects, background objects that are visible but unlabeled, we paste annotated objects on a background image sampled from a small region of the original image.
LDET leads to significant improvements on many datasets in the open world instance segmentation task.
arXiv Detail & Related papers (2021-12-03T03:56:06Z) - Rectifying the Shortcut Learning of Background: Shared Object
Concentration for Few-Shot Image Recognition [101.59989523028264]
Few-Shot image classification aims to utilize pretrained knowledge learned from a large-scale dataset to tackle a series of downstream classification tasks.
We propose COSOC, a novel Few-Shot Learning framework, to automatically figure out foreground objects at both pretraining and evaluation stage.
arXiv Detail & Related papers (2021-07-16T07:46:41Z) - The Blessings of Unlabeled Background in Untrimmed Videos [66.99259967869065]
Weakly-supervised Temporal Action Localization (WTAL) aims to detect the intervals of action instances with only video-level action labels available during training.
The key challenge is how to distinguish the segments of interest from the background segments, which are unlabelled even on the video-level.
We propose a Temporal Smoothing PCA-based (TS-PCA) deconfounder, which exploits the unlabelled background to model an observed substitute for the confounder.
arXiv Detail & Related papers (2021-03-24T13:34:42Z) - Noise or Signal: The Role of Image Backgrounds in Object Recognition [93.55720207356603]
We create a toolkit for disentangling foreground and background signal on ImageNet images.
We find that (a) models can achieve non-trivial accuracy by relying on the background alone, (b) models often misclassify images even in the presence of correctly classified foregrounds.
arXiv Detail & Related papers (2020-06-17T16:54:43Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.