Related papers: Instance-Aware Repeat Factor Sampling for Long-Tailed Object Detection

Instance-Aware Repeat Factor Sampling for Long-Tailed Object Detection

URL: http://arxiv.org/abs/2305.08069v1
Date: Sun, 14 May 2023 04:53:05 GMT
Title: Instance-Aware Repeat Factor Sampling for Long-Tailed Object Detection
Authors: Burhaneddin Yaman, Tanvir Mahmud, Chun-Hao Liu
Abstract summary: Imbalanced datasets in real-world object detection often suffer from a large disparity in the number of instances for each class. We propose IRFS which unifies instance and image counts for the re-sampling process to be aware of different perspectives. Our method shows promising results on the challenging LVIS v1.0 benchmark dataset.
Score: 3.4913694429616022
License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
Abstract: We propose an embarrassingly simple method -- instance-aware repeat factor sampling (IRFS) to address the problem of imbalanced data in long-tailed object detection. Imbalanced datasets in real-world object detection often suffer from a large disparity in the number of instances for each class. To improve the generalization performance of object detection models on rare classes, various data sampling techniques have been proposed. Repeat factor sampling (RFS) has shown promise due to its simplicity and effectiveness. Despite its efficiency, RFS completely neglects the instance counts and solely relies on the image count during re-sampling process. However, instance count may immensely vary for different classes with similar image counts. Such variation highlights the importance of both image and instance for addressing the long-tail distributions. Thus, we propose IRFS which unifies instance and image counts for the re-sampling process to be aware of different perspectives of the imbalance in long-tailed datasets. Our method shows promising results on the challenging LVIS v1.0 benchmark dataset over various architectures and backbones, demonstrating their effectiveness in improving the performance of object detection models on rare classes with a relative $+50\%$ average precision (AP) improvement over counterpart RFS. IRFS can serve as a strong baseline and be easily incorporated into existing long-tailed frameworks.

Related papers

Exponentially Weighted Instance-Aware Repeat Factor Sampling for Long-Tailed Object Detection Model Training in Unmanned Aerial Vehicles Surveillance Scenarios [7.807810158327325]
This work introduces Exponentially Weighted Instance-Aware Repeat Factor Sampling (E-IRFS) E-IRFS applies exponential scaling to better differentiate between rare and frequent classes. We evaluate E-IRFS on a dataset derived from the Fireman-UAV-RGBT dataset.
arXiv Detail & Related papers (2025-03-27T18:09:37Z)
Efficient Feature Fusion for UAV Object Detection [9.632727117779178]
Small objects, in particular, occupy small portions of images, making their accurate detection difficult. Existing multi-scale feature fusion methods address these challenges by aggregating features across different resolutions. We propose a novel feature fusion framework specifically designed for UAV object detection tasks.
arXiv Detail & Related papers (2025-01-29T20:39:16Z)
Long-Tailed Object Detection Pre-training: Dynamic Rebalancing Contrastive Learning with Dual Reconstruction [28.359463356384463]
We introduce a novel pre-training framework for object detection, called Dynamic Rebalancing Contrastive Learning with Dual Reconstruction (2DRCL) Our method builds on a Holistic-Local Contrastive Learning mechanism, which aligns pre-training with object detection by capturing both global contextual semantics and detailed local patterns. Experiments on COCO and LVIS v1.0 datasets demonstrate the effectiveness of our method, particularly in improving the mAP/AP scores for tail classes.
arXiv Detail & Related papers (2024-11-14T13:59:01Z)
LeOCLR: Leveraging Original Images for Contrastive Learning of Visual Representations [4.680881326162484]
Contrastive instance discrimination methods outperform supervised learning in downstream tasks such as image classification and object detection. A common augmentation technique in contrastive learning is random cropping followed by resizing. We introduce LeOCLR, a framework that employs a novel instance discrimination approach and an adapted loss function.
arXiv Detail & Related papers (2024-03-11T15:33:32Z)
Advancing Image Retrieval with Few-Shot Learning and Relevance Feedback [5.770351255180495]
Image Retrieval with Relevance Feedback (IRRF) involves iterative human interaction during the retrieval process. We propose a new scheme based on a hyper-network, that is tailored to the task and facilitates swift adjustment to user feedback. We show that our method can attain SoTA results in few-shot one-class classification and reach comparable results in binary classification task of few-shot open-set recognition.
arXiv Detail & Related papers (2023-12-18T10:20:28Z)
Feature Generation for Long-tail Classification [36.186909933006675]
We show how to generate meaningful features by estimating the tail category's distribution. We also present a qualitative analysis of generated features using t-SNE visualizations and analyze the nearest neighbors used to calibrate the tail class distributions.
arXiv Detail & Related papers (2021-11-10T21:34:29Z)
Rethinking Sampling Strategies for Unsupervised Person Re-identification [59.47536050785886]
We analyze the reasons for the performance differences between various sampling strategies under the same framework and loss function. Group sampling is proposed, which gathers samples from the same class into groups. Experiments on Market-1501, DukeMTMC-reID and MSMT17 show that group sampling achieves performance comparable to state-of-the-art methods.
arXiv Detail & Related papers (2021-07-07T05:39:58Z)
FASA: Feature Augmentation and Sampling Adaptation for Long-Tailed Instance Segmentation [91.129039760095]
Recent methods for long-tailed instance segmentation still struggle on rare object classes with few training data. We propose a simple yet effective method, Feature Augmentation and Sampling Adaptation (FASA) FASA is a fast, generic method that can be easily plugged into standard or long-tailed segmentation frameworks.
arXiv Detail & Related papers (2021-02-25T14:07:23Z)
Towards Better Object Detection in Scale Variation with Adaptive Feature Selection [3.5352273012717044]
We propose a novel adaptive feature selection module (AFSM) to automatically learn the way to fuse multi-level representations in the channel dimension. It significantly improves the performance of the detectors that have a feature pyramid structure. A class-aware sampling mechanism (CASM) is proposed to tackle the class imbalance problem.
arXiv Detail & Related papers (2020-12-06T13:41:20Z)
The Devil is in Classification: A Simple Framework for Long-tail Object Detection and Instance Segmentation [93.17367076148348]
We investigate performance drop of the state-of-the-art two-stage instance segmentation model Mask R-CNN on the recent long-tail LVIS dataset. We unveil that a major cause is the inaccurate classification of object proposals. We propose a simple calibration framework to more effectively alleviate classification head bias with a bi-level class balanced sampling approach.
arXiv Detail & Related papers (2020-07-23T12:49:07Z)
MuCAN: Multi-Correspondence Aggregation Network for Video Super-Resolution [63.02785017714131]
Video super-resolution (VSR) aims to utilize multiple low-resolution frames to generate a high-resolution prediction for each frame. Inter- and intra-frames are the key sources for exploiting temporal and spatial information. We build an effective multi-correspondence aggregation network (MuCAN) for VSR.
arXiv Detail & Related papers (2020-07-23T05:41:27Z)
Multi-Scale Positive Sample Refinement for Few-Shot Object Detection [61.60255654558682]
Few-shot object detection (FSOD) helps detectors adapt to unseen classes with few training instances. We propose a Multi-scale Positive Sample Refinement (MPSR) approach to enrich object scales in FSOD. MPSR generates multi-scale positive samples as object pyramids and refines the prediction at various scales.
arXiv Detail & Related papers (2020-07-18T09:48:29Z)
Overcoming Classifier Imbalance for Long-tail Object Detection with Balanced Group Softmax [88.11979569564427]
We provide the first systematic analysis on the underperformance of state-of-the-art models in front of long-tail distribution. We propose a novel balanced group softmax (BAGS) module for balancing the classifiers within the detection frameworks through group-wise training. Extensive experiments on the very recent long-tail large vocabulary object recognition benchmark LVIS show that our proposed BAGS significantly improves the performance of detectors.
arXiv Detail & Related papers (2020-06-18T10:24:26Z)
One-Shot Object Detection without Fine-Tuning [62.39210447209698]
We introduce a two-stage model consisting of a first stage Matching-FCOS network and a second stage Structure-Aware Relation Module. We also propose novel training strategies that effectively improve detection performance. Our method exceeds the state-of-the-art one-shot performance consistently on multiple datasets.
arXiv Detail & Related papers (2020-05-08T01:59:23Z)

This list is automatically generated from the titles and abstracts of the papers in this site.