Improving Long-Tailed Object Detection with Balanced Group Softmax and Metric Learning
- URL: http://arxiv.org/abs/2511.16619v1
- Date: Tue, 02 Sep 2025 00:38:13 GMT
- Title: Improving Long-Tailed Object Detection with Balanced Group Softmax and Metric Learning
- Authors: Satyam Gaba,
- Abstract summary: We tackle the problem of long-tailed 2D object detection using the LVISv1 dataset.<n>We employ a two-stage Faster R-CNN architecture and propose enhancements to the Balanced Group Softmax framework.<n>Our approach achieves a new state-of-the-art performance with a mean Average Precision (mAP) of 24.5%, surpassing the previous benchmark of 24.0%.
- Score: 0.0
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: Object detection has been widely explored for class-balanced datasets such as COCO. However, real-world scenarios introduce the challenge of long-tailed distributions, where numerous categories contain only a few instances. This inherent class imbalance biases detection models towards the more frequent classes, degrading performance on rare categories. In this paper, we tackle the problem of long-tailed 2D object detection using the LVISv1 dataset, which consists of 1,203 categories and 164,000 images. We employ a two-stage Faster R-CNN architecture and propose enhancements to the Balanced Group Softmax (BAGS) framework to mitigate class imbalance. Our approach achieves a new state-of-the-art performance with a mean Average Precision (mAP) of 24.5%, surpassing the previous benchmark of 24.0%. Additionally, we hypothesize that tail class features may form smaller, denser clusters within the feature space of head classes, making classification challenging for regression-based classifiers. To address this issue, we explore metric learning to produce feature embeddings that are both well-separated across classes and tightly clustered within each class. For inference, we utilize a k-Nearest Neighbors (k-NN) approach to improve classification performance, particularly for rare classes. Our results demonstrate the effectiveness of these methods in advancing long-tailed object detection.
Related papers
- Learning from Neighbors: Category Extrapolation for Long-Tail Learning [62.30734737735273]
We offer a novel perspective on long-tail learning, inspired by an observation: datasets with finer granularity tend to be less affected by data imbalance.<n>We introduce open-set auxiliary classes that are visually similar to existing ones, aiming to enhance representation learning for both head and tail classes.<n>To prevent the overwhelming presence of auxiliary classes from disrupting training, we introduce a neighbor-silencing loss.
arXiv Detail & Related papers (2024-10-21T13:06:21Z) - Scaling Up Deep Clustering Methods Beyond ImageNet-1K [0.9437165725355702]
In this work, we explore the performance of feature-based deep clustering approaches on large-scale benchmarks.
Our experimental analysis reveals that feature-based $k$-means is often unfairly evaluated on balanced datasets.
Deep clustering methods outperform $k$-means across most large-scale benchmarks.
arXiv Detail & Related papers (2024-06-03T11:13:27Z) - DiGeo: Discriminative Geometry-Aware Learning for Generalized Few-Shot
Object Detection [39.937724871284665]
Generalized few-shot object detection aims to achieve precise detection on both base classes with abundant annotations and novel classes with limited training data.
Existing approaches enhance few-shot generalization with the sacrifice of base-class performance.
We propose a new training framework, DiGeo, to learn Geometry-aware features of inter-class separation and intra-class compactness.
arXiv Detail & Related papers (2023-03-16T22:37:09Z) - Ranking hierarchical multi-label classification results with mLPRs [4.869182515096001]
We focus on the less attended second-stage question while adhering to the given class hierarchy.<n>We introduce a new objective function, called CATCH, to ensure reasonable classification performance.<n>Our approach was evaluated on a synthetic dataset and two real datasets.
arXiv Detail & Related papers (2022-05-16T17:43:35Z) - Exploring Classification Equilibrium in Long-Tailed Object Detection [29.069986049436157]
We propose to use the mean classification score to indicate the classification accuracy for each category during training.
We balance the classification via an Equilibrium Loss (EBL) and a Memory-augmented Feature Sampling (MFS) method.
It improves the detection performance of tail classes by 15.6 AP, and outperforms the most recent long-tailed object detectors by more than 1 AP.
arXiv Detail & Related papers (2021-08-17T08:39:04Z) - Adaptive Class Suppression Loss for Long-Tail Object Detection [49.7273558444966]
We devise a novel Adaptive Class Suppression Loss (ACSL) to improve the detection performance of tail categories.
Our ACSL achieves 5.18% and 5.2% improvements with ResNet50-FPN, and sets a new state of the art.
arXiv Detail & Related papers (2021-04-02T05:12:31Z) - Improving Calibration for Long-Tailed Recognition [68.32848696795519]
We propose two methods to improve calibration and performance in such scenarios.
For dataset bias due to different samplers, we propose shifted batch normalization.
Our proposed methods set new records on multiple popular long-tailed recognition benchmark datasets.
arXiv Detail & Related papers (2021-04-01T13:55:21Z) - The Devil is in Classification: A Simple Framework for Long-tail Object
Detection and Instance Segmentation [93.17367076148348]
We investigate performance drop of the state-of-the-art two-stage instance segmentation model Mask R-CNN on the recent long-tail LVIS dataset.
We unveil that a major cause is the inaccurate classification of object proposals.
We propose a simple calibration framework to more effectively alleviate classification head bias with a bi-level class balanced sampling approach.
arXiv Detail & Related papers (2020-07-23T12:49:07Z) - Overcoming Classifier Imbalance for Long-tail Object Detection with
Balanced Group Softmax [88.11979569564427]
We provide the first systematic analysis on the underperformance of state-of-the-art models in front of long-tail distribution.
We propose a novel balanced group softmax (BAGS) module for balancing the classifiers within the detection frameworks through group-wise training.
Extensive experiments on the very recent long-tail large vocabulary object recognition benchmark LVIS show that our proposed BAGS significantly improves the performance of detectors.
arXiv Detail & Related papers (2020-06-18T10:24:26Z) - One-Shot Object Detection without Fine-Tuning [62.39210447209698]
We introduce a two-stage model consisting of a first stage Matching-FCOS network and a second stage Structure-Aware Relation Module.
We also propose novel training strategies that effectively improve detection performance.
Our method exceeds the state-of-the-art one-shot performance consistently on multiple datasets.
arXiv Detail & Related papers (2020-05-08T01:59:23Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.