On Model Calibration for Long-Tailed Object Detection and Instance
Segmentation
- URL: http://arxiv.org/abs/2107.02170v1
- Date: Mon, 5 Jul 2021 17:57:20 GMT
- Title: On Model Calibration for Long-Tailed Object Detection and Instance
Segmentation
- Authors: Tai-Yu Pan, Cheng Zhang, Yandong Li, Hexiang Hu, Dong Xuan, Soravit
Changpinyo, Boqing Gong, Wei-Lun Chao
- Abstract summary: We propose NorCal, Normalized for long-tailed object detection and instance segmentation.
We show that separately handling the background class and normalizing the scores over classes for each proposal are keys to achieving superior performance.
- Score: 56.82077636126353
- License: http://creativecommons.org/licenses/by-nc-sa/4.0/
- Abstract: Vanilla models for object detection and instance segmentation suffer from the
heavy bias toward detecting frequent objects in the long-tailed setting.
Existing methods address this issue mostly during training, e.g., by
re-sampling or re-weighting. In this paper, we investigate a largely overlooked
approach -- post-processing calibration of confidence scores. We propose
NorCal, Normalized Calibration for long-tailed object detection and instance
segmentation, a simple and straightforward recipe that reweighs the predicted
scores of each class by its training sample size. We show that separately
handling the background class and normalizing the scores over classes for each
proposal are keys to achieving superior performance. On the LVIS dataset,
NorCal can effectively improve nearly all the baseline models not only on rare
classes but also on common and frequent classes. Finally, we conduct extensive
analysis and ablation studies to offer insights into various modeling choices
and mechanisms of our approach.
Related papers
- Enabling Calibration In The Zero-Shot Inference of Large Vision-Language
Models [58.720142291102135]
We measure calibration across relevant variables like prompt, dataset, and architecture, and find that zero-shot inference with CLIP is miscalibrated.
A single learned temperature generalizes for each specific CLIP model across inference dataset and prompt choice.
arXiv Detail & Related papers (2023-03-11T17:14:04Z) - A Gating Model for Bias Calibration in Generalized Zero-shot Learning [18.32369721322249]
Generalized zero-shot learning (GZSL) aims at training a model that can generalize to unseen class data by only using auxiliary information.
One of the main challenges in GZSL is a biased model prediction toward seen classes caused by overfitting on only available seen class data during training.
We propose a two-stream autoencoder-based gating model for GZSL.
arXiv Detail & Related papers (2022-03-08T16:41:06Z) - Closing the Generalization Gap in One-Shot Object Detection [92.82028853413516]
We show that the key to strong few-shot detection models may not lie in sophisticated metric learning approaches, but instead in scaling the number of categories.
Future data annotation efforts should therefore focus on wider datasets and annotate a larger number of categories.
arXiv Detail & Related papers (2020-11-09T09:31:17Z) - The Devil is in Classification: A Simple Framework for Long-tail Object
Detection and Instance Segmentation [93.17367076148348]
We investigate performance drop of the state-of-the-art two-stage instance segmentation model Mask R-CNN on the recent long-tail LVIS dataset.
We unveil that a major cause is the inaccurate classification of object proposals.
We propose a simple calibration framework to more effectively alleviate classification head bias with a bi-level class balanced sampling approach.
arXiv Detail & Related papers (2020-07-23T12:49:07Z) - Overcoming Classifier Imbalance for Long-tail Object Detection with
Balanced Group Softmax [88.11979569564427]
We provide the first systematic analysis on the underperformance of state-of-the-art models in front of long-tail distribution.
We propose a novel balanced group softmax (BAGS) module for balancing the classifiers within the detection frameworks through group-wise training.
Extensive experiments on the very recent long-tail large vocabulary object recognition benchmark LVIS show that our proposed BAGS significantly improves the performance of detectors.
arXiv Detail & Related papers (2020-06-18T10:24:26Z) - UniT: Unified Knowledge Transfer for Any-shot Object Detection and
Segmentation [52.487469544343305]
Methods for object detection and segmentation rely on large scale instance-level annotations for training.
We propose an intuitive and unified semi-supervised model that is applicable to a range of supervision.
arXiv Detail & Related papers (2020-06-12T22:45:47Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.