The Devil is in Classification: A Simple Framework for Long-tail Object
Detection and Instance Segmentation
- URL: http://arxiv.org/abs/2007.11978v5
- Date: Tue, 3 Nov 2020 04:11:23 GMT
- Title: The Devil is in Classification: A Simple Framework for Long-tail Object
Detection and Instance Segmentation
- Authors: Tao Wang, Yu Li, Bingyi Kang, Junnan Li, Junhao Liew, Sheng Tang,
Steven Hoi, Jiashi Feng
- Abstract summary: We investigate performance drop of the state-of-the-art two-stage instance segmentation model Mask R-CNN on the recent long-tail LVIS dataset.
We unveil that a major cause is the inaccurate classification of object proposals.
We propose a simple calibration framework to more effectively alleviate classification head bias with a bi-level class balanced sampling approach.
- Score: 93.17367076148348
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Most existing object instance detection and segmentation models only work
well on fairly balanced benchmarks where per-category training sample numbers
are comparable, such as COCO. They tend to suffer performance drop on realistic
datasets that are usually long-tailed. This work aims to study and address such
open challenges. Specifically, we systematically investigate performance drop
of the state-of-the-art two-stage instance segmentation model Mask R-CNN on the
recent long-tail LVIS dataset, and unveil that a major cause is the inaccurate
classification of object proposals. Based on such an observation, we first
consider various techniques for improving long-tail classification performance
which indeed enhance instance segmentation results. We then propose a simple
calibration framework to more effectively alleviate classification head bias
with a bi-level class balanced sampling approach. Without bells and whistles,
it significantly boosts the performance of instance segmentation for tail
classes on the recent LVIS dataset and our sampled COCO-LT dataset. Our
analysis provides useful insights for solving long-tail instance detection and
segmentation problems, and the straightforward \emph{SimCal} method can serve
as a simple but strong baseline. With the method we have won the 2019 LVIS
challenge. Codes and models are available at https://github.com/twangnh/SimCal.
Related papers
- On Model Calibration for Long-Tailed Object Detection and Instance
Segmentation [56.82077636126353]
We propose NorCal, Normalized for long-tailed object detection and instance segmentation.
We show that separately handling the background class and normalizing the scores over classes for each proposal are keys to achieving superior performance.
arXiv Detail & Related papers (2021-07-05T17:57:20Z) - Large-scale Unsupervised Semantic Segmentation [163.3568726730319]
We propose a new problem of large-scale unsupervised semantic segmentation (LUSS) with a newly created benchmark dataset to track the research progress.
Based on the ImageNet dataset, we propose the ImageNet-S dataset with 1.2 million training images and 40k high-quality semantic segmentation annotations for evaluation.
arXiv Detail & Related papers (2021-06-06T15:02:11Z) - The Little W-Net That Could: State-of-the-Art Retinal Vessel
Segmentation with Minimalistic Models [19.089445797922316]
We show that a minimalistic version of a standard U-Net with several orders of magnitude less parameters closely approximates the performance of current best techniques.
We also propose a simple extension, dubbed W-Net, which reaches outstanding performance on several popular datasets.
We also test our approach on the Artery/Vein segmentation problem, where we again achieve results well-aligned with the state-of-the-art.
arXiv Detail & Related papers (2020-09-03T19:59:51Z) - Overcoming Classifier Imbalance for Long-tail Object Detection with
Balanced Group Softmax [88.11979569564427]
We provide the first systematic analysis on the underperformance of state-of-the-art models in front of long-tail distribution.
We propose a novel balanced group softmax (BAGS) module for balancing the classifiers within the detection frameworks through group-wise training.
Extensive experiments on the very recent long-tail large vocabulary object recognition benchmark LVIS show that our proposed BAGS significantly improves the performance of detectors.
arXiv Detail & Related papers (2020-06-18T10:24:26Z) - UniT: Unified Knowledge Transfer for Any-shot Object Detection and
Segmentation [52.487469544343305]
Methods for object detection and segmentation rely on large scale instance-level annotations for training.
We propose an intuitive and unified semi-supervised model that is applicable to a range of supervision.
arXiv Detail & Related papers (2020-06-12T22:45:47Z) - Learning Fast and Robust Target Models for Video Object Segmentation [83.3382606349118]
Video object segmentation (VOS) is a highly challenging problem since the initial mask, defining the target object, is only given at test-time.
Most previous approaches fine-tune segmentation networks on the first frame, resulting in impractical frame-rates and risk of overfitting.
We propose a novel VOS architecture consisting of two network components.
arXiv Detail & Related papers (2020-02-27T21:58:06Z) - Reinforced active learning for image segmentation [34.096237671643145]
We present a new active learning strategy for semantic segmentation based on deep reinforcement learning (RL)
An agent learns a policy to select a subset of small informative image regions -- opposed to entire images -- to be labeled from a pool of unlabeled data.
Our method proposes a new modification of the deep Q-network (DQN) formulation for active learning, adapting it to the large-scale nature of semantic segmentation problems.
arXiv Detail & Related papers (2020-02-16T14:03:06Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.