Object Detection Made Simpler by Eliminating Heuristic NMS
- URL: http://arxiv.org/abs/2101.11782v1
- Date: Thu, 28 Jan 2021 02:38:29 GMT
- Title: Object Detection Made Simpler by Eliminating Heuristic NMS
- Authors: Qiang Zhou and Chaohui Yu and Chunhua Shen and Zhibin Wang and Hao Li
- Abstract summary: We show a simple NMS-free, end-to-end object detection framework.
We attain on par or even improved detection accuracy compared with the original one-stage detector.
- Score: 70.93004137521946
- License: http://creativecommons.org/licenses/by-nc-nd/4.0/
- Abstract: We show a simple NMS-free, end-to-end object detection framework, of which
the network is a minimal modification to a one-stage object detector such as
the FCOS detection model [Tian et al. 2019]. We attain on par or even improved
detection accuracy compared with the original one-stage detector. It performs
detection at almost the same inference speed, while being even simpler in that
now the post-processing NMS (non-maximum suppression) is eliminated during
inference. If the network is capable of identifying only one positive sample
for prediction for each ground-truth object instance in an image, then NMS
would become unnecessary. This is made possible by attaching a compact PSS head
for automatic selection of the single positive sample for each instance (see
Fig. 1). As the learning objective involves both one-to-many and one-to-one
label assignments, there is a conflict in the labels of some training examples,
making the learning challenging. We show that by employing a stop-gradient
operation, we can successfully tackle this issue and train the detector. On the
COCO dataset, our simple design achieves superior performance compared to both
the FCOS baseline detector with NMS post-processing and the recent end-to-end
NMS-free detectors. Our extensive ablation studies justify the rationale of the
design choices.
Related papers
- MutDet: Mutually Optimizing Pre-training for Remote Sensing Object Detection [36.478530086163744]
We propose a novel Mutually optimizing pre-training framework for remote sensing object Detection, dubbed as MutDet.
MutDet fuses the object embeddings and detector features bidirectionally in the last encoder layer, enhancing their information interaction.
Experiments on various settings show new state-of-the-art transfer performance.
arXiv Detail & Related papers (2024-07-13T15:28:15Z) - Label-Efficient Object Detection via Region Proposal Network
Pre-Training [58.50615557874024]
We propose a simple pretext task that provides an effective pre-training for the region proposal network (RPN)
In comparison with multi-stage detectors without RPN pre-training, our approach is able to consistently improve downstream task performance.
arXiv Detail & Related papers (2022-11-16T16:28:18Z) - W2N:Switching From Weak Supervision to Noisy Supervision for Object
Detection [64.10643170523414]
We propose a novel WSOD framework with a new paradigm that switches from weak supervision to noisy supervision (W2N)
In the localization adaptation module, we propose a regularization loss to reduce the proportion of discriminative parts in original pseudo ground-truths.
Our W2N outperforms all existing pure WSOD methods and transfer learning methods.
arXiv Detail & Related papers (2022-07-25T12:13:48Z) - SIOD: Single Instance Annotated Per Category Per Image for Object
Detection [67.64774488115299]
We propose the Single Instance annotated Object Detection (SIOD), requiring only one instance annotation for each existing category in an image.
Degraded from inter-task (WSOD) or inter-image (SSOD) discrepancies to the intra-image discrepancy, SIOD provides more reliable and rich prior knowledge for mining the rest of unlabeled instances.
Under the SIOD setting, we propose a simple yet effective framework, termed Dual-Mining (DMiner), which consists of a Similarity-based Pseudo Label Generating module (SPLG) and a Pixel-level Group Contrastive Learning module (PGCL)
arXiv Detail & Related papers (2022-03-29T08:49:51Z) - End-to-End Object Detection with Fully Convolutional Network [71.56728221604158]
We introduce a Prediction-aware One-To-One (POTO) label assignment for classification to enable end-to-end detection.
A simple 3D Max Filtering (3DMF) is proposed to utilize the multi-scale features and improve the discriminability of convolutions in the local region.
Our end-to-end framework achieves competitive performance against many state-of-the-art detectors with NMS on COCO and CrowdHuman datasets.
arXiv Detail & Related papers (2020-12-07T09:14:55Z) - Co-mining: Self-Supervised Learning for Sparsely Annotated Object
Detection [29.683119976550007]
We propose a simple but effective mechanism, called Co-mining, for sparsely annotated object detection.
In our Co-mining, two branches of a Siamese network predict the pseudo-label sets for each other.
Experiments are performed on MS dataset with three different sparsely annotated settings.
arXiv Detail & Related papers (2020-12-03T14:23:43Z) - FCOS: A simple and strong anchor-free object detector [111.87691210818194]
We propose a fully convolutional one-stage object detector (FCOS) to solve object detection in a per-pixel prediction fashion.
Almost all state-of-the-art object detectors such as RetinaNet, SSD, YOLOv3, and Faster R-CNN rely on pre-defined anchor boxes.
In contrast, our proposed detector FCOS is anchor box free, as well as proposal free.
arXiv Detail & Related papers (2020-06-14T01:03:39Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.