PDNet: Towards Better One-stage Object Detection with Prediction
Decoupling
- URL: http://arxiv.org/abs/2104.13876v1
- Date: Wed, 28 Apr 2021 16:48:04 GMT
- Title: PDNet: Towards Better One-stage Object Detection with Prediction
Decoupling
- Authors: Li Yang, Yan Xu, Shaoru Wang, Chunfeng Yuan, Ziqi Zhang, Bing Li,
Weiming Hu
- Abstract summary: We propose a prediction-target-decoupled detector named PDNet to establish a more flexible detection paradigm.
With a single ResNeXt-64x4d-101 as the backbone, our detector achieves 48.7 AP with single-scale testing.
- Score: 37.83405509385431
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Recent one-stage object detectors follow a per-pixel prediction approach that
predicts both the object category scores and boundary positions from every
single grid location. However, the most suitable positions for inferring
different targets, i.e., the object category and boundaries, are generally
different. Predicting all these targets from the same grid location thus may
lead to sub-optimal results. In this paper, we analyze the suitable inference
positions for object category and boundaries, and propose a
prediction-target-decoupled detector named PDNet to establish a more flexible
detection paradigm. Our PDNet with the prediction decoupling mechanism encodes
different targets separately in different locations. A learnable prediction
collection module is devised with two sets of dynamic points, i.e., dynamic
boundary points and semantic points, to collect and aggregate the predictions
from the favorable regions for localization and classification. We adopt a
two-step strategy to learn these dynamic point positions, where the prior
positions are estimated for different targets first, and the network further
predicts residual offsets to the positions with better perceptions of the
object properties. Extensive experiments on the MS COCO benchmark demonstrate
the effectiveness and efficiency of our method. With a single ResNeXt-64x4d-101
as the backbone, our detector achieves 48.7 AP with single-scale testing, which
outperforms the state-of-the-art methods by an appreciable margin under the
same experimental settings. Moreover, our detector is highly efficient as a
one-stage framework. Our code will be public.
Related papers
- Parallel Reasoning Network for Human-Object Interaction Detection [53.422076419484945]
We propose a new transformer-based method named Parallel Reasoning Network(PR-Net)
PR-Net constructs two independent predictors for instance-level localization and relation-level understanding.
Our PR-Net has achieved competitive results on HICO-DET and V-COCO benchmarks.
arXiv Detail & Related papers (2023-01-09T17:00:34Z) - ConfMix: Unsupervised Domain Adaptation for Object Detection via
Confidence-based Mixing [32.679280923208715]
Unsupervised Domain Adaptation (UDA) for object detection aims to adapt a model trained on a source domain to detect instances from a new target domain for which annotations are not available.
We propose ConfMix, the first method that introduces a sample mixing strategy based on region-level detection confidence for adaptive object detector learning.
arXiv Detail & Related papers (2022-10-20T19:16:39Z) - Frequency Spectrum Augmentation Consistency for Domain Adaptive Object
Detection [107.52026281057343]
We introduce a Frequency Spectrum Augmentation Consistency (FSAC) framework with four different low-frequency filter operations.
In the first stage, we utilize all the original and augmented source data to train an object detector.
In the second stage, augmented source and target data with pseudo labels are adopted to perform the self-training for prediction consistency.
arXiv Detail & Related papers (2021-12-16T04:07:01Z) - Rethinking Counting and Localization in Crowds:A Purely Point-Based
Framework [59.578339075658995]
We propose a purely point-based framework for joint crowd counting and individual localization.
We design an intuitive solution under this framework, which is called Point to Point Network (P2PNet)
arXiv Detail & Related papers (2021-07-27T11:41:50Z) - EPP-Net: Extreme-Point-Prediction-Based Object Detection [9.270523894683278]
We present a new anchor-free dense object detector, which regresses the relative displacement vector between each pixel and the four extreme points.
We also propose a new metric to measure the similarity between two groups of extreme points, namely, Extreme Intersection over Union (EIoU)
On the MS-COCO dataset, our method achieves an average precision (AP) of 39.3% with ResNet-50 and an AP of 48.3% with ResNeXt-101-DCN.
arXiv Detail & Related papers (2021-04-29T01:01:50Z) - Reformulating HOI Detection as Adaptive Set Prediction [25.44630995307787]
We reformulate HOI detection as an adaptive set prediction problem.
We propose an Adaptive Set-based one-stage framework (AS-Net) with parallel instance and interaction branches.
Our method outperforms previous state-of-the-art methods without any extra human pose and language features.
arXiv Detail & Related papers (2021-03-10T10:40:33Z) - Corner Proposal Network for Anchor-free, Two-stage Object Detection [174.59360147041673]
The goal of object detection is to determine the class and location of objects in an image.
This paper proposes a novel anchor-free, two-stage framework which first extracts a number of object proposals.
We demonstrate that these two stages are effective solutions for improving recall and precision.
arXiv Detail & Related papers (2020-07-27T19:04:57Z) - Scope Head for Accurate Localization in Object Detection [135.9979405835606]
We propose a novel detector coined as ScopeNet, which models anchors of each location as a mutually dependent relationship.
With our concise and effective design, the proposed ScopeNet achieves state-of-the-art results on COCO.
arXiv Detail & Related papers (2020-05-11T04:00:09Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.