Disentangle Your Dense Object Detector
- URL: http://arxiv.org/abs/2107.02963v1
- Date: Wed, 7 Jul 2021 00:52:16 GMT
- Title: Disentangle Your Dense Object Detector
- Authors: Zehui Chen, Chenhongyi Yang, Qiaofei Li, Feng Zhao, Zhengjun Zha, Feng
Wu
- Abstract summary: Deep learning-based dense object detectors have achieved great success in the past few years and have been applied to numerous multimedia applications such as video understanding.
However, the current training pipeline for dense detectors is compromised to lots of conjunctions that may not hold.
We propose Disentangled Dense Object Detector (DDOD), in which simple and effective disentanglement mechanisms are designed and integrated into the current state-of-the-art detectors.
- Score: 82.22771433419727
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: Deep learning-based dense object detectors have achieved great success in the
past few years and have been applied to numerous multimedia applications such
as video understanding. However, the current training pipeline for dense
detectors is compromised to lots of conjunctions that may not hold. In this
paper, we investigate three such important conjunctions: 1) only samples
assigned as positive in classification head are used to train the regression
head; 2) classification and regression share the same input feature and
computational fields defined by the parallel head architecture; and 3) samples
distributed in different feature pyramid layers are treated equally when
computing the loss. We first carry out a series of pilot experiments to show
disentangling such conjunctions can lead to persistent performance improvement.
Then, based on these findings, we propose Disentangled Dense Object Detector
(DDOD), in which simple and effective disentanglement mechanisms are designed
and integrated into the current state-of-the-art dense object detectors.
Extensive experiments on MS COCO benchmark show that our approach can lead to
2.0 mAP, 2.4 mAP and 2.2 mAP absolute improvements on RetinaNet, FCOS, and ATSS
baselines with negligible extra overhead. Notably, our best model reaches 55.0
mAP on the COCO test-dev set and 93.5 AP on the hard subset of WIDER FACE,
achieving new state-of-the-art performance on these two competitive benchmarks.
Code is available at https://github.com/zehuichen123/DDOD.
Related papers
- Learning Feature Inversion for Multi-class Anomaly Detection under General-purpose COCO-AD Benchmark [101.23684938489413]
Anomaly detection (AD) is often focused on detecting anomalies for industrial quality inspection and medical lesion examination.
This work first constructs a large-scale and general-purpose COCO-AD dataset by extending COCO to the AD field.
Inspired by the metrics in the segmentation field, we propose several more practical threshold-dependent AD-specific metrics.
arXiv Detail & Related papers (2024-04-16T17:38:26Z) - Benchmarking Deep Models for Salient Object Detection [67.07247772280212]
We construct a general SALient Object Detection (SALOD) benchmark to conduct a comprehensive comparison among several representative SOD methods.
In the above experiments, we find that existing loss functions usually specialized in some metrics but reported inferior results on the others.
We propose a novel Edge-Aware (EA) loss that promotes deep networks to learn more discriminative features by integrating both pixel- and image-level supervision signals.
arXiv Detail & Related papers (2022-02-07T03:43:16Z) - The KFIoU Loss for Rotated Object Detection [115.334070064346]
In this paper, we argue that one effective alternative is to devise an approximate loss who can achieve trend-level alignment with SkewIoU loss.
Specifically, we model the objects as Gaussian distribution and adopt Kalman filter to inherently mimic the mechanism of SkewIoU.
The resulting new loss called KFIoU is easier to implement and works better compared with exact SkewIoU.
arXiv Detail & Related papers (2022-01-29T10:54:57Z) - Anchor Retouching via Model Interaction for Robust Object Detection in
Aerial Images [15.404024559652534]
We present an effective Dynamic Enhancement Anchor (DEA) network to construct a novel training sample generator.
Our method achieves state-of-the-art performance in accuracy with moderate inference speed and computational overhead for training.
arXiv Detail & Related papers (2021-12-13T14:37:20Z) - Decoupling Object Detection from Human-Object Interaction Recognition [37.133695677465376]
DEFR is a DEtection-FRee method to recognize Human-Object Interactions (HOI) at image level without using object location or human pose.
We propose two findings to boost the performance of the detection-free approach, which significantly outperforms the detection-assisted state of the arts.
arXiv Detail & Related papers (2021-12-13T03:01:49Z) - G-DetKD: Towards General Distillation Framework for Object Detectors via
Contrastive and Semantic-guided Feature Imitation [49.421099172544196]
We propose a novel semantic-guided feature imitation technique, which automatically performs soft matching between feature pairs across all pyramid levels.
We also introduce contrastive distillation to effectively capture the information encoded in the relationship between different feature regions.
Our method consistently outperforms the existing detection KD techniques, and works when (1) components in the framework are used separately and in conjunction.
arXiv Detail & Related papers (2021-08-17T07:44:27Z) - Modulating Localization and Classification for Harmonized Object
Detection [40.82723262074911]
We propose a mutual learning framework to modulate the two tasks.
In particular, the two tasks are forced to learn from each other with a novel mutual labeling strategy.
We achieve a significant performance gain over the baseline detectors on the COCO dataset.
arXiv Detail & Related papers (2021-03-16T10:36:02Z) - VarifocalNet: An IoU-aware Dense Object Detector [11.580759212782812]
We learn an Iou-aware Classification Score (IACS) as a joint representation of object presence confidence and localization accuracy.
We show that dense object detectors can achieve a more accurate ranking of candidate detections based on the IACS.
We build an IoU-aware dense object detector based on the FCOS+ATSS architecture, that we call VarifocalNet or VFNet for short.
arXiv Detail & Related papers (2020-08-31T05:12:21Z) - EHSOD: CAM-Guided End-to-end Hybrid-Supervised Object Detection with
Cascade Refinement [53.69674636044927]
We present EHSOD, an end-to-end hybrid-supervised object detection system.
It can be trained in one shot on both fully and weakly-annotated data.
It achieves comparable results on multiple object detection benchmarks with only 30% fully-annotated data.
arXiv Detail & Related papers (2020-02-18T08:04:58Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.