Modulating Localization and Classification for Harmonized Object
Detection
- URL: http://arxiv.org/abs/2103.08958v1
- Date: Tue, 16 Mar 2021 10:36:02 GMT
- Title: Modulating Localization and Classification for Harmonized Object
Detection
- Authors: Taiheng Zhang, Qiaoyong Zhong, Shiliang Pu, Di Xie
- Abstract summary: We propose a mutual learning framework to modulate the two tasks.
In particular, the two tasks are forced to learn from each other with a novel mutual labeling strategy.
We achieve a significant performance gain over the baseline detectors on the COCO dataset.
- Score: 40.82723262074911
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Object detection involves two sub-tasks, i.e. localizing objects in an image
and classifying them into various categories. For existing CNN-based detectors,
we notice the widespread divergence between localization and classification,
which leads to degradation in performance. In this work, we propose a mutual
learning framework to modulate the two tasks. In particular, the two tasks are
forced to learn from each other with a novel mutual labeling strategy. Besides,
we introduce a simple yet effective IoU rescoring scheme, which further reduces
the divergence. Moreover, we define a Spearman rank correlation-based metric to
quantify the divergence, which correlates well with the detection performance.
The proposed approach is general-purpose and can be easily injected into
existing detectors such as FCOS and RetinaNet. We achieve a significant
performance gain over the baseline detectors on the COCO dataset.
Related papers
- Decoupled DETR: Spatially Disentangling Localization and Classification
for Improved End-to-End Object Detection [48.429555904690595]
We introduce spatially decoupled DETR, which includes a task-aware query generation module and a disentangled feature learning process.
We demonstrate that our approach achieves a significant improvement in MSCOCO datasets compared to previous work.
arXiv Detail & Related papers (2023-10-24T15:54:11Z) - Meta-DETR: Image-Level Few-Shot Detection with Inter-Class Correlation
Exploitation [100.87407396364137]
We design Meta-DETR, which (i) is the first image-level few-shot detector, and (ii) introduces a novel inter-class correlational meta-learning strategy.
Experiments over multiple few-shot object detection benchmarks show that the proposed Meta-DETR outperforms state-of-the-art methods by large margins.
arXiv Detail & Related papers (2022-07-30T13:46:07Z) - Exploiting Domain Transferability for Collaborative Inter-level Domain
Adaptive Object Detection [17.61278045720336]
Domain adaptation for object detection (DAOD) has recently drawn much attention owing to its capability of detecting target objects without any annotations.
Previous works focus on aligning features extracted from partial levels in a two-stage detector via adversarial training.
We introduce a novel framework for ProposalD with three proposed components: Multi-scale-aware Uncertainty Attention (MUA), Transferable Region Network (TRPN), and Dynamic Instance Sampling (DIS)
arXiv Detail & Related papers (2022-07-20T01:50:26Z) - The Overlooked Classifier in Human-Object Interaction Recognition [82.20671129356037]
We encode the semantic correlation among classes into the classification head by initializing the weights with language embeddings of HOIs.
We propose a new loss named LSE-Sign to enhance multi-label learning on a long-tailed dataset.
Our simple yet effective method enables detection-free HOI classification, outperforming the state-of-the-arts that require object detection and human pose by a clear margin.
arXiv Detail & Related papers (2022-03-10T23:35:00Z) - Multi-object Tracking with a Hierarchical Single-branch Network [31.680667324595557]
We propose an online multi-object tracking framework based on a hierarchical single-branch network.
Our novel iHOIM loss function unifies the objectives of the two sub-tasks and encourages better detection performance.
Experimental results on MOT16 and MOT20 datasets show that we can achieve state-of-the-art tracking performance.
arXiv Detail & Related papers (2021-01-06T12:14:58Z) - AFD-Net: Adaptive Fully-Dual Network for Few-Shot Object Detection [8.39479809973967]
Few-shot object detection (FSOD) aims at learning a detector that can fast adapt to previously unseen objects with scarce examples.
Existing methods solve this problem by performing subtasks of classification and localization utilizing a shared component.
We present that a general few-shot detector should consider the explicit decomposition of two subtasks, as well as leveraging information from both of them to enhance feature representations.
arXiv Detail & Related papers (2020-11-30T10:21:32Z) - Scope Head for Accurate Localization in Object Detection [135.9979405835606]
We propose a novel detector coined as ScopeNet, which models anchors of each location as a mutually dependent relationship.
With our concise and effective design, the proposed ScopeNet achieves state-of-the-art results on COCO.
arXiv Detail & Related papers (2020-05-11T04:00:09Z) - One-Shot Object Detection without Fine-Tuning [62.39210447209698]
We introduce a two-stage model consisting of a first stage Matching-FCOS network and a second stage Structure-Aware Relation Module.
We also propose novel training strategies that effectively improve detection performance.
Our method exceeds the state-of-the-art one-shot performance consistently on multiple datasets.
arXiv Detail & Related papers (2020-05-08T01:59:23Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.