Task-Specific Context Decoupling for Object Detection
- URL: http://arxiv.org/abs/2303.01047v1
- Date: Thu, 2 Mar 2023 08:02:14 GMT
- Title: Task-Specific Context Decoupling for Object Detection
- Authors: Jiayuan Zhuang, Zheng Qin, Hao Yu, Xucan Chen
- Abstract summary: Exsiting methods usually leverage disentangled heads to learn different feature context for each task.
We propose a novel Task-Specific COntext DEcoupling (TSCODE) head which further disentangles the feature encoding for two tasks.
Our method stably improves different detectors by over 1.0 AP with less computational cost.
- Score: 27.078743716924752
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Classification and localization are two main sub-tasks in object detection.
Nonetheless, these two tasks have inconsistent preferences for feature context,
i.e., localization expects more boundary-aware features to accurately regress
the bounding box, while more semantic context is preferred for object
classification. Exsiting methods usually leverage disentangled heads to learn
different feature context for each task. However, the heads are still applied
on the same input features, which leads to an imperfect balance between
classifcation and localization. In this work, we propose a novel Task-Specific
COntext DEcoupling (TSCODE) head which further disentangles the feature
encoding for two tasks. For classification, we generate spatially-coarse but
semantically-strong feature encoding. For localization, we provide
high-resolution feature map containing more edge information to better regress
object boundaries. TSCODE is plug-and-play and can be easily incorperated into
existing detection pipelines. Extensive experiments demonstrate that our method
stably improves different detectors by over 1.0 AP with less computational
cost. Our code and models will be publicly released.
Related papers
- Decoupled DETR: Spatially Disentangling Localization and Classification
for Improved End-to-End Object Detection [48.429555904690595]
We introduce spatially decoupled DETR, which includes a task-aware query generation module and a disentangled feature learning process.
We demonstrate that our approach achieves a significant improvement in MSCOCO datasets compared to previous work.
arXiv Detail & Related papers (2023-10-24T15:54:11Z) - Object-Centric Multiple Object Tracking [124.30650395969126]
This paper proposes a video object-centric model for multiple-object tracking pipelines.
It consists of an index-merge module that adapts the object-centric slots into detection outputs and an object memory module.
Benefited from object-centric learning, we only require sparse detection labels for object localization and feature binding.
arXiv Detail & Related papers (2023-09-01T03:34:12Z) - DAMO-NLP at SemEval-2023 Task 2: A Unified Retrieval-augmented System
for Multilingual Named Entity Recognition [94.90258603217008]
The MultiCoNER RNum2 shared task aims to tackle multilingual named entity recognition (NER) in fine-grained and noisy scenarios.
Previous top systems in the MultiCoNER RNum1 either incorporate the knowledge bases or gazetteers.
We propose a unified retrieval-augmented system (U-RaNER) for fine-grained multilingual NER.
arXiv Detail & Related papers (2023-05-05T16:59:26Z) - Task-specific Inconsistency Alignment for Domain Adaptive Object
Detection [38.027790951157705]
Detectors trained with massive labeled data often exhibit dramatic performance degradation in certain scenarios with data distribution gap.
We propose Task-specific Inconsistency Alignment (TIA), by developing a new alignment mechanism in separate task spaces.
TIA demonstrates superior results on various scenarios to the previous state-of-the-art methods.
arXiv Detail & Related papers (2022-03-29T08:36:33Z) - Modulating Localization and Classification for Harmonized Object
Detection [40.82723262074911]
We propose a mutual learning framework to modulate the two tasks.
In particular, the two tasks are forced to learn from each other with a novel mutual labeling strategy.
We achieve a significant performance gain over the baseline detectors on the COCO dataset.
arXiv Detail & Related papers (2021-03-16T10:36:02Z) - Inter-Image Communication for Weakly Supervised Localization [77.2171924626778]
Weakly supervised localization aims at finding target object regions using only image-level supervision.
We propose to leverage pixel-level similarities across different objects for learning more accurate object locations.
Our method achieves the Top-1 localization error rate of 45.17% on the ILSVRC validation set.
arXiv Detail & Related papers (2020-08-12T04:14:11Z) - Scope Head for Accurate Localization in Object Detection [135.9979405835606]
We propose a novel detector coined as ScopeNet, which models anchors of each location as a mutually dependent relationship.
With our concise and effective design, the proposed ScopeNet achieves state-of-the-art results on COCO.
arXiv Detail & Related papers (2020-05-11T04:00:09Z) - OS2D: One-Stage One-Shot Object Detection by Matching Anchor Features [14.115782214599015]
One-shot object detection consists in detecting objects defined by a single demonstration.
We build the one-stage system that performs localization and recognition jointly.
Experimental evaluation on several challenging domains shows that our method can detect unseen classes.
arXiv Detail & Related papers (2020-03-15T11:39:47Z) - iFAN: Image-Instance Full Alignment Networks for Adaptive Object
Detection [48.83883375118966]
iFAN aims to precisely align feature distributions on both image and instance levels.
It outperforms state-of-the-art methods with a boost of 10%+ AP over the source-only baseline.
arXiv Detail & Related papers (2020-03-09T13:27:06Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.