TOOD: Task-aligned One-stage Object Detection
- URL: http://arxiv.org/abs/2108.07755v2
- Date: Wed, 18 Aug 2021 01:44:23 GMT
- Title: TOOD: Task-aligned One-stage Object Detection
- Authors: Chengjian Feng, Yujie Zhong, Yu Gao, Matthew R. Scott and Weilin Huang
- Abstract summary: One-stage object detection is commonly implemented by optimizing two sub-tasks: object classification and localization.
We propose a Task-aligned One-stage Object Detection (TOOD) that explicitly aligns the two tasks in a learning-based manner.
Experiments are conducted on MS-COCO, where TOOD achieves a 51.1 AP at single-model single-scale testing.
- Score: 41.43371563426291
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: One-stage object detection is commonly implemented by optimizing two
sub-tasks: object classification and localization, using heads with two
parallel branches, which might lead to a certain level of spatial misalignment
in predictions between the two tasks. In this work, we propose a Task-aligned
One-stage Object Detection (TOOD) that explicitly aligns the two tasks in a
learning-based manner. First, we design a novel Task-aligned Head (T-Head)
which offers a better balance between learning task-interactive and
task-specific features, as well as a greater flexibility to learn the alignment
via a task-aligned predictor. Second, we propose Task Alignment Learning (TAL)
to explicitly pull closer (or even unify) the optimal anchors for the two tasks
during training via a designed sample assignment scheme and a task-aligned
loss. Extensive experiments are conducted on MS-COCO, where TOOD achieves a
51.1 AP at single-model single-scale testing. This surpasses the recent
one-stage detectors by a large margin, such as ATSS (47.7 AP), GFL (48.2 AP),
and PAA (49.0 AP), with fewer parameters and FLOPs. Qualitative results also
demonstrate the effectiveness of TOOD for better aligning the tasks of object
classification and localization. Code is available at
https://github.com/fcjian/TOOD.
Related papers
- EMPL: A novel Efficient Meta Prompt Learning Framework for Few-shot Unsupervised Domain Adaptation [22.586094394391747]
We propose a novel Efficient Meta Prompt Learning Framework for FS-UDA.
Within this framework, we use pre-trained CLIP model as the feature learning base model.
Our method has the large improvement of at least 15.4% on 5-way 1-shot and 8.7% on 5-way 5-shot, compared with the state-of-the-art methods.
arXiv Detail & Related papers (2024-07-04T17:13:06Z) - Class Incremental Learning via Likelihood Ratio Based Task Prediction [20.145128455767587]
An emerging theory-guided approach is to train a task-specific model for each task in a shared network for all tasks.
This paper argues that using a traditional OOD detector for task-id prediction is sub-optimal because additional information can be exploited.
We call the new method TPL (Task-id Prediction based on Likelihood Ratio)
It markedly outperforms strong CIL baselines and has negligible catastrophic forgetting.
arXiv Detail & Related papers (2023-09-26T16:25:57Z) - Fast Inference and Transfer of Compositional Task Structures for
Few-shot Task Generalization [101.72755769194677]
We formulate it as a few-shot reinforcement learning problem where a task is characterized by a subtask graph.
Our multi-task subtask graph inferencer (MTSGI) first infers the common high-level task structure in terms of the subtask graph from the training tasks.
Our experiment results on 2D grid-world and complex web navigation domains show that the proposed method can learn and leverage the common underlying structure of the tasks for faster adaptation to the unseen tasks.
arXiv Detail & Related papers (2022-05-25T10:44:25Z) - Selecting task with optimal transport self-supervised learning for
few-shot classification [15.088213168796772]
Few-Shot classification aims at solving problems that only a few samples are available in the training process.
We propose a novel task selecting algorithm, named Optimal Transport Task Selecting (OTTS), to construct a training set by selecting similar tasks for Few-Shot learning.
OTTS measures the task similarity by calculating the optimal transport distance and completes the model training via a self-supervised strategy.
arXiv Detail & Related papers (2022-04-01T08:45:29Z) - Task Adaptive Parameter Sharing for Multi-Task Learning [114.80350786535952]
Adaptive Task Adapting Sharing (TAPS) is a method for tuning a base model to a new task by adaptively modifying a small, task-specific subset of layers.
Compared to other methods, TAPS retains high accuracy on downstream tasks while introducing few task-specific parameters.
We evaluate our method on a suite of fine-tuning tasks and architectures (ResNet, DenseNet, ViT) and show that it achieves state-of-the-art performance while being simple to implement.
arXiv Detail & Related papers (2022-03-30T23:16:07Z) - Rethinking the Aligned and Misaligned Features in One-stage Object
Detection [9.270523894683278]
One-stage object detectors rely on the point feature to predict the detection results.
We propose a simple and plug-in operator that could generate aligned and disentangled features for each task.
Based on the object-aligned and task-disentangled operator (OAT), we propose OAT-Net, which explicitly exploits point-set features for more accurate detection results.
arXiv Detail & Related papers (2021-08-27T08:40:37Z) - Learning to Relate Depth and Semantics for Unsupervised Domain
Adaptation [87.1188556802942]
We present an approach for encoding visual task relationships to improve model performance in an Unsupervised Domain Adaptation (UDA) setting.
We propose a novel Cross-Task Relation Layer (CTRL), which encodes task dependencies between the semantic and depth predictions.
Furthermore, we propose an Iterative Self-Learning (ISL) training scheme, which exploits semantic pseudo-labels to provide extra supervision on the target domain.
arXiv Detail & Related papers (2021-05-17T13:42:09Z) - Exploring Relational Context for Multi-Task Dense Prediction [76.86090370115]
We consider a multi-task environment for dense prediction tasks, represented by a common backbone and independent task-specific heads.
We explore various attention-based contexts, such as global and local, in the multi-task setting.
We propose an Adaptive Task-Relational Context module, which samples the pool of all available contexts for each task pair.
arXiv Detail & Related papers (2021-04-28T16:45:56Z) - Conditional Channel Gated Networks for Task-Aware Continual Learning [44.894710899300435]
Convolutional Neural Networks experience catastrophic forgetting when optimized on a sequence of learning problems.
We introduce a novel framework to tackle this problem with conditional computation.
We validate our proposal on four continual learning datasets.
arXiv Detail & Related papers (2020-03-31T19:35:07Z) - Meta Reinforcement Learning with Autonomous Inference of Subtask
Dependencies [57.27944046925876]
We propose and address a novel few-shot RL problem, where a task is characterized by a subtask graph.
Instead of directly learning a meta-policy, we develop a Meta-learner with Subtask Graph Inference.
Our experiment results on two grid-world domains and StarCraft II environments show that the proposed method is able to accurately infer the latent task parameter.
arXiv Detail & Related papers (2020-01-01T17:34:00Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.