Rethinking of Feature Interaction for Multi-task Learning on Dense
Prediction
- URL: http://arxiv.org/abs/2312.13514v1
- Date: Thu, 21 Dec 2023 01:30:44 GMT
- Title: Rethinking of Feature Interaction for Multi-task Learning on Dense
Prediction
- Authors: Jingdong Zhang, Jiayuan Fan, Peng Ye, Bo Zhang, Hancheng Ye, Baopu Li,
Yancheng Cai, Tao Chen
- Abstract summary: We observe that low-level representations with rich details and high-level representations with abundant task information are not both involved in the multi-task interaction process.
Low-quality and low-efficiency issues also exist in current multi-task learning architectures.
We propose a novel Bridge-Feature-Centirc Interaction (BRFI) method to learn a comprehensive intermediate feature globally from both task-generic and task-specific features.
- Score: 30.30105024946622
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Existing works generally adopt the encoder-decoder structure for Multi-task
Dense Prediction, where the encoder extracts the task-generic features, and
multiple decoders generate task-specific features for predictions. We observe
that low-level representations with rich details and high-level representations
with abundant task information are not both involved in the multi-task
interaction process. Additionally, low-quality and low-efficiency issues also
exist in current multi-task learning architectures. In this work, we propose to
learn a comprehensive intermediate feature globally from both task-generic and
task-specific features, we reveal an important fact that this intermediate
feature, namely the bridge feature, is a good solution to the above issues.
Based on this, we propose a novel Bridge-Feature-Centirc Interaction (BRFI)
method. A Bridge Feature Extractor (BFE) is designed for the generation of
strong bridge features and Task Pattern Propagation (TPP) is applied to ensure
high-quality task interaction participants. Then a Task-Feature Refiner (TFR)
is developed to refine final task predictions with the well-learned knowledge
from the bridge features. Extensive experiments are conducted on NYUD-v2 and
PASCAL Context benchmarks, and the superior performance shows the proposed
architecture is effective and powerful in promoting different dense prediction
tasks simultaneously.
Related papers
- Low-rank Prompt Interaction for Continual Vision-Language Retrieval [47.323830129786145]
We propose the Low-rank Prompt Interaction to address the problem of multi-modal understanding.
Considering that the training parameters scale to the number of layers and tasks, we propose low-rank interaction-augmented decomposition.
We also adopt hierarchical low-rank contrastive learning to ensure robustness training.
arXiv Detail & Related papers (2025-01-24T10:00:47Z) - Task Indicating Transformer for Task-conditional Dense Predictions [16.92067246179703]
We introduce a novel task-conditional framework called Task Indicating Transformer (TIT) to tackle this challenge.
Our approach designs a Mix Task Adapter module within the transformer block, which incorporates a Task Indicating Matrix through matrix decomposition.
We also propose a Task Gate Decoder module that harnesses a Task Indicating Vector and gating mechanism to facilitate adaptive multi-scale feature refinement.
arXiv Detail & Related papers (2024-03-01T07:06:57Z) - ULTRA-DP: Unifying Graph Pre-training with Multi-task Graph Dual Prompt [67.8934749027315]
We propose a unified framework for graph hybrid pre-training which injects the task identification and position identification into GNNs.
We also propose a novel pre-training paradigm based on a group of $k$-nearest neighbors.
arXiv Detail & Related papers (2023-10-23T12:11:13Z) - Contrastive Multi-Task Dense Prediction [11.227696986100447]
A core objective in design is how to effectively model cross-task interactions to achieve a comprehensive improvement on different tasks.
We introduce feature-wise contrastive consistency into modeling the cross-task interactions for multi-task dense prediction.
We propose a novel multi-task contrastive regularization method based on the consistency to effectively boost the representation learning of the different sub-tasks.
arXiv Detail & Related papers (2023-07-16T03:54:01Z) - A Dynamic Feature Interaction Framework for Multi-task Visual Perception [100.98434079696268]
We devise an efficient unified framework to solve multiple common perception tasks.
These tasks include instance segmentation, semantic segmentation, monocular 3D detection, and depth estimation.
Our proposed framework, termed D2BNet, demonstrates a unique approach to parameter-efficient predictions for multi-task perception.
arXiv Detail & Related papers (2023-06-08T09:24:46Z) - A Hierarchical Interactive Network for Joint Span-based Aspect-Sentiment
Analysis [34.1489054082536]
We propose a hierarchical interactive network (HI-ASA) to model two-way interactions between two tasks appropriately.
We use cross-stitch mechanism to combine the different task-specific features selectively as the input to ensure proper two-way interactions.
Experiments on three real-world datasets demonstrate HI-ASA's superiority over baselines.
arXiv Detail & Related papers (2022-08-24T03:03:49Z) - Fast Inference and Transfer of Compositional Task Structures for
Few-shot Task Generalization [101.72755769194677]
We formulate it as a few-shot reinforcement learning problem where a task is characterized by a subtask graph.
Our multi-task subtask graph inferencer (MTSGI) first infers the common high-level task structure in terms of the subtask graph from the training tasks.
Our experiment results on 2D grid-world and complex web navigation domains show that the proposed method can learn and leverage the common underlying structure of the tasks for faster adaptation to the unseen tasks.
arXiv Detail & Related papers (2022-05-25T10:44:25Z) - Exploring Relational Context for Multi-Task Dense Prediction [76.86090370115]
We consider a multi-task environment for dense prediction tasks, represented by a common backbone and independent task-specific heads.
We explore various attention-based contexts, such as global and local, in the multi-task setting.
We propose an Adaptive Task-Relational Context module, which samples the pool of all available contexts for each task pair.
arXiv Detail & Related papers (2021-04-28T16:45:56Z) - Multi-Task Learning for Dense Prediction Tasks: A Survey [87.66280582034838]
Multi-task learning (MTL) techniques have shown promising results w.r.t. performance, computations and/or memory footprint.
We provide a well-rounded view on state-of-the-art deep learning approaches for MTL in computer vision.
arXiv Detail & Related papers (2020-04-28T09:15:50Z) - Cascaded Human-Object Interaction Recognition [175.60439054047043]
We introduce a cascade architecture for a multi-stage, coarse-to-fine HOI understanding.
At each stage, an instance localization network progressively refines HOI proposals and feeds them into an interaction recognition network.
With our carefully-designed human-centric relation features, these two modules work collaboratively towards effective interaction understanding.
arXiv Detail & Related papers (2020-03-09T17:05:04Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.