Multi-task Learning with Coarse Priors for Robust Part-aware Person
Re-identification
- URL: http://arxiv.org/abs/2003.08069v3
- Date: Fri, 7 May 2021 07:39:23 GMT
- Title: Multi-task Learning with Coarse Priors for Robust Part-aware Person
Re-identification
- Authors: Changxing Ding, Kan Wang, Pengfei Wang, and Dacheng Tao
- Abstract summary: The Multi-task Part-aware Network (MPN) is designed to extract semantically aligned part-level features from pedestrian images.
MPN solves the body part misalignment problem via multi-task learning (MTL) in the training stage.
MPN consistently outperforms state-of-the-art approaches by significant margins.
- Score: 79.33809815035127
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Part-level representations are important for robust person re-identification
(ReID), but in practice feature quality suffers due to the body part
misalignment problem. In this paper, we present a robust, compact, and
easy-to-use method called the Multi-task Part-aware Network (MPN), which is
designed to extract semantically aligned part-level features from pedestrian
images. MPN solves the body part misalignment problem via multi-task learning
(MTL) in the training stage. More specifically, it builds one main task (MT)
and one auxiliary task (AT) for each body part on the top of the same backbone
model. The ATs are equipped with a coarse prior of the body part locations for
training images. ATs then transfer the concept of the body parts to the MTs via
optimizing the MT parameters to identify part-relevant channels from the
backbone model. Concept transfer is accomplished by means of two novel
alignment strategies: namely, parameter space alignment via hard parameter
sharing and feature space alignment in a class-wise manner. With the aid of the
learned high-quality parameters, MTs can independently extract semantically
aligned part-level features from relevant channels in the testing stage. MPN
has three key advantages: 1) it does not need to conduct body part detection in
the inference stage; 2) its model is very compact and efficient for both
training and testing; 3) in the training stage, it requires only coarse priors
of body part locations, which are easy to obtain. Systematic experiments on
four large-scale ReID databases demonstrate that MPN consistently outperforms
state-of-the-art approaches by significant margins. Code is available at
https://github.com/WangKan0128/MPN.
Related papers
- MTP: Advancing Remote Sensing Foundation Model via Multi-Task Pretraining [73.81862342673894]
Foundation models have reshaped the landscape of Remote Sensing (RS) by enhancing various image interpretation tasks.
transferring the pretrained models to downstream tasks may encounter task discrepancy due to their formulation of pretraining as image classification or object discrimination tasks.
We conduct multi-task supervised pretraining on the SAMRS dataset, encompassing semantic segmentation, instance segmentation, and rotated object detection.
Our models are finetuned on various RS downstream tasks, such as scene classification, horizontal and rotated object detection, semantic segmentation, and change detection.
arXiv Detail & Related papers (2024-03-20T09:17:22Z) - S$^3$M-Net: Joint Learning of Semantic Segmentation and Stereo Matching
for Autonomous Driving [40.305452898732774]
S$3$M-Net is a novel joint learning framework developed to perform semantic segmentation and stereo matching simultaneously.
S$3$M-Net shares the features extracted from RGB images between both tasks, resulting in an improved overall scene understanding capability.
arXiv Detail & Related papers (2024-01-21T06:47:33Z) - VMT-Adapter: Parameter-Efficient Transfer Learning for Multi-Task Dense
Scene Understanding [6.816428690763012]
A standard approach to leverage large-scale pre-trained models is to fine-tune all model parameters for downstream tasks.
We propose VMT-Adapter, which shares knowledge from multiple tasks to enhance cross-task interaction.
We also propose VMT-Adapter-Lite, which further reduces the trainable parameters by learning shared parameters between down- and up-projections.
arXiv Detail & Related papers (2023-12-14T08:25:04Z) - Single-stage Multi-human Parsing via Point Sets and Center-based Offsets [28.70266615856546]
We present a high-performance Single-stage Multi-human Parsing architecture that decouples the multi-human parsing problem into two fine-grained sub-problems.
The proposed method requires fewer training epochs and a less complex model architecture.
In particular, the proposed method requires fewer training epochs and a less complex model architecture.
arXiv Detail & Related papers (2023-04-22T09:30:50Z) - Frequency Disentangled Learning for Segmentation of Midbrain Structures
from Quantitative Susceptibility Mapping Data [1.9150304734969674]
Deep models tend to fit the target function from low to high frequencies.
One often lacks sufficient samples for training deep segmentation models.
We propose a new training method based on frequency-domain disentanglement.
arXiv Detail & Related papers (2023-02-25T04:30:11Z) - Locally Aware Piecewise Transformation Fields for 3D Human Mesh
Registration [67.69257782645789]
We propose piecewise transformation fields that learn 3D translation vectors to map any query point in posed space to its correspond position in rest-pose space.
We show that fitting parametric models with poses by our network results in much better registration quality, especially for extreme poses.
arXiv Detail & Related papers (2021-04-16T15:16:09Z) - Decoupled and Memory-Reinforced Networks: Towards Effective Feature
Learning for One-Step Person Search [65.51181219410763]
One-step methods have been developed to handle pedestrian detection and identification sub-tasks using a single network.
There are two major challenges in the current one-step approaches.
We propose a decoupled and memory-reinforced network (DMRNet) to overcome these problems.
arXiv Detail & Related papers (2021-02-22T06:19:45Z) - Batch Coherence-Driven Network for Part-aware Person Re-Identification [79.33809815035127]
Existing part-aware person re-identification methods typically employ two separate steps: namely, body part detection and part-level feature extraction.
We propose NetworkBCDNet that bypasses body part during both the training and testing phases while still semantically aligned features.
arXiv Detail & Related papers (2020-09-21T09:04:13Z) - MetricUNet: Synergistic Image- and Voxel-Level Learning for Precise CT
Prostate Segmentation via Online Sampling [66.01558025094333]
We propose a two-stage framework, with the first stage to quickly localize the prostate region and the second stage to precisely segment the prostate.
We introduce a novel online metric learning module through voxel-wise sampling in the multi-task network.
Our method can effectively learn more representative voxel-level features compared with the conventional learning methods with cross-entropy or Dice loss.
arXiv Detail & Related papers (2020-05-15T10:37:02Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.