Learning to Disentangle Scenes for Person Re-identification
- URL: http://arxiv.org/abs/2111.05476v1
- Date: Wed, 10 Nov 2021 01:17:10 GMT
- Title: Learning to Disentangle Scenes for Person Re-identification
- Authors: Xianghao Zang, Ge Li, Wei Gao, Xiujun Shu
- Abstract summary: This paper proposes to divide-and-conquer the person re-identification (ReID) task.
We employ several self-supervision operations to simulate different challenging problems and handle each challenging problem using different networks.
A general multi-branch network, including one master branch and two servant branches, is introduced to handle different scenes.
- Score: 15.378033331385312
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: There are many challenging problems in the person re-identification (ReID)
task, such as the occlusion and scale variation. Existing works usually tried
to solve them by employing a one-branch network. This one-branch network needs
to be robust to various challenging problems, which makes this network
overburdened. This paper proposes to divide-and-conquer the ReID task. For this
purpose, we employ several self-supervision operations to simulate different
challenging problems and handle each challenging problem using different
networks. Concretely, we use the random erasing operation and propose a novel
random scaling operation to generate new images with controllable
characteristics. A general multi-branch network, including one master branch
and two servant branches, is introduced to handle different scenes. These
branches learn collaboratively and achieve different perceptive abilities. In
this way, the complex scenes in the ReID task are effectively disentangled, and
the burden of each branch is relieved. The results from extensive experiments
demonstrate that the proposed method achieves state-of-the-art performances on
three ReID benchmarks and two occluded ReID benchmarks. Ablation study also
shows that the proposed scheme and operations significantly improve the
performance in various scenes.
Related papers
- RepVF: A Unified Vector Fields Representation for Multi-task 3D Perception [64.80760846124858]
This paper proposes a novel unified representation, RepVF, which harmonizes the representation of various perception tasks.
RepVF characterizes the structure of different targets in the scene through a vector field, enabling a single-head, multi-task learning model.
Building upon RepVF, we introduce RFTR, a network designed to exploit the inherent connections between different tasks.
arXiv Detail & Related papers (2024-07-15T16:25:07Z) - A Versatile Framework for Multi-scene Person Re-identification [30.74494316484783]
Person Re-identification (ReID) has been extensively developed for a decade in order to learn the association of images of the same person across non-overlapping camera views.
Despite the impressive performance of many ReID variants, these variants typically function distinctly and cannot be applied to other challenges.
This work contributes to the first attempt at learning a versatile ReID model to solve such a problem.
arXiv Detail & Related papers (2024-03-17T07:04:09Z) - A Dynamic Feature Interaction Framework for Multi-task Visual Perception [100.98434079696268]
We devise an efficient unified framework to solve multiple common perception tasks.
These tasks include instance segmentation, semantic segmentation, monocular 3D detection, and depth estimation.
Our proposed framework, termed D2BNet, demonstrates a unique approach to parameter-efficient predictions for multi-task perception.
arXiv Detail & Related papers (2023-06-08T09:24:46Z) - Single-branch Network for Multimodal Training [19.690844799632327]
We propose a novel single-branch network capable of learning discriminative representation of unimodal as well as multimodal tasks without changing the network.
We evaluate our proposed single-branch network on the challenging multimodal problem (face-voice association) for cross-modal verification and matching tasks with various loss formulations.
arXiv Detail & Related papers (2023-03-10T18:48:40Z) - Transformer Based Multi-Grained Features for Unsupervised Person
Re-Identification [9.874360118638918]
We build a dual-branch network architecture based upon a modified Vision Transformer (ViT)
Local tokens output in each branch are reshaped and then uniformly partitioned into multiple stripes to generate part-level features.
Global tokens of two branches are averaged to produce a global feature.
arXiv Detail & Related papers (2022-11-22T13:51:17Z) - Multi-Task Learning with Sequence-Conditioned Transporter Networks [67.57293592529517]
We aim to solve multi-task learning through the lens of sequence-conditioning and weighted sampling.
We propose a new suite of benchmark aimed at compositional tasks, MultiRavens, which allows defining custom task combinations.
Second, we propose a vision-based end-to-end system architecture, Sequence-Conditioned Transporter Networks, which augments Goal-Conditioned Transporter Networks with sequence-conditioning and weighted sampling.
arXiv Detail & Related papers (2021-09-15T21:19:11Z) - Semantic Consistency and Identity Mapping Multi-Component Generative
Adversarial Network for Person Re-Identification [39.605062525247135]
We propose a semantic consistency and identity mapping multi-component generative adversarial network (SC-IMGAN) which provides style adaptation from one to many domains.
Our proposed method outperforms state-of-the-art techniques on six challenging person Re-ID datasets.
arXiv Detail & Related papers (2021-04-28T14:12:29Z) - Decoupled and Memory-Reinforced Networks: Towards Effective Feature
Learning for One-Step Person Search [65.51181219410763]
One-step methods have been developed to handle pedestrian detection and identification sub-tasks using a single network.
There are two major challenges in the current one-step approaches.
We propose a decoupled and memory-reinforced network (DMRNet) to overcome these problems.
arXiv Detail & Related papers (2021-02-22T06:19:45Z) - Recurrent Multi-view Alignment Network for Unsupervised Surface
Registration [79.72086524370819]
Learning non-rigid registration in an end-to-end manner is challenging due to the inherent high degrees of freedom and the lack of labeled training data.
We propose to represent the non-rigid transformation with a point-wise combination of several rigid transformations.
We also introduce a differentiable loss function that measures the 3D shape similarity on the projected multi-view 2D depth images.
arXiv Detail & Related papers (2020-11-24T14:22:42Z) - Challenge-Aware RGBT Tracking [32.88141817679821]
We propose a novel challenge-aware neural network to handle the modality-shared challenges and the modality-specific ones.
We show that our method operates at a real-time speed while performing well against the state-of-the-art methods on three benchmark datasets.
arXiv Detail & Related papers (2020-07-26T15:11:44Z) - Reparameterizing Convolutions for Incremental Multi-Task Learning
without Task Interference [75.95287293847697]
Two common challenges in developing multi-task models are often overlooked in literature.
First, enabling the model to be inherently incremental, continuously incorporating information from new tasks without forgetting the previously learned ones (incremental learning)
Second, eliminating adverse interactions amongst tasks, which has been shown to significantly degrade the single-task performance in a multi-task setup (task interference)
arXiv Detail & Related papers (2020-07-24T14:44:46Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.