Learning Disentangled Representation Implicitly via Transformer for
Occluded Person Re-Identification
- URL: http://arxiv.org/abs/2107.02380v1
- Date: Tue, 6 Jul 2021 04:24:10 GMT
- Title: Learning Disentangled Representation Implicitly via Transformer for
Occluded Person Re-Identification
- Authors: Mengxi Jia, Xinhua Cheng, Shijian Lu and Jian Zhang
- Abstract summary: DRL-Net is a representation learning network that handles occluded re-ID without requiring strict person image alignment or any additional supervision.
It measures image similarity by automatically disentangling the representation of undefined semantic components.
The DRL-Net achieves superior re-ID performance consistently and outperforms the state-of-the-art by large margins for Occluded-DukeMTMC.
- Score: 35.40162083252931
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: Person re-identification (re-ID) under various occlusions has been a
long-standing challenge as person images with different types of occlusions
often suffer from misalignment in image matching and ranking. Most existing
methods tackle this challenge by aligning spatial features of body parts
according to external semantic cues or feature similarities but this alignment
approach is complicated and sensitive to noises. We design DRL-Net, a
disentangled representation learning network that handles occluded re-ID
without requiring strict person image alignment or any additional supervision.
Leveraging transformer architectures, DRL-Net achieves alignment-free re-ID via
global reasoning of local features of occluded person images. It measures image
similarity by automatically disentangling the representation of undefined
semantic components, e.g., human body parts or obstacles, under the guidance of
semantic preference object queries in the transformer. In addition, we design a
decorrelation constraint in the transformer decoder and impose it over object
queries for better focus on different semantic components. To better eliminate
interference from occlusions, we design a contrast feature learning technique
(CFL) for better separation of occlusion features and discriminative ID
features. Extensive experiments over occluded and holistic re-ID benchmarks
(Occluded-DukeMTMC, Market1501 and DukeMTMC) show that the DRL-Net achieves
superior re-ID performance consistently and outperforms the state-of-the-art by
large margins for Occluded-DukeMTMC.
Related papers
- Exploring Stronger Transformer Representation Learning for Occluded Person Re-Identification [2.552131151698595]
We proposed a novel self-supervision and supervision combining transformer-based person re-identification framework, namely SSSC-TransReID.
We designed a self-supervised contrastive learning branch, which can enhance the feature representation for person re-identification without negative samples or additional pre-training.
Our proposed model obtains superior Re-ID performance consistently and outperforms the state-of-the-art ReID methods by large margins on the mean average accuracy (mAP) and Rank-1 accuracy.
arXiv Detail & Related papers (2024-10-21T03:17:25Z) - Robust Ensemble Person Re-Identification via Orthogonal Fusion with Occlusion Handling [4.431087385310259]
Occlusion remains one of the major challenges in person reidentification (ReID)
We propose a deep ensemble model that harnesses both CNN and Transformer architectures to generate robust feature representations.
arXiv Detail & Related papers (2024-03-29T18:38:59Z) - Dynamic Patch-aware Enrichment Transformer for Occluded Person
Re-Identification [14.219232629274186]
We present an end-to-end solution known as the Dynamic Patch-aware Enrichment Transformer (DPEFormer)
This model effectively distinguishes human body information from occlusions automatically and dynamically.
To ensure that DPSM and the entire DPEFormer can effectively learn with only identity labels, we also propose a Realistic Occlusion Augmentation (ROA) strategy.
arXiv Detail & Related papers (2024-02-16T03:53:30Z) - Part Representation Learning with Teacher-Student Decoder for Occluded
Person Re-identification [65.63180725319906]
We propose a Teacher-Student Decoder (TSD) framework for occluded person ReID.
Our proposed TSD consists of a Parsing-aware Teacher Decoder (PTD) and a Standard Student Decoder (SSD)
arXiv Detail & Related papers (2023-12-15T13:54:48Z) - Divided Attention: Unsupervised Multi-Object Discovery with Contextually
Separated Slots [78.23772771485635]
We introduce a method to segment the visual field into independently moving regions, trained with no ground truth or supervision.
It consists of an adversarial conditional encoder-decoder architecture based on Slot Attention.
arXiv Detail & Related papers (2023-04-04T00:26:13Z) - Feature Completion Transformer for Occluded Person Re-identification [25.159974510754992]
Occluded person re-identification (Re-ID) is a challenging problem due to the destruction of occluders.
We propose a Feature Completion Transformer (FCFormer) to implicitly complement the semantic information of occluded parts in the feature space.
FCFormer achieves superior performance and outperforms the state-of-the-art methods by significant margins on occluded datasets.
arXiv Detail & Related papers (2023-03-03T01:12:57Z) - Learning Feature Recovery Transformer for Occluded Person
Re-identification [71.18476220969647]
We propose a new approach called Feature Recovery Transformer (FRT) to address the two challenges simultaneously.
To reduce the interference of the noise during feature matching, we mainly focus on visible regions that appear in both images and develop a visibility graph to calculate the similarity.
In terms of the second challenge, based on the developed graph similarity, for each query image, we propose a recovery transformer that exploits the feature sets of its $k$-nearest neighbors in the gallery to recover the complete features.
arXiv Detail & Related papers (2023-01-05T02:36:16Z) - Occluded Person Re-Identification via Relational Adaptive Feature
Correction Learning [8.015703163954639]
Occluded person re-identification (Re-ID) in images captured by multiple cameras is challenging because the target person is occluded by pedestrians or objects.
Most existing methods utilize the off-the-shelf pose or parsing networks as pseudo labels, which are prone to error.
We propose a novel Occlusion Correction Network (OCNet) that corrects features through relational-weight learning and obtains diverse and representative features without using external networks.
arXiv Detail & Related papers (2022-12-09T07:48:47Z) - Dynamic Prototype Mask for Occluded Person Re-Identification [88.7782299372656]
Existing methods mainly address this issue by employing body clues provided by an extra network to distinguish the visible part.
We propose a novel Dynamic Prototype Mask (DPM) based on two self-evident prior knowledge.
Under this condition, the occluded representation could be well aligned in a selected subspace spontaneously.
arXiv Detail & Related papers (2022-07-19T03:31:13Z) - Dual Spoof Disentanglement Generation for Face Anti-spoofing with Depth
Uncertainty Learning [54.15303628138665]
Face anti-spoofing (FAS) plays a vital role in preventing face recognition systems from presentation attacks.
Existing face anti-spoofing datasets lack diversity due to the insufficient identity and insignificant variance.
We propose Dual Spoof Disentanglement Generation framework to tackle this challenge by "anti-spoofing via generation"
arXiv Detail & Related papers (2021-12-01T15:36:59Z) - Adversarial Semantic Data Augmentation for Human Pose Estimation [96.75411357541438]
We propose Semantic Data Augmentation (SDA), a method that augments images by pasting segmented body parts with various semantic granularity.
We also propose Adversarial Semantic Data Augmentation (ASDA), which exploits a generative network to dynamiclly predict tailored pasting configuration.
State-of-the-art results are achieved on challenging benchmarks.
arXiv Detail & Related papers (2020-08-03T07:56:04Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.