Related papers: Pose-guided Feature Disentangling for Occluded Person Re-identification Based on Transformer

Pose-guided Feature Disentangling for Occluded Person Re-identification Based on Transformer

URL: http://arxiv.org/abs/2112.02466v1
Date: Sun, 5 Dec 2021 03:23:31 GMT
Title: Pose-guided Feature Disentangling for Occluded Person Re-identification Based on Transformer
Authors: Tao Wang, Hong Liu, Pinhao Song, Tianyu Guo, Wei Shi
Abstract summary: Occluded person re-identification is a challenging task as human body parts could be occluded by some obstacles. Some existing pose-guided methods solve this problem by aligning body parts according to graph matching. We propose a transformer-based Pose-guided Feature Disentangling (PFD) method by utilizing pose information to clearly disentangle semantic components.
Score: 15.839842504357144
License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
Abstract: Occluded person re-identification is a challenging task as human body parts could be occluded by some obstacles (e.g. trees, cars, and pedestrians) in certain scenes. Some existing pose-guided methods solve this problem by aligning body parts according to graph matching, but these graph-based methods are not intuitive and complicated. Therefore, we propose a transformer-based Pose-guided Feature Disentangling (PFD) method by utilizing pose information to clearly disentangle semantic components (e.g. human body or joint parts) and selectively match non-occluded parts correspondingly. First, Vision Transformer (ViT) is used to extract the patch features with its strong capability. Second, to preliminarily disentangle the pose information from patch information, the matching and distributing mechanism is leveraged in Pose-guided Feature Aggregation (PFA) module. Third, a set of learnable semantic views are introduced in transformer decoder to implicitly enhance the disentangled body part features. However, those semantic views are not guaranteed to be related to the body without additional supervision. Therefore, Pose-View Matching (PVM) module is proposed to explicitly match visible body parts and automatically separate occlusion features. Fourth, to better prevent the interference of occlusions, we design a Pose-guided Push Loss to emphasize the features of visible body parts. Extensive experiments over five challenging datasets for two tasks (occluded and holistic Re-ID) demonstrate that our proposed PFD is superior promising, which performs favorably against state-of-the-art methods. Code is available at https://github.com/WangTaoAs/PFD_Net

Related papers

PAFormer: Part Aware Transformer for Person Re-identification [3.8004980982852214]
We introduce textbfPart Aware Transformer (PAFormer), a pose estimation based ReID model which can perform precise part-to-part comparison. Our method outperforms existing approaches on well-known ReID benchmark datasets.
arXiv Detail & Related papers (2024-08-12T04:46:55Z)
Feature Completion Transformer for Occluded Person Re-identification [25.159974510754992]
Occluded person re-identification (Re-ID) is a challenging problem due to the destruction of occluders. We propose a Feature Completion Transformer (FCFormer) to implicitly complement the semantic information of occluded parts in the feature space. FCFormer achieves superior performance and outperforms the state-of-the-art methods by significant margins on occluded datasets.
arXiv Detail & Related papers (2023-03-03T01:12:57Z)
Occluded Person Re-Identification via Relational Adaptive Feature Correction Learning [8.015703163954639]
Occluded person re-identification (Re-ID) in images captured by multiple cameras is challenging because the target person is occluded by pedestrians or objects. Most existing methods utilize the off-the-shelf pose or parsing networks as pseudo labels, which are prone to error. We propose a novel Occlusion Correction Network (OCNet) that corrects features through relational-weight learning and obtains diverse and representative features without using external networks.
arXiv Detail & Related papers (2022-12-09T07:48:47Z)
Body Part-Based Representation Learning for Occluded Person Re-Identification [102.27216744301356]
Occluded person re-identification (ReID) is a person retrieval task which aims at matching occluded person images with holistic ones. Part-based methods have been shown beneficial as they offer fine-grained information and are well suited to represent partially visible human bodies. We propose BPBreID, a body part-based ReID model for solving the above issues.
arXiv Detail & Related papers (2022-11-07T16:48:41Z)
UIA-ViT: Unsupervised Inconsistency-Aware Method based on Vision Transformer for Face Forgery Detection [52.91782218300844]
We propose a novel Unsupervised Inconsistency-Aware method based on Vision Transformer, called UIA-ViT. Due to the self-attention mechanism, the attention map among patch embeddings naturally represents the consistency relation, making the vision Transformer suitable for the consistency representation learning.
arXiv Detail & Related papers (2022-10-23T15:24:47Z)
Motion-Aware Transformer For Occluded Person Re-identification [1.9899263094148867]
We propose a self-supervised deep learning method to improve the location performance for human parts through occluded person Re-ID. Unlike previous works, we find that motion information derived from the photos of various human postures can help identify major human body components.
arXiv Detail & Related papers (2022-02-09T02:53:10Z)
Pose-Guided Feature Learning with Knowledge Distillation for Occluded Person Re-Identification [137.8810152620818]
We propose a network named Pose-Guided Feature Learning with Knowledge Distillation (PGFL-KD) The PGFL-KD consists of a main branch (MB), and two pose-guided branches, ieno, a foreground-enhanced branch (FEB), and a body part semantics aligned branch (SAB) Experiments on occluded, partial, and holistic ReID tasks show the effectiveness of our proposed network.
arXiv Detail & Related papers (2021-07-31T03:34:27Z)
AAformer: Auto-Aligned Transformer for Person Re-Identification [82.45385078624301]
We introduce an alignment scheme in transformer architecture for the first time. We propose the auto-aligned transformer (AAformer) to automatically locate both the human parts and nonhuman ones at patch level. AAformer integrates the part alignment into the self-attention and the output [PART]s can be directly used as part features for retrieval.
arXiv Detail & Related papers (2021-04-02T08:00:25Z)
AdaFuse: Adaptive Multiview Fusion for Accurate Human Pose Estimation in the Wild [77.43884383743872]
We present AdaFuse, an adaptive multiview fusion method to enhance the features in occluded views. We extensively evaluate the approach on three public datasets including Human3.6M, Total Capture and CMU Panoptic. We also create a large scale synthetic dataset Occlusion-Person, which allows us to perform numerical evaluation on the occluded joints.
arXiv Detail & Related papers (2020-10-26T03:19:46Z)
Pose-guided Visible Part Matching for Occluded Person ReID [80.81748252960843]
We propose a Pose-guided Visible Part Matching (PVPM) method that jointly learns the discriminative features with pose-guided attention and self-mines the part visibility. Experimental results on three reported occluded benchmarks show that the proposed method achieves competitive performance to state-of-the-art methods.
arXiv Detail & Related papers (2020-04-01T04:36:51Z)

This list is automatically generated from the titles and abstracts of the papers in this site.

This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.