PersonMAE: Person Re-Identification Pre-Training with Masked
AutoEncoders
- URL: http://arxiv.org/abs/2311.04496v1
- Date: Wed, 8 Nov 2023 07:02:27 GMT
- Title: PersonMAE: Person Re-Identification Pre-Training with Masked
AutoEncoders
- Authors: Hezhen Hu, Xiaoyi Dong, Jianmin Bao, Dongdong Chen, Lu Yuan, Dong
Chen, Houqiang Li
- Abstract summary: Pre-training is playing an increasingly important role in learning generic feature representation for Person Re-identification (ReID)
We propose PersonMAE, which involves two core designs into masked autoencoders to better serve the task of Person Re-ID.
PersonMAE with ViT-B backbone achieves 79.8% and 69.5% mAP on the MSMT17 and OccDuke datasets.
- Score: 132.60355401780407
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Pre-training is playing an increasingly important role in learning generic
feature representation for Person Re-identification (ReID). We argue that a
high-quality ReID representation should have three properties, namely,
multi-level awareness, occlusion robustness, and cross-region invariance. To
this end, we propose a simple yet effective pre-training framework, namely
PersonMAE, which involves two core designs into masked autoencoders to better
serve the task of Person Re-ID. 1) PersonMAE generates two regions from the
given image with RegionA as the input and \textit{RegionB} as the prediction
target. RegionA is corrupted with block-wise masking to mimic common occlusion
in ReID and its remaining visible parts are fed into the encoder. 2) Then
PersonMAE aims to predict the whole RegionB at both pixel level and semantic
feature level. It encourages its pre-trained feature representations with the
three properties mentioned above. These properties make PersonMAE compatible
with downstream Person ReID tasks, leading to state-of-the-art performance on
four downstream ReID tasks, i.e., supervised (holistic and occluded setting),
and unsupervised (UDA and USL setting). Notably, on the commonly adopted
supervised setting, PersonMAE with ViT-B backbone achieves 79.8% and 69.5% mAP
on the MSMT17 and OccDuke datasets, surpassing the previous state-of-the-art by
a large margin of +8.0 mAP, and +5.3 mAP, respectively.
Related papers
- Bringing Masked Autoencoders Explicit Contrastive Properties for Point Cloud Self-Supervised Learning [116.75939193785143]
Contrastive learning (CL) for Vision Transformers (ViTs) in image domains has achieved performance comparable to CL for traditional convolutional backbones.
In 3D point cloud pretraining with ViTs, masked autoencoder (MAE) modeling remains dominant.
arXiv Detail & Related papers (2024-07-08T12:28:56Z) - HAP: Structure-Aware Masked Image Modeling for Human-Centric Perception [97.55089867970874]
We introduce masked image modeling (MIM) as a pre-training approach for this task.
Motivated by this insight, we incorporate an intuitive human structure prior - human parts - into pre-training.
This encourages the model to concentrate more on body structure information during pre-training, yielding substantial benefits across a range of human-centric perception tasks.
arXiv Detail & Related papers (2023-10-31T17:56:11Z) - Body Part-Based Representation Learning for Occluded Person
Re-Identification [102.27216744301356]
Occluded person re-identification (ReID) is a person retrieval task which aims at matching occluded person images with holistic ones.
Part-based methods have been shown beneficial as they offer fine-grained information and are well suited to represent partially visible human bodies.
We propose BPBreID, a body part-based ReID model for solving the above issues.
arXiv Detail & Related papers (2022-11-07T16:48:41Z) - Dynamic Prototype Mask for Occluded Person Re-Identification [88.7782299372656]
Existing methods mainly address this issue by employing body clues provided by an extra network to distinguish the visible part.
We propose a novel Dynamic Prototype Mask (DPM) based on two self-evident prior knowledge.
Under this condition, the occluded representation could be well aligned in a selected subspace spontaneously.
arXiv Detail & Related papers (2022-07-19T03:31:13Z) - Learning Feature Fusion for Unsupervised Domain Adaptive Person
Re-identification [5.203329540700176]
We propose a Learning Feature Fusion (LF2) framework for adaptively learning to fuse global and local features.
Experiments show that our proposed LF2 framework outperforms the state-of-the-art with 73.5% mAP and 83.7% Rank1 on Market1501 to DukeMTMC-ReID.
arXiv Detail & Related papers (2022-05-19T12:04:21Z) - Unleashing the Potential of Unsupervised Pre-Training with
Intra-Identity Regularization for Person Re-Identification [10.045028405219641]
We design an Unsupervised Pre-training framework for ReID based on the contrastive learning (CL) pipeline, dubbed UP-ReID.
We introduce an intra-identity (I$2$-)regularization in the UP-ReID, which is instantiated as two constraints coming from global image aspect and local patch aspect.
Our UP-ReID pre-trained model can significantly benefit the downstream ReID fine-tuning and achieve state-of-the-art performance.
arXiv Detail & Related papers (2021-12-01T07:16:37Z) - Integrating Coarse Granularity Part-level Features with Supervised
Global-level Features for Person Re-identification [3.4758712821739426]
Part-level person Re-ID network (CGPN) integrates supervised global features for both holistic and partial person images.
CGPN learns to extract effective body part features for both holistic and partial person images.
Single model trained on three Re-ID datasets including Market-1501, DukeMTMC-reID and CUHK03 state-of-the-art performances.
arXiv Detail & Related papers (2020-10-15T11:49:20Z) - Robust Person Re-Identification through Contextual Mutual Boosting [77.1976737965566]
We propose the Contextual Mutual Boosting Network (CMBN) to localize pedestrians.
It localizes pedestrians and recalibrates features by effectively exploiting contextual information and statistical inference.
Experiments on the benchmarks demonstrate the superiority of the architecture compared the state-of-the-art.
arXiv Detail & Related papers (2020-09-16T06:33:35Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.