Identity-Seeking Self-Supervised Representation Learning for
Generalizable Person Re-identification
- URL: http://arxiv.org/abs/2308.08887v1
- Date: Thu, 17 Aug 2023 09:46:27 GMT
- Title: Identity-Seeking Self-Supervised Representation Learning for
Generalizable Person Re-identification
- Authors: Zhaopeng Dou, Zhongdao Wang, Yali Li, and Shengjin Wang
- Abstract summary: Prior DG ReID methods employ limited labeled data for training due to the high cost of annotation.
We propose an Identity-seeking Self-supervised Representation learning (ISR) method.
ISR constructs positive pairs from inter-frame images by modeling the instance association as a maximum-weight bipartite matching problem.
ISR achieves 87.0% Rank-1 on Market-1501 and 56.4% Rank-1 on MSMT17, outperforming the best supervised domain-generalizable method by 5.0% and 19.5%, respectively.
- Score: 55.1738496692892
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: This paper aims to learn a domain-generalizable (DG) person re-identification
(ReID) representation from large-scale videos \textbf{without any annotation}.
Prior DG ReID methods employ limited labeled data for training due to the high
cost of annotation, which restricts further advances. To overcome the barriers
of data and annotation, we propose to utilize large-scale unsupervised data for
training. The key issue lies in how to mine identity information. To this end,
we propose an Identity-seeking Self-supervised Representation learning (ISR)
method. ISR constructs positive pairs from inter-frame images by modeling the
instance association as a maximum-weight bipartite matching problem. A
reliability-guided contrastive loss is further presented to suppress the
adverse impact of noisy positive pairs, ensuring that reliable positive pairs
dominate the learning process. The training cost of ISR scales approximately
linearly with the data size, making it feasible to utilize large-scale data for
training. The learned representation exhibits superior generalization ability.
\textbf{Without human annotation and fine-tuning, ISR achieves 87.0\% Rank-1 on
Market-1501 and 56.4\% Rank-1 on MSMT17}, outperforming the best supervised
domain-generalizable method by 5.0\% and 19.5\%, respectively. In the
pre-training$\rightarrow$fine-tuning scenario, ISR achieves state-of-the-art
performance, with 88.4\% Rank-1 on MSMT17. The code is at
\url{https://github.com/dcp15/ISR_ICCV2023_Oral}.
Related papers
- Debiased Learning for Remote Sensing Data [29.794246747637104]
We propose a highly effective semi-supervised approach tailored specifically to remote sensing data.
First, we adapt the FixMatch framework to remote sensing data by designing robust strong and weak augmentations suitable for this domain.
Second, we develop an effective semi-supervised learning method by removing bias in imbalanced training data resulting from both actual labels and pseudo-labels predicted by the model.
arXiv Detail & Related papers (2023-12-24T03:33:30Z) - Continual Contrastive Finetuning Improves Low-Resource Relation
Extraction [34.76128090845668]
Relation extraction has been particularly challenging in low-resource scenarios and domains.
Recent literature has tackled low-resource RE by self-supervised learning.
We propose to pretrain and finetune the RE model using consistent objectives of contrastive learning.
arXiv Detail & Related papers (2022-12-21T07:30:22Z) - Generalizable Re-Identification from Videos with Cycle Association [60.920036335996414]
We propose Cycle Association (CycAs) as a scalable self-supervised learning method for re-ID with low training complexity.
We construct a large-scale unlabeled re-ID dataset named LMP-video, tailored for the proposed method.
CycAs learns re-ID features by enforcing cycle consistency of instance association between temporally successive video frame pairs.
arXiv Detail & Related papers (2022-11-07T16:21:57Z) - Adversarial Dual-Student with Differentiable Spatial Warping for
Semi-Supervised Semantic Segmentation [70.2166826794421]
We propose a differentiable geometric warping to conduct unsupervised data augmentation.
We also propose a novel adversarial dual-student framework to improve the Mean-Teacher.
Our solution significantly improves the performance and state-of-the-art results are achieved on both datasets.
arXiv Detail & Related papers (2022-03-05T17:36:17Z) - Unleashing the Potential of Unsupervised Pre-Training with
Intra-Identity Regularization for Person Re-Identification [10.045028405219641]
We design an Unsupervised Pre-training framework for ReID based on the contrastive learning (CL) pipeline, dubbed UP-ReID.
We introduce an intra-identity (I$2$-)regularization in the UP-ReID, which is instantiated as two constraints coming from global image aspect and local patch aspect.
Our UP-ReID pre-trained model can significantly benefit the downstream ReID fine-tuning and achieve state-of-the-art performance.
arXiv Detail & Related papers (2021-12-01T07:16:37Z) - Self-Supervised Pre-Training for Transformer-Based Person
Re-Identification [54.55281692768765]
Transformer-based supervised pre-training achieves great performance in person re-identification (ReID)
Due to the domain gap between ImageNet and ReID datasets, it usually needs a larger pre-training dataset to boost the performance.
This work aims to mitigate the gap between the pre-training and ReID datasets from the perspective of data and model structure.
arXiv Detail & Related papers (2021-11-23T18:59:08Z) - Unsupervised Pre-training for Person Re-identification [90.98552221699508]
We present a large scale unlabeled person re-identification (Re-ID) dataset "LUPerson"
We make the first attempt of performing unsupervised pre-training for improving the generalization ability of the learned person Re-ID feature representation.
arXiv Detail & Related papers (2020-12-07T14:48:26Z) - Improving Semantic Segmentation via Self-Training [75.07114899941095]
We show that we can obtain state-of-the-art results using a semi-supervised approach, specifically a self-training paradigm.
We first train a teacher model on labeled data, and then generate pseudo labels on a large set of unlabeled data.
Our robust training framework can digest human-annotated and pseudo labels jointly and achieve top performances on Cityscapes, CamVid and KITTI datasets.
arXiv Detail & Related papers (2020-04-30T17:09:17Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.