A Versatile Framework for Multi-scene Person Re-identification
- URL: http://arxiv.org/abs/2403.11121v1
- Date: Sun, 17 Mar 2024 07:04:09 GMT
- Title: A Versatile Framework for Multi-scene Person Re-identification
- Authors: Wei-Shi Zheng, Junkai Yan, Yi-Xing Peng,
- Abstract summary: Person Re-identification (ReID) has been extensively developed for a decade in order to learn the association of images of the same person across non-overlapping camera views.
Despite the impressive performance of many ReID variants, these variants typically function distinctly and cannot be applied to other challenges.
This work contributes to the first attempt at learning a versatile ReID model to solve such a problem.
- Score: 30.74494316484783
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Person Re-identification (ReID) has been extensively developed for a decade in order to learn the association of images of the same person across non-overlapping camera views. To overcome significant variations between images across camera views, mountains of variants of ReID models were developed for solving a number of challenges, such as resolution change, clothing change, occlusion, modality change, and so on. Despite the impressive performance of many ReID variants, these variants typically function distinctly and cannot be applied to other challenges. To our best knowledge, there is no versatile ReID model that can handle various ReID challenges at the same time. This work contributes to the first attempt at learning a versatile ReID model to solve such a problem. Our main idea is to form a two-stage prompt-based twin modeling framework called VersReID. Our VersReID firstly leverages the scene label to train a ReID Bank that contains abundant knowledge for handling various scenes, where several groups of scene-specific prompts are used to encode different scene-specific knowledge. In the second stage, we distill a V-Branch model with versatile prompts from the ReID Bank for adaptively solving the ReID of different scenes, eliminating the demand for scene labels during the inference stage. To facilitate training VersReID, we further introduce the multi-scene properties into self-supervised learning of ReID via a multi-scene prioris data augmentation (MPDA) strategy. Through extensive experiments, we demonstrate the success of learning an effective and versatile ReID model for handling ReID tasks under multi-scene conditions without manual assignment of scene labels in the inference stage, including general, low-resolution, clothing change, occlusion, and cross-modality scenes. Codes and models are available at https://github.com/iSEE-Laboratory/VersReID.
Related papers
- All in One Framework for Multimodal Re-identification in the Wild [58.380708329455466]
multimodal learning paradigm for ReID introduced, referred to as All-in-One (AIO)
AIO harnesses a frozen pre-trained big model as an encoder, enabling effective multimodal retrieval without additional fine-tuning.
Experiments on cross-modal and multimodal ReID reveal that AIO not only adeptly handles various modal data but also excels in challenging contexts.
arXiv Detail & Related papers (2024-05-08T01:04:36Z) - Magic Tokens: Select Diverse Tokens for Multi-modal Object Re-Identification [64.36210786350568]
We propose a novel learning framework named textbfEDITOR to select diverse tokens from vision Transformers for multi-modal object ReID.
Our framework can generate more discriminative features for multi-modal object ReID.
arXiv Detail & Related papers (2024-03-15T12:44:35Z) - Multimodal Prompt Perceiver: Empower Adaptiveness, Generalizability and Fidelity for All-in-One Image Restoration [58.11518043688793]
MPerceiver is a novel approach to enhance adaptiveness, generalizability and fidelity for all-in-one image restoration.
MPerceiver is trained on 9 tasks for all-in-one IR and outperforms state-of-the-art task-specific methods across most tasks.
arXiv Detail & Related papers (2023-12-05T17:47:11Z) - Unleashing the Potential of Unsupervised Pre-Training with
Intra-Identity Regularization for Person Re-Identification [10.045028405219641]
We design an Unsupervised Pre-training framework for ReID based on the contrastive learning (CL) pipeline, dubbed UP-ReID.
We introduce an intra-identity (I$2$-)regularization in the UP-ReID, which is instantiated as two constraints coming from global image aspect and local patch aspect.
Our UP-ReID pre-trained model can significantly benefit the downstream ReID fine-tuning and achieve state-of-the-art performance.
arXiv Detail & Related papers (2021-12-01T07:16:37Z) - Learning to Disentangle Scenes for Person Re-identification [15.378033331385312]
This paper proposes to divide-and-conquer the person re-identification (ReID) task.
We employ several self-supervision operations to simulate different challenging problems and handle each challenging problem using different networks.
A general multi-branch network, including one master branch and two servant branches, is introduced to handle different scenes.
arXiv Detail & Related papers (2021-11-10T01:17:10Z) - Apparel-invariant Feature Learning for Apparel-changed Person
Re-identification [70.16040194572406]
Most public ReID datasets are collected in a short time window in which persons' appearance rarely changes.
In real-world applications such as in a shopping mall, the same person's clothing may change, and different persons may wearing similar clothes.
It is critical to learn an apparel-invariant person representation under cases like cloth changing or several persons wearing similar clothes.
arXiv Detail & Related papers (2020-08-14T03:49:14Z) - Cross-Resolution Adversarial Dual Network for Person Re-Identification
and Beyond [59.149653740463435]
Person re-identification (re-ID) aims at matching images of the same person across camera views.
Due to varying distances between cameras and persons of interest, resolution mismatch can be expected.
We propose a novel generative adversarial network to address cross-resolution person re-ID.
arXiv Detail & Related papers (2020-02-19T07:21:38Z) - Uncertainty-Aware Multi-Shot Knowledge Distillation for Image-Based
Object Re-Identification [93.39253443415392]
We propose exploiting the multi-shots of the same identity to guide the feature learning of each individual image.
It consists of a teacher network (T-net) that learns the comprehensive features from multiple images of the same object, and a student network (S-net) that takes a single image as input.
We validate the effectiveness of our approach on the popular vehicle re-id and person re-id datasets.
arXiv Detail & Related papers (2020-01-15T09:39:05Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.