See What You Seek: Semantic Contextual Integration for Cloth-Changing Person Re-Identification
- URL: http://arxiv.org/abs/2412.01345v1
- Date: Mon, 02 Dec 2024 10:11:16 GMT
- Title: See What You Seek: Semantic Contextual Integration for Cloth-Changing Person Re-Identification
- Authors: Xiyu Han, Xian Zhong, Wenxin Huang, Xuemei Jia, Wenxuan Liu, Xiaohan Yu, Alex Chichung Kot,
- Abstract summary: Cloth-changing person re-identification (CC-ReID) aims to match individuals across multiple surveillance cameras despite variations in clothing.
Existing methods typically focus on mitigating the effects of clothing changes or enhancing ID-relevant features.
We propose a novel prompt learning framework, Semantic Contextual Integration (SCI), for CC-ReID.
- Score: 16.845045499676793
- License:
- Abstract: Cloth-changing person re-identification (CC-ReID) aims to match individuals across multiple surveillance cameras despite variations in clothing. Existing methods typically focus on mitigating the effects of clothing changes or enhancing ID-relevant features but often struggle to capture complex semantic information. In this paper, we propose a novel prompt learning framework, Semantic Contextual Integration (SCI), for CC-ReID, which leverages the visual-text representation capabilities of CLIP to minimize the impact of clothing changes and enhance ID-relevant features. Specifically, we introduce Semantic Separation Enhancement (SSE) module, which uses dual learnable text tokens to separately capture confounding and clothing-related semantic information, effectively isolating ID-relevant features from distracting clothing semantics. Additionally, we develop a Semantic-Guided Interaction Module (SIM) that uses orthogonalized text features to guide visual representations, sharpening the model's focus on distinctive ID characteristics. This integration enhances the model's discriminative power and enriches the visual context with high-dimensional semantic insights. Extensive experiments on three CC-ReID datasets demonstrate that our method outperforms state-of-the-art techniques. The code will be released at github.
Related papers
- CLIP-Driven Cloth-Agnostic Feature Learning for Cloth-Changing Person Re-Identification [47.948622774810296]
We propose a novel framework called CLIP-Driven Cloth-Agnostic Feature Learning (CCAF) for Cloth-Changing Person Re-Identification (CC-ReID)
Two modules were custom-designed: the Invariant Feature Prompting (IFP) and the Clothes Feature Minimization (CFM)
Experiments have demonstrated the effectiveness of the proposed CCAF, achieving new state-of-the-art performance on several popular CC-ReID benchmarks without any additional inference time.
arXiv Detail & Related papers (2024-06-13T14:56:07Z) - Attend and Enrich: Enhanced Visual Prompt for Zero-Shot Learning [114.59476118365266]
We propose AENet, which endows semantic information into the visual prompt to distill semantic-enhanced prompt for visual representation enrichment.
AENet comprises two key steps: 1) exploring the concept-harmonized tokens for the visual and attribute modalities, grounded on the modal-sharing token that represents consistent visual-semantic concepts; and 2) yielding semantic-enhanced prompt via the visual residual refinement unit with attribute consistency supervision.
arXiv Detail & Related papers (2024-06-05T07:59:48Z) - Content and Salient Semantics Collaboration for Cloth-Changing Person Re-Identification [74.10897798660314]
Cloth-changing person Re-IDentification aims at recognizing the same person with clothing changes across non-overlapping cameras.
We propose the Content and Salient Semantics Collaboration framework, facilitating cross-parallel semantics interaction and refinement.
Our framework is simple yet effective, and the vital design is the Semantics Mining and Refinement (SMR) module.
arXiv Detail & Related papers (2024-05-26T15:17:28Z) - Identity-aware Dual-constraint Network for Cloth-Changing Person Re-identification [13.709863134725335]
Cloth-Changing Person Re-Identification (CC-ReID) aims to accurately identify the target person in more realistic surveillance scenarios, where pedestrians usually change their clothing.
Despite great progress, limited cloth-changing training samples in existing CC-ReID datasets still prevent the model from adequately learning cloth-irrelevant features.
We propose an Identity-aware Dual-constraint Network (IDNet) for the CC-ReID task.
arXiv Detail & Related papers (2024-03-13T05:46:36Z) - CLIP-Driven Semantic Discovery Network for Visible-Infrared Person
Re-Identification [39.262536758248245]
Cross-modality identity matching poses significant challenges in VIReID.
We propose a CLIP-Driven Semantic Discovery Network (CSDN) that consists of Modality-specific Prompt Learner, Semantic Information Integration, and High-level Semantic Embedding.
arXiv Detail & Related papers (2024-01-11T10:20:13Z) - Exploring Fine-Grained Representation and Recomposition for Cloth-Changing Person Re-Identification [78.52704557647438]
We propose a novel FIne-grained Representation and Recomposition (FIRe$2$) framework to tackle both limitations without any auxiliary annotation or data.
Experiments demonstrate that FIRe$2$ can achieve state-of-the-art performance on five widely-used cloth-changing person Re-ID benchmarks.
arXiv Detail & Related papers (2023-08-21T12:59:48Z) - Shape-Erased Feature Learning for Visible-Infrared Person
Re-Identification [90.39454748065558]
Body shape is one of the significant modality-shared cues for VI-ReID.
We propose shape-erased feature learning paradigm that decorrelates modality-shared features in two subspaces.
Experiments on SYSU-MM01, RegDB, and HITSZ-VCM datasets demonstrate the effectiveness of our method.
arXiv Detail & Related papers (2023-04-09T10:22:10Z) - A Semantic-aware Attention and Visual Shielding Network for
Cloth-changing Person Re-identification [29.026249268566303]
Cloth-changing person reidentification (ReID) is a newly emerging research topic that aims to retrieve pedestrians whose clothes are changed.
Since the human appearance with different clothes exhibits large variations, it is very difficult for existing approaches to extract discriminative and robust feature representations.
This work proposes a novel semantic-aware attention and visual shielding network for cloth-changing person ReID.
arXiv Detail & Related papers (2022-07-18T05:38:37Z) - CRIS: CLIP-Driven Referring Image Segmentation [71.56466057776086]
We propose an end-to-end CLIP-Driven Referring Image framework (CRIS)
CRIS resorts to vision-language decoding and contrastive learning for achieving the text-to-pixel alignment.
Our proposed framework significantly outperforms the state-of-the-art performance without any post-processing.
arXiv Detail & Related papers (2021-11-30T07:29:08Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.