What Makes You Unique? Attribute Prompt Composition for Object Re-Identification
- URL: http://arxiv.org/abs/2509.18715v1
- Date: Tue, 23 Sep 2025 07:03:08 GMT
- Title: What Makes You Unique? Attribute Prompt Composition for Object Re-Identification
- Authors: Yingquan Wang, Pingping Zhang, Chong Sun, Dong Wang, Huchuan Lu,
- Abstract summary: Object Re-IDentification aims to recognize individuals across non-overlapping camera views.<n>Single-domain models tend to overfit to domain-specific features, whereas cross-domain models often rely on diverse normalization strategies.<n>We propose an Attribute Prompt Composition framework, which exploits textual semantics to jointly enhance discrimination and generalization.
- Score: 70.67907354506278
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Object Re-IDentification (ReID) aims to recognize individuals across non-overlapping camera views. While recent advances have achieved remarkable progress, most existing models are constrained to either single-domain or cross-domain scenarios, limiting their real-world applicability. Single-domain models tend to overfit to domain-specific features, whereas cross-domain models often rely on diverse normalization strategies that may inadvertently suppress identity-specific discriminative cues. To address these limitations, we propose an Attribute Prompt Composition (APC) framework, which exploits textual semantics to jointly enhance discrimination and generalization. Specifically, we design an Attribute Prompt Generator (APG) consisting of a Semantic Attribute Dictionary (SAD) and a Prompt Composition Module (PCM). SAD is an over-complete attribute dictionary to provide rich semantic descriptions, while PCM adaptively composes relevant attributes from SAD to generate discriminative attribute-aware features. In addition, motivated by the strong generalization ability of Vision-Language Models (VLM), we propose a Fast-Slow Training Strategy (FSTS) to balance ReID-specific discrimination and generalizable representation learning. Specifically, FSTS adopts a Fast Update Stream (FUS) to rapidly acquire ReID-specific discriminative knowledge and a Slow Update Stream (SUS) to retain the generalizable knowledge inherited from the pre-trained VLM. Through a mutual interaction, the framework effectively focuses on ReID-relevant features while mitigating overfitting. Extensive experiments on both conventional and Domain Generalized (DG) ReID datasets demonstrate that our framework surpasses state-of-the-art methods, exhibiting superior performances in terms of both discrimination and generalization. The source code is available at https://github.com/AWangYQ/APC.
Related papers
- Identity Clue Refinement and Enhancement for Visible-Infrared Person Re-Identification [20.544872117860915]
Visible-Infrared Person Re-Identification (VI-ReID) is a challenging cross-modal matching task due to significant modality discrepancies.<n>We propose a novel Identity Clue Refinement and Enhancement (ICRE) network to mine and utilize the implicit discriminative knowledge inherent in modality-specific attributes.
arXiv Detail & Related papers (2025-12-04T07:13:38Z) - LATex: Leveraging Attribute-based Text Knowledge for Aerial-Ground Person Re-Identification [78.73711446918814]
We propose a novel framework named LATex for AG-ReID, which adopts prompt-tuning strategies to leverage attribute-based text knowledge.<n>Our framework can fully leverage attribute-based text knowledge to improve AG-ReID performance.
arXiv Detail & Related papers (2025-03-31T04:47:05Z) - Attribute-formed Class-specific Concept Space: Endowing Language Bottleneck Model with Better Interpretability and Scalability [54.420663939897686]
We propose the Attribute-formed Language Bottleneck Model (ALBM) to achieve interpretable image recognition.<n>ALBM organizes concepts in the attribute-formed class-specific space, where concepts are descriptions of specific attributes for specific classes.<n>To further improve interpretability, we propose Visual Attribute Prompt Learning (VAPL) to extract visual features on fine-grained attributes.
arXiv Detail & Related papers (2025-03-26T07:59:04Z) - Hybrid Discriminative Attribute-Object Embedding Network for Compositional Zero-Shot Learning [83.10178754323955]
Hybrid Discriminative Attribute-Object Embedding (HDA-OE) network is proposed to solve the problem of complex interactions between attributes and object visual representations.<n>To increase the variability of training data, HDA-OE introduces an attribute-driven data synthesis (ADDS) module.<n>To further improve the discriminative ability of the model, HDA-OE introduces the subclass-driven discriminative embedding (SDDE) module.<n>The proposed model has been evaluated on three benchmark datasets, and the results verify its effectiveness and reliability.
arXiv Detail & Related papers (2024-11-28T09:50:25Z) - Domain Consistency Representation Learning for Lifelong Person Re-Identification [31.076769754593098]
Lifelong person re-identification (LReID) exhibits a contradictory relationship between intra-domain discrimination and inter-domain gaps when learning from continuous data.<n>We propose a novel domain consistency representation learning (DCR) model that explores global and attribute-wise representations to balance intra-domain discrimination and inter-domain gaps.<n>Our DCR achieves superior performance compared to state-of-the-art LReID methods.
arXiv Detail & Related papers (2024-09-30T05:19:09Z) - Multi-Prompts Learning with Cross-Modal Alignment for Attribute-based
Person Re-Identification [18.01407937934588]
We present a new framework called Multi-Prompts ReID (MP-ReID) based on prompt learning and language models.
MP-ReID learns to hallucinate diverse, informative, and promptable sentences for describing the query images.
Explicit prompts are obtained by ensembling generation models, such as ChatGPT and VQA models.
arXiv Detail & Related papers (2023-12-28T03:00:19Z) - Calibrated Feature Decomposition for Generalizable Person
Re-Identification [82.64133819313186]
Calibrated Feature Decomposition (CFD) module focuses on improving the generalization capacity for person re-identification.
A calibrated-and-standardized Batch normalization (CSBN) is designed to learn calibrated person representation.
arXiv Detail & Related papers (2021-11-27T17:12:43Z) - Style Normalization and Restitution for Generalizable Person
Re-identification [89.482638433932]
We design a generalizable person ReID framework which trains a model on source domains yet is able to generalize/perform well on target domains.
We propose a simple yet effective Style Normalization and Restitution (SNR) module.
Our models empowered by the SNR modules significantly outperform the state-of-the-art domain generalization approaches on multiple widely-used person ReID benchmarks.
arXiv Detail & Related papers (2020-05-22T07:15:10Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.