Exploring Fine-Grained Representation and Recomposition for Cloth-Changing Person Re-Identification
- URL: http://arxiv.org/abs/2308.10692v2
- Date: Thu, 20 Jun 2024 11:47:21 GMT
- Title: Exploring Fine-Grained Representation and Recomposition for Cloth-Changing Person Re-Identification
- Authors: Qizao Wang, Xuelin Qian, Bin Li, Xiangyang Xue, Yanwei Fu,
- Abstract summary: We propose a novel FIne-grained Representation and Recomposition (FIRe$2$) framework to tackle both limitations without any auxiliary annotation or data.
Experiments demonstrate that FIRe$2$ can achieve state-of-the-art performance on five widely-used cloth-changing person Re-ID benchmarks.
- Score: 78.52704557647438
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Cloth-changing person Re-IDentification (Re-ID) is a particularly challenging task, suffering from two limitations of inferior discriminative features and limited training samples. Existing methods mainly leverage auxiliary information to facilitate identity-relevant feature learning, including soft-biometrics features of shapes or gaits, and additional labels of clothing. However, this information may be unavailable in real-world applications. In this paper, we propose a novel FIne-grained Representation and Recomposition (FIRe$^{2}$) framework to tackle both limitations without any auxiliary annotation or data. Specifically, we first design a Fine-grained Feature Mining (FFM) module to separately cluster images of each person. Images with similar so-called fine-grained attributes (e.g., clothes and viewpoints) are encouraged to cluster together. An attribute-aware classification loss is introduced to perform fine-grained learning based on cluster labels, which are not shared among different people, promoting the model to learn identity-relevant features. Furthermore, to take full advantage of fine-grained attributes, we present a Fine-grained Attribute Recomposition (FAR) module by recomposing image features with different attributes in the latent space. It significantly enhances robust feature learning. Extensive experiments demonstrate that FIRe$^{2}$ can achieve state-of-the-art performance on five widely-used cloth-changing person Re-ID benchmarks. The code is available at https://github.com/QizaoWang/FIRe-CCReID.
Related papers
- Multi-Prompts Learning with Cross-Modal Alignment for Attribute-based
Person Re-Identification [18.01407937934588]
We present a new framework called Multi-Prompts ReID (MP-ReID) based on prompt learning and language models.
MP-ReID learns to hallucinate diverse, informative, and promptable sentences for describing the query images.
Explicit prompts are obtained by ensembling generation models, such as ChatGPT and VQA models.
arXiv Detail & Related papers (2023-12-28T03:00:19Z) - Attribute-Aware Deep Hashing with Self-Consistency for Large-Scale
Fine-Grained Image Retrieval [65.43522019468976]
We propose attribute-aware hashing networks with self-consistency for generating attribute-aware hash codes.
We develop an encoder-decoder structure network of a reconstruction task to unsupervisedly distill high-level attribute-specific vectors.
Our models are equipped with a feature decorrelation constraint upon these attribute vectors to strengthen their representative abilities.
arXiv Detail & Related papers (2023-11-21T08:20:38Z) - Dual Feature Augmentation Network for Generalized Zero-shot Learning [14.410978100610489]
Zero-shot learning (ZSL) aims to infer novel classes without training samples by transferring knowledge from seen classes.
Existing embedding-based approaches for ZSL typically employ attention mechanisms to locate attributes on an image.
We propose a novel Dual Feature Augmentation Network (DFAN), which comprises two feature augmentation modules.
arXiv Detail & Related papers (2023-09-25T02:37:52Z) - Hierarchical Visual Primitive Experts for Compositional Zero-Shot
Learning [52.506434446439776]
Compositional zero-shot learning (CZSL) aims to recognize compositions with prior knowledge of known primitives (attribute and object)
We propose a simple and scalable framework called Composition Transformer (CoT) to address these issues.
Our method achieves SoTA performance on several benchmarks, including MIT-States, C-GQA, and VAW-CZSL.
arXiv Detail & Related papers (2023-08-08T03:24:21Z) - Exploiting Semantic Attributes for Transductive Zero-Shot Learning [97.61371730534258]
Zero-shot learning aims to recognize unseen classes by generalizing the relation between visual features and semantic attributes learned from the seen classes.
We present a novel transductive ZSL method that produces semantic attributes of the unseen data and imposes them on the generative process.
Experiments on five standard benchmarks show that our method yields state-of-the-art results for zero-shot learning.
arXiv Detail & Related papers (2023-03-17T09:09:48Z) - Attribute Prototype Network for Any-Shot Learning [113.50220968583353]
We argue that an image representation with integrated attribute localization ability would be beneficial for any-shot, i.e. zero-shot and few-shot, image classification tasks.
We propose a novel representation learning framework that jointly learns global and local features using only class-level attributes.
arXiv Detail & Related papers (2022-04-04T02:25:40Z) - Compositional Fine-Grained Low-Shot Learning [58.53111180904687]
We develop a novel compositional generative model for zero- and few-shot learning to recognize fine-grained classes with a few or no training samples.
We propose a feature composition framework that learns to extract attribute features from training samples and combines them to construct fine-grained features for rare and unseen classes.
arXiv Detail & Related papers (2021-05-21T16:18:24Z) - Graph-based Person Signature for Person Re-Identifications [17.181807593574764]
We propose a new method to effectively aggregate detailed person descriptions (attributes labels) and visual features (body parts and global features) into a graph.
The graph is integrated into a multi-branch multi-task framework for person re-identification.
Our approach achieves competitive results among the state of the art and outperforms other attribute-based or mask-guided methods.
arXiv Detail & Related papers (2021-04-14T10:54:36Z) - Attributes-Guided and Pure-Visual Attention Alignment for Few-Shot
Recognition [27.0842107128122]
We devise an attributes-guided attention module (AGAM) to utilize human-annotated attributes and learn more discriminative features.
Our proposed module can significantly improve simple metric-based approaches to achieve state-of-the-art performance.
arXiv Detail & Related papers (2020-09-10T08:38:32Z) - An Attention-Based Deep Learning Model for Multiple Pedestrian
Attributes Recognition [4.6898263272139795]
This paper provides a novel solution to the problem of automatic characterization of pedestrians in surveillance footage.
We propose a multi-task deep model that uses an element-wise multiplication layer to extract more comprehensive feature representations.
Our experiments were performed on two well-known datasets (RAP and PETA) and point for the superiority of the proposed method with respect to the state-of-the-art.
arXiv Detail & Related papers (2020-04-02T16:21:14Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.