DC-Former: Diverse and Compact Transformer for Person Re-Identification
- URL: http://arxiv.org/abs/2302.14335v1
- Date: Tue, 28 Feb 2023 06:03:42 GMT
- Title: DC-Former: Diverse and Compact Transformer for Person Re-Identification
- Authors: Wen Li, Cheng Zou, Meng Wang, Furong Xu, Jianan Zhao, Ruobing Zheng,
Yuan Cheng, Wei Chu
- Abstract summary: In person re-identification (re-ID) task, it is still challenging to learn discriminative representation by deep learning, due to limited data.
We propose a Diverse and Compact Transformer (DC-Former) that can achieve a similar effect by splitting embedding space into multiple subspaces.
- Score: 38.12558570608426
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: In person re-identification (re-ID) task, it is still challenging to learn
discriminative representation by deep learning, due to limited data. Generally
speaking, the model will get better performance when increasing the amount of
data. The addition of similar classes strengthens the ability of the classifier
to identify similar identities, thereby improving the discrimination of
representation. In this paper, we propose a Diverse and Compact Transformer
(DC-Former) that can achieve a similar effect by splitting embedding space into
multiple diverse and compact subspaces. Compact embedding subspace helps model
learn more robust and discriminative embedding to identify similar classes. And
the fusion of these diverse embeddings containing more fine-grained information
can further improve the effect of re-ID. Specifically, multiple class tokens
are used in vision transformer to represent multiple embedding spaces. Then, a
self-diverse constraint (SDC) is applied to these spaces to push them away from
each other, which makes each embedding space diverse and compact. Further, a
dynamic weight controller(DWC) is further designed for balancing the relative
importance among them during training. The experimental results of our method
are promising, which surpass previous state-of-the-art methods on several
commonly used person re-ID benchmarks.
Related papers
- Magic Tokens: Select Diverse Tokens for Multi-modal Object Re-Identification [64.36210786350568]
We propose a novel learning framework named textbfEDITOR to select diverse tokens from vision Transformers for multi-modal object ReID.
Our framework can generate more discriminative features for multi-modal object ReID.
arXiv Detail & Related papers (2024-03-15T12:44:35Z) - Contributing Dimension Structure of Deep Feature for Coreset Selection [26.759457501199822]
Coreset selection seeks to choose a subset of crucial training samples for efficient learning.
Sample selection hinges on two main aspects: a sample's representation in enhancing performance and the role of sample diversity in averting overfitting.
Existing methods typically measure both the representation and diversity of data based on similarity metrics.
arXiv Detail & Related papers (2024-01-29T14:47:26Z) - Detail Reinforcement Diffusion Model: Augmentation Fine-Grained Visual Categorization in Few-Shot Conditions [11.121652649243119]
Diffusion models have been widely adopted in data augmentation due to their outstanding diversity in data generation.
We propose a novel approach termed the detail reinforcement diffusion model(DRDM)
It leverages the rich knowledge of large models for fine-grained data augmentation and comprises two key components including discriminative semantic recombination (DSR) and spatial knowledge reference(SKR)
arXiv Detail & Related papers (2023-09-15T01:28:59Z) - Rethinking Person Re-identification from a Projection-on-Prototypes
Perspective [84.24742313520811]
Person Re-IDentification (Re-ID) as a retrieval task, has achieved tremendous development over the past decade.
We propose a new baseline ProNet, which innovatively reserves the function of the classifier at the inference stage.
Experiments on four benchmarks demonstrate that our proposed ProNet is simple yet effective, and significantly beats previous baselines.
arXiv Detail & Related papers (2023-08-21T13:38:10Z) - Exploring Fine-Grained Representation and Recomposition for Cloth-Changing Person Re-Identification [78.52704557647438]
We propose a novel FIne-grained Representation and Recomposition (FIRe$2$) framework to tackle both limitations without any auxiliary annotation or data.
Experiments demonstrate that FIRe$2$ can achieve state-of-the-art performance on five widely-used cloth-changing person Re-ID benchmarks.
arXiv Detail & Related papers (2023-08-21T12:59:48Z) - Generalizable Low-Resource Activity Recognition with Diverse and
Discriminative Representation Learning [24.36351102003414]
Human activity recognition (HAR) is a time series classification task that focuses on identifying the motion patterns from human sensor readings.
We propose a novel approach called Diverse and Discriminative representation Learning (DDLearn) for generalizable lowresource HAR.
Our method significantly outperforms state-of-art methods by an average accuracy improvement of 9.5%.
arXiv Detail & Related papers (2023-05-25T08:24:22Z) - Bi-directional Feature Reconstruction Network for Fine-Grained Few-Shot
Image Classification [61.411869453639845]
We introduce a bi-reconstruction mechanism that can simultaneously accommodate for inter-class and intra-class variations.
This design effectively helps the model to explore more subtle and discriminative features.
Experimental results on three widely used fine-grained image classification datasets consistently show considerable improvements.
arXiv Detail & Related papers (2022-11-30T16:55:14Z) - Improving Deep Metric Learning by Divide and Conquer [11.380358587116683]
Deep metric learning (DML) is a cornerstone of many computer vision applications.
It aims at learning a mapping from the input domain to an embedding space, where semantically similar objects are located nearby and dissimilar objects far from another.
We propose to build a more expressive representation by splitting the embedding space and the data hierarchically into smaller sub-parts.
arXiv Detail & Related papers (2021-09-09T02:57:34Z) - MCL-GAN: Generative Adversarial Networks with Multiple Specialized Discriminators [47.19216713803009]
We propose a framework of generative adversarial networks with multiple discriminators.
We guide each discriminator to have expertise in a subset of the entire data.
Despite the use of multiple discriminators, the backbone networks are shared across the discriminators.
arXiv Detail & Related papers (2021-07-15T11:35:08Z) - Camera-aware Proxies for Unsupervised Person Re-Identification [60.26031011794513]
This paper tackles the purely unsupervised person re-identification (Re-ID) problem that requires no annotations.
We propose to split each single cluster into multiple proxies and each proxy represents the instances coming from the same camera.
Based on the camera-aware proxies, we design both intra- and inter-camera contrastive learning components for our Re-ID model.
arXiv Detail & Related papers (2020-12-19T12:37:04Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.