Cross-modality Person re-identification with Shared-Specific Feature
Transfer
- URL: http://arxiv.org/abs/2002.12489v3
- Date: Thu, 12 Mar 2020 08:52:17 GMT
- Title: Cross-modality Person re-identification with Shared-Specific Feature
Transfer
- Authors: Yan Lu, Yue Wu, Bin Liu, Tianzhu Zhang, Baopu Li, Qi Chu and Nenghai
Yu
- Abstract summary: Cross-modality person re-identification (cm-ReID) is a challenging but key technology for intelligent video analysis.
We propose a novel cross-modality shared-specific feature transfer algorithm (termed cm-SSFT) to explore the potential of both the modality-shared information and the modality-specific characteristics.
- Score: 112.60513494602337
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Cross-modality person re-identification (cm-ReID) is a challenging but key
technology for intelligent video analysis. Existing works mainly focus on
learning common representation by embedding different modalities into a same
feature space. However, only learning the common characteristics means great
information loss, lowering the upper bound of feature distinctiveness. In this
paper, we tackle the above limitation by proposing a novel cross-modality
shared-specific feature transfer algorithm (termed cm-SSFT) to explore the
potential of both the modality-shared information and the modality-specific
characteristics to boost the re-identification performance. We model the
affinities of different modality samples according to the shared features and
then transfer both shared and specific features among and across modalities. We
also propose a complementary feature learning strategy including modality
adaption, project adversarial learning and reconstruction enhancement to learn
discriminative and complementary shared and specific features of each modality,
respectively. The entire cm-SSFT algorithm can be trained in an end-to-end
manner. We conducted comprehensive experiments to validate the superiority of
the overall algorithm and the effectiveness of each component. The proposed
algorithm significantly outperforms state-of-the-arts by 22.5% and 19.3% mAP on
the two mainstream benchmark datasets SYSU-MM01 and RegDB, respectively.
Related papers
- GSSF: Generalized Structural Sparse Function for Deep Cross-modal Metric Learning [51.677086019209554]
We propose a Generalized Structural Sparse to capture powerful relationships across modalities for pair-wise similarity learning.
The distance metric delicately encapsulates two formats of diagonal and block-diagonal terms.
Experiments on cross-modal and two extra uni-modal retrieval tasks have validated its superiority and flexibility.
arXiv Detail & Related papers (2024-10-20T03:45:50Z) - Cross-Modality Gait Recognition: Bridging LiDAR and Camera Modalities for Human Identification [8.976513021790984]
We present CrossGait, capable of cross retrieving diverse data modalities.
We propose a Prototypical Modality-shared Attention Module that learns modality-shared features from two modality-specific features.
We also design a Cross-modality Feature Adapter that transforms the learned modality-specific features into a unified feature space.
arXiv Detail & Related papers (2024-04-04T10:12:55Z) - Deep Common Feature Mining for Efficient Video Semantic Segmentation [29.054945307605816]
We present Deep Common Feature Mining (DCFM) for video semantic segmentation.
DCFM explicitly decomposes features into two complementary components.
We show that our method has a superior balance between accuracy and efficiency.
arXiv Detail & Related papers (2024-03-05T06:17:59Z) - On Exploring Pose Estimation as an Auxiliary Learning Task for
Visible-Infrared Person Re-identification [66.58450185833479]
In this paper, we exploit Pose Estimation as an auxiliary learning task to assist the VI-ReID task in an end-to-end framework.
By jointly training these two tasks in a mutually beneficial manner, our model learns higher quality modality-shared and ID-related features.
Experimental results on two benchmark VI-ReID datasets show that the proposed method consistently improves state-of-the-art methods by significant margins.
arXiv Detail & Related papers (2022-01-11T09:44:00Z) - A cross-modal fusion network based on self-attention and residual
structure for multimodal emotion recognition [7.80238628278552]
We propose a novel cross-modal fusion network based on self-attention and residual structure (CFN-SR) for multimodal emotion recognition.
To verify the effectiveness of the proposed method, we conduct experiments on the RAVDESS dataset.
The experimental results show that the proposed CFN-SR achieves the state-of-the-art and obtains 75.76% accuracy with 26.30M parameters.
arXiv Detail & Related papers (2021-11-03T12:24:03Z) - MSO: Multi-Feature Space Joint Optimization Network for RGB-Infrared
Person Re-Identification [35.97494894205023]
RGB-infrared cross-modality person re-identification (ReID) task aims to recognize the images of the same identity between the visible modality and the infrared modality.
Existing methods mainly use a two-stream architecture to eliminate the discrepancy between the two modalities in the final common feature space.
We present a novel multi-feature space joint optimization (MSO) network, which can learn modality-sharable features in both the single-modality space and the common space.
arXiv Detail & Related papers (2021-10-21T16:45:23Z) - Specificity-preserving RGB-D Saliency Detection [103.3722116992476]
We propose a specificity-preserving network (SP-Net) for RGB-D saliency detection.
Two modality-specific networks and a shared learning network are adopted to generate individual and shared saliency maps.
Experiments on six benchmark datasets demonstrate that our SP-Net outperforms other state-of-the-art methods.
arXiv Detail & Related papers (2021-08-18T14:14:22Z) - Exploring Modality-shared Appearance Features and Modality-invariant
Relation Features for Cross-modality Person Re-Identification [72.95858515157603]
Cross-modality person re-identification works rely on discriminative modality-shared features.
Despite some initial success, such modality-shared appearance features cannot capture enough modality-invariant information.
A novel cross-modality quadruplet loss is proposed to further reduce the cross-modality variations.
arXiv Detail & Related papers (2021-04-23T11:14:07Z) - Domain Private and Agnostic Feature for Modality Adaptive Face
Recognition [10.497190559654245]
This paper proposes a Feature Aggregation Network (FAN), which includes disentangled representation module (DRM), feature fusion module (FFM) and metric penalty learning session.
First, in DRM, twoworks, i.e. domain-private network and domain-agnostic network are specially designed for learning modality features and identity features.
Second, in FFM, the identity features are fused with domain features to achieve cross-modal bi-directional identity feature transformation.
Third, considering that the distribution imbalance between easy and hard pairs exists in cross-modal datasets, the identity preserving guided metric learning with adaptive
arXiv Detail & Related papers (2020-08-10T00:59:42Z) - Modality Compensation Network: Cross-Modal Adaptation for Action
Recognition [77.24983234113957]
We propose a Modality Compensation Network (MCN) to explore the relationships of different modalities.
Our model bridges data from source and auxiliary modalities by a modality adaptation block to achieve adaptive representation learning.
Experimental results reveal that MCN outperforms state-of-the-art approaches on four widely-used action recognition benchmarks.
arXiv Detail & Related papers (2020-01-31T04:51:55Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.