Domain Private and Agnostic Feature for Modality Adaptive Face
Recognition
- URL: http://arxiv.org/abs/2008.03848v1
- Date: Mon, 10 Aug 2020 00:59:42 GMT
- Title: Domain Private and Agnostic Feature for Modality Adaptive Face
Recognition
- Authors: Yingguo Xu, Lei Zhang, Qingyan Duan
- Abstract summary: This paper proposes a Feature Aggregation Network (FAN), which includes disentangled representation module (DRM), feature fusion module (FFM) and metric penalty learning session.
First, in DRM, twoworks, i.e. domain-private network and domain-agnostic network are specially designed for learning modality features and identity features.
Second, in FFM, the identity features are fused with domain features to achieve cross-modal bi-directional identity feature transformation.
Third, considering that the distribution imbalance between easy and hard pairs exists in cross-modal datasets, the identity preserving guided metric learning with adaptive
- Score: 10.497190559654245
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Heterogeneous face recognition is a challenging task due to the large
modality discrepancy and insufficient cross-modal samples. Most existing works
focus on discriminative feature transformation, metric learning and cross-modal
face synthesis. However, the fact that cross-modal faces are always coupled by
domain (modality) and identity information has received little attention.
Therefore, how to learn and utilize the domain-private feature and
domain-agnostic feature for modality adaptive face recognition is the focus of
this work. Specifically, this paper proposes a Feature Aggregation Network
(FAN), which includes disentangled representation module (DRM), feature fusion
module (FFM) and adaptive penalty metric (APM) learning session. First, in DRM,
two subnetworks, i.e. domain-private network and domain-agnostic network are
specially designed for learning modality features and identity features,
respectively. Second, in FFM, the identity features are fused with domain
features to achieve cross-modal bi-directional identity feature transformation,
which, to a large extent, further disentangles the modality information and
identity information. Third, considering that the distribution imbalance
between easy and hard pairs exists in cross-modal datasets, which increases the
risk of model bias, the identity preserving guided metric learning with
adaptive hard pairs penalization is proposed in our FAN. The proposed APM also
guarantees the cross-modality intra-class compactness and inter-class
separation. Extensive experiments on benchmark cross-modal face datasets show
that our FAN outperforms SOTA methods.
Related papers
- Modality Prompts for Arbitrary Modality Salient Object Detection [57.610000247519196]
This paper delves into the task of arbitrary modality salient object detection (AM SOD)
It aims to detect salient objects from arbitrary modalities, eg RGB images, RGB-D images, and RGB-D-T images.
A novel modality-adaptive Transformer (MAT) will be proposed to investigate two fundamental challenges of AM SOD.
arXiv Detail & Related papers (2024-05-06T11:02:02Z) - Modality Unifying Network for Visible-Infrared Person Re-Identification [24.186989535051623]
Visible-infrared person re-identification (VI-ReID) is a challenging task due to large cross-modality discrepancies and intra-class variations.
Existing methods mainly focus on learning modality-shared representations by embedding different modalities into the same feature space.
We propose a novel Modality Unifying Network (MUN) to explore a robust auxiliary modality for VI-ReID.
arXiv Detail & Related papers (2023-09-12T14:22:22Z) - Learning Cross-modality Information Bottleneck Representation for
Heterogeneous Person Re-Identification [61.49219876388174]
Visible-Infrared person re-identification (VI-ReID) is an important and challenging task in intelligent video surveillance.
Existing methods mainly focus on learning a shared feature space to reduce the modality discrepancy between visible and infrared modalities.
We present a novel mutual information and modality consensus network, namely CMInfoNet, to extract modality-invariant identity features.
arXiv Detail & Related papers (2023-08-29T06:55:42Z) - Object Segmentation by Mining Cross-Modal Semantics [68.88086621181628]
We propose a novel approach by mining the Cross-Modal Semantics to guide the fusion and decoding of multimodal features.
Specifically, we propose a novel network, termed XMSNet, consisting of (1) all-round attentive fusion (AF), (2) coarse-to-fine decoder (CFD), and (3) cross-layer self-supervision.
arXiv Detail & Related papers (2023-05-17T14:30:11Z) - CLIP-Driven Fine-grained Text-Image Person Re-identification [50.94827165464813]
TIReID aims to retrieve the image corresponding to the given text query from a pool of candidate images.
We propose a CLIP-driven Fine-grained information excavation framework (CFine) to fully utilize the powerful knowledge of CLIP for TIReID.
arXiv Detail & Related papers (2022-10-19T03:43:12Z) - A cross-modal fusion network based on self-attention and residual
structure for multimodal emotion recognition [7.80238628278552]
We propose a novel cross-modal fusion network based on self-attention and residual structure (CFN-SR) for multimodal emotion recognition.
To verify the effectiveness of the proposed method, we conduct experiments on the RAVDESS dataset.
The experimental results show that the proposed CFN-SR achieves the state-of-the-art and obtains 75.76% accuracy with 26.30M parameters.
arXiv Detail & Related papers (2021-11-03T12:24:03Z) - MSO: Multi-Feature Space Joint Optimization Network for RGB-Infrared
Person Re-Identification [35.97494894205023]
RGB-infrared cross-modality person re-identification (ReID) task aims to recognize the images of the same identity between the visible modality and the infrared modality.
Existing methods mainly use a two-stream architecture to eliminate the discrepancy between the two modalities in the final common feature space.
We present a novel multi-feature space joint optimization (MSO) network, which can learn modality-sharable features in both the single-modality space and the common space.
arXiv Detail & Related papers (2021-10-21T16:45:23Z) - CMTR: Cross-modality Transformer for Visible-infrared Person
Re-identification [38.96033760300123]
Cross-modality transformer-based method (CMTR) for visible-infrared person re-identification task.
We design the novel modality embeddings, which are fused with token embeddings to encode modalities' information.
Our proposed CMTR model's performance significantly surpasses existing outstanding CNN-based methods.
arXiv Detail & Related papers (2021-10-18T03:12:59Z) - Exploring Modality-shared Appearance Features and Modality-invariant
Relation Features for Cross-modality Person Re-Identification [72.95858515157603]
Cross-modality person re-identification works rely on discriminative modality-shared features.
Despite some initial success, such modality-shared appearance features cannot capture enough modality-invariant information.
A novel cross-modality quadruplet loss is proposed to further reduce the cross-modality variations.
arXiv Detail & Related papers (2021-04-23T11:14:07Z) - DF^2AM: Dual-level Feature Fusion and Affinity Modeling for RGB-Infrared
Cross-modality Person Re-identification [18.152310122348393]
RGB-infrared person re-identification is a challenging task due to the intra-class variations and cross-modality discrepancy.
We propose a Dual-level (i.e., local and global) Feature Fusion (DF2) module by learning attention for discnative feature from local to global manner.
To further mining the relationships between global features from person images, we propose an Affinities Modeling (AM) module.
arXiv Detail & Related papers (2021-04-01T03:12:56Z) - Cross-modality Person re-identification with Shared-Specific Feature
Transfer [112.60513494602337]
Cross-modality person re-identification (cm-ReID) is a challenging but key technology for intelligent video analysis.
We propose a novel cross-modality shared-specific feature transfer algorithm (termed cm-SSFT) to explore the potential of both the modality-shared information and the modality-specific characteristics.
arXiv Detail & Related papers (2020-02-28T00:18:45Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.