Modality Agnostic Heterogeneous Face Recognition with Switch Style Modulators
- URL: http://arxiv.org/abs/2407.08640v1
- Date: Thu, 11 Jul 2024 16:21:48 GMT
- Title: Modality Agnostic Heterogeneous Face Recognition with Switch Style Modulators
- Authors: Anjith George, Sebastien Marcel,
- Abstract summary: We introduce a novel framework designed to train a modality-agnostic HFR method capable of handling multiple modalities during inference.
We achieve this by implementing a computationally efficient automatic routing mechanism called Switch Style Modulation Blocks (SSMB)
Our proposed SSMB can be trained end-to-end and seamlessly integrated into pre-trained face recognition models, transforming them into modality-agnostic HFR models.
- Score: 4.910937238451485
- License: http://creativecommons.org/licenses/by-sa/4.0/
- Abstract: Heterogeneous Face Recognition (HFR) systems aim to enhance the capability of face recognition in challenging cross-modal authentication scenarios. However, the significant domain gap between the source and target modalities poses a considerable challenge for cross-domain matching. Existing literature primarily focuses on developing HFR approaches for specific pairs of face modalities, necessitating the explicit training of models for each source-target combination. In this work, we introduce a novel framework designed to train a modality-agnostic HFR method capable of handling multiple modalities during inference, all without explicit knowledge of the target modality labels. We achieve this by implementing a computationally efficient automatic routing mechanism called Switch Style Modulation Blocks (SSMB) that trains various domain expert modulators which transform the feature maps adaptively reducing the domain gap. Our proposed SSMB can be trained end-to-end and seamlessly integrated into pre-trained face recognition models, transforming them into modality-agnostic HFR models. We have performed extensive evaluations on HFR benchmark datasets to demonstrate its effectiveness. The source code and protocols will be made publicly available.
Related papers
- Modality Prompts for Arbitrary Modality Salient Object Detection [57.610000247519196]
This paper delves into the task of arbitrary modality salient object detection (AM SOD)
It aims to detect salient objects from arbitrary modalities, eg RGB images, RGB-D images, and RGB-D-T images.
A novel modality-adaptive Transformer (MAT) will be proposed to investigate two fundamental challenges of AM SOD.
arXiv Detail & Related papers (2024-05-06T11:02:02Z) - Heterogeneous Face Recognition Using Domain Invariant Units [4.910937238451485]
We leverage a pretrained face recognition model as a teacher network to learn domaininvariant network layers called Domain-Invariant Units (DIU)
The proposed DIU can be trained effectively even with a limited amount of paired training data, in a contrastive distillation framework.
This proposed approach has the potential to enhance pretrained models, making them more adaptable to a wider range of variations in data.
arXiv Detail & Related papers (2024-04-22T16:58:37Z) - From Modalities to Styles: Rethinking the Domain Gap in Heterogeneous Face Recognition [4.910937238451485]
We present a new Conditional Adaptive Instance Modulation (CAIM) module that seamlessly fits into existing Face Recognition networks.
The CAIM block modulates intermediate feature maps, efficiently adapting to the style of the source modality and bridging the domain gap.
We extensively evaluate the proposed approach on various challenging HFR benchmarks, showing that it outperforms state-of-the-art methods.
arXiv Detail & Related papers (2024-04-22T15:00:51Z) - MMA-DFER: MultiModal Adaptation of unimodal models for Dynamic Facial Expression Recognition in-the-wild [81.32127423981426]
Multimodal emotion recognition based on audio and video data is important for real-world applications.
Recent methods have focused on exploiting advances of self-supervised learning (SSL) for pre-training of strong multimodal encoders.
We propose a different perspective on the problem and investigate the advancement of multimodal DFER performance by adapting SSL-pre-trained disjoint unimodal encoders.
arXiv Detail & Related papers (2024-04-13T13:39:26Z) - Unleashing Network Potentials for Semantic Scene Completion [50.95486458217653]
This paper proposes a novel SSC framework - Adrial Modality Modulation Network (AMMNet)
AMMNet introduces two core modules: a cross-modal modulation enabling the interdependence of gradient flows between modalities, and a customized adversarial training scheme leveraging dynamic gradient competition.
Extensive experimental results demonstrate that AMMNet outperforms state-of-the-art SSC methods by a large margin.
arXiv Detail & Related papers (2024-03-12T11:48:49Z) - Consistency Regularization for Generalizable Source-free Domain
Adaptation [62.654883736925456]
Source-free domain adaptation (SFDA) aims to adapt a well-trained source model to an unlabelled target domain without accessing the source dataset.
Existing SFDA methods ONLY assess their adapted models on the target training set, neglecting the data from unseen but identically distributed testing sets.
We propose a consistency regularization framework to develop a more generalizable SFDA method.
arXiv Detail & Related papers (2023-08-03T07:45:53Z) - Bridging the Gap: Heterogeneous Face Recognition with Conditional
Adaptive Instance Modulation [7.665392786787577]
We introduce a novel Conditional Adaptive Instance Modulation (CAIM) module that can be integrated into pre-trained Face Recognition networks.
The CAIM block modulates intermediate feature maps, to adapt the style of the target modality effectively bridging the domain gap.
Our proposed method allows for end-to-end training with a minimal number of paired samples.
arXiv Detail & Related papers (2023-07-13T19:17:04Z) - Deep face recognition with clustering based domain adaptation [57.29464116557734]
We propose a new clustering-based domain adaptation method designed for face recognition task in which the source and target domain do not share any classes.
Our method effectively learns the discriminative target feature by aligning the feature domain globally, and, at the meantime, distinguishing the target clusters locally.
arXiv Detail & Related papers (2022-05-27T12:29:11Z) - Domain Private and Agnostic Feature for Modality Adaptive Face
Recognition [10.497190559654245]
This paper proposes a Feature Aggregation Network (FAN), which includes disentangled representation module (DRM), feature fusion module (FFM) and metric penalty learning session.
First, in DRM, twoworks, i.e. domain-private network and domain-agnostic network are specially designed for learning modality features and identity features.
Second, in FFM, the identity features are fused with domain features to achieve cross-modal bi-directional identity feature transformation.
Third, considering that the distribution imbalance between easy and hard pairs exists in cross-modal datasets, the identity preserving guided metric learning with adaptive
arXiv Detail & Related papers (2020-08-10T00:59:42Z) - Learning Meta Face Recognition in Unseen Domains [74.69681594452125]
We propose a novel face recognition method via meta-learning named Meta Face Recognition (MFR)
MFR synthesizes the source/target domain shift with a meta-optimization objective.
We propose two benchmarks for generalized face recognition evaluation.
arXiv Detail & Related papers (2020-03-17T14:10:30Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.