Trading-off Mutual Information on Feature Aggregation for Face
Recognition
- URL: http://arxiv.org/abs/2309.13137v1
- Date: Fri, 22 Sep 2023 18:48:38 GMT
- Title: Trading-off Mutual Information on Feature Aggregation for Face
Recognition
- Authors: Mohammad Akyash, Ali Zafari, Nasser M. Nasrabadi
- Abstract summary: We propose a technique to aggregate the outputs of two state-of-the-art (SOTA) deep Face Recognition (FR) models.
In our approach, we leverage the transformer attention mechanism to exploit the relationship between different parts of two feature maps.
To evaluate the effectiveness of our proposed method, we conducted experiments on popular benchmarks and compared our results with state-of-the-art algorithms.
- Score: 12.803514943105657
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Despite the advances in the field of Face Recognition (FR), the precision of
these methods is not yet sufficient. To improve the FR performance, this paper
proposes a technique to aggregate the outputs of two state-of-the-art (SOTA)
deep FR models, namely ArcFace and AdaFace. In our approach, we leverage the
transformer attention mechanism to exploit the relationship between different
parts of two feature maps. By doing so, we aim to enhance the overall
discriminative power of the FR system. One of the challenges in feature
aggregation is the effective modeling of both local and global dependencies.
Conventional transformers are known for their ability to capture long-range
dependencies, but they often struggle with modeling local dependencies
accurately. To address this limitation, we augment the self-attention mechanism
to capture both local and global dependencies effectively. This allows our
model to take advantage of the overlapping receptive fields present in
corresponding locations of the feature maps. However, fusing two feature maps
from different FR models might introduce redundancies to the face embedding.
Since these models often share identical backbone architectures, the resulting
feature maps may contain overlapping information, which can mislead the
training process. To overcome this problem, we leverage the principle of
Information Bottleneck to obtain a maximally informative facial representation.
This ensures that the aggregated features retain the most relevant and
discriminative information while minimizing redundant or misleading details. To
evaluate the effectiveness of our proposed method, we conducted experiments on
popular benchmarks and compared our results with state-of-the-art algorithms.
The consistent improvement we observed in these benchmarks demonstrates the
efficacy of our approach in enhancing FR performance.
Related papers
- FedDr+: Stabilizing Dot-regression with Global Feature Distillation for Federated Learning [27.782676760198697]
Federated Learning (FL) has emerged as a pivotal framework for the development of effective global models.
A key challenge in FL is client drift, where data heterogeneity impedes the aggregation of scattered knowledge.
We introduce a novel algorithm named FedDr+, which empowers local model alignment using dot-regression loss.
arXiv Detail & Related papers (2024-06-04T14:34:13Z) - Diffusion Models Without Attention [110.5623058129782]
Diffusion State Space Model (DiffuSSM) is an architecture that supplants attention mechanisms with a more scalable state space model backbone.
Our focus on FLOP-efficient architectures in diffusion training marks a significant step forward.
arXiv Detail & Related papers (2023-11-30T05:15:35Z) - Feature Decoupling-Recycling Network for Fast Interactive Segmentation [79.22497777645806]
Recent interactive segmentation methods iteratively take source image, user guidance and previously predicted mask as the input.
We propose the Feature Decoupling-Recycling Network (FDRN), which decouples the modeling components based on their intrinsic discrepancies.
arXiv Detail & Related papers (2023-08-07T12:26:34Z) - CoNAN: Conditional Neural Aggregation Network For Unconstrained Face
Feature Fusion [11.059590443280726]
We propose a feature distribution conditioning approach called CoNAN for template aggregation.
Specifically, our method aims to learn a context vector conditioned over the distribution information of the incoming feature set.
The proposed method produces state-of-the-art results on long-range unconstrained face recognition datasets.
arXiv Detail & Related papers (2023-07-16T09:47:21Z) - Cluster-level pseudo-labelling for source-free cross-domain facial
expression recognition [94.56304526014875]
We propose the first Source-Free Unsupervised Domain Adaptation (SFUDA) method for Facial Expression Recognition (FER)
Our method exploits self-supervised pretraining to learn good feature representations from the target data.
We validate the effectiveness of our method in four adaptation setups, proving that it consistently outperforms existing SFUDA methods when applied to FER.
arXiv Detail & Related papers (2022-10-11T08:24:50Z) - Local Similarity Pattern and Cost Self-Reassembling for Deep Stereo
Matching Networks [3.7384509727711923]
We introduce a pairwise feature for deep stereo matching networks, named LSP (Local Similarity Pattern)
Through explicitly revealing the neighbor relationships, LSP contains rich structural information, which can be leveraged to aid for more discriminative feature description.
Secondly, we design a dynamic self-reassembling refinement strategy and apply it to the cost distribution and the disparity map respectively.
arXiv Detail & Related papers (2021-12-02T06:52:54Z) - Light Field Saliency Detection with Dual Local Graph Learning
andReciprocative Guidance [148.9832328803202]
We model the infor-mation fusion within focal stack via graph networks.
We build a novel dual graph modelto guide the focal stack fusion process using all-focus pat-terns.
arXiv Detail & Related papers (2021-10-02T00:54:39Z) - Multi-Branch Deep Radial Basis Function Networks for Facial Emotion
Recognition [80.35852245488043]
We propose a CNN based architecture enhanced with multiple branches formed by radial basis function (RBF) units.
RBF units capture local patterns shared by similar instances using an intermediate representation.
We show it is the incorporation of local information what makes the proposed model competitive.
arXiv Detail & Related papers (2021-09-07T21:05:56Z) - Video Salient Object Detection via Adaptive Local-Global Refinement [7.723369608197167]
Video salient object detection (VSOD) is an important task in many vision applications.
We propose an adaptive local-global refinement framework for VSOD.
We show that our weighting methodology can further exploit the feature correlations, thus driving the network to learn more discriminative feature representation.
arXiv Detail & Related papers (2021-04-29T14:14:11Z) - Hierarchical Deep CNN Feature Set-Based Representation Learning for
Robust Cross-Resolution Face Recognition [59.29808528182607]
Cross-resolution face recognition (CRFR) is important in intelligent surveillance and biometric forensics.
Existing shallow learning-based and deep learning-based methods focus on mapping the HR-LR face pairs into a joint feature space.
In this study, we desire to fully exploit the multi-level deep convolutional neural network (CNN) feature set for robust CRFR.
arXiv Detail & Related papers (2021-03-25T14:03:42Z) - Relational Deep Feature Learning for Heterogeneous Face Recognition [17.494718795454055]
We propose a graph-structured module called Graph Module (NIR) that extracts global relational information in addition to general facial features.
The proposed method outperforms other state-of-the-art methods on five Heterogeneous Face Recognition (HFR) databases.
arXiv Detail & Related papers (2020-03-02T07:35:23Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.