CoNAN: Conditional Neural Aggregation Network For Unconstrained Face
Feature Fusion
- URL: http://arxiv.org/abs/2307.10237v1
- Date: Sun, 16 Jul 2023 09:47:21 GMT
- Title: CoNAN: Conditional Neural Aggregation Network For Unconstrained Face
Feature Fusion
- Authors: Bhavin Jawade, Deen Dayal Mohan, Dennis Fedorishin, Srirangaraj
Setlur, Venu Govindaraju
- Abstract summary: We propose a feature distribution conditioning approach called CoNAN for template aggregation.
Specifically, our method aims to learn a context vector conditioned over the distribution information of the incoming feature set.
The proposed method produces state-of-the-art results on long-range unconstrained face recognition datasets.
- Score: 11.059590443280726
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Face recognition from image sets acquired under unregulated and uncontrolled
settings, such as at large distances, low resolutions, varying viewpoints,
illumination, pose, and atmospheric conditions, is challenging. Face feature
aggregation, which involves aggregating a set of N feature representations
present in a template into a single global representation, plays a pivotal role
in such recognition systems. Existing works in traditional face feature
aggregation either utilize metadata or high-dimensional intermediate feature
representations to estimate feature quality for aggregation. However,
generating high-quality metadata or style information is not feasible for
extremely low-resolution faces captured in long-range and high altitude
settings. To overcome these limitations, we propose a feature distribution
conditioning approach called CoNAN for template aggregation. Specifically, our
method aims to learn a context vector conditioned over the distribution
information of the incoming feature set, which is utilized to weigh the
features based on their estimated informativeness. The proposed method produces
state-of-the-art results on long-range unconstrained face recognition datasets
such as BTS, and DroneSURF, validating the advantages of such an aggregation
strategy.
Related papers
- High-Precision Dichotomous Image Segmentation via Probing Diffusion Capacity [69.32473738284374]
We propose DiffDIS, a diffusion-driven segmentation model that taps into the potential of the pre-trained U-Net within diffusion models.
By leveraging the robust generalization capabilities and rich, versatile image representation prior to the SD models, we significantly reduce the inference time while preserving high-fidelity, detailed generation.
Experiments on the DIS5K dataset demonstrate the superiority of DiffDIS, achieving state-of-the-art results through a streamlined inference process.
arXiv Detail & Related papers (2024-10-14T02:49:23Z) - Multi-Head Attention Residual Unfolded Network for Model-Based Pansharpening [2.874893537471256]
Unfolding fusion methods integrate the powerful representation capabilities of deep learning with the robustness of model-based approaches.
In this paper, we propose a model-based deep unfolded method for satellite image fusion.
Experimental results on PRISMA, Quickbird, and WorldView2 datasets demonstrate the superior performance of our method.
arXiv Detail & Related papers (2024-09-04T13:05:00Z) - Trading-off Mutual Information on Feature Aggregation for Face
Recognition [12.803514943105657]
We propose a technique to aggregate the outputs of two state-of-the-art (SOTA) deep Face Recognition (FR) models.
In our approach, we leverage the transformer attention mechanism to exploit the relationship between different parts of two feature maps.
To evaluate the effectiveness of our proposed method, we conducted experiments on popular benchmarks and compared our results with state-of-the-art algorithms.
arXiv Detail & Related papers (2023-09-22T18:48:38Z) - SSMG: Spatial-Semantic Map Guided Diffusion Model for Free-form
Layout-to-Image Generation [68.42476385214785]
We propose a novel Spatial-Semantic Map Guided (SSMG) diffusion model that adopts the feature map, derived from the layout, as guidance.
SSMG achieves superior generation quality with sufficient spatial and semantic controllability compared to previous works.
We also propose the Relation-Sensitive Attention (RSA) and Location-Sensitive Attention (LSA) mechanisms.
arXiv Detail & Related papers (2023-08-20T04:09:12Z) - Small Object Detection via Coarse-to-fine Proposal Generation and
Imitation Learning [52.06176253457522]
We propose a two-stage framework tailored for small object detection based on the Coarse-to-fine pipeline and Feature Imitation learning.
CFINet achieves state-of-the-art performance on the large-scale small object detection benchmarks, SODA-D and SODA-A.
arXiv Detail & Related papers (2023-08-18T13:13:09Z) - ShaRP: Shape-Regularized Multidimensional Projections [71.30697308446064]
We present a novel projection technique - ShaRP - that provides users explicit control over the visual signature of the created scatterplot.
ShaRP scales well with dimensionality and dataset size, and generically handles any quantitative dataset.
arXiv Detail & Related papers (2023-06-01T11:16:58Z) - DuAT: Dual-Aggregation Transformer Network for Medical Image
Segmentation [21.717520350930705]
Transformer-based models have been widely demonstrated to be successful in computer vision tasks.
However, they are often dominated by features of large patterns leading to the loss of local details.
We propose a Dual-Aggregation Transformer Network called DuAT, which is characterized by two innovative designs.
Our proposed model outperforms state-of-the-art methods in the segmentation of skin lesion images, and polyps in colonoscopy images.
arXiv Detail & Related papers (2022-12-21T07:54:02Z) - Learning to Aggregate Multi-Scale Context for Instance Segmentation in
Remote Sensing Images [28.560068780733342]
A novel context aggregation network (CATNet) is proposed to improve the feature extraction process.
The proposed model exploits three lightweight plug-and-play modules, namely dense feature pyramid network (DenseFPN), spatial context pyramid ( SCP), and hierarchical region of interest extractor (HRoIE)
arXiv Detail & Related papers (2021-11-22T08:55:25Z) - Hierarchical Deep CNN Feature Set-Based Representation Learning for
Robust Cross-Resolution Face Recognition [59.29808528182607]
Cross-resolution face recognition (CRFR) is important in intelligent surveillance and biometric forensics.
Existing shallow learning-based and deep learning-based methods focus on mapping the HR-LR face pairs into a joint feature space.
In this study, we desire to fully exploit the multi-level deep convolutional neural network (CNN) feature set for robust CRFR.
arXiv Detail & Related papers (2021-03-25T14:03:42Z) - Multi-Person Pose Estimation with Enhanced Feature Aggregation and
Selection [33.15192824888279]
We propose a novel Enhanced Feature Aggregation and Selection network (EFASNet) for multi-person 2D human pose estimation.
Our method can well handle crowded, cluttered and occluded scenes.
Comprehensive experiments demonstrate that the proposed approach outperforms the state-of-the-art methods.
arXiv Detail & Related papers (2020-03-20T08:33:25Z) - Global Context-Aware Progressive Aggregation Network for Salient Object
Detection [117.943116761278]
We propose a novel network named GCPANet to integrate low-level appearance features, high-level semantic features, and global context features.
We show that the proposed approach outperforms the state-of-the-art methods both quantitatively and qualitatively.
arXiv Detail & Related papers (2020-03-02T04:26:10Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.