2D-3D Attention and Entropy for Pose Robust 2D Facial Recognition
- URL: http://arxiv.org/abs/2505.09073v1
- Date: Wed, 14 May 2025 02:17:53 GMT
- Title: 2D-3D Attention and Entropy for Pose Robust 2D Facial Recognition
- Authors: J. Brennan Peace, Shuowen Hu, Benjamin S. Riggan,
- Abstract summary: We propose a novel performance framework to facilitate improvement across large discrepancies in image-based performances.<n>Our proposed framework achieves better performances in at least 7.100 by enabling shared cloud (3D) representations.
- Score: 3.1632426898254224
- License: http://creativecommons.org/licenses/by-sa/4.0/
- Abstract: Despite recent advances in facial recognition, there remains a fundamental issue concerning degradations in performance due to substantial perspective (pose) differences between enrollment and query (probe) imagery. Therefore, we propose a novel domain adaptive framework to facilitate improved performances across large discrepancies in pose by enabling image-based (2D) representations to infer properties of inherently pose invariant point cloud (3D) representations. Specifically, our proposed framework achieves better pose invariance by using (1) a shared (joint) attention mapping to emphasize common patterns that are most correlated between 2D facial images and 3D facial data and (2) a joint entropy regularizing loss to promote better consistency$\unicode{x2014}$enhancing correlations among the intersecting 2D and 3D representations$\unicode{x2014}$by leveraging both attention maps. This framework is evaluated on FaceScape and ARL-VTF datasets, where it outperforms competitive methods by achieving profile (90$\unicode{x00b0}$$\unicode{x002b}$) TAR @ 1$\unicode{x0025}$ FAR improvements of at least 7.1$\unicode{x0025}$ and 1.57$\unicode{x0025}$, respectively.
Related papers
- DICE: End-to-end Deformation Capture of Hand-Face Interactions from a Single Image [98.29284902879652]
We present DICE, the first end-to-end method for Deformation-aware hand-face Interaction reCovEry from a single image.<n>It features disentangling the regression of local deformation fields and global mesh locations into two network branches.<n>It achieves state-of-the-art performance on a standard benchmark and in-the-wild data in terms of accuracy and physical plausibility.
arXiv Detail & Related papers (2024-06-26T00:08:29Z) - PostoMETRO: Pose Token Enhanced Mesh Transformer for Robust 3D Human Mesh Recovery [20.763457281944834]
We present PostoMETRO, which integrates 2D pose representation into transformers in a token-wise manner.
We are able to produce more precise 3D coordinates, even under extreme scenarios like occlusion.
arXiv Detail & Related papers (2024-03-19T06:18:25Z) - Learning Naturally Aggregated Appearance for Efficient 3D Editing [90.57414218888536]
We learn the color field as an explicit 2D appearance aggregation, also called canonical image.<n>We complement the canonical image with a projection field that maps 3D points onto 2D pixels for texture query.<n>Our approach demonstrates remarkable efficiency by being at least 20 times faster per edit compared to existing NeRF-based editing methods.
arXiv Detail & Related papers (2023-12-11T18:59:31Z) - Two Views Are Better than One: Monocular 3D Pose Estimation with Multiview Consistency [0.493599216374976]
We introduce a novel loss function, consistency loss, which operates on two synchronized views.<n>Our consistency loss substantially improves performance for fine-tuning without requiring 3D data.<n>We show that using our consistency loss can yield state-of-the-art performance when training models from scratch in a semi-supervised manner.
arXiv Detail & Related papers (2023-11-21T08:21:55Z) - A Single 2D Pose with Context is Worth Hundreds for 3D Human Pose
Estimation [18.72362803593654]
The dominant paradigm in 3D human pose estimation that lifts a 2D pose sequence to 3D heavily relies on long-term temporal clues.
This can be attributed to their inherent inability to perceive spatial context as plain 2D joint coordinates carry no visual cues.
We propose a straightforward yet powerful solution: leveraging the readily available intermediate visual representations produced by off-the-shelf (pre-trained) 2D pose detectors.
arXiv Detail & Related papers (2023-11-06T18:04:13Z) - Unpaired Multi-domain Attribute Translation of 3D Facial Shapes with a
Square and Symmetric Geometric Map [23.461476902880584]
We propose a learning framework for 3D facial attribute translation.
We use a novel geometric map for 3D shape representation and embed it in an end-to-end generative adversarial network.
We employ a unified and unpaired learning framework for multi-domain attribute translation.
arXiv Detail & Related papers (2023-08-25T08:37:55Z) - Learning from Abstract Images: on the Importance of Occlusion in a
Minimalist Encoding of Human Poses [0.0]
2D-to-D representation suffers from poor performance in cross-dataset benchmarks.
We propose a novel representation using 3D information while encoding it.
The result allows us to predict poses that are completely independent of camera viewpoint.
arXiv Detail & Related papers (2023-07-19T10:45:49Z) - MPM: A Unified 2D-3D Human Pose Representation via Masked Pose Modeling [59.74064212110042]
mpmcan handle multiple tasks including 3D human pose estimation, 3D pose estimation from cluded 2D pose, and 3D pose completion in a textocbfsingle framework.
We conduct extensive experiments and ablation studies on several widely used human pose datasets and achieve state-of-the-art performance on MPI-INF-3DHP.
arXiv Detail & Related papers (2023-06-29T10:30:00Z) - CheckerPose: Progressive Dense Keypoint Localization for Object Pose
Estimation with Graph Neural Network [66.24726878647543]
Estimating the 6-DoF pose of a rigid object from a single RGB image is a crucial yet challenging task.
Recent studies have shown the great potential of dense correspondence-based solutions.
We propose a novel pose estimation algorithm named CheckerPose, which improves on three main aspects.
arXiv Detail & Related papers (2023-03-29T17:30:53Z) - RiCS: A 2D Self-Occlusion Map for Harmonizing Volumetric Objects [68.85305626324694]
Ray-marching in Camera Space (RiCS) is a new method to represent the self-occlusions of foreground objects in 3D into a 2D self-occlusion map.
We show that our representation map not only allows us to enhance the image quality but also to model temporally coherent complex shadow effects.
arXiv Detail & Related papers (2022-05-14T05:35:35Z) - KeypointNeRF: Generalizing Image-based Volumetric Avatars using Relative
Spatial Encoding of Keypoints [28.234772596912165]
We propose a highly effective approach to modeling high-fidelity volumetric avatars from sparse views.
One of the key ideas is to encode relative spatial 3D information via sparse 3D keypoints.
Our experiments show that a majority of errors in prior work stem from an inappropriate choice of spatial encoding.
arXiv Detail & Related papers (2022-05-10T15:57:03Z) - Non-Local Latent Relation Distillation for Self-Adaptive 3D Human Pose
Estimation [63.199549837604444]
3D human pose estimation approaches leverage different forms of strong (2D/3D pose) or weak (multi-view or depth) paired supervision.
We cast 3D pose learning as a self-supervised adaptation problem that aims to transfer the task knowledge from a labeled source domain to a completely unpaired target.
We evaluate different self-adaptation settings and demonstrate state-of-the-art 3D human pose estimation performance on standard benchmarks.
arXiv Detail & Related papers (2022-04-05T03:52:57Z) - Fusing Wearable IMUs with Multi-View Images for Human Pose Estimation: A
Geometric Approach [76.10879433430466]
We propose to estimate 3D human pose from multi-view images and a few IMUs attached at person's limbs.
It operates by firstly detecting 2D poses from the two signals, and then lifting them to the 3D space.
The simple two-step approach reduces the error of the state-of-the-art by a large margin on a public dataset.
arXiv Detail & Related papers (2020-03-25T00:26:54Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.