Deep and Shallow Covariance Feature Quantization for 3D Facial
Expression Recognition
- URL: http://arxiv.org/abs/2105.05708v1
- Date: Wed, 12 May 2021 14:48:39 GMT
- Title: Deep and Shallow Covariance Feature Quantization for 3D Facial
Expression Recognition
- Authors: Walid Hariri, Nadir Farah, Dinesh Kumar Vishwakarma
- Abstract summary: We propose a multi-modal 2D + 3D feature-based method for facial expression recognition.
We extract shallow features from the 3D images, and deep features using Convolutional Neural Networks (CNN) from the transformed 2D images.
High classification performances have been achieved on the BU-3DFE and Bosphorus datasets.
- Score: 7.773399781313892
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: Facial expressions recognition (FER) of 3D face scans has received a
significant amount of attention in recent years. Most of the facial expression
recognition methods have been proposed using mainly 2D images. These methods
suffer from several issues like illumination changes and pose variations.
Moreover, 2D mapping from 3D images may lack some geometric and topological
characteristics of the face. Hence, to overcome this problem, a multi-modal 2D
+ 3D feature-based method is proposed. We extract shallow features from the 3D
images, and deep features using Convolutional Neural Networks (CNN) from the
transformed 2D images. Combining these features into a compact representation
uses covariance matrices as descriptors for both features instead of
single-handedly descriptors. A covariance matrix learning is used as a manifold
layer to reduce the deep covariance matrices size and enhance their
discrimination power while preserving their manifold structure. We then use the
Bag-of-Features (BoF) paradigm to quantize the covariance matrices after
flattening. Accordingly, we obtained two codebooks using shallow and deep
features. The global codebook is then used to feed an SVM classifier. High
classification performances have been achieved on the BU-3DFE and Bosphorus
datasets compared to the state-of-the-art methods.
Related papers
- Large Spatial Model: End-to-end Unposed Images to Semantic 3D [79.94479633598102]
Large Spatial Model (LSM) processes unposed RGB images directly into semantic radiance fields.
LSM simultaneously estimates geometry, appearance, and semantics in a single feed-forward operation.
It can generate versatile label maps by interacting with language at novel viewpoints.
arXiv Detail & Related papers (2024-10-24T17:54:42Z) - Robust 3D Point Clouds Classification based on Declarative Defenders [18.51700931775295]
3D point clouds are unstructured and sparse, while 2D images are structured and dense.
In this paper, we explore three distinct algorithms for mapping 3D point clouds into 2D images.
The proposed approaches demonstrate superior accuracy and robustness against adversarial attacks.
arXiv Detail & Related papers (2024-10-13T01:32:38Z) - RAFaRe: Learning Robust and Accurate Non-parametric 3D Face
Reconstruction from Pseudo 2D&3D Pairs [13.11105614044699]
We propose a robust and accurate non-parametric method for single-view 3D face reconstruction (SVFR)
A large-scale pseudo 2D&3D dataset is created by first rendering the detailed 3D faces, then swapping the face in the wild images with the rendered face.
Our model outperforms previous methods on FaceScape-wild/lab and MICC benchmarks.
arXiv Detail & Related papers (2023-02-10T19:40:26Z) - Learning 3D Representations from 2D Pre-trained Models via
Image-to-Point Masked Autoencoders [52.91248611338202]
We propose an alternative to obtain superior 3D representations from 2D pre-trained models via Image-to-Point Masked Autoencoders, named as I2P-MAE.
By self-supervised pre-training, we leverage the well learned 2D knowledge to guide 3D masked autoencoding.
I2P-MAE attains the state-of-the-art 90.11% accuracy, +3.68% to the second-best, demonstrating superior transferable capacity.
arXiv Detail & Related papers (2022-12-13T17:59:20Z) - Deep-MDS Framework for Recovering the 3D Shape of 2D Landmarks from a
Single Image [8.368476827165114]
This paper proposes a framework to recover the 3D shape of 2D landmarks on a human face, in a single input image.
A deep neural network learns the pairwise dissimilarity among 2D landmarks, used by NMDS approach.
arXiv Detail & Related papers (2022-10-27T06:20:10Z) - PointMCD: Boosting Deep Point Cloud Encoders via Multi-view Cross-modal
Distillation for 3D Shape Recognition [55.38462937452363]
We propose a unified multi-view cross-modal distillation architecture, including a pretrained deep image encoder as the teacher and a deep point encoder as the student.
By pair-wise aligning multi-view visual and geometric descriptors, we can obtain more powerful deep point encoders without exhausting and complicated network modification.
arXiv Detail & Related papers (2022-07-07T07:23:20Z) - End-to-End Learning of Multi-category 3D Pose and Shape Estimation [128.881857704338]
We propose an end-to-end method that simultaneously detects 2D keypoints from an image and lifts them to 3D.
The proposed method learns both 2D detection and 3D lifting only from 2D keypoints annotations.
In addition to being end-to-end in image to 3D learning, our method also handles objects from multiple categories using a single neural network.
arXiv Detail & Related papers (2021-12-19T17:10:40Z) - Implicit Neural Deformation for Multi-View Face Reconstruction [43.88676778013593]
We present a new method for 3D face reconstruction from multi-view RGB images.
Unlike previous methods which are built upon 3D morphable models, our method leverages an implicit representation to encode rich geometric features.
Our experimental results on several benchmark datasets demonstrate that our approach outperforms alternative baselines and achieves superior face reconstruction results compared to state-of-the-art methods.
arXiv Detail & Related papers (2021-12-05T07:02:53Z) - Facial Depth and Normal Estimation using Single Dual-Pixel Camera [81.02680586859105]
We introduce a DP-oriented Depth/Normal network that reconstructs the 3D facial geometry.
It contains the corresponding ground-truth 3D models including depth map and surface normal in metric scale.
It achieves state-of-the-art performances over recent DP-based depth/normal estimation methods.
arXiv Detail & Related papers (2021-11-25T05:59:27Z) - Hard Example Generation by Texture Synthesis for Cross-domain Shape
Similarity Learning [97.56893524594703]
Image-based 3D shape retrieval (IBSR) aims to find the corresponding 3D shape of a given 2D image from a large 3D shape database.
metric learning with some adaptation techniques seems to be a natural solution to shape similarity learning.
We develop a geometry-focused multi-view metric learning framework empowered by texture synthesis.
arXiv Detail & Related papers (2020-10-23T08:52:00Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.