Deep and Shallow Covariance Feature Quantization for 3D Facial
Expression Recognition
- URL: http://arxiv.org/abs/2105.05708v1
- Date: Wed, 12 May 2021 14:48:39 GMT
- Title: Deep and Shallow Covariance Feature Quantization for 3D Facial
Expression Recognition
- Authors: Walid Hariri, Nadir Farah, Dinesh Kumar Vishwakarma
- Abstract summary: We propose a multi-modal 2D + 3D feature-based method for facial expression recognition.
We extract shallow features from the 3D images, and deep features using Convolutional Neural Networks (CNN) from the transformed 2D images.
High classification performances have been achieved on the BU-3DFE and Bosphorus datasets.
- Score: 7.773399781313892
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: Facial expressions recognition (FER) of 3D face scans has received a
significant amount of attention in recent years. Most of the facial expression
recognition methods have been proposed using mainly 2D images. These methods
suffer from several issues like illumination changes and pose variations.
Moreover, 2D mapping from 3D images may lack some geometric and topological
characteristics of the face. Hence, to overcome this problem, a multi-modal 2D
+ 3D feature-based method is proposed. We extract shallow features from the 3D
images, and deep features using Convolutional Neural Networks (CNN) from the
transformed 2D images. Combining these features into a compact representation
uses covariance matrices as descriptors for both features instead of
single-handedly descriptors. A covariance matrix learning is used as a manifold
layer to reduce the deep covariance matrices size and enhance their
discrimination power while preserving their manifold structure. We then use the
Bag-of-Features (BoF) paradigm to quantize the covariance matrices after
flattening. Accordingly, we obtained two codebooks using shallow and deep
features. The global codebook is then used to feed an SVM classifier. High
classification performances have been achieved on the BU-3DFE and Bosphorus
datasets compared to the state-of-the-art methods.
Related papers
- FDGaussian: Fast Gaussian Splatting from Single Image via Geometric-aware Diffusion Model [81.03553265684184]
We introduce FDGaussian, a novel two-stage framework for single-image 3D reconstruction.
Recent methods typically utilize pre-trained 2D diffusion models to generate plausible novel views from the input image.
We demonstrate that FDGaussian generates images with high consistency across different views and reconstructs high-quality 3D objects.
arXiv Detail & Related papers (2024-03-15T12:24:36Z) - Parametric Depth Based Feature Representation Learning for Object
Detection and Segmentation in Bird's Eye View [44.78243406441798]
This paper focuses on leveraging geometry information, such as depth, to model such feature transformation.
We first lift the 2D image features to the 3D space defined for the ego vehicle via a predicted parametric depth distribution for each pixel in each view.
We then aggregate the 3D feature volume based on the 3D space occupancy derived from depth to the BEV frame.
arXiv Detail & Related papers (2023-07-09T06:07:22Z) - RAFaRe: Learning Robust and Accurate Non-parametric 3D Face
Reconstruction from Pseudo 2D&3D Pairs [13.11105614044699]
We propose a robust and accurate non-parametric method for single-view 3D face reconstruction (SVFR)
A large-scale pseudo 2D&3D dataset is created by first rendering the detailed 3D faces, then swapping the face in the wild images with the rendered face.
Our model outperforms previous methods on FaceScape-wild/lab and MICC benchmarks.
arXiv Detail & Related papers (2023-02-10T19:40:26Z) - Learning 3D Representations from 2D Pre-trained Models via
Image-to-Point Masked Autoencoders [52.91248611338202]
We propose an alternative to obtain superior 3D representations from 2D pre-trained models via Image-to-Point Masked Autoencoders, named as I2P-MAE.
By self-supervised pre-training, we leverage the well learned 2D knowledge to guide 3D masked autoencoding.
I2P-MAE attains the state-of-the-art 90.11% accuracy, +3.68% to the second-best, demonstrating superior transferable capacity.
arXiv Detail & Related papers (2022-12-13T17:59:20Z) - Deep-MDS Framework for Recovering the 3D Shape of 2D Landmarks from a
Single Image [8.368476827165114]
This paper proposes a framework to recover the 3D shape of 2D landmarks on a human face, in a single input image.
A deep neural network learns the pairwise dissimilarity among 2D landmarks, used by NMDS approach.
arXiv Detail & Related papers (2022-10-27T06:20:10Z) - PointMCD: Boosting Deep Point Cloud Encoders via Multi-view Cross-modal
Distillation for 3D Shape Recognition [55.38462937452363]
We propose a unified multi-view cross-modal distillation architecture, including a pretrained deep image encoder as the teacher and a deep point encoder as the student.
By pair-wise aligning multi-view visual and geometric descriptors, we can obtain more powerful deep point encoders without exhausting and complicated network modification.
arXiv Detail & Related papers (2022-07-07T07:23:20Z) - End-to-End Learning of Multi-category 3D Pose and Shape Estimation [128.881857704338]
We propose an end-to-end method that simultaneously detects 2D keypoints from an image and lifts them to 3D.
The proposed method learns both 2D detection and 3D lifting only from 2D keypoints annotations.
In addition to being end-to-end in image to 3D learning, our method also handles objects from multiple categories using a single neural network.
arXiv Detail & Related papers (2021-12-19T17:10:40Z) - Implicit Neural Deformation for Multi-View Face Reconstruction [43.88676778013593]
We present a new method for 3D face reconstruction from multi-view RGB images.
Unlike previous methods which are built upon 3D morphable models, our method leverages an implicit representation to encode rich geometric features.
Our experimental results on several benchmark datasets demonstrate that our approach outperforms alternative baselines and achieves superior face reconstruction results compared to state-of-the-art methods.
arXiv Detail & Related papers (2021-12-05T07:02:53Z) - Facial Depth and Normal Estimation using Single Dual-Pixel Camera [81.02680586859105]
We introduce a DP-oriented Depth/Normal network that reconstructs the 3D facial geometry.
It contains the corresponding ground-truth 3D models including depth map and surface normal in metric scale.
It achieves state-of-the-art performances over recent DP-based depth/normal estimation methods.
arXiv Detail & Related papers (2021-11-25T05:59:27Z) - Hard Example Generation by Texture Synthesis for Cross-domain Shape
Similarity Learning [97.56893524594703]
Image-based 3D shape retrieval (IBSR) aims to find the corresponding 3D shape of a given 2D image from a large 3D shape database.
metric learning with some adaptation techniques seems to be a natural solution to shape similarity learning.
We develop a geometry-focused multi-view metric learning framework empowered by texture synthesis.
arXiv Detail & Related papers (2020-10-23T08:52:00Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.