Towards Reading Beyond Faces for Sparsity-Aware 4D Affect Recognition
- URL: http://arxiv.org/abs/2002.03157v4
- Date: Wed, 19 Aug 2020 11:02:47 GMT
- Title: Towards Reading Beyond Faces for Sparsity-Aware 4D Affect Recognition
- Authors: Muzammil Behzad, Nhat Vo, Xiaobai Li, Guoying Zhao
- Abstract summary: We present a sparsity-aware deep network for automatic 4D facial expression recognition (FER)
We first propose a novel augmentation method to combat the data limitation problem for deep learning.
We then present a sparsity-aware deep network to compute the sparse representations of convolutional features over multi-views.
- Score: 55.15661254072032
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: In this paper, we present a sparsity-aware deep network for automatic 4D
facial expression recognition (FER). Given 4D data, we first propose a novel
augmentation method to combat the data limitation problem for deep learning.
This is achieved by projecting the input data into RGB and depth map images and
then iteratively performing randomized channel concatenation. Encoded in the
given 3D landmarks, we also introduce an effective way to capture the facial
muscle movements from three orthogonal plans (TOP), the TOP-landmarks over
multi-views. Importantly, we then present a sparsity-aware deep network to
compute the sparse representations of convolutional features over multi-views.
This is not only effective for a higher recognition accuracy but is also
computationally convenient. For training, the TOP-landmarks and sparse
representations are used to train a long short-term memory (LSTM) network. The
refined predictions are achieved when the learned features collaborate over
multi-views. Extensive experimental results achieved on the BU-4DFE dataset
show the significance of our method over the state-of-the-art methods by
reaching a promising accuracy of 99.69% for 4D FER.
Related papers
- Representing 3D sparse map points and lines for camera relocalization [1.2974519529978974]
We show how a lightweight neural network can learn to represent both 3D point and line features.
In tests, our method secures a significant lead, marking the most considerable enhancement over state-of-the-art learning-based methodologies.
arXiv Detail & Related papers (2024-02-28T03:07:05Z) - Deep Homography Estimation for Visual Place Recognition [49.235432979736395]
We propose a transformer-based deep homography estimation (DHE) network.
It takes the dense feature map extracted by a backbone network as input and fits homography for fast and learnable geometric verification.
Experiments on benchmark datasets show that our method can outperform several state-of-the-art methods.
arXiv Detail & Related papers (2024-02-25T13:22:17Z) - Implicit Shape and Appearance Priors for Few-Shot Full Head
Reconstruction [17.254539604491303]
In this paper, we address the problem of few-shot full 3D head reconstruction.
We accomplish this by incorporating a probabilistic shape and appearance prior into coordinate-based representations.
We extend the H3DS dataset, which now comprises 60 high-resolution 3D full head scans and their corresponding posed images and masks.
arXiv Detail & Related papers (2023-10-12T07:35:30Z) - GraphCSPN: Geometry-Aware Depth Completion via Dynamic GCNs [49.55919802779889]
We propose a Graph Convolution based Spatial Propagation Network (GraphCSPN) as a general approach for depth completion.
In this work, we leverage convolution neural networks as well as graph neural networks in a complementary way for geometric representation learning.
Our method achieves the state-of-the-art performance, especially when compared in the case of using only a few propagation steps.
arXiv Detail & Related papers (2022-10-19T17:56:03Z) - 360 Depth Estimation in the Wild -- The Depth360 Dataset and the SegFuse
Network [35.03201732370496]
Single-view depth estimation from omnidirectional images has gained popularity with its wide range of applications such as autonomous driving and scene reconstruction.
In this work, we first establish a large-scale dataset with varied settings called Depth360 to tackle the training data problem.
We then propose an end-to-end two-branch multi-task learning network, SegFuse, that mimics the human eye to effectively learn from the dataset.
arXiv Detail & Related papers (2022-02-16T11:56:31Z) - Magnifying Subtle Facial Motions for Effective 4D Expression Recognition [56.806738404887824]
The flow of 3D faces is first analyzed to capture the spatial deformations.
The obtained temporal evolution of these deformations are fed into a magnification method.
The latter, main contribution of this paper, allows revealing subtle (hidden) deformations which enhance the emotion classification performance.
arXiv Detail & Related papers (2021-05-05T20:47:43Z) - Learnable Online Graph Representations for 3D Multi-Object Tracking [156.58876381318402]
We propose a unified and learning based approach to the 3D MOT problem.
We employ a Neural Message Passing network for data association that is fully trainable.
We show the merit of the proposed approach on the publicly available nuScenes dataset by achieving state-of-the-art performance of 65.6% AMOTA and 58% fewer ID-switches.
arXiv Detail & Related papers (2021-04-23T17:59:28Z) - PointContrast: Unsupervised Pre-training for 3D Point Cloud
Understanding [107.02479689909164]
In this work, we aim at facilitating research on 3D representation learning.
We measure the effect of unsupervised pre-training on a large source set of 3D scenes.
arXiv Detail & Related papers (2020-07-21T17:59:22Z) - SL-DML: Signal Level Deep Metric Learning for Multimodal One-Shot Action
Recognition [0.0]
We propose a metric learning approach to reduce the action recognition problem to a nearest neighbor search in embedding space.
We encode signals into images and extract features using a deep residual CNN.
The resulting encoder transforms features into an embedding space in which closer distances encode similar actions while higher distances encode different actions.
arXiv Detail & Related papers (2020-04-23T11:28:27Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.