GeoConv: Geodesic Guided Convolution for Facial Action Unit Recognition
- URL: http://arxiv.org/abs/2003.03055v1
- Date: Fri, 6 Mar 2020 07:05:46 GMT
- Title: GeoConv: Geodesic Guided Convolution for Facial Action Unit Recognition
- Authors: Yuedong Chen, Guoxian Song, Zhiwen Shao, Jianfei Cai, Tat-Jen Cham,
Jianming Zheng
- Abstract summary: We propose a novel geodesic guided convolution (GeoConv) for AU recognition.
We further develop an end-to-end trainable framework named GeoCNN for AU recognition.
- Score: 43.22337514214676
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Automatic facial action unit (AU) recognition has attracted great attention
but still remains a challenging task, as subtle changes of local facial muscles
are difficult to thoroughly capture. Most existing AU recognition approaches
leverage geometry information in a straightforward 2D or 3D manner, which
either ignore 3D manifold information or suffer from high computational costs.
In this paper, we propose a novel geodesic guided convolution (GeoConv) for AU
recognition by embedding 3D manifold information into 2D convolutions.
Specifically, the kernel of GeoConv is weighted by our introduced geodesic
weights, which are negatively correlated to geodesic distances on a coarsely
reconstructed 3D face model. Moreover, based on GeoConv, we further develop an
end-to-end trainable framework named GeoCNN for AU recognition. Extensive
experiments on BP4D and DISFA benchmarks show that our approach significantly
outperforms the state-of-the-art AU recognition methods.
Related papers
- GRACE: Estimating Geometry-level 3D Human-Scene Contact from 2D Images [54.602947113980655]
Estimating the geometry level of human-scene contact aims to ground specific contact surface points at 3D human geometries.<n> GRACE (Geometry-level Reasoning for 3D Human-scene Contact Estimation) is a new paradigm for 3D human contact estimation.<n>It incorporates a point cloud encoder-decoder architecture along with a hierarchical feature extraction and fusion module.
arXiv Detail & Related papers (2025-05-10T09:25:46Z) - GeoLRM: Geometry-Aware Large Reconstruction Model for High-Quality 3D Gaussian Generation [65.33726478659304]
We introduce the Geometry-Aware Large Reconstruction Model (GeoLRM), an approach which can predict high-quality assets with 512k Gaussians and 21 input images in only 11 GB GPU memory.
Previous works neglect the inherent sparsity of 3D structure and do not utilize explicit geometric relationships between 3D and 2D images.
GeoLRM tackles these issues by incorporating a novel 3D-aware transformer structure that directly processes 3D points and uses deformable cross-attention mechanisms.
arXiv Detail & Related papers (2024-06-21T17:49:31Z) - NeuroGF: A Neural Representation for Fast Geodesic Distance and Path
Queries [77.04220651098723]
This paper presents the first attempt to represent geodesics on 3D mesh models using neural implicit functions.
Specifically, we introduce neural geodesic fields (NeuroGFs), which are learned to represent the all-pairs geodesics of a given mesh.
NeuroGFs exhibit exceptional performance in solving the single-source all-destination (SSAD) and point-to-point geodesics.
arXiv Detail & Related papers (2023-06-01T13:32:21Z) - Cross-Dimensional Refined Learning for Real-Time 3D Visual Perception
from Monocular Video [2.2299983745857896]
We present a novel real-time capable learning method that jointly perceives a 3D scene's geometry structure and semantic labels.
We propose an end-to-end cross-dimensional refinement neural network (CDRNet) to extract both 3D mesh and 3D semantic labeling in real time.
arXiv Detail & Related papers (2023-03-16T11:53:29Z) - Learning Continuous Depth Representation via Geometric Spatial
Aggregator [47.1698365486215]
We propose a novel continuous depth representation for depth map super-resolution (DSR)
The heart of this representation is our proposed Geometric Spatial Aggregator (GSA), which exploits a distance field modulated by arbitrarily upsampled target gridding.
We also present a transformer-style backbone named GeoDSR, which possesses a principled way to construct the functional mapping between local coordinates.
arXiv Detail & Related papers (2022-12-07T07:48:23Z) - GraphCSPN: Geometry-Aware Depth Completion via Dynamic GCNs [49.55919802779889]
We propose a Graph Convolution based Spatial Propagation Network (GraphCSPN) as a general approach for depth completion.
In this work, we leverage convolution neural networks as well as graph neural networks in a complementary way for geometric representation learning.
Our method achieves the state-of-the-art performance, especially when compared in the case of using only a few propagation steps.
arXiv Detail & Related papers (2022-10-19T17:56:03Z) - Joint stereo 3D object detection and implicit surface reconstruction [39.30458073540617]
We present a new learning-based framework S-3D-RCNN that can recover accurate object orientation in SO(3) and simultaneously predict implicit rigid shapes from stereo RGB images.
For orientation estimation, in contrast to previous studies that map local appearance to observation angles, we propose a progressive approach by extracting meaningful Intermediate Geometrical Representations (IGRs)
This approach features a deep model that transforms perceived intensities from one or two views to object part coordinates to achieve direct egocentric object orientation estimation in the camera coordinate system.
To further achieve finer description inside 3D bounding boxes, we investigate the implicit shape estimation problem from stereo images
arXiv Detail & Related papers (2021-11-25T05:52:30Z) - Progressive Coordinate Transforms for Monocular 3D Object Detection [52.00071336733109]
We propose a novel and lightweight approach, dubbed em Progressive Coordinate Transforms (PCT) to facilitate learning coordinate representations.
In this paper, we propose a novel and lightweight approach, dubbed em Progressive Coordinate Transforms (PCT) to facilitate learning coordinate representations.
arXiv Detail & Related papers (2021-08-12T15:22:33Z) - Learning Geometry-Guided Depth via Projective Modeling for Monocular 3D Object Detection [70.71934539556916]
We learn geometry-guided depth estimation with projective modeling to advance monocular 3D object detection.
Specifically, a principled geometry formula with projective modeling of 2D and 3D depth predictions in the monocular 3D object detection network is devised.
Our method remarkably improves the detection performance of the state-of-the-art monocular-based method without extra data by 2.80% on the moderate test setting.
arXiv Detail & Related papers (2021-07-29T12:30:39Z) - Geodesic-HOF: 3D Reconstruction Without Cutting Corners [42.4960665928525]
Single-view 3D object reconstruction is a challenging fundamental problem in computer vision.
We learn an image-conditioned mapping function from a canonical sampling domain to a high dimensional space.
We find that this learned geodesic embedding space provides useful information for applications such as unsupervised object decomposition.
arXiv Detail & Related papers (2020-06-14T18:59:06Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.