Real-time 3D Facial Tracking via Cascaded Compositional Learning
- URL: http://arxiv.org/abs/2009.00935v1
- Date: Wed, 2 Sep 2020 10:27:36 GMT
- Title: Real-time 3D Facial Tracking via Cascaded Compositional Learning
- Authors: Jianwen Lou, Xiaoxu Cai, Junyu Dong and Hui Yu
- Abstract summary: We learn a cascade of globally-optimized modular boosted ferns (GoMBF) to solve multi-modal facial motion regression for real-time 3D facial tracking from a monocular RGB camera.
GoMBF is a deep composition of multiple regression models with each is a boosted ferns initially trained to predict partial motion parameters of the same modality.
- Score: 30.660564667452118
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: We propose to learn a cascade of globally-optimized modular boosted ferns
(GoMBF) to solve multi-modal facial motion regression for real-time 3D facial
tracking from a monocular RGB camera. GoMBF is a deep composition of multiple
regression models with each is a boosted ferns initially trained to predict
partial motion parameters of the same modality, and then concatenated together
via a global optimization step to form a singular strong boosted ferns that can
effectively handle the whole regression target. It can explicitly cope with the
modality variety in output variables, while manifesting increased fitting power
and a faster learning speed comparing against the conventional boosted ferns.
By further cascading a sequence of GoMBFs (GoMBF-Cascade) to regress facial
motion parameters, we achieve competitive tracking performance on a variety of
in-the-wild videos comparing to the state-of-the-art methods, which require
much more training data or have higher computational complexity. It provides a
robust and highly elegant solution to real-time 3D facial tracking using a
small set of training data and hence makes it more practical in real-world
applications.
Related papers
- MultiViPerFrOG: A Globally Optimized Multi-Viewpoint Perception Framework for Camera Motion and Tissue Deformation [18.261678529996104]
We propose a framework that can flexibly integrate the output of low-level perception modules with kinematic and scene-modeling priors.
Overall, our method shows robustness to combined noisy input measures and can process hundreds of points in a few milliseconds.
arXiv Detail & Related papers (2024-08-08T10:55:55Z) - GGRt: Towards Pose-free Generalizable 3D Gaussian Splatting in Real-time [112.32349668385635]
GGRt is a novel approach to generalizable novel view synthesis that alleviates the need for real camera poses.
As the first pose-free generalizable 3D-GS framework, GGRt achieves inference at $ge$ 5 FPS and real-time rendering at $ge$ 100 FPS.
arXiv Detail & Related papers (2024-03-15T09:47:35Z) - Federated Multi-View Synthesizing for Metaverse [52.59476179535153]
The metaverse is expected to provide immersive entertainment, education, and business applications.
Virtual reality (VR) transmission over wireless networks is data- and computation-intensive.
We have developed a novel multi-view synthesizing framework that can efficiently provide synthesizing, storage, and communication resources for wireless content delivery in the metaverse.
arXiv Detail & Related papers (2023-12-18T13:51:56Z) - Asynchronous Hybrid Reinforcement Learning for Latency and Reliability
Optimization in the Metaverse over Wireless Communications [8.513938423514636]
Real-time digital twinning of real-world scenes is increasing.
The disparity in transmitted scene dimension (2D as opposed to 3D) leads to asymmetric data sizes in uplink (UL) and downlink (DL)
We design a novel multi-agent reinforcement learning algorithm structure, namely Asynchronous Actors Hybrid Critic (AAHC)
arXiv Detail & Related papers (2022-12-30T14:40:00Z) - Progressive Multi-view Human Mesh Recovery with Self-Supervision [68.60019434498703]
Existing solutions typically suffer from poor generalization performance to new settings.
We propose a novel simulation-based training pipeline for multi-view human mesh recovery.
arXiv Detail & Related papers (2022-12-10T06:28:29Z) - Gait Recognition in the Wild with Multi-hop Temporal Switch [81.35245014397759]
gait recognition in the wild is a more practical problem that has attracted the attention of the community of multimedia and computer vision.
This paper presents a novel multi-hop temporal switch method to achieve effective temporal modeling of gait patterns in real-world scenes.
arXiv Detail & Related papers (2022-09-01T10:46:09Z) - JNMR: Joint Non-linear Motion Regression for Video Frame Interpolation [47.123769305867775]
Video frame (VFI) aims to generate frames by warping learnable motions from the bidirectional historical references.
We reformulate VFI as a Joint Non-linear Motion Regression (JNMR) strategy to model the complicated motions of inter-frame.
We show that the effectiveness and significant improvement of joint motion regression compared with state-of-the-art methods.
arXiv Detail & Related papers (2022-06-09T02:47:29Z) - Long-Short Temporal Contrastive Learning of Video Transformers [62.71874976426988]
Self-supervised pretraining of video transformers on video-only datasets can lead to action recognition results on par or better than those obtained with supervised pretraining on large-scale image datasets.
Our approach, named Long-Short Temporal Contrastive Learning, enables video transformers to learn an effective clip-level representation by predicting temporal context captured from a longer temporal extent.
arXiv Detail & Related papers (2021-06-17T02:30:26Z) - Monocular Real-time Full Body Capture with Inter-part Correlations [66.22835689189237]
We present the first method for real-time full body capture that estimates shape and motion of body and hands together with a dynamic 3D face model from a single color image.
Our approach uses a new neural network architecture that exploits correlations between body and hands at high computational efficiency.
arXiv Detail & Related papers (2020-12-11T02:37:56Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.