Gaussian Vector: An Efficient Solution for Facial Landmark Detection
- URL: http://arxiv.org/abs/2010.01318v1
- Date: Sat, 3 Oct 2020 10:15:41 GMT
- Title: Gaussian Vector: An Efficient Solution for Facial Landmark Detection
- Authors: Yilin Xiong, Zijian Zhou, Yuhao Dou and Zhizhong Su
- Abstract summary: This paper proposes a new solution, Gaussian Vector, to preserve the spatial information as well as reduce the output size and simplify the post-processing.
We evaluate our method on 300W, COFW, WFLW and JD-landmark.
- Score: 3.058685580689605
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Significant progress has been made in facial landmark detection with the
development of Convolutional Neural Networks. The widely-used algorithms can be
classified into coordinate regression methods and heatmap based methods.
However, the former loses spatial information, resulting in poor performance
while the latter suffers from large output size or high post-processing
complexity. This paper proposes a new solution, Gaussian Vector, to preserve
the spatial information as well as reduce the output size and simplify the
post-processing. Our method provides novel vector supervision and introduces
Band Pooling Module to convert heatmap into a pair of vectors for each
landmark. This is a plug-and-play component which is simple and effective.
Moreover, Beyond Box Strategy is proposed to handle the landmarks out of the
face bounding box. We evaluate our method on 300W, COFW, WFLW and JD-landmark.
That the results significantly surpass previous works demonstrates the
effectiveness of our approach.
Related papers
- Unified Gradient-Based Machine Unlearning with Remain Geometry Enhancement [29.675650285351768]
Machine unlearning (MU) has emerged to enhance the privacy and trustworthiness of deep neural networks.
Approximate MU is a practical method for large-scale models.
We propose a fast-slow parameter update strategy to implicitly approximate the up-to-date salient unlearning direction.
arXiv Detail & Related papers (2024-09-29T15:17:33Z) - ESOD: Efficient Small Object Detection on High-Resolution Images [36.80623357577051]
Small objects are usually sparsely distributed and locally clustered.
Massive feature extraction computations are wasted on the non-target background area of images.
We propose to reuse the detector's backbone to conduct feature-level object-seeking and patch-slicing.
arXiv Detail & Related papers (2024-07-23T12:21:23Z) - GS-SLAM: Dense Visual SLAM with 3D Gaussian Splatting [51.96353586773191]
We introduce textbfGS-SLAM that first utilizes 3D Gaussian representation in the Simultaneous Localization and Mapping system.
Our method utilizes a real-time differentiable splatting rendering pipeline that offers significant speedup to map optimization and RGB-D rendering.
Our method achieves competitive performance compared with existing state-of-the-art real-time methods on the Replica, TUM-RGBD datasets.
arXiv Detail & Related papers (2023-11-20T12:08:23Z) - Laplacian Canonization: A Minimalist Approach to Sign and Basis
Invariant Spectral Embedding [36.61907023057978]
Spectral embedding is a powerful graph computation technique that has received a lot of attention recently due to its effectiveness on Graph Transformers.
Previous methods developed costly approaches to learn new invariants and suffer from high complexity.
In this work, we explore a minimal approach that resolves the ambiguity issues by directly finding canonical directions for the eigenvectors.
arXiv Detail & Related papers (2023-10-28T14:35:10Z) - OReX: Object Reconstruction from Planar Cross-sections Using Neural
Fields [10.862993171454685]
OReX is a method for 3D shape reconstruction from slices alone, featuring a Neural Field gradients as the prior.
A modest neural network is trained on the input planes to return an inside/outside estimate for a given 3D coordinate, yielding a powerful prior that induces smoothness and self-similarities.
We offer an iterative estimation architecture and a hierarchical input sampling scheme that encourage coarse-to-fine training, allowing the training process to focus on high frequencies at later stages.
arXiv Detail & Related papers (2022-11-23T11:44:35Z) - Detecting Rotated Objects as Gaussian Distributions and Its 3-D
Generalization [81.29406957201458]
Existing detection methods commonly use a parameterized bounding box (BBox) to model and detect (horizontal) objects.
We argue that such a mechanism has fundamental limitations in building an effective regression loss for rotation detection.
We propose to model the rotated objects as Gaussian distributions.
We extend our approach from 2-D to 3-D with a tailored algorithm design to handle the heading estimation.
arXiv Detail & Related papers (2022-09-22T07:50:48Z) - CloudAttention: Efficient Multi-Scale Attention Scheme For 3D Point
Cloud Learning [81.85951026033787]
We set transformers in this work and incorporate them into a hierarchical framework for shape classification and part and scene segmentation.
We also compute efficient and dynamic global cross attentions by leveraging sampling and grouping at each iteration.
The proposed hierarchical model achieves state-of-the-art shape classification in mean accuracy and yields results on par with the previous segmentation methods.
arXiv Detail & Related papers (2022-07-31T21:39:15Z) - Rethinking Spatial Invariance of Convolutional Networks for Object
Counting [119.83017534355842]
We try to use locally connected Gaussian kernels to replace the original convolution filter to estimate the spatial position in the density map.
Inspired by previous work, we propose a low-rank approximation accompanied with translation invariance to favorably implement the approximation of massive Gaussian convolution.
Our methods significantly outperform other state-of-the-art methods and achieve promising learning of the spatial position of objects.
arXiv Detail & Related papers (2022-06-10T17:51:25Z) - Improving Point Cloud Based Place Recognition with Ranking-based Loss
and Large Batch Training [1.116812194101501]
The paper presents a simple and effective learning-based method for computing a discriminative 3D point cloud descriptor.
We employ recent advances in image retrieval and propose a modified version of a loss function based on a differentiable average precision approximation.
arXiv Detail & Related papers (2022-03-02T09:29:28Z) - Progressive Coordinate Transforms for Monocular 3D Object Detection [52.00071336733109]
We propose a novel and lightweight approach, dubbed em Progressive Coordinate Transforms (PCT) to facilitate learning coordinate representations.
In this paper, we propose a novel and lightweight approach, dubbed em Progressive Coordinate Transforms (PCT) to facilitate learning coordinate representations.
arXiv Detail & Related papers (2021-08-12T15:22:33Z) - Pixel-in-Pixel Net: Towards Efficient Facial Landmark Detection in the
Wild [104.61677518999976]
We propose Pixel-in-Pixel Net (PIPNet) for facial landmark detection.
The proposed model is equipped with a novel detection head based on heatmap regression.
To further improve the cross-domain generalization capability of PIPNet, we propose self-training with curriculum.
arXiv Detail & Related papers (2020-03-08T12:23:42Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.