KeyPosS: Plug-and-Play Facial Landmark Detection through GPS-Inspired
True-Range Multilateration
- URL: http://arxiv.org/abs/2305.16437v4
- Date: Sat, 23 Sep 2023 15:02:51 GMT
- Title: KeyPosS: Plug-and-Play Facial Landmark Detection through GPS-Inspired
True-Range Multilateration
- Authors: Xu Bao, Zhi-Qi Cheng, Jun-Yan He, Chenyang Li, Wangmeng Xiang,
Jingdong Sun, Hanbing Liu, Wei Liu, Bin Luo, Yifeng Geng, Xuansong Xie
- Abstract summary: KeyPoint Positioning System (KeyPosS) is first framework to deduce exact landmark coordinates by triangulating distances between points of interest and anchor points predicted by a fully convolutional network.
Experiments on four datasets demonstrate state-of-the-art performance, with KeyPosS outperforming existing methods in low-resolution settings despite minimal computational overhead.
- Score: 28.96448680048584
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Accurate facial landmark detection is critical for facial analysis tasks, yet
prevailing heatmap and coordinate regression methods grapple with prohibitive
computational costs and quantization errors. Through comprehensive theoretical
analysis and experimentation, we identify and elucidate the limitations of
existing techniques. To overcome these challenges, we pioneer the application
of True-Range Multilateration, originally devised for GPS localization, to
facial landmark detection. We propose KeyPoint Positioning System (KeyPosS) -
the first framework to deduce exact landmark coordinates by triangulating
distances between points of interest and anchor points predicted by a fully
convolutional network. A key advantage of KeyPosS is its plug-and-play nature,
enabling flexible integration into diverse decoding pipelines. Extensive
experiments on four datasets demonstrate state-of-the-art performance, with
KeyPosS outperforming existing methods in low-resolution settings despite
minimal computational overhead. By spearheading the integration of
Multilateration with facial analysis, KeyPosS marks a paradigm shift in facial
landmark detection. The code is available at https://github.com/zhiqic/KeyPosS.
Related papers
- Learning to Make Keypoints Sub-Pixel Accurate [80.55676599677824]
This work addresses the challenge of sub-pixel accuracy in detecting 2D local features.
We propose a novel network that enhances any detector with sub-pixel precision by learning an offset vector for detected features.
arXiv Detail & Related papers (2024-07-16T12:39:56Z) - X-Pose: Detecting Any Keypoints [28.274913140048003]
X-Pose is a novel framework for multi-object keypoint detection in images.
UniKPT is a large-scale dataset of keypoint detection datasets.
X-Pose achieves notable improvements over state-of-the-art non-promptable, visual prompt-based, and textual prompt-based methods.
arXiv Detail & Related papers (2023-10-12T17:22:58Z) - DeDoDe: Detect, Don't Describe -- Describe, Don't Detect for Local
Feature Matching [14.837075102089]
Keypoint detection is a pivotal step in 3D reconstruction, whereby sets of (up to) K points are detected in each view of a scene.
Previous learning-based methods typically learn descriptors with keypoints, and treat the keypoint detection as a binary classification task on mutual nearest neighbours.
In this work, we learn keypoints directly from 3D consistency. To this end, we derive a semi-supervised two-view detection objective to expand this set to a desired number of detections.
Results show that our approach, DeDoDe, achieves significant gains on multiple geometry benchmarks.
arXiv Detail & Related papers (2023-08-16T16:37:02Z) - COMICS: End-to-end Bi-grained Contrastive Learning for Multi-face Forgery Detection [56.7599217711363]
Face forgery recognition methods can only process one face at a time.
Most face forgery recognition methods can only process one face at a time.
We propose COMICS, an end-to-end framework for multi-face forgery detection.
arXiv Detail & Related papers (2023-08-03T03:37:13Z) - Towards Accurate Facial Landmark Detection via Cascaded Transformers [14.74021483826222]
We propose an accurate facial landmark detector based on cascaded transformers.
With self-attention in transformers, our model can inherently exploit the structured relationships between landmarks.
During cascaded refinement, our model is able to extract the most relevant image features around the target landmark for coordinate prediction.
arXiv Detail & Related papers (2022-08-23T08:42:13Z) - From Keypoints to Object Landmarks via Self-Training Correspondence: A
novel approach to Unsupervised Landmark Discovery [37.78933209094847]
This paper proposes a novel paradigm for the unsupervised learning of object landmark detectors.
We validate our method on a variety of difficult datasets, including LS3D, BBCPose, Human3.6M and PennAction.
arXiv Detail & Related papers (2022-05-31T15:44:29Z) - Self-Supervised Equivariant Learning for Oriented Keypoint Detection [35.94215211409985]
We introduce a self-supervised learning framework using rotation-equivariant CNNs to learn to detect robust oriented keypoints.
We propose a dense orientation alignment loss by an image pair generated by synthetic transformations for training a histogram-based orientation map.
Our method outperforms the previous methods on an image matching benchmark and a camera pose estimation benchmark.
arXiv Detail & Related papers (2022-04-19T02:26:07Z) - SASA: Semantics-Augmented Set Abstraction for Point-based 3D Object
Detection [78.90102636266276]
We propose a novel set abstraction method named Semantics-Augmented Set Abstraction (SASA)
Based on the estimated point-wise foreground scores, we then propose a semantics-guided point sampling algorithm to help retain more important foreground points during down-sampling.
In practice, SASA shows to be effective in identifying valuable points related to foreground objects and improving feature learning for point-based 3D detection.
arXiv Detail & Related papers (2022-01-06T08:54:47Z) - Pretrained equivariant features improve unsupervised landmark discovery [69.02115180674885]
We formulate a two-step unsupervised approach that overcomes this challenge by first learning powerful pixel-based features.
Our method produces state-of-the-art results in several challenging landmark detection datasets.
arXiv Detail & Related papers (2021-04-07T05:42:11Z) - Robust Facial Landmark Detection by Cross-order Cross-semantic Deep
Network [58.843211405385205]
We propose a cross-order cross-semantic deep network (CCDN) to boost the semantic features learning for robust facial landmark detection.
Specifically, a cross-order two-squeeze multi-excitation (CTM) module is proposed to introduce the cross-order channel correlations for more discriminative representations learning.
A novel cross-order cross-semantic (COCS) regularizer is designed to drive the network to learn cross-order cross-semantic features from different activation for facial landmark detection.
arXiv Detail & Related papers (2020-11-16T08:19:26Z) - Multi-View Optimization of Local Feature Geometry [70.18863787469805]
We address the problem of refining the geometry of local image features from multiple views without known scene or camera geometry.
Our proposed method naturally complements the traditional feature extraction and matching paradigm.
We show that our method consistently improves the triangulation and camera localization performance for both hand-crafted and learned local features.
arXiv Detail & Related papers (2020-03-18T17:22:11Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.