Pixels2Points: Fusing 2D and 3D Features for Facial Skin Segmentation
- URL: http://arxiv.org/abs/2504.19718v1
- Date: Mon, 28 Apr 2025 12:13:12 GMT
- Title: Pixels2Points: Fusing 2D and 3D Features for Facial Skin Segmentation
- Authors: Victoria Yue Chen, Daoye Wang, Stephan Garbin, Sebastian Winberg, Timo Bolkart, Thabo Beeler,
- Abstract summary: Face registration deforms a template mesh to closely fit a 3D face scan, the quality of which commonly degrades in non-skin regions.<n>Existing image-based (2D) or scan-based (3D) segmentation methods however perform poorly.<n>We introduce a novel method that accurately separates skin from non-skin geometry on 3D human head scans.
- Score: 12.2258243244703
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: Face registration deforms a template mesh to closely fit a 3D face scan, the quality of which commonly degrades in non-skin regions (e.g., hair, beard, accessories), because the optimized template-to-scan distance pulls the template mesh towards the noisy scan surface. Improving registration quality requires a clean separation of skin and non-skin regions on the scan mesh. Existing image-based (2D) or scan-based (3D) segmentation methods however perform poorly. Image-based segmentation outputs multi-view inconsistent masks, and they cannot account for scan inaccuracies or scan-image misalignment, while scan-based methods suffer from lower spatial resolution compared to images. In this work, we introduce a novel method that accurately separates skin from non-skin geometry on 3D human head scans. For this, our method extracts features from multi-view images using a frozen image foundation model and aggregates these features in 3D. These lifted 2D features are then fused with 3D geometric features extracted from the scan mesh, to then predict a segmentation mask directly on the scan mesh. We show that our segmentations improve the registration accuracy over pure 2D or 3D segmentation methods by 8.89% and 14.3%, respectively. Although trained only on synthetic data, our model generalizes well to real data.
Related papers
- RadSAM: Segmenting 3D radiological images with a 2D promptable model [4.9000940389224885]
We propose RadSAM, a novel method for segmenting 3D objects with a 2D model from a single prompt.
We introduce a benchmark to evaluate the model's ability to segment 3D objects in CT images from a single prompt.
arXiv Detail & Related papers (2025-04-29T15:00:25Z) - FAMOUS: High-Fidelity Monocular 3D Human Digitization Using View Synthesis [51.193297565630886]
The challenge of accurately inferring texture remains, particularly in obscured areas such as the back of a person in frontal-view images.
This limitation in texture prediction largely stems from the scarcity of large-scale and diverse 3D datasets.
We propose leveraging extensive 2D fashion datasets to enhance both texture and shape prediction in 3D human digitization.
arXiv Detail & Related papers (2024-10-13T01:25:05Z) - MV2Cyl: Reconstructing 3D Extrusion Cylinders from Multi-View Images [13.255044855902408]
We present MV2Cyl, a novel method for reconstructing 3D from 2D multi-view images.
We achieve the optimal reconstruction result with the best accuracy in 2D sketch and extrude parameter estimation.
arXiv Detail & Related papers (2024-06-16T08:54:38Z) - Instant Multi-View Head Capture through Learnable Registration [62.70443641907766]
Existing methods for capturing datasets of 3D heads in dense semantic correspondence are slow.
We introduce TEMPEH to directly infer 3D heads in dense correspondence from calibrated multi-view images.
Predicting one head takes about 0.3 seconds with a median reconstruction error of 0.26 mm, 64% lower than the current state-of-the-art.
arXiv Detail & Related papers (2023-06-12T21:45:18Z) - ARTIC3D: Learning Robust Articulated 3D Shapes from Noisy Web Image
Collections [71.46546520120162]
Estimating 3D articulated shapes like animal bodies from monocular images is inherently challenging.
We propose ARTIC3D, a self-supervised framework to reconstruct per-instance 3D shapes from a sparse image collection in-the-wild.
We produce realistic animations by fine-tuning the rendered shape and texture under rigid part transformations.
arXiv Detail & Related papers (2023-06-07T17:47:50Z) - Automatic 3D Registration of Dental CBCT and Face Scan Data using 2D
Projection Images [0.9226931037259524]
This paper presents a fully automatic registration method of dental cone-beam computed tomography (CBCT) and face scan data.
It can be used for a digital platform of 3D jaw-teeth-face models in a variety of applications, including 3D digital treatment planning and orthognathic surgery.
arXiv Detail & Related papers (2023-05-17T11:26:43Z) - Detailed Avatar Recovery from Single Image [50.82102098057822]
This paper presents a novel framework to recover emphdetailed avatar from a single image.
We use the deep neural networks to refine the 3D shape in a Hierarchical Mesh Deformation framework.
Our method can restore detailed human body shapes with complete textures beyond skinned models.
arXiv Detail & Related papers (2021-08-06T03:51:26Z) - Spatial Context-Aware Self-Attention Model For Multi-Organ Segmentation [18.76436457395804]
Multi-organ segmentation is one of most successful applications of deep learning in medical image analysis.
Deep convolutional neural nets (CNNs) have shown great promise in achieving clinically applicable image segmentation performance on CT or MRI images.
We propose a new framework for combining 3D and 2D models, in which the segmentation is realized through high-resolution 2D convolutions.
arXiv Detail & Related papers (2020-12-16T21:39:53Z) - 3DBooSTeR: 3D Body Shape and Texture Recovery [76.91542440942189]
3DBooSTeR is a novel method to recover a textured 3D body mesh from a partial 3D scan.
The proposed approach decouples the shape and texture completion into two sequential tasks.
arXiv Detail & Related papers (2020-10-23T21:07:59Z) - AutoSweep: Recovering 3D Editable Objectsfrom a Single Photograph [54.701098964773756]
We aim to recover 3D objects with semantic parts and can be directly edited.
Our work makes an attempt towards recovering two types of primitive-shaped objects, namely, generalized cuboids and generalized cylinders.
Our algorithm can recover high quality 3D models and outperforms existing methods in both instance segmentation and 3D reconstruction.
arXiv Detail & Related papers (2020-05-27T12:16:24Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.