Robust Human Registration with Body Part Segmentation on Noisy Point Clouds
- URL: http://arxiv.org/abs/2504.03602v1
- Date: Fri, 04 Apr 2025 17:17:33 GMT
- Title: Robust Human Registration with Body Part Segmentation on Noisy Point Clouds
- Authors: Kai Lascheit, Daniel Barath, Marc Pollefeys, Leonidas Guibas, Francis Engelmann,
- Abstract summary: We introduce a hybrid approach that incorporates body-part segmentation into the mesh fitting process.<n>Our method first assigns body part labels to individual points, which then guide a two-step SMPL-X fitting.<n>We demonstrate that the fitted human mesh can refine body part labels, leading to improved segmentation.
- Score: 73.00876572870787
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: Registering human meshes to 3D point clouds is essential for applications such as augmented reality and human-robot interaction but often yields imprecise results due to noise and background clutter in real-world data. We introduce a hybrid approach that incorporates body-part segmentation into the mesh fitting process, enhancing both human pose estimation and segmentation accuracy. Our method first assigns body part labels to individual points, which then guide a two-step SMPL-X fitting: initial pose and orientation estimation using body part centroids, followed by global refinement of the point cloud alignment. Additionally, we demonstrate that the fitted human mesh can refine body part labels, leading to improved segmentation. Evaluations on the cluttered and noisy real-world datasets InterCap, EgoBody, and BEHAVE show that our approach significantly outperforms prior methods in both pose estimation and segmentation accuracy. Code and results are available on our project website: https://segfit.github.io
Related papers
- CAP-Net: A Unified Network for 6D Pose and Size Estimation of Categorical Articulated Parts from a Single RGB-D Image [86.75098349480014]
This paper tackles category-level pose estimation of articulated objects in robotic manipulation tasks.
We propose a single-stage Network, CAP-Net, for estimating the 6D poses and sizes of Categorical Articulated Parts.
We introduce the RGBD-Art dataset, the largest RGB-D articulated dataset to date, featuring RGB images and depth noise simulated from real sensors.
arXiv Detail & Related papers (2025-04-15T14:30:26Z) - Semantic Segmentation and Scene Reconstruction of RGB-D Image Frames: An End-to-End Modular Pipeline for Robotic Applications [0.7951977175758216]
Traditional RGB-D processing pipelines focus primarily on geometric reconstruction.
We introduce a novel end-to-end modular pipeline that integrates semantic segmentation, human tracking, point-cloud fusion, and scene reconstruction.
We validate our approach on benchmark datasets and real-world Kinect RGB-D data, demonstrating improved efficiency, accuracy, and usability.
arXiv Detail & Related papers (2024-10-23T16:01:31Z) - Articulated Object Manipulation using Online Axis Estimation with SAM2-Based Tracking [57.942404069484134]
Articulated object manipulation requires precise object interaction, where the object's axis must be carefully considered.<n>Previous research employed interactive perception for manipulating articulated objects, but typically, open-loop approaches often suffer from overlooking the interaction dynamics.<n>We present a closed-loop pipeline integrating interactive perception with online axis estimation from segmented 3D point clouds.
arXiv Detail & Related papers (2024-09-24T17:59:56Z) - PWISeg: Point-based Weakly-supervised Instance Segmentation for Surgical
Instruments [27.89003436883652]
We propose a weakly-supervised surgical instrument segmentation approach, named Point-based Weakly-supervised Instance (PWISeg)
PWISeg adopts an FCN-based architecture with point-to-box and point-to-mask branches to model the relationships between feature points and bounding boxes.
Based on this, we propose a key pixel association loss and a key pixel distribution loss, driving the point-to-mask branch to generate more accurate segmentation predictions.
arXiv Detail & Related papers (2023-11-16T11:48:29Z) - Sampling is Matter: Point-guided 3D Human Mesh Reconstruction [0.0]
This paper presents a simple yet powerful method for 3D human mesh reconstruction from a single RGB image.
Experimental results on benchmark datasets show that the proposed method efficiently improves the performance of 3D human mesh reconstruction.
arXiv Detail & Related papers (2023-04-19T08:45:26Z) - AlphaPose: Whole-Body Regional Multi-Person Pose Estimation and Tracking
in Real-Time [47.19339667836196]
We present AlphaPose, a system that can perform accurate whole-body pose estimation and tracking jointly while running in realtime.
We show a significant improvement over current state-of-the-art methods in both speed and accuracy on COCO-wholebody, COCO, PoseTrack, and our proposed Halpe-FullBody pose estimation dataset.
arXiv Detail & Related papers (2022-11-07T09:15:38Z) - CloudAttention: Efficient Multi-Scale Attention Scheme For 3D Point
Cloud Learning [81.85951026033787]
We set transformers in this work and incorporate them into a hierarchical framework for shape classification and part and scene segmentation.
We also compute efficient and dynamic global cross attentions by leveraging sampling and grouping at each iteration.
The proposed hierarchical model achieves state-of-the-art shape classification in mean accuracy and yields results on par with the previous segmentation methods.
arXiv Detail & Related papers (2022-07-31T21:39:15Z) - VoteHMR: Occlusion-Aware Voting Network for Robust 3D Human Mesh
Recovery from Partial Point Clouds [32.72878775887121]
We make the first attempt to reconstruct reliable 3D human shapes from single-frame partial point clouds.
We propose an end-to-end learnable method, named VoteHMR.
The proposed method achieves state-of-the-art performances on two large-scale datasets.
arXiv Detail & Related papers (2021-10-17T05:42:04Z) - Semantic Segmentation for Real Point Cloud Scenes via Bilateral
Augmentation and Adaptive Fusion [38.05362492645094]
Real point cloud scenes can intuitively capture complex surroundings in the real world, but due to 3D data's raw nature, it is very challenging for machine perception.
We concentrate on the essential visual task, semantic segmentation, for large-scale point cloud data collected in reality.
By comparing with state-of-the-art networks on three different benchmarks, we demonstrate the effectiveness of our network.
arXiv Detail & Related papers (2021-03-12T04:13:20Z) - Point-Set Anchors for Object Detection, Instance Segmentation and Pose
Estimation [85.96410825961966]
We argue that the image features extracted at a central point contain limited information for predicting distant keypoints or bounding box boundaries.
To facilitate inference, we propose to instead perform regression from a set of points placed at more advantageous positions.
We apply this proposed framework, called Point-Set Anchors, to object detection, instance segmentation, and human pose estimation.
arXiv Detail & Related papers (2020-07-06T15:59:56Z) - HEMlets PoSh: Learning Part-Centric Heatmap Triplets for 3D Human Pose
and Shape Estimation [60.35776484235304]
This work attempts to address the uncertainty of lifting the detected 2D joints to the 3D space by introducing an intermediate state-Part-Centric Heatmap Triplets (HEMlets)
The HEMlets utilize three joint-heatmaps to represent the relative depth information of the end-joints for each skeletal body part.
A Convolutional Network (ConvNet) is first trained to predict HEMlets from the input image, followed by a volumetric joint-heatmap regression.
arXiv Detail & Related papers (2020-03-10T04:03:45Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.