Related papers: GS2Pose: Two-stage 6D Object Pose Estimation Guided by Gaussian Splatting

GS2Pose: Two-stage 6D Object Pose Estimation Guided by Gaussian Splatting

URL: http://arxiv.org/abs/2411.03807v2
Date: Thu, 07 Nov 2024 07:32:33 GMT
Title: GS2Pose: Two-stage 6D Object Pose Estimation Guided by Gaussian Splatting
Authors: Jilan Mei, Junbo Li, Cai Meng,
Abstract summary: This paper proposes a new method for accurate and robust 6D pose estimation of novel objects, named GS2Pose. By introducing 3D Gaussian splatting, GS2Pose can utilize the reconstruction results without requiring a high-quality CAD model. The code for GS2Pose will soon be released on GitHub.
Score: 4.465134753953128
License: http://creativecommons.org/licenses/by-nc-nd/4.0/
Abstract: This paper proposes a new method for accurate and robust 6D pose estimation of novel objects, named GS2Pose. By introducing 3D Gaussian splatting, GS2Pose can utilize the reconstruction results without requiring a high-quality CAD model, which means it only requires segmented RGBD images as input. Specifically, GS2Pose employs a two-stage structure consisting of coarse estimation followed by refined estimation. In the coarse stage, a lightweight U-Net network with a polarization attention mechanism, called Pose-Net, is designed. By using the 3DGS model for supervised training, Pose-Net can generate NOCS images to compute a coarse pose. In the refinement stage, GS2Pose formulates a pose regression algorithm following the idea of reprojection or Bundle Adjustment (BA), referred to as GS-Refiner. By leveraging Lie algebra to extend 3DGS, GS-Refiner obtains a pose-differentiable rendering pipeline that refines the coarse pose by comparing the input images with the rendered images. GS-Refiner also selectively updates parameters in the 3DGS model to achieve environmental adaptation, thereby enhancing the algorithm's robustness and flexibility to illuminative variation, occlusion, and other challenging disruptive factors. GS2Pose was evaluated through experiments conducted on the LineMod dataset, where it was compared with similar algorithms, yielding highly competitive results. The code for GS2Pose will soon be released on GitHub.

Related papers

No Pose, No Problem: Surprisingly Simple 3D Gaussian Splats from Sparse Unposed Images [100.80376573969045]
NoPoSplat is a feed-forward model capable of reconstructing 3D scenes parameterized by 3D Gaussians from multi-view images. Our model achieves real-time 3D Gaussian reconstruction during inference. This work makes significant advances in pose-free generalizable 3D reconstruction and demonstrates its applicability to real-world scenarios.
arXiv Detail & Related papers (2024-10-31T17:58:22Z)
PF3plat: Pose-Free Feed-Forward 3D Gaussian Splatting [54.7468067660037]
PF3plat sets a new state-of-the-art across all benchmarks, supported by comprehensive ablation studies validating our design choices. Our framework capitalizes on fast speed, scalability, and high-quality 3D reconstruction and view synthesis capabilities of 3DGS.
arXiv Detail & Related papers (2024-10-29T15:28:15Z)
GSplatLoc: Grounding Keypoint Descriptors into 3D Gaussian Splatting for Improved Visual Localization [1.4466437171584356]
We propose a two-stage procedure that integrates dense and robust keypoint descriptors from the lightweight XFeat feature extractor into 3DGS. In the second stage, the initial pose estimate is refined by minimizing the rendering-based photometric warp loss. Benchmarking on widely used indoor and outdoor datasets demonstrates improvements over recent neural rendering-based localization methods.
arXiv Detail & Related papers (2024-09-24T23:18:32Z)
GS-Net: Generalizable Plug-and-Play 3D Gaussian Splatting Module [19.97023389064118]
We propose GS-Net, a plug-and-play 3DGS module that densifies Gaussian ellipsoids from sparse SfM point clouds. Experiments demonstrate that applying GS-Net to 3DGS yields a PSNR improvement of 2.08 dB for conventional viewpoints and 1.86 dB for novel viewpoints.
arXiv Detail & Related papers (2024-09-17T16:03:19Z)
GSLoc: Efficient Camera Pose Refinement via 3D Gaussian Splatting [25.780452115246245]
We propose a novel test-time camera pose refinement framework, GSLoc. This framework enhances the localization accuracy of state-of-the-art absolute pose regression and scene coordinate regression methods. GSLoc obviates the need for training feature extractors or descriptors by operating directly on RGB images.
arXiv Detail & Related papers (2024-08-20T17:58:23Z)
R$^2$-Gaussian: Rectifying Radiative Gaussian Splatting for Tomographic Reconstruction [53.19869886963333]
3D Gaussian splatting (3DGS) has shown promising results in rendering image and surface reconstruction. This paper introduces R2$-Gaussian, the first 3DGS-based framework for sparse-view tomographic reconstruction.
arXiv Detail & Related papers (2024-05-31T08:39:02Z)
LP-3DGS: Learning to Prune 3D Gaussian Splatting [71.97762528812187]
We propose learning-to-prune 3DGS, where a trainable binary mask is applied to the importance score that can find optimal pruning ratio automatically. Experiments have shown that LP-3DGS consistently produces a good balance that is both efficient and high quality.
arXiv Detail & Related papers (2024-05-29T05:58:34Z)
GS-Pose: Generalizable Segmentation-based 6D Object Pose Estimation with 3D Gaussian Splatting [23.724077890247834]
GS-Pose is a framework for localizing and estimating the 6D pose of novel objects. It operates sequentially by locating the object in the input image, estimating its initial 6D pose, and refining the pose with a render-and-compare method. Off-the-shelf toolchains and commodity hardware, such as mobile phones, can be used to capture new objects to be added to the database.
arXiv Detail & Related papers (2024-03-15T21:06:14Z)
GS-IR: 3D Gaussian Splatting for Inverse Rendering [71.14234327414086]
We propose GS-IR, a novel inverse rendering approach based on 3D Gaussian Splatting (GS) We extend GS, a top-performance representation for novel view synthesis, to estimate scene geometry, surface material, and environment illumination from multi-view images captured under unknown lighting conditions. The flexible and expressive GS representation allows us to achieve fast and compact geometry reconstruction, photorealistic novel view synthesis, and effective physically-based rendering.
arXiv Detail & Related papers (2023-11-26T02:35:09Z)
Green Steganalyzer: A Green Learning Approach to Image Steganalysis [30.486433532000344]
Green Steganalyzer (GS) is a learning solution to image steganalysis based on the green learning paradigm. GS consists of three modules: pixel-based anomaly prediction, 2) embedding location detection, and 3) decision fusion for image-level detection.
arXiv Detail & Related papers (2023-06-06T20:43:07Z)
CheckerPose: Progressive Dense Keypoint Localization for Object Pose Estimation with Graph Neural Network [66.24726878647543]
Estimating the 6-DoF pose of a rigid object from a single RGB image is a crucial yet challenging task. Recent studies have shown the great potential of dense correspondence-based solutions. We propose a novel pose estimation algorithm named CheckerPose, which improves on three main aspects.
arXiv Detail & Related papers (2023-03-29T17:30:53Z)

This list is automatically generated from the titles and abstracts of the papers in this site.