Object-Centric 2D Gaussian Splatting: Background Removal and Occlusion-Aware Pruning for Compact Object Models
- URL: http://arxiv.org/abs/2501.08174v2
- Date: Thu, 03 Apr 2025 14:01:02 GMT
- Title: Object-Centric 2D Gaussian Splatting: Background Removal and Occlusion-Aware Pruning for Compact Object Models
- Authors: Marcel Rogge, Didier Stricker,
- Abstract summary: We propose a novel approach that leverages object masks to enable targeted reconstruction, resulting in object-centric models.<n>Our method reconstructs compact object models, yielding object-centric Gaussian and mesh representations that are up to 96% smaller and up to 71% faster to train compared to the baseline.<n>These representations are immediately usable for downstream applications such as appearance editing and physics simulation without additional processing.
- Score: 14.555667193538879
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Current Gaussian Splatting approaches are effective for reconstructing entire scenes but lack the option to target specific objects, making them computationally expensive and unsuitable for object-specific applications. We propose a novel approach that leverages object masks to enable targeted reconstruction, resulting in object-centric models. Additionally, we introduce an occlusion-aware pruning strategy to minimize the number of Gaussians without compromising quality. Our method reconstructs compact object models, yielding object-centric Gaussian and mesh representations that are up to 96% smaller and up to 71% faster to train compared to the baseline while retaining competitive quality. These representations are immediately usable for downstream applications such as appearance editing and physics simulation without additional processing.
Related papers
- Object-IR: Leveraging Object Consistency and Mesh Deformation for Self-Supervised Image Retargeting [18.51504816209345]
This paper presents Object-IR, a self-supervised architecture that reformulates image as a learning-based mesh warping optimization problem.<n>We mitigate a uniform rigid mesh at a target aspect ratio and use a convolutional neural network to predict the motion of each mesh grid and obtain the deformed mesh.<n>The framework efficiently processes arbitrary input resolutions while maintaining real-time performance on consumer-grade GPUs.
arXiv Detail & Related papers (2025-10-31T06:57:10Z) - NOCTIS: Novel Object Cyclic Threshold based Instance Segmentation [47.32364120562497]
Novel Object Cyclic Threshold based Instance (NOCTIS) is a framework for designing a model general enough to be employed for novel objects.<n>We show that NOCTIS outperforms the best RGB and RGB-D methods on the seven core datasets of the BOP 2023 challenge for the "Model-based 2D segmentation of unseen objects" task.
arXiv Detail & Related papers (2025-07-02T08:23:14Z) - Cut-and-Splat: Leveraging Gaussian Splatting for Synthetic Data Generation [0.7864304771129751]
We develop a synthetic data pipeline for generating context-aware instance segmentation training data for specific objects.
We train a Gaussian Splatting model of the target object and automatically extract the object from the video.
We then render the object on a random background image, and monocular depth estimation is employed to place the object in a believable pose.
arXiv Detail & Related papers (2025-04-11T12:04:49Z) - Reasoning and Learning a Perceptual Metric for Self-Training of Reflective Objects in Bin-Picking with a Low-cost Camera [10.976379239028455]
Bin-picking of metal objects using low-cost RGB-D cameras often suffers from sparse depth information and reflective surface textures.
We propose a two-stage framework consisting of a metric learning stage and a self-training stage.
Our approach outperforms several state-of-the-art methods on both the ROBI dataset and our newly introduced Self-ROBI dataset.
arXiv Detail & Related papers (2025-03-26T04:03:51Z) - ProtoGS: Efficient and High-Quality Rendering with 3D Gaussian Prototypes [81.48624894781257]
3D Gaussian Splatting (3DGS) has made significant strides in novel view synthesis but is limited by the substantial number of Gaussian primitives required.
Recent methods address this issue by compressing the storage size of densified Gaussians, yet fail to preserve rendering quality and efficiency.
We propose ProtoGS to learn Gaussian prototypes to represent Gaussian primitives, significantly reducing the total Gaussian amount without sacrificing visual quality.
arXiv Detail & Related papers (2025-03-21T18:55:14Z) - DQO-MAP: Dual Quadrics Multi-Object mapping with Gaussian Splatting [6.736949053673975]
We propose a novel object-SLAM system that seamlessly integrates object pose estimation and reconstruction.
DQO-MAP achieves outstanding performance in terms of precision, reconstruction quality, and computational efficiency.
arXiv Detail & Related papers (2025-03-04T02:55:07Z) - TSGaussian: Semantic and Depth-Guided Target-Specific Gaussian Splatting from Sparse Views [18.050257821756148]
TSGaussian is a novel framework that combines semantic constraints with depth priors to avoid geometry degradation in novel view synthesis tasks.<n>Our approach prioritizes computational resources on designated targets while minimizing background allocation.<n>Extensive experiments demonstrate that TSGaussian outperforms state-of-the-art methods on three standard datasets.
arXiv Detail & Related papers (2024-12-13T11:26:38Z) - A Novel Unified Architecture for Low-Shot Counting by Detection and Segmentation [10.461109095311546]
Low-shot object counters estimate the number of objects in an image using few or no annotated exemplars.<n>The existing approaches often lead to overgeneralization and false positive detections.<n>We introduce GeCo, a novel low-shot counter that achieves accurate object detection, segmentation, and count estimation.
arXiv Detail & Related papers (2024-09-27T12:20:29Z) - SMORE: Simulataneous Map and Object REconstruction [66.66729715211642]
We present a method for dynamic surface reconstruction of large-scale urban scenes from LiDAR.<n>We take a holistic perspective and optimize a compositional model of a dynamic scene that decomposes the world into rigidly-moving objects and the background.
arXiv Detail & Related papers (2024-06-19T23:53:31Z) - GaussianObject: High-Quality 3D Object Reconstruction from Four Views with Gaussian Splatting [82.29476781526752]
Reconstructing and rendering 3D objects from highly sparse views is of critical importance for promoting applications of 3D vision techniques.
GaussianObject is a framework to represent and render the 3D object with Gaussian splatting that achieves high rendering quality with only 4 input images.
GaussianObject is evaluated on several challenging datasets, including MipNeRF360, OmniObject3D, OpenIllumination, and our-collected unposed images.
arXiv Detail & Related papers (2024-02-15T18:42:33Z) - Leveraging Positional Encoding for Robust Multi-Reference-Based Object
6D Pose Estimation [21.900422840817726]
Accurately estimating the pose of an object is a crucial task in computer vision and robotics.
In this paper, we analyze these limitations and propose new strategies to overcome them.
Our experiments on Linemod, Linemod-Occlusion, and YCB-Video datasets demonstrate that our approach outperforms existing methods.
arXiv Detail & Related papers (2024-01-29T16:42:15Z) - PoseMatcher: One-shot 6D Object Pose Estimation by Deep Feature Matching [51.142988196855484]
We propose PoseMatcher, an accurate model free one-shot object pose estimator.
We create a new training pipeline for object to image matching based on a three-view system.
To enable PoseMatcher to attend to distinct input modalities, an image and a pointcloud, we introduce IO-Layer.
arXiv Detail & Related papers (2023-04-03T21:14:59Z) - IFOR: Iterative Flow Minimization for Robotic Object Rearrangement [92.97142696891727]
IFOR, Iterative Flow Minimization for Robotic Object Rearrangement, is an end-to-end method for the problem of object rearrangement for unknown objects.
We show that our method applies to cluttered scenes, and in the real world, while training only on synthetic data.
arXiv Detail & Related papers (2022-02-01T20:03:56Z) - You Better Look Twice: a new perspective for designing accurate
detectors with reduced computations [56.34005280792013]
BLT-net is a new low-computation two-stage object detection architecture.
It reduces computations by separating objects from background using a very lite first-stage.
Resulting image proposals are then processed in the second-stage by a highly accurate model.
arXiv Detail & Related papers (2021-07-21T12:39:51Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.