Related papers: Object Reconstruction under Occlusion with Generative Priors and Contact-induced Constraints

Object Reconstruction under Occlusion with Generative Priors and Contact-induced Constraints

URL: http://arxiv.org/abs/2512.05079v1
Date: Thu, 04 Dec 2025 18:45:14 GMT
Title: Object Reconstruction under Occlusion with Generative Priors and Contact-induced Constraints
Authors: Minghan Zhu, Zhiyi Wang, Qihang Sun, Maani Ghaffari, Michael Posa,
Abstract summary: In this paper, we leverage two extra sources of information to reduce the ambiguity of vision signals.<n>First, generative models learn priors of the shapes of commonly seen objects, allowing us to make reasonable guesses of the unseen part of geometry.<n>Second, contact information, which can be obtained from videos and physical interactions, provides sparse constraints on the boundary of the geometry.
Score: 20.702086497025494
License: http://creativecommons.org/licenses/by/4.0/
Abstract: Object geometry is key information for robot manipulation. Yet, object reconstruction is a challenging task because cameras only capture partial observations of objects, especially when occlusion occurs. In this paper, we leverage two extra sources of information to reduce the ambiguity of vision signals. First, generative models learn priors of the shapes of commonly seen objects, allowing us to make reasonable guesses of the unseen part of geometry. Second, contact information, which can be obtained from videos and physical interactions, provides sparse constraints on the boundary of the geometry. We combine the two sources of information through contact-guided 3D generation. The guidance formulation is inspired by drag-based editing in generative models. Experiments on synthetic and real-world data show that our approach improves the reconstruction compared to pure 3D generation and contact-based optimization.

Related papers

TeHOR: Text-Guided 3D Human and Object Reconstruction with Textures [53.21603129469796]
Joint reconstruction of 3D human and object from a single image is an active research area, with pivotal applications in robotics and digital content creation.<n>Existing approaches rely heavily on physical contact information, which inherently cannot capture non-contact human-object interactions.<n>We introduce TeHOR, a framework built upon two core designs. First, beyond contact information, our framework leverages text descriptions of human-object interactions to enforce semantic alignment.<n>Second, we incorporate appearance cues of the 3D human and object into the alignment process to capture holistic contextual information.
arXiv Detail & Related papers (2026-02-23T10:22:52Z)
ByteLoom: Weaving Geometry-Consistent Human-Object Interactions through Progressive Curriculum Learning [19.292101162897975]
We introduce ByteLoom, a framework that generates realistic HOI videos with geometrically consistent object illustration.<n>We first propose an RCM-cache mechanism that leverages Relative Coordinate Maps (RCM) as a universal representation to maintain object's geometry consistency.<n>We then design a training curriculum that enhances model capabilities in a progressive style and relaxes the demand of hand mesh.
arXiv Detail & Related papers (2025-12-28T09:38:36Z)
VideoArtGS: Building Digital Twins of Articulated Objects from Monocular Video [60.63575135514847]
Building digital twins of articulated objects from monocular video presents an essential challenge in computer vision.<n>We introduce VideoArtGS, a novel approach that reconstructs high-fidelity digital twins of articulated objects from monocular video.<n>VideoArtGS demonstrates state-of-the-art performance in articulation and mesh reconstruction, reducing the reconstruction error by about two orders of magnitude compared to existing methods.
arXiv Detail & Related papers (2025-09-22T11:52:02Z)
Online 3D Scene Reconstruction Using Neural Object Priors [83.14204014687938]
This paper addresses the problem of reconstructing a scene online at the level of objects given an RGB-D video sequence.<n>We propose a feature grid mechanism to continuously update object-centric neural implicit representations as new object parts are revealed.<n>Our approach outperforms state-of-the-art neural implicit models for this task in terms of reconstruction accuracy and completeness.
arXiv Detail & Related papers (2025-03-24T17:09:36Z)
GREAT: Geometry-Intention Collaborative Inference for Open-Vocabulary 3D Object Affordance Grounding [53.42728468191711]
Open-Vocabulary 3D object affordance grounding aims to anticipate action possibilities'' regions on 3D objects with arbitrary instructions.<n>We propose GREAT (GeometRy-intEntion collAboraTive inference) for Open-Vocabulary 3D Object Affordance Grounding.
arXiv Detail & Related papers (2024-11-29T11:23:15Z)
GeoGen: Geometry-Aware Generative Modeling via Signed Distance Functions [22.077366472693395]
We introduce a new generative approach for synthesizing 3D geometry and images from single-view collections. By employing volumetric rendering using neural radiance fields, they inherit a key limitation: the generated geometry is noisy and unconstrained. We propose GeoGen, a new SDF-based 3D generative model trained in an end-to-end manner.
arXiv Detail & Related papers (2024-06-06T17:00:10Z)
REACTO: Reconstructing Articulated Objects from a Single Video [64.89760223391573]
We propose a novel deformation model that enhances the rigidity of each part while maintaining flexible deformation of the joints. Our method outperforms previous works in producing higher-fidelity 3D reconstructions of general articulated objects.
arXiv Detail & Related papers (2024-04-17T08:01:55Z)
Learning Explicit Contact for Implicit Reconstruction of Hand-held Objects from Monocular Images [59.49985837246644]
We show how to model contacts in an explicit way to benefit the implicit reconstruction of hand-held objects. In the first part, we propose a new subtask of directly estimating 3D hand-object contacts from a single image. In the second part, we introduce a novel method to diffuse estimated contact states from the hand mesh surface to nearby 3D space.
arXiv Detail & Related papers (2023-05-31T17:59:26Z)

This list is automatically generated from the titles and abstracts of the papers in this site.