HPR3D: Hierarchical Proxy Representation for High-Fidelity 3D Reconstruction and Controllable Editing
- URL: http://arxiv.org/abs/2507.11971v1
- Date: Wed, 16 Jul 2025 07:09:05 GMT
- Title: HPR3D: Hierarchical Proxy Representation for High-Fidelity 3D Reconstruction and Controllable Editing
- Authors: Tielong Wang, Yuxuan Xiong, Jinfan Liu, Zhifan Zhang, Ye Chen, Yue Shi, Bingbing Ni,
- Abstract summary: 3D representations like meshes, voxels, point clouds, and NeRF-based neural implicit fields exhibit significant limitations.<n>We introduce a novel 3D Hierarchical Proxy Node representation.<n>Our method's expressive efficiency, high-fidelity rendering quality, and superior editability are shown.
- Score: 33.554101922316946
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: Current 3D representations like meshes, voxels, point clouds, and NeRF-based neural implicit fields exhibit significant limitations: they are often task-specific, lacking universal applicability across reconstruction, generation, editing, and driving. While meshes offer high precision, their dense vertex data complicates editing; NeRFs deliver excellent rendering but suffer from structural ambiguity, hindering animation and manipulation; all representations inherently struggle with the trade-off between data complexity and fidelity. To overcome these issues, we introduce a novel 3D Hierarchical Proxy Node representation. Its core innovation lies in representing an object's shape and texture via a sparse set of hierarchically organized (tree-structured) proxy nodes distributed on its surface and interior. Each node stores local shape and texture information (implicitly encoded by a small MLP) within its neighborhood. Querying any 3D coordinate's properties involves efficient neural interpolation and lightweight decoding from relevant nearby and parent nodes. This framework yields a highly compact representation where nodes align with local semantics, enabling direct drag-and-edit manipulation, and offers scalable quality-complexity control. Extensive experiments across 3D reconstruction and editing demonstrate our method's expressive efficiency, high-fidelity rendering quality, and superior editability.
Related papers
- Self-Attention Based Multi-Scale Graph Auto-Encoder Network of 3D Meshes [1.573038298640368]
3D Geometric Mesh Network (3DGeoMeshNet), is a novel GCN-based framework that uses anisotropic convolution layers to learn both global and local features directly in the spatial domain.<n>Our architecture features a multi-scale encoder-decoder structure, where separate global and local pathways capture both large-scale geometric structures and fine-grained local details.
arXiv Detail & Related papers (2025-07-07T07:36:03Z) - PrITTI: Primitive-based Generation of Controllable and Editable 3D Semantic Scenes [30.417675568919552]
Large-scale 3D semantic scene generation has predominantly relied on voxel-based representations.<n> primitives represent semantic entities using compact, coarse 3D structures that are easy to manipulate and compose.<n>PrITTI is a latent diffusion-based framework that leverages primitives as the main foundational elements for generating compositional, controllable, and editable scene layouts.
arXiv Detail & Related papers (2025-06-23T20:47:18Z) - SplatMesh: Interactive 3D Segmentation and Editing Using Mesh-Based Gaussian Splatting [86.50200613220674]
A key challenge in 3D-based interactive editing is the absence of an efficient representation that balances diverse modifications with high-quality view synthesis under a given memory constraint.<n>We introduce SplatMesh, a novel fine-grained interactive 3D segmentation and editing algorithm that integrates 3D Gaussian Splatting with a precomputed mesh.<n>By segmenting and editing the simplified mesh, we can effectively edit the Gaussian splats as well, which will lead to extensive experiments on real and synthetic datasets.
arXiv Detail & Related papers (2023-12-26T02:50:42Z) - Neural Impostor: Editing Neural Radiance Fields with Explicit Shape
Manipulation [49.852533321916844]
We introduce Neural Impostor, a hybrid representation incorporating an explicit tetrahedral mesh alongside a multigrid implicit field.
Our framework bridges the explicit shape manipulation and the geometric editing of implicit fields by utilizing multigrid barycentric coordinate encoding.
We show the robustness and adaptability of our system through diverse examples and experiments, including the editing of both synthetic objects and real captured data.
arXiv Detail & Related papers (2023-10-09T04:07:00Z) - Flattening-Net: Deep Regular 2D Representation for 3D Point Cloud
Analysis [66.49788145564004]
We present an unsupervised deep neural architecture called Flattening-Net to represent irregular 3D point clouds of arbitrary geometry and topology.
Our methods perform favorably against the current state-of-the-art competitors.
arXiv Detail & Related papers (2022-12-17T15:05:25Z) - Deep Marching Tetrahedra: a Hybrid Representation for High-Resolution 3D
Shape Synthesis [90.26556260531707]
DMTet is a conditional generative model that can synthesize high-resolution 3D shapes using simple user guides such as coarse voxels.
Unlike deep 3D generative models that directly generate explicit representations such as meshes, our model can synthesize shapes with arbitrary topology.
arXiv Detail & Related papers (2021-11-08T05:29:35Z) - Neural Geometric Level of Detail: Real-time Rendering with Implicit 3D
Shapes [77.6741486264257]
We introduce an efficient neural representation that, for the first time, enables real-time rendering of high-fidelity neural SDFs.
We show that our representation is 2-3 orders of magnitude more efficient in terms of rendering speed compared to previous works.
arXiv Detail & Related papers (2021-01-26T18:50:22Z) - Learning Local Neighboring Structure for Robust 3D Shape Representation [143.15904669246697]
Representation learning for 3D meshes is important in many computer vision and graphics applications.
We propose a local structure-aware anisotropic convolutional operation (LSA-Conv)
Our model produces significant improvement in 3D shape reconstruction compared to state-of-the-art methods.
arXiv Detail & Related papers (2020-04-21T13:40:03Z) - Convolutional Occupancy Networks [88.48287716452002]
We propose Convolutional Occupancy Networks, a more flexible implicit representation for detailed reconstruction of objects and 3D scenes.
By combining convolutional encoders with implicit occupancy decoders, our model incorporates inductive biases, enabling structured reasoning in 3D space.
We empirically find that our method enables the fine-grained implicit 3D reconstruction of single objects, scales to large indoor scenes, and generalizes well from synthetic to real data.
arXiv Detail & Related papers (2020-03-10T10:17:07Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.