Related papers: Category-level Meta-learned NeRF Priors for Efficient Object Mapping

Category-level Meta-learned NeRF Priors for Efficient Object Mapping

URL: http://arxiv.org/abs/2503.01582v3
Date: Tue, 29 Jul 2025 14:15:39 GMT
Title: Category-level Meta-learned NeRF Priors for Efficient Object Mapping
Authors: Saad Ejaz, Hriday Bavle, Laura Ribeiro, Holger Voos, Jose Luis Sanchez-Lopez,
Abstract summary: PRENOM is a Prior-based Efficient Neural Object Mapper.<n>It integrates category-level priors with object-level NeRFs to enhance reconstruction efficiency and enable canonical object pose estimation.
Score: 0.0
License: http://creativecommons.org/licenses/by/4.0/
Abstract: In 3D object mapping, category-level priors enable efficient object reconstruction and canonical pose estimation, requiring only a single prior per semantic category (e.g., chair, book, laptop, etc.). DeepSDF has been used predominantly as a category-level shape prior, but it struggles to reconstruct sharp geometry and is computationally expensive. In contrast, NeRFs capture fine details but have yet to be effectively integrated with category-level priors in a real-time multi-object mapping framework. To bridge this gap, we introduce PRENOM, a Prior-based Efficient Neural Object Mapper that integrates category-level priors with object-level NeRFs to enhance reconstruction efficiency and enable canonical object pose estimation. PRENOM gets to know objects on a first-name basis by meta-learning on synthetic reconstruction tasks generated from open-source shape datasets. To account for object category variations, it employs a multi-objective genetic algorithm to optimize the NeRF architecture for each category, balancing reconstruction quality and training time. Additionally, prior-based probabilistic ray sampling directs sampling toward expected object regions, accelerating convergence and improving reconstruction quality under constrained resources. Experimental results highlight the ability of PRENOM to achieve high-quality reconstructions while maintaining computational feasibility. Specifically, comparisons with prior-free NeRF-based approaches on a synthetic dataset show a 21\% lower Chamfer distance. Furthermore, evaluations against other approaches using shape priors on a noisy real-world dataset indicate a 13\% improvement averaged across all reconstruction metrics, and comparable pose and size estimation accuracy, while being trained for 5$\times$ less time. Code available at: https://github.com/snt-arg/PRENOM

Related papers

Online 3D Scene Reconstruction Using Neural Object Priors [83.14204014687938]
This paper addresses the problem of reconstructing a scene online at the level of objects given an RGB-D video sequence. We propose a feature grid mechanism to continuously update object-centric neural implicit representations as new object parts are revealed. Our approach outperforms state-of-the-art neural implicit models for this task in terms of reconstruction accuracy and completeness.
arXiv Detail & Related papers (2025-03-24T17:09:36Z)
DQO-MAP: Dual Quadrics Multi-Object mapping with Gaussian Splatting [6.736949053673975]
We propose a novel object-SLAM system that seamlessly integrates object pose estimation and reconstruction.<n>DQO-MAP achieves outstanding performance in terms of precision, reconstruction quality, and computational efficiency.
arXiv Detail & Related papers (2025-03-04T02:55:07Z)
OPUS: Occupancy Prediction Using a Sparse Set [64.60854562502523]
We present a framework to simultaneously predict occupied locations and classes using a set of learnable queries. OPUS incorporates a suite of non-trivial strategies to enhance model performance. Our lightest model achieves superior RayIoU on the Occ3D-nuScenes dataset at near 2x FPS, while our heaviest model surpasses previous best results by 6.1 RayIoU.
arXiv Detail & Related papers (2024-09-14T07:44:22Z)
VoxNeRF: Bridging Voxel Representation and Neural Radiance Fields for Enhanced Indoor View Synthesis [73.50359502037232]
VoxNeRF is a novel approach to enhance the quality and efficiency of neural indoor reconstruction and novel view synthesis.<n>We propose an efficient voxel-guided sampling technique that allocates computational resources to selectively the most relevant segments of rays.<n>Our approach is validated with extensive experiments on ScanNet and ScanNet++.
arXiv Detail & Related papers (2023-11-09T11:32:49Z)
Variable Radiance Field for Real-World Category-Specific Reconstruction from Single Image [25.44715538841181]
Reconstructing category-specific objects using Neural Radiance Field (NeRF) from a single image is a promising yet challenging task.<n>We propose Variable Radiance Field (VRF), a novel framework capable of efficiently reconstructing category-specific objects.<n>VRF achieves state-of-the-art performance in both reconstruction quality and computational efficiency.
arXiv Detail & Related papers (2023-06-08T12:12:02Z)
DeepDC: Deep Distance Correlation as a Perceptual Image Quality Evaluator [53.57431705309919]
ImageNet pre-trained deep neural networks (DNNs) show notable transferability for building effective image quality assessment (IQA) models. We develop a novel full-reference IQA (FR-IQA) model based exclusively on pre-trained DNN features. We conduct comprehensive experiments to demonstrate the superiority of the proposed quality model on five standard IQA datasets.
arXiv Detail & Related papers (2022-11-09T14:57:27Z)
Pushing the Efficiency Limit Using Structured Sparse Convolutions [82.31130122200578]
We propose Structured Sparse Convolution (SSC), which leverages the inherent structure in images to reduce the parameters in the convolutional filter. We show that SSC is a generalization of commonly used layers (depthwise, groupwise and pointwise convolution) in efficient architectures'' Architectures based on SSC achieve state-of-the-art performance compared to baselines on CIFAR-10, CIFAR-100, Tiny-ImageNet, and ImageNet classification benchmarks.
arXiv Detail & Related papers (2022-10-23T18:37:22Z)
Learning to Complete Object Shapes for Object-level Mapping in Dynamic Scenes [30.500198859451434]
We propose a novel object-level mapping system that can simultaneously segment, track, and reconstruct objects in dynamic scenes. It can further predict and complete their full geometries by conditioning on reconstructions from depth inputs and a category-level shape prior. We evaluate its effectiveness by quantitatively and qualitatively testing it in both synthetic and real-world sequences.
arXiv Detail & Related papers (2022-08-09T22:56:33Z)
CATRE: Iterative Point Clouds Alignment for Category-level Object Pose Refinement [52.41884119329864]
Category-level object pose and size refiner CATRE is able to iteratively enhance pose estimate from point clouds to produce accurate results. Our approach remarkably outperforms state-of-the-art methods on REAL275, CAMERA25, and LM benchmarks up to a speed of 85.32Hz.
arXiv Detail & Related papers (2022-07-17T05:55:00Z)
Unsupervised Learning of 3D Object Categories from Videos in the Wild [75.09720013151247]
We focus on learning a model from multiple views of a large collection of object instances. We propose a new neural network design, called warp-conditioned ray embedding (WCR), which significantly improves reconstruction. Our evaluation demonstrates performance improvements over several deep monocular reconstruction baselines on existing benchmarks.
arXiv Detail & Related papers (2021-03-30T17:57:01Z)
Next-best-view Regression using a 3D Convolutional Neural Network [0.9449650062296823]
We propose a data-driven approach to address the next-best-view problem. The proposed approach trains a 3D convolutional neural network with previous reconstructions in order to regress the btxtposition of the next-best-view. We have validated the proposed approach making use of two groups of experiments.
arXiv Detail & Related papers (2021-01-23T01:50:26Z)
Secrets of 3D Implicit Object Shape Reconstruction in the Wild [92.5554695397653]
Reconstructing high-fidelity 3D objects from sparse, partial observation is crucial for various applications in computer vision, robotics, and graphics. Recent neural implicit modeling methods show promising results on synthetic or dense datasets. But, they perform poorly on real-world data that is sparse and noisy. This paper analyzes the root cause of such deficient performance of a popular neural implicit model.
arXiv Detail & Related papers (2021-01-18T03:24:48Z)

This list is automatically generated from the titles and abstracts of the papers in this site.