Related papers: MuSHRoom: Multi-Sensor Hybrid Room Dataset for Joint 3D Reconstruction and Novel View Synthesis

MuSHRoom: Multi-Sensor Hybrid Room Dataset for Joint 3D Reconstruction and Novel View Synthesis

URL: http://arxiv.org/abs/2311.02778v2
Date: Tue, 19 Mar 2024 15:05:22 GMT
Title: MuSHRoom: Multi-Sensor Hybrid Room Dataset for Joint 3D Reconstruction and Novel View Synthesis
Authors: Xuqian Ren, Wenjia Wang, Dingding Cai, Tuuli Tuominen, Juho Kannala, Esa Rahtu,
Abstract summary: We propose a real-world Multi-Sensor Hybrid Room dataset (MuSHRoom) Our dataset presents exciting challenges and requires state-of-the-art methods to be cost-effective, robust to noisy data and devices. We benchmark several famous pipelines on our dataset for joint 3D mesh reconstruction and novel view synthesis.
Score: 26.710960922302124
License: http://creativecommons.org/licenses/by-nc-nd/4.0/
Abstract: Metaverse technologies demand accurate, real-time, and immersive modeling on consumer-grade hardware for both non-human perception (e.g., drone/robot/autonomous car navigation) and immersive technologies like AR/VR, requiring both structural accuracy and photorealism. However, there exists a knowledge gap in how to apply geometric reconstruction and photorealism modeling (novel view synthesis) in a unified framework. To address this gap and promote the development of robust and immersive modeling and rendering with consumer-grade devices, we propose a real-world Multi-Sensor Hybrid Room Dataset (MuSHRoom). Our dataset presents exciting challenges and requires state-of-the-art methods to be cost-effective, robust to noisy data and devices, and can jointly learn 3D reconstruction and novel view synthesis instead of treating them as separate tasks, making them ideal for real-world applications. We benchmark several famous pipelines on our dataset for joint 3D mesh reconstruction and novel view synthesis. Our dataset and benchmark show great potential in promoting the improvements for fusing 3D reconstruction and high-quality rendering in a robust and computationally efficient end-to-end fashion. The dataset and code are available at the project website: https://xuqianren.github.io/publications/MuSHRoom/.

Related papers

Advances in Feed-Forward 3D Reconstruction and View Synthesis: A Survey [154.50661618628433]
3D reconstruction and view synthesis are foundational problems in computer vision, graphics, and immersive technologies such as augmented reality (AR), virtual reality (VR), and digital twins.<n>Recent advances in feed-forward approaches, driven by deep learning, have revolutionized this field by enabling fast and generalizable 3D reconstruction and view synthesis.
arXiv Detail & Related papers (2025-07-19T06:13:25Z)
RaySt3R: Predicting Novel Depth Maps for Zero-Shot Object Completion [49.933001840775816]
RaySt3R recasts 3D shape completion as a novel view synthesis problem.<n>We train a feedforward transformer to predict depth maps, object masks, and per-pixel confidence scores for query rays.<n>RaySt3R fuses these predictions across multiple query views to reconstruct complete 3D shapes.
arXiv Detail & Related papers (2025-06-05T17:43:23Z)
DeepWheel: Generating a 3D Synthetic Wheel Dataset for Design and Performance Evaluation [3.3148826359547523]
This study proposes a synthetic design-performance dataset generation framework using generative AI. The framework first generates 2D rendered images using Stable Diffusion, and then reconstructs the 3D geometry through 2.5D depth estimation. The final dataset, named DeepWheel, consists of over 6,000 photo-realistic images and 900 structurally analyzed 3D models.
arXiv Detail & Related papers (2025-04-15T16:20:00Z)
Text To 3D Object Generation For Scalable Room Assembly [9.275648239993703]
We propose an end-to-end system for synthetic data generation for scalable, high-quality, and customizable 3D indoor scenes. This system generates highfidelity 3D object assets from text prompts and incorporates them into pre-defined floor plans using a rendering tool.
arXiv Detail & Related papers (2025-04-12T20:13:07Z)
MegaSynth: Scaling Up 3D Scene Reconstruction with Synthesized Data [59.88075377088134]
We propose scaling up 3D scene reconstruction by training with synthesized data. At the core of our work is Mega Synth, a procedurally generated 3D dataset comprising 700K scenes. Experiment results show that joint training or pre-training with Mega Synth improves reconstruction quality by 1.2 to 1.8 dB PSNR across diverse image domains.
arXiv Detail & Related papers (2024-12-18T18:59:38Z)
GEAL: Generalizable 3D Affordance Learning with Cross-Modal Consistency [50.11520458252128]
Existing 3D affordance learning methods struggle with generalization and robustness due to limited annotated data. We propose GEAL, a novel framework designed to enhance the generalization and robustness of 3D affordance learning by leveraging large-scale pre-trained 2D models. GEAL consistently outperforms existing methods across seen and novel object categories, as well as corrupted data.
arXiv Detail & Related papers (2024-12-12T17:59:03Z)
WiLoR: End-to-end 3D Hand Localization and Reconstruction in-the-wild [53.288327629960364]
We present a data-driven pipeline for efficient multi-hand reconstruction in the wild. The proposed pipeline is composed of two components: a real-time fully convolutional hand localization and a high-fidelity transformer-based 3D hand reconstruction model. Our approach outperforms previous methods in both efficiency and accuracy on popular 2D and 3D benchmarks.
arXiv Detail & Related papers (2024-09-18T18:46:51Z)
Coral Model Generation from Single Images for Virtual Reality Applications [22.18438294137604]
This paper introduces a deep-learning framework that generates high-precision 3D coral models from a single image. The project incorporates Explainable AI (XAI) to transform AI-generated models into interactive "artworks"
arXiv Detail & Related papers (2024-09-04T01:54:20Z)
HandBooster: Boosting 3D Hand-Mesh Reconstruction by Conditional Synthesis and Sampling of Hand-Object Interactions [68.28684509445529]
We present HandBooster, a new approach to uplift the data diversity and boost the 3D hand-mesh reconstruction performance. First, we construct versatile content-aware conditions to guide a diffusion model to produce realistic images with diverse hand appearances, poses, views, and backgrounds. Then, we design a novel condition creator based on our similarity-aware distribution sampling strategies to deliberately find novel and realistic interaction poses that are distinctive from the training set.
arXiv Detail & Related papers (2024-03-27T13:56:08Z)
VR-based generation of photorealistic synthetic data for training hand-object tracking models [0.0]
"blender-hoisynth" is an interactive synthetic data generator based on the Blender software. It is possible for users to interact with objects via virtual hands using standard Virtual Reality hardware. We replace large parts of the training data in the well-known DexYCB dataset with hoisynth data and train a state-of-the-art HOI reconstruction model with it.
arXiv Detail & Related papers (2024-01-31T14:32:56Z)
EvaSurf: Efficient View-Aware Implicit Textured Surface Reconstruction on Mobile Devices [53.28220984270622]
We present an implicit textured $textbfSurf$ace reconstruction method on mobile devices. Our method can reconstruct high-quality appearance and accurate mesh on both synthetic and real-world datasets. Our method can be trained in just 1-2 hours using a single GPU and run on mobile devices at over 40 FPS (Frames Per Second)
arXiv Detail & Related papers (2023-11-16T11:30:56Z)
NSLF-OL: Online Learning of Neural Surface Light Fields alongside Real-time Incremental 3D Reconstruction [0.76146285961466]
The paper proposes a novel Neural Surface Light Fields model that copes with the small range of view directions while producing a good result in unseen directions. Our model learns online the Neural Surface Light Fields (NSLF) aside from real-time 3D reconstruction with a sequential data stream as the shared input. In addition to online training, our model also provides real-time rendering after completing the data stream for visualization.
arXiv Detail & Related papers (2023-04-29T15:41:15Z)
GINA-3D: Learning to Generate Implicit Neural Assets in the Wild [38.51391650845503]
GINA-3D is a generative model that uses real-world driving data from camera and LiDAR sensors to create 3D implicit neural assets of diverse vehicles and pedestrians. We construct a large-scale object-centric dataset containing over 1.2M images of vehicles and pedestrians. We demonstrate that it achieves state-of-the-art performance in quality and diversity for both generated images and geometries.
arXiv Detail & Related papers (2023-04-04T23:41:20Z)
Learning Dynamic View Synthesis With Few RGBD Cameras [60.36357774688289]
We propose to utilize RGBD cameras to synthesize free-viewpoint videos of dynamic indoor scenes. We generate point clouds from RGBD frames and then render them into free-viewpoint videos via a neural feature. We introduce a simple Regional Depth-Inpainting module that adaptively inpaints missing depth values to render complete novel views.
arXiv Detail & Related papers (2022-04-22T03:17:35Z)
Simple and Effective Synthesis of Indoor 3D Scenes [78.95697556834536]
We study the problem of immersive 3D indoor scenes from one or more images. Our aim is to generate high-resolution images and videos from novel viewpoints. We propose an image-to-image GAN that maps directly from reprojections of incomplete point clouds to full high-resolution RGB-D images.
arXiv Detail & Related papers (2022-04-06T17:54:46Z)
UltraPose: Synthesizing Dense Pose with 1 Billion Points by Human-body Decoupling 3D Model [58.70130563417079]
We introduce a new 3D human-body model with a series of decoupled parameters that could freely control the generation of the body. Compared to the existing manually annotated DensePose-COCO dataset, the synthetic UltraPose has ultra dense image-to-surface correspondences without annotation cost and error.
arXiv Detail & Related papers (2021-10-28T16:24:55Z)
UnrealROX+: An Improved Tool for Acquiring Synthetic Data from Virtual 3D Environments [14.453602631430508]
We present an improved version of UnrealROX, a tool to generate synthetic data from robotic images. Un UnrealROX+ includes new features such as generating albedo or a Python API for interacting with the virtual environment from Deep Learning frameworks.
arXiv Detail & Related papers (2021-04-23T18:45:42Z)

This list is automatically generated from the titles and abstracts of the papers in this site.