Related papers: Windowed-FourierMixer: Enhancing Clutter-Free Room Modeling with Fourier Transform

Windowed-FourierMixer: Enhancing Clutter-Free Room Modeling with Fourier Transform

URL: http://arxiv.org/abs/2402.18287v1
Date: Wed, 28 Feb 2024 12:27:28 GMT
Title: Windowed-FourierMixer: Enhancing Clutter-Free Room Modeling with Fourier Transform
Authors: Bruno Henriques, Benjamin Allaert, Jean-Philippe Vandeborre
Abstract summary: Inpainting indoor environments from a single image plays a crucial role in modeling the internal structure of interior spaces. We propose an innovative approach based on a U-Former architecture and a new Windowed-FourierMixer block. This new architecture proves advantageous for tasks involving indoor scenes where symmetry is prevalent.
Score: 3.864321514889099
License: http://creativecommons.org/licenses/by-nc-sa/4.0/
Abstract: With the growing demand for immersive digital applications, the need to understand and reconstruct 3D scenes has significantly increased. In this context, inpainting indoor environments from a single image plays a crucial role in modeling the internal structure of interior spaces as it enables the creation of textured and clutter-free reconstructions. While recent methods have shown significant progress in room modeling, they rely on constraining layout estimators to guide the reconstruction process. These methods are highly dependent on the performance of the structure estimator and its generative ability in heavily occluded environments. In response to these issues, we propose an innovative approach based on a U-Former architecture and a new Windowed-FourierMixer block, resulting in a unified, single-phase network capable of effectively handle human-made periodic structures such as indoor spaces. This new architecture proves advantageous for tasks involving indoor scenes where symmetry is prevalent, allowing the model to effectively capture features such as horizon/ceiling height lines and cuboid-shaped rooms. Experiments show the proposed approach outperforms current state-of-the-art methods on the Structured3D dataset demonstrating superior performance in both quantitative metrics and qualitative results. Code and models will be made publicly available.

Related papers

DreamPolish: Domain Score Distillation With Progressive Geometry Generation [66.94803919328815]
We introduce DreamPolish, a text-to-3D generation model that excels in producing refined geometry and high-quality textures. In the geometry construction phase, our approach leverages multiple neural representations to enhance the stability of the synthesis process. In the texture generation phase, we introduce a novel score distillation objective, namely domain score distillation (DSD), to guide neural representations toward such a domain.
arXiv Detail & Related papers (2024-11-03T15:15:01Z)
Exploiting Semantic Scene Reconstruction for Estimating Building Envelope Characteristics [6.382787013075262]
We propose BuildNet3D, a novel framework to estimate geometric building characteristics from 2D image inputs. Our framework is evaluated on a range of complex building structures, demonstrating high accuracy and generalizability in estimating window-to-wall ratio and building footprint.
arXiv Detail & Related papers (2024-10-29T13:29:01Z)
Exploring the design space of deep-learning-based weather forecasting systems [56.129148006412855]
This paper systematically analyzes the impact of different design choices on deep-learning-based weather forecasting systems. We study fixed-grid architectures such as UNet, fully convolutional architectures, and transformer-based models. We propose a hybrid system that combines the strong performance of fixed-grid models with the flexibility of grid-invariant architectures.
arXiv Detail & Related papers (2024-10-09T22:25:50Z)
VoxNeRF: Bridging Voxel Representation and Neural Radiance Fields for Enhanced Indoor View Synthesis [51.49008959209671]
We introduce VoxNeRF, a novel approach that leverages volumetric representations to enhance the quality and efficiency of indoor view synthesis. We employ multi-resolution hash grids to adaptively capture spatial features, effectively managing occlusions and the intricate geometry of indoor scenes. We validate our approach against three public indoor datasets and demonstrate that VoxNeRF outperforms state-of-the-art methods.
arXiv Detail & Related papers (2023-11-09T11:32:49Z)
Human as Points: Explicit Point-based 3D Human Reconstruction from Single-view RGB Images [78.56114271538061]
We introduce an explicit point-based human reconstruction framework called HaP. Our approach is featured by fully-explicit point cloud estimation, manipulation, generation, and refinement in the 3D geometric space. Our results may indicate a paradigm rollback to the fully-explicit and geometry-centric algorithm design.
arXiv Detail & Related papers (2023-11-06T05:52:29Z)
Exploiting Multiple Priors for Neural 3D Indoor Reconstruction [15.282699095607594]
We propose a novel neural implicit modeling method that leverages multiple regularization strategies to achieve better reconstructions of large indoor environments. Experimental results show that our approach produces state-of-the-art 3D reconstructions in challenging indoor scenarios.
arXiv Detail & Related papers (2023-09-13T15:23:43Z)
Neural 3D Reconstruction in the Wild [86.6264706256377]
We introduce a new method that enables efficient and accurate surface reconstruction from Internet photo collections. We present a new benchmark and protocol for evaluating reconstruction performance on such in-the-wild scenes.
arXiv Detail & Related papers (2022-05-25T17:59:53Z)
Reconstructing Compact Building Models from Point Clouds Using Deep Implicit Fields [4.683612295430956]
We present a novel framework for reconstructing compact, watertight, polygonal building models from point clouds. Experiments on both synthetic and real-world point clouds have demonstrated that, with our neural-guided strategy, high-quality building models can be obtained with significant advantages in fidelity, compactness, and computational efficiency.
arXiv Detail & Related papers (2021-12-24T21:32:32Z)
PanoDR: Spherical Panorama Diminished Reality for Indoor Scenes [0.0]
Diminished Reality (DR) fulfills the requirement of such applications, to remove existing objects in the scene. To preserve the reality' in indoor (re-)planning applications, the scene's structure preservation is crucial. We propose a model that initially predicts the structure of an indoor scene and then uses it to guide the reconstruction of an empty -- background only -- representation of the same scene.
arXiv Detail & Related papers (2021-06-01T12:56:53Z)
Enhanced 3D Human Pose Estimation from Videos by using Attention-Based Neural Network with Dilated Convolutions [12.900524511984798]
We show a systematic design for how conventional networks and other forms of constraints can be incorporated into the attention framework. We achieve this by adapting temporal receptive field via a multi-scale structure of dilated convolutions. Our method achieves the state-of-the-art performance and outperforms existing methods by reducing the mean per joint position error to 33.4 mm on Human3.6M dataset.
arXiv Detail & Related papers (2021-03-04T17:26:51Z)
S2RMs: Spatially Structured Recurrent Modules [105.0377129434636]
We take a step towards exploiting dynamic structure that are capable of simultaneously exploiting both modular andtemporal structures. We find our models to be robust to the number of available views and better capable of generalization to novel tasks without additional training.
arXiv Detail & Related papers (2020-07-13T17:44:30Z)
Vid2Curve: Simultaneous Camera Motion Estimation and Thin Structure Reconstruction from an RGB Video [90.93141123721713]
Thin structures, such as wire-frame sculptures, fences, cables, power lines, and tree branches, are common in the real world. It is extremely challenging to acquire their 3D digital models using traditional image-based or depth-based reconstruction methods because thin structures often lack distinct point features and have severe self-occlusion. We propose the first approach that simultaneously estimates camera motion and reconstructs the geometry of complex 3D thin structures in high quality from a color video captured by a handheld camera.
arXiv Detail & Related papers (2020-05-07T10:39:20Z)

This list is automatically generated from the titles and abstracts of the papers in this site.