Related papers: MCTS with Refinement for Proposals Selection Games in Scene Understanding

MCTS with Refinement for Proposals Selection Games in Scene Understanding

URL: http://arxiv.org/abs/2207.03204v1
Date: Thu, 7 Jul 2022 10:15:54 GMT
Title: MCTS with Refinement for Proposals Selection Games in Scene Understanding
Authors: Sinisa Stekovic, Mahdi Rad, Alireza Moradi, Friedrich Fraundorfer, and Vincent Lepetit
Abstract summary: We propose a novel method applicable in many scene understanding problems that adapts the Monte Carlo Tree Search (MCTS) algorithm. From a generated pool of proposals, our method jointly selects and optimize proposals that maximize the objective term. Our method shows high performance on the Matterport3D dataset without introducing hard constraints on room layout configurations.
Score: 32.92475660892122
License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
Abstract: We propose a novel method applicable in many scene understanding problems that adapts the Monte Carlo Tree Search (MCTS) algorithm, originally designed to learn to play games of high-state complexity. From a generated pool of proposals, our method jointly selects and optimizes proposals that minimize the objective term. In our first application for floor plan reconstruction from point clouds, our method selects and refines the room proposals, modelled as 2D polygons, by optimizing on an objective function combining the fitness as predicted by a deep network and regularizing terms on the room shapes. We also introduce a novel differentiable method for rendering the polygonal shapes of these proposals. Our evaluations on the recent and challenging Structured3D and Floor-SP datasets show significant improvements over the state-of-the-art, without imposing hard constraints nor assumptions on the floor plan configurations. In our second application, we extend our approach to reconstruct general 3D room layouts from a color image and obtain accurate room layouts. We also show that our differentiable renderer can easily be extended for rendering 3D planar polygons and polygon embeddings. Our method shows high performance on the Matterport3D-Layout dataset, without introducing hard constraints on room layout configurations.

Related papers

Unposed Sparse Views Room Layout Reconstruction in the Age of Pretrain Model [15.892685514932323]
We introduce Plane-DUSt3R, a novel method for multi-view room layout estimation. Plane-DUSt3R incorporates the DUSt3R framework and fine-tunes on a room layout dataset (Structure3D) with a modified objective to estimate structural planes. By generating uniform and parsimonious results, Plane-DUSt3R enables room layout estimation with only a single post-processing step and 2D detection results.
arXiv Detail & Related papers (2025-02-24T02:14:19Z)
Prim2Room: Layout-Controllable Room Mesh Generation from Primitives [90.5012354166981]
Prim2Room is a framework for controllable room mesh generation leveraging 2D layout conditions and 3D primitive retrieval. We introduce an adaptive viewpoint selection algorithm that allows the system to generate the furniture texture and geometry from more favorable views. Our method not only enhances the accuracy and aesthetic appeal of generated 3D scenes but also provides a user-friendly platform for detailed room design.
arXiv Detail & Related papers (2024-09-09T07:25:47Z)
FRI-Net: Floorplan Reconstruction via Room-wise Implicit Representation [18.157827697752317]
We introduce a novel method called FRI-Net for 2D floorplan reconstruction from 3D point cloud. By incorporating geometric priors of room layouts in floorplans into our training strategy, the generated room polygons are more geometrically regular. Our method demonstrates improved performance compared to state-of-the-art methods, validating the effectiveness of our proposed representation for floorplan reconstruction.
arXiv Detail & Related papers (2024-07-15T13:01:44Z)
360 Layout Estimation via Orthogonal Planes Disentanglement and Multi-view Geometric Consistency Perception [56.84921040837699]
Existing panoramic layout estimation solutions tend to recover room boundaries from a vertically compressed sequence, yielding imprecise results. We propose an orthogonal plane disentanglement network (termed DOPNet) to distinguish ambiguous semantics. We also present an unsupervised adaptation technique tailored for horizon-depth and ratio representations. Our solution outperforms other SoTA models on both monocular layout estimation and multi-view layout estimation tasks.
arXiv Detail & Related papers (2023-12-26T12:16:03Z)
Connecting the Dots: Floorplan Reconstruction Using Two-Level Queries [27.564355569013706]
We develop a novel Transformer architecture that generates polygons of multiple rooms in parallel. Our method achieves a new state-of-the-art for two challenging datasets, Structured3D and SceneCAD. It can readily be extended to predict additional information, i.e., semantic room types and architectural elements like doors and windows.
arXiv Detail & Related papers (2022-11-28T18:59:09Z)
Towards High-Fidelity Single-view Holistic Reconstruction of Indoor Scenes [50.317223783035075]
We present a new framework to reconstruct holistic 3D indoor scenes from single-view images. We propose an instance-aligned implicit function (InstPIFu) for detailed object reconstruction. Our code and model will be made publicly available.
arXiv Detail & Related papers (2022-07-18T14:54:57Z)
Neural 3D Scene Reconstruction with the Manhattan-world Assumption [58.90559966227361]
This paper addresses the challenge of reconstructing 3D indoor scenes from multi-view images. Planar constraints can be conveniently integrated into the recent implicit neural representation-based reconstruction methods. The proposed method outperforms previous methods by a large margin on 3D reconstruction quality.
arXiv Detail & Related papers (2022-05-05T17:59:55Z)
LED2-Net: Monocular 360 Layout Estimation via Differentiable Depth Rendering [59.63979143021241]
We formulate the task of 360 layout estimation as a problem of predicting depth on the horizon line of a panorama. We propose the Differentiable Depth Rendering procedure to make the conversion from layout to depth prediction differentiable. Our method achieves state-of-the-art performance on numerous 360 layout benchmark datasets.
arXiv Detail & Related papers (2021-04-01T15:48:41Z)
MonteFloor: Extending MCTS for Reconstructing Accurate Large-Scale Floor Plans [41.31546857809168]
We propose a novel method for reconstructing floor plans from noisy 3D point clouds. Our main contribution is a principled approach that relies on the Monte Carlo Tree Search (MCTS) algorithm. We evaluate our method on the recent and challenging Structured3D and Floor-SP datasets.
arXiv Detail & Related papers (2021-03-20T11:36:49Z)
General 3D Room Layout from a Single View by Render-and-Compare [36.94817376590415]
We present a novel method to reconstruct the 3D layout of a room from a single perspective view. Our dataset consists of 293 images from ScanNet, which we annotated with precise 3D layouts.
arXiv Detail & Related papers (2020-01-07T16:14:00Z)

This list is automatically generated from the titles and abstracts of the papers in this site.