CasaGPT: Cuboid Arrangement and Scene Assembly for Interior Design
- URL: http://arxiv.org/abs/2504.19478v1
- Date: Mon, 28 Apr 2025 04:35:04 GMT
- Title: CasaGPT: Cuboid Arrangement and Scene Assembly for Interior Design
- Authors: Weitao Feng, Hang Zhou, Jing Liao, Li Cheng, Wenbo Zhou,
- Abstract summary: We present a novel approach for indoor scene synthesis, which learns to arrange decomposed cuboid primitives to represent 3D objects within a scene.<n>Our approach, coined CasaGPT for Cuboid Arrangement and Scene Assembly, employs an autoregressive model to sequentially arrange cuboids, producing physically plausible scenes.
- Score: 35.11283253765395
- License: http://creativecommons.org/licenses/by-sa/4.0/
- Abstract: We present a novel approach for indoor scene synthesis, which learns to arrange decomposed cuboid primitives to represent 3D objects within a scene. Unlike conventional methods that use bounding boxes to determine the placement and scale of 3D objects, our approach leverages cuboids as a straightforward yet highly effective alternative for modeling objects. This allows for compact scene generation while minimizing object intersections. Our approach, coined CasaGPT for Cuboid Arrangement and Scene Assembly, employs an autoregressive model to sequentially arrange cuboids, producing physically plausible scenes. By applying rejection sampling during the fine-tuning stage to filter out scenes with object collisions, our model further reduces intersections and enhances scene quality. Additionally, we introduce a refined dataset, 3DFRONT-NC, which eliminates significant noise presented in the original dataset, 3D-FRONT. Extensive experiments on the 3D-FRONT dataset as well as our dataset demonstrate that our approach consistently outperforms the state-of-the-art methods, enhancing the realism of generated scenes, and providing a promising direction for 3D scene synthesis.
Related papers
- Mixed Diffusion for 3D Indoor Scene Synthesis [55.94569112629208]
We present MiDiffusion, a novel mixed discrete-continuous diffusion model designed to synthesize plausible 3D indoor scenes.<n>We show it outperforms state-of-the-art autoregressive and diffusion models in floor-conditioned 3D scene synthesis.
arXiv Detail & Related papers (2024-05-31T17:54:52Z) - Robust 3D Shape Reconstruction in Zero-Shot from a Single Image in the Wild [22.82439286651921]
We propose a unified regression model that integrates segmentation and reconstruction, specifically designed for 3D shape reconstruction.<n>We also introduce a scalable data synthesis pipeline that simulates a wide range of variations in objects, occluders, and backgrounds.<n>Our training on our synthetic data enables the proposed model to achieve state-of-the-art zero-shot results on real-world images.
arXiv Detail & Related papers (2024-03-21T16:40:10Z) - SceneWiz3D: Towards Text-guided 3D Scene Composition [134.71933134180782]
Existing approaches either leverage large text-to-image models to optimize a 3D representation or train 3D generators on object-centric datasets.
We introduce SceneWiz3D, a novel approach to synthesize high-fidelity 3D scenes from text.
arXiv Detail & Related papers (2023-12-13T18:59:30Z) - Differentiable Blocks World: Qualitative 3D Decomposition by Rendering
Primitives [70.32817882783608]
We present an approach that produces a simple, compact, and actionable 3D world representation by means of 3D primitives.
Unlike existing primitive decomposition methods that rely on 3D input data, our approach operates directly on images.
We show that the resulting textured primitives faithfully reconstruct the input images and accurately model the visible 3D points.
arXiv Detail & Related papers (2023-07-11T17:58:31Z) - DiffuScene: Denoising Diffusion Models for Generative Indoor Scene
Synthesis [44.521452102413534]
We present DiffuScene for indoor 3D scene synthesis based on a novel scene configuration denoising diffusion model.
It generates 3D instance properties stored in an unordered object set and retrieves the most similar geometry for each object configuration.
arXiv Detail & Related papers (2023-03-24T18:00:15Z) - Towards High-Fidelity Single-view Holistic Reconstruction of Indoor
Scenes [50.317223783035075]
We present a new framework to reconstruct holistic 3D indoor scenes from single-view images.
We propose an instance-aligned implicit function (InstPIFu) for detailed object reconstruction.
Our code and model will be made publicly available.
arXiv Detail & Related papers (2022-07-18T14:54:57Z) - RandomRooms: Unsupervised Pre-training from Synthetic Shapes and
Randomized Layouts for 3D Object Detection [138.2892824662943]
A promising solution is to make better use of the synthetic dataset, which consists of CAD object models, to boost the learning on real datasets.
Recent work on 3D pre-training exhibits failure when transfer features learned on synthetic objects to other real-world applications.
In this work, we put forward a new method called RandomRooms to accomplish this objective.
arXiv Detail & Related papers (2021-08-17T17:56:12Z) - Reinforced Axial Refinement Network for Monocular 3D Object Detection [160.34246529816085]
Monocular 3D object detection aims to extract the 3D position and properties of objects from a 2D input image.
Conventional approaches sample 3D bounding boxes from the space and infer the relationship between the target object and each of them, however, the probability of effective samples is relatively small in the 3D space.
We propose to start with an initial prediction and refine it gradually towards the ground truth, with only one 3d parameter changed in each step.
This requires designing a policy which gets a reward after several steps, and thus we adopt reinforcement learning to optimize it.
arXiv Detail & Related papers (2020-08-31T17:10:48Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.