Related papers: Generative Blocks World: Moving Things Around in Pictures

Generative Blocks World: Moving Things Around in Pictures

URL: http://arxiv.org/abs/2506.20703v1
Date: Wed, 25 Jun 2025 17:59:55 GMT
Title: Generative Blocks World: Moving Things Around in Pictures
Authors: Vaibhav Vavilala, Seemandhar Jain, Rahul Vasanth, D. A. Forsyth, Anand Bhattad,
Abstract summary: Our method represents scenes as assemblies of convex 3D primitives.<n>The same scene can be represented by different numbers of primitives, allowing an editor to move either whole structures or small details.<n>Our texture hint takes into account the modified 3D primitives, exceeding texture-consistency.
Score: 1.2564343689544843
License: http://creativecommons.org/licenses/by/4.0/
Abstract: We describe Generative Blocks World to interact with the scene of a generated image by manipulating simple geometric abstractions. Our method represents scenes as assemblies of convex 3D primitives, and the same scene can be represented by different numbers of primitives, allowing an editor to move either whole structures or small details. Once the scene geometry has been edited, the image is generated by a flow-based method which is conditioned on depth and a texture hint. Our texture hint takes into account the modified 3D primitives, exceeding texture-consistency provided by existing key-value caching techniques. These texture hints (a) allow accurate object and camera moves and (b) largely preserve the identity of objects depicted. Quantitative and qualitative experiments demonstrate that our approach outperforms prior works in visual fidelity, editability, and compositional generalization.

Related papers

EASI-Tex: Edge-Aware Mesh Texturing from Single Image [12.942796503696194]
We present a novel approach for single-image, which employs a diffusion model with conditioning to seamlessly transfer an object's texture to a given 3D mesh object. We do not assume that the two objects belong to the same category, and even if they do, can be discrepancies in their proportions and part proportions.
arXiv Detail & Related papers (2024-05-27T17:46:22Z)
TextureDreamer: Image-guided Texture Synthesis through Geometry-aware Diffusion [64.49276500129092]
TextureDreamer is an image-guided texture synthesis method. It can transfer relightable textures from a small number of input images to target 3D shapes across arbitrary categories.
arXiv Detail & Related papers (2024-01-17T18:55:49Z)
CG3D: Compositional Generation for Text-to-3D via Gaussian Splatting [57.14748263512924]
CG3D is a method for compositionally generating scalable 3D assets. Gamma radiance fields, parameterized to allow for compositions of objects, possess the capability to enable semantically and physically consistent scenes.
arXiv Detail & Related papers (2023-11-29T18:55:38Z)
Differentiable Blocks World: Qualitative 3D Decomposition by Rendering Primitives [70.32817882783608]
We present an approach that produces a simple, compact, and actionable 3D world representation by means of 3D primitives. Unlike existing primitive decomposition methods that rely on 3D input data, our approach operates directly on images. We show that the resulting textured primitives faithfully reconstruct the input images and accurately model the visible 3D points.
arXiv Detail & Related papers (2023-07-11T17:58:31Z)
Single-Shot Implicit Morphable Faces with Consistent Texture Parameterization [91.52882218901627]
We propose a novel method for constructing implicit 3D morphable face models that are both generalizable and intuitive for editing. Our method improves upon photo-realism, geometry, and expression accuracy compared to state-of-the-art methods.
arXiv Detail & Related papers (2023-05-04T17:58:40Z)
TMO: Textured Mesh Acquisition of Objects with a Mobile Device by using Differentiable Rendering [54.35405028643051]
We present a new pipeline for acquiring a textured mesh in the wild with a single smartphone. Our method first introduces an RGBD-aided structure from motion, which can yield filtered depth maps. We adopt the neural implicit surface reconstruction method, which allows for high-quality mesh.
arXiv Detail & Related papers (2023-03-27T10:07:52Z)
Text2Room: Extracting Textured 3D Meshes from 2D Text-to-Image Models [21.622420436349245]
We present Text2Room, a method for generating room-scale textured 3D meshes from a given text prompt as input. We leverage pre-trained 2D text-to-image models to synthesize a sequence of images from different poses. In order to lift these outputs into a consistent 3D scene representation, we combine monocular depth estimation with a text-conditioned inpainting model.
arXiv Detail & Related papers (2023-03-21T16:21:02Z)
TEXTure: Text-Guided Texturing of 3D Shapes [71.13116133846084]
We present TEXTure, a novel method for text-guided editing, editing, and transfer of textures for 3D shapes. We define a trimap partitioning process that generates seamless 3D textures without requiring explicit surface textures.
arXiv Detail & Related papers (2023-02-03T13:18:45Z)

This list is automatically generated from the titles and abstracts of the papers in this site.