2D Instance Editing in 3D Space
- URL: http://arxiv.org/abs/2507.05819v1
- Date: Tue, 08 Jul 2025 09:38:39 GMT
- Title: 2D Instance Editing in 3D Space
- Authors: Yuhuan Xie, Aoxuan Pan, Ming-Xian Lin, Wei Huang, Yi-Hua Huang, Xiaojuan Qi,
- Abstract summary: We introduce a novel "2D-3D-2D" framework for 2D image editing.<n>Our approach begins by lifting 2D objects into 3D representation, enabling edits within a physically plausible, rigidity-constrained 3D environment.<n>In contrast to existing 2D editing methods, such as DragGAN and DragDiffusion, our method directly manipulates objects in a 3D environment.
- Score: 39.53225056350435
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Generative models have achieved significant progress in advancing 2D image editing, demonstrating exceptional precision and realism. However, they often struggle with consistency and object identity preservation due to their inherent pixel-manipulation nature. To address this limitation, we introduce a novel "2D-3D-2D" framework. Our approach begins by lifting 2D objects into 3D representation, enabling edits within a physically plausible, rigidity-constrained 3D environment. The edited 3D objects are then reprojected and seamlessly inpainted back into the original 2D image. In contrast to existing 2D editing methods, such as DragGAN and DragDiffusion, our method directly manipulates objects in a 3D environment. Extensive experiments highlight that our framework surpasses previous methods in general performance, delivering highly consistent edits while robustly preserving object identity.
Related papers
- DragGaussian: Enabling Drag-style Manipulation on 3D Gaussian Representation [57.406031264184584]
DragGaussian is a 3D object drag-editing framework based on 3D Gaussian Splatting.
Our contributions include the introduction of a new task, the development of DragGaussian for interactive point-based 3D editing, and comprehensive validation of its effectiveness through qualitative and quantitative experiments.
arXiv Detail & Related papers (2024-05-09T14:34:05Z) - DGE: Direct Gaussian 3D Editing by Consistent Multi-view Editing [72.54566271694654]
We consider the problem of editing 3D objects and scenes based on open-ended language instructions.<n>A common approach to this problem is to use a 2D image generator or editor to guide the 3D editing process.<n>This process is often inefficient due to the need for iterative updates of costly 3D representations.
arXiv Detail & Related papers (2024-04-29T17:59:30Z) - Image Sculpting: Precise Object Editing with 3D Geometry Control [33.9777412846583]
Image Sculpting is a new framework for editing 2D images by incorporating tools from 3D geometry and graphics.
It supports precise, quantifiable, and physically-plausible editing options such as pose editing, rotation, translation, 3D composition, carving, and serial addition.
arXiv Detail & Related papers (2024-01-02T18:59:35Z) - BlobGAN-3D: A Spatially-Disentangled 3D-Aware Generative Model for
Indoor Scenes [42.61598924639625]
We propose BlobGAN-3D, which is a 3D-aware improvement of the original 2D BlobGAN.
We enable explicit camera pose control while maintaining the disentanglement for individual objects in the scene.
We show that our method can achieve comparable image quality compared to the 2D BlobGAN and other 3D-aware GAN baselines.
arXiv Detail & Related papers (2023-03-26T12:23:11Z) - Self-Supervised Geometry-Aware Encoder for Style-Based 3D GAN Inversion [115.82306502822412]
StyleGAN has achieved great progress in 2D face reconstruction and semantic editing via image inversion and latent editing.
A corresponding generic 3D GAN inversion framework is still missing, limiting the applications of 3D face reconstruction and semantic editing.
We study the challenging problem of 3D GAN inversion where a latent code is predicted given a single face image to faithfully recover its 3D shapes and detailed textures.
arXiv Detail & Related papers (2022-12-14T18:49:50Z) - XDGAN: Multi-Modal 3D Shape Generation in 2D Space [60.46777591995821]
We propose a novel method to convert 3D shapes into compact 1-channel geometry images and leverage StyleGAN3 and image-to-image translation networks to generate 3D objects in 2D space.
The generated geometry images are quick to convert to 3D meshes, enabling real-time 3D object synthesis, visualization and interactive editing.
We show both quantitatively and qualitatively that our method is highly effective at various tasks such as 3D shape generation, single view reconstruction and shape manipulation, while being significantly faster and more flexible compared to recent 3D generative models.
arXiv Detail & Related papers (2022-10-06T15:54:01Z) - Cross-Modal 3D Shape Generation and Manipulation [62.50628361920725]
We propose a generic multi-modal generative model that couples the 2D modalities and implicit 3D representations through shared latent spaces.
We evaluate our framework on two representative 2D modalities of grayscale line sketches and rendered color images.
arXiv Detail & Related papers (2022-07-24T19:22:57Z) - Do 2D GANs Know 3D Shape? Unsupervised 3D shape reconstruction from 2D
Image GANs [156.1209884183522]
State-of-the-art 2D generative models like GANs show unprecedented quality in modeling the natural image manifold.
We present the first attempt to directly mine 3D geometric cues from an off-the-shelf 2D GAN that is trained on RGB images only.
arXiv Detail & Related papers (2020-11-02T09:38:43Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.